POSITIVE REINFORCEMENT CAN'T STOP A BEHAVIOR
I can’t tell you how many times I’ve seen that phrase repeated over and over in discussions of training methodologies, often as a justification for using tools like e-collars and prong collars. Immediately those who disagree with such methods leap to their soapboxes, crying, “That’s not true!” or “shock collars are abusive!!” I myself used to be one of those people. I would become increasingly upset by claims that prongs, e-collars, and other tools that operate via avoidance (the dog aims to perform a behavior or not perform a behavior in order to avoid getting a correction) are the best option out there for training a dog. Before we continue, I want to make one thing clear: I don’t use prong collars, e-collars, or any form of physical punishment in my training. I opt instead for clickers, toys, treats, and praise to train dogs. I have learned, both through practice and education, that this is the most effective and ethically sound way to get desired results for myself and my clients. However, I will also vehemently agree with the statement that positive reinforcement cannot stop a behavior. Let me explain. When balanced trainers (a term for those who use both rewards and corrections in training, often utilizing tools like e-collars and prong collars in combination with food or toy rewards) say that positive reinforcement cannot stop a behavior from happening, they are absolutely correct. That is because the term “positive reinforcement” has gotten a bit muddied in the vernacular surrounding dog training. What the term actually means and what people think positive reinforcement stands for are often two entirely different things. Bare with me while we get a little sciencey for a second. The term positive reinforcement refers to one of the four quadrants of operant conditioning. Anyone who’s ever taken an intro to psychology class has probably perked up with a, “hey I remember that” thought in the back of their brain. Operant conditioning is the process through which all organisms learn associations between behaviors and consequences. The scientist who categorized the effects of consequences on behavior is B.F. Skinner, who you can learn more about in this video: https://www.youtube.com/watch?v=YIEt6TrjJXw. According to Skinner’s research (and others, such as Thorndike, who pre-dated Skinner by about 30 years), behavior that is followed by a pleasant consequence is likely to be repeated, while behavior that is followed by an unpleasant consequence is likely not to be repeated. Moreover, these associations can be categorized in four ways: positive reinforcement, negative reinforcement, positive punishment, and negative punishment. These are the four quadrants of operant conditioning. In the science world, positive and negative do not mean “good” and “bad.” They mean “to add” and “to take away.” Note how I described consequences above as pleasant and unpleasant for this reason. To make it easier to understand, it helps not to think of reinforcement as good and punishment as bad, either. Instead, remember that reinforcement refers to increasing a behavior and punishment refers to decreasing a behavior. So, with that in mind, the definitions of the quadrants are as follows:
Positive reinforcement: to add a pleasant consequence in order to increase the frequency of a behavior
Positive punishment: to add something unpleasant in order to decrease the frequency of a behavior
The positive ones are much easier to understand. Now, here’s where it gets a bit tricky, as the negative quadrants are a bit more complicated:
Negative reinforcement: to take away something unpleasant to increase the frequency of a behavior
Negative punishment: to take away something pleasant to decrease the frequency of a behavior
Go ahead and read that over again, twice if you have to. I still find myself repeating this like some weird behavioral science mantra in my head, just to make sure I have it straight. Here are some real-life examples to make understanding the quadrants and how they apply to dog training a little easier:
Dog sits down when asked, dog gets a treat. Dog is more likely to sit in the future because it got them something pleasant. (Positive reinforcement)
Dog breaks from heel position and gets corrected by a tug on the leash attached to a prong collar on the dog’s neck. Dog is less likely to break from heel position in the future in order to avoid corrections with the prong collar. (Positive punishment)
Dog is standing in front of handler. Handler tightens the leash, applying constant pressure to a collar around the dog’s neck. Dog sits down, handler releases the pressure. Dog is more likely to sit in the future in order to avoid more pressure being put on his neck. (Negative reinforcement)
Dog jumps at handler when they come through the door. Handler does not interact with the dog until he is calmly standing on the floor, at which point the handler gives the dog a pat. Dog is less likely to jump in the future in order to receive more attention from his handler. (Negative punishment)
Now, here’s a test: the last example technically describes two different quadrants. Do you know why? The dog is jumping because he wants attention from his handler. The handler refuses to give that attention while the dog is jumping in order to decrease the frequency of that behavior. That is negative punishment. But when the handler gives the dog attention for standing calmly on the ground (four on the floor), that is technically positive reinforcement; it is more likely the dog will repeat the behavior of standing calmly on the ground when the owner comes through the door because that was the behavior that got reinforced. This leads us back to the original statement in question: “Positive reinforcement cannot stop a behavior.” Again, by the term’s own definition, this is technically true. This also explains why stating that a prong collar (or e-collar, choke chain, etc) is a “punishment” or “positive punishment tool” is also wrong. The collar itself does not apply any consequence to the dog on its own. A prong collar (or other tool) could be used to positively punish a behavior, or it could be used to negatively reinforce a behavior. The same is true for head collars, which are often seen as “less aversive” than prongs, e-collars, and choke chains. A head collar can still be used to positively punish or negatively reinforce a behavior (and is no inherently better or worse than the other tools, in my opinion, particularly if a dog finds a head collar to be unpleasant or scary, but a handler still opts to use it). The same can be said of flat collars, martingale collars, and even body harnesses. A much more true statement would be to say that essentially all of those tools described above work via avoidance. Dogs will perform more or less of a behavior in order to avoid getting jerked with a prong, corrected with an e-collar, choked with a chain, etc. The issue with using the statement “positive reinforcement cannot stop a behavior” as a justification for the use of corrections in training is that it’s extremely dismissive of the other quadrants of operant learning. (Particularly, of negative punishment, which is an extremely effective way to diminish an undesirable behavior.) To illustrate this point, the inverse of this statement is also true: “Positive punishment cannot teach a behavior.” And it cannot. All it can do is tell the dog, “that behavior led to an unpleasant consequence,” but to say that is just as dismissive of the other quadrants, as well. No, positive reinforcement alone cannot stop a behavior. However, positive reinforcement combined with negative punishment can not only stop an unwanted behavior, but teach a better, more desirable behavior for the dog to perform in its place. There are many reason why many dog trainers and behaviorists today opt not to use tools such as prongs or e-collars, choke chains and choose not to use old school methods like holding a puppy’s mouth closed if they nip at you or pinching a dog’s ear in order to get them to open their mouth to retrieve an object. I’ll list a few here:
Punishing a behavior does not mean the dog has forgotten the behavior, it means they’ve suppressed the behavior. If you constantly punish a dog for pulling on leash and they eventually stop pulling, it doesn’t mean they suddenly unlearned or forgot how to pull. They’re just suppressing the urge to pull in order to avoid getting a correction. However, there’s always a chance that the suppressed behavior can come back (and in my experience, it often does) particularly if you haven’t taught the dog an alternate behavior to perform in its place (which, by definition, can only be done using reinforcement). Learning not to pull and learning to walk on a loose lead are not necessarily the same thing.
Several studies have shown that using positive punishment and negative reinforcement can lead to an increase in aggressive behavior, fear, and excitability. I’ll link some studies down below for further reading on this subject as it could be an entire post on its own.
There are several organizations that have spoken out against the use of such methods, such as the American Society for Animal Behavior. There are also professional member organizations such as the Pet Professional Guild that prohibit its members from using such tools and methods. Several dog training academies require their graduates to sign an oath stating they will not use such tools or methods in their training, as well. Anyone who is a member or graduate of such organizations would risk termination of their credentials should they violate those terms.
There is no data to suggest that positive punishment or negative reinforcement is more effective than using positive reinforcement and negative punishment in behavior modification. In fact, there is quite a bit of evidence to suggest the opposite is true.
That last one is really, really important to me. Putting aside any issues of fallout or questions of ethics, if you can train the same behavior without using force, fear, pain, discomfort, or avoidance, then why choose to do so? That is a question that I will answer in a separate post, as this one has already gotten research-paper-length and it is a topic deserving of its own discussion. Studies:
(Ziv, 2017) The effects of using aversive training methods in dogs—A review https://www.journalvetbehavior.com/article/S1558-7878(17)30035-7/abstract (Arhant, Bubna-Littitz, Bartels, Futschik, & Troxle, 2010). Behaviour of smaller and larger dogs: effects of training methods, inconsistency of owner behaviour and level of engagement in activities with the dog. https://doi.org/10.1016/j.applanim.2010.01.003 (Hiby, Rooney, & Bradshaw 2004). Dog training methods: their use, effectiveness and interaction with behaviour and welfare. Animal Welfare (Blackwell, Bolster, Richards, Loftus, & Casey 2012). The use of electronic collars for training domestic dogs: estimated prevalence, reasons and risk factors for use, and owner perceived success as compared to other training methods. BMC Veterinary Research https://doi.org/10.1186/1746-6148-8-93