Operant Conditioning and Shaping
The two best tools you will ever find for training!
Operant Conditioning (Click and reward)
Operant conditioning is the process of teaching the dog (conditioning him) that he can operate (control) the outcome (the click) with his own behavior. Thus the process is called Operant Conditioning.
This involves taking something that has no meaning to the dog (like the word, “Good!” or “Yes!” or the sound of a clicker) and pairing it with something that has value to the dog (like food, play, or some other reward), to create an association between the two. This can be used to teach the animal behaviors and cues as explained below.
Food is called a “primary reinforcer” because the dog will work to get it-- it is valuable on its own. After the click sound becomes paired with the reward (food), and the dog learns that the click predicts a reward, the click itself becomes a valuable signal. It tells the dog, “Marvelous! You just earned a treat!” We call this click a “bridge” or a “secondary reinforcer.” We also often call it a "Reward Marker" because it marks the behavior the dog was doing that earned the dog his reward. Brenda Aloff calls it a “Memory Marker” because you are telling the dog with the click to “remember what you were doing when you heard the click” so they can repeat that behavior to get another click and reward.
The sequence is: desired behavior = click = reward
To begin, take a whole pile of small treats, hide them from view within reach and sit with the dog. Click the clicker then reach for and give the dog a treat. Click the clicker then reach for and give the dog a treat. Click the clicker then reach for and give the dog a treat. Do this till you’ve used up the pile of treats (40 to 100 clicks then treats.) Remember to use tiny treats (pea size or smaller) --you’re going to be using them regularly. Just use enough to give him a taste of what you have. It is just enough to let him know he has been rewarded. They can taste it even if it’s just a crumb!
After you have completed the above exercise, your dog should start perking up at the sound of the click. Keep the treats hidden in a hand or on the table in a bag, or between your crossed legs, or whatever. The only indicator he has of getting the treat is by hearing the click first. We don’t want him to associate getting the treat with seeing the treat—or with your movement toward the treat -- we want him to associate getting the treat with hearing the reward marker (the click.) Now, when the dog perks at the sound of the click, you know he recognizes it as a predictor of good things to come. This is the first lesson.
Next, instead of clicking at random, you will click for a pre-determined behavior. It’s important that you decide what behavior you want from the dog so you will know when to click. Let’s use the behavior of having your dog touch your hand with his nose. This is known as a nose target. Be ready to click then hold out your other hand, palm open, about 4-8” from your dog’s nose. As soon as the dog comes near it to investigate with his nose, click and then deliver a reward. Hold out your hand again and repeat this process several times. Be sure to click as soon as the dog gets his nose near your hand the first several times. Then wait and see if the dog will make contact with your hand. Click as contact is made.
If the dog stops investigating your hand, you can help him by hiding a tiny piece of treat between your fingers for a few repetitions in a row. Then go back to offering your hand without the treat between your fingers. You will know that your dog understands that you want him to touch your hand with his nose when he is doing that as soon as you present your hand for at least 10 tries in a row without hesitation. Your dog has now learned that he can control when you click, through his own behavior! This is the basic principle of Operant Conditioning. When he is doing something (a nose target in this case), and he hears the click, his brain will instantly file that information and he will work to repeat the behavior that he was doing at the instant he heard the click. When the dog understands that, you have the power of the universe in your hand!
When you catch the dog performing a behavior you want, click as the dog is doing what you want and then give a reward. Let’s say you want to teach the dog to sit. You could lure the dog into putting his butt on the floor with food, or wait for him to sit on his own. The instant his cute little butt plops on the floor; you click, and then give him his treat. It doesn’t matter if you are a half-second late delivering the treat--it’s the click that needs to be precise because it is telling him to choose THAT behavior among all of the others he might have been doing in the last minute (like standing, turning his head, breathing, barking, looking away, pawing you, sniffing the floor...).
Timing is critical. If you are miss the behavior, don’t click. You snooze, you LOSE! You have lost the opportunity to reward THAT sit forever. Don’t panic. There will be more sits. Just be careful to reward them when they happen. If you click too late, and the dog is already switching behaviors to something else (like getting up, lying down, sniffing, barking, etc...) then you will be rewarding that other behavior, and NOT the sit!
New clicker trainers need practice developing their timing. You can practice without the dog by watching TV and clicking each time a commercial comes on, or each time a certain actor or actress comes on the screen. Or at work during lunch, each time someone passes by a certain spot or each time someone across the room takes a bite. Or sit in your car in the parking lot of a busy store and click each time someone goes in. then make it harder and try to click as each person comes out. How far are they out the door before you click? You should be able to click as they are walking through the frame of the door. This is harder if you can’t see them approaching the door. Be sure you pick exactly what behavior you want to click in advance. Break the behavior down into its various pieces and decide which piece to click. Taking a bite of food, for example, involves lifting the food, opening the mouth, putting the food in the mouth, biting down, removing the fork/spoon, etc. What piece are you going to click?
Various studies have shown the following problems showing that begining dog trainers:
A. DO NOT CLICK OFTEN ENOUGH They are daydreaming, doing an impersonation of a “post”, not anticipating their dog’s behaviors, or they are expecting too much too fast. Be ready to click when the dog does what you want.
B. ARE TYPICALLY “THREE BEHAVIORS LATE” delivering a click. They are daydreaming, doing an impersonation of a “post,” or not anticipating their dog’s behaviors, so that they could be ready to click exactly when the dog does the behavior.
C. CANNOT DISTINGUISH BETWEEN WHAT TO CLICK AND WHAT NOT TO CLICK They don’t have a clear picture in their mind of exactly what they are looking for from the dog. Their expectations are far too high (trying to get the dog to do more than he’s ready to understand), or they do not settle for a move toward the behavior they are looking for (instead they wait for the dog to do it all.) See the information on “shaping” below.
This will help you: Don’t daydream! Pay attention to your dog. How else can you avoid missing that all-important first attempt at the desired behavior, so that you can click it?
Don’t be a post! If you stand there like a tree, which your dog is tied to, and don’t give any feedback to your dog, you are like a post! When a dog is tied to a post he gets something we call “barrier frustration.” He’s tied there, nothing’s happening, he wants to be somewhere else, he starts to dig, bark and pull at the end of the leash. You are NOTHING to the dog but an annoying anchor-point. YOU are making him disinterested in you! Work on things that are easy for the dog and work up to the harder stuff in small steps.
Anticipate the dog’s moves and be ready to click them. If he even glances at you for one millisecond, click him at that moment! And deliver a reward quickly. If he sits and looks up at you, well, holy cow, click! And give him a jackpot! But don’t expect a more advanced behavior like sitting and looking up at you in the beginning. Reward that first glint of eye contact, and soon you will have a dog which is ignoring the floor and the other dogs and is sitting there looking at you. You are the keeper of all that is wonderful! Treats and praise flow forth from you like magic. Suddenly, you have become more interesting than the floor!
Look at all the pieces of the behavior you want the dog to do. Are there behaviors that lead up to the one you want? The dog has to shift his weight and bend his knees before he can sit. Often the tail is raised out of the way as well. Once the dog is able to get into the sit position quickly, if you want sit to mean stay there till I release you, you will need to wait for a little bit more time in the sit position before you click. Decide (before the dog offers the behavior) how much time you plan to require before you click. Is it one second, 4 seconds, 10 seconds? If you don’t know, how can you expect the dog to know how to get the click?! If the dog can’t hold the sit for 4 seconds, then only require 1, 2 or 3 seconds until he starts understanding that it is duration you want.
I have been clicker training my dogs for 13 years. It’s only been the past four years that we have been incorporating the clicker during the obedience class. We have asked the people teach the dogs to “wave” using the clicker (at home.) Recent evidence has shown that it is not as distracting as was originally believed to use clickers during the class. The dogs can distinguish between the owner’s click (that is “working” for the dog and bringing him what he wants), and someone else’s click five feet away (which doesn’t do anything for the dog). We are now encouraging our students to go to ALL training with the clicker.
Using a clicker as a reward mark has many advantages over using a verbal reward mark:
1. There is no tonal inflection, as there is with a voice. It sounds the same every time. Therefore, each member of the family can use it, and it will sound the same. Also, you can never make it sound “emotional” if you are upset, angry, or a “lousy praiser.”
2. Men (and inhibited, shy, or grumpy women and children) can use it without feeling stupid. We no longer have to brow-beat the men in class to get them to fork up some excited praise for their dog (it’s easier to get blood from a turnip!) This is not meant to “bash” men--I realize that they are socialized completely differently from women, and aren’t prone to gushy, appreciative remarks, bursts of joyful adoration, or giddy excitement at seemingly small accomplishments (that’s what we women are here for!)
3. It is instantaneous. A click only takes a quarter of a second, so you can mark a behavior which is occurring amidst an onslaught of rapid-fire behaviors being offered by the dog. If you took the time to say, “Gooooood Dogggggg!” The dog would wonder which one of the 18 behaviors he offered during that time span actually earned him the treat!
4. Once conditioned to the click as a reinforcer system, the dog will actually work harder to earn the click (secondary reinforcer) which brings the reward than he did to earn the reward (primary reinforcer) with no click.
5. The click is processed by the primitive part of the brain. The part that controls the instinctive reactions. Words are processed by the front brain (where the dog is also trying to figure out what you want.) So using a verbal word, interrupts the dog’s thinking process, where as the click is processed by a different part of the dog’s brain allowing him to continue to think about the training puzzle, yet still be able to absorb the meaning of the click and how it relates to his own behavior.
6. Words are used around the dog all the time. Most words the dog does hear (on TV, when we talk with others, when we babble to them, etc.) mean nothing to him and have no consequences for the dog. So the dog is more likely to tune out a verbal marker. The click is unique and because of that, it is automatically more interesting and memorable to the dog.
Some rules for training with a clicker:
1. Never let young children (or an adult idiot) get their hands on the clicker. In one afternoon, they could extinguish much of the work you have done making the clicker mean something to the dog! “Extinguishing” is when a behavior is not rewarded when it is repeated until the animal doesn’t offer the behavior any more. How long would you continue at your job if you didn’t get a paycheck?
2. Don’t click the clicker close to your ear or the dog’s ear. The sound is quite loud at close range.
3. Watch your dog’s behavior when you first start using the clicker. If the dog shies away, puts his ears back or startles every time you click, you may need to make a quieter click. This can be done by putting the clicker in your pocket, wrap it in a towel or use a retractable pen as your clicker. There is also a device called a “Click +” which has different sounds and volume levels.
4. Always follow the click with something the dog really likes. This can mean food, play, toy or anything the dog really wants to have or do again.
5. Remember, when you start shaping a behavior, THE QUALITY GOES IN BEFORE THE NAME GOES ON! Don’t NAME the behavior (create a cue word) until you have been getting the behavior regularly and rewarding it for a while. When that behavior becomes predictable – you know the dog is about to do the behavior correctly and completely, then, and only then is it safe to name the behavior. This is explained in more detail below.
CUES
Cues are the names of the behaviors you want your dog to do (behaviors are done on cue). We used to call them commands, but that sounded too controlling and authoritative. We also want to think of cues a little differently than commands. A cue is an
The only punishment for the dog is withholding the reward. If a no-reward situation does not affect your dog, then your reward needs to be more powerful. If I gave out stickers to people who were on time for class, and you were late, you’d say “Ah No Big Deal!” Right? Stickers aren’t a very powerful reinforcer for most people. What if you found out I had given out hundred dollar bills to everyone who was on time, and you were late? DARN! You’d be a little upset, right? The rewards you choose to use need to be like $100 doggie bills. He has to have a strong desire to earn them. There has to be SOMETHING the dog really wants and enjoys!
But what if the dog gets something wrong?
I suggest you don’t use “NO” for this. First of all, “NO” is a swear-word for dogs (in my book). Second of all, people tend to use it for EVERYTHING, which is ineffective. And thirdly, the word NO tells the dog to stop, but not what he should do instead. He WILL be doing SOMETHING instead, so why not just tell him or help him understand what you want him to do in the first place? You can train completely without a punishing word or action (like a leash pop), using just the reward marker (click) when the dog is on track. If you are used to the outdated, correction methods, you may be too focused on telling the dog when he's not right, instead of looking for an opportunity to click. Sometimes it's fun just to watch your dog figure something out on his own, observing which behaviors pay off, and not saying anything to distract him when he's off track. Operant Conditioning with Positive Reinforcement is the best way to teach anything. This goes for dogs, children, employees, and spouses, too.
For some reason, though, our society is very dependent on the punishment system to get cooperation from pets, kids, underlings and citizens. The punishment model fails miserably as a behavior modification tool. To quote Morgan Spector, author of Clicker Training for Obedience, "Behaviors built on a foundation of punishment REQUIRE punishment to maintain the behavior." A classic example of this is the speeding ticket. If cops did not give speeding tickets, how many of us would speed? For that matter, how many of us speed when we know there is no cop around, and only mind the speed limit when there is danger of us getting slapped with a heavy fine? However, if the police pulled us over unexpectedly, and gave us gift certificates for exemplary driving practices, we would be far more likely to drive well, even when there were no police in sight. Like at Dog Scout Camp, for example, when the "poop police" check to see if you have your baggies with you (for dog waste clean-up.) Those with baggies get tickets to get into drawings for neat prizes. Those that don't get nothing - Too bad! Our camp is always clean. People don't leave any excrement behind. As a matter of fact, we even have a drawing for people who go beyond the call of "dooty" and pick up "unclaimed" droppings that may have been missed by the dog's owner. The only danger this has presented is that there have been a few near collisions with potential head injuries as two or more people have claimed, "I've got it!" as they all dove in the direction of the stray poop at once, to earn the ticket!
Shaping is a process of rewarding
successive approximations
Shaping is a tool which incorporates the click and reward to get a more complex or multi-step behavior. Shaping is how we teach the dogs to paint pictures at Dog Scout Camp (see the “Art of shaping” page in the “Dog Activities” section of the website.) Not too many dogs would just walk up and start painting a picture on their own. It's a complex behavior, made up of many parts.
Shaping takes you along a path from the very beginnings of a rudimentary behavior, to the completed product by successive steps. You have to know what to look for along the path, to know what to reward, and when to go on to reward what comes close (an approximation) to resembling the next part. This is the Art of Shaping.
The use of the clicker is crucial in shaping. It is important to communicate to the dog the exact instant he is performing the desired behavior. If your goal is to try to get him to raise his foot a half-inch higher than he did the last time, you have to time your click to match the instant his paw is reaching up that very slight bit HIGHER than it has before. It won't do to try to reward the higher wave AFTER the fact. He'll know that something he did was good, but he won't know what part of it was good, or better than before.
To get a behavior like painting, first the dog must lift his foot, so we would reward any movement on the dog's part which involved sitting or standing still and lifting one foot. Remember, you're not looking for the finished product, here. You'd never get it in a million years by accident. So don't be stingy! Reward anything in the direction of what you want. Reward the dog shifting his weight to the other foot so that he can lift the remaining front foot. Stare at the foot you want him to raise and be ready to click as soon as it leaves the ground! You may be able to help the dog along by trying to get the dog to paw at something, like a treat in your closed hand or a toy hiding under a blanket. But it needs to be the dog’s choice to lift his foot (for whatever reason.)
The thing about shaping is you have to know when to increase the criteria. If you continue to click a certain level of performance, it will be difficult for the dog to go beyond that level of performance. You have to "up the ante," to make the dog see that there is more and more to this new behavior. Just as with regular operant conditioning, you do not name the behavior until after you have the behavior completely finished. You could teach sit in one step, and once the dog "has it," you can name it. But with waving, or painting, there are many steps to go through, and you have to be careful not to get impatient and start blurting cues or instructions to your dog prematurely. Dogs don’t understand our language until certain words are paired with consequences. So telling the dog “raise your foot” is nothing but babble that interrupts his thinking.
You must gradually get the dog to raise his paw higher and higher. You can do this in a number of ways. You can have him try to reach for something, or paw at something down low (eliciting the behavior) and gradually raise that object higher and higher, or you can wait for the behavior to happen on its own, which can take much longer. When the dog will stroke his paw through the air, then we will teach him to do several things: stroke several times for one click, stroke on a cardboard "easel," stroke while wearing the "Paintin' Paw" bootie, and finally, to stroke repeatedly on a cardboard easel while wearing a Paintin' Paw which has tempera paint loaded onto the sponge applicator. Now you can add a verbal cue, like "Paint" to the action your dog has learned to perform.
If you added the cue “paint” back when the dog was waving his paw in the air, and also said “paint” when the dog hit the board with his paw, and also said “paint” when the dog was wearing the bootie for the first time and was shaking his foot like a cat with a wet paw, just what does the cue “paint” mean? By waiting until you have the final, complex behavior being done repeatedly and correctly before you add the cue (just before the dog starts doing that complex behavior) there is no doubt exactly what the cue means. It becomes a predictor for the dog of what behavior will “pay off.”
The key element with operant conditioning AND shaping, is that the dog's behavior is voluntary. You do not use any force to get him to do what you want. If you were to try to pick up the dog's paw and FORCE him to lift it, he would probably pull in the opposite direction. The dog will not learn to do things for himself, if you are always doing them for him or jumping in too quickly to help him. This is why operant conditioning and shaping are such valuable tools. When the dog does something for himself, and gets rewarded, he will try to perform that behavior again and again. He is learning how to learn! Just like a young child. If you pushed or molded the dog to get him to do a particular behavior, he would be so busy resisting you and not performing the act on his own, that he will not know how to "graduate" to doing the behavior without help. In other words, it is not only more humane and more fun to teach things with operant conditioning, but it is faster and easier as well.
Give it a try!