|
Anticipatory Conditioning? [B. F. Skinner]

Out of favor.
These days, for the most part, the behaviorist school of psychology has
been superseded by the more cognitive schools of psychology and is
currently relegated to the background in psychological research.
However, the behaviorist school was the primary mover in psychological
research since it was inaugurated by John Watson in about 1920, and
reached its peak in the meticulous work of B. F. Skinner in the
1940s-80s. This research is vast and still has much to tell us today if
only we are able to see past the ideas, that ignore internal cognitive
processes, and try to understand the research in terms that allow for
internal cognitive processes.
It should also be noted, that although
behaviorist ideas are no longer at the forefront of academic
acceptability, they are still having an enormous impact on the average
person in the street. Behaviorist ideas have seeped into the general
consciousness of people in today's societies. This is so much so that
behaviorist ideas are often taken to be common sense. They have further
been widely accepted by our educational institutions, which seem only
dimly aware of the newer ideas in psychology and social psychology. Thus
it seems any discussion of learning must begin with some attempt to deal
with behaviorist ideas and find a way of perceiving behaviorist research
from a different perspective.
B. F. Skinner
and the other behaviorists have tried to create a science
of learning that is free of subjective experience. To this end,
they like to talk about only behavior and environmental conditions.
They ignore what is going on inside organisms in favor of that
which can be observed. They have some idea that anything which cannot
be observed is not real. However, if pushed, most will provide
a subjective explanation of what they believe is happening.
The behaviorist's
explanation is, that behavior is channeled by the association of pleasure
and pain with other events. They will explain some events are
associated with pleasure and that organisms try to replicate
those events. Similarly, some events are associated with pain
and organisms try to escape from or avoid such events. They believe that
pleasure moves us forward and that pain holds us back. This seems
to make sense, unless we look clearly at what the implications
are for an organism. Are we just pleasure seeking, pain avoiding
organisms? A cursory glance at any human activity simply denies that
this is true.
Throwing out the baby with the bath water.
We have a choice;
we can either say that behaviorist ideas are deficient and must
be discarded or we can try to reinterpret and modify their ideas
in terms of more viable theories of learning i.e.. Kelly's, Maslow's,
Deci and Ryan's, Dweck's or Popper's. Thus we can draw on the vast experiments performed
by the behaviorists and see how their theories work perceived
in terms of Kelly's constructs or as Popper's idea that organisms
need to search for consistency in the universe. This site chooses
to do the latter. We do not have to throw out the baby with
the bath water.
Operant
Conditioning?
Positive
Reinforcement? This is the cornerstone idea of the behaviorists.
The behaviorists commandeered the word reinforcement in order to
avoid any implication of intervening (not observable) events. Reinforcement
was used previously to mean to make a structure stronger, more durable
and more resistant to change. The behaviorist use it to mean to make
a behavior stronger, more durable and more resistant to change. This
completely ignores the organism performing the behavior and in particular
its intentions.
Be that as it may, the behaviorists imply that pleasure or as they put it
"reinforcement" is associated with that behavior which makes it
likelier to re-occur. These behaviorists tend to apply their ideas to all
organisms. By this they could mean a human, a rat or an insect. Skinner
worked with what he called emitted behaviors. These are behaviors
performed by the organism for whatever reason which the experimenter
tries to make more likely to occur again by reinforcing it. Reinforcement
usually takes the form of some kind of reward. For an animal the reward
might be food. For a human it might be praise. The reward is given to the
organism within moments of completing the required behavior. Not surprisingly
the organism tends to perform the behavior statistically more often than
before.
But what is
really happening in the behaviorist experiments? First of all
such things as food or praise might be better characterized as
part of satisfying that creature's needs. Most important here,
however, is the point of view of the organism. How does the organism
see, understand or construe what is happening. The moment we
put ourselves in the organism's place, what is really happening
becomes quite clear. The organism is anticipating being rewarded
with food or praise if it performs the required behavior. This
is to say, it has consciously or unconsciously formed a conjecture
that this behavior will be followed by something it wishes to
happen. The organism is not at the mercy of drives beyond its
control. It chooses to do something that it anticipates will
cause something pleasant or satisfying to occur.
For more on learning and positive reinforcement click here.
Negative
Reinforcement? Negative reinforcement is nearly the same as
positive reinforcement. The behavior of the organism is still
rewarded, and thus reinforced, but this is accomplished using a
negative stimulus. The organism is placed in a situation where
it is continually unpleasantly stimulated. The reward or reinforcement
occurs when the organism is released from the discomfort of the
unpleasant stimulation or even just by its reduction. Negative
reinforcement usually takes the form of some kind of escape from
punishment. An animal might be allowed to escape from an unpleasant
environment, or a human might escape or be allowed to seek relief
from social pressures. If cessation of punishment occurs within moments of completing
the required behavior, of course the organism tends to perform
the behavior statistically more often than before.
To really
understand why the behaviorist view is likely to be only partially
correct, we must again put ourselves in the position of the organism.
Surely the organism is anticipating being able to escape from
discomfiture if it performs the required behavior. This is to
say, that more is happening than can be observed and the organism
has formed a conscious or unconscious conjecture that this behavior
will be followed by, being able to avoid something it does not
wish to happen. Organisms are not at the mercy of automatic mechanisms
that determine their response but rather this response is mediated
by anticipation. Thus they choose to perform an action that they anticipate
will allow them to avoid something that is likely to be unpleasant.
For more on learning and negative reinforcement
click here.
Positive
Aversion Inducement?
Positive aversion could be said to be the opposite of positive
reinforcement in that it has as its goal the reduction or extinction
of a behavior. The idea is that pain, discomfiture, or any unpleasantness
(i.e. a negative stimulus) is associated with a behavior. This
is accomplished by following the behavior with a negative stimulus.
This, the behaviorists initially believed would lead to the statistical
reduction of the behavior and its eventual extinction. Working
with emitted behaviors the experimenters tried to make some behavior
less likely to occur again by creating an aversion to it. Clearly
the action or behavior was being punished. The organism was punished
within moments of completing the behavior. The behaviorists were
most likely surprised to find that while the organism tended
to perform the behavior statistically less often than before,
that this had even less lasting effect than did positive reinforcement.
Not only that, but they discovered that when previously reinforced
behaviors were subjected to strenuous and continuous negative
stimuli, the behaviors became less regular, but the organism
eventually had what could only be interpreted as a nervous breakdown.
In other words they were no longer able to be guided properly
by whatever intellect they had.
These problems
with punishment, or the creating of an aversion, becomes clearer
if we are willing to accept that something else is going on, and
not just a reflexive response. Organisms process the information
coming in, and anticipate being punished if they perform that
behavior. The organisms consciously or unconsciously form a conjecture
that the behavior will be followed by something they wish to
avoid happening. Again the organisms are deciding what to do.
They are choosing not to do something that they anticipate will
enable them to avoid unpleasantness. However if the behavior
is also rewarding they may decide to risk the punishment (unpleasantness)
in order to get the reward. Too much of this reward / punishment
conflict causes all creatures including humans to become very
disturbed. Of course it is possible to create real aversions
both in humans and animals but these are mental disorders. We
call them phobias and they are understood to be irrational. For more
on learning and positive aversion click here.
Negative
Aversion Inducement?
Although to my knowledge behaviorists have never mentioned this idea it seems
there must be such a thing by virtue of logical symmetry. The idea with negative
aversion is still to make a behavior of an organism less likely to occur, but
this is accomplished using the absence of pleasure rather than punishment as such.
The organism is placed in a situation where it is continually pleasantly stimulated.
The organism is then deprived of this pleasure during or just following the behavior
that is to be reduced or eliminated. Here the organism is not trying to escape
punishment but rather trying to maintain the situation of continual reward. This
kind of absence of reward is often used by parents in punishing children as in
the idea of being grounded. If pleasure ceases during or within moments of the
completion of a behavior of course the organism tends to perform the behavior
statistically less often than before.
But what is
the point of view of the organism? How and why does the organism react to
what is happening? Only by putting ourselves in the organism's position can we
begin to see past the behaviorist's view of what is happening. Again the organism
is anticipating being deprived of food or loosing the privilege of
going out if it
performs the behavior. This is to say, it has consciously or unconsciously
formed a conjecture that this behavior will be followed by the loss of
something it enjoys. Again the organism is not at the mercy of drives beyond
its control. It chooses not to do something that it anticipates will allow it
to avoid the loss of something pleasant. However, if the behavior is also
rewarding the organism may decide to risk the loss of pleasantness in order
to get something more pleasant. This kind of punishment is obviously less
dangerous in that it is less likely to produce a nervous breakdown, but is
clearly not very effective. To the organism it now seems that it is being
asked to choose between two pleasures or two rewards and can obviously
choose the one it prefers. For more on learning and negative aversion
click here.
Interest
and Disinterest. This site wishes to suggest that what the behaviorists
call the environmental conditioning of learning is actually the
development of interest and disinterest. All creatures,
far from being conditioned to learn, learn instead because they
anticipate. They anticipate learning will be followed by the
satisfaction of needs or be accompanied by pleasure. All creatures,
(organisms) will choose to learn because they anticipate its
desirability. All creatures far from being conditioned not to
learn, rather do not learn because they anticipate that learning
will be accompanied by failure in the satisfaction of needs or
be followed by displeasure. All creatures, organisms if you will,
choose not to learn because they anticipate its undesirability.
To continue click
here.
Causation and superstition.
While, as explained above, humans can react rationally to form
expectations and anticipate events, we do not always do so. Our brains
and the brains of other creatures are pattern seeking devices that seek
to find ways of controlling events. Scientifically we understand that we
can control events through the process of causation. If we can see a
relation between action A and event B where we understand A causes B we
can implement event B by performing action A. The behaviorists of course
ignore all this subjective intervention, and instead of invoking
causality and anticipation, simply rely on associations of time and
place. Our brains seek causation, and sometimes we seem to find it
although it isn't there. When this occurs we cease real learning and
become superstitious. We find ourselves performing actions to cause or
avoid events where there is no causal relationship. Two things happen in
close proximity of time and space and we react as if there is a causal
effect between them, even when we understand on a rational level that
there is no causal relation. This mechanism in our brains that seeks
causality is usually fairly good at finding causality, but sometimes it
leads us astray.
Coincidence gambling and sport.
The real relationship in these cases is that of coincidence. We tend to
ignore this however, and rationalize these actions as rituals we perform
for luck. Some of these are curious cultural leftovers from primitive
times such as unlucky 13, black cats, knocking on wood and spilt salt,
which are all easily recognized as socially pervasive superstitions. However,
most of what I would call superstition is personal, specific to
individual persons, and has become formed out of coincidences that have
occurred in that person's life. Gamblers and sports people in particular
are very prone to these kinds of false, scientifically ridiculous
beliefs. In his book
"Don't Believe Everything You Think" Thomas Kida
provides the following examples:
"Wade Boggs was one of the most
proficient hitters in the history of baseball. He won the batting title
five times and had a lifetime batting average of .363. He is also highly
superstitious. Early on in his career he formed the belief that he could
hit better after eating chicken. For that reason, he ate chicken almost
every day for twenty years when he played baseball. He is not alone in
his superstitious behavior. Wayne Gretzky, the great hockey star, always
tucked in the right side of his jersey behind the hip pads. Jim Kelly,
the Buffalo Bills quarterback, forced himself to vomit before every
game. Bjorn Borg did not shave after he began to play in a major tennis
tournament. Bill Parcells would buy coffee from two different coffee
shops before every game when he coached the New York Giants.
Superstition
in Pigeons. Perhaps Skinner's most important contribution in
understanding how learning works is his paper "Superstition in the
Pigeon". This paper convincingly demonstrated that coincidence develops
superstitious behavior. Again in his book
"Don't Believe Everything You Think"
Thomas Kida explains the experiment behind Skinner's paper:
"Skinner put pigeons into separate
cages and had a prize (food) dropped periodically (remarkably similar to
slot machine payouts!) After just a a few minutes, each bird exhibited a
different bizarre behavior. Some bobbed their heads up and down, others
walked in circles, while still others thrust their heads into different
places in the cage. It turned out that the birds repeated the behaviors
they performed just prior to receiving the food. Since they were
doing different things just before the food arrived, they developed
different rituals. In essence, the pigeons' behavior was the result of
coincidence based on what they were doing when the food appeared. So it
is with many human superstitions."
Superstition and persistence. The above experiment is easy to repeat, and it is
almost impossible to imagine another explanation for the strange
behaviors of the birds. In other words all the variables are
accounted for and nothing needs to be controlled for. If the birds are
not fed they die. If plenty of food is placed in each of the cages the
birds are not motivated. More food coming into the cage is unimportant
to the birds if they are well fed. The birds feel the necessity to try
and control their feed drop because it is not quite enough and the birds
are hungry. Basically the pigeons want some thing to happen (the food to
drop down). It seems to happen when they do something. So they try doing
that thing again. Of course it does not work but the birds do not stop
trying. The birds try again and again. Suddenly it does seem to work.
The pigeons try again and it does not work. But now the pigeons are much
more committed to their ritual. It has worked on at least two occasions.
The pigeons become persistent in their actions. They perform the actions
over and over again until eventually it does seem to work again.
Superstition and intermittent
reward. In the parlance of the behaviorists, these
pigeons are being reinforced in their actions. This kind of
reinforcement is called intermittent reinforcement. It is intermittent
because the pigeons are not always rewarded. The birds are only rewarded
sometimes, but this has been examined at length in behaviorist
literature, and found to be very effective in motivating.
The development of
human superstitious ritual. How does this work in the world of human
superstition? Let us consider a gambler. Suppose this gambler blows on
his dice just before he roles them. Suppose he wins. Next time he blows on
the dice and loses. Does he give up blowing on the dice? No. He blows on
the dice and roles again. perhaps he loses again. Perhaps he blows and
loses three times in a row. Does he give up? Probably not. Perhaps he
blows and wins. Now the desire to blow on the dice is greater. It
has worked twice. (Never mind all the times it didn't work.) Perhaps if
he blew on the dice twenty times in a row and lost each time he would
stop blowing on the dice. But the likelihood of losing twenty times in a
row is small. In this way seemingly the human also gets intermittently
rewarded for blowing on the dice.
Logic and intermittent
reward. What happens if it is pointed out to the
person that there cannot be a causal connection between the
blowing on the dice and whether he wins or not? Scientifically this is a
given, so the person may be able to be convinced that there is no causal
relationship. Does this mean he will stop blowing on the dice? Not
likely. Even though people fully understand it is irrational they will
usually still persist in performing superstitious ritual. If they
perform the ritual enough times they may not be able to stop performing
the action. It may become a compulsion.
Justification and rationalization. If asked why they are continuing to perform
such an irrational action superstitious people may try to rationalize or
justify the action. In the case of the man blowing on the dice he may
say that doing it hurts no one. He may suggest that there are
possibilities beyond those understood in science. He may say that a
possibility no matter how remotely unlikely is worth doing if the reward
is high enough and the effort required is so small.
Humans are not completely rational. So
there it is. We humans are not completely rational. We can be rational,
but even the most rigorous scientific minds can be drawn into
performing irrational actions akin to magic. This kind of superstitious
behavior is so prevalent in human behavior that we are hardly aware that
we are indulging in it. All of us have little behavioral quirks where we
do things that make no rational sense. All of these behaviors can be
traced back to what we were doing when something good or bad happened.
If something bad happened we try to never again perform that action. If
something good happened we will try to perform the action again, as
often as is possible. There is in these cases no causal relation, we are
aware that there is no causation, but we perform the action anyway. For
instance, although most people will say they do not believe in
horoscopes, they will still follow the advice of such writings in
magazines.
Changing anticipation & expectation. Elsewhere in this site it is explained that
extrinsic reward is difficult to make work because its motivational
power tends to evaporate when it is removed. (This is particularly true
of extrinsic reward for learning.) However,
some behaviors persist after a reward or punishment have been
withdrawn. This is of course not incompatible with the idea that
expectation or anticipation will continue after a reward has
been withdrawn. The conjectures that we form about reality are
dogmatic so a single event where a reward or punishment is not
forthcoming does not invalidate the conjecture, and thus the expectation
or anticipation does not weaken immediately. Of course, it is
obvious, that if the reward or punishment continues not to be
forthcoming over a long period, then our conjecture must be and is revised to fit
the events and the expectation or anticipation gradually fades.
Behaviors
also sometimes do not fade despite the fact that a reward or
punishment is not forthcoming. Expectation and
anticipation can explain this far better than the behaviorist
associations ever
could. The explanation is simple. The behavior continues but
the expectation or anticipation change. A new conjecture is formed
involving a different reward, probably a different type of reward.
In other words the reward has been withdrawn, but is replaced
by another reward, one probably not taken into account by the
experimenter. With humans, we see evidence of this around us
all the time. We start off doing something for one reason (for
one type of reward) and end up doing the same thing for quite
a different reason (for another type of reward).
Rewards and
punishments in clusters.
Any behavior a living organism may perform will involve many conjectures
or many anticipations of pleasure and pain. Many of our needs, for
instance, can be satisfied by a single action. The person who does good
work in the community may satisfy every level on Maslow's hierarchy. The
behavior may put food in his mouth, it may make him feel safer. It may
help him to gain love and friendship and it will certainly increase the
regard in which he is held by others. On top of that, many meta needs
may be satisfied such as the need for creation and the need for justice.
These satisfactions can all be considered rewards. Then also, learning
and accomplishment can also be considered rewards. In the final analysis
all these rewards and possible punishments must be considered in the
expectations of someone doing good community works.
Needs necessitate
anticipation. Our needs determine what we will try to anticipate.
But if needs provide the expectations of reward,
what provides the punishments?
The expectations of punishment can be found in the same needs
etc. For it is in the successful satisfaction of these needs that they
are rewards, and in their
deprivation, they
are punishments. In making a choice as to whether to perform
a behavior, an organism must take into account many anticipated
rewards and punishments. It must decide if the pleasures it
anticipates are worth the pain it also anticipates. Of course this is
probably never done consciously.
Skinner boxes and the need to learn.
The Skinner box was invented by B.
F. Skinner to enable small animals and birds to have some control over
whether they get fed or not. A Skinner box is basically a cage with a
leaver, that when pressed, causes a pellet of food to drop into a
dispenser in the cage.
In his book "The Upside of Irrationality" Dan Ariely describes an
experiment performed by psychologist Glen Jensen. In his experiment
Jensen places a hungry rat in a Skinner box. The rat accidentally pushes
the leaver and learns that when it pushes the leaver, a pellet of food
drops into the food dispenser which enables it to eat. The rat then
engages in a lot of lever pushing activity. As soon as the rat seemed to
get the hang of this Jensen turned off the light in the box and
simultaneously deactivated the ability of the lever to dispense pellets
of food. The rat of course continues to push the lever but no pellets
are forthcoming. Eventually Jensen turns the light back on and reengages
the lever's power to dispense pellets. After having experienced this
several times the rats learn that the lever only works when the light is
on. Jensen then turns the light off in the box and places a cup filled
with food pellets into the box. The rat eventually finds this and begins
to eat. Then Jensen turns the light back on. Contra to what might be
expected, out of hundreds of rats tested, all them eventually returned
to pushing the lever. Dan Ariely in "The Upside of Irrationality" tells
us the following: "Jensen
discovered (and many subsequent experiments confirmed) that many animals
- including fish, gerbils, rats mice, monkeys and chimpanzees - tend to
prefer a longer, more indirect route to food than a shorter, more direct one. That is, as long as fish, birds, gerbils, rats, mice,
monkeys, and chimpanzees don't have to work too hard they frequently [seem
to] prefer to earn their food."
Contrafreeloading. Jensen's
theory about why this behavior occurs in the rats is what he calls 'Contrafreeloading'.
He proposes that the reason the rats go back to lever pressing, despite
the free food available in the cup, is because they prefer to work for
food, rather than get it for free.
Saving against tomorrow. There
can, however, be other reasons why animals might prefer to return to
lever pressing instead of gobbling up the free food. One possible reason
is that the animals may be motivated to keep their free food as a kind
of secure storage against a future where the lever is not working, as
with the lights off condition. The rat may feel more secure knowing the
storage is there. In this case the rat can feed itself and still have
plenty in dire times.
Learning as finding limits through iteration. But there is another possibility, and a
more compelling one, that has to do with learning. From an evolutionary
point of view it is more important for an animal to learn how to provide
food for itself, than it is to to just eat. It may be possible then,
that the act of pressing the lever by an animal, can be construed by
that animal as act of learning. Now you might be temped to say, the rat
is not learning, as he has already learned well that the lever pushing
produces food when the lights are on. But this is a misunderstanding of
how learning takes place.
Limits and iteration. Learning can be seen as a process of finding the limits
within which a principle works. Thus learning is the testing of hypotheses to find
the conditions within which those hypotheses are not valid. So the question is this;
when a rat presses the lever, is that mere repetition of an action it
has learned, or is it a variation of past actions used experimentally?
Is it an action testing an hypothesis about how to obtain food in the
future? If each action the animal performs is a unique effort to test an
hypothesis, the action is not simple repetition, but rather an iteration
of previous actions meant to test the hypothesis under slightly
different conditions. In the
case of of the rat, or any other creature provided with an opportunity
to choose between free food and pressing a lever to obtain food, the
variations in self performance they could be testing are infinite. What
might make a difference, as to whether pressing a lever causes food a
pellet to drop down? Well the amount of pressure the animal places on
the lever will obviously make a difference. Perhaps the speed at which
the lever is pressed could make a difference. Maybe how the animal
stands or where it stands when it pushes the lever will make a
difference. Maybe how far the lever is pushed down will make a
difference. The point is there may be many limits, and the animal to be
secure about its future ability to procure food for itself, must learn as
many and as much as it can.
This is a profoundly important
principle of learning. Humans, animals, organisms, all never truly
just repeat their actions. Rather we are continually testing hypotheses,
continually learning something new, continually refining our actions to
be closer and closer to perfection. For more information about iteration
and how it works as part of learning, click here to
go to the iteration page.
The environment. Behaviorists are so concerned with getting
something or someone to behave in a particular way that they seem
oblivious to the idea that controlling others might not be a good
thing. It is one thing to believe, that the associations formed in what
they call reinforcement mean we could control the actions and the
direction interest taken by others, but it is another thing to believe
that this would be good for society and therefore morally acceptable.
Manipulation of people has become such a part of western society that we
have become blind to the fact that such control is immoral and
unproductive. This is especially true when it comes to learning. As
science delves further and further into motivation and how the brain
works it is becoming clearer that learning should be in the hands of
each individual learner and not some other controlling person. Many
educational theorists now agree that the productive direction for the
study of learning is to discover how we can direct ourselves to become
more interested in many and various things and how we can successfully
avoid becoming disinterested in anything. In other words, if we are to
structure an environment to facilitate learning, it should an environment
that engenders interest in us for anything and everything and not
conveniently herd us in a particular direction.
|