Monday, July 14, 2014

On the Importance of Mathematics

I came to mathematics relatively late in life, as such things go.  Generally, when someone is passionate about something, they get “bit by the bug” early.  In that, I’m rather unconventional in that many of my passions came to me much later than they did to my colleagues.  I’m a magician who, though I did have a magic kit as a child, never became serious about the art until I began to realize its deep connections with psychology.  While I was always interested in science, I didn’t actually choose to go to university to pursue it until much later than most people make those decisions (indeed, I didn’t even go to university at all right out of high school, instead choosing to take several years off to pursue other things).  And when it comes to mathematics, most people likewise get bit by the bug early.  They learn a little trick for computing some arithmetic operation more quickly, and they develop a lifelong passion.  Not so for me.  I left high school barely knowing any algebra at all.  But now I’m pursuing a degree in mathematics.

Why do I point this out?  It’s to set the stage for a discussion that, while nothing new, is perhaps more relevant than ever.  That’s because what we’re talking about today is math education.  What I’m talking about today is not about how to improve math education, though I will touch on some of those issues toward the end.  It’s simply a defense of math education as something worth the effort to improve.  I’m afraid this is the discussion we need to have first, because it has become common for people not only to unfairly dislike mathematics, but to actually believe they have no use for it.

Every mathematics teacher in the world is familiar with the dreaded question: “When am I ever going to use this?”  This is annoying enough, I’m sure, because it means the students are questioning the very validity of the subject the teacher is spending her time and energy trying to bring to their attention.  But while it may be annoying coming from inexperienced students, it’s hard to blame those students because it is indicative of a much larger societal devaluation of mathematics, particularly in the United States (though I’m sure such attitudes are common to varying degrees throughout much of the world).

Evidence of this trend against mathematics can be found in that great new repository of society’s attitudes: the Facebook meme.  I’ve seen all of these come across my newsfeed at one time or another, and have found independent sources of the images so I can share them here.

Here is an entire blog post devoted to the idea that as an adult, its author has never used long division, fractions, or algebra.  The page is broken up with embedded images, many of which I’ve seen before, like this one:

Similar sentiments are commonplace.  Consider this one from an author who claims that “As a marketer and communicator, I’ve never had much use for sophisticated math,” but that he found value in taking algebra only in that it taught him the “life lesson” of “being forced to do something uninteresting simply because someone in authority told me I had to do it.” (Nevermind the fact that algebra hardly qualifies as “sophisticated math,” as evidenced by the fact that universities consider everything up to and including Calculus to be lower-level mathematics courses.)

There are other examples: 




Let me make something absolutely clear at this point.  It’s probably true that everyone is good at something, and everyone is bad at something.  This is not about some people being better at mathematics than other people.  Indeed, it’s not at all necessary that everyone should be a mathematician.  But it’s curious, isn’t it, that mathematics is (as far as I’m aware) the only field that is so commonly hated that people actually take pride in their ignorance of it.  No one would take such pride in illiteracy as they do in innumeracy.

There are many things of which I am ignorant.  For instance, I speak only one language.  Though I hope soon to rectify this situation, it is the truth at the moment, but you don’t see me going around pridefully remarking that I have never needed to know Spanish or French.  I also don’t know how to knit, but I don’t spend any time trying to convince people that being a non-knitter is such a good thing.  Only mathematics is widely regarded as such a hated subject in school and such a difficult field of study that people actually make snide remarks about their inability to perform mathematical operations, while trying to convince people that their ignorance is justified, because the entire field is useless (unless you happen to be a scientist or engineer--this being the limited concession offered by most of the anti-mathematical crowd).

Now that I’ve set the stage by showing you the attitude mathematics educators are up against, I want to spend some time talking about why these people are wrong, and perhaps to help you to understand why mathematics is not only useful (yes, even to you), but fun, rewarding, and profitable.

Let’s begin by looking at why it’s patently false that all of these people don’t use mathematics in their day to day lives.  We should first clarify that I am, in fact, talking about mathematics as distinguished from arithmetic (a small subset of mathematics).  For the duration of this paper, arithmetic refers to basic operations: addition, subtraction, multiplication, division, exponents, roots.  While it is absolutely important as part of a complete education that someone should learn how these arithmetic operations work and how to do them by hand (yes, that does include long division), this is, I would argue, the least important application of any sort of mathematics in daily life.  It’s also the only application most people think of, which is why we hear arguments like “I don’t need to learn math because I can balance my checkbook with a calculator.”  And that’s very true.  Because calculators are now ubiquitous, it’s significantly less important to be adept at arithmetic than it once was.  Arithmetic education’s importance these days has more to do with understanding how the operations work and knowing when to apply them than being able to recite multiplication tables from memory.  Mathematics, however, which we can broadly (and not quite accurately, but it’s good enough for our purposes here) describe as anything you learn in a math course beginning with algebra, is another matter entirely.  This is the sort of mathematics that people think they do not use in day to day life, and yet it is precisely the sort of thinking we all must do every day.

To be sure, most people probably never see a problem like this: 3x^2+2x-3=0 (the solutions, by the way, are x=-1 and x=2/3).  That’s very true, and that’s what people are thinking of when they say they never use algebra.  But that isn’t all algebra is.  It is algebra, yes, but it’s the very formalized algebra you learn in school.  It’s important to learn how to do algebra formally because when you’ve developed that knowledge and ability, you’re able to incorporate algebraic concepts into your life.

Here’s a very brief (and certainly partial) list of some day-to-day applications in which one must use mathematics:

Tax returns
Calculating gratuities
Managing your car (fuel economy, distance traveled, etc.)
Making purchases

Depending on one’s profession, there are plenty of others, ranging from the very basic to the very advanced.

Another application recently came up when a family member was shopping for a television.  They knew the diagonal measurement and aspect ratio of the television but not its vertical and horizontal dimensions and needed to determine whether it would fit where they wanted to put it before making a purchase.  It’s a matter of simple geometry or trigonometry (trig makes it easier if you happen to know trig, but the Pythagorean Theorem and a bit of algebra will do the job, too) to figure that problem out.  I managed to provide the correct answer in a couple of minutes, ensuring that the purchase would work out for this person.

Mathematics is everywhere, and everyone can use it in direct applications.

But that’s not why it’s important.  That’s just a quick couple of examples to prove the lie that people don’t regularly use algebra in day to day life.  The real importance of mathematics is that it’s a formalized way of analyzing information about the world around us.  As in my example with the television, we can use numbers to solve problems.  Mathematics is abstract, but its value is that we use that abstraction to tell us something about the concrete.

Algebra is an Arabic word which, roughly translated, means “to reassemble from broken parts.”  Algebra is about manipulating variables to learn something about some unknown.  It’s not necessarily about “solving for x,” though that’s an easy symbolic way to solve problems.  Consider a simple example.

Story problem: Jim and Jon have five apples.  Jon has three.  How many does Jim have?

Algebra: 3+x=5

In either case, we use some basic algebra, move a couple of numbers around, and we can determine that x=2, so Jim has two apples.

If you’ve ever done a problem like that (and yes you have--we all do problems like that every day), you’ve done algebra.  You may not have written it symbolically, and that’s fine.  You’ve still done the math.  Of course, the value of doing things symbolically is that problems become very complex, even in “real world” examples, so it’s useful to understand a shorthand so that you can write everything out and make sure you solve it correctly.  Our ancestors did not have such systems, and their problem-solving abilities were limited by it.  We do have such systems, so it’s pointless to ignore the tools we have.  Solving the problem symbolically is certainly no more difficult than doing it using a more “intuitive” method, as anyone would do in one’s head.  It’s exactly the same operation!  The symbolic method just gives us a tool to solve more complicated problems than those we can do mentally.

Mathematics is also not just a set of operations.  There are plenty of people who will seldom (if ever) need to solve a quadratic equation.  Does that mean it has no utility for those people? Certainly not!  Mathematics is a method of formalized problem solving.  Learning mathematics, even if you never use it (which you do, though I will actually argue here that the direct applications are less important than the mental training), trains your mind to be able to tackle more and more difficult problems.  Even problems that have nothing to do with mathematics directly can benefit from mathematical thinking.  For instance, when I play chess, I tend not to mathematically analyze each move in the game (though I would be remiss if I didn’t point out that I do calculate some types of positions by “brute force” mathematical operations).  I’m not counting pawns and performing complex operations in order to determine the best move.  But I am problem-solving, and those analytical skills are not something we’re born with.  They are developed over a lifetime of education, and mathematics is the discipline which most directly builds our analytical abilities.

Suggesting that mathematics is an unimportant part of a standard education is clearly wrong, and for similar reasons to the suggestion that art is an unimportant part of a standard education.  I would argue that mathematics is of higher value in the limited sense that it is of greater utility in every (and yes, I do mean EVERY) profession, but let’s not worry about ranking.  Let’s just consider that the very people who seem to take pride in their failure to understand mathematics are often the first in line to defend the teaching of art and music in the public schools.  They are right to do so, but if they object to mathematics because of a misguided view that they never use it, I think it’s a fair question to ask them when was the last time they used their knowledge of Leonardo da Vinci in their day to day life?  I would guess that they probably have never used that knowledge.  And yet, no one would dare to suggest that a study of Leonardo is anything less than essential for a complete education.

For the more economically-minded readers, it’s also worth pointing out that mathematics is a great opportunity for profit. The most obvious way is simply that those with a degree in mathematics are among the most employed (and highest paid) of college graduates.  A teacher once told me that a greater proportion of math majors are admitted to medical school, for instance, than pre-med majors.  The same professor also told me of an acquaintance he once knew who specialized in consulting for government and industry on the mathematical optimization of their systems, a service for which he was easily able to command fees in excess of $10,000 per hour.  While these types of financial incentives may apply more to those who actually spend their lives studying mathematics (you don’t get $10,000 per hour just by knowing how to do basic calculus), there is also a more general argument that any skill one possesses adds a certain value to one’s marketability to employers, and I would suggest that mathematics, with its universal applicability in any field or profession, is arguably one of the greatest (if not the greatest) adder of value to one’s resume.  The more math you know, in other words, the better your job prospects will be.

Just as much as our culture is defined by our art, our cinema, our literature, it is also defined by mathematics.  This has always been true, but it is perhaps truer today than ever before.  In The Demon-Haunted World, Carl Sagan famously wrote, “We’ve arranged a global civilization in which the most crucial elements--transportation, communications, and all other industries; agriculture, medicine, education, entertainment, protecting the environment; and even the key democratic institution of voting, profoundly depend on science and technology.  We have also arranged things so that almost no one understands science and technology. This is a prescription for disaster. We might get away with it for a while, but sooner or later this combustible mixture of ignorance and power is going to blow up in our faces.”  He might as well have been writing about mathematics as well.  In many ways, he was, because I have long contended that a significant (perhaps not the most significant, but surely a significant) reason for this cultural ignorance of science and technology is the widespread understanding that science and technology are dependent upon mathematics, and people are afraid of mathematics.

Let us imagine, then, that you have no desire to extract yourself from the well of ignorance, that you don’t mind falling behind as mathematically literate people change the world, that you still don’t believe you use mathematics in your daily life (though you bloody-well DO), and that you’re perfectly happy to divorce your consciousness from an important part of your culture.  Why should the person I have just described still care about mathematics?  Let us move away from the daily toil and look at some of the other applications of mathematics which arise, not every day, but still in every lifetime.

We can begin this discussion by thinking about the legal profession.  Mathematical literacy amongst lawyers (and hence, judges) is famously low.  Many of my colleagues share my hypothesis that a large part of the reason the market is beyond saturated with more lawyers than we as a society know what to do with is that there is a perception that a law degree is the most prestigious and highest-yielding graduate program one can undertake which does not require significant amounts of mathematics.  People, desiring a post-graduate education but still afraid of mathematics or insecure because of their ignorance thereof, flock to the law schools in an attempt to do the best they can do without ever having to crack open a calculus book.  And yet, this is precisely the opposite of the way things should be.  The legal profession is failing to properly utilize one of its greatest tools, with the result that trials are presented before juries in which statistical analyses play a determining factor, and none of the players involved (the judge, the jury, or the lawyers on either side) realize that the mathematics has been perverted, misunderstood, and gotten wrong.

I have been reading a remarkable book lately entitled Math on Trial, which is a compendium of cases in which the lawyers presenting criminal trials have gotten their statistics wrong.  It includes such novice mistakes as multiplying non-independent probabilities to create the illusion of guilt where happenstance may be a more likely explanation.  Of course we need mathematics in the courtroom, but mathematics only works if the players involved know enough mathematics to actually do it correctly.  Otherwise, they might as well just bring in a kindergartener to write numbers on the whiteboard at random for all the good it will do in the pursuit of justice.

Why do I spend precious paragraphs talking about the legal profession when I know very well that a small minority of my readers are lawyers?  Because it isn’t just the lawyers who get it wrong.  So do the juries.  Any one of you can--and probably will--be called to serve on a jury (why you shouldn’t try to get out of that duty is a topic for another essay).  I think it’s safe to say that, if impaneled on a jury, every one of you would want to get things right.  None of you want to let the guilty go free and you certainly don’t want to wrongfully imprison the innocent.  There is also a fair chance, if the trial is of any importance at all, that statistical analysis will play some role in the evidence you’re presented when you’re sitting in that jury box.  And if not statistical evidence, then perhaps some other branch of mathematics.  The case in question may hinge completely on the geometry of the crime scene or the physics of the objects involved.  However it might manifest itself, there will likely be some mathematics.  So when you’re sitting in that jury box, do you believe the prosecutor when he tells you the odds that the defendant is guilty?  He could be mistaken; he could be lying.  There’s strong precedent for both.  Lawyers often get their mathematics wrong, and certainly prosecutorial misconduct is, unfortunately, not unheard of in the pursuit of winning the case.  But if you can’t trust him, do you trust the defense attorney to be able to adequately explain to you why he’s wrong?  Even if they both make mathematical arguments, how are you to determine whose mathematics is correct?  In order to serve justice in such a situation, you must be mathematically literate enough, if not to check their calculations, at least to ask the right questions.

Okay, so maybe you’re not worried about jury duty.  Perhaps you’re content to assume that the other jurors know enough mathematics that you can rely on them.  (This is obviously not the case, as most of the jurors are likely to be thinking exactly the same thoughts, but let’s imagine for a moment.)  How can you be a fully functional citizen and protect your own interests without at least a rudimentary understanding of mathematics.  Consider one of my blog posts of the past, in which I explain how pyramid schemes work.  The mathematics there is relatively simple (especially in the simplified form in which I presented it).  Most people can understand that math, but it serves as an example of which I speak.  If you did not understand the math, you might be inclined to think the scheme was a good idea.  Similarly, if you don’t understand the math, you might be inclined to think more subtle schemes are good ideas.

The mathematically illiterate are ill-equipped to handle the challenges of the world, particularly in the 21st Century.  They may be the victims of fraud, they may misunderstand science, they may fall victim to predatory practices within the economic system.  They are not prepared to conduct business or vote as educated actors, but instead must rely on the often biased (whether intentionally or otherwise) information provided to them by outside sources.  No one knows everything, so of course it’s not reasonable to expect all people to be able to do all types of mathematics, but I think it is very reasonable to expect people to have a sufficient understanding of mathematics to a) solve the mathematical problems that occur in their lives, b) recognize mathematical problems with sufficient sophistication that they can seek their answers from the correct sources, and c) know well enough when basic errors in mathematics have resulted in erroneous conclusions in that information they’ve been provided.

Journalists certainly have failed in their duties to remain mathematically literate.  Examples abound of newspapers publishing shoddy statistics or even simple arithmetic errors.  Perhaps you’ve seen political polling data.  Do you believe it?  Well, do you know enough math to understand how those numbers work?

And it’s not just about practical uses either.  Mathematics is the language of the universe.  It is the language of nature.  It amazes me how many people can claim to be lovers of nature, but fail to see the mathematical beauty it holds.  Consider the spiral of a nautilus shell.  There is a deep mathematical principle at work in that simple geometric pattern.  The motion of every object can be expressed mathematically, from the gentle floating of a dandelion seed drifting on a gust of wind to the motion of the entire Earth around the Sun or the entire solar system about the center of the Milky Way Galaxy.  Mathematics is everything, and it can be used to reveal the hidden patterns of everything we ever see or do.  Even the relatively mundane is the result of mathematics.  The advertisements you see online are determined by algorithms designed to match advertisements specifically to you based on data profiles of people who visit websites similar to the ones you visit.  Whether you like that or not, it’s mathematics.  And it is understanding of mathematics that allows one to form an informed opinion about such matters.  I’m certainly not saying that everyone should be able to write an algorithm capable of effectively targeting advertising.  But I am saying that everyone ought at least to have some idea of the mathematics behind that process.

No less a person than Abraham Lincoln refused to continue his legal education until he had read, understood, and mastered Euclid's Elements (this text on geometry is arguably one of the greatest products the human mind has ever produced).  Lincoln had no intention to become a mathematician, but he recognized within mathematical thinking the foundation of an agile mind.  The sixteenth President of the United States of America would not complete law school until he mastered Euclid.  Think about that when you consider whether mathematics has relevance in non-mathematical fields.

So mathematics is important.  Why, then, are so many people afraid of it?  Why do people think they’re bad at it?  Why have they allowed themselves to persist in the delusion that they have never had need of it in the “real world,” despite mathematics’ deep connections to literally everything in the real world?

I think it comes down to a number of relatively simple factors.  The first is arguably the most insidious.  Issac Asmiov knew a thing or two about a thing or two.  He wrote over 500 books and is a very rare person indeed, having been published in nine of the ten major categories of the Dewey Decimal System.  So it is with deference to his great mind that we note his famous observation: “There is a cult of ignorance in the United States, and there always has been. The strain of anti-intellectualism has been a constant thread winding its way through our political and cultural life, nurtured by the false notion that democracy means that ‘my ignorance is just as good as your knowledge.’”  I think he nailed it.  I think people feel justified in their ignorance--and indeed, even proud of it to some extent--precisely because of the bizarre idea Asimov described.

My personal feeling is that this dangerous attitude stems from a fundamental misreading of the Declaration of Independence.  Thomas Jefferson did write that “all men are created equal,” but surely he could not have meant that everyone should achieve equal results in life despite varying degrees of effort, but that is the view that we collectively seem to have taken.  We live in a culture where children’s sporting events no longer have winners and losers but all children are given a ribbon for participating.  We’re to value self-esteem above introspection and hard work.  Similarly, we have come to believe that “all men are created equal” means, beyond equal protection of the law, that all men’s ideas are created equal.  This is clearly an unjustifiable position to take, but evidence of its pervasiveness in our culture is overwhelming.  A simple look at commentary on Facebook (authored in reply to my postings and those of my acquaintances) provides clear examples: “it’s just my opinion, and it’s as valid as yours,” or “we’re all entitled to our opinions,” or “let’s just agree to disagree.”  These phrases signal an end of the intellectual debate and a surrender to anti-intellectualism because they confuse facts for opinions and assume that all of these “opinions” are of equal merit.

Once we as a people decided that we were collectively going to buy into this deadly lie, it was not a great stretch for many of us to come to the conclusion that mathematical illiteracy is just as good as mathematical literacy.  Similarly, I have recently seen it argued that belief in astrology is just as valid as belief in astronomy and that creationism and evolution should be given equal time in public school science classes.  We seem to have decided that mathematics is only for some people, and that the pride due to those who have spent their time developing a true mastery of advanced mathematics is also due to those who have no mathematical ability at all.

The second and, I think, far more important factor is the way mathematics is taught.  I don’t believe there is a country in the world whose educational system quite gets mathematics right, but I think our educational system in the United States gets it more wrong than most in the industrialized world.  While the above is, I think, the reason people feel pride rather than shame in their ignorance, I think educational failure is the reason for that ignorance in the first place, and I think that failure is a composition of several factors.

1) Peer influences poison the well.  It has been allowed to become common knowledge that mathematics is an intellectually difficult field of study, and should be limited to the “nerds.”  I will write elsewhere about why nerdiness is indeed something to be proud of, but for many students, this culture stereotype creates a major problem.  It convinces the student that, unless he or she is one of the “nerdy” crowd, he or she will struggle with mathematics.  Self-fulfilling prophecies are common in the study of psychology, and this is no exception.  As Guinan once said in Star Trek: The Next Generation (4.1; The Best of Both Worlds, Part 2), “When a man is convinced he’s going to die tomorrow, he’ll probably find a way to make it happen.”  Similarly, if you’re convinced that you’re going to struggle or fail at mathematics, you probably will unconsciously sabotage yourself to the point that you do so.

2) Mathematical opportunities are limited.  American primary and secondary schools are relatively limited in their mathematical options.  Students in elementary and middle school learn arithmetic and pre-algebra.  In high school, they take Algebra 1, Geometry, Algebra 2, and then (depending on the school) some other math course(s), probably in the following sequence: Trigonometry, maybe Pre-Calculus, then Calculus.  Never are they exposed to any other branch of mathematics such as probability and statistics, number theory, or game theory.  Even at university, many of those fields of mathematics remain entirely unknown to students who do not major in mathematics.

Now, I’m not saying that everyone should master all of these. Not at all.  I do think that statistics should be added to the mandatory sequence, at least to the extent that students learn the basics of interpreting statistical information so they can be informed consumers of information.  But do I think they need to study game theory? No.  On the other hand, do I think students should hear about game theory and number theory and many others? Absolutely.  I think part of the reason people think they dislike mathematics is because by the time they get through all the arithmetic (which is, quite frankly, the most boring part of mathematics), they’ve decided that mathematics isn’t for them, and they never realize the beauty of the more advanced disciplines.

In a short but powerful TED talk, Arthur Benjamin argues (go and watch that video now--it’s worth it) that calculus is the wrong summit of mathematical education for most students, and suggests teaching statistics before calculus.  Given the limited time and resources of the high school math department, I’m not sure if I would go quite as far as he does, because those students who do wish to study science or engineering have a significant leg up if they learn calculus while in high school (certainly, I would have saved a fair amount of time and money if I had entered university with a more complete math education).  But on the other hand, Benjamin is absolutely correct that statistics is a more important discipline in mathematics for the majority of students (every single person would benefit greatly from an understanding of probabilities and statistics), and it is beyond doubt that our educational system does a great disservice to our children by not at least offering statistics as an alternative for students who prefer not to take the “calculus track.”  And even those who do want or need to take calculus (which really is a lot of fun) would benefit greatly from some greater exposure to statistics.

While discussing the issue with my girlfriend, she suggested that Trigonometry is a longer course than it needs to be.  Our schools devote an entire year to it.  She suggested as an alternative that students should take a semester of Statistics and a semester of Trigonometry.  I would amend that slightly to say that every student should, after completing Algebra 2, take a semester of Statistics.  Then, those students who wish to move on to Calculus should take Trigonometry for the second semester of that year, while the others should continue with a course devoted to just exploring the big picture of mathematics.  In such a course, they needn’t develop the skills to do advanced mathematics, but would be exposed to both the history of mathematics and the beauty underlying some of the current areas of mathematical research.  This leaves even the non-scientist and non-mathematician with an education in mathematics comparable to the education in the arts that non-artists receive.

Because the educational opportunities in mathematics are narrowly tailored to a particular type of student with particular needs and expectations, mathematical education has failed to reach the rest of the students who might very well have been more interested in mathematics if they knew there were other disciplines.  Yes, you do need calculus to actually do a lot of those other types of math, but at least students would feel less ignorant and less proud of that ignorance if they knew what work was being done.

3) Many people had bad teachers.  This is not to put teachers down by any stretch.  Unfortunately, though, the bad teachers who are currently working have more opportunity to do harm in mathematics than in many other fields, because our culture has already begun to turn students off to mathematics.  When such a student encounters a bad teacher, they often just give up, instead of pushing through and waiting for a good teacher, as they might be inclined to do in a subject of which they’re more forgiving because they already have a passion for it.

What do I mean by bad teachers in this context?  I mean those who, instead of working to help their students understand, make their students feel inadequate for not understanding.  While I do think there is shame in life-long mathematical illiteracy, there is never shame in ignorance when the ignorant person has a legitimate desire to learn.  I use the word “ignorance” here in is true sense, referring to a lack of knowledge, not to belittle the people in question. All of us are ignorant of something, so there’s no shame in ignorance if we are willing to correct those deficiencies.

I have worked with students who have had teachers bluntly call them “stupid” for not understanding something.  That’s enough to make anyone not go back to their lectures.  So those bad teachers truly are to blame for at least some of the mathematical illiteracy in the world.

4) Teacher selection and training is inadequate.  Separate from the bad teachers mentioned above, there are many who are simply unqualified to teach mathematics.  Teaching mathematics requires knowledge of two fields: mathematics, and education.  That’s true of any field.  You need to know your subject, and you also need to know how to teach it.  Many teachers in the high schools and lower did not study mathematics.  They studied education.  What’s more--many of them were the very same students who disliked and feared mathematics a generation earlier.  They get through their education degree with as little math as possible, and end up teaching elementary mathematics simply because that’s where there was a teaching job available.  How can anyone teach a subject about which they are not passionate and knowledgeable?  And yet, that is exactly what is expected of far too many teachers.

On the other extreme, there are those few who did study mathematics, but many of them don’t understand how to teach mathematics.  They never struggled with mathematics (because most people with math degrees never had a hard time with the subject), and so they are ill-equipped to understand the struggles their students have with the subject.  We need a community of mathematics teachers who understand both mathematics and education.  There are some of them out there, but we need a concentrated effort to create more of them.

5) Mathematics is cumulative but math education is age-based.  Mathematics, as much as any and more than most fields, builds upon itself.  You cannot master fractions until you master multiplication and division; you cannot master algebra until you master fractions; you cannot master logarithms until you understand exponents; you cannot master calculus until you master algebra.  Each new course in mathematics assumes the previous courses as prerequisite knowledge.  There’s nothing wrong with that--indeed, that’s the way it must be (though I have some crazy ideas about tinkering with the order things are taught, this is not the essay to go into that).   So where’s the problem?

The problem is that, with the rare exceptions of students who are exceptionally good or bad, most students will progress through their education based on age rather than ability.  Let us imagine a typical young student.  In his first math course, he gets 90% on his final exam.  He’s labeled an “A” student and passed along to the next course.  But there’s still 10% of the information he’s missing.  That might not seem like much, but that next course will build upon everything he’s previously learned.  He struggles to catch up with that extra 10%, and in this course, he gets 80% on his final because he lost time catching up.  In the next course, he gets 70%.  Then 60%.  Sooner or later, he convinces himself that he’s just bad at mathematics and doesn’t like it.  But it’s not really that case.  If education were based on mastery instead of based on pushing children through according to their age, he would have eventually gotten that initial 10% he missed, and the entire problem would have been averted.  Quality teachers would be free to deviate from the standard curriculum and teach students at their own pace, passing them along to the next course when they demonstrate mastery rather than at the end of each year.

This requires a compete rethinking of how our educational system is structured, so I hold out limited hope for the immediate future.  However, it remains true that these cumulative failures are part of what convinces people that they are bad at mathematics.  I would contend that, putting aside those who have legitimate mental disabilities, no one is just “bad at math.”  Some will take to it more quickly than others for a variety of mysterious reasons psychologists might struggle to understand, but anyone can learn it.  It is only through failures in education that people come to believe that math is just not for them.

6) Informal education is limited.  We can bemoan the rates of illiteracy and scientific illiteracy right along with that of mathematical illiteracy.  However, I think the problem is worse in mathematics, and I think that part of the reason is because there are not many opportunities to gain informal education in mathematics.  Just as with any field, you could actively seek out that education at a library or bookstore (though even then, you’ll struggle to find books that simultaneously provide a depth of understanding and an ease of comprehension befitting the autodidactic student.

Public television (and some network television) is full of programs for children trying to help them understand science or teach them to read.  LeVar Burton recently raised millions of dollars to bring back Reading Rainbow.  For adults, Neil deGrasse Tyson’s reboot of Carl Sagan’s Cosmos was a great television series teaching about science.  Where are the equivalents for mathematics?  The closest I have been able to find are the little segments about counting in Sesame Street.  While admirable, that just isn’t enough.

People often use the word “infotainment” as a derogatory phrase.  It’s meant to convey displeasure at the state of media, often news media, placing more emphasis on keeping people entertained than keeping them informed.  However, there is no reason education should not be entertaining.  We desperately need more high-quality entertaining books, websites, television programs, etc., capable of keeping audiences engaged long enough that they will learn some good information while reading or watching.  Though we need more of them in all disciplines, in mathematics, these programs are virtually nonexistent.

7) There is a fundamental disconnect between the student and the mathematician.  A video on the YouTube channel Numberphile compares this disconnect to art in this way: the way we teach mathematics is akin to teaching someone how to paint a fence and calling it art education.  I hadn’t thought of it in those terms before, but of course that’s absolutely correct!  The average mathematics student learns a series of operations to perform in order to solve the type of math problem that shows up in textbooks.  While they’re doing so (even moreso if they’re engaged and paying attention), they’re developing the analytical tools I mentioned earlier, but on the surface, they’re solving “cookie-cutter” problems which do bear little resemblance to the real world applications of what they’re learning.  Never do they gain an understanding of the history of mathematics as art students would learn about the great masters.  This leaves students not only exhausted by the work they’ve been frustrated about (for the reasons I’ve discussed), but absolutely unaware of the kind of work that real mathematicians are doing.  Students should be taught mathematics in a way that gets them excited about mathematics, rather than in a way that leaves them afraid of it.

8) Cultural stereotypes limit student performance.  As I mentioned earlier, someone who feels doomed to fail at mathematics will probably do so, for purely psychological reasons that have nothing whatever to do with their actual ability to perform mathematical operations or to reason mathematically.  There is an added cultural stereotype which I think adds to this problem, and which is worth pointing out.  Namely, it seems to be a popular belief in the United States that natural ability determines success in mathematics; that some people “naturally” are better analytical thinkers while others are better at, say, emotions.  Psychologists have not fully unlocked the secrets of why some people perform better than others.  While there may be some genetic elements, it is almost certain that environmental influences and hard work make a greater difference.  In many Asian cultures which routinely outperform the United States in tests of students’ mathematical abilities, the success of those students is attributed to hard work and discipline.  In the United States, success is often attributed to “talent.”

I’m not going to sit here and tell you there is no such thing as natural talent.  What I will say is that this cultural view that only a relative few who are “mathematically gifted” can succeed in mathematics is false.  I firmly believe that anyone, with hard work, can become an expert mathematician if they so desire.  It means putting aside our culture of instant gratification and putting in the time it takes to master these skills, but it is achievable for everyone except those with the severest of mental disabilities.

So why do I care?  Why do I take the time to write an essay of this length to tell people a) that mathematics is important and b) that mathematics is within the reach of everyone?  Well, there are a number of reasons.  Of course, the reasons I mentioned above hold true, that mathematics is important in day-to-day life for more people than seem willing to admit it.  But that may be viewed as personal for those people.  However, when mathematical illiteracy affects the way juries rule, the way the news is reported, and the way people vote, it affects all of us.  I don’t think I hyperbolize when I suggest that our survival depends on greater levels of mathematical literacy.

But there’s also a more personal reason.  I find mathematics beautiful, interesting, and fun.  It’s the language of the universe.  With mathematics, we can understand our world, we can understand each other.  We can unlock the secrets of nature if only we speak their language, and that language is mathematics.  Speaking of science, which is also a passion of mine, Carl Sagan said “When you’re in love, you want to tell the world.”

Tuesday, July 8, 2014

In Defense of Replication Studies

There’s been a recent fluttering of activity on the Internet about a paper written by Harvard social psychologist Jason Mitchell, the full text of which can be read here:  The crux of the issue seems to be that Dr. Mitchell apparently sees little value in replication studies or in the publication of negative results, a noted and alarming inverse of the current trend among reputable scientists to decry the lack of those very types of publications in most major journals for reasons I will discuss briefly (though by no means completely) in this response.

Dr. Mitchell received his B.A. and M.S. from Yale and his Ph.D. from Harvard, and is now a professor of psychology at Harvard where he is the principal investigator at the University’s Social Cognitive and Affective Neuroscience Lab (  I say this to point out that Dr. Mitchell’s credentials appear impeccable, at least on paper.  He’s a professor at one of the world’s most prestigious universities (though the merit of such prestige in education is often called into question, that is a discussion for another day), and appears to have a consistent flow of publications in the scientific literature, much of which, though I am completely unfamiliar with his work beyond this single paper in question, appears to be of significant interest.  Having established those credentials, the duty now falls upon my shoulders to convince you that despite an apparently productive career in social science, Dr. Mitchell appears never to have received even the most rudimentary education on the basics of the scientific method, either through oversight on the parts of his instructors or, more likely, inattention on Dr. Mitchell’s part during those key lectures.

It is strongly recommended that you either read Dr. Mitchell’s paper, “On the emptiness of failed replications” in its entirety before returning to this document or that you read it alongside this discussion so that his argument can be made to you in his own words.  I would not wish to be accused of misrepresenting his argument.  Nevertheless, I will proceed through the article point-by-point, providing significant commentary along the way and quoting the source material, though sparingly, so as to provide direct refutations.

Dr. Mitchell’s article begins with a bullet-pointed listing of six postulates, each one of which is dead wrong.  I will attempt my exploration of the faults in Dr. Mitchell’s paper by examining each of these points in turn.  The bulk of the paper is simply Dr. Mitchell’s supporting arguments and evidence (such as they are) for these six points.  As such, the bulk of the paper, though not often directly quoted here, will be addressed under the headings of the six claims.

1) “Recent hand-wringing over failed replications in social psychology is largely pointless, because unsuccessful experiments have no meaningful scientific value.”

Several years ago, I had a chance encounter on the Internet with a gentleman who was pursuing his doctorate in applied physics, specializing in acoustics.  We became acquainted through commentary on my girlfriend’s page on a social media website during a discussion of evolutionary science and creationist dogma, during which debate this gentleman revealed that, despite his scientific training, he was a young earth creationist and that, further, he believed physics supported his position.  Amongst his misunderstandings were claims that because the Sun is burning up, it should be getting smaller, and a belief that Einstein’s theory of special relativity suggests that as an object approaches the speed of light, it loses mass (when in reality, objects approaching light-speed approach infinite mass).  I mention this frustrating conversation because until now, it was the greatest misunderstanding of science I have ever heard from someone claiming any degree of professional training in the sciences.  Dr. Mitchell has the dubious honor of having surpassed that creationist’s achievement.  This creationist, at least, made a show of doing real science and claiming the evidence supported his argments (however misguided those claims were).  Dr. Mitchell’s approach to science, if I dare call it an approach to science, appears to suggest that any study failing to confirm the experimenter’s hypothesis is useless.

For those of you who aren’t already either rolling off your chair in fits of uncontrollable laughter at Dr. Mitchell’s expense or banging your head against your desk in frustration for much the same reason, I will pause for a moment to explain the ludicrousness of Dr. Mitchell’s position (and offer the promise of further hilarity to follow).

To begin with, the “hand-wringing” as Dr. Mitchell dismissively refers to a growing collective concern amongst scientists, is very well-deserved.  If you follow the scientific world, you may have heard of something called “publication bias.”  The idea is that journals tend to like to publish positive results of exciting experiments because those grab headlines and help sell the publication to professional readers.  There’s nothing particularly evil about this on its face, except when you realize that replication is a key part of the scientific process for reasons we’ll discuss in greater depth later on (but it basically comes down to being sure that a published result wasn’t just a phantom due to random chance or experimenter error), and that these replication studies (being the “un-sexy” sort of work that just sets out to question or to establish the credibility of previously published work) find extremely limited venues for their publication.  When they are published, and there is certainly no guarantee they will be, it is often in obscure journals that fail to reach even a sizeable fraction of the readership of the original paper.  The result of this, concern over which is dismissed by Dr. Mitchell as “pointless” and “hand-wringing,” is that erroneous papers which reach publication (yes, despite all the best efforts, erroneous information does get published either due to oversight or, rarer, deliberate misrepresentation of research in order to get published) may wait a considerably long time before they are corrected--if, indeed, they are ever corrected.  This means there is a distinct possibility (nay: probability) that some indeterminate amount of the information accepted into the body of scientific knowledge is wrong.

None of this is intended to cast doubt upon science as a method of knowing. Indeed, the scientific method, when properly applied, is specifically designed to avoid just this sort of situation.  The problem we currently face with the issue of publication bias in the sciences is not a problem with the science, but with the politics that have come to dominate within the halls of academia, and to which science unfortunately often takes a backseat in the minds of the administrators who perpetuate the problem.  This, however, is not intended to be a referendum on politics in academia, but a discussion of the flaws with Dr. Mitchell’s little paper, so I will refrain from heading down the rabbit hole (some might call it a black hole) of academic politics.

Even if replication studies were not of any importance, however--even if Dr. Mitchell’s apparent assumption that original research is always flawless were completely and undeniably true--there would still be much to find fault with in just this first bullet point.  He claims that “unsuccessful experiments have no meaningful scientific value.”  There is a bit of an ambiguity in that statement, and the Principle of Charity would compel me to address the best possible interpretation of his claim.  I will do so, though I will then explore the more troubling interpretation because I actually believe the more troubling interpretation to be the interpretation Dr. Mitchell originally intended.

The ambiguity has to do with the phrase “unsuccessful experiments.”  By that does Dr. Mitchell mean an experiment which has been compromised by error?  Or does he mean an experiment which yields negative results?

Let us examine the former.  If he does indeed mean to discuss experiments which have gone wrong, and yielded inaccurate information due to some experimental error (or even chance fluctuations), then he is arguably correct (though barely so) in suggesting that these experiments have no meaningful scientific value.  The problem, however, is that by conflating this statement with a condemnation of replication studies, he betrays an assumption that original research is always performed with greater accuracy than replication studies.  To be sure, this is sometimes the case.  I am by no means suggesting that a replication study is of greater merit than its predecessor.  What I am saying, and what I believe any competent scientist would say, is that when two studies show up with contradictory results, it indicates that at least one of them contains some kind of error.  It is then for the scientific community to conduct further examination (whether that is a closer reexamination of the data or a completely new experiment) in order to determine which.  Certainly it is of scientific value to determine which of two contradictory studies is invalid, even if that means we then determine that this particular study is completely invalid and without value.  Unless we assume the infallibility of original research, these negative replication studies do provide scientific value because they help us to determine which of the original studies need to be reexamined.  Furthermore, even completely failed experiments often lead scientists to explore new, previously unconsidered hypotheses, so there is indirect scientific value in that way as well.

I do not, however, suspect that this is what Dr. Mitchell intended to say.  Rather, it is my assumption, based on phrasing later in the article equating the term “scientific failure” with “an experiment [that] is expected to yield certain results, and yet… fails to do so,” that Dr. Mitchell means an “unsuccessful experiment” to refer to any experiment which fails to support the researcher’s hypothesis.  This is a much more troubling interpretation of his words, however, for two primary reasons.

The first, and arguably less important (though it is of particular importance to me personally as a student of not only the practice but the philosophy of science) problem with this statement is that it equates the negative result with a failure. Yes, we all become attached to our pet hypotheses, but a negative experiment, if viewed through the proper lens of pure scientific inquiry, is not a failure. It is a monumental success, for it has shown that the experimenter’s assumptions had been incorrect. There is something else at work. There is something new to learn. Issac Asimov famously said that “The most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka’ but ‘That's funny....’” What he meant by that is that true scientific discovery stems not from experiments that confirm what we already suspect to be true, but from those that show us there is something entirely unexpected, just waiting to be discovered. Science would be a sorry practice indeed if we all just went around trying desperately to prove ourselves right without the slightest consideration that there might be more to the universe that we suspected. And so it is the negative experimental result which often leads us in those unexpected but fruitful paths upon which the most profound discoveries are made. Surely Dr. Mitchell is familiar with this philosophical approach to pure scientific inquiry, but his paper gives no indication of it.

Of greater significance is the fact that, putting philosophy aside, his statement is just plain wrong. Negative results are of great “meaningful scientific value.” Science is as much about figuring out what isn’t so as it is about figuring out what is. Indeed, the very essence of the scientific method, apparently taught more thoroughly to fifth-grade science fair competitors than to Harvard researchers, is the practice of formulating testable hypotheses and then attempting to falsify them in order to determine the likelihood of their accuracy. The hypothesis that is not falsified may be tentatively accepted as true (though subject, much to Dr. Mitchell’s apparent displeasure, to further testing and review), while the hypothesis that is falsified is discarded so the scientist may move on to more fruitful pastures. This is the most basic principle of scientific research, and to have to explain it in a paper in response to a credentialed professor at a prestigious center of learning is troublesome to say the least. Negative experimental results indicate falsified hypotheses. Yes, false negatives can occur, so it is worth replicating even negative results, but that certainly doesn’t mean they’re of no scientific value.

Perhaps Dr. Mitchell or someone of his opinion would counter by saying something along the lines of, “Well, that’s all very good, but it’s not important to publish the negative findings. Falsified results may direct a researcher away from a point of inquiry, but are of no value to the larger community in and of themselves.” Obviously this is not so. Knowing of work that has not been supported is valuable to the scientific community at large for precisely the same reason it is important to the individual researcher: it helps us to direct further research. Even putting aside scientific curiosity and a drive to understand the world as much as we possibly can, there is a very good economic reason to desire greater publication of negative results. Grant money is notoriously hard to come by. Even Dr. Mitchell makes a nod toward this fact when he writes, “Science is a tough place to make a living. Our experiments fail much of the time, and even the best scientists meet with a steady drum of rejections from journals, grant panels, and search committees.” This is all very true, and having it spelled out in Dr. Mitchell’s own essay saves me the trouble of having to make exactly the same point in opposition to his thesis. Science is, as Dr. Mitchell says, a tough business. It is very difficult to get grant money. The more involved the work, the more difficult it is to fund. This is Economics 101. So why, oh why, should we want to endlessly reinvent the wheel? Replication studies are essential to avoid both false positives and false negatives, but they are specifically designed as replications. Imagine if Scientist A falsifies his hypothesis after ten years of hard work and then, either by choice or because publications shy away from such things, his work is not published. Later, Scientist B stumbles upon a similar (or identical) hypothesis. She then applies for and receives a grant to look into it. She spends her six-figures of grand money and ten years of her life, and finds an identical result. Had Scientist A published his findings, she might never have made the investment.

Make no mistake, if Scientist B wishes to conduct the study as a replication study, she is well-advised to do so.  Replication is essential.  It’s very possible that Scientist A made some mistake in his original experiment, and Scientist B might be able to correct that mistake.  However, such replications become meaningless when negative results are never published.  This view that negative results are of no scientific value dooms generations of scientists to endlessly follow the same dead-end trails.  It slows scientific progress, costs millions of dollars of grant money which could be better spent elsewhere, and wastes the productive time of countless scientists.  Let’s not pretend we have an overabundance of qualified scientists, either.  Every man-hour is precious, especially in a world where so much of the general population is far more content to spend their lives watching television than working in a laboratory.

I will close this discussion of Dr. Mitchell’s first bullet-point (oh yes, we still have five more of his inane bullet-points, plus several points from the main body of the article to get through before we draw this discussion to an end) with a personal story.  Some years back, I was asked to participate as a judge for a local private school’s science fair, a duty I was happy to perform.  While wandering from presentation to presentation with my fellow judges, I noticed something of a trend amongst the entries.  Namely, most were very traditional (one might be tempted to say clichéd) science fair projects.  This is not less than one would expect from a school limited to kindergarten through eighth grade, so I did not judge particularly harshly, but I did make a mental note that for many of the students, the science fair was about producing a flashy display.  There was a remote controlled robot or two, several volcanoes, and many presentations along those lines.  The quality of display was occasionally impressive, but there was very little science actually being done.  Then I happened across one of the last entries of the day.  It was from a student whose family had recently immigrated from Mexico.  His English, though far more impressive than my Spanish would be given a similar amount of time to study, was extremely limited, and his family had very little money with which to purchase supplies, but he wanted to enter the science fair nonetheless.  Unable to afford flashy props, he did a simple experiment.  He filled basketballs to various levels of air pressure to determine which was the most bouncy.  He hypothesized that the fullest ball would be the bounciest.  To test this, he filled one ball to regulation pressure, overfilled one, and underfilled another.  He found, contrary to his hypothesis, that the medium-filled ball was actually the bounciest.  Granted, this was not a rigorously controlled scientific experiment that would be worthy of publication in even the most lenient of journals.  However, this student was the only one of the many entries to actually do real science.  He conducted a proper experiment, achieved a result that did not support his hypothesis, and wrote up his display (with his teacher’s help to get his English right) to tell us all about what he had found.  I do not recall the results of the science fair once all the judges’ scores were compiled, but he received my highest marks.  If he had taken Dr. Mitchell’s postulate that “failed” experiments are of no scientific value to heart, that would never have taken place.

2) “Because experiments can be undermined by a vast number of practical mistakes, the likeliest explanation for any failed replication will always be that the replicator bungled something along the way.  Unless direct replications are conducted by flawless experimenters, nothing interesting can be learned from them.”

Upon reading this statement, I withheld some hope that clarification would be forthcoming in the body of the text; clarification that might serve to negate the glaring oversight in Dr. Mitchell’s claim.  Indeed, further clarification was provided, but instead of negating his error, Dr. Mitchell doubled down on his mistake.

Lest I get ahead of myself as I explore this idea (albeit in much briefer terms than the previous point), allow me to bludgeon you, dear reader, with the obvious: Dr. Mitchell fails to account for the fact that the replicator may be a more skilled experimenter than the scientist who produced the original finding.

Dr. Mitchell is correct about one thing in this analysis.  It is clearly possible that the replicator might have “bungled something along the way.”  It happens.  As humans, we err.  This is undeniable and hardly worth pointing out.  Except, it seems that Dr. Mitchell struggles not only with the philosophical side of science, but also with the self-evident traits of humanity.  Certainly, this is a forgivable oversight, however.  He is, after all, only a scientist working in a discipline dedicated to understanding the traits of humanity.  But I digress.

The problem is that the statement can easily be reversed.  Let me give it a try: “Because experiments can be undermined by a vast number of practical mistakes, the likeliest explanation for any positive experimental result will always be that the researcher bungled something along the way.  Unless original research is conducted by flawless experimenters, nothing interesting can ever be learned from it.”  If that sounds to you like absolute garbage, you are absolutely correct.  Dr. Mitchell’s great failure is in assuming inerrancy on the part of original researchers and incompetence on the part of replicators.  In reality, replicators and original researchers are often the very same people.  As a reputable scientist, it should be part of every researcher’s job to do both original research and replication studies as the need arises for either.  There would be nothing wrong with specializing in one or the other, but a well-balanced approach to research by doing some of both is probably the best way to advance not only the collective scientific knowledge but one’s personal knowledge of one’s own discipline.  Putting aside the old bugaboo of academic politics, I would think the best way to advance the goal, not necessarily of career advancement but of scientific advancement, would be to do a bit of both.  Nevermind all that, though.  Let’s assume for the moment that we have entered into a fantasy world where scientists are allowed to do either original research or replications but not both.  Is there some magical force that bestows competence disproportionately upon one rather than the other?  Of course not.  There will be incompetents and geniuses on both sides, and the average will always be average.

Dr. Mitchell is correct that experimental error is a problem that needs to be addressed in any replication study and though he seems to forget that the same is true of original research, he is correct to suggest that examining replications for experimental error is a worth-while pursuit.

What Dr. Mitchell seems not to understand is that replication is not an argument that an experiment is somehow better the second time it is performed or when done in a different laboratory than in the first case.  The point of replication is that, just as he argues that there can be mistakes in replication experiments, there are mistakes or unknown factors in original research, too.  Replication is essential to determine the robustness of a finding.  If ten studies show a finding to be valid and a new study fails to replicate it, we still examine all eleven, though we do so with the assumption that the fault might likely lie in the new study.  However, if only two studies have been done, we must examine both very carefully to determine which is more likely correct.  There is the further possibility that all of the studies, even with their conflicting results, can be valid, and that there is just some small change in experimental conditions that renders the studies different.  This could lead to entirely new discoveries.

I will illustrate with this example (note: these studies are fictitious and not based on any real data of any kind).  Let us imagine that Scientist X from the University of Timbuktu conducts an experiment and finds that when given 12-volt electric shocks, people perform better at chess than a control group.  Then, Scientist Y from the University of Nantucket conducts a replication trial.  The experiment is performed in exactly the same conditions, but Scientist Y finds no such effect.  What could be happening? Scientist Z from the University of Neverland reads both papers.  He writes letters to both scientists to make sure the experiments were identical, and reexamines the raw data from both experiments to determine which of the studies was wrong, but he finds no experimental error on either side, no problems with data entry, certainly no fraud, and nothing at all to indicate which study was correct.  Can you solve this little problem?  Certainly it would seem that Dr. Mitchell would immediately assume that Scientist X is correct and Scientist Y has made some undetectable mistake.  However, perhaps the real solution is that they are both correct.  There is no flaw in the University of Timbuktu study, but it is incomplete.  It fails to account for the fact that, in Nantucket, they rather enjoy electric shocks due to some previously undiscovered environmental factor, so they are immune to the effects of the experimental manipulation in the study by Scientist X.  Of course it’s a stupid example, but I think it vividly illustrates the point that Dr. Mitchell, for all his laudable attempts to avoid experimental error reaching the literature, has ignored the possibility that replication studies can bring new insights in addition to oversight.

If an original study is superior to the replication study that finds different results, it should be very easy for the original researchers to defend their work.  They could point out the flaws in the replication, or they could conduct further research or call for independent research.  Any of these approaches could vindicate the original study and show the replication to be incorrect.  Instead of taking this proper approach, Dr. Mitchell suggests that we should ignore replication entirely because sometimes a replicator might get it wrong.  He forgets that in science, truth is determined not by who published first but by who has the best evidence.  All of his anticipated problems with replication are easily dismissed simply by providing the evidence that shows the original study correct.

3) “Three standard rejoinders to this critique are considered and rejected.  Despite claims to the contrary, failed replications do not provide meaningful information if they closely follow original methodology; they do not necessarily identify effects that may be too small or flimsy to be worth studying; and they cannot contribute to a cumulative understanding of scientific phenomena.

Moreso than the other five points, this one relies heavily on the body of the essay to understand its meaning.  The basic idea is that Dr. Mitchell is considering three responses to his critique.  While I’m sure that these responses are real ones, I question his selection because they were not the first three that came to my mind.  Could Dr. Mitchell be attempting to subtly erect a straw man?  At the very least, he seems not to be arguing against the best form of his opponents’ arguments.  Nevertheless, these three points are worth examining.

The first point is one which I must, unfortunately, rely upon quoting in its entirety, so that you may fully appreciate the ineptitude of the argument:

There are three standard rejoinders to these points.  The first is to argue that because the replicator is closely copying the method set out in an earlier experiment, the original description must in some way be insufficient or otherwise defective.  After all, the argument goes, if someone cannot reproduce your results when following your recipe, something must be wrong with either the original method or in the findings it generated. 

This is a barren defense.  I have a particular cookbook that I love, and even though I follow the recipes as closely as I can, the food somehow never quite looks as good as it does in the photos. Does this mean that the recipes are deficient, perhaps even that the authors have misrepresented the quality of their food?  Or could it be that there is more to great cooking than simply following a recipe?  I do wish the authors would specify how many millimeters constitutes a “thinly” sliced onion, or the maximum torque allowed when “fluffing” rice, or even just the acceptable range in degrees Fahrenheit for “medium” heat.  They don’t, because they assume that I share tacit knowledge of certain culinary conventions and techniques; they also do not tell me that the onion needs to be peeled and that the chicken should be plucked free of feathers before browning.  If I do not possess this tacit know-howperhaps because I am globally incompetent, or am relatively new to cooking, or even just new to cooking Middle Eastern food specificallythen naturally, my outcomes will differ from theirs.

Likewise, there is more to being a successful experimenter than merely following what’s printed in a method section.  Experimenters develop a sense, honed over many years, of how to use a method successfully.  Much of this knowledge is implicit.  Collecting meaningful neuroimaging data, for example, requires that participants remain near-motionless during scanning, and thus in my lab, we go through great lengths to encourage participants to keep still.  We whine about how we will have spent a lot of money for nothing if they move, we plead with them not to sneeze or cough or wiggle their foot while in the scanner, and we deliver frequent pep talks and reminders throughout the session.  These experimental events, and countless more like them, go unreported in our method section for the simple fact that they are part of the shared, tacit know-how of competent researchers in my field; we also fail to report that the experimenters wore clothes and refrained from smoking throughout the session.  Someone without full possession of such know-howperhaps because he is globally incompetent, or new to science, or even just new to neuroimaging specificallycould well be expected to bungle one or more of these important, yet unstated, experimental details.  And because there are many more ways to do an experiment badly than to do one well, recipe-following will commonly result in failure to replicate.

Of course, the myriad problems with Dr. Mitchell’s analogy should not require great lengths to expose.

The first problem is the same problem encountered above.  Dr. Mitchell assumes that all providers of original research are, as if by some divine right, more competent practitioners than providers of replication studies.  This is simply not so.  It should be clearly stated that cooking and science are two entirely different practices and that any analogy is bound to be imperfect (cooking is, after all, much more of an art than a science).  However, in the interest of proceeding along established terms, allow me to offer a better analogy.  Dr. Mitchell compared replication studies to his amateur attempts to reproduce recipes from his favorite cookbook.  I fancy myself a rather good cook, but I can sympathize--my food doesn’t always come out looking as good as the photo in the cookbook.  Do I think that this means the authors misrepresented their recipes?  No.  Dr. Mitchell is right to think not.  As an amateur, he is not expected to cook as well as the professionals who wrote his cookbook.  However, if Chef Gordon Ramsay or Chef Wolfgang Puck (or whoever your favorite chef might be) attempted to recreate the recipes, following them precisely, combining the detailed descriptions with the established knowledge of culinary practices that Dr. Mitchell points out are generally understood but not explicitly stated and the food still came out significantly worse than the photograph would indicate, then I might begin to suspect that the cookbook has some flaw.  Dr. Mitchell assumes in his argument that he is the one trying to recreate the recipe.  The reality of replication is that it could just as easily be Chef Ramsay.

None of this is to say that science should be judged based on the fame or credentials of the scientist.  No, scientific questions must be determined based on the evidence.  But it is the height of both arrogance and short-sightedness to assume that anyone who would bother to replicate a study must be new to science and thus less worthy of attention than the author of the original paper.

Replication is essential precisely because (amongst other reasons), people who are new to a particular discipline conduct original research as well, and their mistakes could lead to erroneous papers.

However, there is another claim within this section worthy of attention.  This is the idea that some of the “real work” (to borrow a phrase from the magicians) is not explicitly published.  There is both truth and falsehood to this.  It is certainly true that the most mundane details of experimental practice are not explicitly stated in every paper.  However, if there is a practice which is not expected to be common knowledge, it should be explicitly stated.  Dr. Mitchell explains that subjects must remain near-motionless during neuroimaging scans, and alludes to techniques used in his lab to make sure this is the case.  It needn’t be stated, because anyone doing such a scan will already know, that the subject needs to remain motionless.  However, specific actions taken to ensure this motionless state should be noted, either in the paper reporting original research, or in a separate paper established experimental methodology which can be cited when that methodology is used in such research.  I do not suspect this to be the case with the methods detailed in Dr. Mitchell’s footnote (in which he lists several such techniques which are never mentioned explicitly in the methods section of his papers), but it is an ever-present possibility an experimental result could be affected by such conditions the experimenter finds unimportant.  If such notes make a paper too long for publication, they should be published elsewhere (perhaps on the same website that would be better used for experimental methodological tips than mindless ramblings about how useless replication is), so that both potential replicators and the merely interested can fully understand the experimental procedure in place during any experiment upon which they will base a scientific belief.  In Dr. Mitchell’s case, it is common knowledge and needn’t be stated that the subject must remain still.  The phrasing used to achieve this, while apparently innocent enough, can vary from laboratory to laboratory and should probably be noted somewhere so that no errors are made.  Similarly, though Dr. Mitchell’s cookbook probably doesn’t say so, I’m sure there is a publication somewhere that would gladly specify that important detail that a bird must be plucked of feathers prior to cooking.

The second argument is that a phenomenon which has a small effect size or is difficult to replicate might nonetheless be real.  True.  But how does one determine that? Through further studies.  The studies should be replicated both using the same and with new techniques to tease out the reality of the situation.  No one has ever suggested that a failed replication necessarily means an unreal phenomenon in every case.  It means an attempt at replication has failed, nothing more and nothing less.  The implications of that failure are a subject both for discussion and for further experimental investigation.  Dr. Mitchell’s examples fall short because in the very same paragraph where he decries replication because it might have “killed” fields of inquiry we now know to be important, he makes reference to further study validating the original findings.  It would seem that Dr. Mitchell only objects to replication when it falsifies original research, and frankly, that’s just bad science.

It’s also worth noting that if there is flimsy evidence, it would be unwise to believe a claim.  That doesn’t mean it’s wrong, but the scientific method is based upon skeptical inquiry.  We should have been skeptical about those findings Dr. Mitchell uses as his examples because evidence was flimsy in the early days.  It wasn’t until new methods were found to investigate these phenomena (as Dr. Mitchell points out) that the original studies were vindicated.  So the time to believe them is now that the evidence is in.  The time to believe them was not early on when they were little more than promising hypotheses.  But it is not our side that is trying to shut down inquiry.  It is Dr. Mitchell’s side (if indeed there are more than one lone misguided soul who ascribe to his view) that would seek to stifle inquiry by tacitly accepting original research without even the consideration of its replicability.  Replicability is not the only factor that makes a theory robust, but it is certainly an important factor.

The final counterargument that Dr. Mitchell attempts to address is, I think, one of the stronger points.  As I mentioned earlier when I explained publication bias, there is an asymmetry between positive and negative results, even in studies of the very same phenomenon.  Dr. Mitchell claims that science requires an asymmetry between positive and negative results, harking back to that old chestnut that absence of evidence is not evidence of absence.  He claims that no matter how many papers might be published claiming that swans are only white, it only took one study to prove that there can be black ones.  This is all very true, but a better analogy would be Sasquatch (or Bigfoot or Yeti, depending upon your region).  Would Dr. Mitchell seriously suggest that if one person publishes a photograph of a Sasquatch that we should immediately ignore any paper which argues to the contrary?  Certainly it is true that there could be such a being, but it, like everything else in science, should be treated with the same skepticism that is necessary for science to work. We believe in claims when there is sufficient robustness of evidence to outweigh the skeptical counterarguments.  No one is saying that we should believe scientific claims based entirely upon the number of papers suggesting one position or the other (although certainly that is an important factor to bear in mind when formulating opinions).  But it is certainly important to read those papers that show a published effect might not really exist.  If the evidence in one paper is stronger than the other, believe that one.  If the evidence in one is not clearly stronger than the other, we need a new experiment.  But we can’t possibly begin to even consider all of this until replication has been attempted and either succeeded or failed.

Dr. Mitchell then offers this nugget of wisdom: “After all, the argument goes, if an effect has been reported twice, but hundreds of other studies have failed to obtain it, isn’t it important to publicize that fact? No, it isn’t.”  Actually, that’s exactly the kind of information the scientific community needs.  We needn’t know the numbers of studies on one side or the other.  We need to know the quality of research on both sides, and we can only do that when all of that research is published.  It’s quite possible there could be two great positive studies and hundreds of other studies all of which were conducted by idiots or baboons.  It’s more likely that either two researchers made a mistake, or that there is some other factor causing the difference.  If the latter is the case, it’s important to have all of the information on the table, so we can attempt to isolate that other factor.

4) “Replication efforts appear to reflect strong prior expectations that published findings are not reliable, and as such, do not constitute scientific output.

Well, I didn’t realize that a scientist’s intentions were how we judged whether or not paper constituted scientific output.  I thought scientific claims’ validity was judged based on the strength of the evidence.  Silly me.

The basis of this argument is that, if a belief in the hypothesis can result in a bias in favor of positive results, then if the replicator believes the result to be invalid, this can result in a bias toward negative results.  These biases are real.   And it is possible that many replicators are interested only in falsifying results that disagree with their preconceptions, though Dr. Mitchell seems to have an abnormally low view of scientists when he assumes that this is almost universally the case.  Indeed, the main two reasons to replicate a study are either to detect possible errors if one thinks the study was in error or to offer further independent support if one thinks the original work was valid.  But the scientific process is specifically designed to minimize the impacts of these biases.

Once again, I must allow Dr. Mitchell’s own words to condemn him: “But consider how the replication project inverts this procedureinstead of trying to locate the sources of experimental failure, the replicators and other skeptics are busy trying to locate the sources of experimental success It is hard to imagine how this makes any sense unless one has a strong prior expectation that the effect does not, in fact, obtain. When an experiment fails, one will work hard to figure out why if she has strong expectations that it should succeed.  When an experiment succeeds, one will work hard to figure out why to the extent that she has strong expectations that it should fail.  In other words, scientists try to explain their failures when they have prior expectations of observing a phenomenon, and try to explain away their successes when they have prior expectations of that phenomenon’s nonoccurrence.”

It is perfectly valid to explore either causes of positive or negative results (I refuse to consider this in terms of experimental success or failure for reasons detailed above).  The point of the experiment is to isolate cause and effect, so if there is another possible cause for an effect (whether that effect is a positive or a negative result), it is within the proper purview of the scientist to try to find it.  This is a good thing.  Dr. Mitchell seems to think that the point of science is to offer proof of one’s predetermined conclusions, but this is not the case at all.  While supporting a pet hypothesis or falsifying a rival hypothesis may be the initial motivation to embark upon a study, any reputable scientist places truth above personal preference and seeks the best explanation for a given phenomenon.

I am reminded of a story once told by Richard Dawkins (who is actually a proper scientist, in the real sense of the word). Dawkins writes: “I have previously told the story of a respected elder statesman of the Zoology Department at Oxford when I was an undergraduate. For years he had passionately believed, and taught, that the Golgi Apparatus (a microscopic feature of the interior of cells) was not real: an artifact, an illusion. Every Monday afternoon it was the custom for the whole department to listen to a research talk by a visiting lecturer. One Monday, the visitor was an American cell biologist who presented completely convincing evidence that the Golgi Apparatus was real. At the end of the lecture, the old man strode to the front of the hall, shook the American by the hand and said--with passion--"My dear fellow, I wish to thank you. I have been wrong these fifteen years." We clapped our hands red. No fundamentalist would ever say that. In practice, not all scientists would. But all scientists pay lip service to it as an ideal--unlike, say, politicians who would probably condemn it as flip-flopping. The memory of the incident I have described still brings a lump to my throat.” (This quote is taken from

Unfortunately, Dr. Mitchell has shown Professor Dawkins wrong on one small point.  Apparently not all scientists even bother to pay lip-service to the scientific ideal.  Real scientists have no interest in explaining away results they dislike, whether positive or negative.  They may be initially skeptical, and they certainly demand evidence, and they may even embark upon a replication study in order to further examine that evidence.  But once that evidence is in, if it conflicts with their views, they must admit they had been wrong.

5) “The field of social psychology can be improved, but not by the publication of negative findings.  Experimenters should be encouraged to restrict their "degrees of freedom," for example, by specifying designs in advance.

Actually, putting aside a few phrases, Dr. Mitchell is to be commended for this small section of his essay.  For the reasons already discussed and for the reasons I will discuss in the continuance of this conversation below, he’s dead wrong about his opposition to the publication of negative findings.  However, except for suggesting that this is not the way to improve the field of social psychology, the suggestions he does make are quite reasonable ones.  I won’t rehash everything he said in that section here, but it boils down to increased standards for published research.  On that point, we can all agree.

There is a phrase that bothers me a bit, though, and I want to address it: “All scientists are motivated to find positive results, and social psychologists are no exception.”  This is true, of course, but I think it is problematic and that Dr. Mitchell would have us completely ignore the problem behind it.  Scientists are motivated to find positive results partly because they like to confirm their pet hypotheses.  This is true.  However, this is small motivation indeed when one realizes that most people become scientists because they want to understand the world.  If that means rejecting a pet hypothesis, most scientists (as Richard Dawkins points out) at the very least pay lip-service to the ideal.  For me, rejecting a pet hypothesis may be unpleasant for a day or two, but that emotion soon gives way to the much more profound emotion when I realize that having done so, I have eliminated a false belief and may now substitute a true one.  I think most scientists understand and agree with that desire to follow the evidence wherever it leads and to always seek to discover the truth.

So why, then, are scientists so motivated to find positive results?  Precisely because there is such a bias against publishing negative results.  In academia, if you don’t publish research, your career is doomed to be a short one.  But if you find negative results, you often find yourself with work that can’t find a market in which to publish.  Nevermind that this research might be the result of five years’ work involving dozens of collaborators and research assistants--if it’s negative, it doesn’t get published.  So of course there’s a bias toward finding positive results.  But it’s not necessarily a philosophical bias.  Indeed, there are lots of us (I know--I’ve spoken to them) who actually like negative results because they show us there is more to be learned (“My dear fellow, I wish to thank you…”).  But if we’re trying to meet publication requirements for career advancement, negative results are politically (not scientifically) undesirable.

6) “Whether they mean to or not, authors and editors of failed replications are publicly impugning the scientific integrity of their colleagues.  Targets of failed replications are justifiably upset, particularly given the inadequate basis for replicators’ extraordinary claims.

Whether he means to or not, I think Dr. Mitchell is revealing his true motivation for writing this article here.  He has conflated replication studies with accusations of deliberate misrepresentation of data!  A replication study, even if it is negative, does not impugn anything.  Nor is a replication study an attempted pissing contest between the replicator and the author of the original research.  Indeed, it is possible to perform a replication study while maintaining the greatest of respect for the original author or while having no opinion of him or her at all.  Failed replication does not, need not, and should not be considered an insult to the integrity of the original author unless there is very good reason to suspect deliberate fraud.

Let us imagine a failed replication has been published. What are some possible reasons for this eventuality?

a) The original research is valid, and the replicator made a mistake.
b) The original research is valid, and the replication study failed due to chance
c) The original research is valid, and the replicator falsified his findings
d) The original research is invalid; the original author made a mistake
e) The original research is invalid; the original author falsified his findings
f) The original research is invalid; the original finding was due to chance
g) The original research is valid but incomplete; there are other factors at work

In only one of those situations is the original author’s integrity challenged.  In only one other is his competence even slightly called into question.  It may be uncomfortable to have your work questioned, but that’s just part of science.  It shouldn’t be taken as an attack unless it is coupled with a direct accusation of impropriety.  Those accusations should not be taken lightly.  They should be taken seriously but false accusations should also be met with strict consequences.  Science is an honorable profession, and fraud is rare but intolerable.  False accusations of fraud are similarly rare but also intolerable.  This is not what replication is about, however.  Replication is simply about determining whether original findings hold up.

By convention, we consider a finding to be statistically significant at a p-level of less than 0.05.  That means we accept a 5% chance of a false positive due simply to statistical chance (not considering experimental error).  That means that, all else being equal, as much as 5% of what gets published could be wrong, just based on accepted standards for publication.  We could restrict our p-levels to less than 0.01 if we wanted to, but that still leaves us with 1% of all published research possibly being wrong.  Replication, if nothing else, is about minimizing those probabilities by re-running the experiments to see if the same results happen again.  Even if we put aside all possibility of experimental error, misrepresentation, or incomplete understanding of contributory factors, we must replicate research in order to weed out statistical anomalies.  Restricting p-levels to prohibitively low probabilities won’t do, either, because the more restrictive our statistical tests, the more likely we are to reject findings that are actually real.  That’s just as bad.  So what do we do?  We replicate.

Dr. Mitchell himself points out, “On the occasions that our work does succeed, we expect others to criticize it mercilessly, in public and often in our presence.”  No doubt, it can be quite uncomfortable.  Science is hard work, and it’s a tough business.  If someone thinks you’re wrong, they have no problem saying so, and they expect the same of you.  That’s the way it should be.  There’s no ill will about it--it’s just a matter of subjecting all claims to the strictest of scrutiny.  Anyone who has ever so much as presented a poster understands the feeling of coming under fire.  Anyone who has defended a thesis knows it better than the rest.  When we think someone is wrong, we say so.  When we aren’t sure, we test it, and then we say what the results were.  There’s very little coddling or hand-holding in this field, and there needn’t be.  Scientists are adults, and as such should be able to take professional criticism for what it is and avoid taking it personally.  Replication studies are one more type of potential criticism (though they can also support the original research, as Dr. Mitchell regularly forgets).

He concludes his essay with the following line: “One senses either a profound naiveté or a chilling mean-spiritedness at work, neither of which will improve social psychology.

It seems that exactly one senses such things at work here and that one is called Dr. Jason Mitchell.  The rest of the scientific community seems to understand that replication is not a mean-spirited personal attack, but just part of the job.  Dr. Mitchell’s complaints seem, though I admittedly speak only of a general impression and not from any sort of evidence here, to be the whiny complaints of someone whose pet theory has been called into question.  Instead of calling replicators (who, need I remind you, are just other scientists, just like anyone else, and most often also producers of their own original research) “mean-spirited,” the mature scientist realizes that replication is an essential component of the scientific process and that we neglect it at our peril.

This essay prompted science journalist Ben Lillie to take to Twitter with this comment (quoted in: ): “Do you get points in social psychology for publicly declaring you have no idea how science works?”  I think that sums up the quality of Dr. Mitchell’s essay quite nicely, though I object to the association of Dr. Mitchell with the rest of the field of social psychology.  The social and behavioral sciences have struggled long and hard to achieve strict scientific standards.  Ill-informed tirades like Dr. Mitchell’s contribute to a popular misconception that these fields are not “true” sciences.  They are and they should be.  It is unfortunate that many of their practitioners seem to disagree, but let us not besmirch the image of entire fields based on the “contributions” of a few of their members who prefer not to follow the rules of science.

Throughout this response, harsh though I may have been (though I assure you, my commentary is no more biting than what is generally expected of any controversial statement among scientists), I have striven to avoid making any sort of personal attack or commentary about Dr. Mitchell.  I don’t know him personally, so it would be improper to do so.  I have attempted to restrict my commentary to his arguments themselves and to his apparent lack of understanding of the scientific process.  However, since he chose to close his article by calling scientists who conduct replication studies “naïve” and “mean-spirited,” I feel no guilt at closing my response by pointing out one additional quotation buried in Dr. Mitchell’s essay: “I was mainly educated in Catholic schools….”

Yeah, we can tell.  Which might explain why Dr. Mitchell prefers to treat social psychology as a religion rather than a science.