Sir:
Observer effects are rooted in the universal human tendency to interpret data in a manner consistent with one's expectations (1). This tendency is particularly likely to distort the results of a
scientific test when the underlying data are ambiguous and the scientist is exposed to domain-irrelevant information that engages emotions or desires (2). Despite impressions to the contrary,
forensic DNA analysts often must resolve ambiguities, particularly when interpreting difficult evidence samples such as those that contain mixtures of DNA from two or more individuals, degraded or
inhibited DNA, or limited quantities of DNA template. The full potential of forensic DNA testing can only be realized if observer effects are minimized. We met n December 1 and 2, 2007 in Washington,
D.C. to discuss the implications of observer effects in forensic DNA testing and ways to minimize them.
The interpretation of an evidentiary DNA profile should not be influenced by information about a suspect's DNA profile (3-6). Each item of evidence must be interpreted independent of other items of evidence or reference samples. Yet forensic analysts are commonly aware of submitted reference profiles when interpreting DNA test results, creating the opportunity for a confirmatory bias, despite the best intentions of the analyst. Furthermore, analysts are sometimes exposed to information about the suspects, such as their history or motives, eyewitness identifications, presence or absence of a confession, and the like. Such information should have no bearing on how the results of a DNA test are interpreted, yet may compound an unintentional confirmatory bias. This bias can result in false inclusions under not uncommon conditions of ambiguity encountered in actual casework. It can also render currently used frequency statistics or likelihood ratios misleading.
These problems can be minimized by preventing analysts from knowing the profile of submitted references (i.e., known samples) when interpreting testing results from evidentiary (i.e., unknown or questioned) samples. The necessary filtering or masking of submitted reference profiles can be accomplished in several ways, perhaps most easily by sequencing the laboratory workflow such that evidentiary samples are interpreted, and the interpretation is fully documented, before reference samples are compared. A simple protocol would dictate a separation of tasks between a qualified individual familiar with case information (a case manager) and an analyst from whom domain-irrelevant information is masked.
Such a protocol would have the following steps. First, the analyst interprets the results of testing on the evidentiary samples. In this initial interpretation, the analyst would:
Laboratory documentation should include an enumeration of alleles that would cause a person to be included or excluded as a possible contributor at this juncture.
After the results of the initial interpretation are documented, information about reference samples should be unmasked in a sequential manner. In cases where an individual is expected to be a contributor to a sample (e.g., the victim's DNA in a sexual assault sample), the analyst should next compare this reference sample to the evidence profile and evaluate the foreign donor profile in light of this unmasked information (and document again the alleles that would cause any other person to be included or excluded as a possible contributor). At this stage (before knowing the profiles of any suspects) the laboratory should also compute the frequency in appropriate populations of individuals who would be included as possible additional contributors. Only when these computations are recorded should the laboratory undertake the final step of determining whether the other submitted reference samples have the documented genotypes of potential contributors. Cold hits illustrate that it is feasible to interpret evidence samples without knowledge of the reference profile(s). In cases in which a suspect has been identified, a masked interpretation of the evidentiary profiles should have the same utility.
We are not suggesting that forensic scientists be blind to information that might afford them the greatest opportunity to generate reliable information from evidentiary samples. For instance, the nature of the substrate associated with a sample may dictate that certain extraction procedures be used. The case manager should decide what to test and how to test it and could supervise testing through to the development of a DNA profile. However, a sequential unmasking procedure must be used to shield the analyst from task-irrelevant information when interpreting results in order to minimize observer effects. Such procedures can and should be adopted immediately by all forensic DNA testing laboratories.
Sequential unmasking is the most efficacious means of reducing the compromising influence of observer effects on the utility of forensic DNA evidence. We hope this letter will also initiate a dialogue about other safeguards that might be employed to combat observer effects in DNA testing and other areas of forensic science. In the long run, organizational changes may be required to ensure the integrity of the masking process and a reliable separation between DNA analysts and domain-irrelevant information. A properly designed information firewall, for example, could reduce the danger that case managers will inadvertently leak information to analysts, thereby undermining the masking procedure.
With advances in technology, DNA testing has increasingly been used to analyze marginal samples that are likely to produce ambiguous results, such as older samples, samples exposed to environmental insult, and limited samples resulting from incidental contact. Consequently, the need for measures to minimize the consequences of observer effects in forensic DNA testing is growing.
Dan E. Krane, Ph.D., Professor of Biological Sciences, Wright State University, Dayton, OH;
Simon Ford, Ph.D., President, Lexigen Science and Law Consultants, Inc., San Francisco, CA;
Jason R. Gilder, Ph.D., Senior Systems Engineer, Forensic Bioinformatics, Inc., Fairborn, OH;
Keith Inman, M. Crim., Senior Forensic Scientist, Forensic Analytical Sciences, Inc., Hayward, CA;
Allan Jamieson, Ph.D., Director, The Forensic Institute, Glasgow, UK;
Roger Koppl, Ph.D., Director, Institute for Forensic Science Administration, Fairleigh Dickinson University, Madison, NJ;
Irving L. Kornfield, Ph.D., Professor of Biology and Molecular Forensics, University of Maine, Orono, ME;
D. Michael Risinger, J.D., Professor of Law, Seton Hall University School of Law, South Orange, NJ;
Norah Rudin, Ph.D., Forensic DNA Consultant, Mountain View, CA;
Marc Scott Taylor, Laboratory Director, Technical Associates, Inc., Ventura, CA;
William C. Thompson, J.D., Ph.D., Professor and Chair, Department of Criminology, Law and Society, University of California, Irvine, CA
References
1. R. Rosenthal, Experimenter Effects in Behavioral Research (NY: Appleton-Century-Crofts 1966).
2. D.M. Risinger, M.J. Saks, W.C. Thompson, R. Rosenthal. The Daubert/Kumho implications of observer effects in forensic science: Hidden problems of expectation and suggestion. California Law Review.
Jan. 2002;90(1):1-56.
3. E. S. Lander. DNA fingerprinting on trial. Nature. 1989;339:501-505.
4. National Research Council (U.S.), DNA Technology in Forensic Science (National Academy Press, Washington, D.C. 1992).
5. National Research Council (U.S.), The Evaluation of Forensic DNA Evidence (National Academy Press, Washington, D.C. 1996).
6. P.C. Gianelli, Confirmation Bias Crim. Just. 60 (Fall, 2007).
Sir,
I share Krane et al.’s concern about the potential danger of anti-suspect bias in forensic DNA analysis, and I agree that “interpretation of an evidentiary DNA profile should not be influenced by information about a suspect’s DNA profile.” However, judging by their letter, the incidence of such bias is unknown. It appears that we don’t know that there is a problem, and, if Krane et al.’s recommended policies were adopted, we wouldn’t know whether or not any good had been done. With some exceptions (e.g. http://www.cstl.nist.gov/div831/strbase/interlab/MIX05.htm) there has been little effort to systematically survey subjective decisions by forensic DNA analysts.
The authors are to be commended for the careful thought they have brought to this issue, but I hope that compelling theory doesn’t distract us from the need for empirical data.
Jeffrey D. Wells, Ph.D., Department of Biology, West Virginia University, Morgantown, WV.
We agree that there is a need for empirical research on the extent to which (and the circumstances under which) observer effects can influence the interpretation of DNA test results. We also think it would be foolish to assume, in the absence of such research, that observer effects are not a problem for DNA interpretation. Observer effects are a basic phenomenon of human psychology that has been observed in a broad variety of contexts (1,2). The tendency of human observers to interpret data in a manner consistent with their expectations and desires has been called "one of the most venerable ideas of traditional epistemology" as well as "one of the better demonstrated findings of twentieth-century psychology" (3). Empirical studies have confirmed that observer effects can influence latent print examinations (4,5,6,7,8), microscopic hair analysis (9), and forensic psychological assessment (10). To assume without evidence that forensic DNA analysts are somehow immune to this apparently universal human tendency requires an unwarranted leap of faith.
Observer effects are strongest when the data are ambiguous and when observers are influenced by strongly held expectations and motives (1,2,11,12,13). Both of these circumstances can arise during interpretation of DNA evidence. The potential for ambiguity in DNA test results has been widely noted (14,15,16,17,18) particularly in cases involving mixtures and limited quantities of DNA that may result in incomplete profiles. The authors of the NIST 2005 mixture study quoted prominent forensic scientist Peter Gill saying "If you show 10 colleagues a mixture, you will probably end up with 10 different answers" (19). Furthermore, DNA analysts often approach such data with strongly held expectations about what they will find.
Scientists in most fields use "blind" or "double-blind" procedures when relying on subjective judgment to interpret data (1,2). They do so because they recognize the importance of minimizing observer effects in scientific analyses. It is time for forensic scientists to join the rest of the scientific community in recognizing this problem and in taking obvious, common sense steps to deal with it, such as the sequential unmasking procedure that we have proposed (20).
Dan E. Krane, Ph.D., Professor of Biological Sciences, Wright State University, Dayton, OH;
Simon Ford, Ph.D., President, Lexigen Science and Law Consultants, Inc., San Francisco, CA;
Jason R. Gilder, Ph.D., Senior Systems Engineer, Forensic Bioinformatics, Inc., Fairborn, OH;
Keith Inman, M. Crim., Senior Forensic Scientist, Forensic Analytical Sciences, Inc., Hayward, CA;
Allan Jamieson, Ph.D., Director, The Forensic Institute, Glasgow, UK;
Roger Koppl, Ph.D., Director, Institute for Forensic Science Administration, Fairleigh Dickinson University, Madison, NJ;
Irving L. Kornfield, Ph.D., Professor of Biology and Molecular Forensics, University of Maine, Orono, ME;
D. Michael Risinger, J.D., Professor of Law, Seton Hall University School of Law, South Orange, NJ;
Norah Rudin, Ph.D., Forensic DNA Consultant, Mountain View, CA;
Marc Scott Taylor, Laboratory Director, Technical Associates, Inc., Ventura, CA;
William C. Thompson, J.D., Ph.D., Professor and Chair, Department of Criminology, Law and Society, University of California, Irvine, CA
References
1. Risinger DM, Saks MJ, Thompson WC, Rosenthal R. The Daubert/Kumho implications of observer effects in forensic science: Hidden problems of expectation and suggestion. California Law Review 2002
90(1):1-56.
2. Saks MJ, Risinger DM, Rosenthal R, Thompson, WC. Context effects in forensic science. Science & Justice, 2003 43(2):77-90.
3. Nisbett R, Ross L. Human Inference. Englewood Cliffs, N.J.: Prentice Hall, Inc. 1980, p. 67.
4. Dror IE, Peron A, Hind SL, Charlton D. When emotions get the better of us: The effect of contextual top-down processing on matching fingerprints. Applied Cognitive Psychology 2005
19(6):799-809.
5. Dror IE, Charlton D, Peron A. Contextual information renders experts vulnerable to making erroneous identifications. Forensic Science International 2006 156: 74-78.
6. Dror IE, Charlton D. Why experts make errors. Journal of Forensic Identification 2006 56(4): 600-616.
7. Dror IE, Rosenthal R. Meta-analytically quantifying the reliability and biasability of forensic experts. Journal of Forensic Sciences 2008 53(4):900-3.
8. Schiffer B, Champod, C. The potential (negative) influence of observational biases at the analysis stage of fingermark individualization. Forensic Science International 2007 167:116-120.
9. Miller LS. Procedural bias in forensic examination of hair. Law and Human Behavior 1987 11(2):157-163.
10. Beckham JC, Annis LV, Gustafson DJ. Decision making and examiner bias in forensic expert recommendations for not guilty by reason of insanity. Law & Human Behavior 1989 Mar 13(1):79-87.
11. Plous S. The psychology of judgment and decision making. New York: McGraw-Hill, Inc. 1993.
12. Schneider DJ, Hastorf AH, Ellsworth PC. Person perception (2nd Ed.). Reading, Mass: Addison-Wesley Publishing 1979.
13. Gilovich T. How we know what isn't so: The fallibility of human reason in everyday life. New York: The Free Press 1991.
14. Thompson WC. Subjective interpretation, laboratory error and the value of forensic DNA evidence: Three case studies. Genetica 1995;96:153-168.
15. Thompson WC. A sociological perspective on the science of forensic DNA testing. UC Davis Law Review 1997 30(4):1113-1136.
16. Thompson WC. Accepting lower standards: The National Research Council's second report on forensic DNA evidence. Jurimetrics Journal 1997 37(4):405-424.
17. Thompson WC, Ford S, Doom T, Raymer M, Krane D. Evaluating forensic DNA evidence: Essential elements of a competent defense review: Part 1. The Champion, 2003 27(3): 16-25.
18. Thompson WC, Cole SA. Psychological aspects of forensic identification evidence. In: Costanzo M, Krauss D, Pezdek K, editors. Expert Psychological Testimony for the Courts. New York: Lawrence
Erlbaum & Associates 2007.
19. Butler JM, Kline MC. NIST mixture interpretation interlaboratory study 2005 (MIX05). http://www.cstl.nist.gov/div831/strbase/interlab/MIX05.htm.
20. Krane DE, Ford S, Gilder JR, Inman K, Jamieson, A, Koppl R, et al. Sequential unmasking: A means of minimizing observer effects in forensic DNA interpretation (letter). Journal of Forensic
Sciences 2008 53(4):1006-7.
Sir, Let me say at the outset that I agree with the call by both Wells and Krane et al. for additional research into potential observer effects and/or bias in decision-making by forensic examiners. However, in my opinion the situation is not nearly as clear as Krane et al. suggest in their response to Dr. Wells. In that response, eight studies are cited in the statement "Empirical studies have confirmed that observer effects can influence [the results in various forensic disciplines]." My first impression upon reading this was that these studies showed that observer effects and/or bias were "proven" to be concerns in the listed disciplines.
But one reference, in particular, caught my eye. The 1984 Miller (1) study may be familiar to some as it has been mentioned at least twice before; once at the 2008 AAFS meeting in Washington, DC (2) and again at a recent "Expert Forensic Evidence" conference in Toronto, ON (3). Despite its provocative title, this study provides absolutely no data that could be construed even remotely as pertaining to qualified forensic document examiners. Rather, "Twelve college students, trained in the forensic examination of questioned documents, were utilized in the experiment" (p. 409). No details were provided by the author about the nature of the training given to the students but I find it difficult to understand how the results of a study based entirely on college students can be extended to professional, qualified examiners in any meaningful manner. Upon seeing this reference in the list of citations I decided to review all of the studies. My review showed that these studies provide limited, even ambiguous, data with respect to whether or not observer effects or bias are issues with qualified forensic examiners. Indeed, like the Miller study described above, three of the cited studies did not involve professional forensic examiners at all and instead used only students as test subjects. One study was a metaanalysis based upon two earlier studies. The remaining three studies provide rather conflicting data about the potential for bias/observer effects. I would encourage everyone to review these articles for themselves but a short discussion of each is provided here to clarify my position on this matter.
Both of Miller's studies from 1984 and 1987 (4) used students and included no trained examiners. In the 1987 study the author stated "Fourteen students enrolled in advanced crime laboratory college courses were selectively trained in human hair identification techniques. The training consisted of 60 academic hours of lecture and 60 academic hours of laboratory experience under the instruction of court-qualified human hair experts. The 14 students met the basic requirements for expert testimony on human hair identification in courts of law. Each of the 14 examiners was independently advised to examine and compare human hair evidence in four criminal investigations" (p. 160). The author did not say if the students had successfully completed their courses; only that they were enrolled in such courses. The author's assertion that the students "met the basic requirements for expert testimony" is open to interpretation. As most readers know, meeting the basic requirements to be qualified as a court expert may not, in fact, make someone qualified to do the work. Beyond this, even if the college students were considered to be advanced trainees (e.g., novice examiners) rather than naive subjects, the extension of these results to fully qualified forensic examiners is dubious.
The Beckham et al. study (5) in 1989 was quite extensive in that it involved 180 mental health experts who were asked to assess NGRI (Not Guilty by Reason of Insanity) submissions. Beckham et al. commented in part that "in the current study, no statistically significant bias was detected between group..." (p. 86). In their discussion of this finding the authors further commented "Even though such bias has been demonstrated in clinical psychology graduate students, practicing forensic evaluators may be more attuned to such detrimental possibilities and therefore actively strive to be as objective as they can" (p. 86). If anything, this particular study suggests that bias was less of a problem than the researchers had anticipated though, of course, there is the issue of extension of the results to other types of forensic work. At any rate, it certainly does not support the belief that observer . bias effects are present in all types of forensic work.
The Dror et al. study (6) in 2005 involved "...27 university student volunteers, with a mean age of 23 (9 were males and 18 were females)." In their discussion of the results, the authors noted this limitation and commented "Second, our findings need to be examined within the context of routine everyday work of fingerprint experts. The training, experience, and work procedures of fingerprint experts may play an interesting and crucial role in if and how top-down components play a role in fingerprint identification. On the one hand, fingerprint experts may be less susceptible to topdown interference, perhaps even immune, to such effects. Given their highly specialized skills, they may be able to focus solely on the bottom-up component and be data driven without the external influences that we have observed in the research reported here. On the other hand, and in contrast, fingerprint experts may be even more susceptible to such top-down components" (pp. 807–808). Overall, these comments suggest to me a rather inconclusive position; a very reasonable position since the data from the study had nothing to do with qualified examiners.
Following the 2005 study, two studies (7,8) from 2006 by Dror et al. used qualified fingerprint experts (five and six, respectively). These studies are arguably the most intriguing to date insofar as they provide some support for the belief that observer effects may influence fingerprint examiners in at least some situations. At the same time, the nature of the influence/bias effect is not entirely clear. Aside from the issue of generalization of results from relatively small sample sizes (a point discussed by the original authors), the bias effect seems to be mostly unidirectional. That is, bias attempts may shift some conclusions toward exclusion but they were not very successful in moving conclusions toward individualization. In their Journal of Forensic Identification article, the authors speculated about this result saying "It seems that the threshold to make a decision of exclusion is lower than that to make a decision of individualization. Indeed our data support this claim, as reflected by the fact that most of the conflicting decisions were past individualizations. We did, however, observe a case in which an exclusion decision was now judged to be an individualization. This relates to the decision-making model used by experts in the fingerprint domain" (p. 613). In the end, I would agree there is evidence to support the belief that observer effects can be a factor in these types of comparisons but the precise nature, and the limits, of the influence is not clear from these studies.
Schiffer and Champod's study (9) in 2007 did not use fully trained examiners. Their test subjects were 48 students in forensic sciences studies. How the education of these students might compare with that in the 1987 Miller study is unknown, but the domains were clearly different. With respect to possible bias effects, the authors wrote "Contrary to our initial expectations for test II on the potential effects of stimuli inducing observational biases, no effect of availability of known print nor context information has been observed. This was true for all fingermarks used in the test. These results do, to a certain degree, contradict previous findings or hypotheses, for instance Risinger et al. (10) and their overview of studies on the detrimental effects of expectation on reasoning and perception" (p. 119). The authors pointed out limitations in their study. It is interesting that the findings did not support the idea that observer effects/bias are a problem with this type of examination. However, the key issue with this study is the same as with the other student-based efforts; namely, how well would these results extend to fully qualified forensic examiners?
Dror and Rosenthal’s 2008 study (11) was a meta-analysis of the two 2006 studies (discussed above) and, as such, does not provide additional empirical support beyond those studies. This study involved an analysis intended to clarify the strength of the results from the earlier studies which were limited to a small number of subjects. As such, the authors concluded "The first two studies to examine these questions established that experts are far from being perfect. These studies demonstrated circumstances in which experts were both relatively unreliable and biasable, and in the analyses reported here we quantify these effects statistically and subject them to meta-analytic procedures. The data are based on forensic decision-making made by latent fingerprint experts, but because this forensic domain is the most widely used and well established, we can be confident that the problems exposed within this domain are also prevalent in other forensic domains" (p. 903). In my opinion, there is no reason to consider this particular type of work to be representative of other forensic domains simply because it is widespread and well established. Nonetheless, the idea that bias effects are present in at least some situations has support according to the authors. At the same time, they also commented, "The fact that fingerprint experts can be unreliable and biasable does not mean that they are not ordinarily reliable and unbiasable" (p. 903).
In general, the use of students as test subjects, whether they be completely naive students or novice examiners, is inappropriate if the intent is to learn about the behavior of fully qualified examiners. Once studies based on students are removed from consideration, the remaining works are not conclusive one way or the other. Indeed, the two studies by Dror et al. in 2006 and the 1989 Beckham et al. study provide conflicting information. The results suggest, on the one hand, that contextual information can selectively bias the results for fingerprint examiners while, on the other hand, mental health experts did not seem to be subject to a significant bias effect at all. There may be a number of reasons why the results of these studies diverged so much including, for example, the different domains under consideration or the very different experimental designs. But the bottom line is that evidence regarding the existence or impact of observer effects or bias cannot be considered conclusive or even consistent.
Therefore, I have to agree with Dr. Wells in his assessment of the situation when he wrote "the incidence of such bias is unknown." In their response to Dr. Wells, Krane et al. suggested that the issue of observer effects has been confirmed in several other forensic domains and, based on this, it is only logical that it will be present in DNA analyses as well. Yet in reality there is no clear empirical evidence to support the belief that this is a problem in the other disciplines, let alone for DNA interpretations. Perhaps there is a problem and perhaps there isn’t.
I personally believe there are grounds to warrant well-designed research studies aimed at gaining a better understanding of the situation. Indeed, I would support this based solely on the belief that "observer effects are a basic phenomenon of human psychology." As such, these issues may ultimately prove to be important factors in forensic decision-making. Or they may turn out to be relatively meaningless.
The implementation of any solution before the "problem" is fully understood is not a good idea. That approach may well result in new and unanticipated issues that end up being worse than the original concern. Let us do the research and understand the situation more fully before we begin "fixing" things that may or may not need to be fixed.
Brent Ostrum, Senior forensic document examiner, Canada Border Services Agency, Ottawa, ON.
References
1. Miller LS. Bias among forensic document examiners: a need for procedural change. J Police Sci Admin. 1984;12:407.
2. Risinger DM. The impact of confirmational bias and context effect on report writing in the forensic science laboratory. Proceedings of the 60th Annual Meeting of the American Academy of Forensic
Sciences, February 18.23, 2008; Washington, DC. Colorado Springs, CO: American Academy of Forensic Sciences, 2008.
3. Saks MJ. Building an evidence-based report accounting for confirmation bias. Presented at the Expert Forensic Evidence in Criminal Proceedings: Avoiding Wrongful Convictions Conference, May 9,
2009; Toronto, ON. Toronto, ON: Osgoode Hall Law School of York University and the Centre for Forensic Science & Medicine at the University of Toronto, 2009.
4. Miller LS. Procedural bias in forensic examination of hair. Law and Human Behavior. 1987;11(2):157-163.
5. Beckham JC, Annis LV, Gustafson DJ. Decision making and examiner bias in forensic expert recommendations for not guilty by reason of insanity. Law Hum Behav 1989;13(1):79.87.
6. Dror IE, Peron A, Hind SL, Charlton D. When emotions get the better of us: the effect of contextual top-down processing on matching fingerprints. Appl Cogn Psychol 2005;19(6):799.809.
7. Dror IE, Charlton D, Peron A. Contextual information renders experts vulnerable to making erroneous identifications. Forensic Science International 2006 156: 74-78.
8. Dror IE, Charlton D. Why experts make errors. Journal of Forensic Identification. 2006;56(4):600-616.
9. Schiffer B, Champod, C. The potential (negative) influence of observational biases at the analysis stage of fingermark individualization. Forensic Science International 2007 167:116-120.
10. Risinger DM, Saks MJ, Thompson WC, Rosenthal R. The Daubert/Kumho implications of observer effects in forensic science: Hidden problems of expectation and suggestion. California Law Review.
2002;90(1):1-56.
11. Dror IE, Rosenthal R. Meta-analytically quantifying the reliability and biasability of forensic experts. Journal of Forensic Sciences. 2008;53(4):900-3.
We appreciate the careful attention Brent Ostrum (1) has given to our recent letters and his close examination of the studies we have cited in support of the hypothesis that observer effects are both real and important in forensic science. Ostrum calls for more research on observer effects, as do Wells (2) and the recent NAS report (3). We agree that more research is necessary and view these calls as a favorable development in forensic science. However, we disagree with the suggestion that it is somehow prudent to postpone implementing sequential unmasking type safeguards until we have accumulated more data on observer effects in forensic science. Before explaining why we consider such a posture imprudent, we explain why we think the existing evidence is stronger than Ostrum apparently allows.
Ostrum states that Miller's 1984 study (4) "provides absolutely no data that could be construed even remotely as pertaining to qualified forensic document examiners." He bases this judgment on the fact that Miller describes his experimental subjects as "college students." Risinger (5), however, reports information obtained from Miller (6) that is relevant to this criticism. While Miller's 12 subjects were all part-time college students, four of them were "court-qualified document examiners working for police agencies" and the remaining eight "had completed training but had not yet testified in court." These 12 examiners were divided into two groups of six. Group I was given the unknown samples (three forged checks) and exemplars from one known source and were exposed to potentially biasing information before examining the evidence. Group II was given the same unknown samples and known samples from three sources and were shielded from any potentially biasing case information. "Four examiners in Group I, including one of the court-qualified examiners, incorrectly concluded that the suspect (as represented by the known exemplar) wrote the signatures on the three checks. The other court-qualified examiner declared the results of his examination to be 'inconclusive,' asserting that the known exemplars of the suspect's handwriting 'bore disguised handwriting characteristics.' The last examiner (a trainee) correctly eliminated the suspect. All six examiners in Group II correctly eliminated all three of their suspects." An inexperienced, though fully trained, examiner in Group I outperformed both court-qualified examiners.
Ostrum also states that Miller's 1987 study (5) "used students and included no trained examiners." He views Miller's statement that they "met the basic requirements for expert testimony" as "open to interpretation" and remarks that "meeting the basic requirements to be qualified as a court expert may not, in fact, make someone qualified to do the work." Ostrum views the "extension" of Miller's 1987 results to "fully qualified forensic examiners" as "dubious." Risinger has obtained from Miller (personal communication, 2009) further information regarding the subjects of his 1987 study and finds that Miller's subjects were again more qualified than Ostrum believes. The test subjects were, as before, students at East Tennessee State, but all had completed training in visual hair analysis under instructors who were qualified to give such instruction professionally, and whose trainees either currently worked in law enforcement or were regularly hired by law enforcement laboratories. While Professor Miller is unsure whether any of those subjects had testified at the time of the study, in the normal course of events they would have been testifying to their results in court within a short period of time.
Ostrum goes on to say that the 2006 studies by Dror and his co-authors (7, 8) "provide some support for the belief that observer effects may influence fingerprint examiners in at least some situations." He says, "the nature of the influence/bias effect," however, "is not entirely clear." Passing over the question of sample size, Ostrum comes to his main point in this regard: "the bias effect seems to be mostly uni-directional. That is, bias attempts may shift some conclusions towards exclusion, but they were not very successful in moving conclusions towards individualization." It is true that the Dror and Charlton study (7) demonstrated more switching from inclusion to exclusion then vice versa. Their subjects had a total of 24 opportunities to switch away from individualization and 24 opportunities to switch away from exclusion. Their subjects switched from individualization 5 of 24 times, 4 to exclusion and once to "cannot decide". Their subjects switched away from exclusion, however, only 1 of 24 times, that one being a switch to exclusion.
This difference in effect size may suggest that fingerprint examiners tend to set their decision thresholds higher for individualization than for exclusion. Even if they do, four considerations suggest that we should give more weight to the one switch to inclusion than Ostrum seems to have done.
First, in many forensic contexts, biasing stimuli are often in the direction of individualization. Evidence is more likely to be submitted when it is thought to be incriminating rather than exculpatory, thus tending to create a bias toward individualization even when other domain-irrelevant information is absent. In a detailed study of four different crime laboratories (8), on average, greater than 90% connected a suspect with a crime scene or to the victim. This high rate of inculpation may come from the fact that each piece of evidence connected with a suspect a priori has a greater likelihood of being inculpatory. Thus, the false individualization rate in practice may be higher than the results of Dror and Charlton (7) seem to suggest.
Second, false exclusions are also undesirable errors in that they may let the guilty go free. Even if observer effects were somehow shown to produce only false exclusions, it would be appropriate to adopt sequential unmasking to minimize such errors.
Third, Wertheim, et al. (9) recently found that, at least in some situations, bias may lead mostly to inconclusives. While this may seem neutral and harmless on its face, "inconclusives" have the potential to mutate to "can't excludes" in court testimony directed by a persistent prosecutor.
Finally, even if we accept that the size of the effect observed by Dror and Charlton (7) is representative of the size of the effect in practice, there is at a false-positive rate of at least 1 in 48 (roughly 2%) which is surely high enough to warrant a preventive response.
We also give more weight to the other studies we have cited than Ostrum does regarding the question of whether training and experience reduce one's susceptibility to observer effects. Ostrum seems to feel that training and experience are curative when he says, "the use of students as test subjects, whether they be completely naïve students or novice examiners, is inappropriate if the intent is to learn about the behavior of fully qualified examiners." If training and experience were curative, it raises the alarming prospect that each examiner has an initial period of practice in which he or she may be providing unreliable analysis and testimony. Such a possibility would seem to be an argument in support of sequential unmasking.
Because susceptibility to observer effects is a human universal, it is implausible to suggest that experience would somehow eliminate all susceptibility to it. Little evidence exists that de-biasing strategies work well in forensic science practice. Even if the effect size were small, it is better to minimize it.
The notion that experience as a forensic scientist could cause one to transcend observer effects is akin to the notion that experience as a pilot or painter could cause one to transcend color blindness. In both cases the infirmity is built into the human architecture and cannot be willed away. The main difference is that observer effects are universal, while color blindness afflicts only a fraction of the population. We saw earlier that a trainee outperformed experienced examiners in identification (4). Statistical analysis of the various CTS handwriting proficiency tests by the CTS itself has never shown accuracy to be a function of years in practice. The Miller study could even be read to suggest that field experience may make one more susceptible to observer effects. Several mechanisms suggest themselves. First, experience produces routine, and routine reduces one's alertness to possible errors (10). In other words, one grows complacent, and perhaps more susceptible to subtle context cues. Second, interacting primarily with law enforcement can limit the type of information to which a forensic scientist is exposed. As the recent NAS (3) study notes, "Forensic scientists who sit administratively in law enforcement agencies or prosecutors' offices, or who are hired by those units, are subject to a general risk of bias". Finally, experience may reduce or eliminate the nervous novice tendency to self-doubt, thus tending to increase overconfidence among experienced examiners. Just as most drivers report their skills are better than average (11), most experienced forensic scientists may consider themselves less likely than average to commit an error. While these considerations are speculative, they shift the burden of proof onto those claiming that experience reduces susceptibility to observer effects.
We said earlier that we do not accept Ostrum's suggestion that it is somehow prudent to postpone sequential unmasking until we have accumulated more data (how much is unclear) on observer effects in forensic science in general and DNA profiling in particular. Ostrum remarks that, sequential unmasking "may well result in new and unanticipated issues that end up being worse than the original concern." This worry is vague, particularly given the fact that sequential unmasking has already been instituted in some crime labs. Indeed, sequential unmasking is nothing more than the application of common practice in other sciences to forensic science.
It would be wrong to delay response to the possibility of observer effects until they have been "proven" to exist for experienced, fully trained examiners. First, the ubiquity of observer effects is ''one of the better demonstrated findings of twentieth-century psychology'' (12). Second, virtually no cost is associated with the implementation of sequential unmasking. Third, if a verdict of guilty requires proof beyond a reasonable doubt, then those who deny the importance of observer effects should bear the burden of proving its absence. Ostrum concedes all that is required when he states, "these issues may ultimately prove to be important factors in forensic decision-making." When in doubt, sequentially unmask.
Dan E. Krane, Ph.D., Professor of Biological Sciences, Wright State University, Dayton, OH;
Simon Ford, Ph.D., President, Lexigen Science and Law Consultants, Inc., San Francisco, CA;
Jason R. Gilder, Ph.D., Senior Systems Engineer, Forensic Bioinformatics, Inc., Fairborn, OH;
Keith Inman, M. Crim., Senior Forensic Scientist, Forensic Analytical Sciences, Inc., Hayward, CA;
Allan Jamieson, Ph.D., Director, The Forensic Institute, Glasgow, UK;
Roger Koppl, Ph.D., Director, Institute for Forensic Science Administration, Fairleigh Dickinson University, Madison, NJ;
Irving L. Kornfield, Ph.D., Professor of Biology and Molecular Forensics, University of Maine, Orono, ME;
D. Michael Risinger, J.D., Professor of Law, Seton Hall University School of Law, South Orange, NJ;
Norah Rudin, Ph.D., Forensic DNA Consultant, Mountain View, CA;
Marc Scott Taylor, Laboratory Director, Technical Associates, Inc., Ventura, CA;
William C. Thompson, J.D., Ph.D., Professor and Chair, Department of Criminology, Law and Society, University of California, Irvine, CA
References
1. Ostrum B. Commentary on: "Sequential unmasking: a means of minimizing observer effects in forensic DNA interpretation". J Forensic Sci. 2009;54(6):1498-1499.
2. Wells JD. Commentary on: "Sequential unmasking: a means of minimizing observer effects in forensic DNA interpretation". J Forensic Sci 2009;54(2):500.
3. National Research Council. Strengthening Forensic Science in the United States: A Path Forward. Washington DC, National Academy Press: 2009.
4. Miller LS. Bias among forensic document examiners: a need for procedural change. J Police Science and Administration 1984;12(407).
5. Miller LS. Procedural bias in forensic examination of hair. Law and Human Behavior 1987 11(2):157-163.
6. Risinger, DM. Appendix to Goodbye To All That, or A Fool's Errand, by One of the Fools: How I Stopped Worrying About Court Responses to Handwriting Identification (and Forensic Science in General)
and Learned to Love Misinterpretations of Kumho Tire v. Carmichael. Tulsa L. Rev. 2007;42(2):477-596.
7. Dror IE, Charlton D. Why experts make errors. Journal of Forensic Identification 2006 56(4): 600-616.
8. Dror IE, Charlton D, Peron A. Contextual information renders experts vulnerable to making erroneous identifications. Forensic Science International 2006 156: 74-78.
9. Wertheim K, Langenburg G, Moenssens A. Report of latent print examiner accuracy during comparison training exercises. J Forensic Identification 2006;56(1):55-127.
10. Perrow C. Normal Accidents: Living with High-Risk Technologies. Princeton University Press: 1984.
11. Guerin B. What do people think about the risks of driving? Implications for traffic safety interventions. J. Applied Social Psychology 1994;24:994-1021.
12. Nisbett R, Ross L. Human inference. Englewood Cliffs, NJ: Prentice Hall: 1980.