Table 1: Langmuir's Symptoms of Pathological Science _____
1. The maximum effect that is observed is produced by a causative agent of barely detectable intensity, and the magnitude of the effect is substantially independent of the intensity of the cause. 2. The effect is of a magnitude that remains close to the limit of detectability or, many measurements are necessary because of the very low statistical significance of the results. 3. There are claims of great accuracy. 4. Fantastic theories contrary to experience are suggested. 5. Criticisms are met by ad hoc excuses thought up on the spur of the moment. 6. The ratio of supporters to critics rises up to somewhere near 50% and then falls gradually to oblivion. _____ 3. THE LOGICAL STRUCTURE OF SCIENCE 3.1 Baconian Inductivism vs. Data Selection As a basis for their discussion of how science actually works, Woodward and Goodstein examine critically the theories of the scientific method that are due to Francis Bacon ([1620] 1994) and Karl Popper (1972). Baconian inductivism prescribes that scientific investigation should begin with the careful recording of observations; and as far as possible, these observations should be uninfluenced by any theoretical preconceptions. When a sufficiently large body of such observations has been accumulated, the scientist uses the process of induction to generalize from these observations a hypothesis or theory that describes the systematic effects seen in the data. On the contrary, Woodward and Goodstein assert that "Historians, philosophers, and those scientists who care are virtually unanimous in rejecting Baconian inductivism as a general characterization of good scientific method." Woodward and Goodstein argue that it is impractical to record all one observes and that some selectivity is required. They make the following statement: But decisions about what is relevant inevitably will be influenced heavily by background assumptions, and these ... are often highly theoretical in character. The vocabulary we use to describe the results of measurements, and even the instruments we use to make the measurements, are highly dependent on theory. This point is sometimes expressed by saying that all observation in science is "theory-laden" and that a "theoretically neutral" language for recording observations is impossible. I claim that in the context of computer simulation experiments, this statement is simply untrue. By using portable simulation software, we can achieve exact reproducibility of simulation experiments across computer platforms--that is, the same results can be obtained whether the simulation model is executed on a notebook computer with a 16-bit operating system or on a supercomputer with a 64-bit operating system. Moreover, the accumulation of relevant performance measures within the simulation model can be precisely specified in a way that is completely independent of any theory under investigation. Thus we can attain Feynman's ideal of "a kind of utter honesty" in which every simulation analyst has available the same information with which to evaluate the performance of proposed theoretical or methodological contributions to the field. In my view, it is impossible to overstate the fundamental importance of this advantage of simulated experimentation; and we are deeply indebted to the developers and vendors of simulation software who have taken the trouble and expense to provide us with the tools necessary to achieve the reproducibility that is an essential feature of all legitimate scientific studies. According to Woodward and Goodstein, Baconian inductivism leads to the potentially erroneous and harmful conclusion that data selection and overinterpretation of data are forms of scientific misconduct, while a less restrictive view of how science actually works would lead to a different set of conclusions. In many prominent cases of pathological science, the root of the problem was data selection ("cooking") that may have been subconscious but was nonetheless grossly misleading. In addition to the case of Blondlot's nonexistent N rays, Langmuir and Hall (1989) and Broad and Wade (1982) detail several other noteworthy cases of such cooking and overinterpretation of experimental data in the fields of archaeology, astronomy, geology, parapsychology, physics, and psychology. I claim that whatever the theoretical deficiencies of Baconian inductivism may be, they have no bearing on the field of computer simulation; moreover, there are sound practical reasons for insisting that researchers in all fields should avoid selection or overinterpretation of data that has even the appearance of pathological science. 3.2 Validating vs. "Cooking" Simulation Models Because simulationists work far more closely with the end users of their technology than specialists in many other scientific disciplines, we are sometimes exposed to greater pressure from clients or sponsors to fudge or "cook" our models to yield anticipated or desired results. With the advent of powerful special- and general-purpose simulation environments including extensive animation capabilities, such model-cooking is far easier for simulationists to carry out than it is for, say, atmospheric physicists. In addition to intentional model-cooking, there is the danger of unintentional self-deception resulting from faulty output analysis. In many of the cases of self-deception documented in Langmuir and Hall (1989) and Broad and Wade (1982), the most notable common feature was the experimenter's attempt to detect visually an extremely faint signal in situations where auxiliary clues enabled the experimenter to know for each trial observation whether or not the signal was supposed to be present. For example in the N-ray experiments described previously, Blondlot could see the scale measuring the current position of the thread coated with luminous paint. With each change in the thread's position, Blondlot knew if he was supposed to see a brightening of the thread--and thus he was able to deceive himself into "seeing" effects that other experimenters could not reproduce. In the context of simulation experiments, animation can be one of the primary visual means for self-deception. Equally dangerous is faulty output analysis based on visual inspection of correlograms, histograms, confidence intervals, etc., computed from an inadequate volume of simulation-generated data. With all of these simulation tools, there is the ever-present danger of seeing things that simply do not exist or of not seeing things that do exist. To guard against cooking a simulation model or its outputs, simulationists should place much greater emphasis on meaningful, honest validation of their models as accurate representations of the corresponding target systems. To reemphasize the role of validation in the field of computer simulation, we need fundamental advances in both the practice and theory of model validation. So far as I know, the simulation literature contains very little documentation of real-world applications in which a simulation model was carefully validated. A comprehensive methodology for validating simulation models is detailed in Knepell and Arangno (1993) and Sargent (1996), but it not clear that many practitioners and researchers have given due consideration to either the implementation or the extension of this methodology. I believe that we need to pay much greater attention to simulation model validation in teaching and research as well as in practical applications.