Correlated random numbers
I have a problem that I had initially thought would be straightforward (but then, what is?). For a Monte Carlo-type simulation study, I want to be able to to generate sets of pseudorandom numbers having correlations equal to (or differing only randomly from) a target correlation matrix that I specify up front, based on postulated relationships among variables. This is very easy to do using the classic method of Kaiser & Dickman (1962), as long as the target correlation matrix is positive definite (PD) (ie, has all positive eigenvalues). If not, the algorithm (programmed in Matlab) returns complex numbers, which are not satisfactory for my purposes. So, for a non-PD target correlation matrix, I decided to find the PD matrix that is "closest" to the target matrix in some sense. Somewhere in the past I had gotten the idea that, for a correlation matrix to be PD, all of the pairwise correlations must be internally consistent with respect to all of their partial correlations. So I wrote another function that iteratively and minimally adjusts all correlations until each is within the possible range predicted by all possible partial correlations. To my surprise, the resulting matrix is still not positive definite, which means that my idea about positive-definiteness (definity?) is wrong. Or at least that this kind of internal consistency is necessary but not sufficient. So my question is: in what way should I be adjusting pairwise correlations so as to find the PD matrix that is "closest" to the target? After a reasonably thorough literature search and perusal of texts on linear algebra and related topics, I've failed to find any literature relevant to this problem. Any suggests on how to proceed, or citations that I've missed? Thanks. Rich Strauss Dr Richard E Strauss Biological Sciences Texas Tech University Lubbock TX 79409-3131 Email: [EMAIL PROTECTED] Phone: 806-742-2719 Fax: 806-742-2963
Re: Correlated random numbers
Thanks for the various comments I've gotten (most sent directly to me) on my problem with random sampling from correlation matrices. For those who've requested, here's little bit of background information. I'm interested in a biological phenomenon known as morphological integration, and I work on skeletal development in vertebrates (mostly fishes). Animals becomes regionally compartmentalized during development, such that some suites of bones (those of the jaw, for example, or of the forelimb) become more tightly integrated with one another (ie, more highly correlated in their sizes and shapes) than they are with bones in other suites, although all are correlated at some level. This can be modeled as a time-dependent correlation matrix, in which the correlations change with age or size, with increasing within-suite correlations and decreasing among-suite correlations. (Actually, we use covariances rather than correlations because the scaling is important, but the principles are the same.) I'm interested in modeling this for several reasons. First, several different quantitative measures of morphological integration (indices) are in use in the literature, and I'm interested in their (largely unknown) distributional properties. Second, morphological integration relates to several other biological aspects of development, such as fluctuating bilateral asymmetry, allometric gradients, and metamorphosis, all of which can also be modeled with time-dependent covariances. So, what I want to be able to do is to postulate a set of "target" correlation matrices, varying such things as the numbers of character (=variable) suites, the numbers of characters per suite, and the strengths of the within-suite and among-suite correlations, and for each of these to generate samples of potential "morphologies". Although most such matrices will be similar to those observed for real organisms and thus very well behaved, I occasionally want to gradually push the envelop to extreme conditions, and that's when I bump into statistically incompatible or ill-conditioned sets of correlations. It seemed reasonable to me in such cases to step back to the "closest" correlation matrix that is internally consistent, which is where my problem arose. Several people have suggested to me the following numerical solution: get the eigenvectors and eigenvalues, set the negative eigenvalues to zero (there's generally only one that's negative) and proportionately adjust the others to maintain the same sum (total variance), and reconstruct the correlation matrix. I've tried it, and so far it seems to work very well in practice. However, Rich Ulrich has raised the spectre of "nearly invalid" results, and so what I plan to do is to begin with a well-conditioned correlation matrix and gradually change it until it becomes positive indefinite (is that the correct term?), and check whether the adjustment is consistent with the changes I made in the matrix leading up to the ill-conditioning. So if anyone has any further thoughts on this, or if you're interested in the results, please let me know. And thanks again for the help I've gotten so far. Rich Strauss At 12:00 PM 11/17/99 -0500, you wrote: >On 16 Nov 1999 13:29:31 -0800, [EMAIL PROTECTED] (Rich Strauss) >([EMAIL PROTECTED]) wrote: > >> I have a problem that I had initially thought would be straightforward (but >> then, what is?). For a Monte Carlo-type simulation study, I want to be >> able to to generate sets of pseudorandom numbers having correlations equal >> to (or differing only randomly from) a target correlation matrix that I >> specify up front, based on postulated relationships among variables. This >> is very easy to do using the classic method of Kaiser & Dickman (1962), as >> long as the target correlation matrix is positive definite (PD) (ie, has >> all positive eigenvalues). If not, the algorithm (programmed in Matlab) >> returns complex numbers, which are not satisfactory for my purposes. >> >> So, for a non-PD target correlation matrix, I decided to find the PD matrix >> that is "closest" to the target matrix in some sense.... > >Slow down; stop; back up. > >You don't say what your Monte Carlo is for, and why you are putting in >a variety of correlations, but you don't seem to be taking this "bad >conditioning" seriously enough. -- Look at it this way: If you set >yourself up with a matrix that is the next-closest thing to an invalid >correlation matrix, you are going to get the next-closest thing to >invalid results -- In this case, it seems that you are planning to do >it without ever measuring or recording just how close you are to the >limit, because you are just (blindly) approximating some target.
Re: Disadvantage of Non-parametric vs. Parametric Test
At 12:04 PM 12/8/99 -0500, Rich Ulrich wrote: -- snip -- >Similarly, bootstrapping is a method of "robust variance estimation" >but it does not change the metric like a power transformation does, or >abandon the metric like a rank-order transformation does. If it were >proper terminology to say randomization is nonparametric, you would >probably want to say bootstrapping is nonparametric, too. (I think >some people have done so; but it is not widespread.) In my fields of interest (ecology and evolutionary biology), it is becoming increasing common to refer to two "kinds" of bootstrapping: nonparametric bootstrapping, in which replicate samples are drawn randomly with replacement from the original sample; and parametric bootstrapping, in which samples are drawn randomly from a (usually normal) distribution having the same mean and variance as the original sample. The former is bootstrapping in the traditional sense, of course, while the latter is a form of Monte Carlo simulation. Unfortunately, the new terminology seems to be spreading rapidly. Rich Strauss Dr Richard E Strauss Biological Sciences Texas Tech University Lubbock TX 79409-3131 Email: [EMAIL PROTECTED] Phone: 806-742-2719 Fax: 806-742-2963
Re: ANOVA with proportions
At 12:52 PM 12/14/99 -0800, Dale Berger wrote: >Just a reminder that transformations can be used on proportions as a dv to reduce >the skew, important if some values approach 0 or 1. These include arcsine, >probit, and logit. Each needs special treatment when p=0 or p=1. Cohen and Cohen >(2nd ed. of Applied MR/C) has a section on transformations for proportions (pp. >265-270). I'll just add the usual caveat that hasn't yet been mentioned in these responses about proportions: the transformations, use of the binomial, and comment about proportions just being means all assume that the data really are proportions, not ratios -- that is, that the denominator is fixed among all values, not variable. The problem is that many people use the terms interchangably, talking about proportions or percentages when they're actually dealing with ratios. Rich Strauss Dr Richard E Strauss Biological Sciences Texas Tech University Lubbock TX 79409-3131 Email: [EMAIL PROTECTED] Phone: 806-742-2719 Fax: 806-742-2963
Re: Ocean Waves: Stationary Random Theory
At 10:53 AM 1/21/00 +, you wrote: >In the 1983 Guinness Book of World Records under OCEANS, the following >appears concerning the heights of waves: >"It has been calculated on the statistics of the Stationary Random theory >that one wave in more than 300,000 may exceed the average by a factor of 4." > >What is a reference on Stationary Random statistical theory? >What assumptions are involved in modeling random interactions of waves? >What is the sampling distribution for the heights of "random" waves? For the latter question, you might check out the literature on extreme value theory, such as Castillo's book "Extreme value theory in engineering" (1988 I believe, but I don't know the publisher). There's a good but scattered geomorphometry literature on the occurrence of extreme earthquakes, floods, waves, etc. Rich Strauss Dr Richard E Strauss Biological Sciences Texas Tech University Lubbock TX 79409-3131 Email: [EMAIL PROTECTED] Phone: 806-742-2719 Fax: 806-742-2963 === This list is an open list and occasionally, people lacking the respect for the other members of the list sometimes send messages that are inappropriate in reguards to the list discussion topics. Please just delete the offensive email. For information concerning the list please see the following web page: http://jse.stat.ncsu.edu/ ===
Re: cluster analysis in one-dimensional "circular" space
Since clustering methods begin with pairwise distances among observations, why not measure these distances as minimum arc-lengths along the best-fitting circle (or min chord lengths, or min angular deviations with respect to the centroid, etc)? This is how geographic distances are measured (in 2 dimensions, rather than one) and clustered, and also how distances are measured among observations in Kendall's shape spaces (e.g., Procrustes distances), so there's a well established literature. Rich Strauss At 05:32 PM 4/14/00 +0200, you wrote: >Hi everybody. >I face the problem of clustering one-dimensional data that can range in a >circular way. Does anybody knows the best way to solve this problem with no >aid of an additional variable ? Using a well-suitable trigonometric >transform ? Using an ad-hoc metric ? >Thanks. > >Carl > > > > >=== >This list is open to everyone. Occasionally, less thoughtful >people send inappropriate messages. Please DO NOT COMPLAIN TO >THE POSTMASTER about these messages because the postmaster has no >way of controlling them, and excessive complaints will result in >termination of the list. > >For information about this list, including information about the >problem of inappropriate messages and information about how to >unsubscribe, please see the web page at >http://jse.stat.ncsu.edu/ >=== > Dr Richard E Strauss Biological Sciences Texas Tech University Lubbock TX 79409-3131 Email: [EMAIL PROTECTED] Phone: 806-742-2719 Fax: 806-742-2963 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: differences between groups/treatments ?
At 04:31 PM 6/22/00 +, Gene Gallagher wrote: >This pattern was described in an obit about two-three years ago in the >NY Times. A statistician's obit noted that he'd found a flaw in the >Israeli air force's training program. Apparently, the Israeli air force >was punishing the worst performers in a test because this usually >produced a better performance in subsequent tests and was supposedly >much more effective than positive reinforcement. They'd found that >positive reinforcement of the best performers often resulted in a poorer >performance on the next test. This now-deceased statistician pointed >out the confounding effect of regression to the mean on this assessement >of negative and positive reinforcement. The effectiveness of negative >reinforcement (punishment) could be nothing more than a chance effect. A few years ago the journal "Statistical Methods in Medical Research" published an issue on regression to the mean (vol 6, no 2, 1997). It included the five following papers: Regression towards the mean, historically considered (pp. 103-114) M Stigler S. The impact and implication of regression to the mean on the design and analysis of medical investigations (115-128) Chuang-Stein C.,M Tong D. Adjusting for regression toward the mean when variables are normally distributed (129-146) Lin H., Hughes M. Non-normal variation and regression to the mean (147-166) Chesher A. Using regression models for prediction: shrinkage and regression to the mean (167-183) Copas J. Rich Strauss Dr Richard E Strauss Biological Sciences Texas Tech University Lubbock TX 79409-3131 Email: [EMAIL PROTECTED] Phone: 806-742-2719 Fax: 806-742-2963 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Adjusting a Correlation Matrix
At 03:46 PM 7/6/00 +, Christian A. Walter wrote: >Does anyone know if there is a structured way to adjust a negative >definite matrix such that it becomes semi-definite, while "minimizing" >the induced changes to the matrix? > >Cheers, >Christian I posed a similar question to edstat last fall. I was specifically concerned with non-positive-definite correlations matrices. Several people suggested to me the following numerical solution: get the eigenvectors and eigenvalues, set the negative eigenvalues to zero (there's generally only one that's negative) and proportionately adjust the others to maintain the same sum (total variance), and reconstruct the correlation matrix. This seems to work very well in practice. I've also done some simulations, beginning with a well-conditioned correlation matrix and gradually changing it until it becomes slightly ill-conditioned. The eigen procedure successfully 'corrects' the matrix. Rich Strauss Dr Richard E Strauss Biological Sciences Texas Tech University Lubbock TX 79409-3131 Email: [EMAIL PROTECTED] Phone: 806-742-2719 Fax: 806-742-2963 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Question
Reference: Lande, R. 1977. On comparing coefficients of variation. Systematic Zoology 26:214-217. Simplest approach: since the squared CV is approximately equal to the variance of the log-transformed data for CV < 30% or so, compare the squared CVs with an F-test or equivalent. Or, compare the variances of the original log-transformed data using Levene's test or equivalent. Rich Strauss At 07:56 AM 10/30/00 -0500, you wrote: >Hi! > >My question is on a test to compare CVs. The CVs are computed using the >same data but two different variance methods and I have to compare them. >Been told there is no real test and as of yet have not checked the Current >Index of Stat books but wondered if someone in the group has had this >problem. Someone suggested that take one of the Cvs and make it the >population CV and do 95% C.I. around that. Any suggestions?? Thanks. Dr Richard E Strauss Biological Sciences Texas Tech University Lubbock TX 79409-3131 Email: [EMAIL PROTECTED] (formerly [EMAIL PROTECTED]) Phone: 806-742-2719 Fax: 806-742-2963 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Problem on the probability of death
I have what seems to be a straightforward question involving a conditional probability, but I must be missing something because I can't quite get a handle on it. Let's say I have treatment and control groups with individuals preassigned to each, with T individuals in the treatment group and C in the control group. I observe mortality after some period of time, with t of T dying in the treatment group and c of C in the control group. I would like a measure of the probability of death due to the treatment, over and above (in some sense) the probability of death in the control group. I know that P(x of T) is hypergeometric, assuming that the probabilities of death for treatment and control are identical, so I know how to determine whether (t of T) is significantly greater than (c of C). And I've just verified that this probability is the same as the chi-square probability for the 2 x 2 contingency table. But how do I measure this effect? As a simple difference between the probabilities for the two groups? I initially guessed that the value I wanted was just P(death | treatment), but of course this turns out to be just the ratio t/T, which contains no information about the control group. I'm sure this must be commonly done, as, for example, in estimating the additional probability of death at a particular age due to smoking, but I've scanned the resources (texts, personnel, etc.) I have available and can't find the relevant information. Can someone point me in the right direction? Thanks in advance. Rich Strauss Dr Richard E Strauss Biological Sciences Texas Tech University Lubbock TX 79409-3131 Email: [EMAIL PROTECTED] (formerly [EMAIL PROTECTED]) Phone: 806-742-2719 Fax: 806-742-2963 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Nonrandomness of binary matrices
Say I have a binary data matrix for which both the rows (observations) and columns (variables) are computely permutable. (In practice, about 5-20% of the cells will contain 1's, and the remainder will contain 0's.) Assume that the expected probability of a cell containing a '1' is identical for all cells in the matrix. I'd like to be able to test this assumption by measuring (and testing the significance of) the degree of 'nonrandomness' of the 1's in the matrix. If the rows and columns were fixed in sequence, then this would be an easy problem involving spatial statistics, but the permutability seems to really complicate things. I think that I can test the rows or columns separately by comparing the row or column totals against a corresponding binomial distribution using a goodness-of-fit test, but I can't get a handle on how to do this for the entire matrix. I'd really appreciate ideas about this. Thanks in advance. Rich Strauss = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Nonrandomness of binary matrices
Thanks to Rich Ulrich for the suggestion below -- that was the direction I was heading, but there seem to be difficulties. The general problem is that I have a standard [nxp] data matrix, but (skipping over the scientific details) some of the values are "special", typically 5-20% of them, and I want to know whether their distribution within the matrix is structured in some way. In particular, they might be concentrated in particular rows or columns, but beyond that I have no notion of "nonrandom". I'm hoping that they're uniformly randomly distributed (or rather, not significantly different from random) because then I can basically ignore the fact that they're special, for the scientific problem at hand. I'd like to have two things: a nicely behaved index of "nonrandomness" (perhaps a test statistic, rescaled to an interval 0-1?) and a significance test. So I recoded the matrix as binary, with the special values coded as 1s. I presumed that the null marginal distributions would be binomial rather than Poisson because the frequency of occurrence is so high, but either way I could test that. And if I measured the deviations of marginal totals from expected (as a chi-square statistic, perhaps, or a mean squared deviation) that would provide both an index and a goodness-of-fit significance test for the entire matrix. But the problem is: what if the row totals and column totals are not independent? I've done a few 2-way chi-square contingency tests on these matrices (using randomized null distributions, of course, since the matrices are binary), and some of the results are statistically significant. Doesn't this mean that I can't simply accumulate the row and column totals for a goodness-of-fit test, since they're not always independent? And even if I did the goodness-of-fit tests for rows and columns independently, how do I combine the p-values to get a single level of singificance for the entire matrix, if the tests are not independent? I have the feeling that I'm missing something obvious here but I can't quite get a handle on it, and this little problem is holding up the analysis of the results from a much larger study. I've talked to statisticians on campus, with little progress, so basically I'm begging for help. Rich Strauss At 10:47 AM 7/25/01 -0400, you wrote: >On 23 Jul 2001 14:22:58 -0700, [EMAIL PROTECTED] (Rich Strauss) >wrote: > >> Say I have a binary data matrix for which both the rows (observations) and >> columns (variables) are computely permutable. (In practice, about 5-20% of >> the cells will contain 1's, and the remainder will contain 0's.) Assume >> that the expected probability of a cell containing a '1' is identical for >> all cells in the matrix. I'd like to be able to test this assumption by >> measuring (and testing the significance of) the degree of 'nonrandomness' >> of the 1's in the matrix. >> >> If the rows and columns were fixed in sequence, then this would be an easy >> problem involving spatial statistics, but the permutability seems to really >> complicate things. I think that I can test the rows or columns separately >> by comparing the row or column totals against a corresponding binomial >> distribution using a goodness-of-fit test, but I can't get a handle on how >> to do this for the entire matrix. I'd really appreciate ideas about this. >> Thanks in advance. > >I'm not sure that I grasp what you are after, but - an idea. > >If they are completely permutable, then "permute": >sort them by decreasing counts for row and for column. >This puts me in mind of certain alternatives to "random." > >The set of counts on a margin should be ... Poisson? >The table can be drawn into quadrants or smaller sections, >so that the number of 1s in each can be tabulated, to make >ordinary contingency tables. > >-- >Rich Ulrich, [EMAIL PROTECTED] >http://www.pitt.edu/~wpilib/index.html > = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
At 05:12 PM 11/16/01 +, you wrote: >>On Thu, 15 Nov 2001, Jerry Dallal wrote: >>> But, if the null hypothesis is that the means are the same, why >>> isn't(aren't) the sample variance(s) calculated about a pooled >>> estimate of the common mean? I've just done some quick simulations in Matlab, constructing randomized null distributions of the t-statistic under both scenarious: (1) sample variances based on sample means vs. (2) variances about the pooled mean. Assuming I've done everything correctly, the result is that the null distribution of the t-statistic in the second case consistently approximates the theoretical t-distribution more closely that that of the first case. This seems to be true regardless of sample sizes and of whether the two sample sizes are identical or different. This result implies that the t-statistic should indeed be calculated about a pooled estimate of the common mean, as Jerry Dallal suggested. I could pass on the details of my simulation if anyone is interested, but mostly I'd appreciate it if someone could repeat this simulation independently of mine to see whether it holds up. Rich Strauss = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Fwd: Re: diff in proportions
This is true. I simulated the null distributions, those obtained when the null hypothesis is true, which is what the centered t-distribution represents. I didn't look at the sampling distributions for different effect sizes. >Date: Sat, 17 Nov 2001 00:19:06 -0600 >From: jim clark <[EMAIL PROTECTED]> >Subject: Re: diff in proportions >Sender: [EMAIL PROTECTED] >X-Sender: [EMAIL PROTECTED] >To: [EMAIL PROTECTED] >Organization: The University of Winnipeg >X-Authentication-warning: dex.pathlink.com: news set sender to > [EMAIL PROTECTED] using -f >Original-recipient: rfc822;[EMAIL PROTECTED] > >Hi > >On 16 Nov 2001, Rich Strauss wrote: >> I've just done some quick simulations in Matlab, constructing randomized >> null distributions of the t-statistic under both scenarious: (1) sample >> variances based on sample means vs. (2) variances about the pooled mean. >> Assuming I've done everything correctly, the result is that the null >> distribution of the t-statistic in the second case consistently >> approximates the theoretical t-distribution more closely that that of the >> first case. This seems to be true regardless of sample sizes and of >> whether the two sample sizes are identical or different. This result >> implies that the t-statistic should indeed be calculated about a pooled >> estimate of the common mean, as Jerry Dallal suggested. >> >> I could pass on the details of my simulation if anyone is interested, but >> mostly I'd appreciate it if someone could repeat this simulation >> independently of mine to see whether it holds up. > >This simply cannot be generally true. It probably only applies >when the null is in fact true, which may be the case for your >simulations. To appreciate the illogical nature of this >recommendation, consider creating a real difference of x between >your population means, then 2x, then 3x, and so on. By the >common mean approach, you are treating the variability between >groups as though it were noise (i.e., a component in your >estimate of sigma^2, the variance about the null-hypothesis of >a common mean). It is critical to keep in mind that the null >hypothesis is in fact just that, a hypothesis that may or may >not be correct. Computing the within-group variance about the >group means is the correct way to estimate sigma^2, however, >irrespective of whether the Ho about the means is true or not. > >Best wishes >Jim > > >James M. Clark (204) 786-9757 >Department of Psychology (204) 774-4134 Fax >University of Winnipeg 4L05D >Winnipeg, Manitoba R3B 2E9[EMAIL PROTECTED] >CANADA http://www.uwinnipeg.ca/~clark > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= > = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: N.Y. Times: Statistics, a Tool for Life, Is Getting Short Shrift
>If the trend continues nationwide, this newspaper could someday report >that an apparently alarming cluster of cancer cases has arisen in an >innocuous normal distribution, and students will be able to explain to >their parents what that means. The reporting of cancer clusters already happens on a regular basis, including in the NYTimes. An excellent article on "The Cancer-Cluster Myth" by Atul Gawande was published in The New Yorker, 8 Feb 99. It was reprinted in "The Best American Science and Nature Writing" last year (2000, Houghton Mifflin). === Richard E. Strauss (806) 742-2719 Biological Sciences (806) 742-2963 Fax Texas Tech University [EMAIL PROTECTED] Lubbock, TX 79409-3131 http://www.biol.ttu.edu/Faculty/FacPages/Strauss/Strauss.html === = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: N.Y. Times: Statistics, a Tool for Life, Is Getting Short Shrift
This has nothing to do with normal distributions, as Robert Dawson noted yesterday. The article I cited makes no mention of normal distributions, and I didn't mean to imply that it did. Rich Strauss At 04:29 AM 11/29/01 +, Jerry Dallal wrote: >Rich Strauss <[EMAIL PROTECTED]> wrote: >:>If the trend continues nationwide, this newspaper could someday report >:>that an apparently alarming cluster of cancer cases has arisen in an >:>innocuous normal distribution, and students will be able to explain to >:>their parents what that means. > >: The reporting of cancer clusters already happens on a regular basis, >: including in the NYTimes. An excellent article on "The Cancer-Cluster >: Myth" by Atul Gawande was published in The New Yorker, 8 Feb 99. It was >: reprinted in "The Best American Science and Nature Writing" last year >: (2000, Houghton Mifflin). > >I'd be happy if *anyone* could explain to me what "an apparently >alarming cluster of cancer cases has arisen in an innocuous normal >distribution" means! I *think* there's an unfortunate use of the word >"normal" here, but I can't be sure. === Richard E. Strauss (806) 742-2719 Biological Sciences (806) 742-2963 Fax Texas Tech University [EMAIL PROTECTED] Lubbock, TX 79409-3131 http://www.biol.ttu.edu/Faculty/FacPages/Strauss/Strauss.html === = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =