Re: about a problem of khi2 test
On Sun, 01 Jul 2001 14:19:31 +0200, Bruno Facon [EMAIL PROTECTED] wrote: I work in the area of intelligence differentiation. I would like to know how to use the khi2 statistic to determine whether the number of statistically different correlations between two groups is due or not to random variations. In particular I would like to know how to determine the expected numbers of statistically different correlations due to chance. Let me take an example. Suppose I compare two correlations matrices of 45 coefficients obtained from two independent groups (A and B). If there is no true difference between the two matrices, the number of statistically different correlations should be equal to 1.25 in favor of Yes, that is the number. But there is not a legitimate test that I know of, unless you are willing to make a strong assumption that no pair of the variables should be correlated. I never heard of the khi2 statistic before this. I searched with google, and found a respectable number of references, and here is something that I had not seen with a statistic: kh2 appears to be solely French in its use. Of the first 50 hits, most were in French, at French ISPs (.fr). The few that were in English were also from French sources. One article had a reference (not available in my local libraries): Freilich MH and Chelton DB, J Phys Oceanogr 16, 741-757. group A and equal to 1.25 in favor of group B (in case of alpha = .05). Consequently, the expected number of nonsignificant differences should be 42.75. Is my reasoning correct? I would be nice to test the numbers, but I don't credit that reference as a good one, yet. I don't remember for sure, but I think you might be able to compare two correlation matrices with programs from Jim Steiger's site, http://www.interchg.ubc.ca/steiger/multi.htm On the other hand, you would be better off if you can compare the entire covariance structures, to keep from making accidental assumptions about variances. (Does Jim provide for that?) -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: cigs figs
- in respect of the up-coming U.S. holiday - On Mon, 25 Jun 2001 11:49:47 GMT, mackeral@remove~this~first~yahoo.com (J. Williams) wrote: On Sun, 24 Jun 2001 16:37:48 -0400, Rich Ulrich [EMAIL PROTECTED] wrote: What rights are denied to smokers? JW Many smokers, including my late mother, feel being unable to smoke on a commerical aircraft, sit anywhere in a restaurant, etc. were violation of her rights. I don't agree as a non-smoker, but that was her viewpoint until the day she died. What's your point: She was a crabby old lady, whining (or whinging) about fancied 'rights'? You don't introduce anything that seems inalienable or self-evident (if I may introduce July-4th language). Nobody stopped her from smoking as long as she kept it away from other people-who-would-be-offended. Okay, we form governments to help assure each other of rights. Lately, the law sees fit to stop some assaults from happening, even though it did not always do that in the past. - the offender still has quite a bit of leeway; if you don't cause fatal diseases, you legally can offend quite a lot. We finally have laws about smoking. But she wants the law to stop at HER convenience? [ snip, various ] JW Talking about confused and/or politically driven, what do Scalia and Thomas have to do with smoking rights? Please cite the case law. I mention rights because that did seem to be a attitude you mentioned that was (as you see) provocative to me. I toss in S T, because I think that, to a large extent, they share your mother's preference for a casual, self-centered definition of rights. And they are Supreme Court justices. [ Well, they don't say, This is what *I* want these two translate the blame/ credit to Nature (euphemism for God).] So: I don't fault your mother *too* harshly, when Justices hardly do better. Even though a prolonged skew was needed, to end up with two like this. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Maximum Likelihood
On 28 Jun 2001 20:39:18 -0700, [EMAIL PROTECTED] (Mark W. Humphries) wrote: Hi, Does anyone have references to a simple/intuitive introduction to Maximum Log Likelihood methods. References to algorithms would also be appreciated. Look on the Internet. I used www.google.com to search on maximum likelihood tutorial (put the phrase in quotes to keep it together; or you can use Advanced search) There were MANY hits, and the second reference was in a tutorial that begins at http://statgen.iop.kcl.ac.uk/bgim/mle/sslike_2.html The third reference was for some programs and examples in Gauss (a programming language) by Gary King at Harvard, in his application area. If these aren't worthwhile (I did not try to download anything), there are plenty of other sites to check. [ I am intrigued by G. King, a little. This is the fellow who putatively has a method, not Heckman's, for overcoming or compensating for aggregation bias. Which I never found available for free. But, too bad, the page says these programs go with his 1989 book, and I think his Method is more recent.] -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: cigs figs
- re: some outstandingly confused thinking. Or writing. On Sat, 23 Jun 2001 15:25:31 GMT, mackeral@remove~this~first~yahoo.com (J. Williams) wrote: [ snip; Slate reference, etcetera ] ... My mother was 91 years old when she died a year ago and chain smoked since her college days. She defended the tobacco companies for years saying, it didn't hurt me. She outlived most of her doctors. Upon quoting statistics and research on the subject, her view was that I, like other do gooders and non-smokers, wanted to deny smokers their rights. What statistics would her view quote? to show that someone wants to deny smokers 'their rights'? [ Hey, I didn't write the sentence ] I just love it, how a 'natural right' works out to be *exactly* what the speaker wants to do. And not a whit more. (Thomas and Scalia are probably going to give us tons of that bad philosophy, over the next decades.) What rights are denied to smokers? You know, you can't build your outhouse right on the riverbank, either. Obviously, there is a health connection. How strong that connection is, is what makes this a unique statistical conundrum. How strong is that connection? Well, quite strong. I once considered that it might not be so bad to die 9 years early, owing to smoking, if that cut off years of bad health and suffering. Then I realized, the smoking grants you most of the bad health of old age, EARLY. (You do miss the Alzheimer's.) One day, I might give up smoking my pipe. What is the statistical conundrum? I can almost imagine an ethical conundrum. (How strongly can we legislate, to encourage cyclists to wear helmets?) I sure don't spot a statistical conundrum. Is this word intended? If so, how so? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On Fri, 22 Jun 2001 18:45:52 GMT, Steve Leibel [EMAIL PROTECTED] wrote: In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Eamon) wrote: (c) Reduced motor co-ordination, e.g. when driving a car Numerous studies have shown that marijuana actually improves driving ability. It makes people more attentive and less aggressive. You could look it up. An intoxicant does *that*? I think I recall in the literature, that people getting stoned, on whatever, occasionally *think* that their reaction time or sense of humor or other performance is getting better. Improving your driving by getting mildly stoned (omitting the episodes of hallucinating) seems unlikely enough, to me, that *I* think the burden of proof is the stranger named Steve. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: a form of censoring I have not met before
On 21 Jun 2001 00:35:11 -0700, [EMAIL PROTECTED] (Margaret Mackisack) wrote: I was wondering if anyone could direct me to a reference about the following situation. In a 3-factor experiment, measurements of a continuous variable, which is increasing monotonically over time, are made every 2 hours from 0 to 192 hours on the experimental units (this is an engineering experiment). If the response exceeds a set maximum level the unit is not observed any more (so we only know that the response is that level). If the measuring equipment could do so it would be preferred to observe all units for the full 192 hours. The time to censoring is of no interest as such, the aim is to estimate the form of the response for each unit which is the trace of some curve that we observe every 2 hours. Ignoring the censored traces in the time period after they are censored puts a huge Well, it certainly *sounds* as if the time to censoring should be of great interest, if you had an adequate model. Thus, when you say that ignoring them gives a huge downward bias, it sounds to me as if you are admitting that you do not have an acceptable model. Who can you blame for that? What leverage do you have, if you try to toss out those bad results? (Surely, you do have some ideas about forming estimates that *do* take the hours into account. The problem belongs in the hands of someone who does.) - maybe you want to segregate trials into the ones with 192 hours, or less than 192 hours; and figure two (Maximum Likelihood) estimates for the parameters, which you then combine. downward bias into the results and is clearly not the thing to do although that's what has been done in the past with these experiments. Any suggestions of where people have addressed data of this or related form would be very gratefully received. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help me, please!
On 18 Jun 2001 01:18:37 -0700, [EMAIL PROTECTED] (Monica De Stefani) wrote: 1) Are there some conditions which I can apply normality to Kendall tau? tau is *lumpy* in its distribution for N less than 10. And all rank-order statistics are a bit problematic when you try to use them on rating scales with just a few discrete scores -- the tied values give you bad scaling intervals, and the estimate of variance won't be very good,either. For correlations, your assumption of 'normality' is usually applied to the values at zero. I was wondering if x's observations must be independent and y's observations must be independent to apply asymptotically normal limiting distribution. (null hypothesis = x and y are independent). Could you tell me something about? - Independence is needed for just about any tests. I started to say (as a minor piece of exaggeration) that independence is needed absolutely; but the correct statement, I think, is that independence is always demanded relative to the error term. [ snip, non-linear?] Monotonic is the term. [ snip, T(z): I don't know what that is.] -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Probability Of an Unknown Event
On Sat, 16 Jun 2001 23:05:52 GMT, W. D. Allen Sr. [EMAIL PROTECTED] wrote: It's been years since I was in school so I do not remember if I have the following statement correct. Pascal said that if we know absolutely nothing about the probability of occurrence of an event then our best estimate for the probability of occurrence of that event is one half. Do I have it correctly? Any guidance on a source reference would be greatly appreciated! I did a little bit of Web searching and could not find that. Here is an essay about Bayes, which (dis)credits him and his contemporaries as assuming something like that, years before Laplace. I found it with a google search on know absolutely nothing probability . http://web.onetel.net.uk/~wstanners/bayes.htm -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: meta-analysis
On 17 Jun 2001 04:34:26 -0700, [EMAIL PROTECTED] (Marc) wrote: I have to summarize the results of some clinical trials. Unfortunately the reported information is not complete. The information given in the trials contain: (1) Mean effect in the treatment group (days of hospitalization) (2) Mean effect in the control group (days of hospitalization) (3) Numbers of patients in the control and treatment group (4) p-values of a t-test (between the differences of treatment and control) My question: How can I calculate the variance of treatment difference which I need to perform meta-analysis? Note that the numbers of patients in the Aren't you going too far? You said you have to summarize. Well, summarize. The difference is in terms of days. Or it is in terms of percentage of increase. And you have the t-test and p-values. You might be right in what you propose, but I think you are much more likely to produce a useful report if you keep it simple. You are right; meta-analyses are complex. And a majority of the published ones are (in my opinion) awful. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On 15 Jun 2001 02:04:36 -0700, [EMAIL PROTECTED] (Eamon) wrote: [ snip, Paul Jones. About marijuana statistics.] Surely this whole research is based upon a false premise. Isn't it like saying that 90%, say, of heroin users previously used soft drugs. Therefore, soft-drug use usually leads to hard-drug use - which does not logically follow. (A = B =/= B = A) Conclusions drawn from the set of people who have had heart attacks cannot be validly applied to the set of people who smoke dope. Rather than collect data from a large number of people who had heart attacks and look for a backward link, they should monitor a large number of people who smoke dope. But, of course this is much more expensive. It is much more expensive, but it is also totally stupid to carry out the expensive research if the *cheap* and lousy research didn't give you a hint that there might be something going on. The numbers that he was asking about do pass the simple test. I mean, there were not 1 million people contributing one hour each, but we should still ask, *Would* this say something? If it would not, then the whole question is *totally* arid. The 2x2 table is approximately (dividing the first column by 100; and subtracting from a total): 10687 and 124 175 and 9 That gives a contingency test of 21.2 or 18.2, with p-values under .001. The Odds Ratio on that is 4.4. That is pretty convincing that there is SOMETHING going on, POSSIBLY something that merits an explanation. The expectation for the cell with 9 is just 2.2 -- the tiny cell is the cell that matters for contributions to the test -- which is why it is okay to lop the hundreds off the first column (to make it readable). Now, you may return to your discussion of why the table is not any good, and what is needed for a proper test. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: individual item analysis
On 15 Jun 2001 14:24:39 -0700, [EMAIL PROTECTED] (Doug Sawyer) wrote: I am trying to locate a journal article or textbook that addresses whether or not exam quesitons can be normalized, when the questions are grouped differently. For example, could a question bank be developed where any subset of questions could be selected, and the assembled exam is normalized? What is name of this area of statistics? What authors or keywords would I use for such a search? Do you know whether or not this can be done? I believe that they do this sort of thing in scholastic achievement tests, as a matter of course. Isn't that how they make the transition from year to year? I guess this would be norming. A few weeks ago, I discovered that there is a whole series of tech-reports put out by one of the big test companies. I would look back to it, for this sort of question. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: multivariate techniques for large datasets
On 13 Jun 2001 20:32:51 -0700, [EMAIL PROTECTED] (Tracey Continelli) wrote: Sidney Thomas [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]... srinivas wrote: Hi, I have a problem in identifying the right multivariate tools to handle datset of dimension 1,00,000*500. The problem is still complicated with lot of missing data. can anyone suggest a way out to reduce the data set and also to estimate the missing value. I need to know which clustering tool is appropriate for grouping the observations( based on 500 variables ). One of the best ways in which to handle missing data is to impute the mean for other cases with the selfsame value. If I'm doing psychological research and I am missing some values on my depression scale for certain individuals, I can look at their, say, locus of control reported and impute the mean value. Let's say [common finding] that I find a pattern - individuals with a high locus of control report low levels of depression, and I have a scale ranging from 1-100 listing locus of control. If I have a missing value for depression at level 75 for one case, I can take the mean depression level for all individuals at level 75 of locus of control and impute that for all missing cases in which 75 is the listed locus of control value. I'm not sure why you'd want to reduce the size of the data set, since for the most part the larger the N the better. Do you draw numeric limits for a variable, and for a person? Do you make sure, first, that there is not a pattern? That is -- Do you do something different depending on how many are missing? Say, estimate the value, if it is an oversight in filling blanks on a form, BUT drop a variable if more than 5% of responses are unexpectedly missing, since (obviously) there was something wrong in the conception of it, or the collection of it Psychological research (possibly) expects fewer missing than market research. As to the N - As I suggested before - my computer takes more time to read 50 megabytes than one megabyte. But a psychologist should understand that it is easier to look at and grasp and balance raw numbers that are only two or three digits, compared to 5 and 6. A COMMENT ABOUT HUGE DATA-BASES. And as a statistician, I keep noticing that HUGE databases tend to consist of aggregations. And these are random samples only in the sense that they are uncontrolled, and their structure is apt to be ignored. If you start to sample, to are more likely to ask yourself about the structure - by time, geography, what-have-you. An N of millions gives you tests that are wrong; estimates ignoring relevant structure have a spurious report of precision. To put it another way: the Error (or real variation) that *exists* between a fixed number of units (years, or cities, for what I mentioned above) is something that you want to generalize across. With a small N, that error term is (we assume?) small enough to ignore. However, that error term will not decrease with N, so with a large N, it will eventually dominate. The test based on N becomes increasing irrelevant -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: About kendall
On 12 Jun 2001 08:43:53 -0700, [EMAIL PROTECTED] (Monica De Stefani) wrote: When I aplly Kendall tau or Kendall's partial tau to a time series do I have to calcolate ranks or not? In fact a time series has a natural temporal order. ... but you are not partialing out time. Surely. Your program that does the Kendall tau must do some ranking, as part of the algorithm. Why do you think you might have to calculate ranks? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Diagnosing and addressing collinearity in Survival Analysis
On 06 Jun 2001 06:46:55 GMT, [EMAIL PROTECTED] (ELANMEL) wrote: Any assistance would be appreciated: I am attempting to run some survival analyses using Stata STCOX, and am getting messages that certain variables are collinear and have been dropped. Unfortunately, these variables are the ones I am testing in my analysis! If there are 3 groups (classes), then you can have only two dummy variables to refer to their degrees of freedom. You can code those in the most convenient and informative way. If your problem arises otherwise, then you have a fundamental problem in the logic of what is being tested. Google shows some examples of problems when I search for statistical confounding (use the quotes for the search). And confounded designs seems to obtain discussions. I would appreciate any information or recommendations on how best to diagnose and explore solutions to this problem. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: please help
On 10 Jun 2001 07:27:55 -0700, [EMAIL PROTECTED] (Kelly) wrote: I have the gage repeatability reproducibility(gage RR) analysis done on two instruments, what hyphoses test can I use to test that the repeatability variance(expected sigma values of repeatability) of the two instruments are significantly different form each other or to say one has a lower variance than the other. Any insight will be greatly appreciated. Thanks in advance for your help. I am not completely sure I understand, but I will make a guess. There is hardly any power for comparing two ANOVAs that are done on different samples, until you make strong assumptions about samples being equivalent, in various regards. If ANOVAs are on the same sample, then a CHOW test can be used on the improved prediction if one hypothesis consists of an extra d.f. of prediction. If ANOVAs are on separate samples, I wonder if you could compare the residual variances, by the simple variance ratio F-test -- well, you could do it, but I don't know what arguments should be raised against it, for your particular case. There are criteria resembling the CHOW test that are used less formally, for incommensurate ANOVAs (not the same predictors) - AKAIKE and others. If your measures are done on the same (exact) items, you might have a paired test. Instrument A gets closer values on how many of the measurements that are done. Finally, if you can do a bunch of separate experiments, you can test whether A or B does better in more than half of them. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Need Good book on foundations of statistics
On 1 Jun 2001 19:07:31 GMT, [EMAIL PROTECTED] wrote: Can anyone refer me to a good book on the foundations of statistics? Stigler's The History of Statistics is the most widely read of recent popular histories. It covers pre-1900. His newer book is Statistics on the Table and I enjoyed that one, too. It includes the founding of *modern* statistics in, say, the 1930s, in addition to much older anecdotes. I want to know of the limitations, assumptions, and philosophy behind statistics. A discussion of how the quantum world may have different laws of statistics might be a plus. That last sentence makes me think that you don't know any answers to the sentence just previous to it. ... have different laws is certainly not the way statisticians would put it. Leptons *obey* different laws than baryons do (I think), but the laws are descriptions that were imagined by human beings. I suppose one way to describe the dilemma of physics might be, It is trying to force all of these particles into fitting descriptions that are less than ideal (or, so it keeps working out). I think it is curious and interesting that the physicists at the highest levels of abstraction -- cosmology; and high-energy particles/relativity -- are beginning to use fairly ordinary 'statistical tests' to judge whether they have anything. IS there oscillation in the measured background of stars, near 4 degrees kelvin, across the whole universe? IF they continued CERN for another 18 months, would there have been another dozen or so *apparent* particles of the right type, so they could conclude that the number observed was 'significant' at the one-in-a-million level, instead of just one-in-two-hundred? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: fit-ness
On Thu, 31 May 2001 12:05:24 +0100, Alexis Gatt [EMAIL PROTECTED] wrote: Hi, a basic question from a MSc student in England. First of all, yeah I read the FAQ and I didnt find anything answering my question, which is fairly simple: I am trying to analyse how well several mathematical methods perform to modelize a scanner. So I have, for every input data, the corresponding output given by the scanner and the values given by the mathematical models I am using. First, given the distribution of the errors, I can use the usual mean-StdDev I can think of two or 3 meanings of 'scanner' and not a one of them would have a simple, indisputable measure of 'error.' 1) Some measures would be biased toward one 'method' or another, so a winner would be obvious. 2) Some samples to be tested would be biased (similarly) toward a winner by one method or another. So you select your winner by selecting your mix of samples. If you have fine measures, then you can give histograms of your results (assuming 1-dimensional, as your alternatives suggest). Is it enough to have the picture? What would your audience demand? What is your need? if the distro is normal, or median-95th percentile otherwise. Any other known methods to enhance the pertinence of the analysis? Any ideas welcome. Average squared error (giving SD) is popular. Average absolute error de-emphasizes the extremes. Count of errors beyond a critical limit sometimes fills a need. A more complicated way is to build in a cost function. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: ONLY ONE
FYI - that piece of HTML code is a SPAM advertisement, which does seem to evoke other Web addresses. On 27 May 2001 18:51:32 -0700, [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote: HTML SCRIPT LANGUAGE=JavaScript window.location=http://www.moodysoft.com; /SCRIPT BODY FONT FACE=Verdana SIZE=1 B Best screen capture on earth and in cyberspace.BRIn fact the only one.BRAnything else is just a long learning process.BRBR FONT COLOR=redSPX® v2.0/FONTBREverytime you need to select a portion of screen, hold right-click longer than usual until the cursor turns into the cross graphical cursor.Make your selection and as soon as you release the mouse, SPX® will send it to the destination of your choice: Clipboard, File, Mail, Printer/FaxBRBR Very useful, no?BRA HREF=http://www.moodysoft.com;www.moodysoft.com/A /B /FONT /BODY /HTML -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Standardized testing in schools
On Thu, 24 May 2001 23:25:42 GMT, W. D. Allen Sr. [EMAIL PROTECTED] wrote: And this proved to me , once again, why nuclear power plants are too hazardous to trust:... Maybe you better rush to tell the Navy how risky nuclear power plants are! They have only been operating nuclear power plants for almost half a century with NO, I repeat NO failures that has ever resulted in any radiation poisoning or the death of any ship's crew. In fact the most extensive use of Navy nuclear power plants has been under the most constrained possible conditions, and that is aboard submarines! Beware of our imaginary boogy bears!! As I construct an appropriate sampling frame, one out of two nuclear navies has a good long-term record. Admiral Rickover had a fine success. The other navy was not so lucky, or suffered because it was more pressed for resources. You are right though. There is nothing really hazardous about the operation of nuclear power plants. The real problem has been civilian management's ignorance or laziness! [...] I'm glad you see the problem - though I see it more as 'ordinary management' than ignorance or laziness. It might not even have to be 'poor' management by conventional terms; the conventions don't take into account extraordinarily dangerous materials. The Japanese power plant's nuke-fluke of last year was an illustration of employee inventiveness and 'shop-floor innovation'. Unfortunately for them, they 'solved a problem' that had been a (too-) cleverly designed safety precaution. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The False Placebo Effect
On 24 May 2001 21:39:17 -0700, [EMAIL PROTECTED] (David Heiser) wrote: Be careful on your assumptions in your models and studies! --- Placebo Effect An Illusion, Study Says By Gina Kolata New York Times (Published in the Sacramento Bee, Thursday, May 24, 2001) In a new report that is being met with a mixture of astonishment and some disbelief, two Danish researchers say that the placebo effect is a myth. Do you think they will not believe in voudon/ voodoo, either? The investigators analyzed 114 published studies involving about 7,500 patients with 40 different conditions. They found no support for the common notion that, in general, about one-third of patients will improve if they are given a dummy pill and told it is real. [ ... ] The story goes on. The authors look at studies where the placebo effect is probably explained by regression-to-the-mean. - I was a bit surprised by the newspaper coverage. I tend to forget that most people, including scientists, do *not* blame regression-to-the-mean, as the FIRST suspicious cause whenever there is a pre-post design: because they have scarce heard of it. On the other hand, I have expected for a long time that the best that a light-weight placebo will do is a light-weight improvement. ... The researchers said they saw a slight effect of placebos on subjective outcomes reported by patients, like their descriptions of how much pain they experienced. But Hrobjartsson said he questioned that effect. It could be a true effect, but it also could be a reporting bias, he said. The patient wants to please the investigator and tells the investigator, 'I feel slightly better. ' Pain is a hugely subjective report. It is notorious. I would not want to do a summary across the papers of the whole field of pain-researchers, since -- based on difficulty, and not on knowing those researchers -- I expect an enormous amount of bad research in that area. - I don't know if the researchers are quite unwise here, of if they only seem that way because of bad news reporting. - Oh, I did read a meta-analysis a while ago, that one from Steve Simon. It was based on pain research (and, basically, only relevant to pain research), and the authors insisted that the vast majority of studies were not very good. About the studies these authors found, using 3 groups: They found 114, published between 1946 and 1998. When they analyzed the data, they could detect no effects of placebos on objective measurements, like cholesterol levels or blood pressure. - That is interesting. 114 is a big enough number. Controlled medical research, however, seemed to undergo big changes across those decades. I expect that double-blind and triple-blind studies did not get much use until halfway through that interval. If someone does look into the original publication, and will tell us about it -- I am interested, especially, in what the authors say about pain studies, and what they say about time trends. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Standardized testing in schools
Standardized tests and their problems? Here was a problem with equating the scores between years. The NY Times had a long front-page article on Monday, May 21: When a test fails the schools, careers and reputations suffer. It was about a minor screw-up in standardizing, in 1999. Or, since the company stonewalled and refused to admit any problems, and took a long time to find the problems, it sounds like it became a moderately *bad* screw-up. The article about CTB/McGraw-Hill starts on page 1, and covers most of two pages on the inside of the first section. It seems highly relevant to the 'testing' that the Bush administration advocates, to substitute for having an education policy. CTB/McGraw-Hill runs the tests for a number of states, so they are one of the major players. And this proved to me , once again, why nuclear power plants are too hazardous to trust: we can't yet Managements to spot problems, or to react to credible problem reports in a responsible way. In this example, there was one researcher from Tennessee who had strong longitudinal data to back up his protest to the company; the company arbitrarily (it sounds like) fiddled with *his* scores, to satisfy that complaint, without ever facing up to the fact that they did have a real problem. Other people, they just talked down. The company did not necessarily lose much business from the episode because, as someone was quoted, all the companies who sell these tests have histories of making mistakes. (But, do they have the same history of responding so badly?) -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Variance in z test comparing purcenteges
- BUT, Robert, the equal N case is different from cases with unequal N - - or did I lose track of what the topic really is... - On 22 May 2001 06:52:27 -0700, [EMAIL PROTECTED] (Robert J. MacG. Dawson) wrote: and Rich Ulrich responded: Aren't we looking at the same contrast as the t-test with pooled and unpooled variance estimates? Then - Similar, but not identical. With the z-for-proportion we have the additional twist that the amount of extra power from te unpooled test is linked to the size of the effect we're trying to measure, in such a way that we get it precisely when we don't need it. Or, to avoid being too pessimistic, let's say that the pooled test only costs us power when we can afford to lose some grin. - Robert wrote on May 18,And, clearly, the pooled variance is larger; as the function is convex up, the linear interpolation is always less. Back to my example in the previous post: Whenever you do a t-test, you get exactly the same t if the Ns are equal. For unequal N, you get a bigger t when the group with the smaller variance gets more weight. I think your z-tests on proportions have to work the same way. I can do a t-test with a dichotomous variable as the criterion, testing 1 of 100 versus 3 of 6: 2x2 table is (1+99), (3+3). That gives me a pooled t of 6 or 7, that is p .001; and a separate-variance t that is p= 0.06. - I like that pooled test, but I do think that it has stronger assumptions than the 2x2 table. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Elementary cross-sectional statistics
On Mon, 21 May 2001 13:41:16 GMT, Sakke [EMAIL PROTECTED] wrote: Hello Everybody! We have a probably very simple question. We are doing cross-sectional regressions. We are doing one regression per moth for a period of ten years, resulting in 120 regressions. As we understood, it is possible to just take a arithmetic average for every coefficient. Well, sure, it is possible to take an arithmetic average and then you can tell people, Here is the arithmetic average. It's a lot harder to have any certainty that the average of a time series means much. What we do not know, is how to calculate the t-statistics for these coefficients. Can we just do the same, arithmetic average? Can anybody help us? No, you certainly can't compute an average of some t-tests and claim that it is a t-test. What you absolutely have to have (in some sense) is a model of what happens over 10 years. For instance: If it is the same experience over and over again (that is your model of 'what happens'), *maybe* it would be proper to average each Variable over the 120 time points; and then do the regression. That is the easiest case I can think of --the mean is supposed to represent something, and you conclude that it represents the whole thing. Otherwise: What is there? What are you trying to conclude? Why? (Who cares?) Are the individual regressions 'significant'? Highly? Are there mean-differences over time? - variations between years or seasons? Are the lagged correlations near zero? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Intepreting MANOVA and legitimacy of ANOVA
The usual problem of MANOVA, which is hard to avoid, is that even if a test comes out significant, you can't say what you have shown except 'different.' You get a clue by looking at the univariate tests and correlations. Or drawing up the interesting contrasts and testing them to see if they account for everything. I have a problem, here, that might be avoidable -- I can't tell what you are describing. Part of that is 'ugly abbreviations,' part is 'I do not like the terminology, DV and IV, abbreviated or not' so I will not take much time at it. On Fri, 18 May 2001 14:57:49 -0500, auda [EMAIL PROTECTED] wrote: Hi, all, In my experiment, two dependent variables were measured (say, DV1 and DV2). I found that when analyzed sepeartely with ANOVA, independent variable (say, IV and had two levels IV_1 and IV_2) modulated DV1 and DV2 differentially: mean DV1 in IV_1 mean DV1 in IV_2 mean DV2 in IV_1 mean DV2 in IV_2 If analyzed with MANOVA, the effect of IV was significant, Rao R(2,14)=112.60, p0.000. How to intepret this result of MANOVA? Can I go ahead to claim IV modulated DV1 and DV2 differentially based up the result from MANOVA? Or I have to do other tests? Moreover, can I treat DV1 and DV2 as two levels of a factor, say, type of dependent variable, and then go ahead to test the data with repeated-measures ANOVA and see if there is an interaction between IV and type of dependent variable? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Variance in z test comparing purcenteges
On 18 May 2001 07:51:21 -0700, [EMAIL PROTECTED] (Robert J. MacG. Dawson) wrote: [ ... ] OK, so what *is* going on here? Checking a dozen or so sources, I found that indeed both versions are used fairly frequently (BTW, I myself use the pooled version, and the last few textbooks I've used do so). Then I did what I should have done years ago, and I tried a MINITAB simulation. I saw that for (say) n1=n2=10, p1=p2=0.5, the unpooled statistic tends to have a somewhat heavy-tailed distribution. This makes sense: when the sample sizes are small the pooled variance estimator is computed using a sample size for which the normal approximation works better. The advantage of the unpooled statistic is presumably higher power; hoewever, in most cases, this is illusory. When p1 and p2 are close together, you do not *get* much extra power. When they are far apart and have moderate sample sizes you don't *need* extra power. And when [ snip, rest] Aren't we looking at the same contrast as the t-test with pooled and unpooled variance estimates? Then - (a) there is exactly the same t-test value when the Ns are equal; the only change is in DF. (b) Which test is more powerful depends on which group is larger, the one with *small* variance, or the one with *large* variance. -- it is a large difference when Ns and variances are both different by (say) a fourfold factor or more. If the big N has the small variance, then the advantage lies with 'pooling' so that the wild, small group is not weighted as heavily. If the big N has the large variance, then the separate-variance estimate lets you take advantage of the precision of the smaller group. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: bootstrap, and testing the mean of variance ratio
On Wed, 16 May 2001 11:50:07 + (UTC), [EMAIL PROTECTED] (rpking) wrote: For each of the two variance ratios, A=var(x)/var(y) and B=var(w)/var(z), I bootstrapped with 2000 replications to obtain confidence intervals. Now I want to test whether the means are equal, ie. E(A) = E(B), and I am wondering whether I could just use the 2000 data points, calculate the standard deviations, and do a simple t test. This raises questions, questions, questions. What do you mean by a data point? by bootstrapping? Why do you want ratios of the variances? If you are concerned with variances, why aren't you considering the logs of V? If you are concerned with ratios, why are you considering the logs of the ratios? With 2000 replications each, there would seem to be 4000 points. Or, what relation is there among x-y-z-w?If these give you 2000 vectors, then why don't you have a paired comparison in mind? Bootstrapping is tough enough to figure what's proper, that I don't want to bother with it. Direct tests are usually enough: So, if you were considering a direct test, What would you be testing? (I figure there is really good chance that you are wrong in what you are trying to bootstrap, or how you are doing it.) I have concerns because A and B are bounded below at 0 (but not bounded above), so the distribution may not be asymptotically normal. ... and that is relevant to what? Distributions of raw data are seldom (if ever) asymptotically normal. But I also found the bootstrapped A and B are well away from zero; the 1% percentile has a value of 0.78. ... well, I should hope they are away from zero. Relevance? So could t test be used in this situation? Or should I do another bootstrapping for the test? Take your original problem to a statistician. Bootstrap is something to retreat to when you can't estimate error directly, and you have given no clue why you might need it. -- Rich Ulrich,[EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: (none)
[ note, Jay: HTML-formatting makes this hard to read ] On 11 May 2001 00:30:06 -0700, [EMAIL PROTECTED] (Jay Warner) wrote: [snip, HTML header] I've had occasion to talk with a number of educator types lately, at different application and responsibility levels of primary amp; secondary Ed.nbsp; Only one recalled the term, regression toward the mean.nbsp; Some (granted, the less analytically minded) vehemently denied that such could be causing the results I was discussing.nbsp; Lots of other causes were invoked. pIN an MBA course I teach, which frequently includes teachers wishing to escape the trenches, the textbook never once mentions the term.nbsp; I don't recall any other intro stat book including the term, much less an explanation.nbsp; The explanation I worked out required some refinement to become rational to those educator types (if it has yet :). - I am really sorry to learn that - Not even the texts! that's bad. By the way, there are two relevant chapters in the 1999 history, Statistics on the Table by Stephen Stigler (see pages 157-179). Stigler documents a big, embarrassing blunder by a noted economist, published in 1933. Horace Secrist wrote a book with tedious detail, much of it being accidental repetitions of regression fallacy. Hotelling panned it in a review in JASA. Next, Secrist replied in a letter, calling Hotelling wholly mistaken. Hotelling tromped back, ... and when one version of the thesis is interesting but false and the other is true but trivial, it becomes the duty of the reviewer to give warning at least against the false version. Maybe Stigler's user-friendly anecdote will help to spread the lesson, eventually. pSo I'm not surprised that even the NYT would miss it entirely.nbsp; Rich, I hope you penned a short note to the editor, pointing out its presence.nbsp; Someone has to, soon. I did not write, yet. But I see an e-mail address, which is not usual in the NYTimes. I guess they identify Richard Rothstein as [EMAIL PROTECTED] because this article was laid out as a feature (Lessons) instead of an ordinary news report. I'm still considering what I should say, if someone else doesn't tell me that they have passed the word. pBTW, Campbell's text, A primer on regression artifacts mentions a correction factor/method, which I haven't understood yet.nbsp; Does anyone in education and other social science circles use this correction, and may I have a worked out example? Since you mentioned it, I checked my new copy of the Campbell/ Kenny book. Are you in Chapter 5? There is a lot going on, but I don't grasp that there is any well-recommended correction. Except, maybe, Structural-equations-modeling, and they just gesture vaguely in the direction of that. Give me a page number? I thought that they re-inforced my own prejudices, that when two groups are not matched at Pre, you have a lot of trouble forming clear conclusions. You can be a bit assertive if one group wins by all three standards (raw score, change score, regressed-change score), but you still can't be 100% sure. When your groups don't match, you draw the graphs to help you clarify trends, since the eyeball is great at pattern analysis. Then you see if any hostile interpretations can undercut your optimistic ones, and you sigh regrets when they do. pJay pRich Ulrich wrote: blockquote TYPE= [ snip, my earlier note, with HTML format imposed. ] -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Variance in z test comparing percenteges
On 11 May 2001 22:29:37 -0700, [EMAIL PROTECTED] (Donald Burrill) wrote: On Sat, 12 May 2001, Alexandre Kaoukhov (RD [EMAIL PROTECTED]) wrote: I am puzzled with the following question: In z test for continuous variables we just use the sum of estimated variances to calculate the variance of a difference of two means i.e. s^2 = s1^2/n1 + s2^2/n2. [ snip, Q and A, AK and DB ... ] On the other hand the chi2 is derived from Z^2 as assumed by first approach. DB Sorry; the relevance of this comment eludes me. Well -- every (normal) z score can be squared, to produce a chi-squared score. One particular formula for a z matches the Pearson product-moment chisquared test statistic. Finally, I would like to know whether the second formula is ever used and if so does it have any name. DB Ever is a wider universe of discourse than I would dare pretend to. Perhaps colleagues on the list may know of applications. I would be surprised if it had been named, though. I don't remember a name, either. I think I do remember seeing a textbook that presented that t as their preferred test for proportions. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Question
On 11 May 2001 07:34:38 -0700, [EMAIL PROTECTED] (Magill, Brett) wrote: Don and Dennis, Thanks for your comments, I have some points and futher questions on the ussue below. For both Dennis and Don: I think the option of aggregating the information is a viable one. I would call it unavoidable rather than just viable. The data that you show is basically aggregated already; there's just one item per-person. Yet, I cannot help but think there is some way to do this taking into account the fact that there is variation within organizations. I mean, if I have a organizational salary mean of .70 (70%) with a very tiny [ snip, rest] - I agree, you can use the information concerning within-variation. I think it is totally proper to insist on using it, in order to validate the conclusions, to whatever degree is possible. You might be able to turn around that 'validation' to incorporate it into the initial test; but I think the role as validation is easier to see by itself, first. Here's a simple example where the 'variance' is Poisson. (Ex.) A town experiences some crime at a rate that declines steadily, from 20 000 incidents to 19 900 incidents, over a 5-year period. The linear trend fitted to the several points is highly significant by a regression test. Do you believe it? (Answer) What I would believe is: No, there is no trend, but it is probably true that someone is fudging the numbers. The *observed variation* in means is far too small for the totals to be seen be chance. And the most obvious sources of error would work in the opposite direction. [That is, if there were only a few criminals responsible for many crimes each, and the number-of-criminals is what was subject to Poisson variation, THEN the number-of-crimes should be even more variable.] In your present case, I think you can estimate on the basis of your factory (aggregate) data, and then you figure what you can about how consistent those numbers are with the un-aggregated data, in terms of means or variances. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: (none)
- selecting from CH's article, and re-formatting. I don't know if I am agreeing, disagreeing, or just rambling on. On 4 May 2001 10:15:23 -0700, [EMAIL PROTECTED] (Carl Huberty) wrote: CH: Why do articles appear in print when study methods, analyses, results, and conclusions are somewhat faulty? - I suspect it might be a consequence of Sturgeon's Law, named after the science fiction author. Ninety percent of everything is crap. Why do they appear in print when they are GROSSLY faulty? Yesterday's NY Times carried a report on how the WORST schools have improved more than the schools that were only BAD. That was much- discussed, if not published. - One critique was, the absence of peer review. There are comments from statisticians in the NY Times article; they criticize, but (I thought) they don't get it on the simplest point. The article, while expressing skepticism by numerous people, never mentions REGRESSION TOWARD the MEAN which did seem (to me) to account for every single claim of the original authors whose writing caused the article. CH: [] My first, and perhaps overly critical, response is that the editorial practices are faulty[ ... ] I can think of two reasons: 1) journal editors can not or do not send manuscripts to reviewers with statistical analysis expertise; and 2) manuscript originators do not regularly seek methodologists as co-authors. Which is more prevalent? APA Journals have started trying for both, I think. But I think that statistics only scratches the surface. A lot of what arises are issues of design. And then there are issues of data analysis. Becoming a statistician helped me understand those so that I could articulate them for other people; but a lot of what I know was never important in any courses. I remember taking just one course or epidemiology, where we students were responsible for reading and interpreting some published report, for the edification of the whole class -- I thought I did mine pretty well, but the rest of the class really did stagger through the exercise. Is this critical reading something that can be learned, and improved? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: 2x2 tables in epi. Why Fisher test?
- I offer a suggestion of a reference. On 10 May 2001 17:25:36 GMT, Ronald Bloom [EMAIL PROTECTED] wrote: [ snip, much detail ] It has become the custom, in epidemiological reports to use always the hypergeometric inference test -- The Fisher Exact Test -- when treating 2x2 tables arising from all manner of experimental setups -- e.g. a.) the prospective study b.) the cross-sectional study 3.) the retrospective (or case-control) study [ ... ] I don't know what you are reading, to conclude that this has become the custom. Is that a standard for some journals, now? I would have thought that the Logistic formulation was what was winning out, if anything. My stats-FAQ has mention of the discussion published in JRSS (Series B) in the1980s. Several statisticians gave ambivalent support to Fisher's test. Yates argued the logic of the exact test, and he further recommended the X2 test computed with his (1935) adjustment factor, as a very accurate estimator of Fisher's p-levels. I suppose that people who hate naked p-levels will have to hate Fisher's Exact test, since that is all it gives you. I like the conventional chisquared test for the 2x2, computed without Yates's correction -- for pragmatic reasons. Pragmatically, it produces a good imitation of what you describe, a randomization with a fixed N but not fixed margins. That is ironic, as Yates points out (cited above) because the test assumes fixed margins when you derive it. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Analysis of a time series of categorical data
On 3 May 2001 09:46:12 -0700, [EMAIL PROTECTED] (R. Mark Sharp; Ext. 476) wrote: If there is a better venue for this question, please advise me. - an epidemiology mailing list? [ snip, much detail ] Time point 1Time point 2Time point 3Time point 4 Hosts Inf Not-InfInf Not-InfInf Not-InfInf Not-Inf Tested G1-S11 14 11 4 11 1 13 2 57 G1-S27 8 12 3 14 2 15 8 69 G1-S31 246 18815915 95 G2-S43 12 12 4 10 4 14 2 61 G2-S55 105 68 7 1114 57 G2-S62 26 12 12 1116 1412 105 The questions are how can group 1 (G1) be compare to group 2 (G2) and how can subgroups be compared. I maintain that the heterogeneity within each group does not prevent pooling of the subgroup data within each group, because the groupings were made a priori based on genetic similarity. Mostly, heterogeneity prevents pooling. What's an average supposed to mean? Only if the Ns represent naturally-occurring proportions, and so does your hypothesis, then you MIGHT want to analyze the numbers that way. How much do you know about the speed of expected onset, and offset of the disease? If this were real, It looks to me like you would want special software. Or special evaluation of a likelihood function. I can put the hypothesis in simple ANOVA terms, comparing species (S). Then, the within-Variability of G1 and G2 -- which is big -- would be used to test the difference Between: according to some parameter. Would that be an estimate of maximum number afflicted? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Simple ? on standardized regression coeff.
On Tue, 17 Apr 2001 16:32:06 -0500, "d.u." [EMAIL PROTECTED] wrote: Hi, thanks for the reply. But is beta really just b/SD_b? In the standardized case, the X and Y variables are centered and scaled. If Rxx is the corr matrix [ ... ] No. b/SD_b is the t-test. Beta is b, after it is scaled by the SD of X and the SD of Y. Yes, beta is the b if X and Y are 'scaled' to unit normal. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Simple ? on standardized regression coeff.
On Mon, 16 Apr 2001 20:24:10 -0500, "d.u." [EMAIL PROTECTED] wrote: Hi everyone. In the case of standardized regression coefficients (beta), do they have a range that's like a correlation coefficient's? In other words, must they be within (-1,+1)? And why if they do? Thanks! There is no limit on the raw coefficient, b, so there is no limit on beta= b/SD. In practice, b gets large when there is a suppressor relationship, so that the x1-x2 difference is what matters, e.g., (10x1-9x2). Beta is about the size of the univariate correlation when the co-predictors balance out in their effects. I usually want to consider a different equation if any beta is greater than 1 or has the opposite sign from its corresponding, initial r -- for instance, I might combine (X1, X2) in a rational way. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: In realtion to t-tests
On Mon, 09 Apr 2001 10:44:40 -0400, Paige Miller [EMAIL PROTECTED] wrote: "Andrew L." wrote: I am trying to learn what a t-test will actually tell me, in simple terms. Dennis Roberts and Paige Miller, have helped alot, but i still dont quite understand the significance. Andy L A t-test compares a mean to a specific value...or two means to each other... [ ... ] I remember my estimation classes, where the comparison was always to ZERO for means. To ONE, I guess, for ratios. Technically speaking, or writing. For instance, if the difference in averages X1, X2 is expected to be zero, then "{(X1-X2) -0 }" ... is distributed as t . It might look like a lot of equations with the 'minus zero' seemingly tacked on, but I consider this to be good form. It formalizes as term minus Expectation of term -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: rotations and PCA
- Intelligence, figuring what it might be, and categorizing it, and measuring it... I like the topics, so I have to post more. On Thu, 05 Apr 2001 22:09:33 +0100, Colin Cooper [EMAIL PROTECTED] wrote: In article [EMAIL PROTECTED], Rich Ulrich [EMAIL PROTECTED] wrote: I liked Gould's book. I know that he offended people by pointing to gross evidence of racism and sexism in 'scientific reports.' But he has (I think) offended Carroll in a more subtle way. Gould is certainly partial to ideas that Carroll is not receptive to; I think that is what underlies this critique. ===snip I've several problems with Gould's book. (1) Sure - some of the original applications of intelligence testing (screening immigrants who were ignorant of the language using tests which were grossly unfair to them) were unfair, immoral and wrong. But why impugn the whole area as 'suspect' because of the politically-dubious activities of some researchers a century ago? It I think Gould to "impugned" more than just one area. The message, as I read it, was, "Be leery of social scientists who provide self-congratulatory and self-serving, simplistic conclusions." In recent decades, I imagine that economists have been bigger at that than psychologists. Historians have quite a bit of 20th century history-writing to live down, too. seems to me to be exceptionally surprising to find that ALL abilities - musical, aesthetic, abstract-reasoning, spatial, verbal, memory etc. correlate not just significantly but substantially. Here is one URL for references to Howard Gardner, who has shown some facets of independence of abilities (and who you mention, below). http://www.newhorizons.org/trm_gardner.html (2) Gould's implication is that since Spearman found one factor (general ability) whilst Thurstone fornd about 9 identifiable factors, then factor analysis is a method of dubious use, since it seems to generate contradictory models. There are several crucial differences - I read Gould as being more subtle than that. between the work of Spearman and Thurstone that may account for these differences. For example, (a) Spearman (stupidly) designed tests containing a broad spectrum of abilities: his 'numerical' test, for example, comprised various sorts of problems - addition, fractions, etc. Thurstone used separate tests for each: so Thurstone's factors essentially corresponded to Spearman's tests. (b) Thurstone's work was with students where the limited range of abilities would reduce the magnitude of correlations between tests. (c) More recent work (e.g., Gustafsson, 1981; Carroll, 1993) using exploratory factoring and CFA finds good evidence for a three-stratum model of abilities: 20+ first-order factors, half a dozen second-order factors, or a single 3rd-order factor. (3) Interestingly, Gardner's recent work has come to almost exactly the same conclusions from a very different starting point. Gardner identiied groups of abilities which, according to the literature, tended to covary - for example, which tend to develop at the same age, all change following drugs or brain injury, which interfere with each other in 'dual-task' experiments and so on. His list of abilities derived in this was is very similar to the factors identified by Gustaffson, Carroll and others. - but Gardner has "groups of abilities" that are, therefore, distinct from each other. And also, only a couple of abilities are usually rewarded (or even measured) in our educational system. When I read his book, I thought Gardner was being overly "scholastic" in his leaning, and restrictive in his data, too. I have a feeling that we're going to get on to the issue of whether factors are merely arbitrary representations of sets of data or whether some solutions are more are more meaningful than others - the rotational indeterminacy problem - but I'm off to bed! Well, how much data can you load into one factor analysis? How much virtue can you assign to one 'central ability'? - I see the problem as philosophical instead of numeric. What you will *identify* as a single factor (by techniques of today) will be more trivial than you want. Daniel Dennett, in "Consciousness Explained," does a clever job of defining consciousness. And trivializing it; what I was interested in (I reflect to myself) was something much grander, something more meaningful. But intelligence and self-awareness are separate topics, and big ones. Julian Jaynes's book was more useful on the bigger picture -- setting a framework, so to speak, and establishing the size of the problem. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIAT
Re: attachments
On Fri, 06 Apr 2001 13:34:03 GMT, Jerry Dallal [EMAIL PROTECTED] wrote: "Drake R. Bradley" wrote: While I agree with the sentiments expressed by others that attachments should not be sent to email lists, I take exception that this should apply to small (only a few KB or so) gif or jpeg images. Pictures *are* often worth a thousand words, and certainly it makes sense that the subscribers to a stat It's worth noting that some lists have gateways to Usenet groups. Usenet does not support attachments, so they will be lost to Usenet readers. [ break ] - my Usenet connection seems to give me all the attachments. But if I depended on a modem and a 7-bit protocol, I would be pleased if my ISP filtered out the occasional, 100 kilobyte 8-bit attachment. (Some folk still use 7-bit protocols, don't they?) Also, even in the anything-goes early 21-st Century climate of the Internet, one big no-no remains the posting of binaries to non-binary groups. Right; that's partly because of size. My vendor has the practice, these days, of saving ordinary groups for a week, binary groups (which are the BULK of their internet feed) for 24 hours. Binary strings may be treated as screen-commands, if your Reader doesn't know to package them as an 'attachment' or otherwise ignore them. Some attachments are binary, some are not. Standard HTML files are ASCII, with the added 'risk' (I sometimes look at it that way) of invoking an immediate internet connection. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Fw: statistics question
I reformatted this. Quoting a letter from Carmen Cummings to himself, On 6 Apr 2001 08:48:38 -0700, [EMAIL PROTECTED] wrote: The below question was on my Doctorate Comprehensives in Education at the University of North Florida. Would one of you learned scholars pop me back with possible appropriate answers. the question An educational researcher was interested in developing a predictive scheme to forecast success in an elementary statistics course at a local university. He developed an instrument with a range of scores from 0 to 50. He administered this to 50 incoming frechmen signed up for the elementary statistics course, before the class started. At the end of the semester he obtained each of the 50 student's final average. Describe an appropriate design to collect data to test the hypothesis. = end of cite. I hope the time of the Comprehensives is past. Anyway, this might be better suited for facetious answers, than serious ones. The "appropriate design" in the strong sense: Consult with a statistician IN ORDER TO "develop an instrument". Who decided only a single dimension should be of interest? (How else does one interpret a score with a "range" from 0 to 50?) Consult with a statistician BEFORE administering something to -- selected? unselected? -- freshman; and consult (perhaps) in order to develop particular hypotheses worth testing. I mean, the kids scoring over 700 on Math SATs will ace the course, and the kids under 400 will have trouble. Generalizing, of course. If "final average" (as suggested) is the criterion, instead of "learning." But you don't need a new study to tell you those results. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: 1 tail 2 tail mumbo jumbo
On Mon, 19 Mar 2001 13:14:39 -0500, Bruce Weaver [EMAIL PROTECTED] wrote: On Fri, 16 Mar 2001, Rich Ulrich wrote: [ snip, including earlier post ] That ANOVA is inherently a 2-sided test. So is the traditional 2x2 contingency table. That is because, sides refer to hypotheses. snip I agree with you Rich, except that I don't find "2-sided" all that appropriate for describing ANOVA. For an ANOVA with more than 2 groups, there are MULTIPLE patterns of means that invalidate the null hypothesis, not just 2. With only 3 groups,for example: A B C A C B B A C [ ... ] And then if you included all of the cases where 2 of the means are equal to each other, but not equal to the 3rd mean, there are several more possibilities. And these ways of departing from 3 equal means do not correspond to tails in some distribution. There's my attempt to add to the confusion. ;-) If I convince people that they want only one *contrast* for their ANOVA, then it is just two-sided. I've been talking people out of blindly testing multiple-groups and multiple periods, for years. Then I have to start over on the folks, to convince them about MANOVA. If there are two groups and two variables, there are FOUR sides -- and that's if you just count what is 'significant' by the single variables. Most of the possible results are not useful ones; that is, they are not easily interpretable, when no variable is 'significant' by itself, or when logical directions seem to conflict. We can interpret "group A is better than B." And we analyze measures that have the scaled meaning, where one end is better. So the sensible analysis uses a defined contrast, the 'composite score'; and then you don't have to use the MANOVA packages, and you have the improved power of testing just one or two sides. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk
On 14 Mar 2001 21:55:48 GMT, [EMAIL PROTECTED] (Radford Neal) wrote: In article [EMAIL PROTECTED], Rich Ulrich [EMAIL PROTECTED] wrote: (This guy is already posting irrelevant rants as if I've driven him up the wall or something. So this is just another poke in the eye with a blunt stick, to see what he will swing at next) I think we may take this as an admission by Mr. Ulrich that he is incapable of advancing any sensible argument in favour of his position. Certainly he's never made any sensible response to my criticism. - In a new thread, I have now provided a response that is sensible, or, at least, somewhat numeric. I notice that Jim C. has taken up the cudgel, in trying to explain the basics of t-tests to Jim S, and that "furthers my position." I figure that after I state my position in one post, explicate it in another, and try that again while refining the language -- then I may as well call it quits with JS, when he still doesn't get the points from the first (or from the couple of other people who were posting them before I was). I may not be saying it all that well, but I wasn't inventing the position. You and I are in agreement, now, on one minor conclusion: "The t-test isn't good evidence about a difference in averages." But for me, that's true because the numbers are crappy indicators of performance -- which was clued *first* by the distribution. Whereas, you seem to have much more respect for crude averages, compared to the several of us who object. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk
- I hate having to explain jokes - On 14 Mar 2001 15:34:45 -0800, [EMAIL PROTECTED] (dennis roberts) wrote: At 04:10 PM 3/14/01 -0500, Rich Ulrich wrote: Oh, I see. You do the opposite. Your own flabby rationalizations might be subtly valid, and, on close examination, *do* have some relationship to the questions could we ALL please lower a notch or two ... the darts and arrows? i can't keep track of who started what and who is tossing the latest flames but ... somehow, i think we can do a little better than this ... Dennis, Please, where is YOUR sense of humor? My post was a literary exercise -- I intentionally posted his lines immediately before mine, so the reader could follow my re-write phrase by phrase. I'm still hoping "Irving" will lighten up. You chopped out the original that I was paraphrasing, and you did *not* indicate those important [snip]s -- You would mislead the casual reader to think someone other than JimS is originating lines like that, or intend them as critique in this group. - I'm not always kind, but I think I am never that wild. - It's probably been a dozen years since I purely flamed like that. (Or maybe I never flamed, if you talk about the really empty ones. In the olden days of local Bulletin Boards, with political topics, I discarded 1/3 of my compositions without ever posting, because of poor content or tone. I still use some judgment in what I post.) Compare his original line about 'little or no ... relationship' with my clever reversal, "... on close examination, *do* have some relationship to the questions." Well, I was trying for humor, anyway. Sorry, if I missed. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk
On Thu, 08 Mar 2001 10:38:59 -0800, Irving Scheffe [EMAIL PROTECTED] wrote: On Fri, 02 Mar 2001 16:28:53 -0500, Rich Ulrich [EMAIL PROTECTED] wrote: On Tue, 27 Feb 2001 07:49:23 GMT, [EMAIL PROTECTED] (Irving Scheffe) wrote: My comments are written as responses to the technical comments to Jim Steiger's last post. This is shorter than his post, since I omit redundancy and mostly ignore his 'venting.' I think I offer a little different perspective on my previous posts. [ snip, intro. ] Mr. Ulrich's latest post is a thinly veiled ad hominem, and I'd urge him to rethink this strategy, as it does not present him in a favorable light. - I have a different notion of ad-hominem, since I think it is something directed towards 'the person' rather than at the presentation. Or else, I don't follow what he means by 'thinly veiled.' When a belligerent and nasty and arrogant tone seems to be an essential part of an argument, I don't consider myself to be reacting 'ad-hominem' when I complain about it -- it's not that I hate to be ad-hominem, but I don't like to be misconstrued. I'm willing, at times, to plunk for the 'ad-hominem'. For instance, since my last post on the subject, I looked at those reports. Also, I searched with google for the IWF -- who printed the anti-MIT critiques. I see the organization characterized as an 'anti-feminist' organization, with some large funding from Richard Scaife. 'Anti-feminist' could mean a reasoned-opposition, or a reflex opposition. Given these papers, it appears to me to qualify as 'reflex' or kneejerk opposition. Oh, ho! I say, this explains where the arguments came from, and why Jim keeps on going -- Now, THIS PARAGRAPH is what I consider an ad-hominem argument. And I'll give you some more. Scaife is a paranoid moneybags and publisher who infests this Pittsburgh region (which is why I have noticed him more than a westerner like Coors). His cash was important in persecuting Clinton for his terms in office. For example, Scaife kept alive Victor Foster's suicide for years. He held out money for anyone willing to chase down Clinton-scandals. Oh, he funded the chair at Pepperdine that Starr had intended to take. Now: My comment on the original reports: I am happy to say that it looks to me as if MIT is setting a good model for other universities to follow. The senior administrator listens to his faculty, especially his senior faculty, and responds. MIT makes no point about numbers in their statements, and it does seem to be wise and proper that they don't do so. I see now, Jim is not really arguing with MIT. They won't argue back. Jim's purpose is to create a hostile presence, a shadow to threaten other administrators. He goes, like, "If you try to 'cut a break' for women, we'll be watching and threatening and undermining, threatening your job if we can." I suppose state universities are more vulnerable than the private universities like MIT. On the other hand, with the numbers that Jim has put into the public eye, the next administrator can point to the precedent of MIT and assert that, clearly, the simple numbers on 'quality' are substantially irrelevant to the issues, since they were irrelevant at MIT. Hope this helps. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Trend analysis question: follow-up
On 5 Mar 2001 16:41:22 -0800, [EMAIL PROTECTED] (Donald Burrill) wrote: On Mon, 5 Mar 2001, Philip Cozzolino wrote in part: Yeah, I don't know why I didn't think to compute my eta-squared on the significant trends. As I said, trend analysis is new to me (psych grad student) and I just got startled by the results. The "significant" 4th and 5th order trends only account for 1% of the variance each, so I guess that should tell me something. The linear trend accounts for 44% and the quadratic accounts for 35% more, so 79% of the original 82% omnibus F (this is all practice data). I guess, if I am now interpreting this correctly, the quadratic trend is the best solution. DB Well, now, THAT depends in part on what the spectrum of candidate solutions is, doesn't it? For all that what you have is "practice data", I cannot resist asking: Are the linear quadratic components both positive, and is the overall relationship monotonically increasing? Then, would the context have an interesting interpretation if the relationship were exponential? Does plotting [ snip, rest ] "Interesting interpretation" is important. In this example, the interest (probably) lies mainly with the variance-explained: in the linear and quadratic. It's hard for me to be highly interested in an order-5 polynomial, and sometimes a quadratic seems unnecessarily awkward. What you want is the convenient, natural explanation. If "baseline" is far different from what follows, that will induce a bunch of high order terms if you insist on modeling all the periods in one repeated measures ANOVA. A sensible interpretation in that case might be, to describe the "shock effect" and separately describe what happened later. Example. The start of Psychotropic medications has a huge, immediate, "normalizing" effect on some aspects of sleep of depressed patients (sleep latency, REM latency, REM time, etc.). Various changes *after* the initial jolt can be described as no-change; continued improvement; or return toward the initial baseline. In real life, linear trends worked fine for describing the on-meds followup observation nights (with - not accidentally - increasing intervals between them). -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Census Bureau nixes sampling on 2000 count
On Fri, 02 Mar 2001 12:16:42 GMT, [EMAIL PROTECTED] (J. Williams) wrote: The Census Bureau urged Commerce Secretary Don Evans on Thursday not to use adjusted results from the 2000 population count. Evans must now weigh the recommendation from the Census Bureau, and will make the decision next week. If the data were adjusted statistically it could be used to redistribute and remap political district lines. William Barron, the Bureau Director, said in a letter to Evans that he agreed with a Census Bureau committee recommendation "that unadjusted census data be released as the Census Bureau's official redistricting data." Some say about 3 million or so people make up a disenfranchising undercount. Others disagree viewing sampling as a method to "invent" people who have not actually been counted. Politically, the stakes are high on Evans' final decision. People may wonder, "Why did the Census Bureau say this, and why is there little criticism of them?" According to the reports of a few weeks ago, the inner-city counts, etc., of this census were quite a bit more accurate than they were 10 years ago. That means that we couldn't be so sure that adjustment would make a big improvement, or any improvement. This frees Republicans of some blame, for this one instance, of pushing specious technical arguments for short-term GOP gain. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk
hould have pointed Gene and Dennis politely to the details, instead of blundering around and making it appear that "this one is huge" is your whole basis. My commentary is devoted to your presentation, here. [ snip, "importance of issue" and more redundancy.] Hope that helps. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Post-hoc comparisons
On 2 Mar 2001 07:27:16 -0800, [EMAIL PROTECTED] (Esa M. Rantanen) wrote: [ snip, detail ] contingency table. I have used a Chi-Sq. analysis to determine if there is a statisitcally significant difference between the (treatment) groups (all 4!), and indeed there is. I assume, however, that I cannot simply do pairwise comparisons between the groups using Chi-Sq. and 2 x 2 matrices without inflating the probability of Type 1 error, (1-alpha)^4 in this case. As far as I know, there are no equivalents to Duncan's or Tukey's tests for the type of data (binary) I have to deal with. Well, if you want to do the ANOVA on the dichotomous variable, I won't complain. My reaction is, you are assuming that, somewhere, great precision matters. But being precise in your thinking will gain you most, so that you do and report just ONE important test, that you figured out beforehand, instead of trying to cope with 6 tests that happen to fall into your lap. I would probably (a) Let the Overall test justify all my followup testing, where the followup testing is descriptive, among categories of equal N and equivalent importance; or (b) Do a few specified tests with Bonferroni correction, and report those tests. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Cronbach's alpha and sample size
On Wed, 28 Feb 2001 12:08:55 +0100, Nicolas Sander [EMAIL PROTECTED] wrote: How is Cronbach's alpha affected by the sample size apart from questions related to generalizability issues? - apart from generalizability, "not at all." Ifind it hard to trace down the mathmatics related to this question clearly, and wether there migt be a trade off between N of Items and N of sujects (i.e. compensating for lack of subjects by high number of items). I don't know what you mean by 'trade-off.' I have trouble trying to imagine just what it is, that you are trying to trace down. But, NO. Once you assume some variances are equal, Alpha can be seen as a fairly simple function of the number of items and the average correlation -- more items, higher alpha. The average correlation has a tiny bias by N, but that's typically, safely ignored. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk
- I want to comment a little more thoroughly about the lines I cited: what Garson said about inference, and his citation of Olkey. On Thu, 22 Feb 2001 18:21:41 -0500, Rich Ulrich [EMAIL PROTECTED] wrote: [ snip, previous discussion ] me I think that Garson is wrong, and the last 40 years of epidemiological research have proven the worth of statistics provided on non-random, "observational" samples. When handled with care. From G. David Garson, "PA 765 Notes: An Online Textbook." On Sampling http://www2.chass.ncsu.edu/garson/pa765/sampling.htm Significance testing is only appropriate for random samples. Random sampling is assumed for inferential statistics (significance testing). "Inferential" refers to the fact that conclusions are drawn about relationships in the data based on inference from knowledge of the sampling distribution. Significance tests are based on a sampling theory which requires that every case have a chance of being selected known in advance of sample selection, usually an equal chance. Statistical inference assesses the significance of estimates made using random samples. For enumerations and censuses, such inference is not needed since estimates are exact. Sampling error is irrelevant and therefore inferential statistics dealing with sampling error are irrelevant. - I agree with most of what he says, throughout; there will be a matter of nuances on interpretation and actions. For enumerations and censuses, a limited sort of statistics on 'finite populations,' he says sampling error is irrelevant. Irrelevant is a good and fitting word here. This is not 'illegal and banned,' but rather 'unwanted and totally beside the point.' Garson Significance tests are sometimes applied arbitrarily to non-random samples but there is no existing method of assessing the validity of such estimates, though analysis of non-response may shed some light. The following is typical of a disclaimer footnote in research based on a non random sample: Here is my perspective on testing, which does not match his. - For a randomized experimental design, a small p-level on a "test of hypothesis" establishes that *something* seemed to happen, owing to the treatment; the test might stand pretty-much by itself. - For a non-random sample, a similar test establishes that *something* seems to exist, owing to the factor in question *or* to any of a dozen factors that someone might imagine. The test establishes, perhaps, the _prima facie_ case but the investigator has the responsibility of trying to dispute it. That is, it is an investigator's responsibility (and not just an option) to consider potential confounders and covariates. If the small p-level stands up robustly, that is good for the theory -- but not definitive. If there are vital aspects or factors that cannot be tested, then opponents can stay unsatisfied, no matter WHAT the available tests may say. Garson "Because some authors (ex., Oakes, 1986) note the use of inferential statistics is warranted for nonprobability samples if the sample seems to represent the population, and in deference to the widespread social science practice of reporting significance levels for nonprobability samples as a convenient if arbitrary assessment criterion, significance levels have been reported in the tables included in this article." See Michael Oakes (1986). Statistical inference: A commentary for social and behavioral sciences. NY: Wiley. Garson is telling his readers and would-be statisticians a way to present p-levels, even when the sampling doesn't justify it. And, I would say, when the analysis doesn't justify it. I am not happy with the lines -- The disclaimer does not assume that a *good* analysis has been done, nor does it point to what makes up a good analysis. '... if the sample seems to represent the population' seems to be a weak reminder of the proper effort to overcome 'confounding factors'; it is not an assurance that the effects have proven to be robust. So, the disclaimer should recognize that the non random sample is potentially open to various interpretations; the present analysis has attempted to control for several possibilities; certain effects do seem robust statistically, in addition to being supported by outside chains of inference, and data collected independently. I suggested earlier that this is the status of epidemiological, observational studies. For the most part, those studies have been quite fruitful. But not always. They have been especially likely to mislead, I think, when the designs pretend that binomial variability is the only source of error in a large survey, and attempt to interpret small effects. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html
Re: Sample size question
On 23 Feb 2001 12:08:45 -0800, [EMAIL PROTECTED] (Scheltema, Karen) wrote: I tried the site but received errors trying to download it. It couldn't find the FTP site. Has anyone else been able to access it? As of a few minutes ago, it downloaded fine for me, when I clicked on it with Internet Explorer. The .zip file expanded okay. I used right-click (I just learned that last week) in order to download the .pfd version of the help. [ ... ] Earlier Q and Answer "Can anyone point me to software for estimating ANCOVA or regression sample sizes based on effect size?" Look here: http://www.interchg.ubc.ca/steiger/r2.htm Hmm. Placing limits on R^2. I have't read the accompanying documentation. On the general principal that you can't compute power if you don't know what power you are looking for, I suggest reading the relevant chapters in Jacob Cohen's book (1988+ edition). -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =