Re: Salaries and gender
Include in your list of references on salary differences and gender the article by M. H. Birnbaum, Relationships among models of salary bias, American Psychologist, 1985, July, 862-866. ~~~ Karl L. Wuensch, Department of Psychology, East Carolina University, Greenville NC 27858-4353 Voice: 252-328-4102 Fax: 252-328-6283 mailto:[EMAIL PROTECTED] http://core.ecu.edu/psyc/wuenschk/klw.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Normalizing a non-normal distribution
Brian MacDonald wrote: I am doing a series of analyses using discriminant analysis to predict group membership. Several of the variables I am using show distributions that are not normal. My question is can these (and for that matter shold they) be somehow transformed so that the resulting distribution looks and presumably acts in the analyses) like a normal distribution. It depends. For some distributions it is easy to do the transformations (e.g., log is often appropriate for +ve skew). An alternative approach might be to consider logistic regression which has several advantages over discriminant analysis and doesn't require normality. Thom = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Double mediation
Sylvia J. Hysong, Ph.D. [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Hello, I'm hoping someone can help me with this. I have looked at a multitude of resources including the David Kenny page, this and other newsgroups, Pedhazur (1982), Cohen Cohen (1983), and Darlington (1990?), to no avail. I am hoping someone can direct me to the right resource. I am trying to conduct a test of double mediation. In other words, I am trying to test the hypothesis that x--z1--z2--y. Is there a way to do this (and if so, what is it?), or must I result to a path analysis or a structural equation model? Thanks in advance for any help. If I understand the question correctly, this implies a number of conditional independence relationships which can be tested. i.e. x cond. ind. of z2 and y given z1; x and z1 cond. ind. of y given z2; x cond. ind. of y given z1 and z2. If these, and only these, independence relationships hold then you have either x--z1--z2--y or x--z1--z2--y. To decide which, you need some background knowledge or to conduct an experiment. You might want to check out links at http://www.cs.berkeley.edu/~murphyk/Bayes/bnsoft.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: How calculate 95%=1.96 stdv
Hi Stefan, s.petersson [EMAIL PROTECTED] wrote in message news:XBE07.7641$[EMAIL PROTECTED]... Let's say I want to calculate this constant with a security level of 93.4563, how do I do that? Basically I want to unfold a function like this: f(95)=1.96 Where I can replace 95 with any number ranging from 0-100. To Eric's reply I'd just add that use of a table is unnecessary. Especially in a computer program, it is easier to use a numerical function to calculate the confidence interval. The tables you've seen are for the cumulative probabilities of the standard normal curve--otherwise known as the standard normal cumulative density function (cdf). The standard normal cdf is the function: +infinity p = PHI(z) = INTEGRAL phi(z) -infinity where: z = standard normal deviate PHI(z) = is the probability (p) of observing a score at or below z phi(z) = is the formula for the standard normal curve: 1/sqrt(2*pi) * exp(-z^2/2) Note that PHI() and phi() -- (these mean the greek letters, upper-case and lower-case, respectively) are different. PHI() is the cumulant of phi(). With the function above, one supplies a value for z, and is given a cumulative probability. You seek the inverse function for PHI(), sometimes called the probit function. With the probit function, one supplies a value for p and is returned the value of z such that the area under the standard normal curve from -inf to z equals p. (As Eric noted, you may need to adjust p to handle issues of 1- vs 2-tailed intervals.) Both the PHI() and probit() functions are well approximated in simple applications (such as calculating confidence intervals) by simple polynomial formulas of a few terms. Some of these take as few as 2 or 3 lines of code. A good reference for such approximations is: Abramowitz, M., and I. A. Stegan, 1972: Handbook of Mathematical Functions. Dover. Hope this helps. John Uebersax = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: How to calculate on-line the SD of a population?
Luigi Bianchi wrote in message 9i2doj$r61$[EMAIL PROTECTED]... Hi to all, it's the first time that I post to this NG, so I hope it is the right place. I have the following problem: I read data from an A/D board and I have to provide an estimation of the SD of the population on-line, that is each time I read a sample I have to update the mean and SD. While it is really easy to update the mean, I don't remember how to do the same thing with the SD. I remember that there was a formula, but I don't remeber it. Anyone could help me? Thanks in advance, Luigi -- Luigi Bianchi http://www.luigbianchi.com [EMAIL PROTECTED] Programming, C++, OWL, VCL, SDK, Dfm2API You update the mean and the sum of squares of deviations from the mean: For each new case (new value x) n = n + 1 dev = x - mean mean = mean + dev/n ssq = ssq + dev*(x - mean) Then the usual st.devn. estimate is: sd = sqrt(ssq/(n-1)) If you want an approximately unbiased estimate of the std. devn., use sd = sqrt(ssq/(n-1.5)) -- Alan Miller (Honorary Research Fellow, CSIRO Mathematical Information Sciences) http://www.ozemail.com.au/~milleraj http://users.bigpond.net.au/amiller/ = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: cigs figs
Mr. Ulrich complains my 91 year-old deceased mother's concept of her right to smoke is provocative to me.Wow! Either he has too much time on his hands or some really serious problems that can't be solved through a statistics newsgroup. How my dead mother's attitude toward smoking created such an emotional tirade is beyond me. The bizarre and convoluted allusion to Justices Scalia and Thomas seems to be another of Mr. Ulrich's hot buttons my mother inadvertently pushed from her grave. I suppose he thinks she was a part of the vast right wing conspiracy he seems to be railing about. What Mr. Ulrich doesn't know is she was not only a lifelong smoker, but a Democratic Party activist as well. As yet, Mr. Ulrich has not provided the case law attributed to the two Justices re: smoking rights vis a vis Natural Law. IMHO, his apparent need to spout Democratic Party ideology would be more appropriate for a political science grouping. Possibly, his political ranting plays well to the gallery in Pittsburgh. Are they lucky, or what? On Sun, 01 Jul 2001 19:08:44 -0400, Rich Ulrich [EMAIL PROTECTED] wrote: - in respect of the up-coming U.S. holiday - On Mon, 25 Jun 2001 11:49:47 GMT, mackeral@remove~this~first~yahoo.com (J. Williams) wrote: On Sun, 24 Jun 2001 16:37:48 -0400, Rich Ulrich [EMAIL PROTECTED] wrote: What rights are denied to smokers? JW Many smokers, including my late mother, feel being unable to smoke on a commerical aircraft, sit anywhere in a restaurant, etc. were violation of her rights. I don't agree as a non-smoker, but that was her viewpoint until the day she died. What's your point: She was a crabby old lady, whining (or whinging) about fancied 'rights'? You don't introduce anything that seems inalienable or self-evident (if I may introduce July-4th language). Nobody stopped her from smoking as long as she kept it away from other people-who-would-be-offended. Okay, we form governments to help assure each other of rights. Lately, the law sees fit to stop some assaults from happening, even though it did not always do that in the past. - the offender still has quite a bit of leeway; if you don't cause fatal diseases, you legally can offend quite a lot. We finally have laws about smoking. But she wants the law to stop at HER convenience? [ snip, various ] JW Talking about confused and/or politically driven, what do Scalia and Thomas have to do with smoking rights? Please cite the case law. I mention rights because that did seem to be a attitude you mentioned that was (as you see) provocative to me. I toss in S T, because I think that, to a large extent, they share your mother's preference for a casual, self-centered definition of rights. And they are Supreme Court justices. [ Well, they don't say, This is what *I* want these two translate the blame/ credit to Nature (euphemism for God).] So: I don't fault your mother *too* harshly, when Justices hardly do better. Even though a prolonged skew was needed, to end up with two like this. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On Sun, 01 Jul 2001 17:05:52 GMT, [EMAIL PROTECTED] (John R Ramsden) sat on a tribble, which squeaked: One clever use for GPFs in an old OS called Primos (anyone remember that?) was to detect kernel stack overflows. The idea was that you positioned the stack in virtual address space so that its end abutted onto a page marked void in the address translation tables, in effect a hole in the virtual address space. Then, if some rogue code overflowed the stack it would try and reference an address in this page and immediately throw up a page fault error. I think they did the same with the (smaller but even more critical) fault stacks, e.g. to catch recursive page fault errors. I'd be surprised if the same trick isn't used, even more extensively, in Windoze these days, since many ex-Primates probably migrated to Microsoft after Prime Computer Inc's woes in the early '90s. Windoze does use a trick like that to detect when it needs to read a page in from swap. A swapped-out page is marked void in the virtual address table, and access to it triggers a page fault. The page is then swapped in, unless it's truly bogus, in which case an application fault occurs. On comp.os.msdos.djgpp there's been some discussion about having the runtime environment detect stack overflows by exactly the mechanism you just described. As opposed to [2] the GPF's this guy is hiding - these are not GPF's that are supposed to happen. Mind you, I can see how this might make more efficient and streamlined the kind of code in which references through null pointers were an anticipated but infrequent event. Yeah, and I can see how this might be the most god-awful kluge in world history, particularly when you can't distinguish accessing a null pointer deliberately from doing so due to a bug. -- Bill Gates: No computer will ever need more than 640K of RAM. -- 1980 There's nobody getting rich writing software that I know of. -- 1980 This antitrust thing will blow over. -- 1998 Combine neo, an underscore, and one thousand sixty-one to make my hotmail addy. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: about a problem of khi2 test
On Sun, 01 Jul 2001 14:19:31 +0200, Bruno Facon [EMAIL PROTECTED] wrote: I work in the area of intelligence differentiation. I would like to know how to use the khi2 statistic to determine whether the number of statistically different correlations between two groups is due or not to random variations. In particular I would like to know how to determine the expected numbers of statistically different correlations due to chance. Let me take an example. Suppose I compare two correlations matrices of 45 coefficients obtained from two independent groups (A and B). If there is no true difference between the two matrices, the number of statistically different correlations should be equal to 1.25 in favor of Yes, that is the number. But there is not a legitimate test that I know of, unless you are willing to make a strong assumption that no pair of the variables should be correlated. I never heard of the khi2 statistic before this. I searched with google, and found a respectable number of references, and here is something that I had not seen with a statistic: kh2 appears to be solely French in its use. Of the first 50 hits, most were in French, at French ISPs (.fr). The few that were in English were also from French sources. One article had a reference (not available in my local libraries): Freilich MH and Chelton DB, J Phys Oceanogr 16, 741-757. group A and equal to 1.25 in favor of group B (in case of alpha = .05). Consequently, the expected number of nonsignificant differences should be 42.75. Is my reasoning correct? I would be nice to test the numbers, but I don't credit that reference as a good one, yet. I don't remember for sure, but I think you might be able to compare two correlation matrices with programs from Jim Steiger's site, http://www.interchg.ubc.ca/steiger/multi.htm On the other hand, you would be better off if you can compare the entire covariance structures, to keep from making accidental assumptions about variances. (Does Jim provide for that?) -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: cigs figs
- in respect of the up-coming U.S. holiday - On Mon, 25 Jun 2001 11:49:47 GMT, mackeral@remove~this~first~yahoo.com (J. Williams) wrote: On Sun, 24 Jun 2001 16:37:48 -0400, Rich Ulrich [EMAIL PROTECTED] wrote: What rights are denied to smokers? JW Many smokers, including my late mother, feel being unable to smoke on a commerical aircraft, sit anywhere in a restaurant, etc. were violation of her rights. I don't agree as a non-smoker, but that was her viewpoint until the day she died. What's your point: She was a crabby old lady, whining (or whinging) about fancied 'rights'? You don't introduce anything that seems inalienable or self-evident (if I may introduce July-4th language). Nobody stopped her from smoking as long as she kept it away from other people-who-would-be-offended. Okay, we form governments to help assure each other of rights. Lately, the law sees fit to stop some assaults from happening, even though it did not always do that in the past. - the offender still has quite a bit of leeway; if you don't cause fatal diseases, you legally can offend quite a lot. We finally have laws about smoking. But she wants the law to stop at HER convenience? [ snip, various ] JW Talking about confused and/or politically driven, what do Scalia and Thomas have to do with smoking rights? Please cite the case law. I mention rights because that did seem to be a attitude you mentioned that was (as you see) provocative to me. I toss in S T, because I think that, to a large extent, they share your mother's preference for a casual, self-centered definition of rights. And they are Supreme Court justices. [ Well, they don't say, This is what *I* want these two translate the blame/ credit to Nature (euphemism for God).] So: I don't fault your mother *too* harshly, when Justices hardly do better. Even though a prolonged skew was needed, to end up with two like this. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: cigs figs
Actually, the word is unalienable. reg - Original Message - From: Rich Ulrich [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Sunday, July 01, 2001 7:08 PM Subject: Re: cigs figs - in respect of the up-coming U.S. holiday - On Mon, 25 Jun 2001 11:49:47 GMT, mackeral@remove~this~first~yahoo.com (J. Williams) wrote: On Sun, 24 Jun 2001 16:37:48 -0400, Rich Ulrich [EMAIL PROTECTED] wrote: What rights are denied to smokers? JW Many smokers, including my late mother, feel being unable to smoke on a commerical aircraft, sit anywhere in a restaurant, etc. were violation of her rights. I don't agree as a non-smoker, but that was her viewpoint until the day she died. What's your point: She was a crabby old lady, whining (or whinging) about fancied 'rights'? You don't introduce anything that seems inalienable or self-evident (if I may introduce July-4th language). Nobody stopped her from smoking as long as she kept it away from other people-who-would-be-offended. Okay, we form governments to help assure each other of rights. Lately, the law sees fit to stop some assaults from happening, even though it did not always do that in the past. - the offender still has quite a bit of leeway; if you don't cause fatal diseases, you legally can offend quite a lot. We finally have laws about smoking. But she wants the law to stop at HER convenience? [ snip, various ] JW Talking about confused and/or politically driven, what do Scalia and Thomas have to do with smoking rights? Please cite the case law. I mention rights because that did seem to be a attitude you mentioned that was (as you see) provocative to me. I toss in S T, because I think that, to a large extent, they share your mother's preference for a casual, self-centered definition of rights. And they are Supreme Court justices. [ Well, they don't say, This is what *I* want these two translate the blame/ credit to Nature (euphemism for God).] So: I don't fault your mother *too* harshly, when Justices hardly do better. Even though a prolonged skew was needed, to end up with two like this. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: cigs figs
Yes, historically correct. Mr. Jefferson and colleagues used unalienable in the Declaration of Independence, though inalienable is the overwhelming preference nowadays. ---Jerry Zar Reg Jordan [EMAIL PROTECTED] 07/03/01 04:10PM Actually, the word is unalienable. reg - Original Message - From: Rich Ulrich [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Sunday, July 01, 2001 7:08 PM Subject: Re: cigs figs - in respect of the up-coming U.S. holiday - On Mon, 25 Jun 2001 11:49:47 GMT, mackeral@remove~this~first~yahoo.com (J. Williams) wrote: On Sun, 24 Jun 2001 16:37:48 -0400, Rich Ulrich [EMAIL PROTECTED] wrote: What rights are denied to smokers? JW Many smokers, including my late mother, feel being unable to smoke on a commerical aircraft, sit anywhere in a restaurant, etc. were violation of her rights. I don't agree as a non-smoker, but that was her viewpoint until the day she died. What's your point: She was a crabby old lady, whining (or whinging) about fancied 'rights'? You don't introduce anything that seems inalienable or self-evident (if I may introduce July-4th language). Nobody stopped her from smoking as long as she kept it away from other people-who-would-be-offended. snipped = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
[EMAIL PROTECTED] (David C. Ullrich) wrote: And yet he never made the connection that maybe Michael Caracena's code *is* the code in Windows that regularly GPFs... Um, no. In [1] I wasn't talking about the GPF's that we see when Windows crashes. I forget the details, but these are _intentional_ GPF's that don't give error messages - they're part of how the system works. One clever use for GPFs in an old OS called Primos (anyone remember that?) was to detect kernel stack overflows. The idea was that you positioned the stack in virtual address space so that its end abutted onto a page marked void in the address translation tables, in effect a hole in the virtual address space. Then, if some rogue code overflowed the stack it would try and reference an address in this page and immediately throw up a page fault error. I think they did the same with the (smaller but even more critical) fault stacks, e.g. to catch recursive page fault errors. I'd be surprised if the same trick isn't used, even more extensively, in Windoze these days, since many ex-Primates probably migrated to Microsoft after Prime Computer Inc's woes in the early '90s. As opposed to [2] the GPF's this guy is hiding - these are not GPF's that are supposed to happen. Mind you, I can see how this might make more efficient and streamlined the kind of code in which references through null pointers were an anticipated but infrequent event. The code would need to be well-regulated though, by always maintaining a context indicator to allow the signal handler to, say, allocate and initialize the right kind of structure for the pointer and then restart the offending instruction. Cheers --- John R Ramsden ([EMAIL PROTECTED]) --- The new is in the old concealed, the old is in the new revealed. St Augustine. --- = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Do you known?
In sci.stat.edu Monica De Stefani [EMAIL PROTECTED] wrote: Hi, is there anybody known Quade, D. (1976) . Nonparametric partial correlation, Measurement in the social Sciences. Edited by H.M. Blalock, Jr. Aldine Publishing Company: Chicago, 369-398? I would known how he calcolate Kendall's partial tau (presisely), please. Kendall's partial tau is calculated exactly the same way as Pearson's partial r. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Maximum Likelihood
In article [EMAIL PROTECTED], Mark W. Humphries [EMAIL PROTECTED] wrote: Hi, Does anyone have references to a simple/intuitive introduction to Maximum Log Likelihood methods. References to algorithms would also be appreciated. Cheers, Mark W. Humphries Any decent text on mathematical statistics has this. As for algorithms, it is a problem of numerical analysis, not of statistics. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Maximum Likelihood
On 28 Jun 2001 20:39:18 -0700, [EMAIL PROTECTED] (Mark W. Humphries) wrote: Hi, Does anyone have references to a simple/intuitive introduction to Maximum Log Likelihood methods. References to algorithms would also be appreciated. Look on the Internet. I used www.google.com to search on maximum likelihood tutorial (put the phrase in quotes to keep it together; or you can use Advanced search) There were MANY hits, and the second reference was in a tutorial that begins at http://statgen.iop.kcl.ac.uk/bgim/mle/sslike_2.html The third reference was for some programs and examples in Gauss (a programming language) by Gary King at Harvard, in his application area. If these aren't worthwhile (I did not try to download anything), there are plenty of other sites to check. [ I am intrigued by G. King, a little. This is the fellow who putatively has a method, not Heckman's, for overcoming or compensating for aggregation bias. Which I never found available for free. But, too bad, the page says these programs go with his 1989 book, and I think his Method is more recent.] -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
Ellen Hertz [EMAIL PROTECTED] wrote: I think you need 8760*(number of subjects followed for a year) assuming the 124 heart attacks were from more than one subject. If the data show that one subject suffered 124 heart attacks, then SOMEBODY'S been smoking marijuana for SURE. -- Ross Presser * [EMAIL PROTECTED] A free-range shoggoth is a happy shoggoth, and a happy shoggoth is generally less inclined to eat all of you at once. - Tim Morgan = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: cigs figs
The thing is, of course, in the case of the car accident survivors etc, in each of those individual cases, we can usually gain some insight into what contributed to the survival. It would be very interesting to similarly discover the basis of the long lives of the old lady smokers. -Original Message- From: Thom Baguley [mailto:[EMAIL PROTECTED]] Sent: 26 June 2001 12:22 To: [EMAIL PROTECTED] Subject: Re: cigs figs J. Williams wrote: She maintained, in spite of the Surgeon General's report and other studies I quoted, that smoking doesn't cause cancer or heart disease. Her proof was she and her sister (my aunt) both lived to be over 90 and were chain smokers. She insisted there are other factors which accounted for the deaths of long-time smokers. A common type of argument. By the same logic being shot, being run over by a car or falling out of an aeroplane aren't causes of death. There are documented cases of people do all three and not dying. QED. Thom = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
Paul, I think you need 8760*(number of subjects followed for a year) assuming the 124 heart attacks were from more than one subject. Then you could do a test as to whether or not marijuana in a given hour is associated with heart attack in that hour. The hours for a fixed subject are not independent so you shouldn't lump them together in a contingency table. One possible approach would be to do a logistic regression with the person/hours being the observations, so that there are 8760*(number of subjects followed for a year) observations. A positive response is a heart attack and the predictors are MJ use that hour and N-1dummy variables for the subjects. Then you want to look at the sign and p-value of the MJ coefficient. Hope this helps. Ellen Hertz Paul Jones [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... There was some research recently linking heart attacks with Marijuana smoking. I'm trying to work out the correlation and, most importantly, its statistical significance. In essence the problem comes down to: Of 8760 hours in a year, 124 had heart attacks in them, 141 had MJ smokes in them and 9 had both. What statistical tests apply? Most importantly, what is the statistical significance of the correlation between smoking MJ in any hour and having a heart attack in that same hour? What is the probablity that the null hypothesis (that smoking marijuana and having a heart attack are unrelated) can be rejected? How reliable are the results from a dataset of this size? I'm not very literate in maths and stats - please help me out someone. I'm interested in this research from the perspective of medicinal marijuana. Thanks and take care, Paul All About MS - the latest MS News and Views http://www.mult-sclerosis.org/ = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: cigs figs
J. Williams wrote: She maintained, in spite of the Surgeon General's report and other studies I quoted, that smoking doesn't cause cancer or heart disease. Her proof was she and her sister (my aunt) both lived to be over 90 and were chain smokers. She insisted there are other factors which accounted for the deaths of long-time smokers. A common type of argument. By the same logic being shot, being run over by a car or falling out of an aeroplane aren't causes of death. There are documented cases of people do all three and not dying. QED. Thom = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Edstat: I. J. Good and Walker
dennis roberts wrote: At 06:08 PM 6/19/01 +, Jerry Dallal wrote: >Alex Yu wrote: > > > > In 1940 Helen M. Walker wrote an article in the journal of Educational > > Psychology regarding the concept degrees of freedom. In 1970s, I. J. Good > > wrote something to criticize Walker's idea. I forgot the citation. I tried > > many databases and even searched the internet but got no result. Does any > > one know the citation? Thanks in advance. > > > >73AmerStat 27 227- 228 J What are degrees of freedom? Good, I. >J. answer??? the number of options you have in deciding what courses you can take during your freshperson year in college ... MINUS ONE the minus one is for the option of 0 courses - droping out. It is presumed that you have already decided to stick it out. :) Cheers, Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Normality in Factor Analysis
Robert Ehrlich [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Calculation of eigenvalues and eigenvalues requires no assumption. However evaluation of the results IMHO implicitly assumes at least a unimodal distribution and reasonably homogeneous variance for the same reasons as ANOVA or regression. So think of th consequencesof calculating means and variances of a strongly bimodal distribution where no sample ocurrs near the mean and all samples are tens of standard devatiations from the mean. The largest number of standard deviations all data can be from the mean is 1. To get some data further away than that, some of it has to be less than 1 s.d. from the mean. Glen = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: cigs figs
In article [EMAIL PROTECTED], Rich Ulrich [EMAIL PROTECTED] wrote: - re: some outstandingly confused thinking. Or writing. On Sat, 23 Jun 2001 15:25:31 GMT, mackeral@remove~this~first~yahoo.com (J. Williams) wrote: [ snip; Slate reference, etcetera ] ... My mother was 91 years old when she died a year ago and chain smoked since her college days. She defended the tobacco companies for years saying, it didn't hurt me. She outlived most of her doctors. Upon quoting statistics and research on the subject, her view was that I, like other do gooders and non-smokers, wanted to deny smokers their rights. What statistics would her view quote? to show that someone wants to deny smokers 'their rights'? [ Hey, I didn't write the sentence ] NO amount of demographic statistics can PROVE, even statistically, that smoking is harmful to the person doing it. Statistical arguments based on such data are at most indications, and may even be wrong. The woman who died recently at 120, a claimant for the title of the oldest living person, gave up smoking at the age of 114. I just love it, how a 'natural right' works out to be *exactly* what the speaker wants to do. That is essentially it. The only meaningful rights are the rights to do what others do not want you to do. And not a whit more. (Thomas and Scalia are probably going to give us tons of that bad philosophy, over the next decades.) What rights are denied to smokers? You know, you can't build your outhouse right on the riverbank, either. This only applies to second hand smoke, where the rights of others are directly involved. In some places, you can build your outhouse right on the riverbank; the only reason that you cannot or should not do so generally are that it would threaten others. Obviously, there is a health connection. How strong that connection is, is what makes this a unique statistical conundrum. How strong is that connection? Well, quite strong. Personally, I believe that there is a connection. But it is a situation where the prior probabilities of the various states make a big difference. I once considered that it might not be so bad to die 9 years early, owing to smoking, if that cut off years of bad health and suffering. Then I realized, the smoking grants you most of the bad health of old age, EARLY. (You do miss the Alzheimer's.) One day, I might give up smoking my pipe. Why are you smoking a pipe? Pipe smokers produce second hand smoke, and lots of objectionable odors. Can you cite any benefits which cigarette smokers cannot also claim? Everything involves risks and benefits, and the individual should decide. What is the statistical conundrum? I can almost imagine an ethical conundrum. (How strongly can we legislate, to encourage cyclists to wear helmets?) I sure don't spot a statistical conundrum. I see no statistical conundrum, either, but merely a situation where the regulators are using a very large amount of prior assumptions to justify the legislation. Now this does not mean that most of those assumptions may not be correct, but that this is what they are going by. I believe that one MUST use prior assumptions, as otherwise one will be strongly inconsistent, and it is even possible that nothing will be done. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help with stats please
In article 006901c0fce2$d07c7640$[EMAIL PROTECTED], Melady Preece [EMAIL PROTECTED] wrote: Hi. I am teaching educational statistics for the first time, and although I can go on at length about complex statistical techniques, I find myself at a loss with this multiple choice question in my test bank. I understand why the range of (b) is smaller than (a) and (c), but I can't figure out how to prove that it is smaller than (d). If you can explain it to me, I will be humiliated, but grateful. 1. Which one of the following classes had the smallest range in IQ scores? A) Class A has a mean IQ of 106 and a standard deviation of ll. B) Class B has an IQ range from 93 to 119. C) Class C has a mean IQ of 110 with a variance of 200. D) Class D has a median IQ of 100 with Q1 = 90 and Q3 = 110. The test bank says the answer is b. Melady What are the sizes of the classes? What are the distributions of the scores in the various classes? If the scores are random from some probability distribution, and other than the sample data there is no additional information about the actual scores, for other than extremely small classes (10 is large here), not many absolute statements can be made. I CAN tell that class C cannot have a smaller range than 29, because otherwise the variance cannot be 200, and scores are given as integers. If they are not integers, it goes down slightly. Even if the model is the totally untenable normal distribution, the scores are RANDOM, and the samples need not look at all normal. As to what was bothering you, what are the quantiles of the normal distribution? -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On Mon, 25 Jun 2001 09:09:52 GMT, [EMAIL PROTECTED] (Graaagh the Mighty) wrote: On Sun, 24 Jun 2001 14:39:06 GMT, [EMAIL PROTECTED] (David C. Ullrich) sat on a tribble, which squeaked: [1]That's one scary thing - in fact there are places in Windows95 where the system _regularly_ creates GPF's; something to do with thunking or something. [2]But the scary thing about the quote is that the guy was advocating _hiding_ AV's in programs we write instead of fixing them. AV's can be hard to debug - the eaiest way is to make certain they don't arise in the first place. And given this guy's attitude, one of the steps involved in ensuring that your code contains no hard-to-debug AV's is making sure you never use anything he wrote. Hence the sig - it's a public-service thing. Sometimes you can have access violations all the time and the program still works. (Michael Caracena, comp.lang.pascal.delphi.misc 5/1/01) And yet he never made the connection that maybe Michael Caracena's code *is* the code in Windows that regularly GPFs... Um, no. In [1] I wasn't talking about the GPF's that we see when Windows crashes. I forget the details, but these are _intentional_ GPF's that don't give error messages - they're part of how the system works. As opposed to [2] the GPF's this guy is hiding - these are not GPF's that are supposed to happen. (Seriously though -- core parts of Windoze are written in Pascal, and it is known that Windoze does hide some AVs it commits, especially those involving reading through a null pointer!) How do you know some parts are in Pascal, and what does that have to do with AV's? -- Bill Gates: No computer will ever need more than 640K of RAM. -- 1980 There's nobody getting rich writing software that I know of. -- 1980 This antitrust thing will blow over. -- 1998 Combine neo, an underscore, and one thousand sixty-one to make my hotmail addy. David C. Ullrich * Sometimes you can have access violations all the time and the program still works. (Michael Caracena, comp.lang.pascal.delphi.misc 5/1/01) = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help with stats please
Melady Preece wrote: Hi. I am teaching educational statistics for the first time, and although I can go on at length about complex statistical techniques, I find myself at a loss with this multiple choice question in my test bank. I understand why the range of (b) is smaller than (a) and (c), but I can't figure out how to prove that it is smaller than (d). If you can explain it to me, I will be humiliated, but grateful. I'm not sure why you would be humiliated, even if the answer were obvious. You can't prove the range of (b) is smaller than (d). The question isn't even worded clearly. (b) says a range of from 93 to 119 They range from 93 to 119 and have a range of 26 (subject to any typographical errors I might make!), but a range from to is just...sloppy. If (d) were a small class, say 2 students, the upper and lower quartiles could be 90 and 110, depending on the precise definition of quartile being used, and the range would be 20, even with normality, etc. 1. Which one of the following classes had the smallest range in IQ scores? A) Class A has a mean IQ of 106 and a standard deviation of ll. B) Class B has an IQ range from 93 to 119. C) Class C has a mean IQ of 110 with a variance of 200. D) Class D has a median IQ of 100 with Q1 = 90 and Q3 = 110. The test bank says the answer is b. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: M/G/1 model
It's the Kendall notation A/B/C/D/E A interarrival time distribution M: exponential D: deterministic E: Erlang K G: General B Servive time distribution M: exponential D: deterministic E: Erlang K G: General C number of parallel servers D system capacity E queuing rules FIFO LIFO SIRO (servive in random order) PRI (priority) GD 'general discipline) M/G/1 stand for Exponential interaaival time/general servic time distribution/ 1 server Hope this help JCB France - Original Message - From: *Silvia* [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Monday, June 25, 2001 12:11 PM Subject: M/G/1 model I am studying the M/G/1 model for retrial queues. I know that 1 in M/G/1 means that there is a single server. Does anyone can tell me what M and G exactly stand for? Thanks in advance, Silvia = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On Sat, 23 Jun 2001 23:35:06 GMT, Tetsuo [EMAIL PROTECTED] wrote: in article [EMAIL PROTECTED], Tetsuo at [EMAIL PROTECTED] wrote on 24-06-2001 00:17: in article [EMAIL PROTECTED], David C. Ullrich at [EMAIL PROTECTED] wrote on 23-06-2001 16:06: [obvious jokes' [explanation of why the assertions in the obvious jokes are wrong] [...] Sorry for that indeed, ppl actually have this kind of opinion on this sometimes so I assumed I encountered just another one and got irritated. I should've realized the poster would not spout such stupidity in a serious manner though, of course...heh, certainly not in this ng. No problem, actually I enjoyed reading it. Slightly disappointing that you finally figured out I was being sarcastic - when I read your post I was looking forward to stringing you along a bit. Well, sorry again David C. Ullrich * Sometimes you can have access violations all the time and the program still works. (Michael Caracena, comp.lang.pascal.delphi.misc 5/1/01) = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On Sat, 23 Jun 2001 21:12:40 -0700, Chas F Brown [EMAIL PROTECTED] wrote: David C. Ullrich wrote: [...] In the back-of-envelope calculations I did, this is really the key missing information. If heart attacks are evenly distributed through the day, while MJ smoking (as far as I know!) clearly isn't for most users, then the temporal correlation is going to be alot more marked. Or they tend to smoke before meals (I knew some people like that years ago in college) and tend to have heart attacks after meals. Or they tend to smoke when they start to feel little chest pains, as someone suggested. But you're reading something into what I said, that I didn't say - I'm not saying that the data imply that smoking _causes_ an increased your risk of heart attack in the hour after smoking (although this evidence would support further investigation that that _may_ be the case). Ok. [...] David C. Ullrich * Sometimes you can have access violations all the time and the program still works. (Michael Caracena, comp.lang.pascal.delphi.misc 5/1/01) The scary thing is - he's right. That's one scary thing - in fact there are places in Windows95 where the system _regularly_ creates GPF's; something to do with thunking or something. But the scary thing about the quote is that the guy was advocating _hiding_ AV's in programs we write instead of fixing them. AV's can be hard to debug - the eaiest way is to make certain they don't arise in the first place. And given this guy's attitude, one of the steps involved in ensuring that your code contains no hard-to-debug AV's is making sure you never use anything he wrote. Hence the sig - it's a public-service thing. (Ooops! Netscape just locked up - time to reboot again...) Cheers - Chas --- C Brown Systems Designs Multimedia Environments for Museums and Theme Parks --- David C. Ullrich * Sometimes you can have access violations all the time and the program still works. (Michael Caracena, comp.lang.pascal.delphi.misc 5/1/01) = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help with stats please
At 12:20 PM 6/24/01 -0700, Melady Preece wrote: Hi. I am teaching educational statistics for the first time, and although I can go on at length about complex statistical techniques, I find myself at a loss with this multiple choice question in my test bank. I understand why the range of (b) is smaller than (a) and (c), but I can't figure out how to prove that it is smaller than (d). If you can explain it to me, I will be humiliated, but grateful. 1. Which one of the following classes had the smallest range in IQ scores? of course, there is nothing about the shape of the distribution of any class ... so, does the item assume sort of normal? in fact, since each of these classes is probably on the small side ... it would be hard to assume that but, for the sake of the item ... pretend in addition, it does not say to assume the population of IQ scores has mean = 100 and sd about 15 ... so, whether this plays a role or not, i am not sure BUT ... A) Class A has a mean IQ of 106 and a standard deviation of ll. at least about 2 units of 11 = 22 on each side of 106 ... range about 45 or so or more B) Class B has an IQ range from 93 to 119. well, range here is about 26 ... less than in A for sure C) Class C has a mean IQ of 110 with a variance of 200. variance of 200 means an sd about 14 ... so 2 units of 14 = 28 on each side of 110 ... range must be 50 or more ... similar to A but, more than C D) Class D has a median IQ of 100 with Q1 = 90 and Q3 = 110. 25th PR = 90 and 75PR = 110 ... IF we assumed the class was ND ... then the mean would be about 100 too ... and since -1 for SD below the mean and +1 SD above the mean would give your roughly the 16th PR and 84th PR ... Q1 and Q3 are NOT that far out ... so, the SD must be at least 10 or more ... thus, 2 units of at least 10 = 20 on either side of 100 = range of at least about 40 ... probably less than A or C ... but, more than B ... B is probably the best of the lot BUT, i am NOT sure what the real purpose of this item is ... The test bank says the answer is b. Melady = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: cigs figs
- re: some outstandingly confused thinking. Or writing. On Sat, 23 Jun 2001 15:25:31 GMT, mackeral@remove~this~first~yahoo.com (J. Williams) wrote: [ snip; Slate reference, etcetera ] ... My mother was 91 years old when she died a year ago and chain smoked since her college days. She defended the tobacco companies for years saying, it didn't hurt me. She outlived most of her doctors. Upon quoting statistics and research on the subject, her view was that I, like other do gooders and non-smokers, wanted to deny smokers their rights. What statistics would her view quote? to show that someone wants to deny smokers 'their rights'? [ Hey, I didn't write the sentence ] I just love it, how a 'natural right' works out to be *exactly* what the speaker wants to do. And not a whit more. (Thomas and Scalia are probably going to give us tons of that bad philosophy, over the next decades.) What rights are denied to smokers? You know, you can't build your outhouse right on the riverbank, either. Obviously, there is a health connection. How strong that connection is, is what makes this a unique statistical conundrum. How strong is that connection? Well, quite strong. I once considered that it might not be so bad to die 9 years early, owing to smoking, if that cut off years of bad health and suffering. Then I realized, the smoking grants you most of the bad health of old age, EARLY. (You do miss the Alzheimer's.) One day, I might give up smoking my pipe. What is the statistical conundrum? I can almost imagine an ethical conundrum. (How strongly can we legislate, to encourage cyclists to wear helmets?) I sure don't spot a statistical conundrum. Is this word intended? If so, how so? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help with stats please
On Sun, 24 Jun 2001, Melady Preece wrote in part: I am teaching educational statistics for the first time, and although I can go on at length about complex statistical techniques, I find myself at a loss with this multiple choice question in my test bank. I understand why the range of (b) is smaller than (a) and (c), but I can't figure out how to prove that it is smaller than (d). 1. Which of the following classes had the smallest range in IQ scores? A) Class A has a mean IQ of 106 and a standard deviation of ll. B) Class B has an IQ range from 93 to 119. C) Class C has a mean IQ of 110 with a variance of 200. D) Class D has a median IQ of 100 with Q1 = 90 and Q3 = 110. The test bank says the answer is b. Right. Since you're happy that range(B) range(A) and range(B) range(C), I'll focus on (B) vs. (D). In (B), the entire _range_ is from 93 to 119: 26 (or 27, depending on how you choose to define range) points. In (D), the central half of the distribution is from 90 to 110: the interquartile range (IQR) is 20 points, symmetric about the median; the full range must therefore be greater than 20. Now, _if_ the distribution is normal (which may be what we were to assume from the allegation that these are IQ scores; although as Dennis has pointed out, ille non sequitur -- unless these are rather large classes AND NOT SELECTED BY I.Q. (or by any variable strongly related to I.Q.)), then 10 points from Q1 to median (or from median to Q3) represents 0.67 standard deviation, which implies a standard deviation of about 15, which is larger than the standard deviation in (A) and slightly larger than that in (C). However, we need not invoke the normal distribution. We observe that the distribution in (D) is at least approximately symmetric (insofar as the quartiles are equidistant from the median). If we may assume also that the distribution is unimodal (which I should think reasonable), it then follows (from the tailing off of distributions as one approaches the extremes) that the distance from minimum to Q1 (and the distance from Q3 to maximum) is greater than the distance from Q1 to median (or median to Q3). This implies that the range of the distribution exceeds twice the interquartile range: that is, range(D) 2*20 = 40. Since the range in (B) is only 26, clearly the range of (B) is less than the range of (D). If any part of this argument remains unclear, I'd be happy to attack it again. A rough sketch should make things pretty obvious, but it's a bit of a nuisance to draw pictures in ASCII characters! --DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: cigs figs
On 17 Jun 2001 14:47:14 GMT, [EMAIL PROTECTED] (EugeneGall) wrote: On Slate, there is quite a good discussion of the meaning and probabilistic basis of the statement that 1 in 3 teen smokers will die of cancer. It is written by a math prof and it is one of the most effective lay discussions I've seen of the use of probabilities in describing health risks. http://slate.msn.com/math/01-06-14/math.asp Maybe, I just notice it more, but it seems to me as I move about that more and more young people are smoking. Could it be that even with all of the negatives, smoking is still popular and/or growing among teeny boppers and young adults? Recent jury awards to long-time smokers seem to intimate that even with printed warnings, etc., the tobacco companies are ultimately responsible for respiratory and circulatory ailments. Smokers it is assumed are addicts and consequently not responsible for their actions. A salient point in Mr. Ellenberg's treatise is the query that of a sample of 100,000 deaths of male smokers, would 60,000 still be alive had they eschewed coffin nails throughout their lifetimes? My mother was 91 years old when she died a year ago and chain smoked since her college days. She defended the tobacco companies for years saying, it didn't hurt me. She outlived most of her doctors. Upon quoting statistics and research on the subject, her view was that I, like other do gooders and non-smokers, wanted to deny smokers their rights. Obviously, there is a health connection. How strong that connection is, is what makes this a unique statistical conundrum. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: probability that Xi = X1...Xn
--20209B611F2A68F79DC95EE5 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit You say the X1...Xn are independent. Are they also identically distributed? If not, you will have some very cumbersome expressions.If we use f(Xk) as the density and F(Xk) as the cdf of the k'th r.v., then the density for the largest (which we call U) is n*F(U)^(n-1)f(U). That is, the size of the sample times the (n-1) power of the CDF times the density at U. The most complete reference on such issues is Sahran and Greenbergs' Contributions to Order Statistics, about 1960 from John Wiley and Sons. Fabio Ulisse Pardi wrote: Can anybody give me a hint about this problem? Let the random variables X1,...,Xn be independent and let M be the index of the maximum among them (i.e. M=i implies Xi = X1,...,Xn). We want to find nice formulas that calculate the distribution of M from the distributions of X1...Xn, that we suppose belonging to the same class of distributions: for instance if we assume that all of X1...Xn are normally distributed, with parameters (m1,v1),...,(mn,vn), we would like to obtain a formula of the kind Pr[M=i] = Fi(m1,...,mn,v1,...,vn) for every i=1..n. The problem is that the integral that calculates Pr[M=i] is quite complicated, and I haven't figured out how to express its value as a simple function of the parameters. --20209B611F2A68F79DC95EE5 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit !doctype html public -//w3c//dtd html 4.0 transitional//en html You say the X1...Xn are independent. Are they also identically distributed? If not, you will have some very cumbersome expressions.If we use f(Xk) as the density and F(Xk) as the cdf of the k'th r.v., then the density for the largest (which we call U) is n*F(U)^(n-1)f(U). brThat is, the size of the sample times the (n-1) power of the CDF times the density at U. The most complete reference on such issues is Sahran and Greenbergs' uContributions to Order Statistics,/u about 1960 from John Wiley and Sons. pFabio Ulisse Pardi wrote: blockquote TYPE=CITECan anybody give me a hint about this problem? pLet the random variables X1,...,Xn be independent and let M be the index pof the maximum among them (i.e. M=i implies Xi = X1,...,Xn). brWe want to find nice formulas that calculate the distribution of M from brthe distributions of X1...Xn, that we suppose belonging to the same brclass brof distributions: brfor instance if we assume that all of X1...Xn are normally distributed, brwith parameters (m1,v1),...,(mn,vn), we would like to obtain a formula brof the kind brnbsp;nbsp; Pr[M=i] = Fi(m1,...,mn,v1,...,vn) brfor every i=1..n. pThe problem is that the integral that calculates Pr[M=i] is quite brcomplicated, and I haven't figured out how to express its value as a brsimple function of the parameters./blockquote /html --20209B611F2A68F79DC95EE5-- = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
in article [EMAIL PROTECTED], David C. Ullrich at [EMAIL PROTECTED] wrote on 23-06-2001 16:06: On Fri, 22 Jun 2001 20:49:02 GMT, Steve Leibel [EMAIL PROTECTED] wrote: Hallucinating? On pot? What are YOU smokin'? Pot doesn't cause hallucinations Where are you getting your facts from here? You've obviously never seen Reefer Madness or you wouldn't spout nonsense like this. Normal pot doesn't cause hallucinations, exceptions have to be made with allergies towards it, or not normal potency pot (artificially enhanced, sprayed with lsd, etc...). If both of these factors are not taken in consideration, reefer madness is a myth, sorry to disappoint you. I wonder if there's any data about correlation between alcohol use and traffic fatalities? Probably not, I certainly don't see why there would be any connection. I mean if alcohol were more dangerous than pot in just about any way a person could name that would mean that the laws in this country were all backwards. Alcohol *is* more dangerous than pot. The laws are made to create a perfect balance between social and economical wealth (although the first invokes the latter), so alcohol and tobacco are forced into it, so *yes* they are backwards (that is, in your country). Wake up, please. About the driving, the other poster already supplied enough resources, but it has to be said that I have yet to meet the first adequate doctor or psychologist who shares your opinion on this. -- --- | Sig v.1.1 | |___| / /| :-=-=-=-=-=-: |___ | Tetsuo |/ /| |-=-=-=-=-=-|-=-=-=-=-=-=-=-=-=-: |__ | 101001011 | ICQ# : ask me |/ /| | 001010010 |-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=: | | 011010011 | Artpage : http://zap.to/m_mortier| | | |/ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
in article [EMAIL PROTECTED], Tetsuo at [EMAIL PROTECTED] wrote on 24-06-2001 00:17: in article [EMAIL PROTECTED], David C. Ullrich at [EMAIL PROTECTED] wrote on 23-06-2001 16:06: On Fri, 22 Jun 2001 20:49:02 GMT, Steve Leibel [EMAIL PROTECTED] wrote: Hallucinating? On pot? What are YOU smokin'? Pot doesn't cause hallucinations Where are you getting your facts from here? You've obviously never seen Reefer Madness or you wouldn't spout nonsense like this. Normal pot doesn't cause hallucinations, exceptions have to be made with allergies towards it, or not normal potency pot (artificially enhanced, sprayed with lsd, etc...). If both of these factors are not taken in consideration, reefer madness is a myth, sorry to disappoint you. I wonder if there's any data about correlation between alcohol use and traffic fatalities? Probably not, I certainly don't see why there would be any connection. I mean if alcohol were more dangerous than pot in just about any way a person could name that would mean that the laws in this country were all backwards. Alcohol *is* more dangerous than pot. The laws are made to create a perfect balance between social and economical wealth (although the first invokes the latter), so alcohol and tobacco are forced into it, so *yes* they are backwards (that is, in your country). Wake up, please. About the driving, the other poster already supplied enough resources, but it has to be said that I have yet to meet the first adequate doctor or psychologist who shares your opinion on this. Sorry for that indeed, ppl actually have this kind of opinion on this sometimes so I assumed I encountered just another one and got irritated. I should've realized the poster would not spout such stupidity in a serious manner though, of course...heh, certainly not in this ng. Well, sorry again -- --- | Sig v.1.1 | |___| / /| :-=-=-=-=-=-: |___ | Tetsuo |/ /| |-=-=-=-=-=-|-=-=-=-=-=-=-=-=-=-: |__ | 101001011 | ICQ# : ask me |/ /| | 001010010 |-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=: | | 011010011 | Artpage : http://zap.to/m_mortier| | | |/ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
David C. Ullrich wrote: On Thu, 21 Jun 2001 21:14:44 -0700, Chas F Brown [EMAIL PROTECTED] wrote: snip That seems to be the type of correlation that was reported here - some distribution of MJ smoking, and its *temporal* correlation with heart attacks. Now, that says exactly nothing about whether MJ use increases or decreases the liklihood of having a heart attack in general (it could in fact in general *decrease* heart attacks, even in our data set); That's exactly right. When I say that there's nothing we can conlude from the data given I didn't mean there's _nothing_ we can conclude, rather nothing we can conclude _concerning_ the question of whether smoking increases the risk of a heart attack. I'm with you there. I don't see how we can even quite conclude that the risk of a heart attack is higher among users immediately after smoking, for various reasons: I doubt that most users' use is uniformly distributed during the 24 hours of the day, Well, we all have to sleep _sometime_ :) ... I have no idea whether heart attacks are uniformly distributed throuought the day, so it could well be that the times people tend to smoke are the same as the times they tend to have heart attacks. In the back-of-envelope calculations I did, this is really the key missing information. If heart attacks are evenly distributed through the day, while MJ smoking (as far as I know!) clearly isn't for most users, then the temporal correlation is going to be alot more marked. Or they tend to smoke before meals (I knew some people like that years ago in college) and tend to have heart attacks after meals. Or they tend to smoke when they start to feel little chest pains, as someone suggested. But you're reading something into what I said, that I didn't say - I'm not saying that the data imply that smoking _causes_ an increased your risk of heart attack in the hour after smoking (although this evidence would support further investigation that that _may_ be the case). I'm saying that it is certainly not unreasonable that the risk is increased during the hour after smoking. That may very well be because, purely coincidentally, you habitually puff in exactly the same time when heart attacks are, for unrelated reasons, most likely to occur. Or because you have intimations of oncoming heart attack, or for any other reason. But we don't need to know the reason for this temporal correlation. I noted only that without further information, it is logical to be aware of this temporal correlation, and take action appropriately. That action wasn't stop smoking ganga!, but just an awareness of the higher risk during this time period - perhaps taking extra precautions such as being near medical assistance (for those for whom this is not just an intellectual excercise). Then even if it _is_ true that a smoker is more likely to have a heart attack immediately after smoking a joint, that does _NOT_ show that smoking increases the risk! Could be as you say that it actually decreases the risk, but regardless the time immediately after smoking is the riskiest time. Yah! That's what I was saying! And what does it make sense to do during the riskiest time? Take actions that reduce your risk. (If we assume that the report correctly correlated for the obvious elements of heart attack temporal distribution and MJ usage temporal distribution...etc.) So it seems clear to me that there is _nothing_ we can conclude about whether smoking increases the risk of a heart attack - it also seems clear that that is _the_ question of interest here. For me personally, I agree (and I think the reason why this study was so widely reported was with the unjustified implication in mind). But for somebody with MS and possibly a whole bunch of other related health problems, they might have a different perspective. snip of reasonable description of why Just Say NO! is just sad, sad, sad Weaker (but much less interesting) conclusions about correlations might be possible, but there are _so_ many ways that a correlation could exist by accident that I don't see why one would care. (Unless one was planning on blurring the distinction between correlation and causation for political reasons...) Ding! Gee, what kind of modern military/industrial/health care system could _possibly_ want to do that? What I want to know is why alcohol and tobacco are legal: http://www.drugwarfacts.org/causes.htm http://www.drugwarfacts.org/addictiv.htm Gosh! Don't take away my frosty cold malty one, too! (I promise I won't drive!) And just let me stub out this cigarette before I continue... snip David C. Ullrich * Sometimes you can have access violations all the time and the program still works. (Michael Caracena, comp.lang.pascal.delphi.misc 5/1/01) The scary thing is - he's right. (Ooops! Netscape just locked up - time to reboot again...) Cheers - Chas
Re: Marijuana
Damnit, I promised I wouldn't get involved in this absurd and off-topic thread, but I've got to set the record straight here: In article [EMAIL PROTECTED], Tetsuo [EMAIL PROTECTED] wrote: Normal pot doesn't cause hallucinations, exceptions have to be made with allergies towards it, or not normal potency pot (artificially enhanced, sprayed with lsd, etc...). If both of these factors are not taken in consideration, reefer madness is a myth, sorry to disappoint you. Two things: (1) High dosage marijuana _can_ cause hallucinations, most commonly auditory, in a significant minority of the population. This is not an allergy issue, but a sensitivity to particular psychoactive effects. Not everyone will experience such, but they definitely are experienced by some. (2) Marijuana is never laced with LSD. That's a waste of perfectly good LSD. Heat-vaporization is not a usable means of LSD ingestion, simply because LSD breaks down at high temperatures. +--First Church of Briantology--Order of the Holy Quaternion--+ | A mathematician is a device for turning coffee into | | theorems. -Paul Erdos | +-+ | Jake Wildstrom| +-+ = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Eamon) wrote: (c) Reduced motor co-ordination, e.g. when driving a car Numerous studies have shown that marijuana actually improves driving ability. It makes people more attentive and less aggressive. You could look it up. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On Fri, 22 Jun 2001 18:45:52 GMT, Steve Leibel [EMAIL PROTECTED] wrote: In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Eamon) wrote: (c) Reduced motor co-ordination, e.g. when driving a car Numerous studies have shown that marijuana actually improves driving ability. It makes people more attentive and less aggressive. You could look it up. An intoxicant does *that*? I think I recall in the literature, that people getting stoned, on whatever, occasionally *think* that their reaction time or sense of humor or other performance is getting better. Improving your driving by getting mildly stoned (omitting the episodes of hallucinating) seems unlikely enough, to me, that *I* think the burden of proof is the stranger named Steve. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Normality in Factor Analysis
Calculation of eigenvalues and eigenvalues requires no assumption. However evaluation of the results IMHO implicitly assumes at least a unimodal distribution and reasonably homogeneous variance for the same reasons as ANOVA or regression. So think of th consequencesof calculating means and variances of a strongly bimodal distribution where no sample ocurrs near the mean and all samples are tens of standard devatiations from the mean. Hi, I have a question regarding factor analysis: Is normality an important precondition for using factor analysis? If no, are there any books that justify this. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
In article [EMAIL PROTECTED], Rich Ulrich [EMAIL PROTECTED] wrote: On Fri, 22 Jun 2001 18:45:52 GMT, Steve Leibel [EMAIL PROTECTED] wrote: In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Eamon) wrote: (c) Reduced motor co-ordination, e.g. when driving a car Numerous studies have shown that marijuana actually improves driving ability. It makes people more attentive and less aggressive. You could look it up. An intoxicant does *that*? I think I recall in the literature, that people getting stoned, on whatever, occasionally *think* that their reaction time or sense of humor or other performance is getting better. Improving your driving by getting mildly stoned (omitting the episodes of hallucinating) seems unlikely enough, to me, that *I* think the burden of proof is the stranger named Steve. Hallucinating? On pot? What are YOU smokin'? Pot doesn't cause hallucinations -- although a lot of anti-drug hysteria certainly does. A cursory web search turned up these links among many others to support my statement. Naturally this subject is controversial and there are lots of conflicting studies. The consensus is that at worst pot causes minor driving impairment similar to many prescription medications. At least one study showed that pot users had FEWER fatal crashes than non users! And stranger named Steve? I've been on this newsgroup since 1995. Not as famous as James Harris, maybe, but certainly no stranger. This is a small sample of what came up when I entered marijuana driving into Google. Read and learn. http://www.norml.org/canorml/myths/myth1.shtml http://www.reconsider.org/issues/marijuana/driving.htm http://www.cannabisnews.com/news/thread1016.shtml http://www.marijuana-hemp.com/cin/facts/drivehi.shtml When the data were analyzed, cannabis consumers actually showed a lower likelihood of being involved in a fatal crash than that of a drug-free control group, though the difference was not judged to be statistically significant. http://www.hoboes.com/pub/Prohibition/Drug%20Information/Marijuana/Drivin g/Driving http://www.taima.org/en/driving.htm It was of some interest that cannabis tended to show a negative effect on relative risk when other drug groups showed an increase. http://www.norml.org.nz/norml/Marijuana/Driving.htm#abc981014 Steve L = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On Thu, 21 Jun 2001 21:14:44 -0700, Chas F Brown [EMAIL PROTECTED] wrote: David C. Ullrich wrote: On Fri, 15 Jun 2001 15:23:03 +0100, Paul Jones [EMAIL PROTECTED] wrote: David C. Ullrich wrote: But analyzing it this way simply makes no sense. Those trials you're talking about are _far_ from independent; each trial is associated with a particular person, and there will be a very strong correlation between various trials for the same person at different hours. Okay then, how should it be analysed? I've explained at least twice why I do not believe it is _possible_ to draw the sort of inference you want to draw from the data you've given us. You must be reading _some_ of those posts or you wouldn't keep replying. Well, although I've agreed with most of your complaints about trying to derive any information from the scanty data shown, there is *something* we can notice about the data set which has some relevance. Let's say we look at a sampling of 100 people who have both had heart attacks within the last year and have smoked an aspirin an average of once a week during that year. Now, without knowing what the average percentage of people who smoke aspirin each year, and the average percentage of people who have heart attacks each year without smoking aspirin, these numbers alone would be pretty useless. But if 95% of the people in the data set had their 1 heart attack inside of 1 minute after smoking an aspirin, you'd have some reason to further examine the hypothesis that, for some segment of the population, smoking an aspirin could trigger a heart attack. (Of course it could also be that impending heart attacks bring on the desire to smoke aspirin, or some other hypothesis that correlates the two phenomena). One the other hand, one would expect if there were no immediate correlation between smoking aspirin and heart attacks, the average time between smoking aspirin and heart attack would be more like 1/2 week. This would then indicate that it was not particularly worthwhile to investigate an immediate link between asprinin smoking and heart attacks. That seems to be the type of correlation that was reported here - some distribution of MJ smoking, and its *temporal* correlation with heart attacks. Now, that says exactly nothing about whether MJ use increases or decreases the liklihood of having a heart attack in general (it could in fact in general *decrease* heart attacks, even in our data set); That's exactly right. When I say that there's nothing we can conlude from the data given I didn't mean there's _nothing_ we can conclude, rather nothing we can conclude _concerning_ the question of whether smoking increases the risk of a heart attack. I don't see how we can even quite conclude that the risk of a heart attack is higher among users immediately after smoking, for various reasons: I doubt that most users' use is uniformly distributed during the 24 hours of the day, I have no idea whether heart attacks are uniformly distributed throuought the day, so it could well be that the times people tend to smoke are the same as the times they tend to have heart attacks. Or they tend to smoke before meals (I knew some people like that years ago in college) and tend to have heart attacks after meals. Or they tend to smoke when they start to feel little chest pains, as someone suggested. Then even if it _is_ true that a smoker is more likely to have a heart attack immediately after smoking a joint, that does _NOT_ show that smoking increases the risk! Could be as you say that it actually decreases the risk, but regardless the time immediately after smoking is the riskiest time. So it seems clear to me that there is _nothing_ we can conclude about whether smoking increases the risk of a heart attack - it also seems clear that that is _the_ question of interest here. Not that I'm claiming that it _is_ the case that smoking decreases the risk of heart attack although the hour immediately afterwards is the riskiest time. I have no reason to think that's so. Also no reason to think it's not so: People who assume such a thing is ridiculous think so because they've classified the world into Good things and Bad things - actual things in the world are not that simple: (i) Aspirin is a Good thing. Good for pain and fever relief, and actually an aspirin a day helps prevent heart attack or stroke, I forget which. The reason I forget which is it's irrelevant to me: For me aspirin is a Very Bad thing, because of other medical problems. (ii) Alcohol is a Bad thing. Except for that bit about how a glass of red wine a day is good for you, in terms os risk of heart attack or stroke, again I forget which. Alas, it doesn't follow that a quart of whiskey a day is good for you. Given that there _are_ plenty of legitimate medical uses for marijuana and given that the interaction between the body and chemicals is simply _not_ a matter of some chemicals Good and some Bad, the idea that
Re: calculation of an effect size with medians
Marc wrote: As a part of a report I have to perform a meta-analysis of some clinical trials. These trials report the median effect in the treatment group and the median effect in the control group (days of hospitalization). P-values from Mann-Whitney U-Tests are reported and the numbers of patients in treatment and control. My Question: How can I calculate an effect size (eg median difference between treatment and control)and confidence intervals with that data? You'd need raw data to calculate the effect size for Mann-Whitney test. Regards, Konrad = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On Fri, 15 Jun 2001 15:23:03 +0100, Paul Jones [EMAIL PROTECTED] wrote: David C. Ullrich wrote: But analyzing it this way simply makes no sense. Those trials you're talking about are _far_ from independent; each trial is associated with a particular person, and there will be a very strong correlation between various trials for the same person at different hours. Okay then, how should it be analysed? I've explained at least twice why I do not believe it is _possible_ to draw the sort of inference you want to draw from the data you've given us. You must be reading _some_ of those posts or you wouldn't keep replying. Take care, Paul All About MS - the latest MS News and Views http://www.mult-sclerosis.org/ David C. Ullrich *** Sometimes you can have access violations all the time and the program still works. (Michael Caracena, comp.lang.pascal.delphi.misc 5/1/01) = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: a form of censoring I have not met before
On 21 Jun 2001 00:35:11 -0700, [EMAIL PROTECTED] (Margaret Mackisack) wrote: I was wondering if anyone could direct me to a reference about the following situation. In a 3-factor experiment, measurements of a continuous variable, which is increasing monotonically over time, are made every 2 hours from 0 to 192 hours on the experimental units (this is an engineering experiment). If the response exceeds a set maximum level the unit is not observed any more (so we only know that the response is that level). If the measuring equipment could do so it would be preferred to observe all units for the full 192 hours. The time to censoring is of no interest as such, the aim is to estimate the form of the response for each unit which is the trace of some curve that we observe every 2 hours. Ignoring the censored traces in the time period after they are censored puts a huge Well, it certainly *sounds* as if the time to censoring should be of great interest, if you had an adequate model. Thus, when you say that ignoring them gives a huge downward bias, it sounds to me as if you are admitting that you do not have an acceptable model. Who can you blame for that? What leverage do you have, if you try to toss out those bad results? (Surely, you do have some ideas about forming estimates that *do* take the hours into account. The problem belongs in the hands of someone who does.) - maybe you want to segregate trials into the ones with 192 hours, or less than 192 hours; and figure two (Maximum Likelihood) estimates for the parameters, which you then combine. downward bias into the results and is clearly not the thing to do although that's what has been done in the past with these experiments. Any suggestions of where people have addressed data of this or related form would be very gratefully received. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
David C. Ullrich wrote: On Fri, 15 Jun 2001 15:23:03 +0100, Paul Jones [EMAIL PROTECTED] wrote: David C. Ullrich wrote: But analyzing it this way simply makes no sense. Those trials you're talking about are _far_ from independent; each trial is associated with a particular person, and there will be a very strong correlation between various trials for the same person at different hours. Okay then, how should it be analysed? I've explained at least twice why I do not believe it is _possible_ to draw the sort of inference you want to draw from the data you've given us. You must be reading _some_ of those posts or you wouldn't keep replying. Well, although I've agreed with most of your complaints about trying to derive any information from the scanty data shown, there is *something* we can notice about the data set which has some relevance. Let's say we look at a sampling of 100 people who have both had heart attacks within the last year and have smoked an aspirin an average of once a week during that year. Now, without knowing what the average percentage of people who smoke aspirin each year, and the average percentage of people who have heart attacks each year without smoking aspirin, these numbers alone would be pretty useless. But if 95% of the people in the data set had their 1 heart attack inside of 1 minute after smoking an aspirin, you'd have some reason to further examine the hypothesis that, for some segment of the population, smoking an aspirin could trigger a heart attack. (Of course it could also be that impending heart attacks bring on the desire to smoke aspirin, or some other hypothesis that correlates the two phenomena). One the other hand, one would expect if there were no immediate correlation between smoking aspirin and heart attacks, the average time between smoking aspirin and heart attack would be more like 1/2 week. This would then indicate that it was not particularly worthwhile to investigate an immediate link between asprinin smoking and heart attacks. That seems to be the type of correlation that was reported here - some distribution of MJ smoking, and its *temporal* correlation with heart attacks. Now, that says exactly nothing about whether MJ use increases or decreases the liklihood of having a heart attack in general (it could in fact in general *decrease* heart attacks, even in our data set); but instead would say, there is a segment of the population for whom MJ use is followed by a high liklihood of a heart attack. Would those people have had a heart attack anyway? Is this some small segment of the population that reacts this way? These questions would still remain without any further figures. Even in the abscence of this data, though, one might want to take some precautions during the hour following MJ usage, for those with an otherwise high liklihood of heart attack, such as: be near medical facilities, etc. Cheers - Chas --- C Brown Systems Designs Multimedia Environments for Museums and Theme Parks --- = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: comparing 2 slopes
mccovey@psych [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]... in article [EMAIL PROTECTED], Tracey Continelli at [EMAIL PROTECTED] wrote on 6/13/01 4:14 PM: Mike Tonkovich [EMAIL PROTECTED] wrote in message news:3b20f210_1@newsfeeds... Was hoping someone might be able to confirm that my approach for comparing 2 slopes was correct. I ran an analysis of covariance using PROC GLM (in SAS) with an interaction statement. My understanding was that a nonsignificant interaction term meant that the slopes were the same, and vice versa for a significant interaction term. Is this correct and is this the best way to approach this problem with SAS? Any help would certainly be apprectiated. Mike Tonkovich -- Michael J. Tonkovich, Ph.D. Wildlife Research Biologist ODNR, Division of Wildlife [EMAIL PROTECTED] The slopes need not be the same if the interaction term is non-significant, BUT, the difference between them will not be statistically significant. If the differences between the slops *are* statistically significant, this will be reflected in a statistically significant product term. I have preferred using regression analyses with interaction terms, which can be easily incorporated by simply multiplying the variables together and then running the regression equation with each independent variable plus the product term [which is simply another name for the interaction term]. The results are much more straightforward in my mind. Tracey Continelli SUNY at Albany I agree completely but there can be problems interpreting the regression Output (e.g., mistakes like talking about main effects). For advice on avoiding the common interpretation pitfalls, see Aiken West (1991). Multiple regression: Testing and interpreting interactions. Sage. Irwin McClelland (2001). In Journal of Marketing Research. Gary McClelland Univ of Colorado Quite so. Once you add the product term, the interpretation changes, and the parameter estimates are now known as simple main effects. The interpretation is pretty straightforward however. The parameter estimate, or slope, for your focal independent variable in the interaction model simply represents the effect of your independent variable upon your dependent variable when your moderator variable is equal to zero, holding constant all other independent variables in your model. The same may be said for the slope of your moderator variable - it represents the effect of that variable upon your dependent variable when your focal independent variable is equal to zero. Because in my research [the social science variety] that information isn't terribly useful [because most of the time you won't realistically see the moderator variable at zero, i.e., a zero crime rate or a zero poverty rate], what I will do is a mean centering trick. I'll subtract the mean from the moderator variable, rerun the equation with the new mean centered variable and product term, and NOW the parameter estimates of the simple main effects are meaningful for me. Now, when I look at the parameter estimates of the focal independent variable, it is telling me the effect of that independent variable upon the dependent variable when my moderator variable is at its mean. The actual product term remains identical to the original equation [of course], but now the simple main effects are realistically meaningful. I'll also apply the same technique for when the moderator variable is 2 standard deviations below the mean, 1 below the mean, all the way up to 2 standard deviations above the mean. This gives one a nice graphic sense of the way in which the slope between your focal independent variable and your dependent variable changes with successive changes in your moderator variable. Tracey Continelli Doctoral candidate SUNY at Albany = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: trimming data
At 11:24 AM 6/20/01 -0500, Mike Granaas wrote: A colleague has approached me about locating references discussing the trimming of data, with primary emphasis on psychological research. He is primarily interested in books/chapters/articles that emphasize the when and how. I am at a loss on this one and was wondering if anyone could offer a coupld of references. other than what some software programs do ... i don't have ready references ... but, the notion is that for some distributions ... particularly with some outliers at ONE end ... if you trim say 5% from each end ... it will reduce the impact on your descriptive stats of the outliers ... in minitab, there is a trimmed mean that you get as part of the DESCRIBE command which axes 5% from each end and THEN finds the mean for the middle 90% ... if you think about it ... you can trim different % values from the ends ... and, if you did a full trim of 50% from EACH end ... you are at the median! clearly, the more you trim the data, the narrower the data set is ... one should only consider trimming in the broader context of are there outliers and if there are, what (if anything) should we do about them? in some cases ... you do nothing since, from all accounts, the data are legitimate values ... but, if you find BAD data at the ends (due to miskeying, scoring error, etc.), then the first thing is to justify WHAT values to eliminate if any ... Thanks, Michael *** Michael M. Granaas Associate Professor[EMAIL PROTECTED] Department of Psychology University of South Dakota Phone: (605) 677-5295 Vermillion, SD 57069 FAX: (605) 677-6604 *** All views expressed are those of the author and do not necessarily reflect those of the University of South Dakota, or the South Dakota Board of Regents. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: trimming data
here is some help info from minitab about trimmed means ... === Trimmed mean The trimmed mean (TrMean) is like the mean, but it excludes the most extreme values in the data set. The highest and lowest 5% of the values (rounded to the nearest integer) are dropped, and the mean is calculated for the remaining values. For the precipitation data, 5% of 11 observations is 0.55, which rounds to 1. Thus, the highest value and the lowest value are dropped, and the mean is calculated for the remaining data: 1 2 2 3 3 3 3 4 4 5 10 This yields a value of 3.222. Like the median, the trimmed mean is less sensitive to extreme values than the mean. For example, the trimmed mean of this data set would be 3.222 even if there were 30 days with precipitation in April instead of 10. © All Rights Reserved. 2000 Minitab, Inc. == keep in mind that if the data set is symmetrical ... then, trimming really accomplishes nothing ... when it comes to the mean ... even if there are extreme values ... in a seriously + skewed distribution ... then trimming (for the mean) will back up the mean more to the LEFT ... compared to non trimming ... and just the opposite for a seriously - skewed distribution ... as i said earlier, trimming will necessarily DECREASE the variability ... = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: trimming data
in article [EMAIL PROTECTED], Mike Granaas at [EMAIL PROTECTED] wrote on 6/20/01 10:56 AM: A colleague has approached me about locating references discussing the trimming of data, with primary emphasis on psychological research. He is primarily interested in books/chapters/articles that emphasize the when and how. I am at a loss on this one and was wondering if anyone could offer a coupld of references. McClelland, G.H. (2000). Nasty data: Unruly, ill-mannered observations can ruin your analysis. In H.T. Reis C.M. Judd (Eds.), Handbook of research methods in social and personality psychology. [Chpt 15] Judd, C.M., (1989). Data analysis: A model comparison approach. HBJ. [see Chpt 9] Madansky, A. (1988). Prescrptions for working statisticians. Springer-Verlag. Atkinson, A.C. (1985). Plots, transformations, and regression: An introduction to grpahical methods of diagnostic regression analysis. Clarendon Press. Gary McClelland Univ of Colorado = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: comparing 2 slopes
in article [EMAIL PROTECTED], Tracey Continelli at [EMAIL PROTECTED] wrote on 6/20/01 7:06 AM: mccovey@psych [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]... in article [EMAIL PROTECTED], Tracey Continelli at [EMAIL PROTECTED] wrote on 6/13/01 4:14 PM: Mike Tonkovich [EMAIL PROTECTED] wrote in message news:3b20f210_1@newsfeeds... Was hoping someone might be able to confirm that my approach for comparing 2 slopes was correct. I ran an analysis of covariance using PROC GLM (in SAS) with an interaction statement. My understanding was that a nonsignificant interaction term meant that the slopes were the same, and vice versa for a significant interaction term. Is this correct and is this the best way to approach this problem with SAS? Any help would certainly be apprectiated. Mike Tonkovich -- Michael J. Tonkovich, Ph.D. Wildlife Research Biologist ODNR, Division of Wildlife [EMAIL PROTECTED] The slopes need not be the same if the interaction term is non-significant, BUT, the difference between them will not be statistically significant. If the differences between the slops *are* statistically significant, this will be reflected in a statistically significant product term. I have preferred using regression analyses with interaction terms, which can be easily incorporated by simply multiplying the variables together and then running the regression equation with each independent variable plus the product term [which is simply another name for the interaction term]. The results are much more straightforward in my mind. Tracey Continelli SUNY at Albany I agree completely but there can be problems interpreting the regression Output (e.g., mistakes like talking about main effects). For advice on avoiding the common interpretation pitfalls, see Aiken West (1991). Multiple regression: Testing and interpreting interactions. Sage. Irwin McClelland (2001). In Journal of Marketing Research. Gary McClelland Univ of Colorado Quite so. Once you add the product term, the interpretation changes, and the parameter estimates are now known as simple main effects. The interpretation is pretty straightforward however. The parameter estimate, or slope, for your focal independent variable in the interaction model simply represents the effect of your independent variable upon your dependent variable when your moderator variable is equal to zero, holding constant all other independent variables in your model. The same may be said for the slope of your moderator variable - it represents the effect of that variable upon your dependent variable when your focal independent variable is equal to zero. Because in my research [the social science variety] that information isn't terribly useful [because most of the time you won't realistically see the moderator variable at zero, i.e., a zero crime rate or a zero poverty rate], what I will do is a mean centering trick. I'll subtract the mean from the moderator variable, rerun the equation with the new mean centered variable and product term, and NOW the parameter estimates of the simple main effects are meaningful for me. Now, when I look at the parameter estimates of the focal independent variable, it is telling me the effect of that independent variable upon the dependent variable when my moderator variable is at its mean. The actual product term remains identical to the original equation [of course], but now the simple main effects are realistically meaningful. I'll also apply the same technique for when the moderator variable is 2 standard deviations below the mean, 1 below the mean, all the way up to 2 standard deviations above the mean. This gives one a nice graphic sense of the way in which the slope between your focal independent variable and your dependent variable changes with successive changes in your moderator variable. Tracey Continelli Doctoral candidate SUNY at Albany I hope everyone in the social sciences using product terms or moderator regression reads Tracey's thoughtful comments above. Failing to realize the coefficient for one of the components of a product is the effect of that variable when the other variable of the product is zero is one of my candidates for most common statistical error in the social sciences. Mean centering is indeed quite useful, even if one does not have products in the model. Also note that mean centering will always reduce the correlation between the product and its components and if the component distributions are symmetric it will reduce it to zero. There always exists a change of origin for the components that will make the correlation zero; hence, the colinearity warnings when testing products are not meaningful. Gary McClelland Univ of Colorado = Instructions for joining and leaving this list and remarks about the problem of
Re: Help me, please!
On 18 Jun 2001 01:18:37 -0700, [EMAIL PROTECTED] (Monica De Stefani) wrote: 1) Are there some conditions which I can apply normality to Kendall tau? tau is *lumpy* in its distribution for N less than 10. And all rank-order statistics are a bit problematic when you try to use them on rating scales with just a few discrete scores -- the tied values give you bad scaling intervals, and the estimate of variance won't be very good,either. For correlations, your assumption of 'normality' is usually applied to the values at zero. I was wondering if x's observations must be independent and y's observations must be independent to apply asymptotically normal limiting distribution. (null hypothesis = x and y are independent). Could you tell me something about? - Independence is needed for just about any tests. I started to say (as a minor piece of exaggeration) that independence is needed absolutely; but the correct statement, I think, is that independence is always demanded relative to the error term. [ snip, non-linear?] Monotonic is the term. [ snip, T(z): I don't know what that is.] -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Consistency quotation
G. B. Shaw - Pygmaillion (sp) My Fair Lady, maybe too. "I can tell a woman's age in half a minute - and I do." Surely H. Higgins prided himself on consistency :) Jay [EMAIL PROTECTED] wrote: I remember reading something like the following: "Consistency alone is not necessarily a virtue. One can be consistently obnoxious." I believe it was in a discussion to an RSS read paper, maybe from about 30 years ago, but I have not been able to find it again. A web-search for "consistently obnoxious" taught me more about asbestos corks than I care to know, but was otherwise unhelpful. Can anyone provide the source, or at least a lead? Many thanks, Ewart Shaw. -- J.E.H.Shaw [Ewart Shaw] [EMAIL PROTECTED] TEL: +44 2476 523069 Department of Statistics, University of Warwick, Coventry CV4 7AL, U.K. http://www.warwick.ac.uk/statsdept/Staff/JEHS/ The opposite of a profound truth is not also a profound truth. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
W. D. Allen Sr. wrote: There is medical research that shows marijuana is more lethal than tobacco regarding lung cancer. Maybe there is a correlation between lung cancer susceptibility and heart attacks? We know there is for tobacco! We know there is a correlation between alcohol, doctor's bills, and the tuition for Big Fart science schools and heart attacks too!. But why is that always swept under the rug by probabilty theory genius? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: comparing 2 slopes
in article [EMAIL PROTECTED], Tracey Continelli at [EMAIL PROTECTED] wrote on 6/13/01 4:14 PM: Mike Tonkovich [EMAIL PROTECTED] wrote in message news:3b20f210_1@newsfeeds... Was hoping someone might be able to confirm that my approach for comparing 2 slopes was correct. I ran an analysis of covariance using PROC GLM (in SAS) with an interaction statement. My understanding was that a nonsignificant interaction term meant that the slopes were the same, and vice versa for a significant interaction term. Is this correct and is this the best way to approach this problem with SAS? Any help would certainly be apprectiated. Mike Tonkovich -- Michael J. Tonkovich, Ph.D. Wildlife Research Biologist ODNR, Division of Wildlife [EMAIL PROTECTED] The slopes need not be the same if the interaction term is non-significant, BUT, the difference between them will not be statistically significant. If the differences between the slops *are* statistically significant, this will be reflected in a statistically significant product term. I have preferred using regression analyses with interaction terms, which can be easily incorporated by simply multiplying the variables together and then running the regression equation with each independent variable plus the product term [which is simply another name for the interaction term]. The results are much more straightforward in my mind. Tracey Continelli SUNY at Albany I agree completely but there can be problems interpreting the regression Output (e.g., mistakes like talking about main effects). For advice on avoiding the common interpretation pitfalls, see Aiken West (1991). Multiple regression: Testing and interpreting interactions. Sage. Irwin McClelland (2001). In Journal of Marketing Research. Gary McClelland Univ of Colorado = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: multivariate techniques for large datasets
you might want to go to http://www.pitt.edu/~csna/ and then cross-post your question to CLASS-L The Classification Society meeting this weekend had a lot of discussion of these topics. My first question is whether you intend to interpret the clusters? If so, what is the nature of the 500 variables? What is the nature of your cases? What does the set of cases represent? How much data is missing. What kinds of missing data do you have? What do you want to do with the cluster reults? Are you interested in a tree or a simple clustering? Many users of clustering use data reduction techniques such as factor analysis to summarize the variability of the 500 with a smaller number of dimensions. srinivas wrote: Hi, I have a problem in identifying the right multivariate tools to handle datset of dimension 1,00,000*500. The problem is still complicated with lot of missing data. can anyone suggest a way out to reduce the data set and also to estimate the missing value. I need to know which clustering tool is appropriate for grouping the observations( based on 500 variables ). = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Probability Of an Unknown Event
The only time I can think of this being meaningful is in determining what size sample to draw. If we don't have any prior information about what the proportion of events in a population have a particular characteristic (the probability of a characteristic), then we assume the worse-case (widest variance) of 50%. W. D. Allen Sr. wrote: It's been years since I was in school so I do not remember if I have the following statement correct. Pascal said that if we know absolutely nothing about the probability of occurrence of an event then our best estimate for the probability of occurrence of that event is one half. Do I have it correctly? Any guidance on a source reference would be greatly appreciated! Thanks, WDA [EMAIL PROTECTED] end = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: 3rd degree polynom curve fitting, correlation needed
Matti Overmark wrote: I have fitted a 3 rd degree curve to a sample (least square method), and I want to compare this particular R2 with that of a (similarily) fitted 2 degree polynom. I can assure you that the 3rd degree polynomial will fit as well or better than the 2nd degree polynomial, as measured by R-squared. If you want a statistical test to test the hypothesis that the 3rd degree model yields a significantly better fit compared to the second degree model, then you should do an extra-sums-of-squares test, as explained in the fine textbook by Draper and Smith Applied Regression Analysis. I want to see which of the two models is the best. Any suggestion of a good book? A plot would work just fine, if you want to see how the models fit. -- Paige Miller Eastman Kodak Company [EMAIL PROTECTED] It's nothing until I call it! -- Bill Klem, NL Umpire When you get the choice to sit it out or dance, I hope you dance -- Lee Ann Womack = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Probability Of an Unknown Event
Thanks Robert! WDA end - Original Message - From: Robert J. MacG. Dawson [EMAIL PROTECTED] To: W. D. Allen Sr. [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Sunday, June 17, 2001 6:35 PM Subject: Re: Probability Of an Unknown Event W. D. Allen Sr. wrote: It's been years since I was in school so I do not remember if I have the following statement correct. Pascal said that if we know absolutely nothing about the probability of occurrence of an event then our best estimate for the probability of occurrence of that event is one half. [snipped] = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Probability Of an Unknown Event
On Sat, 16 Jun 2001 23:05:52 GMT, W. D. Allen Sr. [EMAIL PROTECTED] wrote: It's been years since I was in school so I do not remember if I have the following statement correct. Pascal said that if we know absolutely nothing about the probability of occurrence of an event then our best estimate for the probability of occurrence of that event is one half. Do I have it correctly? Any guidance on a source reference would be greatly appreciated! I did a little bit of Web searching and could not find that. Here is an essay about Bayes, which (dis)credits him and his contemporaries as assuming something like that, years before Laplace. I found it with a google search on know absolutely nothing probability . http://web.onetel.net.uk/~wstanners/bayes.htm -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Probability Of an Unknown Event
The problem comes because there is often no unique way of defining events. It is hard to think of a real example where we literally know nothing. The equal probability answer is often just a cop-out for not thinking about what we do know. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Maximum likelihood Was: Re: Factor Analysis
In article [EMAIL PROTECTED], Ken Reed [EMAIL PROTECTED] wrote: It's not really possible to explain this in lay person's terms. The difference between principal factor analysis and common factor analysis is roughly that PCA uses raw scores, whereas factor analysis uses scores predicted from the other variables and does not include the residuals. That's as close to lay terms as I can get. I have never heard a simple explanation of maximum likelihood estimation, but -- MLE compares the observed covariance matrix with a covariance matrix predicted by probability theory and uses that information to estimate factor loadings etc that would 'fit' a normal (multivariate) distribution. MLE factor analysis is commonly used in structural equation modelling, hence Tracey Continelli's conflation of it with SEM. This is not correct though. I'd love to hear simple explanation of MLE! MLE is triviality itself, if you do not make any attempt to state HOW it is to be carried out. For each possible value X of the observation, and each state of nature \theta, there is a probability (or density with respect to some base measure) P(X | \theta). There is no assumption that X is a single real number; it can be anything; the same holds about \theta. What MLE does is to choose the \theta which makes P(X | \theta) as large as possible. That is all there is to it. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help me, please!
Monica De Stefani [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... 2) Can Kendall discover nonlinear dependence? He used to be able to, but he died. (Look at how Kendall's tau is calculated. Notice that it is not affected by any monotonic increasing transformation. So Kendall's tau measures monotonic association - the tendency of two variables to be in the same order.) Glen = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: 3rd degree polynom curve fitting, correlation needed
Judd McClelland, _Data Analysis: A Model Comparison Approch_, chapter 8. MG On 18 Jun 2001, Matti Overmark wrote: Hi group! I´m new to this group, so...just you know. I have fitted a 3 rd degree curve to a sample (least square method), and I want to compare this particular R2 with that of a (similarily) fitted 2 degree polynom. I want to see which of the two models is the best. Any suggestion of a good book? Thanks in advance, Matti Ö. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = *** Michael M. Granaas Associate Professor[EMAIL PROTECTED] Department of Psychology University of South Dakota Phone: (605) 677-5295 Vermillion, SD 57069 FAX: (605) 677-6604 *** All views expressed are those of the author and do not necessarily reflect those of the University of South Dakota, or the South Dakota Board of Regents. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Normality in Factor Analysis
In article 9gg7ht$qa3$[EMAIL PROTECTED], haytham siala [EMAIL PROTECTED] wrote: Hi, I have a question regarding factor analysis: Is normality an important precondition for using factor analysis? If no, are there any books that justify this. Factor analysis is quite robust against non-normality. The essential factor structure is little affected by it at all, although the representation may get somewhat sensitive if data-dependent normalizations are used, such as using correlations rather than covariances, or forcing normalization on the covariance matrix of the factors. Some of this is in my paper with Anderson in the Proceedings of the Third Berkeley Symposium. The result on the asymptotic distribution, not at all difficult to derive, is in one of my abstracts in _Annals of Mathematical Statistics_, 1955. It is basically this: Suppose the factor model is x = \Lambda f + s, f the common factors and s the specific factors. Further suppose that f and s, and also the elements of s, are uncorrelated, and there is adequate normalization and smooth identification of the model by the elements of \Lambda alone. Now estimate \Lambda, M, the covariance matrix of f, and S, the diagonal covariance matrix of s. Assuming the usual assumptions for asymptotic normality of the sample covariances of the elements of f with s, and of the pairs of different elements of s, the asymptotic distribution of the estimates of \Lambda and the SAMPLE values of M and S from their actual values will have the expected asymptotic joint normal distribution. This makes no assumption about the distribution of M and S about their expected values, which is the main place were there is an effect of normality. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: meta-analysis
On 17 Jun 2001 04:34:26 -0700, [EMAIL PROTECTED] (Marc) wrote: I have to summarize the results of some clinical trials. Unfortunately the reported information is not complete. The information given in the trials contain: (1) Mean effect in the treatment group (days of hospitalization) (2) Mean effect in the control group (days of hospitalization) (3) Numbers of patients in the control and treatment group (4) p-values of a t-test (between the differences of treatment and control) My question: How can I calculate the variance of treatment difference which I need to perform meta-analysis? Note that the numbers of patients in the Aren't you going too far? You said you have to summarize. Well, summarize. The difference is in terms of days. Or it is in terms of percentage of increase. And you have the t-test and p-values. You might be right in what you propose, but I think you are much more likely to produce a useful report if you keep it simple. You are right; meta-analyses are complex. And a majority of the published ones are (in my opinion) awful. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On 15 Jun 2001 02:04:36 -0700, [EMAIL PROTECTED] (Eamon) wrote: [ snip, Paul Jones. About marijuana statistics.] Surely this whole research is based upon a false premise. Isn't it like saying that 90%, say, of heroin users previously used soft drugs. Therefore, soft-drug use usually leads to hard-drug use - which does not logically follow. (A = B =/= B = A) Conclusions drawn from the set of people who have had heart attacks cannot be validly applied to the set of people who smoke dope. Rather than collect data from a large number of people who had heart attacks and look for a backward link, they should monitor a large number of people who smoke dope. But, of course this is much more expensive. It is much more expensive, but it is also totally stupid to carry out the expensive research if the *cheap* and lousy research didn't give you a hint that there might be something going on. The numbers that he was asking about do pass the simple test. I mean, there were not 1 million people contributing one hour each, but we should still ask, *Would* this say something? If it would not, then the whole question is *totally* arid. The 2x2 table is approximately (dividing the first column by 100; and subtracting from a total): 10687 and 124 175 and 9 That gives a contingency test of 21.2 or 18.2, with p-values under .001. The Odds Ratio on that is 4.4. That is pretty convincing that there is SOMETHING going on, POSSIBLY something that merits an explanation. The expectation for the cell with 9 is just 2.2 -- the tiny cell is the cell that matters for contributions to the test -- which is why it is okay to lop the hundreds off the first column (to make it readable). Now, you may return to your discussion of why the table is not any good, and what is needed for a proper test. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: meta-analysis
On 17 Jun 2001, Marc wrote (edited): I have to summarize the results of some clinical trials. The information given in the trials contain: Mean effects (days of hospitalization) in treatment control groups; numbers of patients in the groups; p-values of a t-test (of the difference between treatment and control) . My question: How can I calculate the variance of the treatment difference, which I need to perform meta-analysis? Note that the numbers of patients in the groups are not equal. Is it possible to do it like this: s^2 = (difference between contr and treatm)^2/ ((1/n1+1/n2)*t^2) Yes, if you know t. If all you know is that p alpha for some alpha, you then know only that t the t corresponding to alpha (AND you need to know whether the test had been one-sided or two-sided -- of course, you need to know that in any case), you can substitute that corresponding t to obtain an upper bound on s^2 -- ASSUMING that the t was calculated using a pooled variance (your s^2), not using the expression for separate variances in the denominator: (s1^2/n1 + s2^2/n2). Note that this s^2 is NOT the variance of the treatment difference, which you said you wanted to know; it is the pooled variance estimate of the variance within each group. The variance of the difference in treatment means, which _may_ be what you are interested in, would be (difference)^2 / t^2 with the same caveats concerning what you know about t. How exact would such an approximation be? Depends on the precision with which p was reported. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: individual item analysis
On 15 Jun 2001 14:24:39 -0700, [EMAIL PROTECTED] (Doug Sawyer) wrote: I am trying to locate a journal article or textbook that addresses whether or not exam quesitons can be normalized, when the questions are grouped differently. For example, could a question bank be developed where any subset of questions could be selected, and the assembled exam is normalized? What is name of this area of statistics? What authors or keywords would I use for such a search? Do you know whether or not this can be done? I believe that they do this sort of thing in scholastic achievement tests, as a matter of course. Isn't that how they make the transition from year to year? I guess this would be norming. A few weeks ago, I discovered that there is a whole series of tech-reports put out by one of the big test companies. I would look back to it, for this sort of question. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Factor Analysis
It's not really possible to explain this in lay person's terms. The difference between principal factor analysis and common factor analysis is roughly that PCA uses raw scores, whereas factor analysis uses scores predicted from the other variables and does not include the residuals. That's as close to lay terms as I can get. I have never heard a simple explanation of maximum likelihood estimation, but -- MLE compares the observed covariance matrix with a covariance matrix predicted by probability theory and uses that information to estimate factor loadings etc that would 'fit' a normal (multivariate) distribution. MLE factor analysis is commonly used in structural equation modelling, hence Tracey Continelli's conflation of it with SEM. This is not correct though. I'd love to hear simple explanation of MLE! From: [EMAIL PROTECTED] (Tracey Continelli) Organization: http://groups.google.com/ Newsgroups: sci.stat.consult,sci.stat.edu,sci.stat.math Date: 15 Jun 2001 20:26:48 -0700 Subject: Re: Factor Analysis Hi there, would someone please explain in lay person's terms the difference betwn. principal components, commom factors, and maximum likelihood estimation procedures for factor analyses? Should I expect my factors obtained through maximum likelihood estimation tobe highly correlated? Why? When should I use a Maximum likelihood estimation procedure, and when should I not use it? Thanks. Rita [EMAIL PROTECTED] Unlike the other methods, maximum likelihood allows you to estimate the entire structural model *simultaneously* [i.e., the effects of every independent variable upon every dependent variable in your model]. Most other methods only permit you to estimate the model in pieces, i.e., as a series of regressions whereby you regress every dependent variable upon every independent variable that has an arrow directly pointing to it. Moreover, maximum likelihood actually provides a statistical test of significance, unlike many other methods which only provide generally accepted cut-off points but not an actual test of statistical significance. There are very few cases in which I would use anything except a maximum likelihood approach, which you can use in either LISREL or if you use SPSS you can add on the module AMOS which will do this as well. Tracey = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Factor Analysis
Dear Haytham, other issue concern with a measure of the latent construct is the unidimensionality. Hair et alli(1998): unidimensionality is an assumption underlying the calculation of reliability and is demonstraded when indicators of a construct have acceptable fit on a single-factor(one-dimensional) model.(...) The use of reliability measures, such Cronbach´s alpha, does not ensure unidimensionality but instead assumes it exists. The researcher is encouraged to perform unidimensionality tests on all multiple-indicator constructs before assessing their reliability. This reference is very important: Gerbing, David W., Anderson, James C. An updated paadigm for scale development incorporating unidimensionality and its assesment. Best regards, Alexandre Moura. P.S. Please accept my apologies for my English mistakes. - Original Message - From: haytham siala [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, June 15, 2001 5:40 PM Subject: Factor Analysis Hi, I will appreciate if someone can help me with this question: if factors extracted from a factor analysis were found to be reliable (using an internal consistency test like a Cronbach alpha), can they be used to represent a measure of the latent construct? If yes, are there any references or books that justify this technique? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Factor Analysis
The complete reference: Gerbing, David W., Anderson, James C. An updated paradigm for scale development incorporating unidimensionality and its assesment. Journal of Marketing Research. Vol. XXV (May 1988). Alexandre Moura. - Original Message - From: Alexandre Moura [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Saturday, June 16, 2001 9:26 AM Subject: Re: Factor Analysis Dear Haytham, other issue concern with a measure of the latent construct is the unidimensionality. Hair et alli(1998): unidimensionality is an assumption underlying the calculation of reliability and is demonstraded when indicators of a construct have acceptable fit on a single-factor(one-dimensional) model.(...) The use of reliability measures, such Cronbach´s alpha, does not ensure unidimensionality but instead assumes it exists. The researcher is encouraged to perform unidimensionality tests on all multiple-indicator constructs before assessing their reliability. This reference is very important: Gerbing, David W., Anderson, James C. An updated paadigm for scale development incorporating unidimensionality and its assesment. Best regards, Alexandre Moura. P.S. Please accept my apologies for my English mistakes. - Original Message - From: haytham siala [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, June 15, 2001 5:40 PM Subject: Factor Analysis Hi, I will appreciate if someone can help me with this question: if factors extracted from a factor analysis were found to be reliable (using an internal consistency test like a Cronbach alpha), can they be used to represent a measure of the latent construct? If yes, are there any references or books that justify this technique? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
There is medical research that shows marijuana is more lethal than tobacco regarding lung cancer. Maybe there is a correlation between lung cancer susceptibility and heart attacks? We know there is for tobacco! WDA end Paul Jones [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... There was some research recently linking heart attacks with Marijuana smoking. I'm trying to work out the correlation and, most importantly, its statistical significance. In essence the problem comes down to: Of 8760 hours in a year, 124 had heart attacks in them, 141 had MJ smokes in them and 9 had both. What statistical tests apply? Most importantly, what is the statistical significance of the correlation between smoking MJ in any hour and having a heart attack in that same hour? What is the probablity that the null hypothesis (that smoking marijuana and having a heart attack are unrelated) can be rejected? How reliable are the results from a dataset of this size? I'm not very literate in maths and stats - please help me out someone. I'm interested in this research from the perspective of medicinal marijuana. Thanks and take care, Paul All About MS - the latest MS News and Views http://www.mult-sclerosis.org/ = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Normality in Factor Analysis
In sci.stat.consult haytham siala [EMAIL PROTECTED] wrote: I have a question regarding factor analysis: Is normality an important precondition for using factor analysis? It's necessary for testing hypotheses about factors extracted by Joreskog's maximum-likelihood method. Otherwise, no. If no, are there any books that justify this. Any book on factor analysis or multivariate statistics in general. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
In article XhRW6.14316$[EMAIL PROTECTED], W. D. Allen Sr. [EMAIL PROTECTED] wrote: There is medical research that shows marijuana is more lethal than tobacco regarding lung cancer. Thanks for playing, but sorry, no. There's a lot of research which says a lot of different things about marijauna's deleterious effects on the lungs. Off the top of my head: A Berkeley study of the late '70s concluded that marijuana is one-and-a-half times as carcinogenic as tobacco. This assesment took into account _only_ quantities of tar. Tar, while a carcinogen, is not the primary cancer-causing agent in tobacco, or even close; polonium 210 and lead 210 are considerably more hazardous and conspicuously absent from marijuana. Add to this the fact that marijuana smokers are unlikely to consume nearly as much net weight smokable material as tobacco smokers, and you're talking apples and oranges. Actual tests on real live people bears this out. Multiple population samples show no correlation between marijuana use exclusive of tobacco use and lung cancer: Tashkin, D.P. et al, Longitudinal Changes in Respiratory Symptoms and Lung Function in Non-smokers, Tobacco Smokers, and Heavy, Habitual Smokers of Marijuana With or Without Tobacco, pp 25-36 in G. Chesher et al (eds), Marijuana: an International Research Report, Canberra: Australian Government Publishing Service (1988). Sherrill, D.L. et al, Respiratory Effects of Non-Tobacco Cigarettes: A Longitudinal Study in General Population, International Journal of Epidemiology 20: 132-37 (1991). Fligiel, S.E.G. et al, Bronchial Pathology in Chronic Marijuana Smokers: A Light Electron Microscope Study, Journal of Psychoactive Drugs 20:33-42 (1988). Maybe there is a correlation between lung cancer susceptibility and heart attacks? We know there is for tobacco! Well, inhaling smoke of _any_ sort actually puts some strain on your heart. I believe specific toxins in tobacco exacerbate the problem, but it's present for all types of smokables. Of course, we're very off-topic here. Anyone want to crosspost this thread to sci.med.*, or talk.politics.drugs? +--First Church of Briantology--Order of the Holy Quaternion--+ | A mathematician is a device for turning coffee into | | theorems. -Paul Erdos | +-+ | Jake Wildstrom| +-+ = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Pooled relative risks
JFC [EMAIL PROTECTED] wrote: I am concerned in the way of calculating pooled relative risks, since it is interesting in some *meta-analytical* applications. See eg Rothman Modern Epidemiology P 196 et seq. -- | David Duffy. ,-_|\ | email: [EMAIL PROTECTED] ph: INT+61+7+3362-0217 fax: -0101/ * | Epidemiology Unit, The Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia v = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
David C. Ullrich wrote: considerable benefit for neurogenic bladder problems, I did not know that, but I know that the topic is of considerable interest to people with various other conditions. Yes, recent work at the National Hospital of Neurology and Neurosurgery in London, UK has shown that two cannibinoids administered in a spray considerably reduce urinary frequency and the number of time PwMS have to get up to pee during the night (a big problem). The researcher I was talking to said that there are cannibinoid receptors in the bladder and the cortex but not in the micuration control areas of the brainstem nor in the spinal cord. As is the fact that the Supreme Court seems to have decided that pi = 3 again... More like -6. Here I get a little lost again. Exactly what does it mean to say the relative risk is 4.8? I assumed it meant event A happened 4.8 times as much as would be expected if the two events were unrelated. And here again I'm _totally_ lost. Okay, put it like this: Of 1086240 trials, A happened in 17484 of them, B happened in 124 and both A and B happened in 9. I really need to know how to how to calculate the statistical implications here. Please someone help me! What I want to know is what is the correlation between these two event? Most importantly, how statistically significant is the result? Can any reasonable conclusions be drawn from these data - esp, in view of the small dataset size? Take care, Paul All About MS - the latest MS News and Views http://www.mult-sclerosis.org/ = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
David Petry wrote: Keep in mind that correlation is not the same as causation. That's of particular importance in a study like this one. That is, if people are taking marijuana to treat pain and general discomfort, and if heart attacks are preceded by pain and discomfort, then there will be a strong correlation between marijuana use and later heart attacks, but it won't be proof of causation. I know the study is flawed in ever so many ways. I just want to get at the statistical implications. I wish I hadn't memtioned marijuana or the trial. Please help me to find the appropriate statistical test (e.g. two-tailed t-test, Spearman Rank correlation, chi2 test or whatever) and help me work out the statistical significance of any correlation between events A and B where: In 1086240 trials, A happened in 17484 of them, B happened in 124 and both A and B happened in 9. Is there a statistical association between A and B? How significant is that association? I would be ever so grateful if someone could help. Take care, Paul All About MS - the latest MS News and Views http://www.mult-sclerosis.org/ = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
Paul Jones [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]... snip So the research says that of a large number of people who had heart attacks at a centre, 124 people had used MJ in the year preceding the HA. Of these 9 reported that they had used MJ in the hour preceding the HA. All MJ users were questioned on the frequency with which they used MJ. The relative risk was reported as 4.8 - I used this to back-calculate that the average number of MJ usages per year rounded 141 - (9/n)/(115/(8760-n)) = 4.8 snip ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º° Surely this whole research is based upon a false premise. Isn't it like saying that 90%, say, of heroin users previously used soft drugs. Therefore, soft-drug use usually leads to hard-drug use - which does not logically follow. (A = B =/= B = A) Conclusions drawn from the set of people who have had heart attacks cannot be validly applied to the set of people who smoke dope. Rather than collect data from a large number of people who had heart attacks and look for a backward link, they should monitor a large number of people who smoke dope. But, of course this is much more expensive. Just my humble tupennyworth, Eamon ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º° = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
In his redoubtable Re: Marijuana dated: 6/14/2001 5:47:23 PM Central Daylight Time, Jim Ferry wrote I was surprised to see this subject heading on sci.math. I thought it might have to do with the following lyrics (I forget the name of the group and the song): "I smoke two joints at two o' clock; I smoke two joints at four. I smoke two joints before I smoke two joints, And then I smoke two more." Given an infinite supply of marijuana, even granting immortality to Cheech and Chong would not make the above feat possible. One would need to have existed for an infinite amout of time. And even then, smoking a joint takes at least one Planck time unit, so if you plot on a time-line the points at which each joint-pair- smoking finishes, there can't be any accumulation points. This would seem to preclude any such feat of pot-smoking . . . unless you somehow exist in a strange temporal topology (e.g., the long line). So then, how much marijuana would one have to smoke to actually change the nature of (one's personal) time in such a way? I'm guessing that no finite amount would suffice, but do not hazard a guess as to the precise cardnality required. I read the above message by starting with the first word [I] and ending with the last word [required]. Therefore I read the whole thing along some sort of time line (to use Jim's term) Now hear this (I don't mean that literally of course).: I am utterly incapable of understanding the logic adduced in the last three paragraphs that come after the first paragraph as well as after the lyric which comes after the first paragraph and before the last three paragraphs. So now my question to Jim becomes: is what I'm about to say possible according his line of thinking [pun intended] ? I perused one paragraph at the beginning, I perused three paragraphs at the end. I perused one paragraph before I perused three paragraphs And then I perused three paragraphs more. To my untutored mind 'tis entirely possible because the third and fourth lines merely iterate the first and second. See the following exegesis : I perused one paragraph [that at the beginning] before I perused three paragraphs [those at the end] and then [having perused the paragraph at the beginning before going on to peruse the three paragraphs at the end] I perused three paragraphs more [i.e. the last three paragraphs] Now if you tell me that the aforegoing is impossible then --- whoopee --- I have done the impossible; because that is precisely what I did. Of course, you may retort: "Well hows come ya didunt say ya whatcha meant from da git go?" To which I could only sigh and reply: "Because I was engaging in a bit of whimsical wordplay, good sir --- and am inconsolably, irremediably, not to mention insincerely, sorry that you failed to comprehend what I was up to." Now think about this Jim, think real hard!!! Which seems more plausible, that that the unknown author(s) of the lyric you cite were 1) just having some fun with words, or 2) that they were deliberately constructing a verbal paradox in the sense of a self-contradictory statement that at first seems true but could be mathematically demonstrated to be false? Oh yes!!! If you opt for the second option, please support your decision mathematically. (I won't understand it of course. But what difference does *that* make? I will nevertheless be tremenjusly impressed) Su servidor [Sp: your servant --- but don't take that literally] Harley Upchurch, M.D. (No no!!! Not medical doctor, mathematical dummy.) .
Re: Marijuana
On Fri, 15 Jun 2001 08:02:23 +0100, Paul Jones [EMAIL PROTECTED] wrote: David C. Ullrich wrote: considerable benefit for neurogenic bladder problems, I did not know that, but I know that the topic is of considerable interest to people with various other conditions. Yes, recent work at the National Hospital of Neurology and Neurosurgery in London, UK has shown that two cannibinoids administered in a spray considerably reduce urinary frequency and the number of time PwMS have to get up to pee during the night (a big problem). The researcher I was talking to said that there are cannibinoid receptors in the bladder and the cortex but not in the micuration control areas of the brainstem nor in the spinal cord. As is the fact that the Supreme Court seems to have decided that pi = 3 again... More like -6. Here I get a little lost again. Exactly what does it mean to say the relative risk is 4.8? I assumed it meant event A happened 4.8 times as much as would be expected if the two events were unrelated. And here again I'm _totally_ lost. Okay, put it like this: Of 1086240 trials, A happened in 17484 of them, B happened in 124 and both A and B happened in 9. But analyzing it this way simply makes no sense. Those trials you're talking about are _far_ from independent; each trial is associated with a particular person, and there will be a very strong correlation between various trials for the same person at different hours. I really need to know how to how to calculate the statistical implications here. Please someone help me! I know that the way you've been putting things makes no sense. I suspect, but I don't know for sure, that to get the sort of information you want you need more data than what you've told us - you also need data on how many people in the general population, without heart attacks, do and do not smoke evil weeds. What I want to know is what is the correlation between these two event? Most importantly, how statistically significant is the result? Can any reasonable conclusions be drawn from these data - esp, in view of the small dataset size? You keep asking this. The size of the dataset is not the reason we cannot draw the sort of inferences you're interested in. Take care, Paul All About MS - the latest MS News and Views http://www.mult-sclerosis.org/ David C. Ullrich * Sometimes you can have access violations all the time and the program still works. (Michael Caracena, comp.lang.pascal.delphi.misc 5/1/01) = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On Fri, 15 Jun 2001 08:02:23 +0100, Paul Jones [EMAIL PROTECTED] wrote: Of 1086240 trials, A happened in 17484 of them, B happened in 124 and both A and B happened in 9. I really need to know how to how to calculate the statistical implications here. Please someone help me! It is simple to solve this problem using a Monte Carlo simulation, that is, an approximate permutation test. I would gladly do that, but I need to know the frequency of pot smoking among those 124. That is, how many hours each one spend smoking pot in a year. From this information we can calculate how likely that there will be 9 or more coincidences of smoking pot and having a heart attack given statistical independence. Sturla Molden = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On Thu, 14 Jun 2001, Chas F Brown wrote: Jim Ferry wrote: [ ... ] it might have to do with the following lyrics (I forget the name of the group and the song): I smoke two joints at two o' clock; I smoke two joints at four. I smoke two joints before I smoke two joints, And then I smoke two more. Suprisingly, the name of this song is Smoke Two Joints (by Sublime, available on the Mallrats Sound Track Album). It is also a vague echo of one the Earl of Rochester's poems which begins, I rise at Eleven, I dine about Two, / I get drunk before Sev'n; and the next Thing I do... The rest is unpublishable in this dignified company. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
In article [EMAIL PROTECTED], Axel Harvey [EMAIL PROTECTED] wrote: It is also a vague echo of one the Earl of Rochester's poems which begins, I rise at Eleven, I dine about Two, / I get drunk before Sev'n; and the next Thing I do... The rest is unpublishable in this dignified company. He designs a web site dedicated to his proof of FLT? Wade = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Factor Analysis
Hi there, would someone please explain in lay person's terms the difference betwn. principal components, commom factors, and maximum likelihood estimation procedures for factor analyses? Should I expect my factors obtained through maximum likelihood estimation tobe highly correlated? Why? When should I use a Maximum likelihood estimation procedure, and when should I not use it? Thanks. Rita [EMAIL PROTECTED] Unlike the other methods, maximum likelihood allows you to estimate the entire structural model *simultaneously* [i.e., the effects of every independent variable upon every dependent variable in your model]. Most other methods only permit you to estimate the model in pieces, i.e., as a series of regressions whereby you regress every dependent variable upon every independent variable that has an arrow directly pointing to it. Moreover, maximum likelihood actually provides a statistical test of significance, unlike many other methods which only provide generally accepted cut-off points but not an actual test of statistical significance. There are very few cases in which I would use anything except a maximum likelihood approach, which you can use in either LISREL or if you use SPSS you can add on the module AMOS which will do this as well. Tracey = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Factor Analysis
_Psychometric Theory_, by Jum Nunnally to name one. haytham siala wrote: Hi, I will appreciate if someone can help me with this question: if factors extracted from a factor analysis were found to be reliable (using an internal consistency test like a Cronbach alpha), can they be used to represent a measure of the latent construct? If yes, are there any references or books that justify this technique? -- Timothy Victor [EMAIL PROTECTED] Policy Research, Evaluation, and Measurement Graduate School of Education University of Pennsylvania = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: multivariate techniques for large datasets
In article 9g9k9f$h4c$[EMAIL PROTECTED], Eric Bohlman [EMAIL PROTECTED] wrote: In sci.stat.consult Tracey Continelli [EMAIL PROTECTED] wrote: value. I'm not sure why you'd want to reduce the size of the data set, since for the most part the larger the N the better. Actually, for datasets of the OP's size, the increase in power from the large size is a mixed blessing, for the same reason that many hard-of-hearing people don't terribly like wearing hearing aids: they bring up the background noise just as much as the signal. With an N of one million, practically *any* effect you can test for is going to be significant, regardless of how small it is. This just points out another stupidity of the use of significance testing. Since the null hypothesis is false anyhow, why should we care what happens to be the probability of rejecting when it is true? State the REAL problem, and attack this. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: multivariate techniques for large datasets
On 13 Jun 2001 20:32:51 -0700, [EMAIL PROTECTED] (Tracey Continelli) wrote: Sidney Thomas [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]... srinivas wrote: Hi, I have a problem in identifying the right multivariate tools to handle datset of dimension 1,00,000*500. The problem is still complicated with lot of missing data. can anyone suggest a way out to reduce the data set and also to estimate the missing value. I need to know which clustering tool is appropriate for grouping the observations( based on 500 variables ). One of the best ways in which to handle missing data is to impute the mean for other cases with the selfsame value. If I'm doing psychological research and I am missing some values on my depression scale for certain individuals, I can look at their, say, locus of control reported and impute the mean value. Let's say [common finding] that I find a pattern - individuals with a high locus of control report low levels of depression, and I have a scale ranging from 1-100 listing locus of control. If I have a missing value for depression at level 75 for one case, I can take the mean depression level for all individuals at level 75 of locus of control and impute that for all missing cases in which 75 is the listed locus of control value. I'm not sure why you'd want to reduce the size of the data set, since for the most part the larger the N the better. Do you draw numeric limits for a variable, and for a person? Do you make sure, first, that there is not a pattern? That is -- Do you do something different depending on how many are missing? Say, estimate the value, if it is an oversight in filling blanks on a form, BUT drop a variable if more than 5% of responses are unexpectedly missing, since (obviously) there was something wrong in the conception of it, or the collection of it Psychological research (possibly) expects fewer missing than market research. As to the N - As I suggested before - my computer takes more time to read 50 megabytes than one megabyte. But a psychologist should understand that it is easier to look at and grasp and balance raw numbers that are only two or three digits, compared to 5 and 6. A COMMENT ABOUT HUGE DATA-BASES. And as a statistician, I keep noticing that HUGE databases tend to consist of aggregations. And these are random samples only in the sense that they are uncontrolled, and their structure is apt to be ignored. If you start to sample, to are more likely to ask yourself about the structure - by time, geography, what-have-you. An N of millions gives you tests that are wrong; estimates ignoring relevant structure have a spurious report of precision. To put it another way: the Error (or real variation) that *exists* between a fixed number of units (years, or cities, for what I mentioned above) is something that you want to generalize across. With a small N, that error term is (we assume?) small enough to ignore. However, that error term will not decrease with N, so with a large N, it will eventually dominate. The test based on N becomes increasing irrelevant -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
David C. Ullrich wrote in message [EMAIL PROTECTED]... On Thu, 14 Jun 2001 15:22:25 +0100, Paul Jones [EMAIL PROTECTED] wrote: There was some research recently linking heart attacks with Marijuana smoking. I'm trying to work out the correlation and, most importantly, its statistical significance. In essence the problem comes down to: Of 8760 hours in a year, 124 had heart attacks in them, 141 had MJ smokes in them and 9 had both. What statistical tests apply? None. What you've said here makes no sense - what does it mean for an _hour_ to have MJ smoke? If you're actually reporting on actual research it would be interesting to know what the actual researchers actually said - if there's actual research out there that talks about the number of hours in a year containing smoke that will be remarkable. If otoh this is a homework question you should quote the question more accurately. (If the homework question _really_ reads _exactly_ the way you put it then you should complain to whoever assigned it that it makes no sense.) Most importantly, what is the statistical significance of the correlation between smoking MJ in any hour and having a heart attack in that same hour? Now this sounds more like you're talking about one person. This is an actual person who actually had 124 heart attacks in one year? I doubt it. What is the probablity that the null hypothesis (that smoking marijuana and having a heart attack are unrelated) can be rejected? How reliable are the results from a dataset of this size? I'm not very literate in maths and stats - please help me out someone. I'm interested in this research from the perspective of medicinal marijuana. Fascinating topic. If this is not actually homework you need to explain the question much more accurately. The data presented may refer to a much-reported study. (See, for example, http://www.eurekalert.org/releases/bidm-bsf022800.html ) To quote from there: The findings are the latest to emerge from a multicenter study of 3,882 patients who survived heart attacks. In this report, 124 people reported using marijuana regularly. Of these, 37 people reported using marijuana within 24 hours of their heart attacks, and nine smoked marijuana within an hour of their heart attacks. Note: 124 people... 9 within an hour... And 3882/37 + 37 = 141 MU = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
Steve Leibel wrote: So the people who died from heart attacks weren't even considered in the study. Perhaps of all the people who had heart attacks, recent mj use was statistically correlated with saving their lives. That would be consistent with what you just described. So the methodology sounds bogus. That's not all - the MJ users had an excess of males, cigarette smokers and obese people - all increased risks for myocardial infarction. These articles rarely show statistical significance and it's hard to get hold of the full text without paying loads for it - besides, the full text might not quote p values. I want to know how statistically significant the association is, even given the studies obvious weaknesses. I need to know how to calculate a p value. If anyone could help it would be of great value to myself and a number of other PwMS. Thanks and take care, Paul All About MS - the latest MS News and Views http://www.mult-sclerosis.org/ = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
In article 9galk6$fjr$[EMAIL PROTECTED], Mr Unreliable [EMAIL PROTECTED] wrote: David C. Ullrich wrote in message [EMAIL PROTECTED]... On Thu, 14 Jun 2001 15:22:25 +0100, Paul Jones [EMAIL PROTECTED] wrote: There was some research recently linking heart attacks with Marijuana smoking. I'm trying to work out the correlation and, most importantly, its statistical significance. In essence the problem comes down to: Of 8760 hours in a year, 124 had heart attacks in them, 141 had MJ smokes in them and 9 had both. What statistical tests apply? None. What you've said here makes no sense - what does it mean for an _hour_ to have MJ smoke? If you're actually reporting on actual research it would be interesting to know what the actual researchers actually said - if there's actual research out there that talks about the number of hours in a year containing smoke that will be remarkable. If otoh this is a homework question you should quote the question more accurately. (If the homework question _really_ reads _exactly_ the way you put it then you should complain to whoever assigned it that it makes no sense.) Most importantly, what is the statistical significance of the correlation between smoking MJ in any hour and having a heart attack in that same hour? Now this sounds more like you're talking about one person. This is an actual person who actually had 124 heart attacks in one year? I doubt it. What is the probablity that the null hypothesis (that smoking marijuana and having a heart attack are unrelated) can be rejected? How reliable are the results from a dataset of this size? I'm not very literate in maths and stats - please help me out someone. I'm interested in this research from the perspective of medicinal marijuana. Fascinating topic. If this is not actually homework you need to explain the question much more accurately. The data presented may refer to a much-reported study. (See, for example, http://www.eurekalert.org/releases/bidm-bsf022800.html ) To quote from there: The findings are the latest to emerge from a multicenter study of 3,882 patients who survived heart attacks. In this report, 124 people reported using marijuana regularly. Of these, 37 people reported using marijuana within 24 hours of their heart attacks, and nine smoked marijuana within an hour of their heart attacks. So the people who died from heart attacks weren't even considered in the study. Perhaps of all the people who had heart attacks, recent mj use was statistically correlated with saving their lives. That would be consistent with what you just described. So the methodology sounds bogus. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
David C. Ullrich wrote in message [EMAIL PROTECTED]... On Thu, 14 Jun 2001 15:22:25 +0100, Paul Jones [EMAIL PROTECTED] wrote: There was some research recently linking heart attacks with Marijuana smoking. I'm trying to work out the correlation and, most importantly, its statistical significance. In essence the problem comes down to: Of 8760 hours in a year, 124 had heart attacks in them, 141 had MJ smokes in them and 9 had both. What statistical tests apply? None. What you've said here makes no sense - what does it mean for an _hour_ to have MJ smoke? If you're actually reporting on actual research it would be interesting to know what the actual researchers actually said - if there's actual research out there that talks about the number of hours in a year containing smoke that will be remarkable. If otoh this is a homework question you should quote the question more accurately. (If the homework question _really_ reads _exactly_ the way you put it then you should complain to whoever assigned it that it makes no sense.) Most importantly, what is the statistical significance of the correlation between smoking MJ in any hour and having a heart attack in that same hour? Now this sounds more like you're talking about one person. This is an actual person who actually had 124 heart attacks in one year? I doubt it. What is the probablity that the null hypothesis (that smoking marijuana and having a heart attack are unrelated) can be rejected? How reliable are the results from a dataset of this size? I'm not very literate in maths and stats - please help me out someone. I'm interested in this research from the perspective of medicinal marijuana. Fascinating topic. If this is not actually homework you need to explain the question much more accurately. The data presented may refer to a much-reported study. (See, for example, http://www.eurekalert.org/releases/bidm-bsf022800.html ) To quote from there: The findings are the latest to emerge from a multicenter study of 3,882 patients who survived heart attacks. In this report, 124 people reported using marijuana regularly. Of these, 37 people reported using marijuana within 24 hours of their heart attacks, and nine smoked marijuana within an hour of their heart attacks. Note: 124 people... 9 within 24 hours... And 3882/37 + 37 = 141 MU = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
Thanks for replying, David. I'll try to frame the problem better. First, I shall explain my motivations. There has recently been some research that implied that smoking MJ increased risk of heart attack in the hour following the heart attack. I haven't got the full text of the article - I've just seen the abstract, the press releases and resultant press coverage. There is a lot of dodgy research research and I want to know how statistically valid this research is. As you can imagine this topic is of great interest to people who use medicinal marijuana for multiple sclerosis as it has considerable benefit for neurogenic bladder problems, neuropathic pain and muscle spasms. The headline that MJ may increase heart attack risk in the hour following smoking it is extremely pertinent to people with MS. This explains my motives. This is not homework - I have MS. So the research says that of a large number of people who had heart attacks at a centre, 124 people had used MJ in the year preceding the HA. Of these 9 reported that they had used MJ in the hour preceding the HA. All MJ users were questioned on the frequency with which they used MJ. The relative risk was reported as 4.8 - I used this to back-calculate that the average number of MJ usages per year rounded 141 - (9/n)/(115/(8760-n)) = 4.8 I see an immediate mistake in what I wrote before - I have used the average Med MJ smokes but the total heart attacks. Restating the problem: Event A is smoking MJ. Event B is having HA. Let's assume that both events can only happen once per hour and that each person only had one HA. Of 1,086,240 hours, A happened 17,484 times, B happened 124 times and both A and B happened 9 times. What I want to know is what is the correlation between these two event? Most importantly, how statistically significant is the result? Can any reasonable conclusions be drawn from these data - esp, in view of the small dataset size? I would appreciate being corrected. Take care, Paul All About MS - the latest MS News and Views http://www.mult-sclerosis.org/ David C. Ullrich wrote: On Thu, 14 Jun 2001 15:22:25 +0100, Paul Jones [EMAIL PROTECTED] wrote: There was some research recently linking heart attacks with Marijuana smoking. I'm trying to work out the correlation and, most importantly, its statistical significance. In essence the problem comes down to: Of 8760 hours in a year, 124 had heart attacks in them, 141 had MJ smokes in them and 9 had both. What statistical tests apply? None. What you've said here makes no sense - what does it mean for an _hour_ to have MJ smoke? If you're actually reporting on actual research it would be interesting to know what the actual researchers actually said - if there's actual research out there that talks about the number of hours in a year containing smoke that will be remarkable. If otoh this is a homework question you should quote the question more accurately. (If the homework question _really_ reads _exactly_ the way you put it then you should complain to whoever assigned it that it makes no sense.) Most importantly, what is the statistical significance of the correlation between smoking MJ in any hour and having a heart attack in that same hour? Now this sounds more like you're talking about one person. This is an actual person who actually had 124 heart attacks in one year? I doubt it. What is the probablity that the null hypothesis (that smoking marijuana and having a heart attack are unrelated) can be rejected? How reliable are the results from a dataset of this size? I'm not very literate in maths and stats - please help me out someone. I'm interested in this research from the perspective of medicinal marijuana. Fascinating topic. If this is not actually homework you need to explain the question much more accurately. Thanks and take care, Paul All About MS - the latest MS News and Views http://www.mult-sclerosis.org/ David C. Ullrich * Sometimes you can have access violations all the time and the program still works. (Michael Caracena, comp.lang.pascal.delphi.misc 5/1/01) = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: multivariate techniques for large datasets
Herman Rubin wrote: In article 9g9k9f$h4c$[EMAIL PROTECTED], Eric Bohlman [EMAIL PROTECTED] wrote: In sci.stat.consult Tracey Continelli [EMAIL PROTECTED] wrote: value. I'm not sure why you'd want to reduce the size of the data set, since for the most part the larger the N the better. Actually, for datasets of the OP's size, the increase in power from the large size is a mixed blessing, for the same reason that many hard-of-hearing people don't terribly like wearing hearing aids: they bring up the background noise just as much as the signal. With an N of one million, practically *any* effect you can test for is going to be significant, regardless of how small it is. This just points out another stupidity of the use of significance testing. Since the null hypothesis is false anyhow, why should we care what happens to be the probability of rejecting when it is true? State the REAL problem, and attack this. How true! The only drawback there can be to more rather than less data for inferential purposes would have to center around the extra cost of computation, rather than the inconvenience posed to significance testing methodology. There is a significant philosophical question lurking here. It is a reminder of how we get so attached to the tools we use that we sometimes turn their bugs into features. Significance testing is a make-do construction of classical statistical inference, in some sense an indirect way of characterizing the uncertainty surrounding a parameter estimate. The Bayesian approach of attempting to characterize such uncertainty directly, rather than indirectly, and further of characterizing directly, through some function transformation of the parameter in question, the uncertainty surrounding some consequential loss or profit function critical to some real-world decision, is clearly laudable... if it can be justified. Clearly, from a classicist's perspective, the Bayesians have failed at this attempt at justification, otherwise one would have to be a masochist to stick with the sheer torture of classical inferential methods. Besides, the Bayesians indulge not a little in turning bugs into features themselves. At any rate, I say all that to say this: once it is recognized that there is a valid (extended) likelihood calculus, as easy of manipulation as the probability calculus in attempting a direct characterization of the uncertainty surrounding statistical model parameters, the gap between these two ought to be closed. I'm not holding my breath, as this may take several generations. We all reach for the tool we know how to use, not necessarily for the best tool for the job. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 Regards, S. F. Thomas = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
I was surprised to see this subject heading on sci.math. I thought it might have to do with the following lyrics (I forget the name of the group and the song): I smoke two joints at two o' clock; I smoke two joints at four. I smoke two joints before I smoke two joints, And then I smoke two more. Given an infinite supply of marijuana, even granting immortality to Cheech and Chong would not make the above feat possible. One would need to have existed for an infinite amout of time. And even then, smoking a joint takes at least one Planck time unit, so if you plot on a time-line the points at which each joint-pair- smoking finishes, there can't be any accumulation points. This would seem to preclude any such feat of pot-smoking . . . unless you somehow exist in a strange temporal topology (e.g., the long line). So then, how much marijuana would one have to smoke to actually change the nature of (one's personal) time in such a way? I'm guessing that no finite amount would suffice, but do not hazard a guess as to the precise cardnality required. | Jim Ferry | Center for Simulation | ++ of Advanced Rockets | | http://www.uiuc.edu/ph/www/jferry/ ++ |jferry@[delete_this]uiuc.edu| University of Illinois | = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
On Thu, 14 Jun 2001 16:37:02 +0100, Mr Unreliable [EMAIL PROTECTED] wrote: David C. Ullrich wrote in message [EMAIL PROTECTED]... On Thu, 14 Jun 2001 15:22:25 +0100, Paul Jones [EMAIL PROTECTED] wrote: There was some research recently linking heart attacks with Marijuana smoking. [...] Fascinating topic. If this is not actually homework you need to explain the question much more accurately. The data presented may refer to a much-reported study. (See, for example, http://www.eurekalert.org/releases/bidm-bsf022800.html ) To quote from there: The findings are the latest to emerge from a multicenter study of 3,882 patients who survived heart attacks. In this report, 124 people reported using marijuana regularly. Of these, 37 people reported using marijuana within 24 hours of their heart attacks, and nine smoked marijuana within an hour of their heart attacks. Right. Seems to me (although I really know nothing about this sort of thing) that to draw any reliable conclusions (not that _you_'d care about that) we need to know a little more, like what fraction of the people who did _not_ get heart attacks smoke, regularly or otherwise. Note: 124 people... 9 within an hour... And 3882/37 + 37 = 141 MU David C. Ullrich * Sometimes you can have access violations all the time and the program still works. (Michael Caracena, comp.lang.pascal.delphi.misc 5/1/01) = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
Brother! That topic sure drew a crowd! :) Paul Jones wrote: There was some research recently linking heart attacks with Marijuana smoking. [big snip] Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
Paul Jones wrote ... So the research says that of a large number of people who had heart attacks at a centre, 124 people had used MJ in the year preceding the HA. Of these 9 reported that they had used MJ in the hour preceding the HA. All MJ users were questioned on the frequency with which they used MJ. Keep in mind that correlation is not the same as causation. That's of particular importance in a study like this one. That is, if people are taking marijuana to treat pain and general discomfort, and if heart attacks are preceded by pain and discomfort, then there will be a strong correlation between marijuana use and later heart attacks, but it won't be proof of causation. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Average Distance to nearest neighbour
Hello! Thank you very much for answers. But I also wamt to know pdf function for this distance. And if you know, please, give me references to books where I can see this formula. Thank, John Gerber John Garber [EMAIL PROTECTED] wrote in message I am looking for a solution of the following problem: Assume a square area with sides of length L. N points are randomly distributed within area. The location of each point is independent of other points. The location of a point is a uniform random variable - a point is equally likely to be anywhere within the square. Find the expected value of the distance from a randomly selected point to its nearest neighbor. Thanks, John Gerber = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: please help
Kelly wrote: I have the gage repeatability reproducibility(gage RR) analysis done on two instruments, what hyphoses test can I use to test that the repeatability variance(expected sigma values of repeatability) of the two instruments are significantly different form each other or to say one has a lower variance than the other. Any insight will be greatly appreciated. Thanks in advance for your help. One approach is to form the likelihood function in each case and to eliminate the nuisance parameters (the means) by marginalization. Although it is well known that marginalization by maximization will give misleading answers for both the location and precision of your estimate of the variances, I have shown how another method based on marginalization by the rule of product-sum can avoid the problems known to exist with respect to the former. (See _Fuzziness and Probability_ (ACG Press, 1995)). This method also avoids the assumptions of the Bayesian approach -- effectively a method of marginalization by integration -- which have been considered and rejected, and with good reason in my opinion, by those of the classical school. The product-sum method may be relatively easily implemented within an extensible stat package such as R, and I would be happy to apply my implementation of it to your problem if you would send me the two datasets. Essentially, once the nuisance parameters (the one or more means) are eliminated, what is left in each case is the (marginal) likelihood function of the variance, and one could effectively compare directly the plots of the two variance marginal likelihoods, and also, if need be, the likelihood function of the difference, to see how different this is from zero. This is not a classicist's answer, but tests of hypothesis and all that can be obviated if the likelihood function can be directly manipulated in the way I describe. This has been the whole point of the Bayesian method, except of course for the inadequate justification provided not only for its insistent subjectiveness, but also for treating model parameters as though they were random variables in their own right. Hope this is helpful. Regards, S. F. Thomas = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =