Re: [R] Using t tests
( after getting confirmation of lack of posts try again, LOL ) > From: marchy...@hotmail.com > To: r-help@r-project.org > Subject: RE: [R] Using t tests > Date: Sun, 10 Jul 2011 10:13:51 -0400 > > > ( sorry if this is a repost but I meant to post to list and never received > any indication it was sent to the list, thanks asking for comments about > approach > to data analysis). > > > > From: marchy...@hotmail.com > > To: j...@bitwrit.com.au; gwanme...@aol.com > > CC: r-help@r-project.org > > Subject: RE: [R] Using t tests > > Date: Sun, 10 Jul 2011 07:35:32 -0400 > > > > > > > > > > > > > > > Date: Sat, 9 Jul 2011 18:40:43 +1000 > > > From: j...@bitwrit.com.au > > > To: gwanme...@aol.com > > > CC: r-help@r-project.org > > > Subject: Re: [R] Using t tests > > > > > > On 07/08/2011 07:22 PM, gwanme...@aol.com wrote: > > > > Dear Sir, > > > > > > > > I am doing some work on a population of patients. About half of them are > > > > admitted into hospital with albumin levels less than 33. The other half > > > > have > > > > albumin levels greater than 33, so I stratify them into 2 groups, x and > > > > y > > > > respectively. > > > > > > > > I suspect that the average length of stay in hospital for the group of > > > > patients (x) with albumin levels less than 33 is greater than those with > > > > albumin levels greater than 33 (y). > > > > > > > > What command function do I use (assuming that I will be using the chi > > > > square test) to show that the length of stay in hospital of those in > > > > group x is > > > > statistically significantly different from those in group y? > > > > > > > Hi Ivo, > > > Just to make things even more complicated for you, Mark's suggestion > > > that the length_of_stay measure is unlikely to be normally distributed > > > might lead you to look into a non-parametric test like the Wilcoxon (aka > > > > ( please correct any of the following which is wrong, but note that > > the discusion is more interesting and useful with details of your goals ) > > I'm curious why people still jump to setting arbitrary cutoff points, > > in this case based on what you happen to have sampled, rather than > > try to find a functional relationship between the two parametric > > variables? Generally the thing that separates likely cause from > > noise is smotthness or something you can at least rationalize > > in terms of physical mechanisms. If your question relates > > to the reprodiciblity of a given result (" well this experiment showed > > hi and low were significantly different on hospital stays, maybe the next > > experiement will show the same ") you'd probably like to consider > > the data in relation to possible causes. I'd not sure your disease process > > would know about your median test results when patients walk in. BTW, > > what is terminating the hospital stay, cure death or insurance exhaustion? > > This sounds like you are just trying to reproduce something that is already > > in the literature:cutoff is on the low side of normal and often hypoprotein > > is suspected of being bad, that the higher group would be usually expected > > to do better no? Although > > I suppose this could have something to do with dehydration etc but the point > > of course is that data interpretation is difficult to do in a vacuum. > > > > > > > > > > > > > > > > > > > Mann-Whitney in your case) test. You will have to split your > > > length_of_stay measure into two like this (assume your data frame is > > > named "losdf"): > > > > > > albumin_hilo <- albumin > 33 > > > wilcox.test(losdf$length-of-stay[albumin_hilo], > > > losdf$length_of_stay[!albumin_hilo]) > > > > > > or if you use wilcox_test in the "coin package: > > > > > > albumin_hilo <- albumin > 33 > > > wilcox_test(length_of_stay~albumin_hilo,losdf) > > > > > > Do remember that the chi-square test is used for categorical variables, > > > for instance if you dichotomized your length_of_stay into less than 10 > > > days or 10 days and over. > > > > > > Jim > > > > > > __ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using t tests
On 07/08/2011 07:22 PM, gwanme...@aol.com wrote: Dear Sir, I am doing some work on a population of patients. About half of them are admitted into hospital with albumin levels less than 33. The other half have albumin levels greater than 33, so I stratify them into 2 groups, x and y respectively. I suspect that the average length of stay in hospital for the group of patients (x) with albumin levels less than 33 is greater than those with albumin levels greater than 33 (y). What command function do I use (assuming that I will be using the chi square test) to show that the length of stay in hospital of those in group x is statistically significantly different from those in group y? Hi Ivo, Just to make things even more complicated for you, Mark's suggestion that the length_of_stay measure is unlikely to be normally distributed might lead you to look into a non-parametric test like the Wilcoxon (aka Mann-Whitney in your case) test. You will have to split your length_of_stay measure into two like this (assume your data frame is named "losdf"): albumin_hilo <- albumin > 33 wilcox.test(losdf$length-of-stay[albumin_hilo], losdf$length_of_stay[!albumin_hilo]) or if you use wilcox_test in the "coin package: albumin_hilo <- albumin > 33 wilcox_test(length_of_stay~albumin_hilo,losdf) Do remember that the chi-square test is used for categorical variables, for instance if you dichotomized your length_of_stay into less than 10 days or 10 days and over. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using t tests
Just to add onto Greg's comments, you may want to review this thread over on MedStats, since this topic was just discussed extensively this week, initially as a query about using LOS as a covariate: http://groups.google.com/group/medstats/browse_thread/thread/f875fdeeaf48dc38?hl=en It is highly unlikely that LOS is normally distributed. HTH, Marc Schwartz On Jul 8, 2011, at 10:43 AM, Greg Snow wrote: > How are you measuring length of stay? A chi-square test suggests that you > have it categorized, a t-test assumes it is continuous (and relatively > symmetric with the amount depending on sample size). > > Do you have any censoring? (patients dying or transferring before discharge) > if so you should look at survival analysis. > > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of gwanme...@aol.com > Sent: Friday, July 08, 2011 3:23 AM > To: r-help@r-project.org > Subject: [R] Using t tests > > Dear Sir, > > I am doing some work on a population of patients. About half of them are > admitted into hospital with albumin levels less than 33. The other half have > albumin levels greater than 33, so I stratify them into 2 groups, x and y > respectively. > > I suspect that the average length of stay in hospital for the group of > patients (x) with albumin levels less than 33 is greater than those with > albumin levels greater than 33 (y). > > What command function do I use (assuming that I will be using the chi > square test) to show that the length of stay in hospital of those in group x > is > statistically significantly different from those in group y? > > I look forward to your thoughts. > > Ivo Gwanmesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using t tests
How are you measuring length of stay? A chi-square test suggests that you have it categorized, a t-test assumes it is continuous (and relatively symmetric with the amount depending on sample size). Do you have any censoring? (patients dying or transferring before discharge) if so you should look at survival analysis. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of gwanme...@aol.com Sent: Friday, July 08, 2011 3:23 AM To: r-help@r-project.org Subject: [R] Using t tests Dear Sir, I am doing some work on a population of patients. About half of them are admitted into hospital with albumin levels less than 33. The other half have albumin levels greater than 33, so I stratify them into 2 groups, x and y respectively. I suspect that the average length of stay in hospital for the group of patients (x) with albumin levels less than 33 is greater than those with albumin levels greater than 33 (y). What command function do I use (assuming that I will be using the chi square test) to show that the length of stay in hospital of those in group x is statistically significantly different from those in group y? I look forward to your thoughts. Ivo Gwanmesia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using t tests
Dear Sir, I am doing some work on a population of patients. About half of them are admitted into hospital with albumin levels less than 33. The other half have albumin levels greater than 33, so I stratify them into 2 groups, x and y respectively. I suspect that the average length of stay in hospital for the group of patients (x) with albumin levels less than 33 is greater than those with albumin levels greater than 33 (y). What command function do I use (assuming that I will be using the chi square test) to show that the length of stay in hospital of those in group x is statistically significantly different from those in group y? I look forward to your thoughts. Ivo Gwanmesia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.