Re: [R] Using t tests

2011-07-10 Thread Mike Marchywka


( after getting confirmation of lack of posts try again, LOL )


> From: marchy...@hotmail.com
> To: r-help@r-project.org
> Subject: RE: [R] Using t tests
> Date: Sun, 10 Jul 2011 10:13:51 -0400
>
>
> ( sorry if this is a repost but I meant to post to list and never received
> any indication it was sent to the list, thanks asking for comments about 
> approach
> to data analysis).
>
>
> > From: marchy...@hotmail.com
> > To: j...@bitwrit.com.au; gwanme...@aol.com
> > CC: r-help@r-project.org
> > Subject: RE: [R] Using t tests
> > Date: Sun, 10 Jul 2011 07:35:32 -0400
> >
> >
> >
> >
> >
> > 
> > > Date: Sat, 9 Jul 2011 18:40:43 +1000
> > > From: j...@bitwrit.com.au
> > > To: gwanme...@aol.com
> > > CC: r-help@r-project.org
> > > Subject: Re: [R] Using t tests
> > >
> > > On 07/08/2011 07:22 PM, gwanme...@aol.com wrote:
> > > > Dear Sir,
> > > >
> > > > I am doing some work on a population of patients. About half of them are
> > > > admitted into hospital with albumin levels less than 33. The other half 
> > > > have
> > > > albumin levels greater than 33, so I stratify them into 2 groups, x and 
> > > > y
> > > > respectively.
> > > >
> > > > I suspect that the average length of stay in hospital for the group of
> > > > patients (x) with albumin levels less than 33 is greater than those with
> > > > albumin levels greater than 33 (y).
> > > >
> > > > What command function do I use (assuming that I will be using the chi
> > > > square test) to show that the length of stay in hospital of those in 
> > > > group x is
> > > > statistically significantly different from those in group y?
> > > >
> > > Hi Ivo,
> > > Just to make things even more complicated for you, Mark's suggestion
> > > that the length_of_stay measure is unlikely to be normally distributed
> > > might lead you to look into a non-parametric test like the Wilcoxon (aka
> >
> > ( please correct any of the following which is wrong, but note that
> > the discusion is more interesting and useful with details of your goals )
> > I'm curious why people still jump to setting arbitrary cutoff points,
> > in this case based on what you happen to have sampled, rather than
> > try to find a functional relationship between the two parametric
> > variables? Generally the thing that separates likely cause from
> > noise is smotthness or something you can at least rationalize
> > in terms of physical mechanisms. If your question relates
> > to the reprodiciblity of a given result (" well this experiment showed
> > hi and low were significantly different on hospital stays, maybe the next
> > experiement will show the same ") you'd probably like to consider
> > the data in relation to possible causes. I'd not sure your disease process
> > would know about your median test results when patients walk in. BTW,
> > what is terminating the hospital stay, cure death or insurance exhaustion?
> > This sounds like you are just trying to reproduce something that is already
> > in the literature:cutoff is on the low side of normal and often hypoprotein
> > is suspected of being bad, that the higher group would be usually expected 
> > to do better no? Although
> > I suppose this could have something to do with dehydration etc but the point
> > of course is that data interpretation is difficult to do in a vacuum.
> >
> >
> >
> >
> >
> >
> >
> >
> > > Mann-Whitney in your case) test. You will have to split your
> > > length_of_stay measure into two like this (assume your data frame is
> > > named "losdf"):
> > >
> > > albumin_hilo <- albumin > 33
> > > wilcox.test(losdf$length-of-stay[albumin_hilo],
> > > losdf$length_of_stay[!albumin_hilo])
> > >
> > > or if you use wilcox_test in the "coin package:
> > >
> > > albumin_hilo <- albumin > 33
> > > wilcox_test(length_of_stay~albumin_hilo,losdf)
> > >
> > > Do remember that the chi-square test is used for categorical variables,
> > > for instance if you dichotomized your length_of_stay into less than 10
> > > days or 10 days and over.
> > >
> > > Jim
> > >
> > > __
> > > R-help@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
>
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using t tests

2011-07-09 Thread Jim Lemon

On 07/08/2011 07:22 PM, gwanme...@aol.com wrote:

Dear Sir,

I am doing some work on a population of patients. About half of them are
admitted into hospital with albumin levels less than 33. The other half have
albumin levels greater than 33, so I stratify them into 2 groups, x and y
respectively.

I suspect that the average length of stay in hospital for the group of
patients (x) with albumin levels less than 33 is greater than those  with
albumin levels greater than 33 (y).

What command function do I use (assuming that I will be using the chi
square test) to show that the length of stay in hospital of those in group x is
statistically significantly different from those in group y?


Hi Ivo,
Just to make things even more complicated for you, Mark's suggestion 
that the length_of_stay measure is unlikely to be normally distributed 
might lead you to look into a non-parametric test like the Wilcoxon (aka 
Mann-Whitney in your case) test. You will have to split your 
length_of_stay measure into two like this (assume your data frame is 
named "losdf"):


albumin_hilo <- albumin > 33
wilcox.test(losdf$length-of-stay[albumin_hilo],
 losdf$length_of_stay[!albumin_hilo])

or if you use wilcox_test in the "coin package:

albumin_hilo <- albumin > 33
wilcox_test(length_of_stay~albumin_hilo,losdf)

Do remember that the chi-square test is used for categorical variables, 
for instance if you dichotomized your length_of_stay into less than 10 
days or 10 days and over.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using t tests

2011-07-08 Thread Marc Schwartz
Just to add onto Greg's comments, you may want to review this thread over on 
MedStats, since this topic was just discussed extensively this week, initially 
as a query about using LOS as a covariate:

  
http://groups.google.com/group/medstats/browse_thread/thread/f875fdeeaf48dc38?hl=en

It is highly unlikely that LOS is normally distributed.

HTH,

Marc Schwartz

On Jul 8, 2011, at 10:43 AM, Greg Snow wrote:

> How are you measuring length of stay?  A chi-square test suggests that you 
> have it categorized, a t-test assumes it is continuous (and relatively 
> symmetric with the amount depending on sample size).
> 
> Do you have any censoring? (patients dying or transferring before discharge) 
> if so you should look at survival analysis.
> 
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of gwanme...@aol.com
> Sent: Friday, July 08, 2011 3:23 AM
> To: r-help@r-project.org
> Subject: [R] Using t tests
> 
> Dear Sir,
> 
> I am doing some work on a population of patients. About half of them are  
> admitted into hospital with albumin levels less than 33. The other half have  
> albumin levels greater than 33, so I stratify them into 2 groups, x and y  
> respectively.
> 
> I suspect that the average length of stay in hospital for the group of  
> patients (x) with albumin levels less than 33 is greater than those  with 
> albumin levels greater than 33 (y).
> 
> What command function do I use (assuming that I will be using the chi  
> square test) to show that the length of stay in hospital of those in group x 
> is  
> statistically significantly different from those in group y?
> 
> I look forward to your thoughts.
> 
> Ivo Gwanmesia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using t tests

2011-07-08 Thread Greg Snow
How are you measuring length of stay?  A chi-square test suggests that you have 
it categorized, a t-test assumes it is continuous (and relatively symmetric 
with the amount depending on sample size).

Do you have any censoring? (patients dying or transferring before discharge) if 
so you should look at survival analysis.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of gwanme...@aol.com
Sent: Friday, July 08, 2011 3:23 AM
To: r-help@r-project.org
Subject: [R] Using t tests

Dear Sir,
 
I am doing some work on a population of patients. About half of them are  
admitted into hospital with albumin levels less than 33. The other half have  
albumin levels greater than 33, so I stratify them into 2 groups, x and y  
respectively.
 
I suspect that the average length of stay in hospital for the group of  
patients (x) with albumin levels less than 33 is greater than those  with 
albumin levels greater than 33 (y).
 
What command function do I use (assuming that I will be using the chi  
square test) to show that the length of stay in hospital of those in group x is 
 
statistically significantly different from those in group y?
 
I look forward to your thoughts.
 
Ivo Gwanmesia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using t tests

2011-07-08 Thread Gwanmesia
Dear Sir,
 
I am doing some work on a population of patients. About half of them are  
admitted into hospital with albumin levels less than 33. The other half have  
albumin levels greater than 33, so I stratify them into 2 groups, x and y  
respectively.
 
I suspect that the average length of stay in hospital for the group of  
patients (x) with albumin levels less than 33 is greater than those  with 
albumin levels greater than 33 (y).
 
What command function do I use (assuming that I will be using the chi  
square test) to show that the length of stay in hospital of those in group x is 
 
statistically significantly different from those in group y?
 
I look forward to your thoughts.
 
Ivo Gwanmesia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.