[R] paired t-test with bootstrap

2004-07-13 Thread Vito Ricci
Hi

On 13 Jul 2004 at 12:28, luciana wrote:

> Dear Sirs,
 
> I have a sample of diabetic people, matched (by age
and sex) with a
> control sample. The variable I would like to compare
is their drug
> and hospital monthly cost. The variable cost has a
very far from
> gaussian distribution, but I need any way to compare
the mean
> between the two group. So, in the specific case of a
paired sample
> t-test, I aim at testing if the difference of cost
is close to 0.
> What is the better way to follow for that?
> 

I can suggest to see:

? pairwise.wilcox.test()
? wilcox.test

using non-parametric tests instead of t-test.

Cordially
Vito


=
Diventare costruttori di soluzioni

Visitate il portale http://www.modugno.it/
e in particolare la sezione su Palese http://www.modugno.it/archivio/cat_palese.shtml

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] paired t-test with bootstrap

2004-07-13 Thread Marc Schwartz
On Tue, 2004-07-13 at 07:28, Petr Pikal wrote:
> Hi
> 
> On 13 Jul 2004 at 12:28, luciana wrote:
> 
> > Dear Sirs,
> > 
> > I am a R beginning user: by mean of R I would like to apply the
> > bootstrap to my data in order to test cost differences between
> > independent or paired samples of people affected by a certain
> > disease.
> > 
> > My problem is that even if I am reading the book by Efron
> > (introduction to the bootstrap), looking at the examples in internet
> > and available in R, learning a lot of theoretical things on
> > bootstrap, I can't apply bootstrap with R to my data because of many
> > doubts and difficulties. This is the reason why I have decided to
> > ask the expert for help.
> > 
> > 
> > 
> > I have a sample of diabetic people, matched (by age and sex) with a
> > control sample. The variable I would like to compare is their drug
> > and hospital monthly cost. The variable cost has a very far from
> > gaussian distribution, but I need any way to compare the mean
> > between the two group. So, in the specific case of a paired sample
> > t-test, I aim at testing if the difference of cost is close to 0.
> > What is the better way to follow for that?
> > 
> > 
> > 
> > Another question is that sometimes I have missing data in my dataset
> > (for example I have the cost for a patients but not for a control).
> > If I introduce NA or a dot, R doesn't estimate the statistic I need
> > (for instance the mean). To overcome this problem I have replaced
> > the missing data with the mean computed with the remaining part of
> > data. Anyway, I think R can actually compute the mean even with the
> > presence of missing data. Is it right? What can I do?
> 
> your.statistic(your.data, na.rm=T)
> 
> e.g.
> mean(your.data, na.rm=T)
> 
> or look at ?na.action e.g  mean(na.omit(your.data))
> 
> Cheers
> Petr Pikal


A couple of other thoughts here with respect to the use of a paired
t-test for the comparison.

As Luciana notes above, cost data is typically highly skewed, raising
doubt as to the use of a simple parametric test to compare the two
groups.

One of the many reasons such data is skewed is that there are notable
differences in the populations that are not accounted for when using
simple characteristics for matching as is done here. What makes a
patient an "outlier" with respect to cost and how does the distribution
of these patients differ between the two groups and the individual
pairs?

For example, are all the patients in both groups insulin dependent or
are some controlled with oral agents or diet alone? If all are using
insulin, are some using self-administered injections while others are
using implanted infusion pumps? What is the interval from disease onset?
Have any had Pancreas/Islet Cell transplants? Do the matched patients
have similar diabetic related sequelae such as diabetic retinopathy,
neuropathy, vasculopathy, renal dysfunction and others? If not, the
costs to treat these other issues, such as dialysis and wound care
alone, can dramatically alter the cost profile for patients even when
matched by age and gender.

If you are not considering these issues (ie. such as inclusion/exclusion
criteria), you risk significant challenges in your conclusions with
respect to the comparison of costs for these two groups. I would raise
similar concerns when using a sample mean as the imputed value for
missing data.

If you have not done so already, a Medline search of the literature
would be in order to better understand what others have done in this
area for diabetic treatment costs and the pros and cons of their
respective approaches. I suspect that others here will have additional
recommendations.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] paired t-test with bootstrap

2004-07-13 Thread Christoph Lehmann
just a hint for further bootstrapping examples (worked out with R):
"Bootstrap Methods and Their Applications" by A.C. Davison and D.V. Hinkley
cheers
christoph
luciana wrote:
Dear Sirs,
I am a R beginning user: by mean of R I would like to apply the bootstrap to my data 
in order to test cost differences between independent or paired samples of people 
affected by a certain disease.
My problem is that even if I am reading the book by Efron (introduction to the 
bootstrap), looking at the examples in internet and available in R, learning a lot of 
theoretical things on bootstrap, I can't apply bootstrap with R to my data because of 
many doubts and difficulties. This is the reason why I have decided to ask the expert 
for help.
 

I have a sample of diabetic people, matched (by age and sex) with a control sample. 
The variable I would like to compare is their drug and hospital monthly cost. The 
variable cost has a very far from gaussian distribution, but I need any way to compare 
the mean between the two group. So, in the specific case of a paired sample t-test, I 
aim at testing if the difference of cost is close to 0. What is the better way to 
follow for that?
 

Another question is that sometimes I have missing data in my dataset (for example I 
have the cost for a patients but not for a control). If I introduce NA or a dot, R 
doesn't estimate the statistic I need (for instance the mean). To overcome this 
problem I have replaced the missing data with the mean computed with the remaining 
part of data. Anyway, I think R can actually compute the mean even with the presence 
of missing data. Is it right? What can I do?
 

Thank you very much for your attention and, I hope, your help.
 

Best wishes 

 

Luciana Scalone
Center of Pharmacoeconomics
University of Milan
[[alternative HTML version deleted]]
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] paired t-test with bootstrap

2004-07-13 Thread Petr Pikal
Hi

On 13 Jul 2004 at 12:28, luciana wrote:

> Dear Sirs,
> 
> I am a R beginning user: by mean of R I would like to apply the
> bootstrap to my data in order to test cost differences between
> independent or paired samples of people affected by a certain
> disease.
> 
> My problem is that even if I am reading the book by Efron
> (introduction to the bootstrap), looking at the examples in internet
> and available in R, learning a lot of theoretical things on
> bootstrap, I can't apply bootstrap with R to my data because of many
> doubts and difficulties. This is the reason why I have decided to
> ask the expert for help.
> 
> 
> 
> I have a sample of diabetic people, matched (by age and sex) with a
> control sample. The variable I would like to compare is their drug
> and hospital monthly cost. The variable cost has a very far from
> gaussian distribution, but I need any way to compare the mean
> between the two group. So, in the specific case of a paired sample
> t-test, I aim at testing if the difference of cost is close to 0.
> What is the better way to follow for that?
> 
> 
> 
> Another question is that sometimes I have missing data in my dataset
> (for example I have the cost for a patients but not for a control).
> If I introduce NA or a dot, R doesn't estimate the statistic I need
> (for instance the mean). To overcome this problem I have replaced
> the missing data with the mean computed with the remaining part of
> data. Anyway, I think R can actually compute the mean even with the
> presence of missing data. Is it right? What can I do?

your.statistic(your.data, na.rm=T)

e.g.
mean(your.data, na.rm=T)

or look at ?na.action e.g  mean(na.omit(your.data))

Cheers
Petr Pikal


> 
> 
> 
> Thank you very much for your attention and, I hope, your help.
> 
> 
> 
> Best wishes 
> 
> 
> 
> Luciana Scalone
> 
> Center of Pharmacoeconomics
> 
> University of Milan
> 
>  [[alternative HTML version deleted]]
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] paired t-test with bootstrap

2004-07-13 Thread luciana
Dear Sirs,

I am a R beginning user: by mean of R I would like to apply the bootstrap to my data 
in order to test cost differences between independent or paired samples of people 
affected by a certain disease.

My problem is that even if I am reading the book by Efron (introduction to the 
bootstrap), looking at the examples in internet and available in R, learning a lot of 
theoretical things on bootstrap, I can't apply bootstrap with R to my data because of 
many doubts and difficulties. This is the reason why I have decided to ask the expert 
for help.

 

I have a sample of diabetic people, matched (by age and sex) with a control sample. 
The variable I would like to compare is their drug and hospital monthly cost. The 
variable cost has a very far from gaussian distribution, but I need any way to compare 
the mean between the two group. So, in the specific case of a paired sample t-test, I 
aim at testing if the difference of cost is close to 0. What is the better way to 
follow for that?

 

Another question is that sometimes I have missing data in my dataset (for example I 
have the cost for a patients but not for a control). If I introduce NA or a dot, R 
doesn't estimate the statistic I need (for instance the mean). To overcome this 
problem I have replaced the missing data with the mean computed with the remaining 
part of data. Anyway, I think R can actually compute the mean even with the presence 
of missing data. Is it right? What can I do?

 

Thank you very much for your attention and, I hope, your help.

 

Best wishes 

 

Luciana Scalone

Center of Pharmacoeconomics

University of Milan

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html