[R] paired t-test with bootstrap
Hi On 13 Jul 2004 at 12:28, luciana wrote: > Dear Sirs, > I have a sample of diabetic people, matched (by age and sex) with a > control sample. The variable I would like to compare is their drug > and hospital monthly cost. The variable cost has a very far from > gaussian distribution, but I need any way to compare the mean > between the two group. So, in the specific case of a paired sample > t-test, I aim at testing if the difference of cost is close to 0. > What is the better way to follow for that? > I can suggest to see: ? pairwise.wilcox.test() ? wilcox.test using non-parametric tests instead of t-test. Cordially Vito = Diventare costruttori di soluzioni Visitate il portale http://www.modugno.it/ e in particolare la sezione su Palese http://www.modugno.it/archivio/cat_palese.shtml __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] paired t-test with bootstrap
On Tue, 2004-07-13 at 07:28, Petr Pikal wrote: > Hi > > On 13 Jul 2004 at 12:28, luciana wrote: > > > Dear Sirs, > > > > I am a R beginning user: by mean of R I would like to apply the > > bootstrap to my data in order to test cost differences between > > independent or paired samples of people affected by a certain > > disease. > > > > My problem is that even if I am reading the book by Efron > > (introduction to the bootstrap), looking at the examples in internet > > and available in R, learning a lot of theoretical things on > > bootstrap, I can't apply bootstrap with R to my data because of many > > doubts and difficulties. This is the reason why I have decided to > > ask the expert for help. > > > > > > > > I have a sample of diabetic people, matched (by age and sex) with a > > control sample. The variable I would like to compare is their drug > > and hospital monthly cost. The variable cost has a very far from > > gaussian distribution, but I need any way to compare the mean > > between the two group. So, in the specific case of a paired sample > > t-test, I aim at testing if the difference of cost is close to 0. > > What is the better way to follow for that? > > > > > > > > Another question is that sometimes I have missing data in my dataset > > (for example I have the cost for a patients but not for a control). > > If I introduce NA or a dot, R doesn't estimate the statistic I need > > (for instance the mean). To overcome this problem I have replaced > > the missing data with the mean computed with the remaining part of > > data. Anyway, I think R can actually compute the mean even with the > > presence of missing data. Is it right? What can I do? > > your.statistic(your.data, na.rm=T) > > e.g. > mean(your.data, na.rm=T) > > or look at ?na.action e.g mean(na.omit(your.data)) > > Cheers > Petr Pikal A couple of other thoughts here with respect to the use of a paired t-test for the comparison. As Luciana notes above, cost data is typically highly skewed, raising doubt as to the use of a simple parametric test to compare the two groups. One of the many reasons such data is skewed is that there are notable differences in the populations that are not accounted for when using simple characteristics for matching as is done here. What makes a patient an "outlier" with respect to cost and how does the distribution of these patients differ between the two groups and the individual pairs? For example, are all the patients in both groups insulin dependent or are some controlled with oral agents or diet alone? If all are using insulin, are some using self-administered injections while others are using implanted infusion pumps? What is the interval from disease onset? Have any had Pancreas/Islet Cell transplants? Do the matched patients have similar diabetic related sequelae such as diabetic retinopathy, neuropathy, vasculopathy, renal dysfunction and others? If not, the costs to treat these other issues, such as dialysis and wound care alone, can dramatically alter the cost profile for patients even when matched by age and gender. If you are not considering these issues (ie. such as inclusion/exclusion criteria), you risk significant challenges in your conclusions with respect to the comparison of costs for these two groups. I would raise similar concerns when using a sample mean as the imputed value for missing data. If you have not done so already, a Medline search of the literature would be in order to better understand what others have done in this area for diabetic treatment costs and the pros and cons of their respective approaches. I suspect that others here will have additional recommendations. HTH, Marc Schwartz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] paired t-test with bootstrap
just a hint for further bootstrapping examples (worked out with R): "Bootstrap Methods and Their Applications" by A.C. Davison and D.V. Hinkley cheers christoph luciana wrote: Dear Sirs, I am a R beginning user: by mean of R I would like to apply the bootstrap to my data in order to test cost differences between independent or paired samples of people affected by a certain disease. My problem is that even if I am reading the book by Efron (introduction to the bootstrap), looking at the examples in internet and available in R, learning a lot of theoretical things on bootstrap, I can't apply bootstrap with R to my data because of many doubts and difficulties. This is the reason why I have decided to ask the expert for help. I have a sample of diabetic people, matched (by age and sex) with a control sample. The variable I would like to compare is their drug and hospital monthly cost. The variable cost has a very far from gaussian distribution, but I need any way to compare the mean between the two group. So, in the specific case of a paired sample t-test, I aim at testing if the difference of cost is close to 0. What is the better way to follow for that? Another question is that sometimes I have missing data in my dataset (for example I have the cost for a patients but not for a control). If I introduce NA or a dot, R doesn't estimate the statistic I need (for instance the mean). To overcome this problem I have replaced the missing data with the mean computed with the remaining part of data. Anyway, I think R can actually compute the mean even with the presence of missing data. Is it right? What can I do? Thank you very much for your attention and, I hope, your help. Best wishes Luciana Scalone Center of Pharmacoeconomics University of Milan [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] paired t-test with bootstrap
Hi On 13 Jul 2004 at 12:28, luciana wrote: > Dear Sirs, > > I am a R beginning user: by mean of R I would like to apply the > bootstrap to my data in order to test cost differences between > independent or paired samples of people affected by a certain > disease. > > My problem is that even if I am reading the book by Efron > (introduction to the bootstrap), looking at the examples in internet > and available in R, learning a lot of theoretical things on > bootstrap, I can't apply bootstrap with R to my data because of many > doubts and difficulties. This is the reason why I have decided to > ask the expert for help. > > > > I have a sample of diabetic people, matched (by age and sex) with a > control sample. The variable I would like to compare is their drug > and hospital monthly cost. The variable cost has a very far from > gaussian distribution, but I need any way to compare the mean > between the two group. So, in the specific case of a paired sample > t-test, I aim at testing if the difference of cost is close to 0. > What is the better way to follow for that? > > > > Another question is that sometimes I have missing data in my dataset > (for example I have the cost for a patients but not for a control). > If I introduce NA or a dot, R doesn't estimate the statistic I need > (for instance the mean). To overcome this problem I have replaced > the missing data with the mean computed with the remaining part of > data. Anyway, I think R can actually compute the mean even with the > presence of missing data. Is it right? What can I do? your.statistic(your.data, na.rm=T) e.g. mean(your.data, na.rm=T) or look at ?na.action e.g mean(na.omit(your.data)) Cheers Petr Pikal > > > > Thank you very much for your attention and, I hope, your help. > > > > Best wishes > > > > Luciana Scalone > > Center of Pharmacoeconomics > > University of Milan > > [[alternative HTML version deleted]] > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] paired t-test with bootstrap
Dear Sirs, I am a R beginning user: by mean of R I would like to apply the bootstrap to my data in order to test cost differences between independent or paired samples of people affected by a certain disease. My problem is that even if I am reading the book by Efron (introduction to the bootstrap), looking at the examples in internet and available in R, learning a lot of theoretical things on bootstrap, I can't apply bootstrap with R to my data because of many doubts and difficulties. This is the reason why I have decided to ask the expert for help. I have a sample of diabetic people, matched (by age and sex) with a control sample. The variable I would like to compare is their drug and hospital monthly cost. The variable cost has a very far from gaussian distribution, but I need any way to compare the mean between the two group. So, in the specific case of a paired sample t-test, I aim at testing if the difference of cost is close to 0. What is the better way to follow for that? Another question is that sometimes I have missing data in my dataset (for example I have the cost for a patients but not for a control). If I introduce NA or a dot, R doesn't estimate the statistic I need (for instance the mean). To overcome this problem I have replaced the missing data with the mean computed with the remaining part of data. Anyway, I think R can actually compute the mean even with the presence of missing data. Is it right? What can I do? Thank you very much for your attention and, I hope, your help. Best wishes Luciana Scalone Center of Pharmacoeconomics University of Milan [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html