Re: [R] significance testing for the difference in the ratio of means

2013-06-14 Thread Bert Gunter
Sigh...

(Again!) These are primarily statistical, not R, issues.  I would urge
that you seek local statistical help. You appear to be approaching
this with a good deal of semi-informed adhoc-ery. Standard methodology
should be applicable, but it would be presumptuous and ill-advised of
me to offer specifics remotely  without understanding in detail the
goals of your research, the nature of your design (e.g. protocols,
randomization?), and the behavior of your data (what do appropriate
plots tell you??)

Others may be bolder. Proceed at your own risk.

Cheers,
Bert

On Fri, Jun 14, 2013 at 2:07 PM, Rahul Mahajan mahaj...@vcu.edu wrote:
 I have a question regarding significance testing for the difference in the
 ratio of means.
 The data consists of a control and a test group, each with and without
 treatment.  I am interested in testing if the treatment has a significantly
 different effect (say, in terms of fold-activation) on the test group
 compared to the control.

 The form of the data with arbitrary n and not assuming equal variance:

 m1 = mean of (control group) n = 7
 m2 = mean of (control group w/ treatment) n=  10
 m3 = mean of (test group) n = 8
 m4 = mean of (test group w/ treatment) n = 9

 H0: m2/m1 = m4/m3
 restated,
 H0: m2/m1 - m4/m3 = 0;

 Method 1: Fieller's Intervals
 Use fieller's theorum available in R as part of the mratios package.  This
 is a promising way to compute standard error/confidence intervals for each
 of the two ratios but will not yield p-values for significance testing.
  Significance by non-overlap of confidence intervals is too stringent a
 test and will lead to frequent type II errors.

 Method 2: Bootstrap
 Abandoning an analytical solution, we try a numerical solution.  I can
 repeatedly (1000 or 10,000 times)  draw with replacement samples of size
 7,10,8,9 from m1,m2,m3,m4 respectively.  Each iteration, I can compute the
 ratio for m2/m1 and m4/m3 as well as the difference.  Standard deviations
 of the m2/m1 and the m4/m3 bootstrap distributions can give me standard
 errors for these two ratios.  Then, I can test to see where 0 falls on
 the third distribution, the distribution of the difference of the ratios.
  If 0 falls on one of the tails, beyond the 2.5th or 97.5th percentile, I
 can declare a significant difference in the two ratios.  My question here
 is if I can correctly report the percentile location of 0 as the p-value?

 Method 3: Permutation test
 I understand the best way to obtain a p-value for the significance test
 would be to resample under the null hypothesis.  However, as I am comparing
 the ratio of means, I do not have individual observations to randomize
 between the groups.  The best I can think to do is create an exhaustive
 list of all (7x10) = 70 possible observations for m2/m1 from the data.
  Then create a similar list of all (8x9) = 72 possible observations for
 m4/m3. Pool all (70+72) = 142 observations and repeatedly randomly assign
 them to two groups  of size 70 and 72 to represent the two ratios and
 compute the difference in means.  This distribution could represent the
 distribution under the null hypothesis and I could then measure where my
 observed value falls to compute the p-value.  This however, makes me
 uncomfortable as it seems to treat the data as a mean of ratios rather
 than a ratio of means.

 Method 4: Combination of bootstrap and permutation test
 Sample with replacement samples of size 7,10,8,9 from m1,m2,m3,m4
 respectively as in method 2 above.  Calculate the two ratios for these 4
 samples (m2/m1 and m4/m3).  Record these two ratios into a list.  Repeat
 this process an arbitrary (B) number of times and record the two ratios
 into your growing list each time.  Hence if B = 10, we will have 20
 observations of the ratios.  Then proceed with permutation testing with
 these 20 ratio observations by repeatedly randomizing them into two equal
 groups of 10 and computing the difference in means of the two groups as we
 did in method 3 above.  This could potentially yeild a distribution under
 the null hypothesis and p-values could be obtained by localizing the
 observed value on this distribution.  I am unsure of appropriate values for
 B or if this method is valid at all.

 Another complication would be the concern for multiple comparisons if I
 wished to include additional  test groups (m5 = testgroup2; m6 = testgroup2
 w/ treatment; m7 = testgroup3, m8 = testgoup3 w/ treatment...etc) and how
 that might be appropriately handled.

 Method 2 seems the most intuitive to me.  Bootstrapping this way will
 likely yield appropriate Starndard Errors for the two ratios.  However, I
 am very much interested in appropriate p-values for the comparison and I am
 not sure if localizing 0 on the bootstrap distribution of the difference
 of means is appropriate.

 Thank you in advance for your suggestions.

 -Rahul

 [[alternative HTML version deleted]]

 __
 

Re: [R] significance testing for the difference in the ratio of means

2013-06-14 Thread Rahul Mahajan
My apologies if my request is off topic and for my admittedly
half-baked understanding of the topic.  I'm afraid trying to talk with
the local statistical help, and trying to post on several general
statistical forums to look for proper guidance has not yielded any
response much less any helpful ones.  I turned to this forum in
desperation because 1) I will be using R to implement the chosen
strategy and 2) looking through the archives of this forum seemed
promising especially because of past helpful posts as this:

https://stat.ethz.ch/pipermail/r-help/2009-April/194843.html

Perhaps you can suggest a resource which would cover the applicable
standard methodology and perhaps its implementation in R?  I would
truly appreciate any guidance.

My protocols/design = each observation within the 4 groups represents
a recording of a continuous variable (whole-cell current from 1 cell
in electrophysiology measurements).  The data for each group appears
roughly normal (albeit small n values from 7-10 per group).  The
variance is not equal among the groups because it seems to vary with
the mean, ie larger currents = larger absolute variance.  There is no
explicit randomization involved as these observations are merely the
measurements of wholecell currents for cells receiving an identical
experimental treatment.  I am interested comparing the
fold-activation effect of the treatment for control cells versus for
testgroup cells which have differing baseline pre-treatment current
values.

Best,
Rahul




On Fri, Jun 14, 2013 at 7:13 PM, Bert Gunter gunter.ber...@gene.com wrote:
 Sigh...

 (Again!) These are primarily statistical, not R, issues.  I would urge
 that you seek local statistical help. You appear to be approaching
 this with a good deal of semi-informed adhoc-ery. Standard methodology
 should be applicable, but it would be presumptuous and ill-advised of
 me to offer specifics remotely  without understanding in detail the
 goals of your research, the nature of your design (e.g. protocols,
 randomization?), and the behavior of your data (what do appropriate
 plots tell you??)

 Others may be bolder. Proceed at your own risk.

 Cheers,
 Bert

 On Fri, Jun 14, 2013 at 2:07 PM, Rahul Mahajan mahaj...@vcu.edu wrote:
 I have a question regarding significance testing for the difference in the
 ratio of means.
 The data consists of a control and a test group, each with and without
 treatment.  I am interested in testing if the treatment has a significantly
 different effect (say, in terms of fold-activation) on the test group
 compared to the control.

 The form of the data with arbitrary n and not assuming equal variance:

 m1 = mean of (control group) n = 7
 m2 = mean of (control group w/ treatment) n=  10
 m3 = mean of (test group) n = 8
 m4 = mean of (test group w/ treatment) n = 9

 H0: m2/m1 = m4/m3
 restated,
 H0: m2/m1 - m4/m3 = 0;

 Method 1: Fieller's Intervals
 Use fieller's theorum available in R as part of the mratios package.  This
 is a promising way to compute standard error/confidence intervals for each
 of the two ratios but will not yield p-values for significance testing.
  Significance by non-overlap of confidence intervals is too stringent a
 test and will lead to frequent type II errors.

 Method 2: Bootstrap
 Abandoning an analytical solution, we try a numerical solution.  I can
 repeatedly (1000 or 10,000 times)  draw with replacement samples of size
 7,10,8,9 from m1,m2,m3,m4 respectively.  Each iteration, I can compute the
 ratio for m2/m1 and m4/m3 as well as the difference.  Standard deviations
 of the m2/m1 and the m4/m3 bootstrap distributions can give me standard
 errors for these two ratios.  Then, I can test to see where 0 falls on
 the third distribution, the distribution of the difference of the ratios.
  If 0 falls on one of the tails, beyond the 2.5th or 97.5th percentile, I
 can declare a significant difference in the two ratios.  My question here
 is if I can correctly report the percentile location of 0 as the p-value?

 Method 3: Permutation test
 I understand the best way to obtain a p-value for the significance test
 would be to resample under the null hypothesis.  However, as I am comparing
 the ratio of means, I do not have individual observations to randomize
 between the groups.  The best I can think to do is create an exhaustive
 list of all (7x10) = 70 possible observations for m2/m1 from the data.
  Then create a similar list of all (8x9) = 72 possible observations for
 m4/m3. Pool all (70+72) = 142 observations and repeatedly randomly assign
 them to two groups  of size 70 and 72 to represent the two ratios and
 compute the difference in means.  This distribution could represent the
 distribution under the null hypothesis and I could then measure where my
 observed value falls to compute the p-value.  This however, makes me
 uncomfortable as it seems to treat the data as a mean of ratios rather
 than a ratio of means.

 Method 4: Combination of bootstrap and 

Re: [R] significance testing for the difference in the ratio of means

2013-06-14 Thread Robert A LaBudde
The fact that your currents are apparently intrinsically positive and 
the variance increases with current plus the fact that you are 
interested in ratio statistics suggests that your data would benefit 
from an initial log transform of the data. All of your issues would 
then disappear, given that the log-transformed data were roughly 
normally distributed.


At 10:36 PM 6/14/2013, Rahul Mahajan wrote:

My apologies if my request is off topic and for my admittedly
half-baked understanding of the topic.  I'm afraid trying to talk with
the local statistical help, and trying to post on several general
statistical forums to look for proper guidance has not yielded any
response much less any helpful ones.  I turned to this forum in
desperation because 1) I will be using R to implement the chosen
strategy and 2) looking through the archives of this forum seemed
promising especially because of past helpful posts as this:

https://stat.ethz.ch/pipermail/r-help/2009-April/194843.html

Perhaps you can suggest a resource which would cover the applicable
standard methodology and perhaps its implementation in R?  I would
truly appreciate any guidance.

My protocols/design = each observation within the 4 groups represents
a recording of a continuous variable (whole-cell current from 1 cell
in electrophysiology measurements).  The data for each group appears
roughly normal (albeit small n values from 7-10 per group).  The
variance is not equal among the groups because it seems to vary with
the mean, ie larger currents = larger absolute variance.  There is no
explicit randomization involved as these observations are merely the
measurements of wholecell currents for cells receiving an identical
experimental treatment.  I am interested comparing the
fold-activation effect of the treatment for control cells versus for
testgroup cells which have differing baseline pre-treatment current
values.

Best,
Rahul




On Fri, Jun 14, 2013 at 7:13 PM, Bert Gunter gunter.ber...@gene.com wrote:
 Sigh...

 (Again!) These are primarily statistical, not R, issues.  I would urge
 that you seek local statistical help. You appear to be approaching
 this with a good deal of semi-informed adhoc-ery. Standard methodology
 should be applicable, but it would be presumptuous and ill-advised of
 me to offer specifics remotely  without understanding in detail the
 goals of your research, the nature of your design (e.g. protocols,
 randomization?), and the behavior of your data (what do appropriate
 plots tell you??)

 Others may be bolder. Proceed at your own risk.

 Cheers,
 Bert

 On Fri, Jun 14, 2013 at 2:07 PM, Rahul Mahajan mahaj...@vcu.edu wrote:
 I have a question regarding significance testing for the difference in the
 ratio of means.
 The data consists of a control and a test group, each with and without
 treatment.  I am interested in testing if the treatment has a 
significantly

 different effect (say, in terms of fold-activation) on the test group
 compared to the control.

 The form of the data with arbitrary n and not assuming equal variance:

 m1 = mean of (control group) n = 7
 m2 = mean of (control group w/ treatment) n=  10
 m3 = mean of (test group) n = 8
 m4 = mean of (test group w/ treatment) n = 9

 H0: m2/m1 = m4/m3
 restated,
 H0: m2/m1 - m4/m3 = 0;

 Method 1: Fieller's Intervals
 Use fieller's theorum available in R as part of the mratios package.  This
 is a promising way to compute standard error/confidence intervals for each
 of the two ratios but will not yield p-values for significance testing.
  Significance by non-overlap of confidence intervals is too stringent a
 test and will lead to frequent type II errors.

 Method 2: Bootstrap
 Abandoning an analytical solution, we try a numerical solution.  I can
 repeatedly (1000 or 10,000 times)  draw with replacement samples of size
 7,10,8,9 from m1,m2,m3,m4 respectively.  Each iteration, I can compute the
 ratio for m2/m1 and m4/m3 as well as the difference.  Standard deviations
 of the m2/m1 and the m4/m3 bootstrap distributions can give me standard
 errors for these two ratios.  Then, I can test to see where 0 falls on
 the third distribution, the distribution of the difference of the ratios.
  If 0 falls on one of the tails, beyond the 2.5th or 97.5th percentile, I
 can declare a significant difference in the two ratios.  My question here
 is if I can correctly report the percentile location of 0 as 
the p-value?


 Method 3: Permutation test
 I understand the best way to obtain a p-value for the significance test
 would be to resample under the null hypothesis.  However, as I 
am comparing

 the ratio of means, I do not have individual observations to randomize
 between the groups.  The best I can think to do is create an exhaustive
 list of all (7x10) = 70 possible observations for m2/m1 from the data.
  Then create a similar list of all (8x9) = 72 possible observations for
 m4/m3. Pool all (70+72) = 142 observations and repeatedly randomly assign
 them to two