Can any of you help me with this problem?  I'm not a very good theoretical
statistician, but the following problem is related to some research that
is consequentially related to an importan research project I'm working on.
Any help I get is greatly appreciated.

Thank you for your time.

Anita

--------------------------------------------------------------------
Here is the experiment.  The goal is to show people who have been doing a
certain type of analysis that their conclusions are of very limited value.

There are two barrels.  Each contains a very large number of colored
balls,
each ball is one of two colors.  I draw 10 balls out of each barrel and,
balls,
each ball is one of two colors.  I draw 10 balls out of each barrel and
get
a certain number of red balls and white balls from each barrel.  I want
my
conclusions to be significant at the 95% level (p less than or equal to
0.05).  Lets say I got 6 white and 4 red balls from the first barrel.
With
95% confidence I can say that the real fraction of white balls in barrel
one is between what and what?  I sample the second barrel and the 10
balls
come out different from the 6 and 4 ratio sampled from barrel one.  How
different does the ratio have to be to say with 95% confidence that the
fraction of white balls in barrel two is really different from the
fraction
of white balls in barrel one?  How does the quality of the conclusion
improve if one samples with twenty balls from each barrel rather than
10?

I would assume that the absolute value of the ratio (whether it is 6/4
or
8/2) doesn't affect this calculation except for the fact that the level
of
uncertainty for a sampled ball can't go to zero (i.e. a sampling that
gave
five might really be between 3 and 7 but a sampling of one can only be
between 1/Total and 3).  Also, I would assume that if there were more
than
two species these limitations on sampling would still hold for
estimating
the fraction of any one species.

There are many examples in the literature where people use 10 or 20
clones
to estimate the structure of the sequence distribution of a viral
population.  Then they will take another time point or virus from a
to estimate the structure of the sequence distribution of a viral
population.  Then they will take another time point or virus from a
different bodily compartment and sample 10 or 20 clones and compare the
two
groups of sequence.  I think a statistical analysis would show that they
have very little power to draw conclusions about the nature of the
structure of either population let alone to compare them.  This has
become
an important issue for us because we look at the population using a
different tool that overcomes these sampling problems.  I would like to
highlight what is wrong with the old approach as a way of emphasizing
the
advantage of our new approach.

Can you please help?

Anita





Reply via email to