Re: Testing whether my subsample represents my population...help!

Donald Burrill Wed, 05 Mar 2003 06:18:58 -0800

It is not clear whether your questions have answers.  A little
clarification of the problem (and of some of your ambivalent language)
may help.


On Tue, 4 Mar 2003, Jeremy Bauer wrote (edited):

> Out of a population of 220 children, I need to randomly select 10
> for use in a pilot experiement.  The selection of 10 subjects is
> based on financial reasons, not on any power calculations.

        O.K. so far ...  You do say "randomly", but the sequel suggests
that you didn't really mean it.

> My question is, how do I make sure that the subsample of 10 subjects
> represent my population of 220?

The short answer is, "You don't."
 A longer answer is, "If you select at random, which is easier said than
done, you buy about as much representativeness as is possible in this
life.  Of course, you have no *guarantee* (which is what you seem to be
asking for!), and a small proportion of the time random selection can be
expected to produce a highly UNrepresentative sample."

> The question seems basic at face value, but I'm just not comfortable
> with any solutions.

Nobody promised you life would be comfortable.

> It is important that the subsample ...

"SUBsample"?  So far, you'd only been discussing one sample of 10, to be
drawn from a population of 220.  Are you now thinking of the 220 as
itself a sample of a larger population?  Perhaps you need to think more
deeply about what your "population" -- the reference collective to which
you will wish to generalize -- REALLY is.

> ... have similar age, height & weight.

Why?  Is the "population" (the 220) then so heterogeneous on these
measures?  If so, should you be considering selecting several clusters
of subjects, in some kind of stratified sampling plan?
 (Not easy with n=10, but you need to address your logic, I think.)
 And what other variables, not considered here, will conspire to
guarantee that your sample of 10 is highly UNrepresentative on those
dimensions (e.g., IQ, ethnic origin, wealth, religious affiliation,
...)?  (That's the chief problem with any attempt to guarantee
representativeness:  the influential variables you hadn't noticed, or
didn't know about, or had forgotten about.)

> Do I just run one t-test for each of the 3 dependent variables?

I take it you are interested in "representativeness" only with respect
to the value of the sample (& population) mean, not with respect to
variability nor to relationships among whatever variables you will find
interesting enough to analyze?

> While the sample sizes are unequal, ...

Echo of earlier ambiguity.  Are there in fact more than one sample to be
compared?  (If so, why, and what utility will the comparison(s) have?)

> ... am I correct in saying that unequal sample sizes are "ok" as
> long as the variances are equal?

If this is a technical question about t-tests in general, the answer is
(a) "Yes" and (b) "The variances don't need to be equal" (use the form
of the t-test that permits unequal variances and doesn't use a pooled
variance estimate).

> Any help will be very appreciated!  Thank you!  > Jeremy

This may not have been so helpful as you'd hoped.  Sorry about that.
 Good luck!  (I think you'll need it.)
                                         -- DFB.
 -----------------------------------------------------------------------
 Donald F. Burrill                                            [EMAIL PROTECTED]
 56 Sebbins Pond Drive, Bedford, NH 03110                 (603) 626-0816

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: Testing whether my subsample represents my population...help!

Reply via email to