In article <[EMAIL PROTECTED]>,
Paige Miller  <[EMAIL PROTECTED]> wrote:
>Suppose I have three (or more) samples, from three (or more) different 
>populations. According to my subject matter expert, he wants to estimate 
>a linear combination of the means, say for example

>      0.5*mu1 + 0.5*mu2 - mu3

>where mu1, mu2 and mu3 are the population means. I know how to compute 
>this estimate, it is done by simply replacing the population means with 
>the sample means. If I assume the original populations are normal and 
>that the population variances are equal, I can compute the variance of 
>this linear combination. Pretty straightforward stuff.

>However, I want to create a t-test to test the null hypothesis that this 
>linear combination of means is equal to zero, using an estimate of 
>variance derived from the data, rather than a population variance, which 
>is unknown. In doing so, I run into the mathematical difficulty that I 
>do not know the proper degrees of freedom for this test. (And yes, I 
>know that for the special case of estimating mu1 - mu2 there are 
>textbook formulas for this t-test, however I really am interested in 
>linear combinations of more than two means).

>I feel like I am missing something very obvious; however, if someone 
>knows, or can point me to the proper formula for a t-test on a linear 
>combination of means, it would be greatly appreciated. 

The textbook formulas are only approximate for that case,
anyhow.  However, there is a procedure which is easy to
derive, has the t distribution under normality, and which
only loses power compared to what can be done precisely.
It may even be a little more robust against non-normality.

>From your description, you have samples of size n_j with
means mu_j, and you want to do a t-test of \sum a_j*mu_j.
Let N be the smallest of the n_j, and let X_jk be the 
observations from the j-th sample, m_j the sample means.
Choose, at random or otherwise, coefficients b_jkr, 
r=1, ..., N-1, so that 

        \sum_k b_jkr = 0;
        \sum_k b_jkr^2 = 1;
        \sum_k b_jkr*b_jks = 0          if r != s.
        
The b_j's are rows of an orthogonal matrix orthogonal to
the mean vector.  Then the numbers Y_jr = \sum b_jkr*X_jk
are uncorrelated (independent if normal) and have mean 0 
and variance \sigma_j^2.  So the sums 

        \sum_j a_j*Y_jr/sqrt(n_j) 

are uncorrelated (independent normal) random variables
with mean 0 and variance that of \sum_j a_j*m_j.  Use
this to get a t-test with N-1 degrees of freedom.


-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED]         Phone: (765)494-6054   FAX: (765)494-0558
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to