Wuzzy <[EMAIL PROTECTED]> wrote in message
[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> > And that sounds impossible.  I suspect a programming error.
> >
> > -Jay
>
> you're right i programmed a food database incorrectly but i've redone
> it and yep the correlation was only 0.20 for kcal or so.
> it is hard to program a database *into* another database easy to make
> errors..
>
> i've made many errors in my trials.
>  dumbest mistake: is i listed people who left one question blank as a
> dummy variable, "9999" but i forgot to filter those subjects out and
> so it altered my correlation coefficient.. because people who leave
> one question blank will also leave another blank..  and i got very
> spurious correlations, hehe..

Those 9999s can be...um...non-linearizing

> One of the things i have been unable to figure out is if you are
> allowed to draw conclusions on very low R^2 equations.  Like if only
> 1% of the variance is predicted by your equation but the p-value is
> very small and the coefficient is very large, does that mean that this
> variable has a huge effect on the dependant variable?

A large regression coeffeicent for an exposure means that the effect of the
exposure on the outcome is strong.  A small r^2 means that the exposure
explains little of the variation in the outcome in the study population.
These two things can happen simultaneously; indeed, since chronic diseases
have multi-factorial causes, small r^2's are common.

Take, for example, the effect of the exposure "being heterozygous for
familal hypercholesterolemia" on serum cholesterol.  Heterozygotes for this
disease have serum cholesterol levels of around 400 mg/dL, compared with an
average of about 200 for persons without this condition.  Hence, the
regression coefficient for this exposure would be "large", in the sense that
I can't think of any other single exposure that would double a person's
serum cholesterol level (although a dummy variable for "not being on a
low-fat vegan diet" would be close).  However, the incidence of heterozygous
familal hypercholesterolemia is only 1:500,000, so this exposure contributes
little to the variance in serum cholesterol in the population; its r^2 would
be small.

-Jay




=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to