Dear John,

I fully understand your point that a IV might not be significantly
correlated with DV in bivariate situation but might be significantly
correlated with DV with the presense of other IVs. But does this significant
partial relationship reflect the true relation between IV and DV and really
help to predict DV?

>From here, let's go one step further. If I do multiple resampling from
original dataset, build bivariate LM between IV and DV with different
samples, and still can't get significant result, do you think I should give
a chance to this IV by looking at its partial relationship with DV?

Thank you so much!

On 2/18/06, John Fox <[EMAIL PROTECTED]> wrote:
> Dear Wensui and Andy,
> When the explanatory variables are correlated it's perfectly possible for
> the marginal relationship between and X and Y to be zero and a partial
> relationship nonzero (even in the absence of interactions) -- this is
> simply
> a reflection of the more general point that partial and marginal
> relationships can differ.
> Regards,
> John
> --------------------------------
> John Fox
> Department of Sociology
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> 905-525-9140x23604
> --------------------------------
> > -----Original Message-----
> > [mailto:[EMAIL PROTECTED] On Behalf Of Wensui Liu
> > Sent: Saturday, February 18, 2006 2:03 PM
> > To: Liaw, Andy
> > Cc:
> > Subject: Re: [R] Question about variable selection
> >
> > Thank you so much for your reply, Andy.
> >
> > But what if I am only interesed in main effects instead of
> > interactions?
> >
> >
> >
> > On 2/18/06, Liaw, Andy <[EMAIL PROTECTED]> wrote:
> > >
> > > That depends on whether the IV could have some significant
> > > interactions with other Ivs not considered in the bivariate
> > analysis.
> > > E.g.,
> > >
> > > > iv <- expand.grid(-2:2, -2:2)
> > > > y <- 3 + iv[,1] * iv[,2] + rnorm(nrow(iv), sd=0.1) summary(lm(y ~
> > > > iv[,1]))
> > >
> > > Call:
> > > lm(formula = y ~ iv[, 1])
> > >
> > > Residuals:
> > >      Min       1Q   Median       3Q      Max
> > > -4.06259 -1.06048 -0.02377  1.05901  4.04315
> > >
> > > Coefficients:
> > >             Estimate Std. Error t value Pr(>|t|)
> > > (Intercept)  3.01908    0.41482   7.278 2.09e-07 ***
> > > iv[, 1]      0.01417    0.29332   0.048    0.962
> > > ---
> > > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> > >
> > > Residual standard error: 2.074 on 23 degrees of freedom Multiple
> > > R-Squared: 0.0001014,  Adjusted R-squared: -0.04337
> > > F-statistic: 0.002333 on 1 and 23 DF,  p-value: 0.9619
> > >
> > > > summary(lm(y ~ iv[,1] * iv[,2]))
> > >
> > > Call:
> > > lm(formula = y ~ iv[, 1] * iv[, 2])
> > >
> > > Residuals:
> > >      Min       1Q   Median       3Q      Max
> > > -0.22390 -0.08894 -0.01279  0.13525  0.17608
> > >
> > > Coefficients:
> > >                  Estimate Std. Error t value Pr(>|t|)
> > > (Intercept)      3.019083   0.026330 114.665   <2e-16 ***
> > > iv[, 1]          0.014167   0.018618   0.761    0.455
> > > iv[, 2]         -0.005486   0.018618  -0.295    0.771
> > > iv[, 1]:iv[, 2]  0.992865   0.013165  75.418   <2e-16 ***
> > > ---
> > > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> > >
> > > Residual standard error: 0.1316 on 21 degrees of freedom
> > > Multiple R-Squared: 0.9963,     Adjusted R-squared: 0.9958
> > > F-statistic:  1896 on 3 and 21 DF,  p-value: < 2.2e-16
> > >
> > >
> > >
> > >
> > > Andy
> > >
> > > From: Wensui Liu
> > > >
> > > > Dear Lister,
> > > >
> > > > I have a question about variable selection for regression.
> > > >
> > > > if the IV is not significantly related to DV in the bivariate
> > > > analysis, does it make sense to include this IV into the
> > full model
> > > > with multiple IVs?
> > > >
> > > > Thank you so much!
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
attachment...{{dropped}}
WenSui Liu
Senior Decision Support Analyst
Health Policy and Clinical Effectiveness
Cincinnati Children Hospital Medical Center

