If variables are colinear, then looking at interactions among them doesn't make much sense. High collinearity means that one variable is nearly a linear combination of others. IOW, that variable is not adding much information. So, if you look at the interaction, you are ALMOST looking at a quadratic (e.g., if the collinearity involves only 2 variables, then one is very similar to the other, so X1*X2 is almost X1*X1). The output will be confusing, to say the least.
Worse, when you include collinear variables, the resulting equation is highly sensitive to small (sometimes very small) changes in the data. Belsley gives an example where changes in the third decimal place result in totally different equations. For details see Belsley's book titled something like "collinearity and weak data in regression" (sorry, the book and my files are at the office, but this should let you find it HTH Peter L. Flom, PhD Assistant Director, Statistics and Data Analysis Core Center for Drug Use and HIV Research National Development and Research Institutes 71 W. 23rd St www.peterflom.com New York, NY 10010 (212) 845-4485 (voice) (917) 438-0894 (fax) >>> "Devshruti Pahuja" <[EMAIL PROTECTED]> 06/11/04 5:35 AM >>> Hi I have a set of data with both quantitative and categorical predictors. After scaling of response variable, i looked for multicollinearity (VIF values) among the predictors and removed the predictors who were hinding some of the other significant predictors. I'm curious to know whether the predictors (who are not significant) while doing simple 'lm' will be involved in interactions. How do i take into account interactions of those predictors whom i removed just on the basis of multicollinearity ? I'll appreciate if someone can throw some light on this matter and how to use R to detect the interactions effectively . Thanks Regards Dev > ------Final 'lm model'-------------------- > > logmodelfull_minus_run_hr_walk_batting <- lm(log(salary) ~ hit+rbi + walk > + obp + strike.out+free.agent.eligible+free.agent.1991+arbitr.elgible.) > > summary(logmodelfull_minus_run_hr_walk_batting) > > Call: > lm(formula = log(salary) ~ hit + rbi + walk + obp + strike.out + > free.agent.eligible + free.agent.1991 + arbitr.elgible.) > > Residuals: > Min 1Q Median 3Q Max > -2.41786 -0.28911 -0.02814 0.31890 1.49007 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 5.340782 0.251218 21.260 < 2e-16 *** > hit 0.004479 0.001158 3.867 0.000133 *** > rbi 0.011102 0.002195 5.059 7.05e-07 *** > walk 0.005421 0.002206 2.457 0.014533 * > obp -1.385584 0.824105 -1.681 0.093653 . > strike.out -0.005399 0.001438 -3.755 0.000205 *** > free.agent.eligible1 1.611521 0.080657 19.980 < 2e-16 *** > free.agent.19911 -0.301243 0.103481 -2.911 0.003848 ** > arbitr.elgible.1 1.293059 0.086696 14.915 < 2e-16 *** > --- > Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 > > Residual standard error: 0.5351 on 328 degrees of freedom > Multiple R-Squared: 0.7981, Adjusted R-squared: 0.7932 > F-statistic: 162.1 on 8 and 328 DF, p-value: < 2.2e-16 > > -------------------------------------------------------------------------- -- > ---------------------------------------------------- > > > --------------with > interactions---------------------------------------------------------------- > --------------------------- > > > > > summary(baseball.lgmodel_with_interactions_ALL_arbid) > > Call: > lm(formula = log(salary) ~ hit + rbi + strike.out + free.agent.eligible + > free.agent.1991 + arbitr.elgible. + hit * free.agent.1991 + > hit * arbitr.elgible. + hit * rbi + rbi * free.agent.eligible + > rbi * arbitr.elgible. + rbi * arbitr.1991 + hit * strike.out + > strike.out * free.agent.eligible + strike.out * arbitr.elgible. + > strike.out * run + strike.out * hr + hit * free.agent.eligible + > free.agent.eligible * run + hit * free.agent.1991 + strike.out * > free.agent.1991 + free.agent.1991 * batting + free.agent.1991 * > obp + arbitr.elgible. * run + batting * double + obp * run + > obp * hr + walk * stolen.base + hit * arbitr.1991 + free.agent.eligible > * > double + arbitr.elgible. * double + strike.out * triple + > triple * batting + triple * walk + triple * walk + hit * > hr + rbi * hr + free.agent.eligible * hr + free.agent.1991 * > hr + arbitr.elgible. * hr + hr * arbitr.1991 + hit * walk + > free.agent.eligible * walk + walk * rbi + rbi * stolen.base + > strike.out * stolen.base + stolen.base * batting + stolen.base * > walk + stolen.base * rbi + stolen.base * walk + arbitr.elgible. * > error) > > Residuals: > Min 1Q Median 3Q Max > -2.29352 -0.28287 -0.03748 0.29790 1.31590 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 5.217e+00 3.467e-01 15.048 < 2e-16 *** > hit 6.927e-03 6.226e-03 1.112 0.266889 > rbi 1.908e-02 1.150e-02 1.658 0.098350 . > strike.out -5.692e-03 4.586e-03 -1.241 0.215517 > free.agent.eligible1 1.287e+00 2.259e-01 5.699 3.05e-08 *** > free.agent.19911 3.828e-01 6.575e-01 0.582 0.560914 > arbitr.elgible.1 1.038e+00 2.195e-01 4.726 3.63e-06 *** > arbitr.19911 -1.024e+00 4.392e-01 -2.331 0.020443 * > run 4.932e-02 2.905e-02 1.698 0.090682 . > hr -1.093e-01 7.208e-02 -1.516 0.130543 > batting -1.814e-01 2.558e+00 -0.071 0.943522 > obp -1.375e+00 2.253e+00 -0.610 0.542099 > double -5.259e-02 4.489e-02 -1.172 0.242349 > walk 1.395e-02 9.757e-03 1.430 0.153808 > stolen.base -1.685e-02 4.299e-02 -0.392 0.695372 > triple -1.367e-01 1.600e-01 -0.854 0.393807 > error -4.097e-03 6.879e-03 -0.595 0.552007 > hit:free.agent.19911 8.248e-04 4.611e-03 0.179 0.858174 > hit:arbitr.elgible.1 4.873e-03 6.448e-03 0.756 0.450395 > hit:rbi -1.382e-04 7.709e-05 -1.792 0.074184 . > rbi:free.agent.eligible1 5.352e-03 9.555e-03 0.560 0.575855 > rbi:arbitr.elgible.1 -3.384e-03 1.136e-02 -0.298 0.766072 > rbi:arbitr.19911 3.596e-02 2.179e-02 1.650 0.100046 > hit:strike.out 5.480e-06 5.446e-05 0.101 0.919917 > strike.out:free.agent.eligible1 -2.570e-03 4.282e-03 -0.600 0.548890 > strike.out:arbitr.elgible.1 -9.703e-04 5.234e-03 -0.185 0.853068 > strike.out:run 1.685e-04 1.246e-04 1.352 0.177345 > strike.out:hr -3.088e-04 2.277e-04 -1.356 0.176229 > hit:free.agent.eligible1 -1.359e-03 6.224e-03 -0.218 0.827363 > free.agent.eligible1:run 1.248e-02 9.109e-03 1.370 0.171917 > strike.out:free.agent.19911 -1.851e-02 5.974e-03 -3.099 0.002140 ** > free.agent.19911:batting 7.076e-01 6.200e+00 0.114 0.909215 > free.agent.19911:obp -1.421e+00 3.952e+00 -0.360 0.719394 > arbitr.elgible.1:run -8.541e-03 8.773e-03 -0.974 0.331100 > batting:double 2.346e-01 1.609e-01 1.458 0.145884 > run:obp -1.825e-01 7.492e-02 -2.436 0.015462 * > hr:obp 3.687e-01 2.116e-01 1.742 0.082608 . > walk:stolen.base -6.789e-05 1.557e-04 -0.436 0.663083 > hit:arbitr.19911 -5.835e-03 7.084e-03 -0.824 0.410808 > free.agent.eligible1:double -1.151e-02 1.663e-02 -0.692 0.489334 > arbitr.elgible.1:double 2.169e-03 1.938e-02 0.112 0.910985 > strike.out:triple -8.106e-04 6.023e-04 -1.346 0.179475 > batting:triple 5.179e-01 5.599e-01 0.925 0.355841 > walk:triple 8.755e-04 9.262e-04 0.945 0.345349 > hit:hr -3.320e-04 2.626e-04 -1.264 0.207180 > rbi:hr 4.748e-04 3.015e-04 1.575 0.116414 > free.agent.eligible1:hr 1.840e-02 2.313e-02 0.796 0.426972 > free.agent.19911:hr 7.216e-02 1.889e-02 3.819 0.000165 *** > arbitr.elgible.1:hr 4.111e-02 2.803e-02 1.467 0.143564 > arbitr.19911:hr -2.368e-02 4.647e-02 -0.510 0.610723 > hit:walk 3.173e-05 7.826e-05 0.405 0.685442 > free.agent.eligible1:walk -5.423e-03 4.984e-03 -1.088 0.277472 > rbi:walk -7.569e-05 1.313e-04 -0.577 0.564598 > rbi:stolen.base 3.980e-05 1.605e-04 0.248 0.804409 > strike.out:stolen.base -2.611e-04 1.615e-04 -1.617 0.107004 > batting:stolen.base 1.552e-01 1.434e-01 1.082 0.280020 > arbitr.elgible.1:error 3.930e-03 1.390e-02 0.283 0.777495 > --- > Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 > > Residual standard error: 0.4925 on 280 degrees of freedom > Multiple R-Squared: 0.854, Adjusted R-squared: 0.8248 > F-statistic: 29.24 on 56 and 280 DF, p-value: < 2.2e-16 > ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html