Dear all, I have one dependent variable y and two independent variables x1 and x2 which I would like to use to explain y. x1 and x2 are design factors in an experiment and are not correlated with each other. For example assume that:
x1 <- rbind(1,1,1,2,2,2,3,3,3) x2 <- rbind(1,2,3,1,2,3,1,2,3) cor(x1,x2) The problem is that I do not only want to analyze the effect of x1 and x2 on y but also of their interaction x1*x2. Evidently this interaction term has a substantial correlation with both x1 and x2: x3 <- x1*x2 cor(x1,x3) cor(x2,x3) I therefore expect that a simple regression of y on x1, x2 and x1*x2 will lead to biased results due to multicollinearity. For example, even when y is completely random and unrelated to x1 and x2, I obtain a substantial R2 for a simple linear model which includes all three variables. This evidently does not make sense: y <- rnorm(9) model <- lm (y ~ x1 + x2 + x1*x2) summary(model) Is there some function within R or in some separate library that allows me to estimate such a regression without obtaining inconsistent results? Thanks for your help in advance, Michael Michael Haenlein Associate Professor of Marketing ESCP Europe Paris, France [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.