Dear all,

I have one dependent variable y and two independent variables x1 and x2
which I would like to use to explain y. x1 and x2 are design factors in an
experiment and are not correlated with each other. For example assume that:

x1 <- rbind(1,1,1,2,2,2,3,3,3)
x2 <- rbind(1,2,3,1,2,3,1,2,3)
cor(x1,x2)

The problem is that I do not only want to analyze the effect of x1 and x2 on
y but also of their interaction x1*x2. Evidently this interaction term has a
substantial correlation with both x1 and x2:

x3 <- x1*x2
cor(x1,x3)
cor(x2,x3)

I therefore expect that a simple regression of y on x1, x2 and x1*x2 will
lead to biased results due to multicollinearity. For example, even when y is
completely random and unrelated to x1 and x2, I obtain a substantial R2 for
a simple linear model which includes all three variables. This evidently
does not make sense:

y <- rnorm(9)
model <- lm (y ~ x1 + x2 + x1*x2)
summary(model)

Is there some function within R or in some separate library that allows me
to estimate such a regression without obtaining inconsistent results?

Thanks for your help in advance,

Michael


Michael Haenlein
Associate Professor of Marketing
ESCP Europe
Paris, France

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to