On 04/05/2017 10:28 AM, Nick Brown wrote:
Hallo,
I hope I am posting to the right place. I was advised to try this list by Ben
Bolker (https://twitter.com/bolkerb/status/859909918446497795). I also posted
this question to StackOverflow
(http://stackoverflow.com/questions/43771269/lm-gives-different-results-from-lm-ridgelambda-0).
I am a relative newcomer to R, but I wrote my first program in 1975 and have
been paid to program in about 15 different languages, so I have some general
background knowledge.
I have a regression from which I extract the coefficients like this:
lm(y ~ x1 * x2, data=ds)$coef
That gives: x1=0.40, x2=0.37, x1*x2=0.09
When I do the same regression in SPSS, I get:
beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14.
So the main effects are in agreement, but there is quite a difference in the
coefficient for the interaction.
I don't know about this instance, but a common cause of this sort of
difference is a different parametrization. If that's the case, then
predictions in the two systems would match, even if coefficients don't.
Duncan Murdoch
X1 and X2 are correlated about .75 (yes, yes, I know - this model wasn't my
idea, but it got published), so there is quite possibly something going on with
collinearity. So I thought I'd try lm.ridge() to see if I can get an idea of
where the problems are occurring.
The starting point is to run lm.ridge() with lambda=0 (i.e., no ridge penalty)
and check we get the same results as with lm():
lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef
x1=0.40, x2=0.37, x1*x2=0.14
So lm.ridge() agrees with SPSS, but not with lm(). (Of course, lambda=0 is the default,
so it can be omitted; I can alternate between including or deleting ".ridge" in
the function call, and watch the coefficient for the interaction change.)
What seems slightly strange to me here is that I assumed that lm.ridge() just piggybacks
on lm() anyway, so in the specific case where lambda=0 and there is no
"ridging" to do, I'd expect exactly the same results.
Unfortunately there are 34,000 cases in the dataset, so a "minimal" reprex will
not be easy to make, but I can share the data via Dropbox or something if that would help.
I appreciate that when there is strong collinearity then all bets are off in
terms of what the betas mean, but I would really expect lm() and lm.ridge() to
give the same results. (I would be happy to ignore SPSS, but for the moment
it's part of the majority!)
Thanks for reading,
Nick
[[alternative HTML version deleted]]
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel