Re: [Rd] lm() gives different results to lm.ridge() and SPSS

peter dalgaard Thu, 04 May 2017 10:21:53 -0700

Um, the link to StackOverflow does not seem to contain the same question. It 
does contain a stern warning not to use the $coef component of lm.ridge...


Is it perhaps the case that x1 and x2 have already been scaled to have standard 
deviation 1? In that case, x1*x2 won't be.

Also notice that SPSS tends to use "Beta" for standardized regression 
coefficients, and (AFAIR) "b" for the regular ones.

-pd

> On 4 May 2017, at 16:28 , Nick Brown <nick.br...@free.fr> wrote:
> 
> Hallo, 
> 
> I hope I am posting to the right place. I was advised to try this list by Ben 
> Bolker (https://twitter.com/bolkerb/status/859909918446497795). I also posted 
> this question to StackOverflow 
> (http://stackoverflow.com/questions/43771269/lm-gives-different-results-from-lm-ridgelambda-0).
>  I am a relative newcomer to R, but I wrote my first program in 1975 and have 
> been paid to program in about 15 different languages, so I have some general 
> background knowledge. 
> 
> 
> I have a regression from which I extract the coefficients like this: 
> lm(y ~ x1 * x2, data=ds)$coef 
> That gives: x1=0.40, x2=0.37, x1*x2=0.09 
> 
> 
> 
> When I do the same regression in SPSS, I get: 
> beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14. 
> So the main effects are in agreement, but there is quite a difference in the 
> coefficient for the interaction. 
> 
> 
> X1 and X2 are correlated about .75 (yes, yes, I know - this model wasn't my 
> idea, but it got published), so there is quite possibly something going on 
> with collinearity. So I thought I'd try lm.ridge() to see if I can get an 
> idea of where the problems are occurring. 
> 
> 
> The starting point is to run lm.ridge() with lambda=0 (i.e., no ridge 
> penalty) and check we get the same results as with lm(): 
> lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef 
> x1=0.40, x2=0.37, x1*x2=0.14 
> So lm.ridge() agrees with SPSS, but not with lm(). (Of course, lambda=0 is 
> the default, so it can be omitted; I can alternate between including or 
> deleting ".ridge" in the function call, and watch the coefficient for the 
> interaction change.) 
> 
> 
> 
> What seems slightly strange to me here is that I assumed that lm.ridge() just 
> piggybacks on lm() anyway, so in the specific case where lambda=0 and there 
> is no "ridging" to do, I'd expect exactly the same results. 
> 
> 
> Unfortunately there are 34,000 cases in the dataset, so a "minimal" reprex 
> will not be easy to make, but I can share the data via Dropbox or something 
> if that would help. 
> 
> 
> 
> I appreciate that when there is strong collinearity then all bets are off in 
> terms of what the betas mean, but I would really expect lm() and lm.ridge() 
> to give the same results. (I would be happy to ignore SPSS, but for the 
> moment it's part of the majority!) 
> 
> 
> 
> Thanks for reading, 
> Nick 
> 
> 
>       [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Reply via email to