On Sat, 26-Feb-2011 at 08:46AM -0800, Ridgeway, Greg wrote: |> I have heard about this before happening on other |> platforms. Frankly I'm not positive how this happens. My best guess |> is that there's a tiny bit of numeric instability in the 9+ decimal |> place so that on a given iteration a one variable choice at random |> looks better than the other. Any other ideas? Greg
I played around with this some time ago and noticed that it happens only when there's perfect or very nearly perfect correlation. I even tried a third variable and it was ignored almost completely. I concluded it's highly unlikely to cause a problem since real data wouldn't have perfectly correlated variables -- or if they did, they'd be easy enough to detect. |> |> ----- Original Message ----- |> From: Joshua Wiley <jwiley.ps...@gmail.com> |> To: Axel Urbiz <axel.ur...@gmail.com> |> Cc: R-help@r-project.org <R-help@r-project.org>; Ridgeway, Greg |> Sent: Fri Feb 25 22:16:02 2011 |> Subject: Re: [R] Reproducibility issue in gbm (32 vs 64 bit) |> |> Hi Axel, |> |> I do not have a nice explanation why the results differ off the top of |> my head. I can say I can replicate what you get on 32/64 (both |> Windows 7) bit with the development version of R and gbm_1.6-3.1. |> |> Here is an even simpler example that shows the difference: |> |> gbmfit <- gbm(1:50 ~ I(50:1) + I(60:11), distribution = "gaussian") |> summary(gbmfit) |> |> I copied that package maintainer. |> |> Cheers, |> |> Josh |> |> On Fri, Feb 25, 2011 at 7:29 PM, Axel Urbiz <axel.ur...@gmail.com> wrote: |> > Dear List, |> > |> > The gbm package on Win 7 produces different results for the |> > relative importance of input variables in R 32-bit relative to R 64-bit. Any |> > idea why? Any idea which one is correct? |> > |> > Based on this example, it looks like the relative importance of 2 perfectly |> > correlated predictors is "diluted" by half in 32-bit, whereas in 64-bit, one |> > of these predictors gets all the importance and the other gets none. I found |> > this interesting. |> > |> > ### Sample code |> > |> > library(gbm) |> > set.seed(12345) |> > xc=matrix(rnorm(100*20),100,20) |> > y=sample(1:2,100,replace=TRUE) |> > xc[,2] <- xc[,1] |> > gbmfit <- gbm(y~xc[,1]+xc[,2] +xc[,3], distribution="gaussian") |> > summary(gbmfit) |> > |> > ### Results on R 2.12.0 (32-bit) |> > |> > var rel.inf |> > 1 xc[, 3] 49.76143 |> > 2 xc[, 1] 27.27432 |> > 3 xc[, 2] 22.96425 |> >> |> > ### Results on R 2.12.0 (64-bit) |> >> summary(gbmfit) |> > var rel.inf |> > 1 xc[, 1] 50.23857 |> > 2 xc[, 3] 49.76143 |> > 3 xc[, 2] 0.00000 |> > |> > Thanks, |> > Axel. |> > |> > [[alternative HTML version deleted]] |> > |> > ______________________________________________ |> > R-help@r-project.org mailing list |> > https://stat.ethz.ch/mailman/listinfo/r-help |> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html |> > and provide commented, minimal, self-contained, reproducible code. |> > |> |> |> |> -- |> Joshua Wiley |> Ph.D. Student, Health Psychology |> University of California, Los Angeles |> http://www.joshuawiley.com/ |> |> __________________________________________________________________________ |> |> This email message is for the sole use of the intended recipient(s) and |> may contain confidential information. Any unauthorized review, use, |> disclosure or distribution is prohibited. If you are not the intended |> recipient, please contact the sender by reply email and destroy all copies |> of the original message. |> ______________________________________________ |> R-help@r-project.org mailing list |> https://stat.ethz.ch/mailman/listinfo/r-help |> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html |> and provide commented, minimal, self-contained, reproducible code. -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___ Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) ..... Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.