[R] Reproducibility issue in gbm (32 vs 64 bit)

2014-07-26 Thread Bruce.W Morlan
Absolutely, even though the seed means the random number sequence starts in
the same place, the sequence generated will certainly drift in different
directions on different precision machines.

-- 
Information (and analysis) is power, and I am all about Power to the People.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility issue in gbm (32 vs 64 bit)

2011-03-03 Thread Patrick Connolly
On Sat, 26-Feb-2011 at 08:46AM -0800, Ridgeway, Greg wrote:

| I have heard about this before happening on other
| platforms. Frankly I'm not positive how this happens. My best guess
| is that there's a tiny bit of numeric instability in the 9+ decimal
| place so that on a given iteration a one variable choice at random
| looks better than the other. Any other ideas?  Greg

I played around with this some time ago and noticed that it happens
only when there's perfect or very nearly perfect correlation.  I even
tried a third variable and it was ignored almost completely.  I
concluded it's highly unlikely to cause a problem since real data
wouldn't have perfectly correlated variables -- or if they did, they'd
be easy enough to detect.



| 
| - Original Message -
| From: Joshua Wiley jwiley.ps...@gmail.com
| To: Axel Urbiz axel.ur...@gmail.com
| Cc: R-help@r-project.org R-help@r-project.org; Ridgeway, Greg
| Sent: Fri Feb 25 22:16:02 2011
| Subject: Re: [R] Reproducibility issue in gbm (32 vs 64 bit)
| 
| Hi Axel,
| 
| I do not have a nice explanation why the results differ off the top of
| my head.  I can say I can replicate what you get on 32/64 (both
| Windows 7) bit with the development version of R and gbm_1.6-3.1.
| 
| Here is an even simpler example that shows the difference:
| 
| gbmfit - gbm(1:50 ~ I(50:1) + I(60:11), distribution = gaussian)
| summary(gbmfit)
| 
| I copied that package maintainer.
| 
| Cheers,
| 
| Josh
| 
| On Fri, Feb 25, 2011 at 7:29 PM, Axel Urbiz axel.ur...@gmail.com wrote:
|  Dear List,
| 
|  The gbm package on Win 7 produces different results for the
|  relative importance of input variables in R 32-bit relative to R 64-bit. 
Any
|  idea why? Any idea which one is correct?
| 
|  Based on this example, it looks like the relative importance of 2 perfectly
|  correlated predictors is diluted by half in 32-bit, whereas in 64-bit, 
one
|  of these predictors gets all the importance and the other gets none. I 
found
|  this interesting.
| 
|  ### Sample code
| 
|  library(gbm)
|  set.seed(12345)
|  xc=matrix(rnorm(100*20),100,20)
|  y=sample(1:2,100,replace=TRUE)
|  xc[,2] - xc[,1]
|  gbmfit - gbm(y~xc[,1]+xc[,2] +xc[,3], distribution=gaussian)
|  summary(gbmfit)
| 
|  ### Results on R 2.12.0 (32-bit)
| 
|       var  rel.inf
|  1 xc[, 3] 49.76143
|  2 xc[, 1] 27.27432
|  3 xc[, 2] 22.96425
| 
|  ### Results on R 2.12.0 (64-bit)
|  summary(gbmfit)
|       var  rel.inf
|  1 xc[, 1] 50.23857
|  2 xc[, 3] 49.76143
|  3 xc[, 2]  0.0
| 
|  Thanks,
|  Axel.
| 
|         [[alternative HTML version deleted]]
| 
|  __
|  R-help@r-project.org mailing list
|  https://stat.ethz.ch/mailman/listinfo/r-help
|  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
|  and provide commented, minimal, self-contained, reproducible code.
| 
| 
| 
| 
| -- 
| Joshua Wiley
| Ph.D. Student, Health Psychology
| University of California, Los Angeles
| http://www.joshuawiley.com/
| 
| __
| 
| This email message is for the sole use of the intended recipient(s) and
| may contain confidential information. Any unauthorized review, use,
| disclosure or distribution is prohibited. If you are not the intended
| recipient, please contact the sender by reply email and destroy all copies
| of the original message.
| __
| R-help@r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-help
| PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
| and provide commented, minimal, self-contained, reproducible code.

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility issue in gbm (32 vs 64 bit)

2011-02-28 Thread Guelman, Leo
I'm guessing this has something to do with numerical precision on the two 
platforms.

Leo.

- Original Message -
From: Joshua Wiley jwiley.ps...@gmail.com
To: Axel Urbiz axel.ur...@gmail.com
Cc: R-help@r-project.org R-help@r-project.org; Ridgeway, Greg
Sent: Fri Feb 25 22:16:02 2011
Subject: Re: [R] Reproducibility issue in gbm (32 vs 64 bit)

Hi Axel,

I do not have a nice explanation why the results differ off the top of
my head.  I can say I can replicate what you get on 32/64 (both
Windows 7) bit with the development version of R and gbm_1.6-3.1.

Here is an even simpler example that shows the difference:

gbmfit - gbm(1:50 ~ I(50:1) + I(60:11), distribution = gaussian)
summary(gbmfit)

I copied that package maintainer.

Cheers,

Josh

On Fri, Feb 25, 2011 at 7:29 PM, Axel Urbiz axel.ur...@gmail.com wrote:
 Dear List,

 The gbm package on Win 7 produces different results for the
 relative importance of input variables in R 32-bit relative to R 64-bit. Any
 idea why? Any idea which one is correct?

 Based on this example, it looks like the relative importance of 2 perfectly
 correlated predictors is diluted by half in 32-bit, whereas in 64-bit, one
 of these predictors gets all the importance and the other gets none. I found
 this interesting.

 ### Sample code

 library(gbm)
 set.seed(12345)
 xc=matrix(rnorm(100*20),100,20)
 y=sample(1:2,100,replace=TRUE)
 xc[,2] - xc[,1]
 gbmfit - gbm(y~xc[,1]+xc[,2] +xc[,3], distribution=gaussian)
 summary(gbmfit)

 ### Results on R 2.12.0 (32-bit)

      var  rel.inf
 1 xc[, 3] 49.76143
 2 xc[, 1] 27.27432
 3 xc[, 2] 22.96425

 ### Results on R 2.12.0 (64-bit)
 summary(gbmfit)
      var  rel.inf
 1 xc[, 1] 50.23857
 2 xc[, 3] 49.76143
 3 xc[, 2]  0.0

 Thanks,
 Axel.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__

This email message is for the sole use of the intended r...{{dropped:30}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility issue in gbm (32 vs 64 bit)

2011-02-26 Thread Ridgeway, Greg
I have heard about this before happening on other platforms. Frankly I'm not 
positive how this happens. My best guess is that there's a tiny bit of numeric 
instability in the 9+ decimal place so that on a given iteration a one variable 
choice at random looks better than the other. Any other ideas?
Greg

- Original Message -
From: Joshua Wiley jwiley.ps...@gmail.com
To: Axel Urbiz axel.ur...@gmail.com
Cc: R-help@r-project.org R-help@r-project.org; Ridgeway, Greg
Sent: Fri Feb 25 22:16:02 2011
Subject: Re: [R] Reproducibility issue in gbm (32 vs 64 bit)

Hi Axel,

I do not have a nice explanation why the results differ off the top of
my head.  I can say I can replicate what you get on 32/64 (both
Windows 7) bit with the development version of R and gbm_1.6-3.1.

Here is an even simpler example that shows the difference:

gbmfit - gbm(1:50 ~ I(50:1) + I(60:11), distribution = gaussian)
summary(gbmfit)

I copied that package maintainer.

Cheers,

Josh

On Fri, Feb 25, 2011 at 7:29 PM, Axel Urbiz axel.ur...@gmail.com wrote:
 Dear List,

 The gbm package on Win 7 produces different results for the
 relative importance of input variables in R 32-bit relative to R 64-bit. Any
 idea why? Any idea which one is correct?

 Based on this example, it looks like the relative importance of 2 perfectly
 correlated predictors is diluted by half in 32-bit, whereas in 64-bit, one
 of these predictors gets all the importance and the other gets none. I found
 this interesting.

 ### Sample code

 library(gbm)
 set.seed(12345)
 xc=matrix(rnorm(100*20),100,20)
 y=sample(1:2,100,replace=TRUE)
 xc[,2] - xc[,1]
 gbmfit - gbm(y~xc[,1]+xc[,2] +xc[,3], distribution=gaussian)
 summary(gbmfit)

 ### Results on R 2.12.0 (32-bit)

      var  rel.inf
 1 xc[, 3] 49.76143
 2 xc[, 1] 27.27432
 3 xc[, 2] 22.96425

 ### Results on R 2.12.0 (64-bit)
 summary(gbmfit)
      var  rel.inf
 1 xc[, 1] 50.23857
 2 xc[, 3] 49.76143
 3 xc[, 2]  0.0

 Thanks,
 Axel.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__

This email message is for the sole use of the intended recipient(s) and
may contain confidential information. Any unauthorized review, use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reproducibility issue in gbm (32 vs 64 bit)

2011-02-25 Thread Axel Urbiz
Dear List,

The gbm package on Win 7 produces different results for the
relative importance of input variables in R 32-bit relative to R 64-bit. Any
idea why? Any idea which one is correct?

Based on this example, it looks like the relative importance of 2 perfectly
correlated predictors is diluted by half in 32-bit, whereas in 64-bit, one
of these predictors gets all the importance and the other gets none. I found
this interesting.

### Sample code

library(gbm)
set.seed(12345)
xc=matrix(rnorm(100*20),100,20)
y=sample(1:2,100,replace=TRUE)
xc[,2] - xc[,1]
gbmfit - gbm(y~xc[,1]+xc[,2] +xc[,3], distribution=gaussian)
summary(gbmfit)

### Results on R 2.12.0 (32-bit)

  var  rel.inf
1 xc[, 3] 49.76143
2 xc[, 1] 27.27432
3 xc[, 2] 22.96425

### Results on R 2.12.0 (64-bit)
 summary(gbmfit)
  var  rel.inf
1 xc[, 1] 50.23857
2 xc[, 3] 49.76143
3 xc[, 2]  0.0

Thanks,
Axel.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility issue in gbm (32 vs 64 bit)

2011-02-25 Thread Joshua Wiley
Hi Axel,

I do not have a nice explanation why the results differ off the top of
my head.  I can say I can replicate what you get on 32/64 (both
Windows 7) bit with the development version of R and gbm_1.6-3.1.

Here is an even simpler example that shows the difference:

gbmfit - gbm(1:50 ~ I(50:1) + I(60:11), distribution = gaussian)
summary(gbmfit)

I copied that package maintainer.

Cheers,

Josh

On Fri, Feb 25, 2011 at 7:29 PM, Axel Urbiz axel.ur...@gmail.com wrote:
 Dear List,

 The gbm package on Win 7 produces different results for the
 relative importance of input variables in R 32-bit relative to R 64-bit. Any
 idea why? Any idea which one is correct?

 Based on this example, it looks like the relative importance of 2 perfectly
 correlated predictors is diluted by half in 32-bit, whereas in 64-bit, one
 of these predictors gets all the importance and the other gets none. I found
 this interesting.

 ### Sample code

 library(gbm)
 set.seed(12345)
 xc=matrix(rnorm(100*20),100,20)
 y=sample(1:2,100,replace=TRUE)
 xc[,2] - xc[,1]
 gbmfit - gbm(y~xc[,1]+xc[,2] +xc[,3], distribution=gaussian)
 summary(gbmfit)

 ### Results on R 2.12.0 (32-bit)

      var  rel.inf
 1 xc[, 3] 49.76143
 2 xc[, 1] 27.27432
 3 xc[, 2] 22.96425

 ### Results on R 2.12.0 (64-bit)
 summary(gbmfit)
      var  rel.inf
 1 xc[, 1] 50.23857
 2 xc[, 3] 49.76143
 3 xc[, 2]  0.0

 Thanks,
 Axel.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.