On 11-02-10 6:37 AM, Graham Williams wrote:
Should one expect minor numerical differences between 64bit and 32bit R on
Windows? Hunting around the lists I've not been able to find a definitive
answer yet. Seems plausible using different precision arithmetic, but waned
to confirm from those who might know for sure.

I think our goal is that those results should be as close as possible. R uses the same precision in both 32 bit and 64 bit; the differences are all in pointers, not floating point values.

However, the two versions use different run-time libraries, and it is possible that there are precision differences coming from there. I think we'd be interested in knowing what they are even if they are beyond our control, so I would appreciate it if you could track down where the difference arises.

Duncan Murdoch


BACKGROUND

A colleague was trying to replicate some modelling results (from a soon to
be published book) using rpart, ada, and randomForest, for example. My 64bit
Linux and 64bit Windows 7 always agree (so far), but not their 32bit
Windows. I've distilled it to a few simple lines of code to replicate the
differences (but had to stay with the weather dataset from rattle since
could not replicate on standard datasets yet).

library(rpart)
library(rattle)
set.seed(41)
model<- rpart(RainTomorrow ~ ., data=weather[-c(1, 2,
23)], control=rpart.control(minbucket=0))
print(model$cptable)

Final row on 32bit: 9 0.01000000     23 0.1515152 1.1060606 0.1158273
Final row on 64bit: 9 0.01000000     23 0.1515152 1.0909091 0.1152273

Pretty minor, but different. I've not found any seed other than 41 (only
tried a few) that results in a difference.

library(ada) # using rpart underneath
set.seed(41)
model<- ada(RainTomorrow ~ ., data=weather[-c(1, 2, 23)])
print(model)

On 32bit: Train Error: 0.057
On 64bit: Train Error: 0.055

Changing the seed to 42, for example, brings them into sync.

library(randomForest)
set.seed(41)
model<- randomForest(RainTomorrow ~ ., data=weather[-c(1, 2, 23)],
                       importance=TRUE, na.action=na.roughfix)
print(model)

On 32bit:  OOB estimate of  error rate: 12.84%
On 64bit:  OOB estimate of  error rate: 11.75%


sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
[5] LC_TIME=English_Australia.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] randomForest_4.5-36 pmml_1.2.27         XML_3.2-0.2
[4] colorspace_1.0-1    RGtk2_2.20.3        ada_2.0-2
[7] rattle_2.6.2        rpart_3.1-47

loaded via a namespace (and not attached):
[1] tools_2.12.1

sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-mingw32/x64 (64-bit)
...


Thanks,
Graham

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to