On 04/06/2015 3:59 AM, Thierry Onkelinx wrote: > Dear Duncan, > > I had been thinking about FAQ 7.31. I tried to create a dummy dataset > with the same structure to replicate the problem with the need of > sending my dataset. However all of them gave identical() results between > 32-bit and 64-bit. Note that coef()$fRow is a 1266 x 6 data.frame. Is it > correct to infer that tiny difference between 32-bit and 64-bit are > possible but have a low probability of occurring?
Differences are rare, but it's hard to assign a probability to them. Duncan Murdoch > > signif() makes indeed more sense than round(). Using 20 digits gives > identical results, 21 digits gives non identical results. > > Best regards, > > ir. Thierry Onkelinx > Instituut voor natuur- en bosonderzoek / Research Institute for Nature > and Forest > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance > Kliniekstraat 25 > 1070 Anderlecht > Belgium > > To call in the statistician after the experiment is done may be no more > than asking him to perform a post-mortem examination: he may be able to > say what the experiment died of. ~ Sir Ronald Aylmer Fisher > The plural of anecdote is not data. ~ Roger Brinner > The combination of some data and an aching desire for an answer does not > ensure that a reasonable answer can be extracted from a given body of > data. ~ John Tukey > > 2015-06-03 18:09 GMT+02:00 Duncan Murdoch <murdoch.dun...@gmail.com > <mailto:murdoch.dun...@gmail.com>>: > > On 03/06/2015 11:56 AM, Thierry Onkelinx wrote: > > Dear all, > > > > I'm a bit puzzled by the difference in an object when created in R > 32-bit > > and R 64-bit. > > > > Consider the code below. test.rda is available at > > > > https://drive.google.com/file/d/0BzBrlGSuB9n-NFBWeC1TR093Sms/view?usp=sharing > > > > # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8 > > library(lme4) > > load("test.rda") > > coef.32 <- coef(test) > > save(coef.32, file = "32bit.rda") > > > > # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8 > > library(lme4) > > load("~/test.rda") > > coef.64 <- coef(test) > > save(coef.64, file = "64bit.rda") > > > > > > # Compare the results > > # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8 > > # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8 > > library(lme4) > > load("32bit.rda") > > load("64bit.rda") > > identical(coef.32, coef.64) # FALSE > > identical(coef.32$fRow, coef.64$fRow) # FALSE > > identical(coef.32$fLocation, coef.64$fLocation) # TRUE > > identical(coef.32$fSubLocation, coef.64$fSubLocation) # TRUE > > > > The first comparison is FALSE, because the second is FALSE. But > why is the > > second FALSE and the third and fourth TRUE? > > > > My goal is the calculate a SHA1 hash on the coef(test) to track if the > > coefficients of test have changed. I'd like to get the same hash on a > > 32-bit and 64-bit system. A simple hack would be to calculate the > hash on > > round(coef(test), 20). Is that a good or bad idea? > > > > identical(round(coef.32$fRow, 20), round(coef.64$fRow, 20)) # TRUE > > Different math libraries round differently, so small differences are > expected. This is FAQ 7.31. In many cases the 32 bit calculations are > more accurate, because they tend to use more 80 bit extended precision > intermediate values, but that is not guaranteed. > > Rounding before comparing makes sense, but I would use signif() instead > of round(), I would choose a relatively small number of significant > digits, and I would expect to see a few false positives: if the true > value is 0 but some "random" noise is added, I'd expect values rounded > by signif() to be unequal. > > Duncan Murdoch > > > > > Best regards, > > > > ir. Thierry Onkelinx > > Instituut voor natuur- en bosonderzoek / Research Institute for Nature > and > > Forest > > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance > > Kliniekstraat 25 > > 1070 Anderlecht > > Belgium > > > > To call in the statistician after the experiment is done may be no more > > than asking him to perform a post-mortem examination: he may be able to > say > > what the experiment died of. ~ Sir Ronald Aylmer Fisher > > The plural of anecdote is not data. ~ Roger Brinner > > The combination of some data and an aching desire for an answer does not > > ensure that a reasonable answer can be extracted from a given body of > data. > > ~ John Tukey > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- > To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.