Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when the seed is the same?
On Fri, 18 Aug 2023 12:17:51 +0200 Martin Maechler wrote: > I think it would be nice to provide the average R user with a > (possibly super small) R package that allows to turn on (and off) > such CNR reproducibility. Would it be possible to effect this on/off via options()? cheers, Rolf -- Honorary Research Fellow Department of Statistics University of Auckland Stats. Dep't. (secretaries) phone: +64-9-373-7599 ext. 89622 Home phone: +64-9-480-4619 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when the seed is the same?
> Bill Dunlap > on Thu, 17 Aug 2023 07:31:12 -0700 writes: > MKL's results can depend on the number of threads running and perhaps other > things. They blame it on the non-associativity of floating point > arithmetic. This article gives a way to make results repeatable: > https://www.intel.com/content/www/us/en/developer/articles/technical/introduction-to-the-conditional-numerical-reproducibility-cnr.html > -Bill Thanks a lot, Bill, for your answer and this pointer to MKL creators explanation and details on how to achieve reproducibility ! Conditional reproducibility with some cost - speedwise: If limited to a particular code-path, Intel MKL performance can in some circumstances degrade by more than half. This can be easily understood by noting that matrix-multiply performance nearly doubled with the introduction of new processors supporting AVX instructions. In other cases, performance may degrade by 10-20% if algorithms are restricted to maintain the order of operations. It confirms what I have been claiming for about a dozen years, that in my experience, *all* optimizing BLAS/Lapack libraries trade speed for accuracy to some extent, and hence an unoptimized BLAS/Lapack (as the one in R's sources), should be considered gold-standard. {Consequently, I have not been fully happy that we switched to using an *external* Lapack, when one is found during configuration. Of course there *are* other good reasons to do such a switch, notably compatibility with other math software running on the same system.} Now, in the above report, they show how to ensure CNR := conditional numerical reproducibility when using MKL ("conditional" means e.g. the same level of parallelization). on Intel compatible chips. I think it would be nice to provide the average R user with a (possibly super small) R package that allows to turn on (and off) such CNR reproducibility. Strict reproducibility when running on the same hardware and software should be a very high desideratum for all scientists and I hope also for all data "wranglers" etc.. Martin -- Martin Maechler ETH Zurich and R Core team > On Wed, Aug 16, 2023 at 8:11 PM Shu Fai Cheung > wrote: >> Hi All, >> >> When addressing an error in one of my packages in the CRAN check at CRAN, >> I found something strange which, I believe, is unrelated to my package. >> I am not sure whether it is a bug or a known issue. Therefore, I would like >> to have advice from experts here. >> >> The error at CRAN check occurred in a test, and only on R-devel with MKL. >> No >> issues on all other platforms. The test compares two sets of random >> numbers, >> supposed to be identical because generated from the same seed. However, >> when >> I tried that in R-devel (2023-08-15 r84957) on Ubuntu 22.04.3 LTS, linked >> to >> MKL (2023.2.0), I found that this is not the case in one situation. >> >> I don't know why but somehow only one particular set of means and >> covariance matrix, among many others in the code, led to that problem. >> Please find >> below the code to reproduce the means and the covariance matrix (pardon me >> for the long code): >> >> mu0 <- structure(c(0.52252416853188655, 0.39883382931927774, >> 1.6296642535174521, >> 2.9045763671121816, -0.19816874840500939, 0.42610841566522556, >> 0.30155498531316366, 1.0339601619394503, 3.4125587827873192, >> 13.125481598475405, 19.275480386183503, 658.94225353462195, >> 1.0997193726987393, >> 9.9980286642877214, 6.4947188998602092, -12.952617780813679, >> 8.4991882784024799, 106.82209520278377, 0.19623712449219838), class = >> c("numeric")) >> >> sigma0 <- structure(c(0.0047010182215945009, 1.4688424622163808e-17, >> -2.5269697696885822e-17, >> -1.2451516929358655e-18, 3.6553475725253408e-19, 3.2092363914356227e-19, >> -2.6341938382508503e-18, 8.6439582878287556e-19, -1.295240602054123e-17, >> 1.1154057497258702e-18, -8.2407621704807367e-33, -1.0891686096304908e-15, >> -2.6800671450262819e-18, -0.0009225142979911, -1.4833018148344697e-16, >> 2.292242132203e-16, -3.5128615476876035e-18, 2.8004770623734002e-33, >> 2.0818099301348832e-34, 1.294907062174661e-17, 0.012788608283099183, >> 1.5603789410838518e-15, 1.0425182851251724e-16, 2.758318745465066e-17, >> -9.7047963858148572e-18, -7.4685438142667792e-17, -1.9426723638193857e-17, >> -7.8775307262071738e-17, -4.0773523140359949e-17, 4.1212975037936416e-18, >> -1.199934828824835e-14, 2.301740948367742e-17, -2.5410883836579325e-18, >> -0.129172198695584, -1.9922957470809647e-14, 1.7077690739402761e-16, >> 1.663702957392817e-30, 5.5362608501514495e-32, -2.5575000656645995e-17, >> 7.3879960338026333e-16, 0.052246980157830372, -0.007670030643844231, >> -1.5468402204507243e-17, 1.0518757786432683e-17, 2.9992131812421
Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when the seed is the same?
Thanks a lot! I am not aware of this behavior! I will take this into account in my work. Regards, Shu Fai On Thu, Aug 17, 2023 at 10:31 PM Bill Dunlap wrote: > > MKL's results can depend on the number of threads running and perhaps other > things. They blame it on the non-associativity of floating point arithmetic. > This article gives a way to make results repeatable: > > https://www.intel.com/content/www/us/en/developer/articles/technical/introduction-to-the-conditional-numerical-reproducibility-cnr.html > > -Bill > > On Wed, Aug 16, 2023 at 8:11 PM Shu Fai Cheung > wrote: >> >> Hi All, >> >> When addressing an error in one of my packages in the CRAN check at CRAN, >> I found something strange which, I believe, is unrelated to my package. >> I am not sure whether it is a bug or a known issue. Therefore, I would like >> to have advice from experts here. >> >> The error at CRAN check occurred in a test, and only on R-devel with MKL. No >> issues on all other platforms. The test compares two sets of random numbers, >> supposed to be identical because generated from the same seed. However, when >> I tried that in R-devel (2023-08-15 r84957) on Ubuntu 22.04.3 LTS, linked to >> MKL (2023.2.0), I found that this is not the case in one situation. >> >> I don't know why but somehow only one particular set of means and >> covariance matrix, among many others in the code, led to that problem. >> Please find >> below the code to reproduce the means and the covariance matrix (pardon me >> for the long code): >> >> mu0 <- structure(c(0.52252416853188655, 0.39883382931927774, >> 1.6296642535174521, >> 2.9045763671121816, -0.19816874840500939, 0.42610841566522556, >> 0.30155498531316366, 1.0339601619394503, 3.4125587827873192, >> 13.125481598475405, 19.275480386183503, 658.94225353462195, >> 1.0997193726987393, >> 9.9980286642877214, 6.4947188998602092, -12.952617780813679, >> 8.4991882784024799, 106.82209520278377, 0.19623712449219838), class = >> c("numeric")) >> >> sigma0 <- structure(c(0.0047010182215945009, 1.4688424622163808e-17, >> -2.5269697696885822e-17, >> -1.2451516929358655e-18, 3.6553475725253408e-19, 3.2092363914356227e-19, >> -2.6341938382508503e-18, 8.6439582878287556e-19, -1.295240602054123e-17, >> 1.1154057497258702e-18, -8.2407621704807367e-33, -1.0891686096304908e-15, >> -2.6800671450262819e-18, -0.0009225142979911, -1.4833018148344697e-16, >> 2.292242132203e-16, -3.5128615476876035e-18, 2.8004770623734002e-33, >> 2.0818099301348832e-34, 1.294907062174661e-17, 0.012788608283099183, >> 1.5603789410838518e-15, 1.0425182851251724e-16, 2.758318745465066e-17, >> -9.7047963858148572e-18, -7.4685438142667792e-17, -1.9426723638193857e-17, >> -7.8775307262071738e-17, -4.0773523140359949e-17, 4.1212975037936416e-18, >> -1.199934828824835e-14, 2.301740948367742e-17, -2.5410883836579325e-18, >> -0.129172198695584, -1.9922957470809647e-14, 1.7077690739402761e-16, >> 1.663702957392817e-30, 5.5362608501514495e-32, -2.5575000656645995e-17, >> 7.3879960338026333e-16, 0.052246980157830372, -0.007670030643844231, >> -1.5468402204507243e-17, 1.0518757786432683e-17, 2.9992131812421253e-16, >> 1.6588830885081527e-17, -8.3276045185697992e-16, -3.3713008849128695e-15, >> 1.5823980389961169e-17, 2.2243227896948194e-15, -3.2071920299461956e-17, >> 5.0187645877462975e-18, -7.4622951185527265e-15, -0.44701112744184102, >> -1.854066605407503e-16, -5.2511356793649759e-30, -6.7993385117653912e-32, >> -1.5343740968067595e-18, -3.6785921616583949e-17, -0.0076700306438433133, >> 0.019231143599164325, 5.7818784939170702e-18, 1.7819897158350214e-17, >> -2.3927039320969442e-17, -3.9630951242056644e-18, 1.9779107704573469e-15, >> 7.8103367010164485e-15, -1.2351275675875924e-17, -8.6959160183013082e-15, >> -1.601002787863e-18, 3.0110116065267407e-19, 3.7155867714997651e-16, >> -0.12490087179942209, -2.2029631919898945e-16, -8.0869824211018481e-31, >> -5.058673986695e-33, 2.0667225433033359e-19, 1.0141735013231455e-18, >> 3.2845977347050714e-17, -2.9662734037779067e-19, 9.9595082331331027e-05, >> -1.9982466114987023e-18, 2.6420113469249318e-18, 5.2415327077690342e-19, >> 3.007533948014542e-18, -5.1128158961673558e-17, 6.7791246559883483e-19, >> -3.042327665547861e-15, 1.9523738271941284e-19, -4.0556768902105205e-20, >> -1.024372770865438e-17, -0.010638955366526863, 2.5786757042756338e-17, >> 1.3546130005558746e-32, 1.7947249868721723e-33, 1.9228405363682382e-19, >> -4.1209014001548407e-18, 1.5228291529275058e-17, 2.0443162438349059e-17, >> -1.9572693345249383e-18, 0.0012709581028842473, -0.0018515998737074948, >> 9.3128024867175073e-20, -5.1895788671618993e-17, 2.7373981615289174e-16, >> 1.2812977711597223e-17, -2.2792319486263267e-15, 4.1599503721278813e-19, >> -3.7733269771394201e-20, 4.16234419478765e-17, -1.5986158133468129e-16, >> -0.016037670900735834, -5.0763064708173244e-33, -1.0176066166236902e-50, >> -5.9296704797665404e-18, -8.0698801402861772e-17, 2.9646619457173492e-16, >> -2.8879431119718227e-17, 2.
Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when the seed is the same?
Ah, super interesting! Thanks for the pointer, Bill. I might not have seen the issue because I have export MKL_NUM_THREADS=1 in my .profile, so MKL never goes beyond using one thread on my machine. That's a potential source of non-reproducibility I have not come across. So the 'wisdom' that one can only guarantee full reproducibility if one locks the machine in a safe isn't even true unless one takes extra steps when using MKL. Best, Wolfgang >-Original Message- >From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bill Dunlap >Sent: Thursday, 17 August, 2023 16:31 >To: Shu Fai Cheung >Cc: r-help@r-project.org >Subject: Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when >the seed is the same? > >MKL's results can depend on the number of threads running and perhaps other >things. They blame it on the non-associativity of floating point >arithmetic. This article gives a way to make results repeatable: > >https://www.intel.com/content/www/us/en/developer/articles/technical/introduction >-to-the-conditional-numerical-reproducibility-cnr.html > >-Bill > >On Wed, Aug 16, 2023 at 8:11 PM Shu Fai Cheung >wrote: > >> Hi All, >> >> When addressing an error in one of my packages in the CRAN check at CRAN, >> I found something strange which, I believe, is unrelated to my package. >> I am not sure whether it is a bug or a known issue. Therefore, I would like >> to have advice from experts here. >> >> The error at CRAN check occurred in a test, and only on R-devel with MKL. >> No >> issues on all other platforms. The test compares two sets of random >> numbers, >> supposed to be identical because generated from the same seed. However, >> when >> I tried that in R-devel (2023-08-15 r84957) on Ubuntu 22.04.3 LTS, linked >> to >> MKL (2023.2.0), I found that this is not the case in one situation. >> >> I don't know why but somehow only one particular set of means and >> covariance matrix, among many others in the code, led to that problem. >> Please find >> below the code to reproduce the means and the covariance matrix (pardon me >> for the long code): >> >> mu0 <- structure(c(0.52252416853188655, 0.39883382931927774, >> 1.6296642535174521, >> 2.9045763671121816, -0.19816874840500939, 0.42610841566522556, >> 0.30155498531316366, 1.0339601619394503, 3.4125587827873192, >> 13.125481598475405, 19.275480386183503, 658.94225353462195, >> 1.0997193726987393, >> 9.9980286642877214, 6.4947188998602092, -12.952617780813679, >> 8.4991882784024799, 106.82209520278377, 0.19623712449219838), class = >> c("numeric")) >> >> sigma0 <- structure(c(0.0047010182215945009, 1.4688424622163808e-17, >> -2.5269697696885822e-17, >> -1.2451516929358655e-18, 3.6553475725253408e-19, 3.2092363914356227e-19, >> -2.6341938382508503e-18, 8.6439582878287556e-19, -1.295240602054123e-17, >> 1.1154057497258702e-18, -8.2407621704807367e-33, -1.0891686096304908e-15, >> -2.6800671450262819e-18, -0.0009225142979911, -1.4833018148344697e-16, >> 2.292242132203e-16, -3.5128615476876035e-18, 2.8004770623734002e-33, >> 2.0818099301348832e-34, 1.294907062174661e-17, 0.012788608283099183, >> 1.5603789410838518e-15, 1.0425182851251724e-16, 2.758318745465066e-17, >> -9.7047963858148572e-18, -7.4685438142667792e-17, -1.9426723638193857e-17, >> -7.8775307262071738e-17, -4.0773523140359949e-17, 4.1212975037936416e-18, >> -1.199934828824835e-14, 2.301740948367742e-17, -2.5410883836579325e-18, >> -0.129172198695584, -1.9922957470809647e-14, 1.7077690739402761e-16, >> 1.663702957392817e-30, 5.5362608501514495e-32, -2.5575000656645995e-17, >> 7.3879960338026333e-16, 0.052246980157830372, -0.007670030643844231, >> -1.5468402204507243e-17, 1.0518757786432683e-17, 2.9992131812421253e-16, >> 1.6588830885081527e-17, -8.3276045185697992e-16, -3.3713008849128695e-15, >> 1.5823980389961169e-17, 2.2243227896948194e-15, -3.2071920299461956e-17, >> 5.0187645877462975e-18, -7.4622951185527265e-15, -0.44701112744184102, >> -1.854066605407503e-16, -5.2511356793649759e-30, -6.7993385117653912e-32, >> -1.5343740968067595e-18, -3.6785921616583949e-17, -0.0076700306438433133, >> 0.019231143599164325, 5.7818784939170702e-18, 1.7819897158350214e-17, >> -2.3927039320969442e-17, -3.9630951242056644e-18, 1.9779107704573469e-15, >> 7.8103367010164485e-15, -1.2351275675875924e-17, -8.6959160183013082e-15, >> -1.601002787863e-18, 3.0110116065267407e-19, 3.7155867714997651e-16, >> -0.12490087179942209, -2.2029631919898945e-16, -8.0869824211018481e-31, >> -5.058673986695e-33, 2.066
Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when the seed is the same?
Thanks, I was missing the point that this was *non-repeatability on the same platform*. On 2023-08-17 10:31 a.m., Bill Dunlap wrote: MKL's results can depend on the number of threads running and perhaps other things. They blame it on the non-associativity of floating point arithmetic. This article gives a way to make results repeatable: https://www.intel.com/content/www/us/en/developer/articles/technical/introduction-to-the-conditional-numerical-reproducibility-cnr.html -Bill On Wed, Aug 16, 2023 at 8:11 PM Shu Fai Cheung wrote: Hi All, When addressing an error in one of my packages in the CRAN check at CRAN, I found something strange which, I believe, is unrelated to my package. I am not sure whether it is a bug or a known issue. Therefore, I would like to have advice from experts here. The error at CRAN check occurred in a test, and only on R-devel with MKL. No issues on all other platforms. The test compares two sets of random numbers, supposed to be identical because generated from the same seed. However, when I tried that in R-devel (2023-08-15 r84957) on Ubuntu 22.04.3 LTS, linked to MKL (2023.2.0), I found that this is not the case in one situation. I don't know why but somehow only one particular set of means and covariance matrix, among many others in the code, led to that problem. Please find below the code to reproduce the means and the covariance matrix (pardon me for the long code): mu0 <- structure(c(0.52252416853188655, 0.39883382931927774, 1.6296642535174521, 2.9045763671121816, -0.19816874840500939, 0.42610841566522556, 0.30155498531316366, 1.0339601619394503, 3.4125587827873192, 13.125481598475405, 19.275480386183503, 658.94225353462195, 1.0997193726987393, 9.9980286642877214, 6.4947188998602092, -12.952617780813679, 8.4991882784024799, 106.82209520278377, 0.19623712449219838), class = c("numeric")) sigma0 <- structure(c(0.0047010182215945009, 1.4688424622163808e-17, -2.5269697696885822e-17, -1.2451516929358655e-18, 3.6553475725253408e-19, 3.2092363914356227e-19, -2.6341938382508503e-18, 8.6439582878287556e-19, -1.295240602054123e-17, 1.1154057497258702e-18, -8.2407621704807367e-33, -1.0891686096304908e-15, -2.6800671450262819e-18, -0.0009225142979911, -1.4833018148344697e-16, 2.292242132203e-16, -3.5128615476876035e-18, 2.8004770623734002e-33, 2.0818099301348832e-34, 1.294907062174661e-17, 0.012788608283099183, 1.5603789410838518e-15, 1.0425182851251724e-16, 2.758318745465066e-17, -9.7047963858148572e-18, -7.4685438142667792e-17, -1.9426723638193857e-17, -7.8775307262071738e-17, -4.0773523140359949e-17, 4.1212975037936416e-18, -1.199934828824835e-14, 2.301740948367742e-17, -2.5410883836579325e-18, -0.129172198695584, -1.9922957470809647e-14, 1.7077690739402761e-16, 1.663702957392817e-30, 5.5362608501514495e-32, -2.5575000656645995e-17, 7.3879960338026333e-16, 0.052246980157830372, -0.007670030643844231, -1.5468402204507243e-17, 1.0518757786432683e-17, 2.9992131812421253e-16, 1.6588830885081527e-17, -8.3276045185697992e-16, -3.3713008849128695e-15, 1.5823980389961169e-17, 2.2243227896948194e-15, -3.2071920299461956e-17, 5.0187645877462975e-18, -7.4622951185527265e-15, -0.44701112744184102, -1.854066605407503e-16, -5.2511356793649759e-30, -6.7993385117653912e-32, -1.5343740968067595e-18, -3.6785921616583949e-17, -0.0076700306438433133, 0.019231143599164325, 5.7818784939170702e-18, 1.7819897158350214e-17, -2.3927039320969442e-17, -3.9630951242056644e-18, 1.9779107704573469e-15, 7.8103367010164485e-15, -1.2351275675875924e-17, -8.6959160183013082e-15, -1.601002787863e-18, 3.0110116065267407e-19, 3.7155867714997651e-16, -0.12490087179942209, -2.2029631919898945e-16, -8.0869824211018481e-31, -5.058673986695e-33, 2.0667225433033359e-19, 1.0141735013231455e-18, 3.2845977347050714e-17, -2.9662734037779067e-19, 9.9595082331331027e-05, -1.9982466114987023e-18, 2.6420113469249318e-18, 5.2415327077690342e-19, 3.007533948014542e-18, -5.1128158961673558e-17, 6.7791246559883483e-19, -3.042327665547861e-15, 1.9523738271941284e-19, -4.0556768902105205e-20, -1.024372770865438e-17, -0.010638955366526863, 2.5786757042756338e-17, 1.3546130005558746e-32, 1.7947249868721723e-33, 1.9228405363682382e-19, -4.1209014001548407e-18, 1.5228291529275058e-17, 2.0443162438349059e-17, -1.9572693345249383e-18, 0.0012709581028842473, -0.0018515998737074948, 9.3128024867175073e-20, -5.1895788671618993e-17, 2.7373981615289174e-16, 1.2812977711597223e-17, -2.2792319486263267e-15, 4.1599503721278813e-19, -3.7733269771394201e-20, 4.16234419478765e-17, -1.5986158133468129e-16, -0.016037670900735834, -5.0763064708173244e-33, -1.0176066166236902e-50, -5.9296704797665404e-18, -8.0698801402861772e-17, 2.9646619457173492e-16, -2.8879431119718227e-17, 2.9393253663483117e-18, -0.0018515998737074959, 0.090335687736246548, 5.6849412731758135e-19, 8.8934458836007799e-17, -4.1390858756690365e-16, 4.120323677410211e-16, 2.8000915545933503e-15, 2.8094462743052983e-17, 1.1636214841356605e-18, 8.1510367497
Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when the seed is the same?
Sorry for the confusion caused by that sentence. I did not expect using the same seed generates the same sequence of numbers unconditionally. What puzzled me is not having the same sequence when running the same code repeatedly in the same computer in the same session of R. The following four sets of results were generated from a script with four copies of the two lines, ran in the same session: > set.seed(1234) > MASS::mvrnorm(n = 5, mu = mu0, Sigma = sigma0)[1, 1:5] [1] 0.4851605 0.5704446 1.6873036 2.7645014 -0.2020908 > set.seed(1234) > MASS::mvrnorm(n = 5, mu = mu0, Sigma = sigma0)[1, 1:5] [1] 0.4851605 0.5704446 1.6872818 2.7644787 -0.2023346 > set.seed(1234) > MASS::mvrnorm(n = 5, mu = mu0, Sigma = sigma0)[1, 1:5] [1] 0.4851605 0.5704446 1.6872818 2.7644787 -0.2023346 > set.seed(1234) > MASS::mvrnorm(n = 5, mu = mu0, Sigma = sigma0)[1, 1:5] [1] 0.4851605 0.5704446 1.6873036 2.7645014 -0.2020908 I could not figure out the pattern. I could not even reproduce the order above. Sometimes, the order can be different. Sometimes, the four sets of results can be identical. I did a few more tests, on another computer (Windows 10), with several copies of R installed. For R 4.3.1 and R-devel (2023-08-12 r84939 ucrt), for which I did not touch anything about BLAS and LAPACK, they consistently gave the following results: > set.seed(1234) > MASS::mvrnorm(n = 5, mu = mu0, Sigma = sigma0)[1, 1:5] [1] 0.4851605 0.5704446 1.6872818 2.7644787 -0.2023346 Therefore, it is possible that in my test with MKL, somehow BLAS and LAPACK came with R were sometimes used, even though the sessionInfo() output shows that MKL is used. I am not sure whether it is because I did anything wrong in setting up R to use MKL. More on the machine on which I found that issue. It was a virtual machine built using VirtualBox. The host system is Windows 10 (though I guess it should not matter). The OS is Ubuntu 22.04.3 LTS. I installed MKL (2023.2.0) following these steps: https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-download.html?operatingsystem=linux&distributions=aptpackagemanager I compiled R-devel (2023-08-15 r84957) and then link MKL to it following these steps (with some modifications, as the description seems to be outdated): https://www.intel.com/content/www/us/en/developer/articles/technical/quick-linking-intel-mkl-blas-lapack-to-r.html Ideally, I should compile R with MKL, instead of using the quick linking method. However, I am occupied with something else and could not find an updated guide on how to do this quickly. The guides I located were several years old. Regards, Shu Fai On Thu, Aug 17, 2023 at 8:57 PM Ben Bolker wrote: > > > However, should the numbers > > generated identical if the same seed is used? > >I don't see how using the same seed can overcome floating-point > differences across platforms (compilers etc.) stemming from differences > in an eigen() computation (based on arcane details like use of > registers, compiler optimizations, etc.). set.seed() guarantees that > the normal deviates generated by rnorm() will be identical, but those > random numbers are then multiplied by the eigenvectors derived from > eigen(), at which point the differences will crop up. > >This has been discussed on Twitter: > https://twitter.com/wviechtb/status/1230078883317387264 > > Wolfgang Viechtbauer, 2020-02-18: > > Another interesting reproducibility issue that came up. MASS::mvrnorm() > can give different values despite setting the same seed. The problem: > Results of eigen() (which is used by mvrnorm) can be machine dependent > (help(eigen) does warn about this). > > Interestingly, mvtnorm::rmvnorm() with method="eigen" gives the same > values across all machines I tested this on (and method="svd" give the > same values). With method="chol" I get different values, but again > consistent across machines. > > Ah, mvtnorm::rmvnorm() applies the results from eigen() in a different > way that appears to be less (not?) affected by the indeterminacy in the > eigenvectors. Quite clever. > > > On 2023-08-16 11:10 p.m., Shu Fai Cheung wrote: > > Hi All, > > > > When addressing an error in one of my packages in the CRAN check at CRAN, > > I found something strange which, I believe, is unrelated to my package. > > I am not sure whether it is a bug or a known issue. Therefore, I would > like > > to have advice from experts here. > > > > The error at CRAN check occurred in a test, and only on R-devel with > MKL. No > > issues on all other platforms. The test compares two sets of random > numbers, > > supposed to be identical because generated from the same seed. However, > when > > I tried that in R-devel (2023-08-15 r84957) on Ubuntu 22.04.3 LTS, > linked to > > MKL (2023.2.0), I found that this is not the case in one situation. > > > > I don't know why but somehow only one particular set of means and > > covariance matrix, among many others in the code, led to that problem. >
Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when the seed is the same?
MKL's results can depend on the number of threads running and perhaps other things. They blame it on the non-associativity of floating point arithmetic. This article gives a way to make results repeatable: https://www.intel.com/content/www/us/en/developer/articles/technical/introduction-to-the-conditional-numerical-reproducibility-cnr.html -Bill On Wed, Aug 16, 2023 at 8:11 PM Shu Fai Cheung wrote: > Hi All, > > When addressing an error in one of my packages in the CRAN check at CRAN, > I found something strange which, I believe, is unrelated to my package. > I am not sure whether it is a bug or a known issue. Therefore, I would like > to have advice from experts here. > > The error at CRAN check occurred in a test, and only on R-devel with MKL. > No > issues on all other platforms. The test compares two sets of random > numbers, > supposed to be identical because generated from the same seed. However, > when > I tried that in R-devel (2023-08-15 r84957) on Ubuntu 22.04.3 LTS, linked > to > MKL (2023.2.0), I found that this is not the case in one situation. > > I don't know why but somehow only one particular set of means and > covariance matrix, among many others in the code, led to that problem. > Please find > below the code to reproduce the means and the covariance matrix (pardon me > for the long code): > > mu0 <- structure(c(0.52252416853188655, 0.39883382931927774, > 1.6296642535174521, > 2.9045763671121816, -0.19816874840500939, 0.42610841566522556, > 0.30155498531316366, 1.0339601619394503, 3.4125587827873192, > 13.125481598475405, 19.275480386183503, 658.94225353462195, > 1.0997193726987393, > 9.9980286642877214, 6.4947188998602092, -12.952617780813679, > 8.4991882784024799, 106.82209520278377, 0.19623712449219838), class = > c("numeric")) > > sigma0 <- structure(c(0.0047010182215945009, 1.4688424622163808e-17, > -2.5269697696885822e-17, > -1.2451516929358655e-18, 3.6553475725253408e-19, 3.2092363914356227e-19, > -2.6341938382508503e-18, 8.6439582878287556e-19, -1.295240602054123e-17, > 1.1154057497258702e-18, -8.2407621704807367e-33, -1.0891686096304908e-15, > -2.6800671450262819e-18, -0.0009225142979911, -1.4833018148344697e-16, > 2.292242132203e-16, -3.5128615476876035e-18, 2.8004770623734002e-33, > 2.0818099301348832e-34, 1.294907062174661e-17, 0.012788608283099183, > 1.5603789410838518e-15, 1.0425182851251724e-16, 2.758318745465066e-17, > -9.7047963858148572e-18, -7.4685438142667792e-17, -1.9426723638193857e-17, > -7.8775307262071738e-17, -4.0773523140359949e-17, 4.1212975037936416e-18, > -1.199934828824835e-14, 2.301740948367742e-17, -2.5410883836579325e-18, > -0.129172198695584, -1.9922957470809647e-14, 1.7077690739402761e-16, > 1.663702957392817e-30, 5.5362608501514495e-32, -2.5575000656645995e-17, > 7.3879960338026333e-16, 0.052246980157830372, -0.007670030643844231, > -1.5468402204507243e-17, 1.0518757786432683e-17, 2.9992131812421253e-16, > 1.6588830885081527e-17, -8.3276045185697992e-16, -3.3713008849128695e-15, > 1.5823980389961169e-17, 2.2243227896948194e-15, -3.2071920299461956e-17, > 5.0187645877462975e-18, -7.4622951185527265e-15, -0.44701112744184102, > -1.854066605407503e-16, -5.2511356793649759e-30, -6.7993385117653912e-32, > -1.5343740968067595e-18, -3.6785921616583949e-17, -0.0076700306438433133, > 0.019231143599164325, 5.7818784939170702e-18, 1.7819897158350214e-17, > -2.3927039320969442e-17, -3.9630951242056644e-18, 1.9779107704573469e-15, > 7.8103367010164485e-15, -1.2351275675875924e-17, -8.6959160183013082e-15, > -1.601002787863e-18, 3.0110116065267407e-19, 3.7155867714997651e-16, > -0.12490087179942209, -2.2029631919898945e-16, -8.0869824211018481e-31, > -5.058673986695e-33, 2.0667225433033359e-19, 1.0141735013231455e-18, > 3.2845977347050714e-17, -2.9662734037779067e-19, 9.9595082331331027e-05, > -1.9982466114987023e-18, 2.6420113469249318e-18, 5.2415327077690342e-19, > 3.007533948014542e-18, -5.1128158961673558e-17, 6.7791246559883483e-19, > -3.042327665547861e-15, 1.9523738271941284e-19, -4.0556768902105205e-20, > -1.024372770865438e-17, -0.010638955366526863, 2.5786757042756338e-17, > 1.3546130005558746e-32, 1.7947249868721723e-33, 1.9228405363682382e-19, > -4.1209014001548407e-18, 1.5228291529275058e-17, 2.0443162438349059e-17, > -1.9572693345249383e-18, 0.0012709581028842473, -0.0018515998737074948, > 9.3128024867175073e-20, -5.1895788671618993e-17, 2.7373981615289174e-16, > 1.2812977711597223e-17, -2.2792319486263267e-15, 4.1599503721278813e-19, > -3.7733269771394201e-20, 4.16234419478765e-17, -1.5986158133468129e-16, > -0.016037670900735834, -5.0763064708173244e-33, -1.0176066166236902e-50, > -5.9296704797665404e-18, -8.0698801402861772e-17, 2.9646619457173492e-16, > -2.8879431119718227e-17, 2.9393253663483117e-18, -0.0018515998737074959, > 0.090335687736246548, 5.6849412731758135e-19, 8.8934458836007799e-17, > -4.1390858756690365e-16, 4.120323677410211e-16, 2.8000915545933503e-15, > 2.8094462743052983e-17, 1.1636214841356605e-18, 8.15103
Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when the seed is the same?
But I have never seen a case where, on the same machine and using the same math routines, using the same seed repeatedly generated different values. That is quite bizarre (the issue I was discussing in the thread is quite simple in comparison). I tried to reproduce the issue below, but was unable to do so. > sessionInfo() R version 4.3.1 (2023-06-16) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.6 LTS Matrix products: default BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so; LAPACK version 3.7.0 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=C [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C time zone: Europe/Berlin tzcode source: system (glibc) attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.3.1 > > set.seed(1234) > MASS::mvrnorm(n = 5, mu = mu0, Sigma = sigma0)[1, 1:5] [1] 0.4851605 0.5704446 1.6873036 2.7645014 -0.2020908 > set.seed(1234) > MASS::mvrnorm(n = 5, mu = mu0, Sigma = sigma0)[1, 1:5] [1] 0.4851605 0.5704446 1.6873036 2.7645014 -0.2020908 > set.seed(1234) > MASS::mvrnorm(n = 5, mu = mu0, Sigma = sigma0)[1, 1:5] [1] 0.4851605 0.5704446 1.6873036 2.7645014 -0.2020908 > set.seed(1234) > MASS::mvrnorm(n = 5, mu = mu0, Sigma = sigma0)[1, 1:5] [1] 0.4851605 0.5704446 1.6873036 2.7645014 -0.2020908 > set.seed(1234) > MASS::mvrnorm(n = 5, mu = mu0, Sigma = sigma0)[1, 1:5] [1] 0.4851605 0.5704446 1.6873036 2.7645014 -0.2020908 > set.seed(1234) > MASS::mvrnorm(n = 5, mu = mu0, Sigma = sigma0)[1, 1:5] [1] 0.4851605 0.5704446 1.6873036 2.7645014 -0.2020908 > set.seed(1234) > MASS::mvrnorm(n = 5, mu = mu0, Sigma = sigma0)[1, 1:5] [1] 0.4851605 0.5704446 1.6873036 2.7645014 -0.2020908 Best, Wolfgang >-Original Message- >From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ben Bolker >Sent: Thursday, 17 August, 2023 14:57 >To: r-help@r-project.org >Subject: Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when >the seed is the same? > > > However, should the numbers > > generated identical if the same seed is used? > > I don't see how using the same seed can overcome floating-point >differences across platforms (compilers etc.) stemming from differences >in an eigen() computation (based on arcane details like use of >registers, compiler optimizations, etc.). set.seed() guarantees that >the normal deviates generated by rnorm() will be identical, but those >random numbers are then multiplied by the eigenvectors derived from >eigen(), at which point the differences will crop up. > > This has been discussed on Twitter: >https://twitter.com/wviechtb/status/1230078883317387264 > >Wolfgang Viechtbauer, 2020-02-18: > >Another interesting reproducibility issue that came up. MASS::mvrnorm() >can give different values despite setting the same seed. The problem: >Results of eigen() (which is used by mvrnorm) can be machine dependent >(help(eigen) does warn about this). > >Interestingly, mvtnorm::rmvnorm() with method="eigen" gives the same >values across all machines I tested this on (and method="svd" give the >same values). With method="chol" I get different values, but again >consistent across machines. > >Ah, mvtnorm::rmvnorm() applies the results from eigen() in a different >way that appears to be less (not?) affected by the indeterminacy in the >eigenvectors. Quite clever. > >On 2023-08-16 11:10 p.m., Shu Fai Cheung wrote: >> Hi All, >> >> When addressing an error in one of my packages in the CRAN check at CRAN, >> I found something strange which, I believe, is unrelated to my package. >> I am not sure whether it is a bug or a known issue. Therefore, I would like >> to have advice from experts here. >> >> The error at CRAN check occurred in a test, and only on R-devel with MKL. No >> issues on all other platforms. The test compares two sets of random numbers, >> supposed to be identical because generated from the same seed. However, when >> I tried that in R-devel (2023-08-15 r84957) on Ubuntu 22.04.3 LTS, linked to >> MKL (2023.2.0), I found that this is not the case in one situation. >> >> I don't know why but somehow only one particular set of means and >> covariance matrix, among many others in the code, led to that problem. >> Please find >> below the code to reproduce the means and the covariance matrix (pardon me >> for the long code): >> >> mu0 <- structure(c(0.522524168
Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when the seed is the same?
> However, should the numbers > generated identical if the same seed is used? I don't see how using the same seed can overcome floating-point differences across platforms (compilers etc.) stemming from differences in an eigen() computation (based on arcane details like use of registers, compiler optimizations, etc.). set.seed() guarantees that the normal deviates generated by rnorm() will be identical, but those random numbers are then multiplied by the eigenvectors derived from eigen(), at which point the differences will crop up. This has been discussed on Twitter: https://twitter.com/wviechtb/status/1230078883317387264 Wolfgang Viechtbauer, 2020-02-18: Another interesting reproducibility issue that came up. MASS::mvrnorm() can give different values despite setting the same seed. The problem: Results of eigen() (which is used by mvrnorm) can be machine dependent (help(eigen) does warn about this). Interestingly, mvtnorm::rmvnorm() with method="eigen" gives the same values across all machines I tested this on (and method="svd" give the same values). With method="chol" I get different values, but again consistent across machines. Ah, mvtnorm::rmvnorm() applies the results from eigen() in a different way that appears to be less (not?) affected by the indeterminacy in the eigenvectors. Quite clever. On 2023-08-16 11:10 p.m., Shu Fai Cheung wrote: Hi All, When addressing an error in one of my packages in the CRAN check at CRAN, I found something strange which, I believe, is unrelated to my package. I am not sure whether it is a bug or a known issue. Therefore, I would like to have advice from experts here. The error at CRAN check occurred in a test, and only on R-devel with MKL. No issues on all other platforms. The test compares two sets of random numbers, supposed to be identical because generated from the same seed. However, when I tried that in R-devel (2023-08-15 r84957) on Ubuntu 22.04.3 LTS, linked to MKL (2023.2.0), I found that this is not the case in one situation. I don't know why but somehow only one particular set of means and covariance matrix, among many others in the code, led to that problem. Please find below the code to reproduce the means and the covariance matrix (pardon me for the long code): mu0 <- structure(c(0.52252416853188655, 0.39883382931927774, 1.6296642535174521, 2.9045763671121816, -0.19816874840500939, 0.42610841566522556, 0.30155498531316366, 1.0339601619394503, 3.4125587827873192, 13.125481598475405, 19.275480386183503, 658.94225353462195, 1.0997193726987393, 9.9980286642877214, 6.4947188998602092, -12.952617780813679, 8.4991882784024799, 106.82209520278377, 0.19623712449219838), class = c("numeric")) sigma0 <- structure(c(0.0047010182215945009, 1.4688424622163808e-17, -2.5269697696885822e-17, -1.2451516929358655e-18, 3.6553475725253408e-19, 3.2092363914356227e-19, -2.6341938382508503e-18, 8.6439582878287556e-19, -1.295240602054123e-17, 1.1154057497258702e-18, -8.2407621704807367e-33, -1.0891686096304908e-15, -2.6800671450262819e-18, -0.0009225142979911, -1.4833018148344697e-16, 2.292242132203e-16, -3.5128615476876035e-18, 2.8004770623734002e-33, 2.0818099301348832e-34, 1.294907062174661e-17, 0.012788608283099183, 1.5603789410838518e-15, 1.0425182851251724e-16, 2.758318745465066e-17, -9.7047963858148572e-18, -7.4685438142667792e-17, -1.9426723638193857e-17, -7.8775307262071738e-17, -4.0773523140359949e-17, 4.1212975037936416e-18, -1.199934828824835e-14, 2.301740948367742e-17, -2.5410883836579325e-18, -0.129172198695584, -1.9922957470809647e-14, 1.7077690739402761e-16, 1.663702957392817e-30, 5.5362608501514495e-32, -2.5575000656645995e-17, 7.3879960338026333e-16, 0.052246980157830372, -0.007670030643844231, -1.5468402204507243e-17, 1.0518757786432683e-17, 2.9992131812421253e-16, 1.6588830885081527e-17, -8.3276045185697992e-16, -3.3713008849128695e-15, 1.5823980389961169e-17, 2.2243227896948194e-15, -3.2071920299461956e-17, 5.0187645877462975e-18, -7.4622951185527265e-15, -0.44701112744184102, -1.854066605407503e-16, -5.2511356793649759e-30, -6.7993385117653912e-32, -1.5343740968067595e-18, -3.6785921616583949e-17, -0.0076700306438433133, 0.019231143599164325, 5.7818784939170702e-18, 1.7819897158350214e-17, -2.3927039320969442e-17, -3.9630951242056644e-18, 1.9779107704573469e-15, 7.8103367010164485e-15, -1.2351275675875924e-17, -8.6959160183013082e-15, -1.601002787863e-18, 3.0110116065267407e-19, 3.7155867714997651e-16, -0.12490087179942209, -2.2029631919898945e-16, -8.0869824211018481e-31, -5.058673986695e-33, 2.0667225433033359e-19, 1.0141735013231455e-18, 3.2845977347050714e-17, -2.9662734037779067e-19, 9.9595082331331027e-05, -1.9982466114987023e-18, 2.6420113469249318e-18, 5.2415327077690342e-19, 3.007533948014542e-18, -5.1128158961673558e-17, 6.7791246559883483e-19, -3.042327665547861e-15, 1.9523738271941284e-19, -4.0556768902105205e-20, -1.024372770865438e-17, -0.010638955366526863, 2.5786757042756338e-17, 1.3546
[R] MASS::mvrnorm() on MKL may produce different numbers even when the seed is the same?
Hi All, When addressing an error in one of my packages in the CRAN check at CRAN, I found something strange which, I believe, is unrelated to my package. I am not sure whether it is a bug or a known issue. Therefore, I would like to have advice from experts here. The error at CRAN check occurred in a test, and only on R-devel with MKL. No issues on all other platforms. The test compares two sets of random numbers, supposed to be identical because generated from the same seed. However, when I tried that in R-devel (2023-08-15 r84957) on Ubuntu 22.04.3 LTS, linked to MKL (2023.2.0), I found that this is not the case in one situation. I don't know why but somehow only one particular set of means and covariance matrix, among many others in the code, led to that problem. Please find below the code to reproduce the means and the covariance matrix (pardon me for the long code): mu0 <- structure(c(0.52252416853188655, 0.39883382931927774, 1.6296642535174521, 2.9045763671121816, -0.19816874840500939, 0.42610841566522556, 0.30155498531316366, 1.0339601619394503, 3.4125587827873192, 13.125481598475405, 19.275480386183503, 658.94225353462195, 1.0997193726987393, 9.9980286642877214, 6.4947188998602092, -12.952617780813679, 8.4991882784024799, 106.82209520278377, 0.19623712449219838), class = c("numeric")) sigma0 <- structure(c(0.0047010182215945009, 1.4688424622163808e-17, -2.5269697696885822e-17, -1.2451516929358655e-18, 3.6553475725253408e-19, 3.2092363914356227e-19, -2.6341938382508503e-18, 8.6439582878287556e-19, -1.295240602054123e-17, 1.1154057497258702e-18, -8.2407621704807367e-33, -1.0891686096304908e-15, -2.6800671450262819e-18, -0.0009225142979911, -1.4833018148344697e-16, 2.292242132203e-16, -3.5128615476876035e-18, 2.8004770623734002e-33, 2.0818099301348832e-34, 1.294907062174661e-17, 0.012788608283099183, 1.5603789410838518e-15, 1.0425182851251724e-16, 2.758318745465066e-17, -9.7047963858148572e-18, -7.4685438142667792e-17, -1.9426723638193857e-17, -7.8775307262071738e-17, -4.0773523140359949e-17, 4.1212975037936416e-18, -1.199934828824835e-14, 2.301740948367742e-17, -2.5410883836579325e-18, -0.129172198695584, -1.9922957470809647e-14, 1.7077690739402761e-16, 1.663702957392817e-30, 5.5362608501514495e-32, -2.5575000656645995e-17, 7.3879960338026333e-16, 0.052246980157830372, -0.007670030643844231, -1.5468402204507243e-17, 1.0518757786432683e-17, 2.9992131812421253e-16, 1.6588830885081527e-17, -8.3276045185697992e-16, -3.3713008849128695e-15, 1.5823980389961169e-17, 2.2243227896948194e-15, -3.2071920299461956e-17, 5.0187645877462975e-18, -7.4622951185527265e-15, -0.44701112744184102, -1.854066605407503e-16, -5.2511356793649759e-30, -6.7993385117653912e-32, -1.5343740968067595e-18, -3.6785921616583949e-17, -0.0076700306438433133, 0.019231143599164325, 5.7818784939170702e-18, 1.7819897158350214e-17, -2.3927039320969442e-17, -3.9630951242056644e-18, 1.9779107704573469e-15, 7.8103367010164485e-15, -1.2351275675875924e-17, -8.6959160183013082e-15, -1.601002787863e-18, 3.0110116065267407e-19, 3.7155867714997651e-16, -0.12490087179942209, -2.2029631919898945e-16, -8.0869824211018481e-31, -5.058673986695e-33, 2.0667225433033359e-19, 1.0141735013231455e-18, 3.2845977347050714e-17, -2.9662734037779067e-19, 9.9595082331331027e-05, -1.9982466114987023e-18, 2.6420113469249318e-18, 5.2415327077690342e-19, 3.007533948014542e-18, -5.1128158961673558e-17, 6.7791246559883483e-19, -3.042327665547861e-15, 1.9523738271941284e-19, -4.0556768902105205e-20, -1.024372770865438e-17, -0.010638955366526863, 2.5786757042756338e-17, 1.3546130005558746e-32, 1.7947249868721723e-33, 1.9228405363682382e-19, -4.1209014001548407e-18, 1.5228291529275058e-17, 2.0443162438349059e-17, -1.9572693345249383e-18, 0.0012709581028842473, -0.0018515998737074948, 9.3128024867175073e-20, -5.1895788671618993e-17, 2.7373981615289174e-16, 1.2812977711597223e-17, -2.2792319486263267e-15, 4.1599503721278813e-19, -3.7733269771394201e-20, 4.16234419478765e-17, -1.5986158133468129e-16, -0.016037670900735834, -5.0763064708173244e-33, -1.0176066166236902e-50, -5.9296704797665404e-18, -8.0698801402861772e-17, 2.9646619457173492e-16, -2.8879431119718227e-17, 2.9393253663483117e-18, -0.0018515998737074959, 0.090335687736246548, 5.6849412731758135e-19, 8.8934458836007799e-17, -4.1390858756690365e-16, 4.120323677410211e-16, 2.8000915545933503e-15, 2.8094462743052983e-17, 1.1636214841356605e-18, 8.151036749771e-16, -3.004558574117108e-15, 0.0061666744014873013, -2.5381532354086619e-33, -2.7110175619881585e-34, -3.9720857469692968e-18, -2.2722246209861298e-17, 6.5087631702810744e-18, -5.8833898464959171e-18, 6.0598974659958357e-19, -6.4324667436451943e-19, -4.9865458549184915e-19, 0.010690736164778532, -2.5717220776886191e-17, -2.9932234433250055e-17, -1.7586566364625091e-32, -2.0572361612722435e-15, -4.1554892682395136e-18, 7.7947068522170098e-19, 2.2950757715435305e-16, -6.8563401956907788e-17, 8.3986032618135276e-18, -3.0929447793865389e-45, -9