Two of the machines having the problem are AVX-512 capable (e.g., i7-7820X) but 
another one is an old Samsung Series 5 with an i5-3317U. I guess I will start 
with the folks at Linux Mint.

Tom


Thomas R. LaBone
PhD student
Department of Epidemiology and Biostatistics
Arnold School of Public Health
University of South Carolina
Columbia, South Carolina USA



________________________________
From: Sarah Goslee <sarah.gos...@gmail.com>
Sent: Friday, December 3, 2021 11:00 AM
To: Labone, Thomas <lab...@email.sc.edu>
Cc: Bill Dunlap <williamwdun...@gmail.com>; r-help@r-project.org 
<r-help@r-project.org>
Subject: Re: [R] Problem with lm Giving Wrong Results

It might also be a BLAS+processor problem - I got bit pretty hard by
that, with an example here:

https://stat.ethz.ch/pipermail/r-help/2019-July/463477.html

With a key excerpt here:

On Thu, Jul 18, 2019 at 1:59 PM Ivan Krylov <krylov.r00t using gmail.com> wrote:
> Yes, this might be bad. I have heard about OpenBLAS (specifically, the
> matrix product routine) misbehaving on certain AVX-512 capable
> processors, so much that they had to disable some optimizations in
> 0.3.6 [*], which you already have installed. Still, would `env
> OPENBLAS_CORETYPE=Haswell R --vanilla` give a better result?
>

On Fri, Dec 3, 2021 at 10:29 AM Labone, Thomas <lab...@email.sc.edu> wrote:
>
> Thanks for the feedback everyone. If you go to 
> https://protect2.fireeye.com/v1/url?k=f8bdd2b7-a726ea7c-f8bd9c76-86ce7c8b8969-1acf41b3a3825b65&q=1&e=05c346dc-4f60-4e2e-ada8-1abaa8792515&u=https%3A%2F%2Fgithub.com%2Fcsantill%2FRPerformanceWBLAS%2Fblob%2Fmaster%2FRPerformanceBLAS.md
>  you will find the Linux commands to change the default math library. When I 
> switch the BLAS library from MKL to the system default (see sessionInfo 
> below), everything works as expected. I installed version 2020.0-166-1 of 
> "Intel-MKL" from the Linux Mint Software Manager. I may be coming to a hasty 
> conclusion, but there appears to be something wrong with that package or how 
> it interacts with other system software. Any suggestions on who I should 
> notify about the problem (e.g., Intel, Mint, Ubuntu)?
>
> R version 4.1.2 (2021-11-01)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Linux Mint 20.2
>
> Matrix products: default
> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
> LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8
>  [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    
> LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C
> [10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.1.2 tools_4.1.2
>
>
>
> Thomas R. LaBone
> PhD student
> Department of Epidemiology and Biostatistics
> Arnold School of Public Health
> University of South Carolina
> Columbia, South Carolina USA
>
>
>
> ________________________________
> From: Labone, Thomas <lab...@email.sc.edu>
> Sent: Thursday, December 2, 2021 11:53 AM
> To: Bill Dunlap <williamwdun...@gmail.com>
> Cc: r-help@r-project.org <r-help@r-project.org>
> Subject: Re: [R] Problem with lm Giving Wrong Results
>
> > summary(fit)
>
> Call:
> lm(formula = log(k) ~ Z)
>
> Residuals:
>     Min      1Q  Median      3Q     Max
> -21.241   1.327   1.776   2.245   4.418
>
> Coefficients:
>             Estimate Std. Error t value Pr(>|t|)
> (Intercept) -0.03465    0.01916  -1.809   0.0705 .
> Z           -0.24207    0.01916 -12.634   <2e-16 ***
> ---
> Signif. codes:  0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1
>
> Residual standard error: 1.914 on 9998 degrees of freedom
> Multiple R-squared:  0.01467, Adjusted R-squared:  0.01457
> F-statistic: 148.8 on 1 and 9998 DF,  p-value: < 2.2e-16
>
> > summary(k)
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>  0.2735  3.7658  5.9052  7.5113  9.4399 82.9531
> > summary(Z)
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
> -3.8906 -0.6744  0.0000  0.0000  0.6744  3.8906
> > summary(gm*gsd^Z)
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>  0.3767  0.8204  0.9659  0.9947  1.1372  2.4772
> >
>
>
> Thomas R. LaBone
> PhD student
> Department of Epidemiology and Biostatistics
> Arnold School of Public Health
> University of South Carolina
> Columbia, South Carolina USA
>
>
> ________________________________
> From: Bill Dunlap <williamwdun...@gmail.com>
> Sent: Thursday, December 2, 2021 10:31 AM
> To: Labone, Thomas <lab...@email.sc.edu>
> Cc: r-help@r-project.org <r-help@r-project.org>
> Subject: Re: [R] Problem with lm Giving Wrong Results
>
> On the 'bad' machines, what did you get for
>    summary(fit)
>    summary(k)
>    summary(Z)
>    summary(gm*gsd^Z)
> ?
>
> -Bill
>
> On Thu, Dec 2, 2021 at 6:18 AM Labone, Thomas 
> <lab...@email.sc.edu<mailto:lab...@email.sc.edu>> wrote:
> In the code below the first and second plots should look pretty much the 
> same, the only difference being that the first has n=1000 points and the 
> second n=10000 points. On two of my Linux machines (info below) the second 
> plot is a horizontal line (incorrect answer from lm), but on my Windows 10 
> machine and a third Linux machine it works as expected. The interesting thing 
> is that the code works as expected for n <= 4095 but fails for n>=4096 (which 
> equals 2^12). Can anyone else reproduce this problem? Any ideas on how to fix 
> it?
>
> set.seed(132)
>
> #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> # This works
> n <- 1000# OK <= 4095
> Z <- qnorm(ppoints(n))
>
> k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))
>
> quantile(k,probs=c(0.025,0.5,0.975))
> summary(k)
>
> fit <- lm(log(k) ~ Z)
> summary(fit)
>
> gm <- exp(coef(fit)[1])
> gsd <- exp(coef(fit)[2])
> gm
> gsd
>
> plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
> lines(Z,gm*gsd^Z,col="red")
>
> #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> #this does not
> n <- 10000# fails >= 4096 = 2^12
> Z <- qnorm(ppoints(n))
>
> k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))
>
> quantile(k,probs=c(0.025,0.5,0.975))
> summary(k)
>
> fit <- lm(log(k) ~ Z)
> summary(fit)
>
> gm <- exp(coef(fit)[1])
> gsd <- exp(coef(fit)[2])
> gm
> gsd
>
> plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
> lines(Z,gm*gsd^Z,col="red")
>
>
> #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > sessionInfo() #for two Linux machines having problem
> R version 4.1.2 (2021-11-01)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Linux Mint 20.2
>
> Matrix products: default
> BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               
> LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8
>  [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C          
>         LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.1.2  Matrix_1.3-4    tools_4.1.2     expm_0.999-6    
> grid_4.1.2      lattice_0.20-45
>
> #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > sessionInfo() # for a third Linux machine not having the problem
> R version 4.1.1 (2021-08-10)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Linux Mint 19.3
>
> Matrix products: default
> BLAS/LAPACK: 
> /opt/intel/compilers_and_libraries_2020.0.166/linux/mkl/lib/intel64_lin/libmkl_rt.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8
>  [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    
> LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C
> [10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.1.1 tools_4.1.1
>
>
>
> Thomas R. LaBone
> PhD student
> Department of Epidemiology and Biostatistics
> Arnold School of Public Health
> University of South Carolina
> Columbia, South Carolina USA
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
> UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Sarah Goslee (she/her)
https://protect2.fireeye.com/v1/url?k=c805d788-979eef43-c8059949-86ce7c8b8969-1bd7107f3d9bccca&q=1&e=05c346dc-4f60-4e2e-ada8-1abaa8792515&u=http%3A%2F%2Fwww.numberwright.com%2F

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to