Re: [R] Problem with lm Giving Wrong Results

2021-12-03 Thread Labone, Thomas
Two of the machines having the problem are AVX-512 capable (e.g., i7-7820X) but 
another one is an old Samsung Series 5 with an i5-3317U. I guess I will start 
with the folks at Linux Mint.

Tom


Thomas R. LaBone
PhD student
Department of Epidemiology and Biostatistics
Arnold School of Public Health
University of South Carolina
Columbia, South Carolina USA




From: Sarah Goslee 
Sent: Friday, December 3, 2021 11:00 AM
To: Labone, Thomas 
Cc: Bill Dunlap ; r-help@r-project.org 

Subject: Re: [R] Problem with lm Giving Wrong Results

It might also be a BLAS+processor problem - I got bit pretty hard by
that, with an example here:

https://stat.ethz.ch/pipermail/r-help/2019-July/463477.html

With a key excerpt here:

On Thu, Jul 18, 2019 at 1:59 PM Ivan Krylov  wrote:
> Yes, this might be bad. I have heard about OpenBLAS (specifically, the
> matrix product routine) misbehaving on certain AVX-512 capable
> processors, so much that they had to disable some optimizations in
> 0.3.6 [*], which you already have installed. Still, would `env
> OPENBLAS_CORETYPE=Haswell R --vanilla` give a better result?
>

On Fri, Dec 3, 2021 at 10:29 AM Labone, Thomas  wrote:
>
> Thanks for the feedback everyone. If you go to 
> https://protect2.fireeye.com/v1/url?k=f8bdd2b7-a726ea7c-f8bd9c76-86ce7c8b8969-1acf41b3a3825b65=1=05c346dc-4f60-4e2e-ada8-1abaa8792515=https%3A%2F%2Fgithub.com%2Fcsantill%2FRPerformanceWBLAS%2Fblob%2Fmaster%2FRPerformanceBLAS.md
>  you will find the Linux commands to change the default math library. When I 
> switch the BLAS library from MKL to the system default (see sessionInfo 
> below), everything works as expected. I installed version 2020.0-166-1 of 
> "Intel-MKL" from the Linux Mint Software Manager. I may be coming to a hasty 
> conclusion, but there appears to be something wrong with that package or how 
> it interacts with other system software. Any suggestions on who I should 
> notify about the problem (e.g., Intel, Mint, Ubuntu)?
>
> R version 4.1.2 (2021-11-01)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Linux Mint 20.2
>
> Matrix products: default
> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
> LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C   LC_TIME=en_US.UTF-8
>  [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
> LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C  LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.1.2 tools_4.1.2
>
>
>
> Thomas R. LaBone
> PhD student
> Department of Epidemiology and Biostatistics
> Arnold School of Public Health
> University of South Carolina
> Columbia, South Carolina USA
>
>
>
> ____________________
> From: Labone, Thomas 
> Sent: Thursday, December 2, 2021 11:53 AM
> To: Bill Dunlap 
> Cc: r-help@r-project.org 
> Subject: Re: [R] Problem with lm Giving Wrong Results
>
> > summary(fit)
>
> Call:
> lm(formula = log(k) ~ Z)
>
> Residuals:
> Min  1Q  Median  3Q Max
> -21.241   1.327   1.776   2.245   4.418
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) -0.034650.01916  -1.809   0.0705 .
> Z   -0.242070.01916 -12.634   <2e-16 ***
> ---
> Signif. codes:  0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1
>
> Residual standard error: 1.914 on 9998 degrees of freedom
> Multiple R-squared:  0.01467, Adjusted R-squared:  0.01457
> F-statistic: 148.8 on 1 and 9998 DF,  p-value: < 2.2e-16
>
> > summary(k)
>Min. 1st Qu.  MedianMean 3rd Qu.Max.
>  0.2735  3.7658  5.9052  7.5113  9.4399 82.9531
> > summary(Z)
>Min. 1st Qu.  MedianMean 3rd Qu.Max.
> -3.8906 -0.6744  0.  0.  0.6744  3.8906
> > summary(gm*gsd^Z)
>Min. 1st Qu.  MedianMean 3rd Qu.Max.
>  0.3767  0.8204  0.9659  0.9947  1.1372  2.4772
> >
>
>
> Thomas R. LaBone
> PhD student
> Department of Epidemiology and Biostatistics
> Arnold School of Public Health
> University of South Carolina
> Columbia, South Carolina USA
>
>
> 
> From: Bill Dunlap 
> Sent: Thursday, December 2, 2021 10:31 AM
> To: Labone, Thomas 
> Cc: r-help@r-project.org 
> Subject: Re: [R] Problem with lm Giving Wrong Results
>
> On the 'bad' machines, what did you get for
>summary(fit)
>summary(k)
>summary(Z)
>summary(gm*gsd^Z)
> ?
>
> -Bill
>

Re: [R] Problem with lm Giving Wrong Results

2021-12-03 Thread Sarah Goslee
It might also be a BLAS+processor problem - I got bit pretty hard by
that, with an example here:

https://stat.ethz.ch/pipermail/r-help/2019-July/463477.html

With a key excerpt here:

On Thu, Jul 18, 2019 at 1:59 PM Ivan Krylov  wrote:
> Yes, this might be bad. I have heard about OpenBLAS (specifically, the
> matrix product routine) misbehaving on certain AVX-512 capable
> processors, so much that they had to disable some optimizations in
> 0.3.6 [*], which you already have installed. Still, would `env
> OPENBLAS_CORETYPE=Haswell R --vanilla` give a better result?
>

On Fri, Dec 3, 2021 at 10:29 AM Labone, Thomas  wrote:
>
> Thanks for the feedback everyone. If you go to 
> https://github.com/csantill/RPerformanceWBLAS/blob/master/RPerformanceBLAS.md 
> you will find the Linux commands to change the default math library. When I 
> switch the BLAS library from MKL to the system default (see sessionInfo 
> below), everything works as expected. I installed version 2020.0-166-1 of 
> "Intel-MKL" from the Linux Mint Software Manager. I may be coming to a hasty 
> conclusion, but there appears to be something wrong with that package or how 
> it interacts with other system software. Any suggestions on who I should 
> notify about the problem (e.g., Intel, Mint, Ubuntu)?
>
> R version 4.1.2 (2021-11-01)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Linux Mint 20.2
>
> Matrix products: default
> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
> LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C   LC_TIME=en_US.UTF-8
>  [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
> LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C  LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.1.2 tools_4.1.2
>
>
>
> Thomas R. LaBone
> PhD student
> Department of Epidemiology and Biostatistics
> Arnold School of Public Health
> University of South Carolina
> Columbia, South Carolina USA
>
>
>
> ____________________
> From: Labone, Thomas 
> Sent: Thursday, December 2, 2021 11:53 AM
> To: Bill Dunlap 
> Cc: r-help@r-project.org 
> Subject: Re: [R] Problem with lm Giving Wrong Results
>
> > summary(fit)
>
> Call:
> lm(formula = log(k) ~ Z)
>
> Residuals:
> Min  1Q  Median  3Q Max
> -21.241   1.327   1.776   2.245   4.418
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) -0.034650.01916  -1.809   0.0705 .
> Z   -0.242070.01916 -12.634   <2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 1.914 on 9998 degrees of freedom
> Multiple R-squared:  0.01467, Adjusted R-squared:  0.01457
> F-statistic: 148.8 on 1 and 9998 DF,  p-value: < 2.2e-16
>
> > summary(k)
>Min. 1st Qu.  MedianMean 3rd Qu.Max.
>  0.2735  3.7658  5.9052  7.5113  9.4399 82.9531
> > summary(Z)
>Min. 1st Qu.  MedianMean 3rd Qu.Max.
> -3.8906 -0.6744  0.  0.  0.6744  3.8906
> > summary(gm*gsd^Z)
>Min. 1st Qu.  MedianMean 3rd Qu.Max.
>  0.3767  0.8204  0.9659  0.9947  1.1372  2.4772
> >
>
>
> Thomas R. LaBone
> PhD student
> Department of Epidemiology and Biostatistics
> Arnold School of Public Health
> University of South Carolina
> Columbia, South Carolina USA
>
>
> 
> From: Bill Dunlap 
> Sent: Thursday, December 2, 2021 10:31 AM
> To: Labone, Thomas 
> Cc: r-help@r-project.org 
> Subject: Re: [R] Problem with lm Giving Wrong Results
>
> On the 'bad' machines, what did you get for
>summary(fit)
>summary(k)
>summary(Z)
>summary(gm*gsd^Z)
> ?
>
> -Bill
>
> On Thu, Dec 2, 2021 at 6:18 AM Labone, Thomas 
> mailto:lab...@email.sc.edu>> wrote:
> In the code below the first and second plots should look pretty much the 
> same, the only difference being that the first has n=1000 points and the 
> second n=1 points. On two of my Linux machines (info below) the second 
> plot is a horizontal line (incorrect answer from lm), but on my Windows 10 
> machine and a third Linux machine it works as expected. The interesting thing 
> is that the code works as expected for n <= 4095 but fails for n>=4096 (which 
> equals 2^12). Can anyone else reproduce this problem? Any ideas on how to fix 
> it?
>
> set.seed(132)
>
>

Re: [R] Problem with lm Giving Wrong Results

2021-12-03 Thread Labone, Thomas
Thanks for the feedback everyone. If you go to 
https://github.com/csantill/RPerformanceWBLAS/blob/master/RPerformanceBLAS.md 
you will find the Linux commands to change the default math library. When I 
switch the BLAS library from MKL to the system default (see sessionInfo below), 
everything works as expected. I installed version 2020.0-166-1 of "Intel-MKL" 
from the Linux Mint Software Manager. I may be coming to a hasty conclusion, 
but there appears to be something wrong with that package or how it interacts 
with other system software. Any suggestions on who I should notify about the 
problem (e.g., Intel, Mint, Ubuntu)?

R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 20.2

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C   LC_TIME=en_US.UTF-8
 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C  LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.1.2 tools_4.1.2



Thomas R. LaBone
PhD student
Department of Epidemiology and Biostatistics
Arnold School of Public Health
University of South Carolina
Columbia, South Carolina USA




From: Labone, Thomas 
Sent: Thursday, December 2, 2021 11:53 AM
To: Bill Dunlap 
Cc: r-help@r-project.org 
Subject: Re: [R] Problem with lm Giving Wrong Results

> summary(fit)

Call:
lm(formula = log(k) ~ Z)

Residuals:
Min  1Q  Median  3Q Max
-21.241   1.327   1.776   2.245   4.418

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.034650.01916  -1.809   0.0705 .
Z   -0.242070.01916 -12.634   <2e-16 ***
---
Signif. codes:  0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1

Residual standard error: 1.914 on 9998 degrees of freedom
Multiple R-squared:  0.01467, Adjusted R-squared:  0.01457
F-statistic: 148.8 on 1 and 9998 DF,  p-value: < 2.2e-16

> summary(k)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
 0.2735  3.7658  5.9052  7.5113  9.4399 82.9531
> summary(Z)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
-3.8906 -0.6744  0.  0.  0.6744  3.8906
> summary(gm*gsd^Z)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
 0.3767  0.8204  0.9659  0.9947  1.1372  2.4772
>


Thomas R. LaBone
PhD student
Department of Epidemiology and Biostatistics
Arnold School of Public Health
University of South Carolina
Columbia, South Carolina USA



From: Bill Dunlap 
Sent: Thursday, December 2, 2021 10:31 AM
To: Labone, Thomas 
Cc: r-help@r-project.org 
Subject: Re: [R] Problem with lm Giving Wrong Results

On the 'bad' machines, what did you get for
   summary(fit)
   summary(k)
   summary(Z)
   summary(gm*gsd^Z)
?

-Bill

On Thu, Dec 2, 2021 at 6:18 AM Labone, Thomas 
mailto:lab...@email.sc.edu>> wrote:
In the code below the first and second plots should look pretty much the same, 
the only difference being that the first has n=1000 points and the second 
n=1 points. On two of my Linux machines (info below) the second plot is a 
horizontal line (incorrect answer from lm), but on my Windows 10 machine and a 
third Linux machine it works as expected. The interesting thing is that the 
code works as expected for n <= 4095 but fails for n>=4096 (which equals 2^12). 
Can anyone else reproduce this problem? Any ideas on how to fix it?

set.seed(132)

#~~~
# This works
n <- 1000# OK <= 4095
Z <- qnorm(ppoints(n))

k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))

quantile(k,probs=c(0.025,0.5,0.975))
summary(k)

fit <- lm(log(k) ~ Z)
summary(fit)

gm <- exp(coef(fit)[1])
gsd <- exp(coef(fit)[2])
gm
gsd

plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
lines(Z,gm*gsd^Z,col="red")

#~~~
#this does not
n <- 1# fails >= 4096 = 2^12
Z <- qnorm(ppoints(n))

k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))

quantile(k,probs=c(0.025,0.5,0.975))
summary(k)

fit <- lm(log(k) ~ Z)
summary(fit)

gm <- exp(coef(fit)[1])
gsd <- exp(coef(fit)[2])
gm
gsd

plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
lines(Z,gm*gsd^Z,col="red")


#~~~
> sessionInfo() #for two Linux machines having problem
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 20.2

Re: [R] Problem with lm Giving Wrong Results

2021-12-02 Thread Labone, Thomas
> summary(fit)

Call:
lm(formula = log(k) ~ Z)

Residuals:
Min  1Q  Median  3Q Max
-21.241   1.327   1.776   2.245   4.418

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.034650.01916  -1.809   0.0705 .
Z   -0.242070.01916 -12.634   <2e-16 ***
---
Signif. codes:  0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1

Residual standard error: 1.914 on 9998 degrees of freedom
Multiple R-squared:  0.01467, Adjusted R-squared:  0.01457
F-statistic: 148.8 on 1 and 9998 DF,  p-value: < 2.2e-16

> summary(k)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
 0.2735  3.7658  5.9052  7.5113  9.4399 82.9531
> summary(Z)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
-3.8906 -0.6744  0.  0.  0.6744  3.8906
> summary(gm*gsd^Z)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
 0.3767  0.8204  0.9659  0.9947  1.1372  2.4772
>


Thomas R. LaBone
PhD student
Department of Epidemiology and Biostatistics
Arnold School of Public Health
University of South Carolina
Columbia, South Carolina USA



From: Bill Dunlap 
Sent: Thursday, December 2, 2021 10:31 AM
To: Labone, Thomas 
Cc: r-help@r-project.org 
Subject: Re: [R] Problem with lm Giving Wrong Results

On the 'bad' machines, what did you get for
   summary(fit)
   summary(k)
   summary(Z)
   summary(gm*gsd^Z)
?

-Bill

On Thu, Dec 2, 2021 at 6:18 AM Labone, Thomas 
mailto:lab...@email.sc.edu>> wrote:
In the code below the first and second plots should look pretty much the same, 
the only difference being that the first has n=1000 points and the second 
n=1 points. On two of my Linux machines (info below) the second plot is a 
horizontal line (incorrect answer from lm), but on my Windows 10 machine and a 
third Linux machine it works as expected. The interesting thing is that the 
code works as expected for n <= 4095 but fails for n>=4096 (which equals 2^12). 
Can anyone else reproduce this problem? Any ideas on how to fix it?

set.seed(132)

#~~~
# This works
n <- 1000# OK <= 4095
Z <- qnorm(ppoints(n))

k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))

quantile(k,probs=c(0.025,0.5,0.975))
summary(k)

fit <- lm(log(k) ~ Z)
summary(fit)

gm <- exp(coef(fit)[1])
gsd <- exp(coef(fit)[2])
gm
gsd

plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
lines(Z,gm*gsd^Z,col="red")

#~~~
#this does not
n <- 1# fails >= 4096 = 2^12
Z <- qnorm(ppoints(n))

k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))

quantile(k,probs=c(0.025,0.5,0.975))
summary(k)

fit <- lm(log(k) ~ Z)
summary(fit)

gm <- exp(coef(fit)[1])
gsd <- exp(coef(fit)[2])
gm
gsd

plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
lines(Z,gm*gsd^Z,col="red")


#~~~
> sessionInfo() #for two Linux machines having problem
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 20.2

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C   LC_TIME=en_US.UTF-8  
  LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
 [6] LC_MESSAGES=en_US.UTF-8LC_PAPER=en_US.UTF-8   LC_NAME=C
  LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.1.2  Matrix_1.3-4tools_4.1.2 expm_0.999-6grid_4.1.2  
lattice_0.20-45

#~~
> sessionInfo() # for a third Linux machine not having the problem
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 19.3

Matrix products: default
BLAS/LAPACK: 
/opt/intel/compilers_and_libraries_2020.0.166/linux/mkl/lib/intel64_lin/libmkl_rt.so

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C   LC_TIME=en_US.UTF-8
 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C  LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.1.1 tools_4.1.1



Thomas R. LaBone
PhD student
Department of Epidemiology and Biostatistics
Arnold School of Public Health
University of South Carolina
Columbia, South Carolina USA


[[alternative HTML version deleted]]


Re: [R] Problem with lm Giving Wrong Results

2021-12-02 Thread Duncan Murdoch

On 02/12/2021 5:50 a.m., Labone, Thomas wrote:

In the code below the first and second plots should look pretty much the same, the 
only difference being that the first has n=1000 points and the second n=1 points. 
On two of my Linux machines (info below) the second plot is a horizontal line 
(incorrect answer from lm), but on my Windows 10 machine and a third Linux machine it 
works as expected. The interesting thing is that the code works as expected for n 
<= 4095 but fails for n>=4096 (which equals 2^12). Can anyone else reproduce 
this problem? Any ideas on how to fix it?


The likely location of this problem is the BLAS.  You are using

BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so

on the bad machines, and

BLAS/LAPACK: 
/opt/intel/compilers_and_libraries_2020.0.166/linux/mkl/lib/intel64_lin/libmkl_rt.so

>

on the good one, according to the sessionInfo you posted. I don't have 
any experience dealing with such things, but that's where I'd look for a 
fix.


Duncan Murdoch




set.seed(132)

#~~~
# This works
n <- 1000# OK <= 4095
Z <- qnorm(ppoints(n))

k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))

quantile(k,probs=c(0.025,0.5,0.975))
summary(k)

fit <- lm(log(k) ~ Z)
summary(fit)

gm <- exp(coef(fit)[1])
gsd <- exp(coef(fit)[2])
gm
gsd

plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
lines(Z,gm*gsd^Z,col="red")

#~~~
#this does not
n <- 1# fails >= 4096 = 2^12
Z <- qnorm(ppoints(n))

k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))

quantile(k,probs=c(0.025,0.5,0.975))
summary(k)

fit <- lm(log(k) ~ Z)
summary(fit)

gm <- exp(coef(fit)[1])
gsd <- exp(coef(fit)[2])
gm
gsd

plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
lines(Z,gm*gsd^Z,col="red")


#~~~

sessionInfo() #for two Linux machines having problem

R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 20.2

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C   LC_TIME=en_US.UTF-8 
   LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
  [6] LC_MESSAGES=en_US.UTF-8LC_PAPER=en_US.UTF-8   LC_NAME=C   
   LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.1.2  Matrix_1.3-4tools_4.1.2 expm_0.999-6grid_4.1.2  
lattice_0.20-45

#~~

sessionInfo() # for a third Linux machine not having the problem

R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 19.3

Matrix products: default
BLAS/LAPACK: 
/opt/intel/compilers_and_libraries_2020.0.166/linux/mkl/lib/intel64_lin/libmkl_rt.so

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C   LC_TIME=en_US.UTF-8
  [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C  LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.1.1 tools_4.1.1



Thomas R. LaBone
PhD student
Department of Epidemiology and Biostatistics
Arnold School of Public Health
University of South Carolina
Columbia, South Carolina USA


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with lm Giving Wrong Results

2021-12-02 Thread Bill Dunlap
On the 'bad' machines, what did you get for
   summary(fit)
   summary(k)
   summary(Z)
   summary(gm*gsd^Z)
?

-Bill

On Thu, Dec 2, 2021 at 6:18 AM Labone, Thomas  wrote:

> In the code below the first and second plots should look pretty much the
> same, the only difference being that the first has n=1000 points and the
> second n=1 points. On two of my Linux machines (info below) the second
> plot is a horizontal line (incorrect answer from lm), but on my Windows 10
> machine and a third Linux machine it works as expected. The interesting
> thing is that the code works as expected for n <= 4095 but fails for
> n>=4096 (which equals 2^12). Can anyone else reproduce this problem? Any
> ideas on how to fix it?
>
> set.seed(132)
>
>
> #~~~
> # This works
> n <- 1000# OK <= 4095
> Z <- qnorm(ppoints(n))
>
> k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))
>
> quantile(k,probs=c(0.025,0.5,0.975))
> summary(k)
>
> fit <- lm(log(k) ~ Z)
> summary(fit)
>
> gm <- exp(coef(fit)[1])
> gsd <- exp(coef(fit)[2])
> gm
> gsd
>
> plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
> lines(Z,gm*gsd^Z,col="red")
>
>
> #~~~
> #this does not
> n <- 1# fails >= 4096 = 2^12
> Z <- qnorm(ppoints(n))
>
> k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))
>
> quantile(k,probs=c(0.025,0.5,0.975))
> summary(k)
>
> fit <- lm(log(k) ~ Z)
> summary(fit)
>
> gm <- exp(coef(fit)[1])
> gsd <- exp(coef(fit)[2])
> gm
> gsd
>
> plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
> lines(Z,gm*gsd^Z,col="red")
>
>
>
> #~~~
> > sessionInfo() #for two Linux machines having problem
> R version 4.1.2 (2021-11-01)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Linux Mint 20.2
>
> Matrix products: default
> BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>  LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>  LC_MONETARY=en_US.UTF-8
>  [6] LC_MESSAGES=en_US.UTF-8LC_PAPER=en_US.UTF-8   LC_NAME=C
> LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.1.2  Matrix_1.3-4tools_4.1.2 expm_0.999-6
> grid_4.1.2  lattice_0.20-45
>
> #~~
> > sessionInfo() # for a third Linux machine not having the problem
> R version 4.1.1 (2021-08-10)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Linux Mint 19.3
>
> Matrix products: default
> BLAS/LAPACK:
> /opt/intel/compilers_and_libraries_2020.0.166/linux/mkl/lib/intel64_lin/libmkl_rt.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>  LC_TIME=en_US.UTF-8
>  [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
> LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C  LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
> LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.1.1 tools_4.1.1
>
>
>
> Thomas R. LaBone
> PhD student
> Department of Epidemiology and Biostatistics
> Arnold School of Public Health
> University of South Carolina
> Columbia, South Carolina USA
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with lm Giving Wrong Results

2021-12-02 Thread Ivan Krylov
On Thu, 2 Dec 2021 14:34:42 +
"Labone, Thomas"  wrote:

> Can someone point me to the procedure for switching from the Intel
> Math Library back to the standard math library so that I can see if
> the problem is associated with using MKL?

Depends on how you have installed it. You mentioned using Linux Mint
(see also: the R-SIG-Debian mailing list), so try using
`update-alternatives` to set the value for all alternatives with "blas"
or "lapack" in their names. See `ls /etc/alternatives/ | grep -i -e blas
-e lapack` for the list.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with lm Giving Wrong Results

2021-12-02 Thread Labone, Thomas
Thanks.

Can someone point me to the procedure for switching from the Intel Math Library 
back to the standard math library so that I can see if the problem is 
associated with using MKL?


Thomas R. LaBone
PhD student
Department of Epidemiology and Biostatistics
Arnold School of Public Health
University of South Carolina
Columbia, South Carolina USA



From: J C Nash 
Sent: Thursday, December 2, 2021 9:31 AM
To: Labone, Thomas ; r-help@r-project.org 

Subject: Re: [R] Problem with lm Giving Wrong Results

I get two similar graphs.

https://protect2.fireeye.com/v1/url?k=99a20372-c6393a53-99a24db3-862c53b6784d-c549bca0bcc4210d=1=8ce563a7-fac1-41d3-8699-13c4a9417e00=https%3A%2F%2Fweb.ncf.ca%2Fnashjc%2Fjfiles%2FRplot-Labone-4095.pdf
https://protect2.fireeye.com/v1/url?k=6d56bd36-32cd8417-6d56f3f7-862c53b6784d-85baef704a412b23=1=8ce563a7-fac1-41d3-8699-13c4a9417e00=https%3A%2F%2Fweb.ncf.ca%2Fnashjc%2Fjfiles%2FRplotLabone10K.pdf

Context:

R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 20.2

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
  [1] LC_CTYPE=en_CA.UTF-8   LC_NUMERIC=C   LC_TIME=en_CA.UTF-8 
   LC_COLLATE=en_CA.UTF-8
  [5] LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_CA.UTF-8
LC_PAPER=en_CA.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C 
LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
  [1] compiler_4.1.2  fastmap_1.1.0   htmltools_0.5.2 tools_4.1.2 
yaml_2.2.1  rmarkdown_2.11  knitr_1.36
  [8] xfun_0.28   digest_0.6.28   rlang_0.4.12evaluate_0.14
 >

Hope this helps,

JN


On 2021-12-02 5:50 a.m., Labone, Thomas wrote:
> #~~~
> # This works
> n <- 1000# OK <= 4095
> Z <- qnorm(ppoints(n))
>
> k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))
>
> quantile(k,probs=c(0.025,0.5,0.975))
> summary(k)
>
> fit <- lm(log(k) ~ Z)
> summary(fit)
>
> gm <- exp(coef(fit)[1])
> gsd <- exp(coef(fit)[2])
> gm
> gsd
>
> plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
> lines(Z,gm*gsd^Z,col="red")
>
> #~~~
> #this does not
> n <- 1# fails >= 4096 = 2^12
> Z <- qnorm(ppoints(n))
>
> k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))
>
> quantile(k,probs=c(0.025,0.5,0.975))
> summary(k)
>
> fit <- lm(log(k) ~ Z)
> summary(fit)
>
> gm <- exp(coef(fit)[1])
> gsd <- exp(coef(fit)[2])
> gm
> gsd
>
> plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
> lines(Z,gm*gsd^Z,col="red")
>
>
> #~~~

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with lm Giving Wrong Results

2021-12-02 Thread J C Nash

I get two similar graphs.

https://web.ncf.ca/nashjc/jfiles/Rplot-Labone-4095.pdf
https://web.ncf.ca/nashjc/jfiles/RplotLabone10K.pdf

Context:

R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 20.2

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=en_CA.UTF-8   LC_NUMERIC=C   LC_TIME=en_CA.UTF-8  
  LC_COLLATE=en_CA.UTF-8
 [5] LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_CA.UTF-8LC_PAPER=en_CA.UTF-8 
  LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C 
LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
 [1] compiler_4.1.2  fastmap_1.1.0   htmltools_0.5.2 tools_4.1.2 yaml_2.2.1 
 rmarkdown_2.11  knitr_1.36
 [8] xfun_0.28   digest_0.6.28   rlang_0.4.12evaluate_0.14
>

Hope this helps,

JN


On 2021-12-02 5:50 a.m., Labone, Thomas wrote:

#~~~
# This works
n <- 1000# OK <= 4095
Z <- qnorm(ppoints(n))

k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))

quantile(k,probs=c(0.025,0.5,0.975))
summary(k)

fit <- lm(log(k) ~ Z)
summary(fit)

gm <- exp(coef(fit)[1])
gsd <- exp(coef(fit)[2])
gm
gsd

plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
lines(Z,gm*gsd^Z,col="red")

#~~~
#this does not
n <- 1# fails >= 4096 = 2^12
Z <- qnorm(ppoints(n))

k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))

quantile(k,probs=c(0.025,0.5,0.975))
summary(k)

fit <- lm(log(k) ~ Z)
summary(fit)

gm <- exp(coef(fit)[1])
gsd <- exp(coef(fit)[2])
gm
gsd

plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
lines(Z,gm*gsd^Z,col="red")


#~~~


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with lm Giving Wrong Results

2021-12-02 Thread Eric Berger
Hi Thomas,
I could not reproduce your problem. Both examples worked fine for me.

Here is my setup:

R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8
 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] R.matlab_3.6.2

loaded via a namespace (and not attached):
[1] compiler_4.1.2tools_4.1.2   R.methodsS3_1.8.1
R.utils_2.11.0R.oo_1.24.0


On Thu, Dec 2, 2021 at 4:18 PM Labone, Thomas  wrote:
>
> In the code below the first and second plots should look pretty much the 
> same, the only difference being that the first has n=1000 points and the 
> second n=1 points. On two of my Linux machines (info below) the second 
> plot is a horizontal line (incorrect answer from lm), but on my Windows 10 
> machine and a third Linux machine it works as expected. The interesting thing 
> is that the code works as expected for n <= 4095 but fails for n>=4096 (which 
> equals 2^12). Can anyone else reproduce this problem? Any ideas on how to fix 
> it?
>
> set.seed(132)
>
> #~~~
> # This works
> n <- 1000# OK <= 4095
> Z <- qnorm(ppoints(n))
>
> k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))
>
> quantile(k,probs=c(0.025,0.5,0.975))
> summary(k)
>
> fit <- lm(log(k) ~ Z)
> summary(fit)
>
> gm <- exp(coef(fit)[1])
> gsd <- exp(coef(fit)[2])
> gm
> gsd
>
> plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
> lines(Z,gm*gsd^Z,col="red")
>
> #~~~
> #this does not
> n <- 1# fails >= 4096 = 2^12
> Z <- qnorm(ppoints(n))
>
> k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))
>
> quantile(k,probs=c(0.025,0.5,0.975))
> summary(k)
>
> fit <- lm(log(k) ~ Z)
> summary(fit)
>
> gm <- exp(coef(fit)[1])
> gsd <- exp(coef(fit)[2])
> gm
> gsd
>
> plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
> lines(Z,gm*gsd^Z,col="red")
>
>
> #~~~
> > sessionInfo() #for two Linux machines having problem
> R version 4.1.2 (2021-11-01)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Linux Mint 20.2
>
> Matrix products: default
> BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C   
> LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
>  [6] LC_MESSAGES=en_US.UTF-8LC_PAPER=en_US.UTF-8   LC_NAME=C  
> LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.1.2  Matrix_1.3-4tools_4.1.2 expm_0.999-6
> grid_4.1.2  lattice_0.20-45
>
> #~~
> > sessionInfo() # for a third Linux machine not having the problem
> R version 4.1.1 (2021-08-10)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Linux Mint 19.3
>
> Matrix products: default
> BLAS/LAPACK: 
> /opt/intel/compilers_and_libraries_2020.0.166/linux/mkl/lib/intel64_lin/libmkl_rt.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C   LC_TIME=en_US.UTF-8
>  [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
> LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C  LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.1.1 tools_4.1.1
>
>
>
> Thomas R. LaBone
> PhD student
> Department of Epidemiology and Biostatistics
> Arnold School of Public Health
> University of South Carolina
> Columbia, South Carolina USA
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see