Re: [R] Reproducibility Between Local and Remote Computer with R

2020-08-09 Thread Kevin Egan
Hi Stephen,

Thanks, I’m now trying to use R 3.6.3 on the HPC, I was able to run a few
tests remote and get reproducible results. The batches have not yet run,
but I’m hoping will give reproducible results when they do.

Thanks,

Kevin

On Sun, Aug 9, 2020 at 08:42 stephen sefick  wrote:

> Hi Kevin,
>
> I think Abby has suggested something similar to what I think the problem
> is related to - environment setup.
>
> Some possible solutions:
> The renv and packrat packages are a way to version your packages to help
> with reproducability. Anaconda might be a solution for the R version and
> package version problem, if installed on your hpc. Docker could work as
> well (maybe the best option if installed). There are other workarounds, but
> I would have to know how your particular hpc/compute environment is set up
> to comment further.
>
> Brass tacks:
> I think you need to ensure all your package versions (R and add-on
> packages) are the same.
>
> Fwiw,
>
> Stephen
>
> On Sun, Aug 9, 2020, 08:26 Kevin Egan  wrote:
>
>> Hi Stephen,
>>
>> I believe I am using Renv, but on my remote computer I am running batch
>> files.
>>
>> Thanks,
>>
>> Kevin
>>
>> On 8 Aug 2020, at 18:18, stephen sefick  wrote:
>>
>> Caveat, I have only skimmed this email thread, so please forgive me if I
>> have missed something.
>>
>> Are you able to use Renv, packrat, docker, or anaconda? Your compute
>> environments are very different.
>> Kindest regards,
>>
>> Stephen Sefick
>>
>> On Sat, Aug 8, 2020, 19:05 Abby Spurdle  wrote:
>>
>>> Hi Kevin,
>>>
>>> Intuitively, the first step would be to ensure that all versions of R,
>>> and all the R packages, are the same.
>>>
>>> However, you mention HPC.
>>> And the glmnet package imports the foreach package, which appears
>>> (after a quick glance) to support multi-core and parallel computing.
>>>
>>> If your code uses parallel computing (?), you may need to look at how
>>> random numbers, and related results, are handled...
>>>
>>>
>>> On Sun, Aug 9, 2020 at 1:14 AM Kevin Egan  wrote:
>>> >
>>> > I posted this question:
>>> >
>>> > I am currently using R , RStudio , and a remote computer (using an R
>>> script) to run the same code. I start by using set.seed(123) in all three
>>> versions of the code, then using glmnet to assess a matrix. Ultimately, I
>>> am having trouble reproducing the results between my local and the remote
>>> computer's results. I am using R version 4.0.2 locally, and R version 3.6.0
>>> remote.
>>> >
>>> > After running several tests, I'm wondering if there is a difference
>>> between the two versions in R which may lead to slightly different
>>> coefficients. If anyone has any insight I would appreciate it.
>>> >
>>> > Thanks.
>>> >
>>> > and found that there were slight differences between using rnorm with
>>> R-4.0.2 and R-3.6.0 but did not find any differences for runif between both
>>> systems. In my original code, I am using rnorm and was wondering if this
>>> may be the reason I am finding slight differences in coefficients for
>>> glmnet and lars testing between using my local computer (R-4.0.2) and my
>>> remote computer (R-3.6.0). I am running my code locally on a MacOSX and
>>> remote on what I believe is an HPC.
>>> >
>>> > Thanks.
>>> > [[alternative HTML version deleted]]
>>> >
>>> > __
>>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> 
>>> > and provide commented, minimal, self-contained, reproducible code.
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> 
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility Between Local and Remote Computer with R

2020-08-09 Thread Duncan Murdoch

On 09/08/2020 8:33 a.m., Kevin Egan wrote:

Hi Abby,

After running a few tests on my local and remote versions of R, this seems
to be the most plausible answer to the problem. I put set.seed(123)
several times within my code and produced the same results but would rather
not have to do that if possible.


You should look at the doRNG package, which addresses exactly this 
problem.  See its vignette, vignette("doRNG", package="doRNG").


Duncan Murdoch



On Sat, Aug 8, 2020 at 6:05 PM Abby Spurdle  wrote:


Hi Kevin,

Intuitively, the first step would be to ensure that all versions of R,
and all the R packages, are the same.

However, you mention HPC.
And the glmnet package imports the foreach package, which appears
(after a quick glance) to support multi-core and parallel computing.

If your code uses parallel computing (?), you may need to look at how
random numbers, and related results, are handled...


On Sun, Aug 9, 2020 at 1:14 AM Kevin Egan  wrote:


I posted this question:

I am currently using R , RStudio , and a remote computer (using an R

script) to run the same code. I start by using set.seed(123) in all three
versions of the code, then using glmnet to assess a matrix. Ultimately, I
am having trouble reproducing the results between my local and the remote
computer's results. I am using R version 4.0.2 locally, and R version 3.6.0
remote.


After running several tests, I'm wondering if there is a difference

between the two versions in R which may lead to slightly different
coefficients. If anyone has any insight I would appreciate it.


Thanks.

and found that there were slight differences between using rnorm with

R-4.0.2 and R-3.6.0 but did not find any differences for runif between both
systems. In my original code, I am using rnorm and was wondering if this
may be the reason I am finding slight differences in coefficients for
glmnet and lars testing between using my local computer (R-4.0.2) and my
remote computer (R-3.6.0). I am running my code locally on a MacOSX and
remote on what I believe is an HPC.


Thanks.
 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility Between Local and Remote Computer with R

2020-08-09 Thread stephen sefick
Hi Kevin,

I think Abby has suggested something similar to what I think the problem is
related to - environment setup.

Some possible solutions:
The renv and packrat packages are a way to version your packages to help
with reproducability. Anaconda might be a solution for the R version and
package version problem, if installed on your hpc. Docker could work as
well (maybe the best option if installed). There are other workarounds, but
I would have to know how your particular hpc/compute environment is set up
to comment further.

Brass tacks:
I think you need to ensure all your package versions (R and add-on
packages) are the same.

Fwiw,

Stephen

On Sun, Aug 9, 2020, 08:26 Kevin Egan  wrote:

> Hi Stephen,
>
> I believe I am using Renv, but on my remote computer I am running batch
> files.
>
> Thanks,
>
> Kevin
>
> On 8 Aug 2020, at 18:18, stephen sefick  wrote:
>
> Caveat, I have only skimmed this email thread, so please forgive me if I
> have missed something.
>
> Are you able to use Renv, packrat, docker, or anaconda? Your compute
> environments are very different.
> Kindest regards,
>
> Stephen Sefick
>
> On Sat, Aug 8, 2020, 19:05 Abby Spurdle  wrote:
>
>> Hi Kevin,
>>
>> Intuitively, the first step would be to ensure that all versions of R,
>> and all the R packages, are the same.
>>
>> However, you mention HPC.
>> And the glmnet package imports the foreach package, which appears
>> (after a quick glance) to support multi-core and parallel computing.
>>
>> If your code uses parallel computing (?), you may need to look at how
>> random numbers, and related results, are handled...
>>
>>
>> On Sun, Aug 9, 2020 at 1:14 AM Kevin Egan  wrote:
>> >
>> > I posted this question:
>> >
>> > I am currently using R , RStudio , and a remote computer (using an R
>> script) to run the same code. I start by using set.seed(123) in all three
>> versions of the code, then using glmnet to assess a matrix. Ultimately, I
>> am having trouble reproducing the results between my local and the remote
>> computer's results. I am using R version 4.0.2 locally, and R version 3.6.0
>> remote.
>> >
>> > After running several tests, I'm wondering if there is a difference
>> between the two versions in R which may lead to slightly different
>> coefficients. If anyone has any insight I would appreciate it.
>> >
>> > Thanks.
>> >
>> > and found that there were slight differences between using rnorm with
>> R-4.0.2 and R-3.6.0 but did not find any differences for runif between both
>> systems. In my original code, I am using rnorm and was wondering if this
>> may be the reason I am finding slight differences in coefficients for
>> glmnet and lars testing between using my local computer (R-4.0.2) and my
>> remote computer (R-3.6.0). I am running my code locally on a MacOSX and
>> remote on what I believe is an HPC.
>> >
>> > Thanks.
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> 
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility Between Local and Remote Computer with R

2020-08-09 Thread Kevin Egan
Hi Abby,

After running a few tests on my local and remote versions of R, this seems
to be the most plausible answer to the problem. I put set.seed(123)
several times within my code and produced the same results but would rather
not have to do that if possible.


On Sat, Aug 8, 2020 at 6:05 PM Abby Spurdle  wrote:

> Hi Kevin,
>
> Intuitively, the first step would be to ensure that all versions of R,
> and all the R packages, are the same.
>
> However, you mention HPC.
> And the glmnet package imports the foreach package, which appears
> (after a quick glance) to support multi-core and parallel computing.
>
> If your code uses parallel computing (?), you may need to look at how
> random numbers, and related results, are handled...
>
>
> On Sun, Aug 9, 2020 at 1:14 AM Kevin Egan  wrote:
> >
> > I posted this question:
> >
> > I am currently using R , RStudio , and a remote computer (using an R
> script) to run the same code. I start by using set.seed(123) in all three
> versions of the code, then using glmnet to assess a matrix. Ultimately, I
> am having trouble reproducing the results between my local and the remote
> computer's results. I am using R version 4.0.2 locally, and R version 3.6.0
> remote.
> >
> > After running several tests, I'm wondering if there is a difference
> between the two versions in R which may lead to slightly different
> coefficients. If anyone has any insight I would appreciate it.
> >
> > Thanks.
> >
> > and found that there were slight differences between using rnorm with
> R-4.0.2 and R-3.6.0 but did not find any differences for runif between both
> systems. In my original code, I am using rnorm and was wondering if this
> may be the reason I am finding slight differences in coefficients for
> glmnet and lars testing between using my local computer (R-4.0.2) and my
> remote computer (R-3.6.0). I am running my code locally on a MacOSX and
> remote on what I believe is an HPC.
> >
> > Thanks.
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility Between Local and Remote Computer with R

2020-08-09 Thread Kevin Egan
Local:
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.6

Matrix products: default
BLAS:   
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: 
/Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

loaded via a namespace (and not attached):
 [1] crayon_1.3.4 dplyr_1.0.0  R6_2.4.1 lifecycle_0.2.0  
magrittr_1.5 pillar_1.4.3
 [7] rlang_0.4.7  rstudioapi_0.11  vctrs_0.3.1  generics_0.0.2   
ellipsis_0.3.0   tools_4.0.2 
[13] glue_1.4.1   purrr_0.3.4  yaml_2.2.1   compiler_4.0.2   
pkgconfig_2.0.3  tidyselect_1.1.0
[19] tibble_3.0.1 


Remote:
> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: 
/ddn/apps/Cluster-Apps/intel/2019.5/compilers_and_libraries_2019.5.281/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so

locale:
[1] C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

loaded via a namespace (and not attached):
[1] compiler_3.6.3

> On 8 Aug 2020, at 08:17, Jeff Newmiller  wrote:
> 
> Compare the sessionInfo outputs for the different environments.
> 
> On August 7, 2020 1:24:55 PM PDT, Kevin Egan  wrote:
>> I posted this question:
>> 
>> I am currently using R , RStudio , and a remote computer (using an R
>> script) to run the same code. I start by using set.seed(123) in all
>> three versions of the code, then using glmnet to assess a matrix.
>> Ultimately, I am having trouble reproducing the results between my
>> local and the remote computer's results. I am using R version 4.0.2
>> locally, and R version 3.6.0 remote.
>> 
>> After running several tests, I'm wondering if there is a difference
>> between the two versions in R which may lead to slightly different
>> coefficients. If anyone has any insight I would appreciate it.
>> 
>> Thanks.
>> 
>> and found that there were slight differences between using rnorm with
>> R-4.0.2 and R-3.6.0 but did not find any differences for runif between
>> both systems. In my original code, I am using rnorm and was wondering
>> if this may be the reason I am finding slight differences in
>> coefficients for glmnet and lars testing between using my local
>> computer (R-4.0.2) and my remote computer (R-3.6.0). I am running my
>> code locally on a MacOSX and remote on what I believe is an HPC.
>> 
>> Thanks.
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Sent from my phone. Please excuse my brevity.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility Between Local and Remote Computer with R

2020-08-08 Thread stephen sefick
Caveat, I have only skimmed this email thread, so please forgive me if I
have missed something.

Are you able to use Renv, packrat, docker, or anaconda? Your compute
environments are very different.
Kindest regards,

Stephen Sefick

On Sat, Aug 8, 2020, 19:05 Abby Spurdle  wrote:

> Hi Kevin,
>
> Intuitively, the first step would be to ensure that all versions of R,
> and all the R packages, are the same.
>
> However, you mention HPC.
> And the glmnet package imports the foreach package, which appears
> (after a quick glance) to support multi-core and parallel computing.
>
> If your code uses parallel computing (?), you may need to look at how
> random numbers, and related results, are handled...
>
>
> On Sun, Aug 9, 2020 at 1:14 AM Kevin Egan  wrote:
> >
> > I posted this question:
> >
> > I am currently using R , RStudio , and a remote computer (using an R
> script) to run the same code. I start by using set.seed(123) in all three
> versions of the code, then using glmnet to assess a matrix. Ultimately, I
> am having trouble reproducing the results between my local and the remote
> computer's results. I am using R version 4.0.2 locally, and R version 3.6.0
> remote.
> >
> > After running several tests, I'm wondering if there is a difference
> between the two versions in R which may lead to slightly different
> coefficients. If anyone has any insight I would appreciate it.
> >
> > Thanks.
> >
> > and found that there were slight differences between using rnorm with
> R-4.0.2 and R-3.6.0 but did not find any differences for runif between both
> systems. In my original code, I am using rnorm and was wondering if this
> may be the reason I am finding slight differences in coefficients for
> glmnet and lars testing between using my local computer (R-4.0.2) and my
> remote computer (R-3.6.0). I am running my code locally on a MacOSX and
> remote on what I believe is an HPC.
> >
> > Thanks.
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility Between Local and Remote Computer with R

2020-08-08 Thread Duncan Murdoch

On 08/08/2020 9:34 a.m., Marc Schwartz via R-help wrote:

Hi,

I was initially going to think that the change in the RNG might be the source, 
however, that change was made in 3.6.0 and would have applied to runif() and 
sample():

"sample.kind can be "Rounding" or "Rejection", or partial matches to these. The 
former was the default in versions prior to 3.6.0: it made sample noticeably non-uniform on large 
populations, and should only be used for reproduction of old results. See PR#17494 for a discussion."



That still may be an issue.  If a user saves a workspace in an old 
version and reloads it in a newer version, I believe they get the old 
version of the RNG.


You need to check that the output of RNGkind() matches in all machines 
to know that they're using the same RNGs.


Duncan Murdoch


Three other possibilities:

1. Read news() for your local 4.0.2 installation, as there are some changes 
that were made, including some changes to round() that could be applicable here.

2. Check to see if the version of glmnet is the same on both machines. There 
have been changes to that package that might be relevant here and you might 
read the README and NEWS files for the package on CRAN to see if there is any 
relevant information there.

3. There is always a chance that different hardware and OS versions could lead 
to issues, especially out to a number of decimal places that could alter 
results. If you or via an Admin, have the ability to update the remote machine 
(both R and installed packages), that can help to reduce the number of 
variables at play here.

Regards,

Marc Schwartz



On Aug 7, 2020, at 4:24 PM, Kevin Egan  wrote:

I posted this question:

I am currently using R , RStudio , and a remote computer (using an R script) to 
run the same code. I start by using set.seed(123) in all three versions of the 
code, then using glmnet to assess a matrix. Ultimately, I am having trouble 
reproducing the results between my local and the remote computer's results. I 
am using R version 4.0.2 locally, and R version 3.6.0 remote.

After running several tests, I'm wondering if there is a difference between the 
two versions in R which may lead to slightly different coefficients. If anyone 
has any insight I would appreciate it.

Thanks.

and found that there were slight differences between using rnorm with R-4.0.2 
and R-3.6.0 but did not find any differences for runif between both systems. In 
my original code, I am using rnorm and was wondering if this may be the reason 
I am finding slight differences in coefficients for glmnet and lars testing 
between using my local computer (R-4.0.2) and my remote computer (R-3.6.0). I 
am running my code locally on a MacOSX and remote on what I believe is an HPC.

Thanks.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility Between Local and Remote Computer with R

2020-08-08 Thread Abby Spurdle
Hi Kevin,

Intuitively, the first step would be to ensure that all versions of R,
and all the R packages, are the same.

However, you mention HPC.
And the glmnet package imports the foreach package, which appears
(after a quick glance) to support multi-core and parallel computing.

If your code uses parallel computing (?), you may need to look at how
random numbers, and related results, are handled...


On Sun, Aug 9, 2020 at 1:14 AM Kevin Egan  wrote:
>
> I posted this question:
>
> I am currently using R , RStudio , and a remote computer (using an R script) 
> to run the same code. I start by using set.seed(123) in all three versions of 
> the code, then using glmnet to assess a matrix. Ultimately, I am having 
> trouble reproducing the results between my local and the remote computer's 
> results. I am using R version 4.0.2 locally, and R version 3.6.0 remote.
>
> After running several tests, I'm wondering if there is a difference between 
> the two versions in R which may lead to slightly different coefficients. If 
> anyone has any insight I would appreciate it.
>
> Thanks.
>
> and found that there were slight differences between using rnorm with R-4.0.2 
> and R-3.6.0 but did not find any differences for runif between both systems. 
> In my original code, I am using rnorm and was wondering if this may be the 
> reason I am finding slight differences in coefficients for glmnet and lars 
> testing between using my local computer (R-4.0.2) and my remote computer 
> (R-3.6.0). I am running my code locally on a MacOSX and remote on what I 
> believe is an HPC.
>
> Thanks.
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility Between Local and Remote Computer with R

2020-08-08 Thread Jeff Newmiller
You did not load the corresponding packages in both environments.

Also.. please post plain text format per the Posting Guide mentioned in the 
footer of every post.

On August 8, 2020 7:15:16 AM PDT, Kevin Egan  wrote:
>Local:
>R version 4.0.2 (2020-06-22)
>Platform: x86_64-apple-darwin17.0 (64-bit)
>Running under: macOS Catalina 10.15.6
>
>Matrix products: default
>BLAS:  
>/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
>LAPACK:
>/Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
>
>locale:
>[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>
>attached base packages:
>[1] stats graphics  grDevices utils datasets  methods   base   
> 
>
>loaded via a namespace (and not attached):
>[1] crayon_1.3.4 dplyr_1.0.0  R6_2.4.1 lifecycle_0.2.0 
>magrittr_1.5 pillar_1.4.3
>[7] rlang_0.4.7  rstudioapi_0.11  vctrs_0.3.1  generics_0.0.2  
>ellipsis_0.3.0   tools_4.0.2 
>[13] glue_1.4.1   purrr_0.3.4  yaml_2.2.1   compiler_4.0.2 
> pkgconfig_2.0.3  tidyselect_1.1.0
>[19] tibble_3.0.1 
>
>
>Remote:
>> sessionInfo()
>R version 3.6.3 (2020-02-29)
>Platform: x86_64-pc-linux-gnu (64-bit)
>Running under: CentOS Linux 7 (Core)
>
>Matrix products: default
>BLAS/LAPACK:
>/ddn/apps/Cluster-Apps/intel/2019.5/compilers_and_libraries_2019.5.281/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so
>
>locale:
>[1] C
>
>attached base packages:
>[1] stats graphics  grDevices utils datasets  methods   base   
> 
>
>loaded via a namespace (and not attached):
>[1] compiler_3.6.3
>
>> On 8 Aug 2020, at 08:17, Jeff Newmiller 
>wrote:
>> 
>> Compare the sessionInfo outputs for the different environments.
>> 
>> On August 7, 2020 1:24:55 PM PDT, Kevin Egan 
>wrote:
>>> I posted this question:
>>> 
>>> I am currently using R , RStudio , and a remote computer (using an R
>>> script) to run the same code. I start by using set.seed(123) in all
>>> three versions of the code, then using glmnet to assess a matrix.
>>> Ultimately, I am having trouble reproducing the results between my
>>> local and the remote computer's results. I am using R version 4.0.2
>>> locally, and R version 3.6.0 remote.
>>> 
>>> After running several tests, I'm wondering if there is a difference
>>> between the two versions in R which may lead to slightly different
>>> coefficients. If anyone has any insight I would appreciate it.
>>> 
>>> Thanks.
>>> 
>>> and found that there were slight differences between using rnorm
>with
>>> R-4.0.2 and R-3.6.0 but did not find any differences for runif
>between
>>> both systems. In my original code, I am using rnorm and was
>wondering
>>> if this may be the reason I am finding slight differences in
>>> coefficients for glmnet and lars testing between using my local
>>> computer (R-4.0.2) and my remote computer (R-3.6.0). I am running my
>>> code locally on a MacOSX and remote on what I believe is an HPC.
>>> 
>>> Thanks.
>>> [[alternative HTML version deleted]]
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> -- 
>> Sent from my phone. Please excuse my brevity.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility Between Local and Remote Computer with R

2020-08-08 Thread Marc Schwartz via R-help
Hi,

I was initially going to think that the change in the RNG might be the source, 
however, that change was made in 3.6.0 and would have applied to runif() and 
sample():

"sample.kind can be "Rounding" or "Rejection", or partial matches to these. The 
former was the default in versions prior to 3.6.0: it made sample noticeably 
non-uniform on large populations, and should only be used for reproduction of 
old results. See PR#17494 for a discussion."

Three other possibilities:

1. Read news() for your local 4.0.2 installation, as there are some changes 
that were made, including some changes to round() that could be applicable here.

2. Check to see if the version of glmnet is the same on both machines. There 
have been changes to that package that might be relevant here and you might 
read the README and NEWS files for the package on CRAN to see if there is any 
relevant information there.

3. There is always a chance that different hardware and OS versions could lead 
to issues, especially out to a number of decimal places that could alter 
results. If you or via an Admin, have the ability to update the remote machine 
(both R and installed packages), that can help to reduce the number of 
variables at play here.

Regards,

Marc Schwartz


> On Aug 7, 2020, at 4:24 PM, Kevin Egan  wrote:
> 
> I posted this question:
> 
> I am currently using R , RStudio , and a remote computer (using an R script) 
> to run the same code. I start by using set.seed(123) in all three versions of 
> the code, then using glmnet to assess a matrix. Ultimately, I am having 
> trouble reproducing the results between my local and the remote computer's 
> results. I am using R version 4.0.2 locally, and R version 3.6.0 remote.
> 
> After running several tests, I'm wondering if there is a difference between 
> the two versions in R which may lead to slightly different coefficients. If 
> anyone has any insight I would appreciate it.
> 
> Thanks.
> 
> and found that there were slight differences between using rnorm with R-4.0.2 
> and R-3.6.0 but did not find any differences for runif between both systems. 
> In my original code, I am using rnorm and was wondering if this may be the 
> reason I am finding slight differences in coefficients for glmnet and lars 
> testing between using my local computer (R-4.0.2) and my remote computer 
> (R-3.6.0). I am running my code locally on a MacOSX and remote on what I 
> believe is an HPC.
> 
> Thanks.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility Between Local and Remote Computer with R

2020-08-08 Thread Jeff Newmiller
Compare the sessionInfo outputs for the different environments.

On August 7, 2020 1:24:55 PM PDT, Kevin Egan  wrote:
>I posted this question:
>
>I am currently using R , RStudio , and a remote computer (using an R
>script) to run the same code. I start by using set.seed(123) in all
>three versions of the code, then using glmnet to assess a matrix.
>Ultimately, I am having trouble reproducing the results between my
>local and the remote computer's results. I am using R version 4.0.2
>locally, and R version 3.6.0 remote.
>
>After running several tests, I'm wondering if there is a difference
>between the two versions in R which may lead to slightly different
>coefficients. If anyone has any insight I would appreciate it.
>
>Thanks.
>
>and found that there were slight differences between using rnorm with
>R-4.0.2 and R-3.6.0 but did not find any differences for runif between
>both systems. In my original code, I am using rnorm and was wondering
>if this may be the reason I am finding slight differences in
>coefficients for glmnet and lars testing between using my local
>computer (R-4.0.2) and my remote computer (R-3.6.0). I am running my
>code locally on a MacOSX and remote on what I believe is an HPC.
>
>Thanks.
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.