Re: [Rd] parallel PSOCK connection latency is greater on Linux?

2020-11-01 Thread Simon Urbanek
It looks like R sockets on Linux could do with TCP_NODELAY -- without (status 
quo):

Unit: microseconds
   expr  min   lq mean  median   uq  max
 clusterEvalQ(cl, iris) 1449.997 43991.99 43975.21 43997.1 44001.91 48027.83
 neval
  1000

exactly the same machine + R but with TCP_NODELAY enabled in R_SockConnect():

Unit: microseconds
   expr min lq mean  median  uq  max neval
 clusterEvalQ(cl, iris) 156.125 166.41 180.8806 170.247 174.298 5322.234  1000

Cheers,
Simon


> On 2/11/2020, at 3:39 AM, Jeff  wrote:
> 
> I'm exploring latency overhead of parallel PSOCK workers and noticed that 
> serializing/unserializing data back to the main R session is significantly 
> slower on Linux than it is on Windows/MacOS with similar hardware. Is there a 
> reason for this difference and is there a way to avoid the apparent 
> additional Linux overhead?
> 
> I attempted to isolate the behavior with a test that simply returns an 
> existing object from the worker back to the main R session.
> 
> library(parallel)
> library(microbenchmark)
> gcinfo(TRUE)
> cl <- makeCluster(1)
> (x <- microbenchmark(clusterEvalQ(cl, iris), times = 1000, unit = "us"))
> plot(x$time, ylab = "microseconds")
> head(x$time, n = 10)
> 
> On Windows/MacOS, the test runs in 300-500 microseconds depending on 
> hardware. A few of the 1000 runs are an order of magnitude slower but this 
> can probably be attributed to garbage collection on the worker.
> 
> On Linux, the first 5 or so executions run at comparable speeds but all 
> subsequent executions are two orders of magnitude slower (~40 milliseconds).
> 
> I see this behavior across various platforms and hardware combinations:
> 
> Ubuntu 18.04 (Intel Xeon Platinum 8259CL)
> Linux Mint 19.3 (AMD Ryzen 7 1800X)
> Linux Mint 20 (AMD Ryzen 7 3700X)
> Windows 10 (AMD Ryzen 7 4800H)
> MacOS 10.15.7 (Intel Core i7-8850H)
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] vignettes present in 2 folders or won't work

2020-11-01 Thread Duncan Murdoch

On 01/11/2020 2:57 p.m., Dirk Eddelbuettel wrote:


The closest to a canonical reference for a static vignette is the basic blog
post by Mark at

  
https://www.markvanderloo.eu/yaRb/2019/01/11/add-a-static-pdf-vignette-to-an-r-package/

which I follow in a number of packages.

Back to the original point by Alexandre: No, I do _not_ think we can do
without a double copy of the _pre-made_ pdf ("input") and the _resulting_ pdf
("output").

That bugs me a little too but I take it as a given as static / pre-made
vignettes are non-standard (given lack of any mention in WRE, and the pretty
obvious violation of the "spirit of the law" of vignette which is after all
made to run code, not to avoid it). Yet uses for static vignettes are pretty
valid and here we are with another clear as mud situation.



In many cases such files aren't vignettes.

By definition, packages should contain plain text source code for 
vignettes.  They can contain other PDF files in inst/doc, but if you 
don't include the plain text source, those aren't vignettes.


An exception would be a package that contains the source code but 
doesn't want to require CRAN or other users to run it, because it's too 
time-consuming, or needs obscure resources.  The CRAN policy discusses this.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[R-pkg-devel] URLencode at DESCRIPTION file and citation()

2020-11-01 Thread Amanda Rehbein
Dear R-Devs,

I was wondering if someone could please help me to with two errors when
submitting a package to CRAN.

1) At the DESCRIPTION file I added some DOI's with curled brackets "<>"
inside the "<>" (e.g. 2.0.CO;2>).
I tried URLencode() the string in R and copy/paste the output into
the DESCRIPTION file, like this:
. But when I load,
build, check, etc, and ?raytracing the DOI does not appear correctly.
Everything after the first "%" disappeared and obviously, the DOI doesn't
work anymore.


2) I have a CITATION file at the sub-directory ins. Its content is like
this:

citHeader("To cite raytracing in publications use one of below:")

citEntry(entry = "manual",
   title  = "Atmospheric Rossby waves identification and tracking
with a raytracing numerical model",
   author = personList(c(person("Amanda", "Rehbein"),
person("Tercio", "Ambrizzi"),
person("Sergio", "Ibarra-Espinosa"),
person("Livia M. M.", "Dutra"))),
   year   = "2020",
   url= "https://github.com/salvatirehbein/raytracing;,

   textVersion  =
   paste0("Rehbein, A., Ambrizzi, T., Ibarra-Espinosa, S., Dutra, L. M. M.:
Atmospheric Rossby waves identification and tracking with a raytracing
numerical model ", packageVersion("raytracing"), ".
https://github.com/salvatirehbein/raytracing, 2020.")
)

It passes all checks and gives a-okay citation when I type
citation("raytracing"). However, it fails in the CRAN tests with the
following error message.

Reading CITATION file fails with
 there is no package called 'raytracing'
   when package is not installed.

I will be very thankful for any help with this! Best,
Amanda

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] vignettes present in 2 folders or won't work

2020-11-01 Thread Spencer Graves
	  CRAN policies say, "neither data nor documentation should exceed 5MB 
(which covers several books). A CRAN package is not an appropriate way 
to distribute course notes, and authors will be asked to trim their 
documentation to a maximum of 5MB."[1]



	  I post R Markdown vignettes as companions to Wikiversity articles. 
For example, the Wikiversity article on "Forecasting nuclear 
proliferation" is a tech report on the indicated topic with two R 
Markdown vignettes as part of an appendix.[2]



	  Wikiversity is similar to Wikipedia but supports teaching materials 
and original research, which are forbidden on Wikipedia.  Both are 
projects of the Wikimedia Foundation and have very similar rules and 
management.  For both, almost anybody can change almost anything.  What 
stays tends to be written from a neutral point of view citing credible 
sources.  If you don't do that, your work may be speedily deleted or 
reverted.  Shi et al (2017) "The wisdom of polarized crowds" did a 
content analysis of all edits to English Wikipedia articles relating to 
politics, social issues and science from its start to December 1, 2016. 
This included almost 233,000 articles representing approximately 5 
percent of the English Wikipedia.  They found that the best articles had 
a large number of editors with a very diverse views.  They said that 95% 
of articles could benefit from greater conflict;  the conflict became 
counterproductive in only about 5% of articles.[3]



  Spencer Graves


[1]
https://cran.r-project.org/web/packages/policies.html


[2]
https://en.wikiversity.org/wiki/Forecasting_nuclear_proliferation#Appendix._Companion_R_Markdown_vignettes


[3]
https://en.wikipedia.org/wiki/Reliability_of_Wikipedia#Articles_on_contentious_issue


On 2020-11-01 13:35, Ben Bolker wrote:
   I take Duncan's point but would second the motion to have WRE clarify 
how static vignettes are supposed to work; it's a topic I am repeatedly 
confused about despite being an experienced package maintainer. If 
knowledgeable outsiders compiled a documentation patch would it be 
likely to be considered ...??


On 11/1/20 2:29 PM, Duncan Murdoch wrote:

On 01/11/2020 1:02 p.m., Alexandre Courtiol wrote:

Noted Duncan and TRUE...

I cannot do more immediately unfortunately, that is always the issue 
of asking a last minute panic attack question before teaching a 
course involving the package...
I do have /doc in my .Rbuildignore for reasons I can no longer 
remember... I will dig and create a MRE/reprex.
The students will download heavy packages, but they probably won't 
notice.

*Apologies*

In the meantime, perhaps my question was clear enough to get clarity on:
1) whether having vignettes twice in foders inst/doc and vignettes is 
normal or not when vignettes are static.
2) where could anyone find a complete documentation on R vignettes 
since it is a recurring issue in this list and elsewhere.


The Writing R Extensions manual describes vignette support in R, but R 
allows contributed packages (like knitr, rmarkdown, R.rsp) to handle 
vignettes.  WRE explains enough to write such a package, but it's up 
to their authors to document how to use them, so "complete 
documentation" is spread out all over the place.  As with any 
documentation, there are probably errors and omissions.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] vignettes present in 2 folders or won't work

2020-11-01 Thread Dirk Eddelbuettel


The closest to a canonical reference for a static vignette is the basic blog
post by Mark at

 
https://www.markvanderloo.eu/yaRb/2019/01/11/add-a-static-pdf-vignette-to-an-r-package/

which I follow in a number of packages.

Back to the original point by Alexandre: No, I do _not_ think we can do
without a double copy of the _pre-made_ pdf ("input") and the _resulting_ pdf
("output").

That bugs me a little too but I take it as a given as static / pre-made
vignettes are non-standard (given lack of any mention in WRE, and the pretty
obvious violation of the "spirit of the law" of vignette which is after all
made to run code, not to avoid it). Yet uses for static vignettes are pretty
valid and here we are with another clear as mud situation.

Dirk

-- 
https://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] vignettes present in 2 folders or won't work

2020-11-01 Thread Ben Bolker
  I take Duncan's point but would second the motion to have WRE clarify 
how static vignettes are supposed to work; it's a topic I am repeatedly 
confused about despite being an experienced package maintainer. If 
knowledgeable outsiders compiled a documentation patch would it be 
likely to be considered ...??


On 11/1/20 2:29 PM, Duncan Murdoch wrote:

On 01/11/2020 1:02 p.m., Alexandre Courtiol wrote:

Noted Duncan and TRUE...

I cannot do more immediately unfortunately, that is always the issue 
of asking a last minute panic attack question before teaching a course 
involving the package...
I do have /doc in my .Rbuildignore for reasons I can no longer 
remember... I will dig and create a MRE/reprex.
The students will download heavy packages, but they probably won't 
notice.

*Apologies*

In the meantime, perhaps my question was clear enough to get clarity on:
1) whether having vignettes twice in foders inst/doc and vignettes is 
normal or not when vignettes are static.
2) where could anyone find a complete documentation on R vignettes 
since it is a recurring issue in this list and elsewhere.


The Writing R Extensions manual describes vignette support in R, but R 
allows contributed packages (like knitr, rmarkdown, R.rsp) to handle 
vignettes.  WRE explains enough to write such a package, but it's up to 
their authors to document how to use them, so "complete documentation" 
is spread out all over the place.  As with any documentation, there are 
probably errors and omissions.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] vignettes present in 2 folders or won't work

2020-11-01 Thread Duncan Murdoch

On 01/11/2020 1:02 p.m., Alexandre Courtiol wrote:

Noted Duncan and TRUE...

I cannot do more immediately unfortunately, that is always the issue of 
asking a last minute panic attack question before teaching a course 
involving the package...
I do have /doc in my .Rbuildignore for reasons I can no longer 
remember... I will dig and create a MRE/reprex.

The students will download heavy packages, but they probably won't notice.
*Apologies*

In the meantime, perhaps my question was clear enough to get clarity on:
1) whether having vignettes twice in foders inst/doc and vignettes is 
normal or not when vignettes are static.
2) where could anyone find a complete documentation on R vignettes since 
it is a recurring issue in this list and elsewhere.


The Writing R Extensions manual describes vignette support in R, but R 
allows contributed packages (like knitr, rmarkdown, R.rsp) to handle 
vignettes.  WRE explains enough to write such a package, but it's up to 
their authors to document how to use them, so "complete documentation" 
is spread out all over the place.  As with any documentation, there are 
probably errors and omissions.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] vignettes present in 2 folders or won't work

2020-11-01 Thread Alexandre Courtiol
Noted Duncan and TRUE...

I cannot do more immediately unfortunately, that is always the issue of
asking a last minute panic attack question before teaching a course
involving the package...
I do have /doc in my .Rbuildignore for reasons I can no longer remember...
I will dig and create a MRE/reprex.
The students will download heavy packages, but they probably won't notice.
*Apologies*

In the meantime, perhaps my question was clear enough to get clarity on:
1) whether having vignettes twice in foders inst/doc and vignettes is
normal or not when vignettes are static.
2) where could anyone find a complete documentation on R vignettes since it
is a recurring issue in this list and elsewhere.

Many thanks

On Sun, 1 Nov 2020 at 18:19, Duncan Murdoch 
wrote:

> You are doing a lot of things that are non-standard, so I doubt if
> anyone is going to be able to help you without access to a simple
> reproducible example of a package that does what you do.  Try to cut out
> as much as you can to make it minimal.  For example,
> devtools::document() (indeed, most of your code) is probably irrelevant
> to your problem with vignettes, but things like your .Rbuildignore file
> are not.
>
> Duncan Murdoch
>
> On 01/11/2020 11:22 a.m., Alexandre Courtiol wrote:
> > Dear all,
> >
> > I am struggling with an issue related to static vignettes: they work, but
> > only when present in double in the tarball -- in the folder inst/doc and
> > vignettes; see below for details.
> >
> > Details:
> >
> > I am pre-compiling heavy vignettes thanks to the vignette builder R.rsp.
> > So basically, I have PDF files which I want the package to use as
> Vignettes.
> >
> > For this, I have the following in my Description file:
> > VignetteBuilder: R.rsp
> >
> > I am organising the vignette by hand using a Makefile (because this is
> the
> > only way that has proven 100% reliable to me, across a variety of
> > situations).
> >
> > In my Makefile, I have something like:
> >
> > build: clean
> >mkdir -p inst/doc
> >mkdir vignettes
> >-cp sources_vignettes/*/*.pdf* vignettes
> >Rscript -e "tools::compactPDF(paths = 'vignettes', gs_quality =
> > 'printer')"
> >cp vignettes/*.pdf* inst/doc
> >Rscript -e "devtools::document()"
> >mkdir inst/extdata/sources_vignettes
> >cp sources_vignettes/*/*.Rnw inst/extdata/sources_vignettes
> >Rscript -e "devtools::build(vignettes = FALSE)"
> >
> > That works fine, the vignettes show up using browseVignettes() after
> > installing the package the normal way.
> >
> > However, after building, the tar.gz contains each pdf corresponding to a
> > vignette twice: once in vignettes and once in inst/doc (which is obvious,
> > when reading the Makefile).
> >
> >  From the reading of "Writing R Extensions" and other material, I cannot
> > tell if that is a must or not, but I hope it is not since I wish to avoid
> > that (my pdfs are large even once compressed).
> >
> > My problem is that when I delete either inst/doc or vignette just before
> > calling the last command of the Makefile (Rscript -e
> > "devtools::build(vignettes = FALSE)"), then browseVignettes() does not
> find
> > the vignettes after a normal installation.
> >
> > If anyone knows of some _complete_ documentation about the ever
> troublesome
> > topic of vignettes building in R, I would be very grateful too...
> >
> > Many thanks!
> >
> > Alex
> >
>
>

-- 
Alexandre Courtiol

http://sites.google.com/site/alexandrecourtiol/home

*"Science is the belief in the ignorance of experts"*, R. Feynman

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] vignettes present in 2 folders or won't work

2020-11-01 Thread Duncan Murdoch
You are doing a lot of things that are non-standard, so I doubt if 
anyone is going to be able to help you without access to a simple 
reproducible example of a package that does what you do.  Try to cut out 
as much as you can to make it minimal.  For example, 
devtools::document() (indeed, most of your code) is probably irrelevant 
to your problem with vignettes, but things like your .Rbuildignore file 
are not.


Duncan Murdoch

On 01/11/2020 11:22 a.m., Alexandre Courtiol wrote:

Dear all,

I am struggling with an issue related to static vignettes: they work, but
only when present in double in the tarball -- in the folder inst/doc and
vignettes; see below for details.

Details:

I am pre-compiling heavy vignettes thanks to the vignette builder R.rsp.
So basically, I have PDF files which I want the package to use as Vignettes.

For this, I have the following in my Description file:
VignetteBuilder: R.rsp

I am organising the vignette by hand using a Makefile (because this is the
only way that has proven 100% reliable to me, across a variety of
situations).

In my Makefile, I have something like:

build: clean
   mkdir -p inst/doc
   mkdir vignettes
   -cp sources_vignettes/*/*.pdf* vignettes
   Rscript -e "tools::compactPDF(paths = 'vignettes', gs_quality =
'printer')"
   cp vignettes/*.pdf* inst/doc
   Rscript -e "devtools::document()"
   mkdir inst/extdata/sources_vignettes
   cp sources_vignettes/*/*.Rnw inst/extdata/sources_vignettes
   Rscript -e "devtools::build(vignettes = FALSE)"

That works fine, the vignettes show up using browseVignettes() after
installing the package the normal way.

However, after building, the tar.gz contains each pdf corresponding to a
vignette twice: once in vignettes and once in inst/doc (which is obvious,
when reading the Makefile).

 From the reading of "Writing R Extensions" and other material, I cannot
tell if that is a must or not, but I hope it is not since I wish to avoid
that (my pdfs are large even once compressed).

My problem is that when I delete either inst/doc or vignette just before
calling the last command of the Makefile (Rscript -e
"devtools::build(vignettes = FALSE)"), then browseVignettes() does not find
the vignettes after a normal installation.

If anyone knows of some _complete_ documentation about the ever troublesome
topic of vignettes building in R, I would be very grateful too...

Many thanks!

Alex



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] vignettes present in 2 folders or won't work

2020-11-01 Thread Alexandre Courtiol
Dear all,

I am struggling with an issue related to static vignettes: they work, but
only when present in double in the tarball -- in the folder inst/doc and
vignettes; see below for details.

Details:

I am pre-compiling heavy vignettes thanks to the vignette builder R.rsp.
So basically, I have PDF files which I want the package to use as Vignettes.

For this, I have the following in my Description file:
VignetteBuilder: R.rsp

I am organising the vignette by hand using a Makefile (because this is the
only way that has proven 100% reliable to me, across a variety of
situations).

In my Makefile, I have something like:

build: clean
  mkdir -p inst/doc
  mkdir vignettes
  -cp sources_vignettes/*/*.pdf* vignettes
  Rscript -e "tools::compactPDF(paths = 'vignettes', gs_quality =
'printer')"
  cp vignettes/*.pdf* inst/doc
  Rscript -e "devtools::document()"
  mkdir inst/extdata/sources_vignettes
  cp sources_vignettes/*/*.Rnw inst/extdata/sources_vignettes
  Rscript -e "devtools::build(vignettes = FALSE)"

That works fine, the vignettes show up using browseVignettes() after
installing the package the normal way.

However, after building, the tar.gz contains each pdf corresponding to a
vignette twice: once in vignettes and once in inst/doc (which is obvious,
when reading the Makefile).

>From the reading of "Writing R Extensions" and other material, I cannot
tell if that is a must or not, but I hope it is not since I wish to avoid
that (my pdfs are large even once compressed).

My problem is that when I delete either inst/doc or vignette just before
calling the last command of the Makefile (Rscript -e
"devtools::build(vignettes = FALSE)"), then browseVignettes() does not find
the vignettes after a normal installation.

If anyone knows of some _complete_ documentation about the ever troublesome
topic of vignettes building in R, I would be very grateful too...

Many thanks!

Alex

-- 
Alexandre Courtiol

http://sites.google.com/site/alexandrecourtiol/home

*"Science is the belief in the ignorance of experts"*, R. Feynman

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] parallel PSOCK connection latency is greater on Linux?

2020-11-01 Thread Jeff
I'm exploring latency overhead of parallel PSOCK workers and noticed 
that serializing/unserializing data back to the main R session is 
significantly slower on Linux than it is on Windows/MacOS with similar 
hardware. Is there a reason for this difference and is there a way to 
avoid the apparent additional Linux overhead?


I attempted to isolate the behavior with a test that simply returns an 
existing object from the worker back to the main R session.


library(parallel)
library(microbenchmark)
gcinfo(TRUE)
cl <- makeCluster(1)
(x <- microbenchmark(clusterEvalQ(cl, iris), times = 1000, unit = "us"))
plot(x$time, ylab = "microseconds")
head(x$time, n = 10)

On Windows/MacOS, the test runs in 300-500 microseconds depending on 
hardware. A few of the 1000 runs are an order of magnitude slower but 
this can probably be attributed to garbage collection on the worker.


On Linux, the first 5 or so executions run at comparable speeds but all 
subsequent executions are two orders of magnitude slower (~40 
milliseconds).


I see this behavior across various platforms and hardware combinations:

Ubuntu 18.04 (Intel Xeon Platinum 8259CL)
Linux Mint 19.3 (AMD Ryzen 7 1800X)
Linux Mint 20 (AMD Ryzen 7 3700X)
Windows 10 (AMD Ryzen 7 4800H)
MacOS 10.15.7 (Intel Core i7-8850H)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel