[R-pkg-devel] callr and CRAN policy on the max number of cores
Hil, I am exploring the use of callr in a package. I know that for packages that do parallel computing, we should not use more than two cores in examples, tests, etc. when being checked on CRAN. I believe I can use callr in tests and examples. How should I use callr in compliance with CRAN policy? Because even just one call to callr will start a new R session, should I limit the use of callr such that at any one time, only one background R session created by callr is active? Regards, Shu Fai __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] PkgA imports PkgB, and PkgB suggests PkgA?
Many many thanks for the clarification, which is very clear! The case of testthat is a very good example, as many packages suggest it. Regards, Shu Fai On Sat, Nov 4, 2023 at 4:41 PM Iñaki Ucar wrote: > > > > El sáb., 4 nov. 2023 5:43, Shu Fai Cheung escribió: >> >> Hi All, >> >> I vaguely recall that, on CRAN, if PkgA imports PkgB, then PkgB cannot >> import PkgA. (Please correct me if I am wrong.) >> >> How about this? >> >> PkgA imports PkgB (because PkgA has some helper functions for using PkgB) >> PkgB suggests PkgA (because some vignettes or examples in PkgB use >> those helpers from PkgA) > > > Or some tests are based on PkgA, or... > >> Is this allowed on CRAN? > > > Yes, it is. For example: testthat imports a bunch of packages to do is thing, > and those packages suggest testthat because their test suite is based on it. > > Cycles of hard dependencies (Depends, Imports) are not allowed for obvious > reasons. But packages should install and work without soft dependencies, so > there's no problem there. > > Iñaki __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
[R-pkg-devel] PkgA imports PkgB, and PkgB suggests PkgA?
Hi All, I vaguely recall that, on CRAN, if PkgA imports PkgB, then PkgB cannot import PkgA. (Please correct me if I am wrong.) How about this? PkgA imports PkgB (because PkgA has some helper functions for using PkgB) PkgB suggests PkgA (because some vignettes or examples in PkgB use those helpers from PkgA) Is this allowed on CRAN? Regards, Shu Fai __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Suppressing long-running vignette code in CRAN submission
Please pardon me if I suggest something unrelated below. Many experts have made suggestions that I would also like to consider because I also have a similar issue with some packages. This is an approach I found, for Rmarkdown vignettes: https://www.kloppenborg.ca/2021/06/long-running-vignettes/ This is similar to some of the suggestions. The vignette is rendered locally. It uses the trick that, If we render the vignette by calling knitr::knit() directly, the extension of the source file does not matter. The output, although with the extension ".Rmd", actually contains the results of the code, in chunks starting with "```r", not "```{r}". When this pre-buiult .Rmd file is built again, it will just convert the file to an HTML file, with no need to rerun the code. The method uses an extension for the source Rmd file (".orig" in the post) to make sure the "real" source files are ignored when building the vignettes. Perhaps this is also a feasible solution for long running vignettes? Regards, Shu Fai On Wed, Oct 18, 2023 at 6:51 AM John Fox wrote: > > Dear John, > > Unless I'm mistaken, the *installation* time of the package isn't really > at issue. If a user installs a package from a tarball provided by CRAN, > the vignettes aren't normally rebuilt. > > Best, > John > > On 2023-10-17 6:30 p.m., John Harrold wrote: > > Caution: External email. > > > > > > I ask myself the question: Who is the vignette for? It does server two > > purposes. One is testing but primarily it's for the users to learn how to > > use a package. I think the testing is secondary, and if it slows down > > installation or general usability I'd sacrifice the testing. If it's that > > important, then the tests can be added explicitly in tests/. > > > > On Tue, Oct 17, 2023 at 3:04 PM Dirk Eddelbuettel wrote: > > > >> > >> On 18 October 2023 at 08:51, Simon Urbanek wrote: > >> | John, > >> | > >> | the short answer is it won't work (it defeats the purpose of vignettes). > >> > >> Not exactly. Everything is under our (i.e. package author) control, and > >> when > >> we want to replace 'computed' values with cached values we can. > >> > >> All this is somewhat of a charade. "Of course" we want vignettes to run > >> tests. But then we don't want to fall over random missing .sty files or > >> fonts > >> (macOS machines have been less forgiving than others), not to mention > >> compile > >> time. > >> > >> So for simplicity I often pre-make pdf vignettes that get included in other > >> latex code as source. Works great, never fails, CRAN never complained -- > >> which is somewhat contrary to your statement. > >> > >> It is effectively the same with tests. We all want maximum test surfaces. > >> But > >> when tests fail, or when they run too long, or [insert many other reasons > >> here] so many packages run tests conditionally. Such is life. > >> > >> Dirk > >> > >> > >> | However, this sounds like a purely hypothetical question - CRAN policies > >> allow long-running vignettes if they declared. > >> | > >> | Cheers, > >> | Simon > >> | > >> | > >> | > On 18/10/2023, at 3:02 AM, John Fox wrote: > >> | > > >> | > Hello Dirk, > >> | > > >> | > Thank you (and Kevin and John) for addressing my questions. > >> | > > >> | > No one directly answered my first question, however, which was whether > >> the approach that I suggested would work. I guess that the implication is > >> that it won't, but it would be nice to confirm that before I try something > >> else, specifically using R.rsp. > >> | > > >> | > Best, > >> | > John > >> | > > >> | > On 2023-10-17 4:02 a.m., Dirk Eddelbuettel wrote: > >> | >> Caution: External email. > >> | >> On 16 October 2023 at 10:42, Kevin R Coombes wrote: > >> | >> | Produce a PDF file yourself, then use the "as.is" feature of the > >> R.rsp > >> | >> | package. > >> | >> For completeness, that approach also works directly with Sweave. > >> Described in > >> | >> a blog post by Mark van der Loo in 2019, and used in a number of > >> packages > >> | >> including a few of mine. > >> | >> That said, I also used the approach described by John Harrold and > >> cached > >> | >> results myself. > >> | >> Dirk > >> | >> -- > >> | >> dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org > >> | >> __ > >> | >> R-package-devel@r-project.org mailing list > >> | >> https://stat.ethz.ch/mailman/listinfo/r-package-devel > >> | > > >> | > __ > >> | > R-package-devel@r-project.org mailing list > >> | > https://stat.ethz.ch/mailman/listinfo/r-package-devel > >> | > > >> | > >> | __ > >> | R-package-devel@r-project.org mailing list > >> | https://stat.ethz.ch/mailman/listinfo/r-package-devel > >> > >> -- > >> dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org > >> > >> __ > >> R-package-devel@r-project.org mailing list > >>
Re: [R-pkg-devel] Checking the number of cores used
Thanks for the suggestion. I will use rhub and a virtual machine to check again. I read some previous posts and I thought I need to check the times to see if there is any unintended usage of parallel processing, e.g., CPU time > 2 x elapsed time. May I ask a few questions on this part? This is from the "-Ex.Rout" file: > base::cat("Time elapsed: ", proc.time() - base::get("ptime", pos = > 'CheckExEnv'),"\n") Time elapsed: 14.94 0.27 15.26 NA NA First, is this something I can ignore, unless the CPU time is substantially larger than elapsed time? This is the total time but the parallel process may be triggered in only one of the many examples. Second, are the files "-Ex.timings", showing per-example timing, only available on some platforms? I could get it locally in Windows, and can find it in Winbuilder output. However, I could not find them in the rhub platforms I tried, not even in the Windows platform. I suppose adding "--as-cran" will also add "--timings"? Third, the "testthat.Rout" file only shows the total time: > proc.time() user system elapsed 13.530 0.291 29.492 I believe the user time is not useful as we can use two processes in testthat. How can we detect the use of more than two cores in the tests? Last, how can we detect the use of more than two cores in vignettes? I checked and couldn't find similar timing information on vignettes. Sorry for asking so many questions. I would like to have a reliable way to detect hidden use of parallel processing such that I can prevent the problem from happening. I have some ideas which package I imported is causing the problem but I have used it before without problem. Therefore, I would like to see if I overlooked anything. Regards, Shu Fai On Tue, Sep 19, 2023 at 5:02 PM Uwe Ligges wrote: > > > > On 18.09.2023 16:10, Shu Fai Cheung wrote: > > Hi All, > > > > I know we should not use more than 2 cores in tests, vignettes, etc. I > > encountered and solved this issue before. However, I still committed > > this mistake in a new package and would like find out where the cause > > is. > > > > I have a package that already has parallel processing disabled by > > default and I did not enable parallel processing in the examples and > > tests (except for one test, which is always skipped by skip()). > > However, I was told that somewhere in the package more than 2 cores > > are used. > > > > I checked several times and even added a temporary 'stop()` to "trap" > > parallel processing but still could not find where the source of the > > problem is. > > > > I checked the timing in the log in R CMD check results from winbuilder > > but everything seems OK. The user time and elapsed time are similar > > for all the examples. > > > > Is there any quick way to check where things go wrong regarding the > > number of cores? It is not easy to find the source of the problems > > when there are many examples and tests. > > If it is OK on winbuilder but not on Linux, then likely something makes > use of multithreading. > > Best, > Uwe Ligges > > > > > Regards, > > Shu Fai > > > > __ > > R-package-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-package-devel __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Checking the number of cores used
Thanks a lot. I don't have a physical linux box and so I need to use rhub. But I don't know why there are no "-Ex.timings" files in the check. E.g, check(platform = "debian-gcc-release", show_status = FALSE, check_args = "--as-cran") I can only see these files in artifacts: [ ]00check.log2023-09-20 08:00 3.4K [ ]00install.out2023-09-20 08:00 702 [ ]Rdlatex.log2023-09-20 08:00 30K [ ]modelbpp-Ex.Rout2023-09-20 08:00 28K [ ]modelbpp-manual.log2023-09-20 08:00 19K The case is the same for this, no "-Ex.timings" files: check(platform = "windows-x86_64-release", show_status = FALSE, check_args = "--as-cran") Although I think I need to, I tried adding "--timings" but still do not see the "-Ex.timings". However, if I run the check locally in Windows 10 using R CMD check with --as-cran, I can find the "-Ex.timings" files. I can find the total time at the end of "-Ex.Rout" but I think this is not what I need: > base::cat("Time elapsed: ", proc.time() - base::get("ptime", pos = > 'CheckExEnv'),"\n") Time elapsed: 10.963 0.161 13.589 0.302 0.081 Regards, Shu Fai On Tue, Sep 19, 2023 at 5:59 PM Duncan Murdoch wrote: > > On 18/09/2023 10:10 a.m., Shu Fai Cheung wrote: > > Hi All, > > > > I know we should not use more than 2 cores in tests, vignettes, etc. I > > encountered and solved this issue before. However, I still committed > > this mistake in a new package and would like find out where the cause > > is. > > > > I have a package that already has parallel processing disabled by > > default and I did not enable parallel processing in the examples and > > tests (except for one test, which is always skipped by skip()). > > However, I was told that somewhere in the package more than 2 cores > > are used. > > > > I checked several times and even added a temporary 'stop()` to "trap" > > parallel processing but still could not find where the source of the > > problem is. > > > > I checked the timing in the log in R CMD check results from winbuilder > > but everything seems OK. The user time and elapsed time are similar > > for all the examples. > > > > Is there any quick way to check where things go wrong regarding the > > number of cores? It is not easy to find the source of the problems > > when there are many examples and tests. > > If you run R CMD check at the command line, it will produce a > directory *.Rcheck containing a number of files. One of those files > will be *-Ex.timings, which will give the individual timings of each of > the examples in your package. Maybe you can recognize from those which > of the examples are problematic ones, and add `proc.time()` calls to the > example to figure out which line(s) cause the issue. > > I don't remember whether winbuilder keeps the timings file when it runs > a check. > > Duncan Murdoch > __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
[R-pkg-devel] Checking the number of cores used
Hi All, I know we should not use more than 2 cores in tests, vignettes, etc. I encountered and solved this issue before. However, I still committed this mistake in a new package and would like find out where the cause is. I have a package that already has parallel processing disabled by default and I did not enable parallel processing in the examples and tests (except for one test, which is always skipped by skip()). However, I was told that somewhere in the package more than 2 cores are used. I checked several times and even added a temporary 'stop()` to "trap" parallel processing but still could not find where the source of the problem is. I checked the timing in the log in R CMD check results from winbuilder but everything seems OK. The user time and elapsed time are similar for all the examples. Is there any quick way to check where things go wrong regarding the number of cores? It is not easy to find the source of the problems when there are many examples and tests. Regards, Shu Fai __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel