[R-pkg-devel] callr and CRAN policy on the max number of cores

2024-05-22 Thread Shu Fai Cheung
Hil,

I am exploring the use of callr in a package. I know that for packages
that do parallel computing, we should not use more than two cores in
examples, tests, etc. when being checked on CRAN.

I believe I can use callr in tests and examples. How should I use
callr in compliance with CRAN policy? Because even just one call to
callr will start a new R session, should I limit the use of callr such
that at any one time, only one background R session created by callr
is active?

Regards,
Shu Fai

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] PkgA imports PkgB, and PkgB suggests PkgA?

2023-11-04 Thread Shu Fai Cheung
Many many thanks for the clarification, which is very clear! The case
of testthat is a very good example, as many packages suggest it.

Regards,
Shu Fai


On Sat, Nov 4, 2023 at 4:41 PM Iñaki Ucar  wrote:
>
>
>
> El sáb., 4 nov. 2023 5:43, Shu Fai Cheung  escribió:
>>
>> Hi All,
>>
>> I vaguely recall that, on CRAN, if PkgA imports PkgB, then PkgB cannot
>> import PkgA. (Please correct me if I am wrong.)
>>
>> How about this?
>>
>> PkgA imports PkgB (because PkgA has some helper functions for using PkgB)
>> PkgB suggests PkgA (because some vignettes or examples in PkgB use
>> those helpers from PkgA)
>
>
> Or some tests are based on PkgA, or...
>
>> Is this allowed on CRAN?
>
>
> Yes, it is. For example: testthat imports a bunch of packages to do is thing, 
> and those packages suggest testthat because their test suite is based on it.
>
> Cycles of hard dependencies (Depends, Imports) are not allowed for obvious 
> reasons. But packages should install and work without soft dependencies, so 
> there's no problem there.
>
> Iñaki

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] PkgA imports PkgB, and PkgB suggests PkgA?

2023-11-03 Thread Shu Fai Cheung
Hi All,

I vaguely recall that, on CRAN, if PkgA imports PkgB, then PkgB cannot
import PkgA. (Please correct me if I am wrong.)

How about this?

PkgA imports PkgB (because PkgA has some helper functions for using PkgB)
PkgB suggests PkgA (because some vignettes or examples in PkgB use
those helpers from PkgA)

Is this allowed on CRAN?

Regards,
Shu Fai

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Suppressing long-running vignette code in CRAN submission

2023-10-17 Thread Shu Fai Cheung
Please pardon me if I suggest something unrelated below. Many experts
have made suggestions that I would also like to consider because I
also have a similar issue with some packages.

This is an approach I found, for Rmarkdown vignettes:

https://www.kloppenborg.ca/2021/06/long-running-vignettes/

This is similar to some of the suggestions. The vignette is rendered
locally. It uses the trick that, If we render the vignette by calling
knitr::knit() directly, the extension of the source file does not
matter. The output, although with the extension ".Rmd", actually
contains the results of the code, in chunks starting with "```r", not
"```{r}".

When this pre-buiult .Rmd file is built again, it will just convert
the file to an HTML file, with no need to rerun the code.

The method uses an extension for the source Rmd file (".orig" in the
post) to make sure the "real" source files are ignored when building
the vignettes.

Perhaps this is also a feasible solution for long running vignettes?

Regards,
Shu Fai

On Wed, Oct 18, 2023 at 6:51 AM John Fox  wrote:
>
> Dear John,
>
> Unless I'm mistaken, the *installation* time of the package isn't really
> at issue. If a user installs a package from a tarball provided by CRAN,
> the vignettes aren't normally rebuilt.
>
> Best,
>   John
>
> On 2023-10-17 6:30 p.m., John Harrold wrote:
> > Caution: External email.
> >
> >
> > I ask myself the question: Who is the vignette for?  It does server two
> > purposes. One is testing but primarily it's for the users to learn how to
> > use a package. I think the testing is secondary, and if it slows down
> > installation or general usability I'd sacrifice the testing. If it's that
> > important, then the tests can be added explicitly in tests/.
> >
> > On Tue, Oct 17, 2023 at 3:04 PM Dirk Eddelbuettel  wrote:
> >
> >>
> >> On 18 October 2023 at 08:51, Simon Urbanek wrote:
> >> | John,
> >> |
> >> | the short answer is it won't work (it defeats the purpose of vignettes).
> >>
> >> Not exactly. Everything is under our (i.e. package author) control, and
> >> when
> >> we want to replace 'computed' values with cached values we can.
> >>
> >> All this is somewhat of a charade. "Of course" we want vignettes to run
> >> tests. But then we don't want to fall over random missing .sty files or
> >> fonts
> >> (macOS machines have been less forgiving than others), not to mention
> >> compile
> >> time.
> >>
> >> So for simplicity I often pre-make pdf vignettes that get included in other
> >> latex code as source. Works great, never fails, CRAN never complained --
> >> which is somewhat contrary to your statement.
> >>
> >> It is effectively the same with tests. We all want maximum test surfaces.
> >> But
> >> when tests fail, or when they run too long, or [insert many other reasons
> >> here] so many packages run tests conditionally.  Such is life.
> >>
> >> Dirk
> >>
> >>
> >> | However, this sounds like a purely hypothetical question - CRAN policies
> >> allow long-running vignettes if they declared.
> >> |
> >> | Cheers,
> >> | Simon
> >> |
> >> |
> >> | > On 18/10/2023, at 3:02 AM, John Fox  wrote:
> >> | >
> >> | > Hello Dirk,
> >> | >
> >> | > Thank you (and Kevin and John) for addressing my questions.
> >> | >
> >> | > No one directly answered my first question, however, which was whether
> >> the approach that I suggested would work. I guess that the implication is
> >> that it won't, but it would be nice to confirm that before I try something
> >> else, specifically using R.rsp.
> >> | >
> >> | > Best,
> >> | > John
> >> | >
> >> | > On 2023-10-17 4:02 a.m., Dirk Eddelbuettel wrote:
> >> | >> Caution: External email.
> >> | >> On 16 October 2023 at 10:42, Kevin R Coombes wrote:
> >> | >> | Produce a PDF file yourself, then use the "as.is" feature of the
> >> R.rsp
> >> | >> | package.
> >> | >> For completeness, that approach also works directly with Sweave.
> >> Described in
> >> | >> a blog post by Mark van der Loo in 2019, and used in a number of
> >> packages
> >> | >> including a few of mine.
> >> | >> That said, I also used the approach described by John Harrold and
> >> cached
> >> | >> results myself.
> >> | >> Dirk
> >> | >> --
> >> | >> dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> >> | >> __
> >> | >> R-package-devel@r-project.org mailing list
> >> | >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >> | >
> >> | > __
> >> | > R-package-devel@r-project.org mailing list
> >> | > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >> | >
> >> |
> >> | __
> >> | R-package-devel@r-project.org mailing list
> >> | https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >>
> >> --
> >> dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> >>
> >> __
> >> R-package-devel@r-project.org mailing list
> >> 

Re: [R-pkg-devel] Checking the number of cores used

2023-09-20 Thread Shu Fai Cheung
Thanks for the suggestion. I will use rhub and a virtual machine to check again.

I read some previous posts and I thought I need to check the times to
see if there is any unintended usage of parallel processing, e.g., CPU
time > 2 x elapsed time. May I ask a few questions on this part?

This is from the "-Ex.Rout" file:

> base::cat("Time elapsed: ", proc.time() - base::get("ptime", pos = 
> 'CheckExEnv'),"\n")
Time elapsed:  14.94 0.27 15.26 NA NA

First, is this something I can ignore, unless the CPU time is
substantially larger than elapsed time? This is the total time but the
parallel process may be triggered in only one of the many examples.

Second, are the files "-Ex.timings", showing per-example timing, only
available on some platforms? I could get it locally in Windows, and
can find it in Winbuilder output. However, I could not find them in
the rhub platforms I tried, not even in the Windows platform. I
suppose adding "--as-cran" will also add "--timings"?

Third, the "testthat.Rout" file only shows the total time:

> proc.time()
   user  system elapsed
 13.530   0.291  29.492

I believe the user time is not useful as we can use two processes in
testthat. How can we detect the use of more than two cores in the
tests?

Last, how can we detect the use of more than two cores in vignettes? I
checked and couldn't find similar timing information on vignettes.

Sorry for asking so many questions. I would like to have a reliable
way to detect hidden use of parallel processing such that I can
prevent the problem from happening. I have some ideas which package I
imported is causing the problem but I have used it before without
problem. Therefore, I would like to see if I overlooked anything.

Regards,
Shu Fai

On Tue, Sep 19, 2023 at 5:02 PM Uwe Ligges
 wrote:
>
>
>
> On 18.09.2023 16:10, Shu Fai Cheung wrote:
> > Hi All,
> >
> > I know we should not use more than 2 cores in tests, vignettes, etc. I
> > encountered and solved this issue before. However, I still committed
> > this mistake in a new package and would like find out where the cause
> > is.
> >
> > I have a package that already has parallel processing disabled by
> > default and I did not enable parallel processing in the examples and
> > tests (except for one test, which is always skipped by skip()).
> > However, I was told that somewhere in the package more than 2 cores
> > are used.
> >
> > I checked several times and even added a temporary 'stop()` to "trap"
> > parallel processing but still could not find where the source of the
> > problem is.
> >
> > I checked the timing in the log in R CMD check results from winbuilder
> > but everything seems OK. The user time and elapsed time are similar
> > for all the examples.
> >
> > Is there any quick way to check where things go wrong regarding the
> > number of cores? It is not easy to find the source of the problems
> > when there are many examples and tests.
>
> If it is OK on winbuilder but not on Linux, then likely something makes
> use of multithreading.
>
> Best,
> Uwe Ligges
>
>
>
> > Regards,
> > Shu Fai
> >
> > __
> > R-package-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Checking the number of cores used

2023-09-20 Thread Shu Fai Cheung
Thanks a lot. I don't have a physical linux box and so I need to use
rhub. But I don't know why there are no "-Ex.timings" files in the
check. E.g,

check(platform = "debian-gcc-release", show_status = FALSE, check_args
= "--as-cran")

I can only see these files in artifacts:

[ ]00check.log2023-09-20 08:00 3.4K
[ ]00install.out2023-09-20 08:00 702
[ ]Rdlatex.log2023-09-20 08:00 30K
[ ]modelbpp-Ex.Rout2023-09-20 08:00 28K
[ ]modelbpp-manual.log2023-09-20 08:00 19K

The case is the same for this, no "-Ex.timings" files:

check(platform = "windows-x86_64-release", show_status = FALSE,
check_args = "--as-cran")

Although I think I need to, I tried adding "--timings" but still do
not see the "-Ex.timings".

However, if I run the check locally in Windows 10 using R CMD check
with --as-cran, I can find the "-Ex.timings" files.

I can find the total time at the end of "-Ex.Rout" but I think this is
not what I need:

> base::cat("Time elapsed: ", proc.time() - base::get("ptime", pos = 
> 'CheckExEnv'),"\n")
Time elapsed:  10.963 0.161 13.589 0.302 0.081

Regards,
Shu Fai


On Tue, Sep 19, 2023 at 5:59 PM Duncan Murdoch  wrote:
>
> On 18/09/2023 10:10 a.m., Shu Fai Cheung wrote:
> > Hi All,
> >
> > I know we should not use more than 2 cores in tests, vignettes, etc. I
> > encountered and solved this issue before. However, I still committed
> > this mistake in a new package and would like find out where the cause
> > is.
> >
> > I have a package that already has parallel processing disabled by
> > default and I did not enable parallel processing in the examples and
> > tests (except for one test, which is always skipped by skip()).
> > However, I was told that somewhere in the package more than 2 cores
> > are used.
> >
> > I checked several times and even added a temporary 'stop()` to "trap"
> > parallel processing but still could not find where the source of the
> > problem is.
> >
> > I checked the timing in the log in R CMD check results from winbuilder
> > but everything seems OK. The user time and elapsed time are similar
> > for all the examples.
> >
> > Is there any quick way to check where things go wrong regarding the
> > number of cores? It is not easy to find the source of the problems
> > when there are many examples and tests.
>
> If you run R CMD check  at the command line, it will produce a
> directory *.Rcheck containing a number of files.  One of those files
> will be *-Ex.timings, which will give the individual timings of each of
> the examples in your package.  Maybe you can recognize from those which
> of the examples are problematic ones, and add `proc.time()` calls to the
> example to figure out which line(s) cause the issue.
>
> I don't remember whether winbuilder keeps the timings file when it runs
> a check.
>
> Duncan Murdoch
>

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] Checking the number of cores used

2023-09-18 Thread Shu Fai Cheung
Hi All,

I know we should not use more than 2 cores in tests, vignettes, etc. I
encountered and solved this issue before. However, I still committed
this mistake in a new package and would like find out where the cause
is.

I have a package that already has parallel processing disabled by
default and I did not enable parallel processing in the examples and
tests (except for one test, which is always skipped by skip()).
However, I was told that somewhere in the package more than 2 cores
are used.

I checked several times and even added a temporary 'stop()` to "trap"
parallel processing but still could not find where the source of the
problem is.

I checked the timing in the log in R CMD check results from winbuilder
but everything seems OK. The user time and elapsed time are similar
for all the examples.

Is there any quick way to check where things go wrong regarding the
number of cores? It is not easy to find the source of the problems
when there are many examples and tests.

Regards,
Shu Fai

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel