Re: [Bioc-devel] Methods to speed up R CMD Check

Hervé Pagès Mon, 22 Mar 2021 09:57:08 -0700

Hi Alan,

It looks like what is slowing everything down significantly is theapproach you've taken to look up the ExperimentHub resources that youcontrol by name every time you need to access them. E.g:


Look up by name:

  > system.time(tt_alzh <- ewceData::tt_alzh())
  snapshotDate(): 2021-03-22
  see ?ewceData and browseVignettes('ewceData') for documentation
  loading from cache
     user  system elapsed
    2.496   0.024   9.460

Direct access:

  > system.time(tt_alzh <- eh[["EH5373"]])
  see ?ewceData and browseVignettes('ewceData') for documentation
  loading from cache
     user  system elapsed
    1.195   0.012   2.060

ewceData::tt_alzh() is just one of the 18 utility functions defined inewceData that perform this lookup over and over again in the vignetteand man page. This lookup is expensive and not needed since theExperimentHub IDs that were assigned to your resources are fixed andknown in advance.

Note however that it's a good idea to not expose these IDs to the enduser (they might change at some point if you need to update theseresources on ExperimentHub) so it's actually recommended to lookup byname in user-visible code.

Another easy improvement is that you drop dependency onExperimentHubData. This will reduce the nb of deps (direct and indirect)from 130 to 94. There are likely other deps that you could try to getrid of.


Hope this helps,
H.

On 3/22/21 2:38 AM, Murphy, Alan E wrote:

Hi all,

I am working on the development of [EWCE](https://github.com/NathanSkene/EWCE) 
but have hit an issue with R CMD check's runtime. I have been informed this 
test needs to be completed in 15 minutes but mine is currently running in ~24 
minutes and I am looking for methods to speed this up. The main culprits for 
the runtime issue are:

checking examples (5m 49.8s)
Running �testthat.R� [308s/469s] (7m 49.1s)
checking for unstated dependencies in vignettes (7m 49.4s)
checking re-building of vignette outputs (5m 12s)

With the exception of using smaller datasets which I will consider myself, is 
there known ways of speeding these up? EWCE derives data from an Experimenthub 
package [ewceData](https://github.com/neurogenomics/ewceData) for its examples, 
tests and vignette. This is run repeatedly and I have noted this takes a 
significant amount of time to load a dataset. Is there anyway of caching the 
datasets for all the checks or more generally of speeding this up?

I have heard of the use of [long 
tests](http://bioconductor.org/developers/how-to/long-tests/) which aren't run 
daily by Bioconductor but are these still checked in R CMD Check? Is there any 
other way to exclude my tests from the R CMD Check given they aren't a 
necessity from Bioconductor?

Does checking for unstated dependencies in vignettes have a long runtime based 
on the number of package dependencies? If I just export specific functions from 
packages will this check time reduce?

Lastly, is there any way to get an exception of the 15 minute maximum? I may be 
ill-informed but is the max time for packages on Bioconductor's daily check 40 
minutes which my code in its current state would complete by.

Kind regards,
Alan.


        [[alternative HTML version deleted]]


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


--
Hervé Pagès

Bioconductor Core Team
hpages.on.git...@gmail.com

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Methods to speed up R CMD Check

Reply via email to