Hi guys,

On 07/08/2014 05:29 AM, Michael Lawrence wrote:
This is why I tell people not to use require(). But what's with needing to
load IRanges to subset an Rle? Is that temporary?

Very temporary. The source code of the "extractROWS" and "replaceROWS"
methods for Rle objects actually contains the following comment:

  ## FIXME: Right now, the subscript 'i' is turned into an IRanges
  ## object so we need stuff that lives in the IRanges package for this
  ## to work. This is ugly/hacky and needs to be fixed (thru a redesign
  ## of this method).
  if (!suppressWarnings(require(IRanges, quietly=TRUE)))
    stop(...)
  ...

I introduced this hack last week when I moved the Rle code from IRanges
to S4Vectors. It's temporary. The 2 methods need to be refactored which
I'm planning to do this week.

Cheers,
H.



Limiting imports is unlikely to reduce loading time. It may actually
increase it. There are good reasons for it though.



On Tue, Jul 8, 2014 at 5:21 AM, Martin Morgan <mtmor...@fhcrc.org> wrote:

Hi Leonardo --


On 07/07/2014 03:27 PM, Leonardo Collado Torres wrote:

Hello BioC-devel list,

I am currently confused on a namespace issue which I haven't been able
to solve. To reproduce this, I made the simplest example I thought of.


Step 1: make some toy data and save it on your desktop

library(IRanges)
DF <- DataFrame(x = Rle(0, 10), y = Rle(1, 10))
save(DF, file="~/Desktop/DF.Rdata")

Step 2: install the toy package on R 3.1.x

library(devtools)
install_github("lcolladotor/fooPkg")
# Note that it passes R CMD check

Step 3: on a new R session run

example("foo", "fooPkg")
# Change the location of DF.Rdata if necessary


You will see that when running the example, the session information is
printed listing:

other attached packages:
[1] fooPkg_0.0.1

loaded via a namespace (and not attached):
[1] BiocGenerics_0.11.3 IRanges_1.99.17     parallel_3.1.0
S4Vectors_0.1.0     stats4_3.1.0        tools_3.1.0


Then the message for loading IRanges is showed, which is something I
was not expecting and thus the following session info shows:

other attached packages:
[1] IRanges_1.99.17     S4Vectors_0.1.0     BiocGenerics_0.11.3
fooPkg_0.0.1

loaded via a namespace (and not attached):
[1] stats4_3.1.0 tools_3.1.0

Meaning that IRanges, S4Vectors and BiocGenerics all went from "loaded
via a namespace" to "other attached packages".



All the fooPkg::foo() is doing is using a mapply() to go through a
DataFrame and a list of indices to subset the data as shown at
https://github.com/lcolladotor/fooPkg/blob/master/R/foo.R#L26 That is:

res <- mapply(function(x, y) { x[y] }, DF, index)

I thus thought that the only thing I would need to specify on the
namespace is to import the '[' IRanges method.

Checking with BiocCheck and codetoolsBioC suggests importing the
method for mapply() from BiocGenerics. Doing so doesn't affect things
and R still loads IRanges on that mapply() call. Importing the '['
method from S4Vectors doesn't help either. Most intriging, importing
the whole S4Vectors, BiocGenerics and IRanges still doesn't change the
fact that IRanges is loaded when evaluating the same line of code
shown above.

Any clues on what I am missing or doing wrong?


This comes from S4Vectors::extractROWS

selectMethod(extractROWS, c("Rle", "integer"))
Method Definition:

function (x, i)
{
     if (!suppressWarnings(require(IRanges, quietly = TRUE)))
         stop("Couldn't load the IRanges package. You need to install ",
             "the IRanges\n  package in order to subset an Rle object.")

...

which moves the IRanges package from loaded to attached. Maybe that should
be 'suppressPackageStartupMessages' or if (!IRanges %in%
loadedNamespaces()) and functions referenced by IRanges:::...






In my use case, I'm trying to keep the namespace as small as possible
(to minimize loading time) because it's for a tiny package that has a
single function. This tiny package is then loaded on a
BiocParallel::blapply() call using BiocParallel::SnowParam() which
performs much better than BiocParallel::MulticoreParam() in terms of
keeping the memory under control.


probably it is not desirable to move packages from loaded to attached, but
I don't think this influences performance in a meaningful way?

Martin






Thank you for your help!
Leo

Leonardo Collado Torres, PhD student
Department of Biostatistics
Johns Hopkins University
Bloomberg School of Public Health
Website: http://www.biostat.jhsph.edu/~lcollado/
Blog: http://lcolladotor.github.io/











Full output from running the example:




  example("foo", "fooPkg")


foo> ## Initial info
foo> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] fooPkg_0.0.1

loaded via a namespace (and not attached):
[1] BiocGenerics_0.11.3 IRanges_1.99.17     parallel_3.1.0
S4Vectors_0.1.0     stats4_3.1.0        tools_3.1.0

foo> ## Load data
foo> load("~/Desktop/DF.Rdata")

foo> ## Run function
foo> result <- foo(DF)
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] fooPkg_0.0.1

loaded via a namespace (and not attached):
[1] BiocGenerics_0.11.3 IRanges_1.99.17     parallel_3.1.0
S4Vectors_0.1.0     stats4_3.1.0        tools_3.1.0
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

      clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
      parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from ‘package:stats’:

      xtabs

The following objects are masked from ‘package:base’:

      anyDuplicated, append, as.data.frame, as.vector, cbind, colnames,
do.call, duplicated, eval, evalq, Filter, Find, get,
      intersect, is.unsorted, lapply, Map, mapply, match, mget, order,
paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
      rbind, Reduce, rep.int, rownames, sapply, setdiff, sort, table,
tapply, union, unique, unlist

R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets
methods   base

other attached packages:
[1] IRanges_1.99.17     S4Vectors_0.1.0     BiocGenerics_0.11.3
fooPkg_0.0.1

loaded via a namespace (and not attached):
[1] stats4_3.1.0 tools_3.1.0





The same thing happens with the following setup:

R version 3.1.1 RC (2014-07-07 r66083)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
   [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices datasets  utils     methods
[8] base

other attached packages:
[1] IRanges_1.99.17     S4Vectors_0.1.0     BiocGenerics_0.11.3
[4] fooPkg_0.0.1        colorout_1.0-2

loaded via a namespace (and not attached):
[1] stats4_3.1.1 tools_3.1.1

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


        [[alternative HTML version deleted]]



_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to