There are few other changes in there too, but profiling did identify low hanging fruit like the sapplys. From there I have found a long list of refactoring opportunities that offer ~2% improvements. These may not be worth the risk of reversions, however. I'll be putting together a patch proposal with the easy changes in the next few days.
Regards, Pete ____________________ Peter M. Haverty, Ph.D. Genentech, Inc. phave...@gene.com On Fri, Jan 2, 2015 at 1:58 AM, Michael Lawrence <lawrence.mich...@gene.com> wrote: > Pete Haverty is the one working on this. He has almost cut loading time in > half by just changing some sapply and lappy calls to vapply calls. Most > likely because allocating all of those list elements is expensive, and > Martin's memory parameters also help with that. > > On Wed, Dec 31, 2014 at 10:30 PM, Herv� Pag�s <hpa...@fredhutch.org> > wrote: > > > Hi Gordon, > > > > My guess is that it has to do with how many symbols get exported. > > For example on my machine, doing library(limma) in a fresh session > > takes 0.261s and triggers export of 292 symbols (as reported by > > ls(..., all.names=TRUE)). Doing library(GenomicRanges) in a fresh > > session takes 2.724s and triggers export of 1581 symbols (counting > > the symbols exported by all the packages that get loaded). > > > > Michael it's great to hear that somebody is working on speeding up > > the code in charge of this. > > > > Happy New Year everybody! > > H. > > > > > > > > On 12/31/2014 06:07 PM, Gordon K Smyth wrote: > > > >> Hi Michael, > >> > >> What aspect of the methods package causes the slowness? > >> > >> There are many packages (limma for one) that depend on methods but load > >> quickly. > >> > >> Regards > >> Gordon > >> > >> > >> Date: Wed, 31 Dec 2014 09:17:01 -0800 > >>> From: Michael Lawrence <lawrence.mich...@gene.com> > >>> To: Peng Yu <pengyu...@gmail.com> > >>> Cc: Bioconductor Package Maintainer <maintai...@bioconductor.org>, > >>> "bioc-devel@r-project.org" <bioc-devel@r-project.org> > >>> Subject: Re: [Bioc-devel] [devteam-bioc] Use Imports instead of > >>> Depends in the DESCRIPTION files of bioconductor packages. > >>> > >>> The slowness is due to the methods package. We're working on it. > >>> > >>> Michael > >>> > >>> On Wed, Dec 31, 2014 at 8:47 AM, Peng Yu <pengyu...@gmail.com> wrote: > >>> > >>> On Wed, Dec 31, 2014 at 9:41 AM, Martin Morgan < > mtmor...@fredhutch.org> > >>>> wrote: > >>>> > >>>>> On 12/24/2014 07:31 PM, Maintainer wrote: > >>>>> > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> Many bioconductor packages Depends on other packages but not Imports > >>>>>> other packages. (e.g., IRanges Depends on BiocGenerics.) Imports is > >>>>>> usually preferred to Depends. > >>>>>> > >>>>>> > >>>>>> > >>>>>> http://stackoverflow.com/questions/8637993/better- > >>>> explanation-of-when-to-use-imports-depends > >>>> > >>>> http://obeautifulcode.com/R/How-R-Searches-And-Finds-Stuff/ > >>>>>> > >>>>>> Could the unnecessary Depends be forced to be replaced by Imports? > >>>>>> This should improve the package load time significantly. > >>>>>> > >>>>> > >>>>> > >>>>> R package symbols and other objects are collated at build time into a > >>>>> > >>>> 'name > >>>> > >>>>> space'. When used, > >>>>> > >>>>> - Import: loads the name space from disk. > >>>>> - Depends: loads the name space from disk, and attaches it to the > >>>>> > >>>> search() > >>>> > >>>>> path. > >>>>> > >>>>> Attaching is very inexpensive compared to loading, so there is no > speed > >>>>> improvement gained by Import'ing instead of Depend'ing. > >>>>> > >>>> > >>>> Yes. For example, changing Depends to Imports does not improve the > >>>> package load time much. > >>>> > >>>> But loading a package in 4 sec seems to be too long. > >>>> > >>>> system.time(suppressPackageStartupMessages(library(MBASED))) > >>>>> > >>>> user system elapsed > >>>> 4.404 0.100 4.553 > >>>> > >>>> For example, it only takes 10% of the time to load ggplot2. It seems > >>>> that many bioconductor packages have similar problems. > >>>> > >>>> system.time(suppressPackageStartupMessages(library(ggplot2))) > >>>>> > >>>> user system elapsed > >>>> 0.394 0.036 0.460 > >>>> > >>>> The main reason to Depend: on a package is because the symbols > >>>>> defined by > >>>>> the package are needed by the end-user. Import'ing a package is > >>>>> > >>>> appropriate > >>>> > >>>>> when the package provides functionality only relevant to the package > >>>>> > >>>> author. > >>>> > >>>> What causes the load time to be too long? Is it because exporting too > >>>> many functions from all dependent packages to the global namespace? > >>>> > >>>> There are likely to be specific packages that mis-use Depends; > packages > >>>>> > >>>> such > >>>> > >>>>> as IRanges, GenomicRanges, etc use Depends: as intended, to provide > >>>>> functions that are useful to the end user. > >>>>> > >>>>> Maintainers are certainly encouraged to think carefully about adding > >>>>> packages providing functionality irrelevant to the end-user to the > >>>>> > >>>> Depends: > >>>> > >>>>> field. The codetoolsBioC package (available from svn, see > >>>>> http://bioconductor.org/developers/how-to/source-control/) provides > >>>>> some > >>>>> mostly reliable hints to package authors about correctly formulating > a > >>>>> NAMESPACE file to facilitate using Imports: instead of Depends:. > >>>>> > >>>>> General questions about Bioconductor packages should be addressed to > >>>>> the > >>>>> support forum https://support.bioconductor.org. > >>>>> > >>>>> Questions about Bioconductor development (such as this) should be > >>>>> > >>>> addressed > >>>> > >>>>> to the bioc-devel mailing list (subscription required) > >>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel. > >>>>> > >>>>> I have cc'd the bioc-devel mailing list; I hope that is ok. > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Regards, > >>>> Peng > >>>> > >>> > >> ______________________________________________________________________ > >> The information in this email is confidential and inte...{{dropped:20}} > >> > > > > _______________________________________________ > > Bioc-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel