Re: [Bioc-devel] [devteam-bioc] Use Imports instead of Depends in the DESCRIPTION files of bioconductor packages.
On Wed, Dec 31, 2014 at 9:41 AM, Martin Morgan mtmor...@fredhutch.org wrote: On 12/24/2014 07:31 PM, Maintainer wrote: Hi, Many bioconductor packages Depends on other packages but not Imports other packages. (e.g., IRanges Depends on BiocGenerics.) Imports is usually preferred to Depends. http://stackoverflow.com/questions/8637993/better-explanation-of-when-to-use-imports-depends http://obeautifulcode.com/R/How-R-Searches-And-Finds-Stuff/ Could the unnecessary Depends be forced to be replaced by Imports? This should improve the package load time significantly. R package symbols and other objects are collated at build time into a 'name space'. When used, - Import: loads the name space from disk. - Depends: loads the name space from disk, and attaches it to the search() path. Attaching is very inexpensive compared to loading, so there is no speed improvement gained by Import'ing instead of Depend'ing. Yes. For example, changing Depends to Imports does not improve the package load time much. But loading a package in 4 sec seems to be too long. system.time(suppressPackageStartupMessages(library(MBASED))) user system elapsed 4.404 0.100 4.553 For example, it only takes 10% of the time to load ggplot2. It seems that many bioconductor packages have similar problems. system.time(suppressPackageStartupMessages(library(ggplot2))) user system elapsed 0.394 0.036 0.460 The main reason to Depend: on a package is because the symbols defined by the package are needed by the end-user. Import'ing a package is appropriate when the package provides functionality only relevant to the package author. What causes the load time to be too long? Is it because exporting too many functions from all dependent packages to the global namespace? There are likely to be specific packages that mis-use Depends; packages such as IRanges, GenomicRanges, etc use Depends: as intended, to provide functions that are useful to the end user. Maintainers are certainly encouraged to think carefully about adding packages providing functionality irrelevant to the end-user to the Depends: field. The codetoolsBioC package (available from svn, see http://bioconductor.org/developers/how-to/source-control/) provides some mostly reliable hints to package authors about correctly formulating a NAMESPACE file to facilitate using Imports: instead of Depends:. General questions about Bioconductor packages should be addressed to the support forum https://support.bioconductor.org. Questions about Bioconductor development (such as this) should be addressed to the bioc-devel mailing list (subscription required) https://stat.ethz.ch/mailman/listinfo/bioc-devel. I have cc'd the bioc-devel mailing list; I hope that is ok. -- Regards, Peng ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] [devteam-bioc] Use Imports instead of Depends in the DESCRIPTION files of bioconductor packages.
The slowness is due to the methods package. We're working on it. Michael On Wed, Dec 31, 2014 at 8:47 AM, Peng Yu pengyu...@gmail.com wrote: On Wed, Dec 31, 2014 at 9:41 AM, Martin Morgan mtmor...@fredhutch.org wrote: On 12/24/2014 07:31 PM, Maintainer wrote: Hi, Many bioconductor packages Depends on other packages but not Imports other packages. (e.g., IRanges Depends on BiocGenerics.) Imports is usually preferred to Depends. http://stackoverflow.com/questions/8637993/better-explanation-of-when-to-use-imports-depends http://obeautifulcode.com/R/How-R-Searches-And-Finds-Stuff/ Could the unnecessary Depends be forced to be replaced by Imports? This should improve the package load time significantly. R package symbols and other objects are collated at build time into a 'name space'. When used, - Import: loads the name space from disk. - Depends: loads the name space from disk, and attaches it to the search() path. Attaching is very inexpensive compared to loading, so there is no speed improvement gained by Import'ing instead of Depend'ing. Yes. For example, changing Depends to Imports does not improve the package load time much. But loading a package in 4 sec seems to be too long. system.time(suppressPackageStartupMessages(library(MBASED))) user system elapsed 4.404 0.100 4.553 For example, it only takes 10% of the time to load ggplot2. It seems that many bioconductor packages have similar problems. system.time(suppressPackageStartupMessages(library(ggplot2))) user system elapsed 0.394 0.036 0.460 The main reason to Depend: on a package is because the symbols defined by the package are needed by the end-user. Import'ing a package is appropriate when the package provides functionality only relevant to the package author. What causes the load time to be too long? Is it because exporting too many functions from all dependent packages to the global namespace? There are likely to be specific packages that mis-use Depends; packages such as IRanges, GenomicRanges, etc use Depends: as intended, to provide functions that are useful to the end user. Maintainers are certainly encouraged to think carefully about adding packages providing functionality irrelevant to the end-user to the Depends: field. The codetoolsBioC package (available from svn, see http://bioconductor.org/developers/how-to/source-control/) provides some mostly reliable hints to package authors about correctly formulating a NAMESPACE file to facilitate using Imports: instead of Depends:. General questions about Bioconductor packages should be addressed to the support forum https://support.bioconductor.org. Questions about Bioconductor development (such as this) should be addressed to the bioc-devel mailing list (subscription required) https://stat.ethz.ch/mailman/listinfo/bioc-devel. I have cc'd the bioc-devel mailing list; I hope that is ok. -- Regards, Peng ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[Bioc-devel] BiocCheck Error 'duplicate label install'?
Greetings, Apologies if this is a double post, I was over zealous in my initial submission that I didn't wait for my membership to be confirmed. I have 'finished' creating a package I would like to submit to bioconductor. As per the instructions on the bioconductor website I am trying to get the package to pass the BiocCheck function but I keep getting the following error: Error in parse_block(g[-1], g[1], params.src) : duplicate label 'install' The package also passes through R CMD Check with no errors or warnings. If someone could kindly explain what this error means I would sincerely appreciate it. Regards, -- Dr. Charles Determan, PhD Integrated Biosciences [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] BiocCheck Error 'duplicate label install'?
- Original Message - From: Charles Determan Jr deter...@umn.edu To: bioc-devel@r-project.org Sent: Wednesday, December 31, 2014 11:41:09 AM Subject: [Bioc-devel] BiocCheck Error 'duplicate label install'? Greetings, Apologies if this is a double post, I was over zealous in my initial submission that I didn't wait for my membership to be confirmed. I have 'finished' creating a package I would like to submit to bioconductor. As per the instructions on the bioconductor website I am trying to get the package to pass the BiocCheck function but I keep getting the following error: Error in parse_block(g[-1], g[1], params.src) : duplicate label 'install' It could be a bug in BiocCheck. Can you send me your package off-list? Then I'll be able to reproduce the problem and figure out what's going on. Thanks, Dan The package also passes through R CMD Check with no errors or warnings. If someone could kindly explain what this error means I would sincerely appreciate it. Regards, -- Dr. Charles Determan, PhD Integrated Biosciences [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[Bioc-devel] Use Imports instead of Depends in the DESCRIPTION files of bioconductor packages
Hi Michael, What aspect of the methods package causes the slowness? There are many packages (limma for one) that depend on methods but load quickly. Regards Gordon Date: Wed, 31 Dec 2014 09:17:01 -0800 From: Michael Lawrence lawrence.mich...@gene.com To: Peng Yu pengyu...@gmail.com Cc: Bioconductor Package Maintainer maintai...@bioconductor.org, bioc-devel@r-project.org bioc-devel@r-project.org Subject: Re: [Bioc-devel] [devteam-bioc] Use Imports instead of Depends in the DESCRIPTION files of bioconductor packages. The slowness is due to the methods package. We're working on it. Michael On Wed, Dec 31, 2014 at 8:47 AM, Peng Yu pengyu...@gmail.com wrote: On Wed, Dec 31, 2014 at 9:41 AM, Martin Morgan mtmor...@fredhutch.org wrote: On 12/24/2014 07:31 PM, Maintainer wrote: Hi, Many bioconductor packages Depends on other packages but not Imports other packages. (e.g., IRanges Depends on BiocGenerics.) Imports is usually preferred to Depends. http://stackoverflow.com/questions/8637993/better-explanation-of-when-to-use-imports-depends http://obeautifulcode.com/R/How-R-Searches-And-Finds-Stuff/ Could the unnecessary Depends be forced to be replaced by Imports? This should improve the package load time significantly. R package symbols and other objects are collated at build time into a 'name space'. When used, - Import: loads the name space from disk. - Depends: loads the name space from disk, and attaches it to the search() path. Attaching is very inexpensive compared to loading, so there is no speed improvement gained by Import'ing instead of Depend'ing. Yes. For example, changing Depends to Imports does not improve the package load time much. But loading a package in 4 sec seems to be too long. system.time(suppressPackageStartupMessages(library(MBASED))) user system elapsed 4.404 0.100 4.553 For example, it only takes 10% of the time to load ggplot2. It seems that many bioconductor packages have similar problems. system.time(suppressPackageStartupMessages(library(ggplot2))) user system elapsed 0.394 0.036 0.460 The main reason to Depend: on a package is because the symbols defined by the package are needed by the end-user. Import'ing a package is appropriate when the package provides functionality only relevant to the package author. What causes the load time to be too long? Is it because exporting too many functions from all dependent packages to the global namespace? There are likely to be specific packages that mis-use Depends; packages such as IRanges, GenomicRanges, etc use Depends: as intended, to provide functions that are useful to the end user. Maintainers are certainly encouraged to think carefully about adding packages providing functionality irrelevant to the end-user to the Depends: field. The codetoolsBioC package (available from svn, see http://bioconductor.org/developers/how-to/source-control/) provides some mostly reliable hints to package authors about correctly formulating a NAMESPACE file to facilitate using Imports: instead of Depends:. General questions about Bioconductor packages should be addressed to the support forum https://support.bioconductor.org. Questions about Bioconductor development (such as this) should be addressed to the bioc-devel mailing list (subscription required) https://stat.ethz.ch/mailman/listinfo/bioc-devel. I have cc'd the bioc-devel mailing list; I hope that is ok. -- Regards, Peng __ The information in this email is confidential and intend...{{dropped:4}} ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] Use Imports instead of Depends in the DESCRIPTION files of bioconductor packages
Hi Gordon, My guess is that it has to do with how many symbols get exported. For example on my machine, doing library(limma) in a fresh session takes 0.261s and triggers export of 292 symbols (as reported by ls(..., all.names=TRUE)). Doing library(GenomicRanges) in a fresh session takes 2.724s and triggers export of 1581 symbols (counting the symbols exported by all the packages that get loaded). Michael it's great to hear that somebody is working on speeding up the code in charge of this. Happy New Year everybody! H. On 12/31/2014 06:07 PM, Gordon K Smyth wrote: Hi Michael, What aspect of the methods package causes the slowness? There are many packages (limma for one) that depend on methods but load quickly. Regards Gordon Date: Wed, 31 Dec 2014 09:17:01 -0800 From: Michael Lawrence lawrence.mich...@gene.com To: Peng Yu pengyu...@gmail.com Cc: Bioconductor Package Maintainer maintai...@bioconductor.org, bioc-devel@r-project.org bioc-devel@r-project.org Subject: Re: [Bioc-devel] [devteam-bioc] Use Imports instead of Depends in the DESCRIPTION files of bioconductor packages. The slowness is due to the methods package. We're working on it. Michael On Wed, Dec 31, 2014 at 8:47 AM, Peng Yu pengyu...@gmail.com wrote: On Wed, Dec 31, 2014 at 9:41 AM, Martin Morgan mtmor...@fredhutch.org wrote: On 12/24/2014 07:31 PM, Maintainer wrote: Hi, Many bioconductor packages Depends on other packages but not Imports other packages. (e.g., IRanges Depends on BiocGenerics.) Imports is usually preferred to Depends. http://stackoverflow.com/questions/8637993/better-explanation-of-when-to-use-imports-depends http://obeautifulcode.com/R/How-R-Searches-And-Finds-Stuff/ Could the unnecessary Depends be forced to be replaced by Imports? This should improve the package load time significantly. R package symbols and other objects are collated at build time into a 'name space'. When used, - Import: loads the name space from disk. - Depends: loads the name space from disk, and attaches it to the search() path. Attaching is very inexpensive compared to loading, so there is no speed improvement gained by Import'ing instead of Depend'ing. Yes. For example, changing Depends to Imports does not improve the package load time much. But loading a package in 4 sec seems to be too long. system.time(suppressPackageStartupMessages(library(MBASED))) user system elapsed 4.404 0.100 4.553 For example, it only takes 10% of the time to load ggplot2. It seems that many bioconductor packages have similar problems. system.time(suppressPackageStartupMessages(library(ggplot2))) user system elapsed 0.394 0.036 0.460 The main reason to Depend: on a package is because the symbols defined by the package are needed by the end-user. Import'ing a package is appropriate when the package provides functionality only relevant to the package author. What causes the load time to be too long? Is it because exporting too many functions from all dependent packages to the global namespace? There are likely to be specific packages that mis-use Depends; packages such as IRanges, GenomicRanges, etc use Depends: as intended, to provide functions that are useful to the end user. Maintainers are certainly encouraged to think carefully about adding packages providing functionality irrelevant to the end-user to the Depends: field. The codetoolsBioC package (available from svn, see http://bioconductor.org/developers/how-to/source-control/) provides some mostly reliable hints to package authors about correctly formulating a NAMESPACE file to facilitate using Imports: instead of Depends:. General questions about Bioconductor packages should be addressed to the support forum https://support.bioconductor.org. Questions about Bioconductor development (such as this) should be addressed to the bioc-devel mailing list (subscription required) https://stat.ethz.ch/mailman/listinfo/bioc-devel. I have cc'd the bioc-devel mailing list; I hope that is ok. -- Regards, Peng __ The information in this email is confidential and inte...{{dropped:20}} ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel