seen QuasR (and/or gmapR, Rsubread, etc.)? one can run BowTie, gsnap, etc. from R
this certainly makes it easier for me to remember how I did some ChIP-seq or BS-seq or RNA-seq processing a year ago, when it turns out I need to add a sample or samples and carry on with an existing analysis pipeline On Wed, Mar 6, 2013 at 10:17 AM, Cook, Malcolm <m...@stowers.org> wrote: > Thanks David, I've looked into them both a bit, and I don't think the > provide an approach for R (or Perl, for that matter) library management, > which is the wicket I'm trying to get less sticky now. > > They could be useful to manage the various installations of version of R > and analysis files (we're talking allot of NextGenSequencing, so, bowtie, > tophat, and friends) quite nicely similarly in service of an approach to > enabling reproducible results. > > THanks for you thoughts, and, if you know of others similar to > dotkit/modules I'd be keen to here of them. > > ~Malcolm > > > .-----Original Message----- > .From: Lapointe, David [mailto:david.lapoi...@umassmed.edu] > .Sent: Wednesday, March 06, 2013 7:46 AM > .To: Cook, Malcolm; 'Paul Gilbert' > .Cc: 'r-devel@r-project.org'; 'bioconduc...@r-project.org'; ' > r-discuss...@listserv.stowers.org' > .Subject: RE: [BioC] [Rd] enabling reproducible research & R package > management & install.package.version & BiocLite > . > .There are utilities ( e.g. dotkit, and modules) which facilitate version > management, basically creating on the fly PATH and env setups, if > .you are comfortable keeping all that around. > . > .David > . > .-----Original Message----- > .From: bioconductor-boun...@r-project.org [mailto: > bioconductor-boun...@r-project.org] On Behalf Of Cook, Malcolm > .Sent: Tuesday, March 05, 2013 6:08 PM > .To: 'Paul Gilbert' > .Cc: 'r-devel@r-project.org'; 'bioconduc...@r-project.org'; ' > r-discuss...@listserv.stowers.org' > .Subject: Re: [BioC] [Rd] enabling reproducible research & R package > management & install.package.version & BiocLite > . > .Paul, > . > .I think your balanced and reasoned approach addresses all my current > concerns. Nice! I will likely adopt your methods. Let me > .ruminate. Thanks for this. > . > .~ Malcolm > . > . .-----Original Message----- > . .From: Paul Gilbert [mailto:pgilbert...@gmail.com] > . .Sent: Tuesday, March 05, 2013 4:34 PM > . .To: Cook, Malcolm > . .Cc: 'r-devel@r-project.org'; 'bioconduc...@r-project.org'; ' > r-discuss...@listserv.stowers.org' > . .Subject: Re: [Rd] [BioC] enabling reproducible research & R package > management & install.package.version & BiocLite . > . .(More on the original question further below.) . > . .On 13-03-05 09:48 AM, Cook, Malcolm wrote: > . .> All, > . .> > . .> What got me started on this line of inquiry was my attempt at .> > balancing the advantages of performing a periodic (daily or > .weekly) .> update to the 'release' version of locally installed > R/Bioconductor .> packages on our institute-wide installation of R with > .the .> disadvantages of potentially changing the result of an analyst's > .> workflow in mid-project. > . . > . .I have implemented a strategy to try to address this as follows: > . . > . .1/ Install a new version of R when it is released, and packages in the > R .version's site-library with package versions as available at the > .time .the R version is installed. Only upgrade these package versions > in the .case they are severely broken. > . . > . .2/ Install the same packages in site-library-fresh and upgrade these > .package versions on a regular basis (e.g. daily). > . . > . .3/ When a new version of R is released, freeze but do not remove the > old .R version, at least not for a fairly long time, and freeze > ..site-library-fresh for the old version. Begin with the new version as > in .1/ and 2/. The old version remains available, so "reverting" is > .trivial. > . . > . . > . .The analysts are then responsible for choosing the R version they use, > .and the library they use. This means they do not have to > .change R and .package version mid-project, but they can if they wish. I > think the .above two libraries will cover most cases, but it is > .possible that a few .projects will need their own special library with > a combination of .package versions. In this case the user could > .create their own library, .or you might prefer some more official > mechanism. > . . > . .The idea of the above strategy is to provide the stability one might > .want for an ongoing project, and the possibility of an upgraded > .package .if necessary, but not encourage analysts to remain > indefinitely with old .versions (by say, putting new packages in an old R > .version library). > . . > . .This strategy has been implemented in a set of make files in the > project .RoboAdmin available at http://automater.r-forge.r- > .project.org/. It can .be done entirely automatically with a cron job. > Constructive comments .are always appreciated. > . . > . .(IT departments sometimes think that there should be only one version > of .everything available, which they test and approve. So > .the initial .reaction to this approach could be negative. I think they > have not .really thought about the advantages. They usually > .cannot test/approve an .upgrade without user input, and timing is often > extremely complicate .because of ongoing user needs. This > .strategy is simply shifting .responsibility and timing to the users, or > user departments, that can .actually do the testing and > .approving.) . > . .Regarding NFS mounts, it is relatively robust. There can be occasional > .problems, especially for users that have a habit of keeping an > .R session .open for days at a time and using site-library-fresh > packages. In my .experience this did not happen often enough to worry > .about a "blackout .period". > . . > . .Regarding the original question, I would like to think it could be > .possible to keep enough information to reproduce the exact > .environment, .but I think for potentially sensitive numerical problems > that is .optimistic. As others have pointed out, results can > .depend not only on R .and package versions, configuration, OS versions, > and library and .compiler versions, but also on the > .underlying hardware. You might have .some hope using something like an > Amazon core instance. (BTW, this .problem is not specific > .to R.) . > . .It is true that restricting to a fixed computing environment at your > .institution may ease things somewhat, but if you occasionally > .upgrade .hardware or the OS then you will probably lose reproducibility. > . . > . .An alternative that I recommend is that you produce a set of tests > that .confirm the results of any important project. These can be > .conveniently .put in the tests/ directory of an R package, which is > then maintained .local, not on CRAN, and built/tested whenever a > .new R and packages are .installed. (Tools for this are also available > at the above indicated web > . .site.) This approach means that you continue to reproduce the old > .results, or if not, discover differences/problems in the old or new > ..version of R and/or packages that may be important to you. I have been > .successfully using a variant of this since about 1993, using R > .and .package tests/ since they became available. > . . > . .Paul > . . > . .> > . .> I just got the "green light" to institute such periodic updates that > .> I have been arguing is in our collective best interest. In return, > ..> I promised my best effort to provide a means for preserving or .> > reverting to a working R library configuration. > . .> > . .> Please note that the reproducibility I am most eager to provide is > .> limited to reproducibility within the computing environment of > .our .> institute, which perhaps takes away some of the dragon's nests, > .> though certainly not all. > . .> > . .> There are technical issues of updating package installations on an > .> NFS mount that might have files/libraries open on it from > .running R .> sessions. I am interested in learning of approaches for > .> minimizing/eliminating exposure to these issue as well. The .> > .first/best approach seems to be to institute a 'black out' period > . .> when users should expect the installed library to change. Perhaps > . .> there are improvements to this???? > . .> > . .> Best, > . .> > . .> Malcolm > . .> > . .> > . .> .-----Original Message----- .From: Mike Marchywka .> [mailto: > marchy...@hotmail.com] .Sent: Tuesday, March 05, 2013 5:24 .> > .AM .To: amac...@virginia.edu; Cook, Malcolm .Cc: > . .> r-devel@r-project.org; bioconduc...@r-project.org; .> > r-discuss...@listserv.stowers.org .Subject: RE: [Rd] [BioC] enabling .> > .reproducible research & R package management & .> > install.package.version & BiocLite . . .I hate to ask what go this .> > thread started > .but it sounds like someone was counting on .exact .> numeric > reproducibility or was there a bug in a specific release? In .> actual > ..fact, the best way to determine reproducibility is run the .> code in > a variety of .packages. Alternatively, you can do everything .> in > .java and not assume .that calculations commute or associate as the .> > code is modified but it seems .pointless. Sensitivity > .determination .> would seem to lead to more reprodicible results .than > trying to keep .> a specific set of code quirks. . .I also seem to > .recall that FPU may .> have random lower order bits in some cases, > .same code/data give .> different results. Alsways assume FP is > .stochastic and plan .on .> anlayzing the "noise." . . > .---------------------------------------- > . .> .> From: amac...@virginia.edu .> Date: Mon, 4 Mar 2013 16:28:48 .> > -0500 .> To: m...@stowers.org .> CC: r-devel@r-project.org; > ..> bioconduc...@r-project.org; r-discuss...@listserv.stowers.org .> .> > Subject: Re: [Rd] [BioC] enabling reproducible research & R > .package .> management & install.package.version & BiocLite .> .> On > Mon, Mar 4, .> 2013 at 4:13 PM, Cook, Malcolm > .<m...@stowers.org> wrote: .> .> > * .> where do the dragons lurk .> > > .> .> webs of interconnected .> dynamically loaded libraries, > .identical versions of .> R compiled .> with different BLAS/LAPACK > options, etc. Go with the VM if you .> .> really, truly, want this level > .of exact reproducibility. .> .> An .> alternative (and arguably more > useful) strategy would be to cache .> .> results of each > .computational step, and report when results differ .> upon .> > re-execution with identical inputs; if you cache sessionInfo .> along > .with .> each result, you can identify which package(s) changed, .> and > begin to hunt .> down why the change occurred (possibly for > .the .> better); couple this with .> the concept of keeping both code > *and* .> results in version control, then you .> can move forward > .with a .> (re)analysis without being crippled by out-of-date .> > software. .> .> .> -Aaron .> .> -- .> Aaron J. Mackey, PhD .> Assistant > .Professor .> .> Center for Public Health Genomics .> University of > Virginia .> .> amac...@virginia.edu .> > .http://www.cphg.virginia.edu/mackey .> .> .> [[alternative HTML > version deleted]] .> .> .> > .______________________________________________ .> .> > R-devel@r-project.org mailing list .> .> > .https://stat.ethz.ch/mailman/listinfo/r-devel . > . .> > . .> ______________________________________________ R-devel@r-project.org .> > mailing list > .https://stat.ethz.ch/mailman/listinfo/r-devel > . .> > . > ._______________________________________________ > .Bioconductor mailing list > .bioconduc...@r-project.org > .https://stat.ethz.ch/mailman/listinfo/bioconductor > .Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > bioconduc...@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf> [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel