Hi,

See the latest software builds for BioC 2.13:

  http://bioconductor.org/checkResults/2.13/bioc-20140405/

The number of packages that needed to be installed on the build
system in order to build and check the 750 BioC software packages
is displayed in the right-most column of the top table:

  1510 on zin1 (Linux)
  1486 on moscato1 (Windows)
  1500 on perceval (Mac)

If you click on these numbers, you get the full list of packages
plus their version.

Once you've subtracted the 750 software packages + the number of data
annotation and data experiment packages (a few more hundreds) from
these numbers, that gives you the number of CRAN packages that
BioC 2.13 depends on. Not that many really (only a very small fraction
of the 5400 CRAN packages).

If we hosted only this small subset of CRAN packages under

  http://bioconductor.org/packages/2.13/cran

next to the other 4 frozen repos

  http://bioconductor.org/packages/2.13/bioc
  http://bioconductor.org/packages/2.13/data/annotation
  http://bioconductor.org/packages/2.13/experiment
  http://bioconductor.org/packages/2.13/extra

and have biocLite() modified to point to

   http://bioconductor.org/packages/2.13/cran

instead of

  http://cran.fhcrc.org

then anybody that has R 3.0.3 could *easily* install and run
BioC 2.13 now or in 5 years from now.

Cheers,
H.


On 04/24/2014 08:09 AM, Steve Lianoglou wrote:
Hi all,

Just saw this tangentially related link to "packrat" which seems something
analogous to a virtualenv (of sorts) for R by the Rstudio folks, which I
thought might be useful

It actually doesn't solve anybody's problem here, but as I said ...
tangential :-)

http://rstudio.github.io/packrat/


On Thursday, April 24, 2014, Wolfgang Huber <whu...@embl.de> wrote:

Hi Kasper

you are right, I had misunderstood the problem.
In that case I agree with Martin that the problem resolves into components
that are either intractable, already addressed by deprecation policies, or
not very important.
Sorry for the noise.

         Wolfgang

On 24 Apr 2014, at 15:18, Kasper Daniel Hansen <
kasperdanielhan...@gmail.com> wrote:

Wolfgang,

Alejandro did not have a problem with the current release, but with the
most recent prior release.  His issue is precisely because it is no longer
the current (stable) release.

Kasper


On Thu, Apr 24, 2014 at 3:05 PM, Wolfgang Huber <whu...@embl.de> wrote:
Hi Martin
to come back to the original trigger for this thread: it was not
concerns for reproducibility, but the fact that a Bioc package in the
current release stopped working because a CRAN package has changed in the
meanwhile.
What's the most practical solution to this specific problem?
         Best wishes
         Wolfgang




On 23 Apr 2014, at 19:41, Martin Morgan <mtmor...@fhcrc.org> wrote:

On 04/22/2014 09:47 AM, Kasper Daniel Hansen wrote:
I think we should have a CRAN snapshot (or a subset of CRAN used in
Bioc)
inside each Bioc release; I don't know how hard that is to manage
from a
technical point of view.

I followed this thread with some interest.

It would be surprisingly challenging to update even a 2.13 package --
the build machines have moved on to other tasks, unconstrained by the
unique system dependencies needed for 2.13 builds.

The idea of a 'forever' repository snapshot seems possible, but would
the snapshot be at the beginning of the release and hence miss the few but
important bug fixes introduced during the release, or at the end of the
release, which might be after the time required for the purposes of
replication? Either way it is certain that the peanut butter would land
face down for one's particular need. Also, the need for the user to satisfy
system dependencies becomes increasingly challenging, even with a binary
repository. I don't think a central 'Bioc' solution would really address
the problem of reproducibility.

It is not that 'hard' for an individual group to create a snapshot of
Bioc and CRAN, using rsync

  http://www.bioconductor.org/about/mirrors/mirror-how-to/
  http://cran.r-project.org/mirror-howto.html

and to use install.packages() or even biocLite to access these (see
?setRepositories). This would again require that the system dependencies
for these packages are satisfied in some kind of frozen fashion.

A more robust possibility is of course a virtual machine, such as the
AMI (or a customized version) we provide

  http://www.bioconductor.org/help/bioconductor-cloud-ami/#ami_ids

although these have only a subset of packages installed by default.

The CRAN thread referenced earlier included this post

  https://stat.ethz.ch/pipermail/r-devel/2014-March/068605.html

which I think makes an important distinction between exact replication
and scientific reproducibility; it is the latter that must be the most
interesting, and the former that we somehow seem to stumble over. The
thread also mentions best practices -- version control

  http://bioconductor.org/developers/how-to/source-control/

disciplined approach to deprecation

  http://bioconductor.org/developers/how-to/deprecation/

package versioning

  http://bioconductor.org/developers/how-to/version-numbering/

and the Bioc-style approach to release that we as developers can act
on to enhance reproducibility. What other best pract




--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to