On 17/06/16 18:54, Ricardo Wurmus wrote:
Hi Guix,

Bioconductor makes me sad.  Bioconductor is a repository of R packages
for bioinformatics.  They have a bit of a weird release model.  There
are releases of all of Bioconductor (current version is 3.3), but within
releases R packages can be updated.  The packages are supposed to be
compatible with other packages in the same Bioconductor release.

For Bioconductor synchronisation appears to be important, but is also
elusive.  You usually want to have all R packages from the same
Bioconductor release, but since a Bioconductor release is a fluid thing
and individual packages get updated all the time you probably want to
have the latest at all times.

Unfortunately, Bioconductor doesn’t have an archive of previous releases
of R packages.  They only keep the latest version of any particular R
package at a time.  All of Bioconductor is also kept in SVN and there
are git mirrors of the SVN repository.

Our Bioconductor importer (guix import cran -a bioconductor) fetches
DESCRIPTION files of individual R packages from SVN.  I found that the
tarballs offered for download are not always in sync with what is
offered on SVN, so the importer sometimes fails as it tries to fetch a
tarball version that doesn’t exist.

The lack of an archive is also a problem for reproducibility.  You
simply cannot download an archive for an obsolete package version.

This makes me wonder if we shouldn’t ignore the tarballs and fetch
directly from SVN or the git mirror.  I would like to make this a little
more reliable, so that people can reproduce the state of Bioconductor at
a particular point in time if they have a manifest and a git hash of the
Guix repository.

Releases of individual packages are not tagged in the Bioconductor SVN
repository, however.  Do we still have to append the SVN revision to the
version strings of every Bioconductor package?  An increase in the SVN
revision does not necessarily mean that an individual package has been
updated.

The problem is that the SVN revision number gets bumped each time any Bioconductor package is updated? I still would think we should, to me a version increase indicates a possible change of source code upstream, not a guarantee of one.

What are you intending that 'refresh' report?

Overall FWIW I'm supportive of moving to SVN for Bioconductor, because otherwise there is too much onus on us to keep all tarballs in perpetuity, or to make sure they are available somewhere.

ben

Reply via email to