Hi Martin
to come back to the original trigger for this thread: it was not concerns for
reproducibility, but the fact that a Bioc package in the current release
stopped working because a CRAN package has changed in the meanwhile.
What’s the most practical solution to this specific problem?
Best wishes
Wolfgang
On 23 Apr 2014, at 19:41, Martin Morgan <[email protected]> wrote:
> On 04/22/2014 09:47 AM, Kasper Daniel Hansen wrote:
>> I think we should have a CRAN snapshot (or a subset of CRAN used in Bioc)
>> inside each Bioc release; I don't know how hard that is to manage from a
>> technical point of view.
>
> I followed this thread with some interest.
>
> It would be surprisingly challenging to update even a 2.13 package -- the
> build machines have moved on to other tasks, unconstrained by the unique
> system dependencies needed for 2.13 builds.
>
> The idea of a 'forever' repository snapshot seems possible, but would the
> snapshot be at the beginning of the release and hence miss the few but
> important bug fixes introduced during the release, or at the end of the
> release, which might be after the time required for the purposes of
> replication? Either way it is certain that the peanut butter would land face
> down for one's particular need. Also, the need for the user to satisfy system
> dependencies becomes increasingly challenging, even with a binary repository.
> I don't think a central 'Bioc' solution would really address the problem of
> reproducibility.
>
> It is not that 'hard' for an individual group to create a snapshot of Bioc
> and CRAN, using rsync
>
> http://www.bioconductor.org/about/mirrors/mirror-how-to/
> http://cran.r-project.org/mirror-howto.html
>
> and to use install.packages() or even biocLite to access these (see
> ?setRepositories). This would again require that the system dependencies for
> these packages are satisfied in some kind of frozen fashion.
>
> A more robust possibility is of course a virtual machine, such as the AMI (or
> a customized version) we provide
>
> http://www.bioconductor.org/help/bioconductor-cloud-ami/#ami_ids
>
> although these have only a subset of packages installed by default.
>
> The CRAN thread referenced earlier included this post
>
> https://stat.ethz.ch/pipermail/r-devel/2014-March/068605.html
>
> which I think makes an important distinction between exact replication and
> scientific reproducibility; it is the latter that must be the most
> interesting, and the former that we somehow seem to stumble over. The thread
> also mentions best practices -- version control
>
> http://bioconductor.org/developers/how-to/source-control/
>
> disciplined approach to deprecation
>
> http://bioconductor.org/developers/how-to/deprecation/
>
> package versioning
>
> http://bioconductor.org/developers/how-to/version-numbering/
>
> and the Bioc-style approach to release that we as developers can act on to
> enhance reproducibility. What other best practices can we more forcefully /
> conveniently adopt within the project?
>
> Martin
>
>>
>> Best,
>> Kasper
>>
>>
>> On Tue, Apr 22, 2014 at 6:06 PM, Julian Gehring
>> <[email protected]>wrote:
>>
>>> Hi,
>>>
>>> For most problems discussed here, it seems that having a fixed version of
>>> package is sufficient rather than a specific version. If the idea of a
>>> snapshot with each bioc release would work (which still means one version
>>> per package), so would requiring that version within the package (one would
>>> just need to agree which version this is).
>>>
>>> Best wishes
>>>
>>> Julian
>>>
>>>
>>> what if two Bioc packages require different version of the ‘same’ CRAN
>>>> package?
>>>> AfaIu, the infrastructure is not designed to deal with multiple versions
>>>> of a package.
>>>>
>>>> Nor would I as a user expect to have less-than-the-most recent versions
>>>> of CRAN packages in my library just because some other package says so…
>>>>
>>>> Just to throw in another, and probably silly suggestion: the Bioconductor
>>>> repository could keep ‘snapshots’ of CRAN packages compatible with each
>>>> release, but they would have to be name-mangled in some way. The potential
>>>> for confusion is enormous.
>>>>
>>>
>>> _______________________________________________
>>> [email protected] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>
>> [[alternative HTML version deleted]]
>>
>>
>>
>> _______________________________________________
>> [email protected] mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>
> --
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
>
> _______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel