subject:"\"Reproducibility\""

Re: Reproducibility

2010-04-30 Thread Yaroslav Halchenko


On Fri, 30 Apr 2010, Antonio Paiva wrote:
> http://www.vistrails.org/index.php/Downloads. I remember that the
> software keeps track of the libraries, OS, and CPU that the code is
> using to get the results.

> Best,
> António Rafael C. Paiva
> Post-doctoral fellow
> SCI Institute, University of Utah
> Salt Lake City, UT
ha -- also the land of SCIRun pipes and toys:
http://www.sci.utah.edu/cibc/software.html

if we are to share the links, here is imho also very relevant approach
for reproducibility assurance within research itself:

http://neuralensemble.org/trac/sumatra/wiki

Sumatra is a tool for managing and tracking projects based on numerical
simulation or analysis, with the aim of supporting reproducible research. It
can be thought of as an automated electronic lab notebook for
simulation/analysis projects. 


-- 
  .-.
=--   /v\  =
Keep in touch// \\ (yoh@|www.)onerussian.com
Yaroslav Halchenko  /(   )\   ICQ#: 60653192
   Linux User^^-^^[17]



-- 
To UNSUBSCRIBE, email to debian-science-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100430180910.gf8...@onerussian.com

Re: Reproducibility

2010-04-30 Thread Yaroslav Halchenko

On Fri, 30 Apr 2010, Johan Grönqvist wrote:
> >That is why we have backports.org and neuro.debian.net that offer at
> >least the latest and greatest for 'stable'. But this is still not
> >enough.
> To me (IMHO) that feels like _the_ solution, when combined with the
> debian snapshot service.
Exactly that -- snapshots! but not combined with anything: alternatives
are not a solution since it might be harder to control imho.

But consider snapshot.debian.org approach -- if the research system kept
up to a specific date -- you can deploy exactly the same environment
with consistent versioning later on with ease, and probably also simply
within a chroot using debootstrap within a matter of speed to the
mirror. The only thing to take care would be exactly the confusing
part -- alternatives (and possibly a custom system configuration if it
was of any relevance).

N.B. note for our neuro.debian.net -- we probably should setup such
 snapshots service ;-)

The "alternatives" (or "modules" in some other research
environments/systems) solution is indeed appealing for deploying
heterogeneous systems which aim to satisfy variety of
researchers/projects at once (for example - university-wide high
performance cluster) if those groups indeed require some custom software
no available natively as a part of OS.  But I think it just complicates
reproducibility -- complete chroot/virtual machine sounds more appealing
if reincarnation of the environment is necessary.

-- 
  .-.
=--   /v\  =
Keep in touch// \\ (yoh@|www.)onerussian.com
Yaroslav Halchenko  /(   )\   ICQ#: 60653192
   Linux User^^-^^[17]

-- 
To UNSUBSCRIBE, email to debian-science-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100430175145.ge8...@onerussian.com

Re: Reproducibility

2010-04-30 Thread Johan Grönqvist


2010-04-30 16:29, Michael Hanke skrev:


Usually we have some version in stable and some people will use it.


[...]


In
Debian we have the universal operating system that incorporates all
software and 'stable' is a snapshot of everything at the time of release
-- and this is not what scientists want.

That is why we have backports.org and neuro.debian.net that offer at
least the latest and greatest for 'stable'. But this is still not
enough.


This is why I like the approach of Gobolinux, at least in theory.

As I understand the basic idea of gobolinux, every packages follows a 
system like the debian alternatives system, where the alternatives are 
the different versions of that package. Upgrading therefore does not 
have to remove old versions, but can just install the new version and 
update the symlink, removing the old package can then be a separate 
procedure.


To me (IMHO) that feels like _the_ solution, when combined with the 
debian snapshot service. I imagine maintenance not to be a problem. I 
would be happy to have only the same level of support as we have now, 
but with the eternal _availability_ of all packages. Perhaps other users 
have other requirements.


Of course this also requires more form the alternatives-handling 
software in order to handle versioned dependencies when switching the 
alternatives system between different version, but I expect that to be 
doable using tools similar to those that exist now.


Regards

Johan


--
To UNSUBSCRIBE, email to debian-science-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/hreska$ob...@dough.gmane.org

Re: Reproducibility

2010-04-30 Thread Antonio Paiva

Those of you interested in reproducibility might be interested in
VisTrails. These is a start-up commercializing the software but most
of it is free and development is open source, available from
http://www.vistrails.org/index.php/Downloads. I remember that the
software keeps track of the libraries, OS, and CPU that the code is
using to get the results.

Best,
António Rafael C. Paiva
Post-doctoral fellow
SCI Institute, University of Utah
Salt Lake City, UT



On Fri, Apr 30, 2010 at 8:51 AM, Brett Viren  wrote:
> Teemu Ikonen  writes:
>
>> Does anyone here have good ideas on how to ensure reproducibility in
>> the long term?
>
> Regression testing, as mentioned, or running some fixed analysis and
> statistically comparing the results to past runs.
>
> We worry about reproducibility in my field of particle physics.  We run
> on many different Linux and Mac platforms and strive for statistical
> consistency (see below) not identical consistency.  I don't recall there
> ever being an issue with different versions of, say, Debian system
> libraries.  Any inconsistencies we have found have been due to version
> shear in different copies of our own codes.
>
> [Aside: I have seen gross differences between Debian and RH-derived
> platforms.  In a past experiment I was the only collaborator working on
> Debian and almost everyone else was using Scientific Linux (RHEL
> derivative).  I kept getting bit by our code crashing on me.  It seems,
> for some reason, my compilations tended to put garbage in uninitialized
> pointers where on SL they tended to get NULL.  So, I was the lucky one
> to find and fix a lot of programming mistakes.  This could have just
> been a fluke, I have no explanation for it.]
>
>> The only thing that comes to my mind is to run all
>> important calculations in a virtual machine image which is then signed
>> and stored in case the results need verification. But, maybe there are
>> other options?
>
> We have found that running the exact same code and same Debian OS on
> differing CPUs will lead to differing results.  They differ because IEEE
> FP "standard" isn't implemented exactly the same on all CPUs.  The
> results will differ in only the least significant digits.  But, if you
> use simulations that consume random numbers and compare them against FP
> values this can lead to more gross divergences.  However, with a large
> enough sample the results are all statistically consistent.
>
> I don't know how that translates when using virtual machines on
> different host CPUs, but if you care about bit-for-bit identically, this
> FP "standard" may percolate up through the VM and ruin that.  Anyways,
> in the end, all CPUs give the "wrong" results since FP calculations are
> not infinitely precise, so striving for bit-for-bit consistency is kind
> of a pipe dream.
>
>
> -Brett.
>
>


--
To UNSUBSCRIBE, email to debian-science-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/o2v9f0a69bf1004300811rdc7039far1ba329a704526...@mail.gmail.com

Re: Reproducibility

2010-04-30 Thread Adam C Powell IV

On Fri, 2010-04-30 at 14:18 +0200, Andreas Tille wrote:
> I can confirm that this is actually the reason why at Sanger Institute
> (even if there are three DDs working) plain Debian (and specifically the
> Debian Med packages) is not used.

FYI, I uploaded a new version of the Med packages on Monday which I
believe passes all of the tests (at least André Espaze got them all to
pass when he tried the package a while ago).

-Adam
-- 
GPG fingerprint: D54D 1AEE B11C CE9B A02B  C5DD 526F 01E8 564E E4B6

Engineering consulting with open source tools
http://www.opennovation.com/


signature.asc
Description: This is a digitally signed message part

Re: Reproducibility

2010-04-30 Thread Brett Viren

Teemu Ikonen  writes:

> Does anyone here have good ideas on how to ensure reproducibility in
> the long term? 

Regression testing, as mentioned, or running some fixed analysis and
statistically comparing the results to past runs.

We worry about reproducibility in my field of particle physics.  We run
on many different Linux and Mac platforms and strive for statistical
consistency (see below) not identical consistency.  I don't recall there
ever being an issue with different versions of, say, Debian system
libraries.  Any inconsistencies we have found have been due to version
shear in different copies of our own codes.

[Aside: I have seen gross differences between Debian and RH-derived
platforms.  In a past experiment I was the only collaborator working on
Debian and almost everyone else was using Scientific Linux (RHEL
derivative).  I kept getting bit by our code crashing on me.  It seems,
for some reason, my compilations tended to put garbage in uninitialized
pointers where on SL they tended to get NULL.  So, I was the lucky one
to find and fix a lot of programming mistakes.  This could have just
been a fluke, I have no explanation for it.]

> The only thing that comes to my mind is to run all
> important calculations in a virtual machine image which is then signed
> and stored in case the results need verification. But, maybe there are
> other options?

We have found that running the exact same code and same Debian OS on
differing CPUs will lead to differing results.  They differ because IEEE
FP "standard" isn't implemented exactly the same on all CPUs.  The
results will differ in only the least significant digits.  But, if you
use simulations that consume random numbers and compare them against FP
values this can lead to more gross divergences.  However, with a large
enough sample the results are all statistically consistent.

I don't know how that translates when using virtual machines on
different host CPUs, but if you care about bit-for-bit identically, this
FP "standard" may percolate up through the VM and ruin that.  Anyways,
in the end, all CPUs give the "wrong" results since FP calculations are
not infinitely precise, so striving for bit-for-bit consistency is kind
of a pipe dream.

-Brett.

smime.p7s
Description: S/MIME cryptographic signature

Re: Reproducibility

2010-04-30 Thread Michael Hanke

On Fri, Apr 30, 2010 at 03:23:42PM +0200, Andreas Tille wrote:
> On Fri, Apr 30, 2010 at 09:30:16AM -0300, David Bremner wrote:
> > > Yes, that's the problem.
> > 
> > For stable releases though, we have the time, and we can (I suspect) get
> > the compute cycles to run heavy regression tests. Would that be a
> > worthwhile project? 
> 
> Well, it is not me who raised this problem and so I do not feel realy
> able to give a definite answer.  But as I understood the people at
> Sanger scientist do not really care about stable Debian.  They care
> about a really specific version of a specific software.  Perhaps the
> version in stable is to old - or it might even be to new (if they want
> to reproduce old results).  I really dobt that these people who are
> used to stick to such a version will care about Debians regression
> tests if they have the chance to simply install their own version.

Right.

Usually we have some version in stable and some people will use it. In
general, however, people want the stable 'operating system' and _in
addition_ a multitude of versions of their critical applications. In
Debian we have the universal operating system that incorporates all
software and 'stable' is a snapshot of everything at the time of release
-- and this is not what scientists want.

That is why we have backports.org and neuro.debian.net that offer at
least the latest and greatest for 'stable'. But this is still not
enough. Ideally, I would keep maintaining each an every package version
for an indefinite period of time -- that should make everybody happy,
but I'm clearly not going to do this unless my day gets additional 24
hours ;-)

BUT: I believe people would be a lot more happy to keep upgrading to
latest versions, IF there would be a standardized, upstream-supported
method to perform some reasonable tests.

This is a topic that Yarik stripped from his talk, but is still very
interesting to talk about: We, as Debian, could to a lot to help upstream
projects to deploy their software in a more sane way, by offering
concrete guidelines and facilities to do what is necessary to ensure
proper behavior.

IMHO, sticking to old versions is a reality in the science community --
but it is a problem that should be solved and not supported.

Michael

-- 
GPG key:  1024D/3144BE0F Michael Hanke
http://mih.voxindeserto.de

-- 
To UNSUBSCRIBE, email to debian-science-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100430142954.ga2...@meiner

Re: Reproducibility

2010-04-30 Thread Andreas Tille

On Fri, Apr 30, 2010 at 09:30:16AM -0300, David Bremner wrote:
> > Yes, that's the problem.
> 
> For stable releases though, we have the time, and we can (I suspect) get
> the compute cycles to run heavy regression tests. Would that be a
> worthwhile project? 

Well, it is not me who raised this problem and so I do not feel realy
able to give a definite answer.  But as I understood the people at
Sanger scientist do not really care about stable Debian.  They care
about a really specific version of a specific software.  Perhaps the
version in stable is to old - or it might even be to new (if they want
to reproduce old results).  I really dobt that these people who are
used to stick to such a version will care about Debians regression
tests if they have the chance to simply install their own version.

So for the application at Sanger I mentioned it is probably wasted
energy / manpower.  For other cases it might make sense, yes.

Kind regards

Andreas.

-- 
http://fam-tille.de

-- 
To UNSUBSCRIBE, email to debian-science-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100430132342.ga12...@an3as.eu

Re: Reproducibility

2010-04-30 Thread Andreas Tille

On Fri, Apr 30, 2010 at 07:07:21AM -0400, Michael Hanke wrote:
> > This nice abstract inspired me to think about reproducibility of
> > program runs. If one runs e.g. Debian unstable the OS code which can
> > potentially affect the results of calculations can change almost
> > daily. Reproducing results later can be close to impossible unless
> > versions of all the related libraries etc. are written down somewhere.
> 
> This is not just a potential problem -- we have seen it happen already.
> Part of the problem is that in Debian we prefer dynamic linking to
> up-to-date shared libs from separate packages -- instead of statically
> linking to ancient versions with known behavior (for good reasons of
> course).

I can confirm that this is actually the reason why at Sanger Institute
(even if there are three DDs working) plain Debian (and specifically the
Debian Med packages) is not used.  The requirement of the scientists is
to stick to a very specific version of the packages (not necessary those
which are part of a stable Debian release) and some labs use different
versions than other labs.
 
> IMHO better than relying on a snapshot of OS and a particular software
> state to get constant results, projects should have comprehensive
> regression tests that ensure proper behavior.

In theory this is probably right but in practice it needs extra manpower
which I doubt will be spend on problems like this.

> The problem is, however,
> that we cannot run then during package build time, since they tend to
> require large datasets and run for many hours. Therefore users need to
> do that, but nobody does it.

Yes, that's the problem.

Kind regards

Andreas. 

-- 
http://fam-tille.de


-- 
To UNSUBSCRIBE, email to debian-science-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100430121829.ga8...@an3as.eu

Re: Reproducibility

2010-04-30 Thread Michael Hanke

Hi,

On Fri, Apr 30, 2010 at 10:01:23AM +0200, Teemu Ikonen wrote:
> On Fri, Apr 30, 2010 at 2:08 AM, Michael Hanke  
> wrote:
> > Debian: The ultimate platform for neuroimaging research
> [...]
> > However, it is hard to blame the respective developers, because the
> > sheer number of existing combinations of operating systems, hardware,
> > and library versions makes it almost impossible to verify that a
> > particular software is working as intended.  Restricting the
> > ``supported'' runtime environment is one approach of making
> > verification efforts feasible.
> 
> Dear list,
> 
> This nice abstract inspired me to think about reproducibility of
> program runs. If one runs e.g. Debian unstable the OS code which can
> potentially affect the results of calculations can change almost
> daily. Reproducing results later can be close to impossible unless
> versions of all the related libraries etc. are written down somewhere.

This is not just a potential problem -- we have seen it happen already.
Part of the problem is that in Debian we prefer dynamic linking to
up-to-date shared libs from separate packages -- instead of statically
linking to ancient versions with known behavior (for good reasons of
course).

> Does anyone here have good ideas on how to ensure reproducibility in
> the long term? The only thing that comes to my mind is to run all
> important calculations in a virtual machine image which is then signed
> and stored in case the results need verification. But, maybe there are
> other options?

IMHO better than relying on a snapshot of OS and a particular software
state to get constant results, projects should have comprehensive
regression tests that ensure proper behavior. The problem is, however,
that we cannot run then during package build time, since they tend to
require large datasets and run for many hours. Therefore users need to
do that, but nobody does it.

Michael

-- 
GPG key:  1024D/3144BE0F Michael Hanke
http://mih.voxindeserto.de

-- 
To UNSUBSCRIBE, email to debian-science-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100430110721.ga26...@meiner

Reproducibility

2010-04-30 Thread Teemu Ikonen

On Fri, Apr 30, 2010 at 2:08 AM, Michael Hanke  wrote:
> Debian: The ultimate platform for neuroimaging research
[...]
> However, it is hard to blame the respective developers, because the
> sheer number of existing combinations of operating systems, hardware,
> and library versions makes it almost impossible to verify that a
> particular software is working as intended.  Restricting the
> ``supported'' runtime environment is one approach of making
> verification efforts feasible.

Dear list,

This nice abstract inspired me to think about reproducibility of
program runs. If one runs e.g. Debian unstable the OS code which can
potentially affect the results of calculations can change almost
daily. Reproducing results later can be close to impossible unless
versions of all the related libraries etc. are written down somewhere.

Does anyone here have good ideas on how to ensure reproducibility in
the long term? The only thing that comes to my mind is to run all
important calculations in a virtual machine image which is then signed
and stored in case the results need verification. But, maybe there are
other options?

Best,

Teemu

--
To UNSUBSCRIBE, email to debian-science-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/m2s97fdf2d71004300101z977bdb3q5a2e3eba882c9...@mail.gmail.com

Re: Reproducibility

Re: Reproducibility

Re: Reproducibility

Re: Reproducibility

Re: Reproducibility

Re: Reproducibility

Re: Reproducibility

Re: Reproducibility

Re: Reproducibility

Re: Reproducibility

Reproducibility

11 matches

Site Navigation

Mail list logo

Footer information