Hi Jin,

you can use a post-link script in conda.

Like here: https://github.com/bioconda/bioconda-recipes/blob/master/recipes/picrust2/post-link.sh

This way the data can be fetch during tool installation.

See more information here: https://docs.conda.io/projects/conda-build/en/latest/resources/link-scripts.html

Ciao,
Bjoern

Am 24.07.19 um 18:43 schrieb Jin Li:
Hi Brad,

Thank you for your quick reply. I can put the data file to Zenodo so
that I will have a permanent location for it.

As for re-computing the data file locally, it may need several days to
run, so it may be quite inefficient to do the computing. I am
expecting an automatic download of the data file when installing the
package. Do we have a convention to do that? Thank you.

Best regards,
Jin

On Wed, Jul 24, 2019 at 11:31 AM Langhorst, Brad <langho...@neb.com> wrote:

Hi:

I’d be concerned about that file changing or disappearing and causing 
irreproducibility.
If the URL were to a permanent location (e.g. NCBI or zenodo) maybe it’s ok.

Could it be re-computed locally if necessary (like a genome index)?

Maybe others know of examples where this is done.


Brad

On Jul 24, 2019, at 12:24 PM, Jin Li <lijin....@gmail.com> wrote:

Hi all,

I am not sure if this mailing list is a good place to ask a bioconda
question. Sorry to bother if not. I want to ask how to include a large
data file when publishing a bioconda package. Our program depends on a
pre-computed data file, which is too large to be included in the
source code package. The data file can be accessed via a public URL.
Can I put the downloading command in `build.sh` when publishing a
bioconda package? If not, do we have a convention to deal with
dependent large datafiles? Thank you.

Best regards,
Jin
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  %(web_page_url)s

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/


Bradley W. Langhorst, Ph.D.
Development Group Leader
New England Biolabs



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
   %(web_page_url)s

To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 %(web_page_url)s

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/

Reply via email to