Re: [galaxy-dev] Number of outputs = number of inputs

2013-07-30 Thread Peter Cock
On Tue, Jul 30, 2013 at 4:29 PM, Shafer, Christina
 wrote:
> My question is related to this post in the archive. As recommended, I
> referred to
> http://wiki.galaxyproject.org/Admin/Tools/Multiple%20Output%20Files#Number_of_Output_datasets_cannot_be_determined_until_tool_run
> for some guidance on how to achieve multiple output files, specifically one
> set of six files per number of times the  tag is called. The example
> provided isn't that informative, however, as it doesn't show what the
> example_tool.sh command script is doing so I can make my perl wrapper script
> do the same thing. If I have six files total, do I only specify one in the
>  tag and then have the remaining 5 named according to the scheme in
> the perl wrapper? Or is it Galaxy that does the naming of the remaining
> files automatically? Is there a perl equivalent to the "$__new_file_path__"
> variable that I can use, or is this the literal resolved file path (e.g.
> "/opt/galaxy/... etc)?
>
> Right now, upon execution of my tool, if a user has two entries (so number
> of times  has executed is twice), my perl wrapper is called twice
> (good), but on the second call, Galaxy uses the same output filenames as the
> first run such that the first set of output files are all overwritten by the
> second execution (bad). How do I make sure that each  iteration
> results in a unique set of output files?

The normal expectation is the Galaxy wrapper calls the tool ONCE only,
with a more complex command line including multiple file names from
the repeat.

If I recall correctly from your last email, you are using a trick with semi
colons to embed multiple shell commands in the  tag, one
for each repetition of the  tag.

I suggest you rework your Perl script to be designed to be called once
for all the work,

Peter
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Number of outputs = number of inputs

2013-07-30 Thread Shafer, Christina
My question is related to this post 

 in the archive. As recommended, I referred to 
http://wiki.galaxyproject.org/Admin/Tools/Multiple%20Output%20Files#Number_of_Output_datasets_cannot_be_determined_until_tool_run
 for some guidance on how to achieve multiple output files, specifically one 
set of six files per number of times the  tag is called. The example 
provided isn't that informative, however, as it doesn't show what the 
example_tool.sh command script is doing so I can make my perl wrapper script do 
the same thing. If I have six files total, do I only specify one in the 
 tag and then have the remaining 5 named according to the scheme in 
the perl wrapper? Or is it Galaxy that does the naming of the remaining files 
automatically? Is there a perl equivalent to the "$__new_file_path__" variable 
that I can use, or is this the literal resolved file path (e.g. 
"/opt/galaxy/... etc)?

Right now, upon execution of my tool, if a user has two entries (so number of 
times  has executed is twice), my perl wrapper is called twice (good), 
but on the second call, Galaxy uses the same output filenames as the first run 
such that the first set of output files are all overwritten by the second 
execution (bad). How do I make sure that each  iteration results in a 
unique set of output files?

Thanks for your help!

Christina Shafer, Ph.D
Regenerative Biology Laboratory
Morgridge Institute for Research
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Number of outputs = number of inputs

2012-10-17 Thread John Chilton
Like most days, JJ very politely pointed out that I am a wrong this
morning. You can have variable numbers of outputs at runtime, see the
last section ("Number of Output datasets cannot be determined until
tool run"
) of this page:

http://wiki.g2.bx.psu.edu/Admin/Tools/Multiple%20Output%20Files

Sorry about that.

-John

On Tue, Oct 16, 2012 at 8:48 AM, John Chilton  wrote:
> I don't believe this is possible in Galaxy right now. Are the outputs
> independent or is information from all inputs used to produce all
> outputs? If they are independent, you can create a workflow containing
> just your tool with 1 input and 1 output and use the batch workflow
> mode to run it on multiple files and get multiple outputs. This is not
> a beautiful solution but it gets the job done in some cases.
>
> Another thing to look at might be the discussion we are having on the
> thread "pass more information on a dataset merge". We have a fork (its
> all work from Jorrit Boekel) of galaxy that creates composite
> datatypes for each explicitly defined type that can hold collections
> of a single type.
>
> https://bitbucket.org/galaxyp/galaxy-central-homogeneous-composite-datatypes/compare
>
> This would hopefully let you declare that you can accept a collection
> of whatever your input type is and produce a collection of whatever
> your output is. Lots of downsides to this approach - not fully
> implemented, and not included in Galaxy proper, your outputs would be
> wrapped up in a composite datatype so they wouldn't be easily
> processable by downstream tools. It would be good to have additional
> people hacking on it though :)
>
> -John
>
> 
> John Chilton
> Senior Software Developer
> University of Minnesota Supercomputing Institute
> Office: 612-625-0917
> Cell: 612-226-9223
> Bitbucket: https://bitbucket.org/jmchilton
> Github: https://github.com/jmchilton
> Web: http://jmchilton.net
>
> On Tue, Oct 16, 2012 at 7:13 AM, Sascha Kastens
>  wrote:
>> Hi all!
>>
>>
>>
>> I have a tool which takes one ore more input files. For each input file one
>> output is created,
>>
>> i.e. 1 input file -> 1 output file, 2 input files -> 2 output files, etc.
>>
>>
>>
>> What is the best way to handle this? I used the directions for handlin
>> multiple output files where
>>
>> the ’Number of Output datasets cannot be determined until tool run’ which in
>> my opinion is a bit
>>
>> inappropriate. BTW: The input files are added via the -Tag, so maybe
>> there is a similar
>>
>> thing for outputs?
>>
>>
>>
>> Thanks in advance!
>>
>>
>>
>> Cheers,
>>
>> Sascha
>>
>>
>> ___
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>
>>   http://lists.bx.psu.edu/

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Number of outputs = number of inputs

2012-10-16 Thread Alex.Khassapov
I tried galaxy-central-homogeneous-composite-datatypes fork, works great. I 
have a similar problem, where number of output files varies, it seems that your 
approach might work for output files as well (not only input). Currently I'm 
trying to work out how to implement it, any help is appreciated.

Alex

-Original Message-
From: galaxy-dev-boun...@lists.bx.psu.edu 
[mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of John Chilton
Sent: Wednesday, 17 October 2012 12:49 AM
To: Sascha Kastens
Cc: galaxy-dev@lists.bx.psu.edu
Subject: Re: [galaxy-dev] Number of outputs = number of inputs

I don't believe this is possible in Galaxy right now. Are the outputs 
independent or is information from all inputs used to produce all outputs? If 
they are independent, you can create a workflow containing just your tool with 
1 input and 1 output and use the batch workflow mode to run it on multiple 
files and get multiple outputs. This is not a beautiful solution but it gets 
the job done in some cases.

Another thing to look at might be the discussion we are having on the thread 
"pass more information on a dataset merge". We have a fork (its all work from 
Jorrit Boekel) of galaxy that creates composite datatypes for each explicitly 
defined type that can hold collections of a single type.

https://bitbucket.org/galaxyp/galaxy-central-homogeneous-composite-datatypes/compare

This would hopefully let you declare that you can accept a collection of 
whatever your input type is and produce a collection of whatever your output 
is. Lots of downsides to this approach - not fully implemented, and not 
included in Galaxy proper, your outputs would be wrapped up in a composite 
datatype so they wouldn't be easily processable by downstream tools. It would 
be good to have additional people hacking on it though :)

-John


John Chilton
Senior Software Developer
University of Minnesota Supercomputing Institute
Office: 612-625-0917
Cell: 612-226-9223
Bitbucket: https://bitbucket.org/jmchilton
Github: https://github.com/jmchilton
Web: http://jmchilton.net

On Tue, Oct 16, 2012 at 7:13 AM, Sascha Kastens  
wrote:
> Hi all!
>
>
>
> I have a tool which takes one ore more input files. For each input 
> file one output is created,
>
> i.e. 1 input file -> 1 output file, 2 input files -> 2 output files, etc.
>
>
>
> What is the best way to handle this? I used the directions for handlin 
> multiple output files where
>
> the 'Number of Output datasets cannot be determined until tool run' 
> which in my opinion is a bit
>
> inappropriate. BTW: The input files are added via the -Tag, so 
> maybe there is a similar
>
> thing for outputs?
>
>
>
> Thanks in advance!
>
>
>
> Cheers,
>
> Sascha
>
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this and other 
> Galaxy lists, please use the interface at:
>
>   http://lists.bx.psu.edu/

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this and other Galaxy 
lists, please use the interface at:

  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Number of outputs = number of inputs

2012-10-16 Thread John Chilton
I don't believe this is possible in Galaxy right now. Are the outputs
independent or is information from all inputs used to produce all
outputs? If they are independent, you can create a workflow containing
just your tool with 1 input and 1 output and use the batch workflow
mode to run it on multiple files and get multiple outputs. This is not
a beautiful solution but it gets the job done in some cases.

Another thing to look at might be the discussion we are having on the
thread "pass more information on a dataset merge". We have a fork (its
all work from Jorrit Boekel) of galaxy that creates composite
datatypes for each explicitly defined type that can hold collections
of a single type.

https://bitbucket.org/galaxyp/galaxy-central-homogeneous-composite-datatypes/compare

This would hopefully let you declare that you can accept a collection
of whatever your input type is and produce a collection of whatever
your output is. Lots of downsides to this approach - not fully
implemented, and not included in Galaxy proper, your outputs would be
wrapped up in a composite datatype so they wouldn't be easily
processable by downstream tools. It would be good to have additional
people hacking on it though :)

-John


John Chilton
Senior Software Developer
University of Minnesota Supercomputing Institute
Office: 612-625-0917
Cell: 612-226-9223
Bitbucket: https://bitbucket.org/jmchilton
Github: https://github.com/jmchilton
Web: http://jmchilton.net

On Tue, Oct 16, 2012 at 7:13 AM, Sascha Kastens
 wrote:
> Hi all!
>
>
>
> I have a tool which takes one ore more input files. For each input file one
> output is created,
>
> i.e. 1 input file -> 1 output file, 2 input files -> 2 output files, etc.
>
>
>
> What is the best way to handle this? I used the directions for handlin
> multiple output files where
>
> the ’Number of Output datasets cannot be determined until tool run’ which in
> my opinion is a bit
>
> inappropriate. BTW: The input files are added via the -Tag, so maybe
> there is a similar
>
> thing for outputs?
>
>
>
> Thanks in advance!
>
>
>
> Cheers,
>
> Sascha
>
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>   http://lists.bx.psu.edu/

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/