Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch

2012-07-18 Thread Geert Vandeweyer

Hi,

In case you'd be interested: we use a script that creates interlaced 
data of paired end data.  We run this outside galaxy on groups of 
samples ordered in directories at once. We then import the interlaced 
data into galaxy, enabling batch workflow. The first step of the 
workflow is a deinterlacing of the datafiles.


The script is available here: 
http://geertvandeweyer.zymichost.com/index.php?page=read&id=27


Best Regards,

Geert



On 07/18/2012 02:38 PM, Sascha Kastens wrote:


Hi Dev-Team,

are you planning or maybe working on an update which enables the

possiblilty to run workflows in batch mode with paired end data?

Cheers,

Sascha



___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/



--

Geert Vandeweyer, Ph.D.
Department of Medical Genetics
University of Antwerp
Prins Boudewijnlaan 43
2650 Edegem
Belgium
Tel: +32 (0)3 275 97 56
E-mail: geert.vandewe...@ua.ac.be
http://ua.ac.be/cognitivegenetics
http://www.linkedin.com/pub/geert-vandeweyer/26/457/726

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch

2012-07-18 Thread Sascha Kastens
Hi Dev-Team,

 

are you planning or maybe working on an update which enables the

possiblilty to run workflows in batch mode with paired end data?

 

Cheers,

Sascha

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch

2012-07-05 Thread Thon Deboer
But this only works if you have a single dataset (such as a BAM file) for each 
workflow to run on.
If you have pairs of files (such as paired end FASTQ files, not an uncommon 
workflow nowadays :) ) you need to resort to using the API, since there is no 
support for paired end sequencing in GALAXY in this batch processing from the 
UI (yet?). You can run the Workflow one at a time, but you have to choose the 
FASTQ pairs your self.

I have written a fairly generic execution engine that I can share, that uses a 
config file to describe the files you need from the library in simple key:value 
pairs and that can execute the paired-end sequencing on hundreds of FASTQ 
files...It's a little hacky and requires your FASTQ files to have some 
consistent naming for the forward and reverse reads (_R1.fastq & _R2.fastq) but 
other than that it seems to do the job...

There is however a nasty bug in the API, in that it removes the files from your 
history if you use them in the API (I will post something on that later) but it 
seems to work fine for data in the libraries...

Thon
Regards,

Thon de Boer, Ph.D.
Bioinformatics Guru
+1-650-799-6839
thondeb...@me.com
LinkedIn Profile




On Jul 4, 2012, at 12:19 AM, Bernd Jagla wrote:

> Dannon Baker  writes:
> 
>> 
>> Hi Dave,
>> 
>> Yes, galaxy's standard run-workflow dialog has a feature where you can 
>> select 
> multiple datasets as input
>> for a single "Input Dataset" step.  To do this, click the icon referenced by 
> the tooltip in the screenshot
>> below to select multiple files.  All parameters remain static between 
> executions except for the single
>> input dataset that gets modified for each run, and that only one input 
>> dataset 
> can be set to multiple files
>> in this fashion.
>> 
>> -Dannon
> 
> Dannon,
> 
> what if I don't have this icon??? How can I enable this? Where is this 
> documented?
> 
> Thanks,
> 
> Bernd
> 
> 
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch

2012-07-04 Thread Dannon Baker
In your workflows, are you using "Input Dataset" steps?  Galaxy uses these 
steps to know how to map datasets to do special things like this.  If you're 
not currently using them, just open the workflow editor and add input dataset 
steps (it's at the very bottom of the tool list) connected to the tool inputs 
at the highest level of the workflow, and you'll see the multiple dataset 
flagging when you go to run it next time.

-Dannon

On Jul 4, 2012, at 3:19 AM, Bernd Jagla wrote:

> Dannon Baker  writes:
> 
>> 
>> Hi Dave,
>> 
>> Yes, galaxy's standard run-workflow dialog has a feature where you can 
>> select 
> multiple datasets as input
>> for a single "Input Dataset" step.  To do this, click the icon referenced by 
> the tooltip in the screenshot
>> below to select multiple files.  All parameters remain static between 
> executions except for the single
>> input dataset that gets modified for each run, and that only one input 
>> dataset 
> can be set to multiple files
>> in this fashion.
>> 
>> -Dannon
> 
> Dannon,
> 
> what if I don't have this icon??? How can I enable this? Where is this 
> documented?
> 
> Thanks,
> 
> Bernd
> 
> 
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch

2012-07-04 Thread Bernd Jagla
Dannon Baker  writes:

> 
> Hi Dave,
> 
> Yes, galaxy's standard run-workflow dialog has a feature where you can select 
multiple datasets as input
> for a single "Input Dataset" step.  To do this, click the icon referenced by 
the tooltip in the screenshot
> below to select multiple files.  All parameters remain static between 
executions except for the single
> input dataset that gets modified for each run, and that only one input 
> dataset 
can be set to multiple files
> in this fashion.
> 
> -Dannon

Dannon,

what if I don't have this icon??? How can I enable this? Where is this 
documented?

Thanks,

Bernd


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch

2012-02-07 Thread Dannon Baker
Thanks for the suggestion, I like that!  I'll make the change shortly.

-Dannon

On Feb 7, 2012, at 8:03 AM, Louise-Amélie Schmitt wrote:

> Hello Dannon
> 
> Could it be possible to have the input dataset's display name appended to the 
> new history's name instead of plain numbers when the "Send results in a new 
> history" option is checked?
> 
> This new feature is indeed very useful (thanks a million for it) but the 
> numbered suffixes make it hard to track what new history belongs to which 
> dataset.
> 
> Thanks,
> L-A
> 
> 
> Le 06/02/2012 23:00, Dannon Baker a écrit :
>> This method only works for single inputs at the moment, though eventually 
>> it'd be nice to allow pairing.  Another option for you would be to use the 
>> workflows API, with which you can definitely specify multiple inputs.  See 
>> workflow_execute.py in the scripts/api folder of your galaxy installation 
>> for one method of doing this.
>> 
>> -Dannon
>> 
>> 
>> On Feb 6, 2012, at 4:53 PM, Dave Lin wrote:
>> 
>>> Thank you Dannon. That is helpful.
>>> 
>>> 
>>> What if I need to specify multiple inputs per run (i.e. .csfasta + .qual 
>>> file)?
>>> 
>>> -Dave
>>> 
>>> On Mon, Feb 6, 2012 at 1:27 PM, Dannon Baker  wrote:
>>> Hi Dave,
>>> 
>>> Yes, galaxy's standard run-workflow dialog has a feature where you can 
>>> select multiple datasets as input for a single "Input Dataset" step.  To do 
>>> this, click the icon referenced by the tooltip in the screenshot below to 
>>> select multiple files.  All parameters remain static between executions 
>>> except for the single input dataset that gets modified for each run, and 
>>> that only one input dataset can be set to multiple files in this fashion.
>>> 
>>> -Dannon
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Feb 6, 2012, at 4:18 PM, Dave Lin wrote:
>>> 
 Hi All,
 
 I'm looking to batch process 40 large data sets with the same galaxy 
 workflow.
 
 This obviously can be done in a brute-force manual manner.
 
 However, is there a better way to schedule/invoke these jobs in batch
 
 1) from the UI with a plugin
 2) command-line
 3) web-service
 
 Thanks in advance for any pointers.
 Dave
 
 ___
 Please keep all replies on the list by using "reply all"
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/
>>> 
>> ___
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>> 
>>   http://lists.bx.psu.edu/
> 


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch

2012-02-07 Thread Louise-Amélie Schmitt

Hello Dannon

Could it be possible to have the input dataset's display name appended 
to the new history's name instead of plain numbers when the "Send 
results in a new history" option is checked?


This new feature is indeed very useful (thanks a million for it) but 
the numbered suffixes make it hard to track what new history belongs to 
which dataset.


Thanks,
L-A


Le 06/02/2012 23:00, Dannon Baker a écrit :

This method only works for single inputs at the moment, though eventually it'd 
be nice to allow pairing.  Another option for you would be to use the workflows 
API, with which you can definitely specify multiple inputs.  See 
workflow_execute.py in the scripts/api folder of your galaxy installation for 
one method of doing this.

-Dannon


On Feb 6, 2012, at 4:53 PM, Dave Lin wrote:


Thank you Dannon. That is helpful.


What if I need to specify multiple inputs per run (i.e. .csfasta + .qual file)?

-Dave

On Mon, Feb 6, 2012 at 1:27 PM, Dannon Baker  wrote:
Hi Dave,

Yes, galaxy's standard run-workflow dialog has a feature where you can select multiple 
datasets as input for a single "Input Dataset" step.  To do this, click the 
icon referenced by the tooltip in the screenshot below to select multiple files.  All 
parameters remain static between executions except for the single input dataset that gets 
modified for each run, and that only one input dataset can be set to multiple files in 
this fashion.

-Dannon







On Feb 6, 2012, at 4:18 PM, Dave Lin wrote:


Hi All,

I'm looking to batch process 40 large data sets with the same galaxy workflow.

This obviously can be done in a brute-force manual manner.

However, is there a better way to schedule/invoke these jobs in batch

1) from the UI with a plugin
2) command-line
3) web-service

Thanks in advance for any pointers.
Dave

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/



___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch

2012-02-06 Thread Dannon Baker
This method only works for single inputs at the moment, though eventually it'd 
be nice to allow pairing.  Another option for you would be to use the workflows 
API, with which you can definitely specify multiple inputs.  See 
workflow_execute.py in the scripts/api folder of your galaxy installation for 
one method of doing this.

-Dannon


On Feb 6, 2012, at 4:53 PM, Dave Lin wrote:

> Thank you Dannon. That is helpful.
> 
>  
> What if I need to specify multiple inputs per run (i.e. .csfasta + .qual 
> file)?
> 
> -Dave
> 
> On Mon, Feb 6, 2012 at 1:27 PM, Dannon Baker  wrote:
> Hi Dave,
> 
> Yes, galaxy's standard run-workflow dialog has a feature where you can select 
> multiple datasets as input for a single "Input Dataset" step.  To do this, 
> click the icon referenced by the tooltip in the screenshot below to select 
> multiple files.  All parameters remain static between executions except for 
> the single input dataset that gets modified for each run, and that only one 
> input dataset can be set to multiple files in this fashion.
> 
> -Dannon
> 
> 
> 
> 
> 
> 
> 
> On Feb 6, 2012, at 4:18 PM, Dave Lin wrote:
> 
>> Hi All,
>> 
>> I'm looking to batch process 40 large data sets with the same galaxy 
>> workflow.
>> 
>> This obviously can be done in a brute-force manual manner.  
>> 
>> However, is there a better way to schedule/invoke these jobs in batch 
>> 
>> 1) from the UI with a plugin
>> 2) command-line
>> 3) web-service
>> 
>> Thanks in advance for any pointers.
>> Dave
>> 
>> ___
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>> 
>>  http://lists.bx.psu.edu/
> 
> 

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch

2012-02-06 Thread Dave Lin
Thank you Dannon. That is helpful.

What if I need to specify multiple inputs per run (i.e. .csfasta + .qual
file)?

-Dave

On Mon, Feb 6, 2012 at 1:27 PM, Dannon Baker  wrote:

> Hi Dave,
>
> Yes, galaxy's standard run-workflow dialog has a feature where you can
> select multiple datasets as input for a single "Input Dataset" step.  To do
> this, click the icon referenced by the tooltip in the screenshot below to
> select multiple files.  All parameters remain static between executions
> except for the single input dataset that gets modified for each run, and
> that only one input dataset can be set to multiple files in this fashion.
>
> -Dannon
>
>
>
>
>
>
> On Feb 6, 2012, at 4:18 PM, Dave Lin wrote:
>
> Hi All,
>
> I'm looking to batch process 40 large data sets with the same galaxy
> workflow.
>
> This obviously can be done in a brute-force manual manner.
>
> However, is there a better way to schedule/invoke these jobs in batch
>
> 1) from the UI with a plugin
> 2) command-line
> 3) web-service
>
> Thanks in advance for any pointers.
> Dave
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/
>
>
>
<>___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch

2012-02-06 Thread Dannon Baker
Hi Dave,Yes, galaxy's standard run-workflow dialog has a feature where you can select multiple datasets as input for a single "Input Dataset" step.  To do this, click the icon referenced by the tooltip in the screenshot below to select multiple files.  All parameters remain static between executions except for the single input dataset that gets modified for each run, and that only one input dataset can be set to multiple files in this fashion.-DannonOn Feb 6, 2012, at 4:18 PM, Dave Lin wrote:Hi All,I'm looking to batch process 40 large data sets with the same galaxy workflow.This obviously can be done in a brute-force manual manner.  However, is there a better way to schedule/invoke these jobs in batch 
1) from the UI with a plugin2) command-line3) web-serviceThanks in advance for any pointers.Dave
___Please keep all replies on the list by using "reply all"in your mail client.  To manage your subscriptions to thisand other Galaxy lists, please use the interface at:  http://lists.bx.psu.edu/___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch

2012-02-06 Thread Dave Lin
Hi All,

I'm looking to batch process 40 large data sets with the same galaxy
workflow.

This obviously can be done in a brute-force manual manner.

However, is there a better way to schedule/invoke these jobs in batch

1) from the UI with a plugin
2) command-line
3) web-service

Thanks in advance for any pointers.
Dave
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/