Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch
Hi, In case you'd be interested: we use a script that creates interlaced data of paired end data. We run this outside galaxy on groups of samples ordered in directories at once. We then import the interlaced data into galaxy, enabling batch workflow. The first step of the workflow is a deinterlacing of the datafiles. The script is available here: http://geertvandeweyer.zymichost.com/index.php?page=read&id=27 Best Regards, Geert On 07/18/2012 02:38 PM, Sascha Kastens wrote: Hi Dev-Team, are you planning or maybe working on an update which enables the possiblilty to run workflows in batch mode with paired end data? Cheers, Sascha ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Geert Vandeweyer, Ph.D. Department of Medical Genetics University of Antwerp Prins Boudewijnlaan 43 2650 Edegem Belgium Tel: +32 (0)3 275 97 56 E-mail: geert.vandewe...@ua.ac.be http://ua.ac.be/cognitivegenetics http://www.linkedin.com/pub/geert-vandeweyer/26/457/726 ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch
Hi Dev-Team, are you planning or maybe working on an update which enables the possiblilty to run workflows in batch mode with paired end data? Cheers, Sascha ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch
But this only works if you have a single dataset (such as a BAM file) for each workflow to run on. If you have pairs of files (such as paired end FASTQ files, not an uncommon workflow nowadays :) ) you need to resort to using the API, since there is no support for paired end sequencing in GALAXY in this batch processing from the UI (yet?). You can run the Workflow one at a time, but you have to choose the FASTQ pairs your self. I have written a fairly generic execution engine that I can share, that uses a config file to describe the files you need from the library in simple key:value pairs and that can execute the paired-end sequencing on hundreds of FASTQ files...It's a little hacky and requires your FASTQ files to have some consistent naming for the forward and reverse reads (_R1.fastq & _R2.fastq) but other than that it seems to do the job... There is however a nasty bug in the API, in that it removes the files from your history if you use them in the API (I will post something on that later) but it seems to work fine for data in the libraries... Thon Regards, Thon de Boer, Ph.D. Bioinformatics Guru +1-650-799-6839 thondeb...@me.com LinkedIn Profile On Jul 4, 2012, at 12:19 AM, Bernd Jagla wrote: > Dannon Baker writes: > >> >> Hi Dave, >> >> Yes, galaxy's standard run-workflow dialog has a feature where you can >> select > multiple datasets as input >> for a single "Input Dataset" step. To do this, click the icon referenced by > the tooltip in the screenshot >> below to select multiple files. All parameters remain static between > executions except for the single >> input dataset that gets modified for each run, and that only one input >> dataset > can be set to multiple files >> in this fashion. >> >> -Dannon > > Dannon, > > what if I don't have this icon??? How can I enable this? Where is this > documented? > > Thanks, > > Bernd > > > ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch
In your workflows, are you using "Input Dataset" steps? Galaxy uses these steps to know how to map datasets to do special things like this. If you're not currently using them, just open the workflow editor and add input dataset steps (it's at the very bottom of the tool list) connected to the tool inputs at the highest level of the workflow, and you'll see the multiple dataset flagging when you go to run it next time. -Dannon On Jul 4, 2012, at 3:19 AM, Bernd Jagla wrote: > Dannon Baker writes: > >> >> Hi Dave, >> >> Yes, galaxy's standard run-workflow dialog has a feature where you can >> select > multiple datasets as input >> for a single "Input Dataset" step. To do this, click the icon referenced by > the tooltip in the screenshot >> below to select multiple files. All parameters remain static between > executions except for the single >> input dataset that gets modified for each run, and that only one input >> dataset > can be set to multiple files >> in this fashion. >> >> -Dannon > > Dannon, > > what if I don't have this icon??? How can I enable this? Where is this > documented? > > Thanks, > > Bernd > > > ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch
Dannon Baker writes: > > Hi Dave, > > Yes, galaxy's standard run-workflow dialog has a feature where you can select multiple datasets as input > for a single "Input Dataset" step. To do this, click the icon referenced by the tooltip in the screenshot > below to select multiple files. All parameters remain static between executions except for the single > input dataset that gets modified for each run, and that only one input > dataset can be set to multiple files > in this fashion. > > -Dannon Dannon, what if I don't have this icon??? How can I enable this? Where is this documented? Thanks, Bernd ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch
Thanks for the suggestion, I like that! I'll make the change shortly. -Dannon On Feb 7, 2012, at 8:03 AM, Louise-Amélie Schmitt wrote: > Hello Dannon > > Could it be possible to have the input dataset's display name appended to the > new history's name instead of plain numbers when the "Send results in a new > history" option is checked? > > This new feature is indeed very useful (thanks a million for it) but the > numbered suffixes make it hard to track what new history belongs to which > dataset. > > Thanks, > L-A > > > Le 06/02/2012 23:00, Dannon Baker a écrit : >> This method only works for single inputs at the moment, though eventually >> it'd be nice to allow pairing. Another option for you would be to use the >> workflows API, with which you can definitely specify multiple inputs. See >> workflow_execute.py in the scripts/api folder of your galaxy installation >> for one method of doing this. >> >> -Dannon >> >> >> On Feb 6, 2012, at 4:53 PM, Dave Lin wrote: >> >>> Thank you Dannon. That is helpful. >>> >>> >>> What if I need to specify multiple inputs per run (i.e. .csfasta + .qual >>> file)? >>> >>> -Dave >>> >>> On Mon, Feb 6, 2012 at 1:27 PM, Dannon Baker wrote: >>> Hi Dave, >>> >>> Yes, galaxy's standard run-workflow dialog has a feature where you can >>> select multiple datasets as input for a single "Input Dataset" step. To do >>> this, click the icon referenced by the tooltip in the screenshot below to >>> select multiple files. All parameters remain static between executions >>> except for the single input dataset that gets modified for each run, and >>> that only one input dataset can be set to multiple files in this fashion. >>> >>> -Dannon >>> >>> >>> >>> >>> >>> >>> >>> On Feb 6, 2012, at 4:18 PM, Dave Lin wrote: >>> Hi All, I'm looking to batch process 40 large data sets with the same galaxy workflow. This obviously can be done in a brute-force manual manner. However, is there a better way to schedule/invoke these jobs in batch 1) from the UI with a plugin 2) command-line 3) web-service Thanks in advance for any pointers. Dave ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ >>> >> ___ >> Please keep all replies on the list by using "reply all" >> in your mail client. To manage your subscriptions to this >> and other Galaxy lists, please use the interface at: >> >> http://lists.bx.psu.edu/ > ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch
Hello Dannon Could it be possible to have the input dataset's display name appended to the new history's name instead of plain numbers when the "Send results in a new history" option is checked? This new feature is indeed very useful (thanks a million for it) but the numbered suffixes make it hard to track what new history belongs to which dataset. Thanks, L-A Le 06/02/2012 23:00, Dannon Baker a écrit : This method only works for single inputs at the moment, though eventually it'd be nice to allow pairing. Another option for you would be to use the workflows API, with which you can definitely specify multiple inputs. See workflow_execute.py in the scripts/api folder of your galaxy installation for one method of doing this. -Dannon On Feb 6, 2012, at 4:53 PM, Dave Lin wrote: Thank you Dannon. That is helpful. What if I need to specify multiple inputs per run (i.e. .csfasta + .qual file)? -Dave On Mon, Feb 6, 2012 at 1:27 PM, Dannon Baker wrote: Hi Dave, Yes, galaxy's standard run-workflow dialog has a feature where you can select multiple datasets as input for a single "Input Dataset" step. To do this, click the icon referenced by the tooltip in the screenshot below to select multiple files. All parameters remain static between executions except for the single input dataset that gets modified for each run, and that only one input dataset can be set to multiple files in this fashion. -Dannon On Feb 6, 2012, at 4:18 PM, Dave Lin wrote: Hi All, I'm looking to batch process 40 large data sets with the same galaxy workflow. This obviously can be done in a brute-force manual manner. However, is there a better way to schedule/invoke these jobs in batch 1) from the UI with a plugin 2) command-line 3) web-service Thanks in advance for any pointers. Dave ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch
This method only works for single inputs at the moment, though eventually it'd be nice to allow pairing. Another option for you would be to use the workflows API, with which you can definitely specify multiple inputs. See workflow_execute.py in the scripts/api folder of your galaxy installation for one method of doing this. -Dannon On Feb 6, 2012, at 4:53 PM, Dave Lin wrote: > Thank you Dannon. That is helpful. > > > What if I need to specify multiple inputs per run (i.e. .csfasta + .qual > file)? > > -Dave > > On Mon, Feb 6, 2012 at 1:27 PM, Dannon Baker wrote: > Hi Dave, > > Yes, galaxy's standard run-workflow dialog has a feature where you can select > multiple datasets as input for a single "Input Dataset" step. To do this, > click the icon referenced by the tooltip in the screenshot below to select > multiple files. All parameters remain static between executions except for > the single input dataset that gets modified for each run, and that only one > input dataset can be set to multiple files in this fashion. > > -Dannon > > > > > > > > On Feb 6, 2012, at 4:18 PM, Dave Lin wrote: > >> Hi All, >> >> I'm looking to batch process 40 large data sets with the same galaxy >> workflow. >> >> This obviously can be done in a brute-force manual manner. >> >> However, is there a better way to schedule/invoke these jobs in batch >> >> 1) from the UI with a plugin >> 2) command-line >> 3) web-service >> >> Thanks in advance for any pointers. >> Dave >> >> ___ >> Please keep all replies on the list by using "reply all" >> in your mail client. To manage your subscriptions to this >> and other Galaxy lists, please use the interface at: >> >> http://lists.bx.psu.edu/ > > ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch
Thank you Dannon. That is helpful. What if I need to specify multiple inputs per run (i.e. .csfasta + .qual file)? -Dave On Mon, Feb 6, 2012 at 1:27 PM, Dannon Baker wrote: > Hi Dave, > > Yes, galaxy's standard run-workflow dialog has a feature where you can > select multiple datasets as input for a single "Input Dataset" step. To do > this, click the icon referenced by the tooltip in the screenshot below to > select multiple files. All parameters remain static between executions > except for the single input dataset that gets modified for each run, and > that only one input dataset can be set to multiple files in this fashion. > > -Dannon > > > > > > > On Feb 6, 2012, at 4:18 PM, Dave Lin wrote: > > Hi All, > > I'm looking to batch process 40 large data sets with the same galaxy > workflow. > > This obviously can be done in a brute-force manual manner. > > However, is there a better way to schedule/invoke these jobs in batch > > 1) from the UI with a plugin > 2) command-line > 3) web-service > > Thanks in advance for any pointers. > Dave > > ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/ > > > <>___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch
Hi Dave,Yes, galaxy's standard run-workflow dialog has a feature where you can select multiple datasets as input for a single "Input Dataset" step. To do this, click the icon referenced by the tooltip in the screenshot below to select multiple files. All parameters remain static between executions except for the single input dataset that gets modified for each run, and that only one input dataset can be set to multiple files in this fashion.-DannonOn Feb 6, 2012, at 4:18 PM, Dave Lin wrote:Hi All,I'm looking to batch process 40 large data sets with the same galaxy workflow.This obviously can be done in a brute-force manual manner. However, is there a better way to schedule/invoke these jobs in batch 1) from the UI with a plugin2) command-line3) web-serviceThanks in advance for any pointers.Dave ___Please keep all replies on the list by using "reply all"in your mail client. To manage your subscriptions to thisand other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Looking for recommendations: How to run galaxy workflows in batch
Hi All, I'm looking to batch process 40 large data sets with the same galaxy workflow. This obviously can be done in a brute-force manual manner. However, is there a better way to schedule/invoke these jobs in batch 1) from the UI with a plugin 2) command-line 3) web-service Thanks in advance for any pointers. Dave ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/