Re: [galaxy-dev] Improved DRMAAJobRunner (Peter van Heusden)
n the bug report – > >>> > >>> > >>> ustacks parameters selected: > >>> > >>>Sample ID: 1 > >>> > >>>Min depth of coverage to create a stack: 3 > >>> > >>>Max distance allowed between stacks: 2 > >>> > >>>Max distance allowed to align secondary reads: 4 > >>> > >>>Max number of stacks allowed per de novo locus: 3 > >>> > >>>Deleveraging algorithm: disabled > >>> > >>>Removal algorithm: enabled > >>> > >>>Model type: SNP > >>> > >>>Alpha significance level for model: 0.05 > >>> > >>>Gapped alignments: disabled > >>> > >>> Parsing stacks_inputs/ZU-6.fq > >>> > >>> Loading RAD-Tags...done > >>> > >>> Loaded 933120 RAD-Tags. > >>> > >>>Inserted 256463 elements into the RAD-Tags hash map. > >>> > >>>0 reads contained uncalled nucleotides that were modified. > >>> > >>> 65509 initial stacks were populated; 190954 stacks were set > >>> aside as > >>> secondary reads. > >>> > >>> Initial coverage mean: 10.9276; Std Dev: 25.3864; Max: 4650 > >>> > >>> Deleveraging trigger: 36; Removal trigger: 62 > >>> > >>> Calculating distance for removing repetitive stacks. > >>> > >>>Distance allowed between stacks: 1; searching with a k-mer > length > >>> of 35 (36 k-mers per read); 1 k-mer hits required. > >>> > >>> Removing repetitive stacks. > >>> > >>>Removed 993 stacks. > >>> > >>>64677 stacks remain for merging. > >>> > >>> Post-Repeat Removal, coverage depth Mean: 10.2655; Std Dev: > >>> 9.03472; > >>> Max: 62 > >>> > >>> Calculating distance between stacks... > >>> > >>>Distance allowed between stacks: 2; searching with a k-mer > length > >>> of 23 (48 k-mers per read); 2 k-mer hits required. > >>> > >>> Merging stacks, maximum allowed distance: 2 nucleotide(s) > >>> > >>>64677 stacks merged into 57327 loci; deleveraged 0 loci; > >>> blacklisted 189 loci. > >>> > >>> After merging, coverage depth Mean: 11.0747; Std Dev: 9.82823; > >>> Max: 91 > >>> > >>> Merging remainder radtags > >>> > >>>217265 remainder sequences left to merge. > >>> > >>>Distance allowed between stacks: 4; searching with a k-mer > length > >>> of 13 (58 k-mers per read); 6 k-mer hits required. > >>> > >>>Matched 26211 remainder reads; unable to match 191054 > >>> remainder reads. > >>> > >>> After remainders merged, coverage depth Mean: 11.5347; Std Dev: > >>> 9.95581; Max: 93 > >>> > >>> Calling final consensus sequences, invoking SNP-calling model... > >>> > >>> Number of utilized reads: 742066 > >>> > >>> Writing loci, SNPs, and alleles to 'stacks_outputs/'... > >>> > >>>Refetching sequencing IDs from stacks_inputs/ZU-6.fq... read > >>> 933120 sequence IDs. > >>> > >>> done. > >>> > >>> ustacks is done. > >>> > >>> > >>> I am not sure what is causing the bad interpreter: No such file or > >>> directory. The tools are correctly installed along with its > >>> requisite conda packages. > >>> > >>> > >>> Any help would be greatly appreciated. > >>> > >>> > >>> Best Regards, > >>> > >>> > >>> David > >>> > >>> > >>> ___ > >>> Please keep all replies on the list by using "reply all" > >>> in your mail client. To manage your subscriptions to this > >>> and other Galaxy lists, please use the interface at: > >>>https://lists.galaxyproject.org/ >
Re: [galaxy-dev] Improved DRMAAJobRunner (Peter van Heusden)
al trigger: 62 Calculating distance for removing repetitive stacks. Distance allowed between stacks: 1; searching with a k-mer length of 35 (36 k-mers per read); 1 k-mer hits required. Removing repetitive stacks. Removed 993 stacks. 64677 stacks remain for merging. Post-Repeat Removal, coverage depth Mean: 10.2655; Std Dev: 9.03472; Max: 62 Calculating distance between stacks... Distance allowed between stacks: 2; searching with a k-mer length of 23 (48 k-mers per read); 2 k-mer hits required. Merging stacks, maximum allowed distance: 2 nucleotide(s) 64677 stacks merged into 57327 loci; deleveraged 0 loci; blacklisted 189 loci. After merging, coverage depth Mean: 11.0747; Std Dev: 9.82823; Max: 91 Merging remainder radtags 217265 remainder sequences left to merge. Distance allowed between stacks: 4; searching with a k-mer length of 13 (58 k-mers per read); 6 k-mer hits required. Matched 26211 remainder reads; unable to match 191054 remainder reads. After remainders merged, coverage depth Mean: 11.5347; Std Dev: 9.95581; Max: 93 Calling final consensus sequences, invoking SNP-calling model... Number of utilized reads: 742066 Writing loci, SNPs, and alleles to 'stacks_outputs/'... Refetching sequencing IDs from stacks_inputs/ZU-6.fq... read 933120 sequence IDs. done. ustacks is done. I am not sure what is causing the bad interpreter: No such file or directory. The tools are correctly installed along with its requisite conda packages. Any help would be greatly appreciated. Best Regards, David ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/ -- Message: 3 Date: Sat, 08 Jul 2017 11:56:50 +0200 From: Matthias Bernt To: galaxy-dev@lists.galaxyproject.org Subject: Re: [galaxy-dev] Improved DRMAAJobRunner (Peter van Heusden) Message-ID: Content-Type: text/plain; charset="utf-8" Hi Peter, Code looks interesting. This is for Univa Grid Engine? In its current state its tailored for the Univa Grid Engine. But the drmaa library code that is based on job_info() and wait() should run on all grid engines. For the command line based code changes might be necessary for the different grid engines. (Currently there is only a small bug, because wait() is currently called twice for finished jobs: in the repeated polling and then in the final check. The second call won't work which causes a fallback to qacct. I will fix this soon.) I guess that the output of qstat and qacct will be different. But I think this can be configured one just needs a way to get the info which grid engine is running. In terms of the upload jobs, are those not designed to be run as 'local' jobs and not with the 'real user' setting? Sounds reasonable, but this is not what is happening on my installation of galaxy. Any idea where I could start to find the problem. Best, Matthias -- next part -- An HTML attachment was scrubbed... URL: <https://lists.galaxyproject.org/pipermail/galaxy-dev/attachments/20170708/a72251d4/attachment-0001.html> -- Message: 4 Date: Sat, 08 Jul 2017 13:52:39 + From: Peter van Heusden To: galaxy-dev@lists.galaxyproject.org Subject: Re: [galaxy-dev] Improved DRMAAJobRunner (Peter van Heusden) Message-ID: Content-Type: text/plain; charset="utf-8" On Sat, 8 Jul 2017 at 11:56 Matthias Bernt wrote: Hi Peter, Code looks interesting. This is for Univa Grid Engine? In its current state its tailored for the Univa Grid Engine. But the drmaa library code that is based on job_info() and wait() should run on all grid engines. For the command line based code changes might be necessary for the different grid engines. (Currently there is only a small bug, because wait() is currently called twice for finished jobs: in the repeated polling and then in the final check. The second call won't work which causes a fallback to qacct. I will fix this soon.) I guess that the output of qstat and qacct will be different.
Re: [galaxy-dev] Improved DRMAAJobRunner (Peter van Heusden)
ks remain for merging. Post-Repeat Removal, coverage depth Mean: 10.2655; Std Dev: 9.03472; Max: 62 Calculating distance between stacks... Distance allowed between stacks: 2; searching with a k-mer length of 23 (48 k-mers per read); 2 k-mer hits required. Merging stacks, maximum allowed distance: 2 nucleotide(s) 64677 stacks merged into 57327 loci; deleveraged 0 loci; blacklisted 189 loci. After merging, coverage depth Mean: 11.0747; Std Dev: 9.82823; Max: 91 Merging remainder radtags 217265 remainder sequences left to merge. Distance allowed between stacks: 4; searching with a k-mer length of 13 (58 k-mers per read); 6 k-mer hits required. Matched 26211 remainder reads; unable to match 191054 remainder reads. After remainders merged, coverage depth Mean: 11.5347; Std Dev: 9.95581; Max: 93 Calling final consensus sequences, invoking SNP-calling model... Number of utilized reads: 742066 Writing loci, SNPs, and alleles to 'stacks_outputs/'... Refetching sequencing IDs from stacks_inputs/ZU-6.fq... read 933120 sequence IDs. done. ustacks is done. I am not sure what is causing the bad interpreter: No such file or directory. The tools are correctly installed along with its requisite conda packages. Any help would be greatly appreciated. Best Regards, David ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/ -- Message: 3 Date: Sat, 08 Jul 2017 11:56:50 +0200 From: Matthias Bernt To: galaxy-dev@lists.galaxyproject.org Subject: Re: [galaxy-dev] Improved DRMAAJobRunner (Peter van Heusden) Message-ID: Content-Type: text/plain; charset="utf-8" Hi Peter, Code looks interesting. This is for Univa Grid Engine? In its current state its tailored for the Univa Grid Engine. But the drmaa library code that is based on job_info() and wait() should run on all grid engines. For the command line based code changes might be necessary for the different grid engines. (Currently there is only a small bug, because wait() is currently called twice for finished jobs: in the repeated polling and then in the final check. The second call won't work which causes a fallback to qacct. I will fix this soon.) I guess that the output of qstat and qacct will be different. But I think this can be configured one just needs a way to get the info which grid engine is running. In terms of the upload jobs, are those not designed to be run as 'local' jobs and not with the 'real user' setting? Sounds reasonable, but this is not what is happening on my installation of galaxy. Any idea where I could start to find the problem. Best, Matthias ---------- next part ------ An HTML attachment was scrubbed... URL: <https://lists.galaxyproject.org/pipermail/galaxy-dev/attachments/20170708/a72251d4/attachment-0001.html> -- Message: 4 Date: Sat, 08 Jul 2017 13:52:39 + From: Peter van Heusden To: galaxy-dev@lists.galaxyproject.org Subject: Re: [galaxy-dev] Improved DRMAAJobRunner (Peter van Heusden) Message-ID: Content-Type: text/plain; charset="utf-8" On Sat, 8 Jul 2017 at 11:56 Matthias Bernt wrote: Hi Peter, Code looks interesting. This is for Univa Grid Engine? In its current state its tailored for the Univa Grid Engine. But the drmaa library code that is based on job_info() and wait() should run on all grid engines. For the command line based code changes might be necessary for the different grid engines. (Currently there is only a small bug, because wait() is currently called twice for finished jobs: in the repeated polling and then in the final check. The second call won't work which causes a fallback to qacct. I will fix this soon.) I guess that the output of qstat and qacct will be different. But I think this can be configured one just needs a way to get the info which grid engine is running. In terms of the upload jobs, are those not designed to be run as 'local' jobs and not with the 'real user' setting? Sounds reasonable, but this is no
Re: [galaxy-dev] Improved DRMAAJobRunner (Peter van Heusden)
On Sat, 8 Jul 2017 at 11:56 Matthias Bernt wrote: > Hi Peter, > > > Code looks interesting. This is for Univa Grid Engine? > > In its current state its tailored for the Univa Grid Engine. But the drmaa > library code that is based on job_info() and wait() should run on all grid > engines. > For the command line based code changes might be necessary for the > different grid engines. (Currently there is only a small bug, because > wait() is currently > called twice for finished jobs: in the repeated polling and then in the > final check. The second call won't work which causes a fallback to qacct. I > will fix this > soon.) > I guess that the output of qstat and qacct will be different. But I think > this can be configured one just needs a way to get the info which grid > engine is running. > > > In terms of the upload jobs, are those not designed to be run as 'local' > jobs and not with the 'real user' setting? > > Sounds reasonable, but this is not what is happening on my installation of > galaxy. Any idea where I could start to find the problem. > > I'm not an expert on this, but what does your job_conf.xml look like? Peter > Best, > Matthias ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > https://lists.galaxyproject.org/ > > To search Galaxy mailing lists use the unified search at: > http://galaxyproject.org/search/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/
Re: [galaxy-dev] Improved DRMAAJobRunner (Peter van Heusden)
Hi Peter, > Code looks interesting. This is for Univa Grid Engine? In its current state its tailored for the Univa Grid Engine. But the drmaa library code that is based on job_info() and wait() should run on all grid engines. For the command line based code changes might be necessary for the different grid engines. (Currently there is only a small bug, because wait() is currently called twice for finished jobs: in the repeated polling and then in the final check. The second call won't work which causes a fallback to qacct. I will fix this soon.) I guess that the output of qstat and qacct will be different. But I think this can be configured one just needs a way to get the info which grid engine is running. > In terms of the upload jobs, are those not designed to be run as 'local' jobs > and not with the 'real user' setting? Sounds reasonable, but this is not what is happening on my installation of galaxy. Any idea where I could start to find the problem. Best, Matthias ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/
Re: [galaxy-dev] Improved DRMAAJobRunner
Code looks interesting. This is for Univa Grid Engine? In terms of the upload jobs, are those not designed to be run as 'local' jobs and not with the 'real user' setting? On Fri, 7 Jul 2017 at 17:32 Matthias Bernt wrote: > Dear list, > > I have implemented a new DRMAA job runner allows to detect runtime and > memory violations. This is done by using the drmaa library functions > job_info + wait or as fallback the commandline tools qstat + qacct. This > has the additional advantage that the runner works also in setups using > the external runner scripts to start jobs as real user (the original > DRMAAJobRunner can not query these jobs at all). > > You may have a look here: > > https://github.com/galaxyproject/galaxy/pull/4275 > > I have successfully tested the but settings: galaxy user/real user > submits the jobs. Also the resubmission in case of memory and time > violations seems to work. > > Would be great to get some comments. > > One problem that I encountered is that upload jobs do not work in the > real user setting (for the original and the new runner): The permissions > of the uploaded file in > /gpfs1/data/galaxy_server/galaxy-dev/database/tmp/ are not changed. Any > idea what needs to be changed to get this running? > > Best, > Matthias > > -- > > --- > Matthias Bernt > Bioinformatics Service > Molekulare Systembiologie (MOLSYB) > Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/ > Helmholtz Centre for Environmental Research GmbH - UFZ > Permoserstraße 15, 04318 Leipzig, Germany > Phone +49 341 235 482296 <+49%20341%20235482296>, > m.be...@ufz.de, www.ufz.de > > Sitz der Gesellschaft/Registered Office: Leipzig > Registergericht/Registration Office: Amtsgericht Leipzig > Handelsregister Nr./Trade Register Nr.: B 4703 > Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: > MinDirig Wilfried Kraus > Wissenschaftlicher Geschäftsführer/Scientific Managing Director: > Prof. Dr. Dr. h.c. Georg Teutsch > Administrative Geschäftsführerin/ Administrative Managing Director: > Prof. Dr. Heike Graßmann > --- > ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > https://lists.galaxyproject.org/ > > To search Galaxy mailing lists use the unified search at: > http://galaxyproject.org/search/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/