[galaxy-dev] metadata in parallelization

2015-04-17 Thread Roberto Alonso CIPF
Hello,

I am writing some code to enable parallelization for some tool wrappers.
First, I did it for simple bwa wrapper, but now I am modifying
toolshed.g2.bx.psu.edu/repos/devteam/bwa/c71dd035971e/bwa/bwa-mem.xml to
check if the code would work with this wrapper. So, I wrote some code that
I thing was necessary in order to merge some bam and I added the
parallelism tag (in bold) to the config file:

tool id=bwa_mem name=BWA-MEM version=0.1

  macros
importbwa_macros.xml/import
  /macros

  requirements
requirement type=package
version=0.7.10.039ea20639bwa/requirement
requirement type=package version=1.1samtools/requirement
  /requirements
  description- map medium and long reads (gt; 100 bp) against reference
genome/description
  parallelism method=multi split_size=3 shared_inputs=ref_file
split_mode=number_of_parts merge_outputs=bam_output
split_inputs=fastq_input1,fastq_input2 /parallelism


  command
...

So, everything works well, and the resulting bam from parallelization mode
and without the parallelization mode is the same but the Galaxy log throws
an error regarding metadata, it says something like this:

galaxy.jobs.splitters.multi DEBUG 2015-04-17 09:54:58,335 merge finished:
/home/ralonso/galaxy/database/files/000/dataset_198.dat
galaxy.jobs.runners.tasks DEBUG 2015-04-17 09:54:58,473 executing external
set_meta script for job 200: python
/home/ralonso/galaxy/database/tmp/set_metadata_E5fGIE.py
/home/ralonso/galaxy/database/tmp/tmpHS8Byo
/home/ralonso/galaxy/database/job_working_directory/000/200/galaxy.json
/home/ralonso/galaxy/database/tmp/metadata_in_HistoryDatasetAssociation_198_yOGiQG,/home/ralonso/galaxy/database/tmp/metadata_kwds_HistoryDatasetAssociation_198_nAsQoq,/home/ralonso/galaxy/database/tmp/metadata_out_HistoryDatasetAssociation_198_I_cLs4,/home/ralonso/galaxy/database/tmp/metadata_results_HistoryDatasetAssociation_198_qhjzoV,/home/ralonso/galaxy/database/files/000/dataset_198.dat,/home/ralonso/galaxy/database/tmp/metadata_override_HistoryDatasetAssociation_198_ScKLqH
Traceback (most recent call last):
  File /home/ralonso/galaxy/database/tmp/set_metadata_E5fGIE.py, line 1,
in module
from galaxy_ext.metadata.set_metadata import set_metadata;
set_metadata()
ImportError: No module named galaxy_ext.metadata.set_metadata
galaxy.jobs.runners.tasks DEBUG 2015-04-17 09:54:58,624 execution of
external set_meta finished for job 200
galaxy.datatypes.metadata DEBUG 2015-04-17 09:54:58,714 setting metadata
externally failed for HistoryDatasetAssociation 198: External set_meta()
not called

When using no parallelization mode, there is no problem, also because
Galaxy doesn't go through this part of code, I mean it doesn't execute this.
I see that Galaxy have to do something with metada attributes, but what is
t trying to do? is there any way to solve this?

Thank you very much

Regards,


-- 
Roberto Alonso
Functional Genomics Unit
Bioinformatics and Genomics Department
Prince Felipe Research Center (CIPF)
C./Eduardo Primo Yúfera (Científic), nº 3
(junto Oceanografico)
46012 Valencia, Spain
Tel: +34 963289680 Ext. 1021
Fax: +34 963289574
E-Mail: ralo...@cipf.es
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] metadata in parallelization

2015-04-17 Thread Roberto Alonso
Hello,

I am writing some code to enable parallelization for some tool wrappers.
First, I did it for simple bwa wrapper, but now I am modifying
toolshed.g2.bx.psu.edu/repos/devteam/bwa/c71dd035971e/bwa/bwa-mem.xml to
check if the code would work with this wrapper. So, I wrote some code that
I thing was necessary in order to merge some bam and I added the
parallelism tag (in bold) to the config file:

tool id=bwa_mem name=BWA-MEM version=0.1

  macros
importbwa_macros.xml/import
  /macros

  requirements
requirement type=package
version=0.7.10.039ea20639bwa/requirement
requirement type=package version=1.1samtools/requirement
  /requirements
  description- map medium and long reads (gt; 100 bp) against reference
genome/description
  *parallelism method=multi split_size=3 shared_inputs=ref_file
split_mode=number_of_parts merge_outputs=bam_output
split_inputs=fastq_input1,fastq_input2 /parallelism*


  command
...

So, everything works well, and the resulting bam from parallelization mode
and without the parallelization mode is the same but the Galaxy log throws
an error regarding metadata, it says something like this:

galaxy.jobs.splitters.multi DEBUG 2015-04-17 09:54:58,335 merge finished:
/home/ralonso/galaxy/database/files/000/dataset_198.dat
galaxy.jobs.runners.tasks DEBUG 2015-04-17 09:54:58,473 executing external
set_meta script for job 200: python
/home/ralonso/galaxy/database/tmp/set_metadata_E5fGIE.py
/home/ralonso/galaxy/database/tmp/tmpHS8Byo
/home/ralonso/galaxy/database/job_working_directory/000/200/galaxy.json
/home/ralonso/galaxy/database/tmp/metadata_in_HistoryDatasetAssociation_198_yOGiQG,/home/ralonso/galaxy/database/tmp/metadata_kwds_HistoryDatasetAssociation_198_nAsQoq,/home/ralonso/galaxy/database/tmp/metadata_out_HistoryDatasetAssociation_198_I_cLs4,/home/ralonso/galaxy/database/tmp/metadata_results_HistoryDatasetAssociation_198_qhjzoV,/home/ralonso/galaxy/database/files/000/dataset_198.dat,/home/ralonso/galaxy/database/tmp/metadata_override_HistoryDatasetAssociation_198_ScKLqH
Traceback (most recent call last):
  File /home/ralonso/galaxy/database/tmp/set_metadata_E5fGIE.py, line 1,
in module
from galaxy_ext.metadata.set_metadata import set_metadata;
set_metadata()
ImportError: No module named galaxy_ext.metadata.set_metadata
galaxy.jobs.runners.tasks DEBUG 2015-04-17 09:54:58,624 execution of
external set_meta finished for job 200
*galaxy.datatypes.metadata DEBUG 2015-04-17 09:54:58,714 setting metadata
externally failed for HistoryDatasetAssociation 198: External set_meta()
not called*

When using no parallelization mode, there is no problem, also because
Galaxy doesn't go through this part of code, I mean it doesn't execute this.
I see that Galaxy have to do something with metada attributes, but what is
t trying to do? is there any way to solve this?

Thank you very much

Regards,
Roberto
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/