Hi Trevor
This is historically how the species names for attributes were created
and since taking over the creation of the marts, we have continued
using the old format. This is something that another user has reported
previously and it is on my "To Do" list as it would make sense to make
this more consistent. The names are stable from release to release and
you can find the configuration for a dataset using the following for
example:
http://www.biomart.org/biomart/martservice?type=configuration&dataset=oanatinus_gene_ensembl
I hope that helps
Regards
Rhoda
On 22 Jan 2010, at 13:44, trevor paterson (RI) wrote:
I am using the ensembl/biomart webservices (accessed by JAVA SOAP
calls bound to the mart schema by JAXB)
But I am finding difficulty resolving species names for attributes
when exploring orthologies
There seem to be a number of different ways of representing a
species identifier
So we have:
ptroglodytes_gene_ensembl DATASET
with_ptroglodytes_homolog FILTER
But
chimp_orthology_type, chimp_ensembl_gene ATTRIBUTES
And this is even more cryptic for C.elegans
celegans_gene_ensembl DATASET
with_celegans_homolog FILTER
But
elegans_orthology_type ATTRIBUTE
The species identifiers in the orthology attributes ‘seem to be’ the
lower case of the common name – but why not ‘chimpanzee’ rather
than ‘chimp’ then?
Is there any sort of config file accessible that lists these – or do
I have to guess them? And are they stable over time?
You can manually scrape the attribute names out of an attribute
query on the datasets – but that isn’t fun.
I know the PERL API registry deals with aliases ....
– but at least for non PERL users sake would not the
adoption of a common, unique ‘magic string’ to refer to each species
in each mart table/query simplify things ? ;)
Trevor Paterson PhD
new email [email protected]
Bioinformatics
The Roslin Institute
Edinburgh University
Scotland EH25 9PS
phone +44 (0)131 5274197
http://www.roslin.ed.ac.uk
http://www.resspecies.org
http://www.thearkdb.org
http://www.comparagrid.org
Please consider the environment before printing this e-mail
The University of Edinburgh is a charitable body, registered in
Scotland with registration number SC005336
Disclaimer:This e-mail and any attachments are confidential and
intended solely for the use of the recipient(s) to whom they are
addressed. If you have received it in error, please destroy all
copies and inform the sender.
Rhoda Kinsella Ph.D.
Ensembl Bioinformatician,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.