Dear all, I have a set of accession numbers and I want to retrieve the organism that the sequence is associated with - i.e. the content of the OS line in an embl file. I don't need the taxonomic id, and I don't need to start traversing taxonomy trees. I want to do this by accessing remote databases (via srs, as configured in my emboss.defaults file), rather than indexing databases locally. So the output I want would be a text mapping like:
accession : species where species is taken from the OS line of a database entry. The closest I've made it to using Emboss is to get the gff output file containing feature information using a command along the lines of: seqret -feature embl:XXXX -oufo2 myfeat.txt (embl is a database I can search using srs as configured in my emboss.defaults file.) The first non-hashed line in the file myfeat.txt contains the term "organism="Whateverus thingus" so I could parse that out. However, this file still contains a lot of extra (unwanted) information and requires parsing. Does anyone know if I'm missing something obvious in Emboss that I could use for this? (I have tried the BioPerl route to get this info from the NCBI, and apart from being unwieldly, I'm managing to get the wrong organism returned for the type of identifer I have. No, I haven't spent time tracking down the problem - frankly, I'd rather resove it using Emboss and/or srs calls.) If there isn't anything that will do the job in Emboss at the moment, is there any chance I can put in a development request for an extra flag for seqret, or an extra utility tool that might accomplish this task? cheers, Bela ************************* Dr. Bela Tiwari Lead Bioinformatician NERC Environmental Bioinformatics Centre http://nebc.nerc.ac.uk tel: 01491 69 2705 Centre for Ecology and Hydrology Maclean Bldg, Benson Lane Crowmarsh Gifford Wallingford, England OX10 8BB ************************* -- This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system. _______________________________________________ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss