Re: [EMBOSS] problem with eprimer32
Hi folks, in the current EMBOSS there are 2 wrappers for eprimer: eprimer3 Pick PCR primers and hybridization oligos eprimer32Pick PCR primers and hybridization oligos eprimer3 is for primer3 version 2.3.x (works for me with 2.3.6) and eprimer32 is for the older version 2.2.3 So if you want to use the new primer3 version (2.3.5), the symbolic link is not needed. Just put primer3_core in the path and use eprimer3. FWIW you can set the path to the primer3_core version to use by setting the corresponding environment variables in the EMBOSS configuration. For example in my 'emboss.defaults' I have: ENV EMBOSS_PRIMER3_CORE /ebi/extserv/bin/primer3-1.1.4/primer3_core ENV EMBOSS_PRIMER32_CORE /ebi/extserv/bin/primer3-2.2.3/primer3_core This method may make it somewhat easier to keep track of which versions are being used that using the symlink and PATH options. All the best, Hamish Kind regards, David. -Ursprüngliche Nachricht- Von: emboss-boun...@lists.open-bio.org [mailto:emboss-boun...@lists.open-bio.org] Im Auftrag von Guy Bottu Gesendet: 13 February 2014 18:17 An: emboss@lists.open-bio.org Betreff: [EMBOSS] problem with eprimer32 Dear all, I installed on my computer EMBOSS version 6.6.0.0 and tried to make eprimer32 work. I installed the last version of Primer3 (version 2.3.5) and I put a logical link in the bin directory of EMBOSS (primer32_core - .../primer3_core). When I try to run it, I get : Pick PCR primers and hybridization oligos Whitehead primer3_core program output file [emboss_001.eprimer32]: Error: thermodynamic approach chosen, but path to thermodynamic parameters not specified What could be missing ? The EMBOSS eprimer32 manual does not say anything beyond the need to have eprimer3_core in the path. Regards, Guy Bottu -- Mr Hamish McWilliam, Web Production, European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD United Kingdom URL: http://www.ebi.ac.uk/ ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] accessing remote databases
Hi Richard, [Pushing the thread back over to the list so other interested parties can participate] I was wondering if you could provide a new emboss.default file defined access to uniprot and swissprot (or at least the code to insert in my current file). Does the latest version of EMBOSS come with this by default? I am running 6.3.1 with the Ubuntu 12.04 OS. Data server support was added in EMBOSS 6.4.0, along with the associated default server definitions. These include a number of servers which provide access to UniProtKB or UniProtKB/SwissProt. Newer EMBOSS versions are available in more recent Ubuntu versions, and the next Ubuntu LTS will likely provide EMBOSS 6.6.0. For older versions of EMBOSS without server support, you can find a set of EMBOSS database definitions for databases available via dbfetch at: http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/emboss4.databases And for really old versions (i.e. pre EMBOSS 4.0.0), which do not support the 'dbfetch' access method, you could use the 'url' based equivalents described at: http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/emboss1.databases These dbfetch pages provide definitions for all the sequence databases available via dbfetch (see http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/dbfetch.databases) including EMBL-Bank, EMBLCDS, UniParc, UniProtKB, and the UniRef databases. For direct access to UniProtKB data from http://www.uniprot.org/ you can use something like: # UniProtKB DB uniprot [ type: P comment: UniProtKB (UniProt.org) method: url format: swiss url: http://www.uniprot.org/uniprot/%s.txt; fields: id acc ] to get basic entry name and accession based entry look-up. For more complex search options see the UniProt.org Web Service documentation: http://www.uniprot.org/faq/28 In the old setup I had a script (not written by myself!) that allowed me to pull taxonomies from uniprot/swissprot accessions. Depending on how the script was implemented and exactly what it did, one of the options detailed above may provide a suitable data source replacement. Alternatively many of the public SRS servers provide UniProtKB, so you could just switch to one of them. All the best, Hamish Thanks! Richard Rothery On 14-02-13 07:46 AM, Hamish McWilliam wrote: Hi Iddo, Since the SRS server at EBI was retired, I am looking for other remote databasest to accessvia EMBOSS. The DKFZ server seems to do a mostly good job (although slow from where I'm at): http://www.dkfz.de/menu/cgi-bin/srs7.1.3.1/wgetz However, I was wondering how to access genbank via EMBOSS (thorugh any protocoal) , what would be the entry in .embossrc? Also, are there SRS servers I can use in N. America that would hopefully be faster? For details of public SRS servers, see the Public SRS Installations at: http://bioblog.instem.com/download/srs-parser-and-software-downloads/public-srs-installations/ Current versions of EMBOSS come with a number of data sources configured which are accessed via the data server support. You can see details of the configured servers using the showserver command: http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/showserver.html And you can access entries via these services by using the slightly extended USA which specifies the server as well as the database, for example: entret -stdout -auto dbfetch:embl:L12345 to get EMBL-Bank data from dbfetch, or to get the same entry from NCBI Entrez: entret -stdout -auto entrez:nucleotide:L12345 Since NCBI's GenBank is part of the INSDC (http://www.insdc.org/), the data in GenBank is also available in ENA EMBL-Bank and DDBJ. So you could use the existing server definitions containing EMBL-Bank or DDBJ. Alternatively you can define your own (see http://emboss.open-bio.org/html/adm/ch04s01.html) to access GenBank via NCBI's E-Utitlites (http://eutils.ncbi.nlm.nih.gov/), for example: # NCBI GenBank+RefSeq via NCBI Entrez DB nucleotide [ type: nucleotide method: entrez format: genbank ] Since NCBI have also recently released command-line clients for their E-Utilities Web Services (http://www.ncbi.nlm.nih.gov/news/02-06-2014-entrez-direct-released/) another option would be to use these directly or wrap them as EMBOSS database definitions for your commonest queries. All the best, Hamish -- Mr Hamish McWilliam, Web Production, European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD United Kingdom URL: http://www.ebi.ac.uk/ ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] pepinfo and ambiguous amino-acids
Hi folks, When trying to use pepinfo on sequences containing peptide ambiguity codes (e.g. BJXZ) the program gives an error. For example running pepinfo on UniProtKB:Q9DY04 gives: Error: At position 83 in seq, couldn't find key X This is a reference to these amino acid codes not having corresponding entries in the data files used by pepinfo: - Eaa_properties.dat - Eaa_hydropathy.dat It would be good if this was handled more gracefully, either by providing average values in the data files, or by handling these cases with a warning and an appropriate null value. FWIW the same issue comes when handling sequences with unusual amino acids (i.e. 'O' and 'U'), although there should be actual figures for these. Does anyone have already fixed data files that could be used instead? All the best, Hamish -- Mr Hamish McWilliam, Web Production, European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD United Kingdom URL: http://www.ebi.ac.uk/ ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] ftp server down?
Just FYI: For the past few hours, the emboss ftp server seems to be down. Also, are there any mirrors? I am aware of two mirrors for the EMBOSS distribution: * EMBL-EBI: ftp://ftp.ebi.ac.uk/pub/software/unix/EMBOSS/ * IUBio Archive: - http://iubio.bio.indiana.edu/soft/molbio/emboss/ - ftp://iubio.bio.indiana.edu/molbio/emboss/ In my experience the main EMBOSS FTP server (ftp://emboss.open-bio.org/) is temperamental and often has issues with connectivity, so you are probably better off using a mirror if possible. If you are only looking for the software, and can handle the slight lag for packaging, then you might want to use prepackaged versions for your system: * RPM based systems: - http://rpm.pbone.net/index.php3?stat=3search=EMBOSS - http://rpmfind.net/linux/rpm2html/search.php?query=emboss * Debain (.deb) based systems: - http://packages.debian.org/search?keywords=emboss All the best, Hamish -- Mr Hamish McWilliam, Web Production, European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD United Kingdom URL: http://www.ebi.ac.uk/ ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Escaping query terms in a USA
Hi David, it seems the index is OK, just the database query code can not handle the : which has special meanings in USAs. So as workaround you can replace the : by a *. entret -stdout -auto 'imgthla-key:A*02*364' will return the entry HLA08011. But be aware that by this you actually generate a wildcard query, so the * matches any single character at that position. Unfortunately that is not going to work for this case since the HLA alleles use a somewhat nested nomenclature, for example: a*01:01:02 a*01:02 a*02:01:02 a*02:101:02 However a little experimentation indicates that EMBOSS supports the single character wild-card '?', so something like: $ entret -stdout -auto 'imgthla-key:A?01?02' appears to do what I want in most cases. That said, it would be better to have a way to escape the special characters (i.e. '*', ':' and '?') in the search term when an exact match is required (as in this case). Thanks, Hamish Kind regards, David. -Ursprüngliche Nachricht- Von: emboss-boun...@lists.open-bio.org [mailto:emboss-boun...@lists.open-bio.org] Im Auftrag von Hamish McWilliam Gesendet: 23 August 2013 11:25 An: emboss@lists.open-bio.org Betreff: [EMBOSS] Escaping query terms in a USA Hi folks, In the IMGT/HLA database (http://www.ebi.ac.uk/ipd/imgt/hla/) the keywords field in the EMBL-Bank format flat-file contains allele names like: A*02:364 While I can build an index containing the keywords, it does not appear to be possible to search the index with the allele names. For example: $ entret -stdout -auto 'imgthla-key:Allele' works as expected, but: $ entret -stdout -auto 'imgthla-key:A*02:364' just gives errors: Error: Failed to open filename 'imgthla-key' Error: Unable to read sequence 'imgthla-key:A*02:364' Died: entret terminated: Bad value for '-sequence' with -auto defined I am guessing that the problem is the '*' and ':' characters in the term... so is there some way to escape these or are the terms in the index mangles in some way? All the best, Hamish -- Mr Hamish McWilliam, Web Production, European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD United Kingdom URL: http://www.ebi.ac.uk/ ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] EMBOSS 6.1.0 release now available
Hi Peter, Any thought on implementing some of the algorithms using CUDA when possible on GPUs? This could speed up some programs significantly. Given that our server systems do not have particularly powerful GPUs, but do have multiple CPU cores: threading, and possibly the use of on core vectorization (see http://en.wikipedia.org/wiki/SIMD), seem like more generally applicable methods for improving performance in our case. One interesting option for Intel platforms is the Intel Compiler (icc), which will vectorize some code constructions as a platform specific optimization. Unfortunately we are running a mixture of AMD and Intel systems of various vintages, so this option is going to require a lot of testing to check it works and gives us any benefits. Yes indeed. At BOSC/ISMB last month we were discussing closer collaborations with the other Open Bio FOundation projects. One of these is BioManyCores which is aiming at OpenCL programming but is currently concentrating on CUDA. When our new workstations are delivered we will be looking into CUDA. Given that OpenCL supports both GPU and CPU vectorization, and CUDA is Nvidia GPU specific, it may be worth waiting for OpenCL to be adopted. MacOS X Snow Leopard is only a couple of months away after all ;-) Which applications would you most like to speed up? (current EMBOSS programs, and suggestions for new ones) At our end the bottlenecks are mainly the indexing (dbi* dbx*) and reformatting tools (seqret). All the best, Hamish ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] EMBOSS database setup
Hi Alan, The appended definitions are simple ones that may be useful if you only want a few sequences at a time. If sites upgrade to SRS8 then alter accordingly. Alan DB embl [ type: N method: srswww format: embl release: EBI url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz; comment: EMBL from the EBI ] DB em [ type: N method: srswww format: embl release: EBI url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz; dbalias: EMBL comment: EMBL from the EBI ] DB swissprot [ type: P method: srswww format: swiss release: EBI url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz; comment: SWISSPROT from the EBI ] DB sw [ type: P method: srswww format: swiss release: EBI url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz; dbalias: SWISSPROT comment: SWISSPROT from the EBI ] DB uniprot [ type: P method: srswww format: swiss release: EBI url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz; comment: UNIPROT from the EBI ] DB uni [ type: P method: srswww format: swiss release: EBI url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz; dbalias: UNIPROT comment: UNIPROT from the EBI ] DB pir [ type: P method: srswww format: nbrf release: EBI url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz; comment: PIR from the EBI ] DB genbank [ type: N method: srswww format: genbank release: NCBI url: http://www.infobiogen.fr/srs7bin/cgi-bin/wgetz; comment: GenBank from Infobiogen ] DB gb [ type: N method: srswww format: genbank release: NCBI url: http://www.infobiogen.fr/srs7bin/cgi-bin/wgetz; dbalias: GENBANK comment: GenBank from Infobiogen ] DB refseq [ type: N method: srswww format: genbank release: NCBI url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz; comment: REFSEQ from EBI ] For the EBI's SRS server please use: http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz as the URL. This should allow for continued support when the server is upgraded. Also note that the Infobiogen SRS service is no longer available. For other SRS sites carrying GenBank please see the Public SRS Server List (http://downloads.biowisdomsrs.com/publicsrs.html). Hamish ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss