Re: [EMBOSS] problem with eprimer32

2014-02-17 Thread Hamish McWilliam
Hi folks,

 in the current EMBOSS there are 2 wrappers for eprimer:
 
 eprimer3 Pick PCR primers and hybridization oligos
 eprimer32Pick PCR primers and hybridization oligos
 
 
 eprimer3 is for primer3  version 2.3.x (works for me with 2.3.6) 
 and
 eprimer32 is for the older version 2.2.3
 
 So if you want to use the new primer3 version (2.3.5), the symbolic link is 
 not needed.
 Just put primer3_core in the path and use eprimer3.

FWIW you can set the path to the primer3_core version to use by setting
the corresponding environment variables in the EMBOSS configuration. For
example in my 'emboss.defaults' I have:

ENV EMBOSS_PRIMER3_CORE /ebi/extserv/bin/primer3-1.1.4/primer3_core
ENV EMBOSS_PRIMER32_CORE /ebi/extserv/bin/primer3-2.2.3/primer3_core

This method may make it somewhat easier to keep track of which versions
are being used that using the symlink and PATH options.

All the best,

Hamish

 
 Kind regards,
 David.
 
 -Ursprüngliche Nachricht-
 Von: emboss-boun...@lists.open-bio.org 
 [mailto:emboss-boun...@lists.open-bio.org] Im Auftrag von Guy Bottu
 Gesendet: 13 February 2014 18:17
 An: emboss@lists.open-bio.org
 Betreff: [EMBOSS] problem with eprimer32
 
 Dear all,
 
 I installed on my computer EMBOSS version 6.6.0.0 and tried to make 
 eprimer32 work. I installed the last version of Primer3 (version 2.3.5) 
 and I put a logical link in the bin directory of EMBOSS (primer32_core 
 - .../primer3_core). When I try to run it, I get :
 
 Pick PCR primers and hybridization oligos
 Whitehead primer3_core program output file [emboss_001.eprimer32]:
 Error: thermodynamic approach chosen, but path to thermodynamic 
 parameters not specified
 
 What could be missing ? The EMBOSS eprimer32 manual does not say 
 anything beyond the need to have eprimer3_core in the path.
 
 Regards,
 Guy Bottu


-- 

Mr Hamish McWilliam,
Web Production,
European Bioinformatics Institute (EMBL-EBI),
European Molecular Biology Laboratory,
Wellcome Trust Genome Campus,
Hinxton, Cambridge, CB10 1SD
United Kingdom

URL: http://www.ebi.ac.uk/


___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss


Re: [EMBOSS] accessing remote databases

2014-02-13 Thread Hamish McWilliam
Hi Richard,

[Pushing the thread back over to the list so other interested parties
can participate]

 I was wondering if you could provide a new emboss.default file defined 
 access to uniprot and swissprot (or at least the code to insert in my 
 current file). Does the latest version of EMBOSS come with this by 
 default? I am running 6.3.1 with the Ubuntu 12.04 OS.

Data server support was added in EMBOSS 6.4.0, along with the associated
default server definitions. These include a number of servers which
provide access to UniProtKB or UniProtKB/SwissProt. Newer EMBOSS
versions are available in more recent Ubuntu versions, and the next
Ubuntu LTS will likely provide EMBOSS 6.6.0.

For older versions of EMBOSS without server support, you can find a set
of EMBOSS database definitions for databases available via dbfetch at:

http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/emboss4.databases

And for really old versions (i.e. pre EMBOSS 4.0.0), which do not
support the 'dbfetch' access method, you could use the 'url' based
equivalents described at:

http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/emboss1.databases

These dbfetch pages provide definitions for all the sequence databases
available via dbfetch (see
http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/dbfetch.databases) including
EMBL-Bank, EMBLCDS, UniParc, UniProtKB, and the UniRef databases.

For direct access to UniProtKB data from http://www.uniprot.org/ you can
use something like:

# UniProtKB
DB uniprot [
type: P
comment: UniProtKB (UniProt.org)
method: url
format: swiss
url: http://www.uniprot.org/uniprot/%s.txt;
fields: id acc
]

to get basic entry name and accession based entry look-up.

For more complex search options see the UniProt.org Web Service
documentation: http://www.uniprot.org/faq/28

 In the old setup I had a script (not written by myself!) that allowed me 
 to pull taxonomies from uniprot/swissprot accessions.

Depending on how the script was implemented and exactly what it did, one
of the options detailed above may provide a suitable data source
replacement. Alternatively many of the public SRS servers provide
UniProtKB, so you could just switch to one of them.

All the best,

Hamish

 Thanks!
 
 Richard Rothery
 
 On 14-02-13 07:46 AM, Hamish McWilliam wrote:
 Hi Iddo,

 Since the SRS server at EBI was retired, I am looking for other remote
 databasest to accessvia EMBOSS. The DKFZ server seems to do a mostly good
 job (although slow from where I'm at):

 http://www.dkfz.de/menu/cgi-bin/srs7.1.3.1/wgetz

 However, I was wondering how to access genbank via EMBOSS (thorugh any
 protocoal) , what would be the entry in .embossrc?

 Also, are there SRS servers I can use in N. America that would hopefully be
 faster?
 For details of public SRS servers, see the Public SRS Installations at:

 http://bioblog.instem.com/download/srs-parser-and-software-downloads/public-srs-installations/

 Current versions of EMBOSS come with a number of data sources configured
 which are accessed via the data server support. You can see details of
 the configured servers using the showserver command:

 http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/showserver.html

 And you can access entries via these services by using the slightly
 extended USA which specifies the server as well as the database, for
 example:

 entret -stdout -auto dbfetch:embl:L12345

 to get EMBL-Bank data from dbfetch, or to get the same entry from NCBI
 Entrez:

 entret -stdout -auto entrez:nucleotide:L12345

 Since NCBI's GenBank is part of the INSDC (http://www.insdc.org/), the
 data in GenBank is also available in ENA EMBL-Bank and DDBJ. So you
 could use the existing server definitions containing EMBL-Bank or DDBJ.

 Alternatively you can define your own (see
 http://emboss.open-bio.org/html/adm/ch04s01.html) to access GenBank via
 NCBI's E-Utitlites (http://eutils.ncbi.nlm.nih.gov/), for example:

 # NCBI GenBank+RefSeq via NCBI Entrez
 DB nucleotide [
 type: nucleotide
 method: entrez
 format: genbank
 ]

 Since NCBI have also recently released command-line clients for their
 E-Utilities Web Services
 (http://www.ncbi.nlm.nih.gov/news/02-06-2014-entrez-direct-released/)
 another option would be to use these directly or wrap them as EMBOSS
 database definitions for your commonest queries.

 All the best,

 Hamish
 
 


-- 

Mr Hamish McWilliam,
Web Production,
European Bioinformatics Institute (EMBL-EBI),
European Molecular Biology Laboratory,
Wellcome Trust Genome Campus,
Hinxton, Cambridge, CB10 1SD
United Kingdom

URL: http://www.ebi.ac.uk/


___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss


[EMBOSS] pepinfo and ambiguous amino-acids

2014-01-06 Thread Hamish McWilliam
Hi folks,

When trying to use pepinfo on sequences containing peptide ambiguity
codes (e.g. BJXZ) the program gives an error. For example running
pepinfo on UniProtKB:Q9DY04 gives:

Error: At position 83 in seq, couldn't find key X

This is a reference to these amino acid codes not having corresponding
entries in the data files used by pepinfo:

- Eaa_properties.dat
- Eaa_hydropathy.dat

It would be good if this was handled more gracefully, either by
providing average values in the data files, or by handling these cases
with a warning and an appropriate null value.

FWIW the same issue comes when handling sequences with unusual amino
acids (i.e. 'O' and 'U'), although there should be actual figures for these.

Does anyone have already fixed data files that could be used instead?

All the best,

Hamish
-- 

Mr Hamish McWilliam,
Web Production,
European Bioinformatics Institute (EMBL-EBI),
European Molecular Biology Laboratory,
Wellcome Trust Genome Campus,
Hinxton, Cambridge, CB10 1SD
United Kingdom

URL: http://www.ebi.ac.uk/


___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss


Re: [EMBOSS] ftp server down?

2013-09-10 Thread Hamish McWilliam

Just FYI: For the past few hours, the emboss ftp server seems to be down.

Also, are there any mirrors?


I am aware of two mirrors for the EMBOSS distribution:

* EMBL-EBI: ftp://ftp.ebi.ac.uk/pub/software/unix/EMBOSS/
* IUBio Archive:
- http://iubio.bio.indiana.edu/soft/molbio/emboss/
- ftp://iubio.bio.indiana.edu/molbio/emboss/

In my experience the main EMBOSS FTP server (ftp://emboss.open-bio.org/) 
is temperamental and often has issues with connectivity, so you are 
probably better off using a mirror if possible.


If you are only looking for the software, and can handle the slight lag 
for packaging, then you might want to use prepackaged versions for your 
system:


* RPM based systems:
- http://rpm.pbone.net/index.php3?stat=3search=EMBOSS
- http://rpmfind.net/linux/rpm2html/search.php?query=emboss
* Debain (.deb) based systems:
- http://packages.debian.org/search?keywords=emboss

All the best,

Hamish
--

Mr Hamish McWilliam,
Web Production,
European Bioinformatics Institute (EMBL-EBI),
European Molecular Biology Laboratory,
Wellcome Trust Genome Campus,
Hinxton, Cambridge, CB10 1SD
United Kingdom

URL: http://www.ebi.ac.uk/


___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss


Re: [EMBOSS] Escaping query terms in a USA

2013-08-23 Thread Hamish McWilliam

Hi David,


it seems the index is OK, just the database query code can not handle
the : which has special meanings in USAs. So as workaround you can
replace the : by a *.

entret -stdout -auto 'imgthla-key:A*02*364'

will return the entry HLA08011.

But be aware that by this you actually generate a wildcard query, so
the * matches any single character at that position.


Unfortunately that is not going to work for this case since the HLA 
alleles use a somewhat nested nomenclature, for example:


a*01:01:02
a*01:02
a*02:01:02
a*02:101:02

However a little experimentation indicates that EMBOSS supports the 
single character wild-card '?', so something like:


$ entret -stdout -auto 'imgthla-key:A?01?02'

appears to do what I want in most cases.

That said, it would be better to have a way to escape the special 
characters (i.e. '*', ':' and '?') in the search term when an exact 
match is required (as in this case).


Thanks,

Hamish



Kind regards, David.

-Ursprüngliche Nachricht- Von:
emboss-boun...@lists.open-bio.org
[mailto:emboss-boun...@lists.open-bio.org] Im Auftrag von Hamish
McWilliam Gesendet: 23 August 2013 11:25 An:
emboss@lists.open-bio.org Betreff: [EMBOSS] Escaping query terms in a
USA

Hi folks,

In the IMGT/HLA database (http://www.ebi.ac.uk/ipd/imgt/hla/) the
keywords field in the EMBL-Bank format flat-file contains allele
names like:

A*02:364

While I can build an index containing the keywords, it does not
appear to be possible to search the index with the allele names. For
example:

$ entret -stdout -auto 'imgthla-key:Allele'

works as expected, but:

$ entret -stdout -auto 'imgthla-key:A*02:364'

just gives errors:

Error: Failed to open filename 'imgthla-key' Error: Unable to read
sequence 'imgthla-key:A*02:364' Died: entret terminated: Bad value
for '-sequence' with -auto defined

I am guessing that the problem is the '*' and ':' characters in the
term... so is there some way to escape these or are the terms in the
index mangles in some way?

All the best,

Hamish




--

Mr Hamish McWilliam,
Web Production,
European Bioinformatics Institute (EMBL-EBI),
European Molecular Biology Laboratory,
Wellcome Trust Genome Campus,
Hinxton, Cambridge, CB10 1SD
United Kingdom

URL: http://www.ebi.ac.uk/


___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss


Re: [EMBOSS] EMBOSS 6.1.0 release now available

2009-07-17 Thread Hamish McWilliam

Hi Peter,

Any thought on implementing some of the algorithms using CUDA when 
possible on GPUs?  This could speed up some programs significantly.


Given that our server systems do not have particularly powerful GPUs, 
but do have multiple CPU cores: threading, and possibly the use of on 
core vectorization (see http://en.wikipedia.org/wiki/SIMD), seem like 
more generally applicable methods for improving performance in our case.


One interesting option for Intel platforms is the Intel Compiler (icc), 
which will vectorize some code constructions as a platform specific 
optimization. Unfortunately we are running a mixture of AMD and Intel 
systems of various vintages, so this option is going to require a lot of 
testing to check it works and gives us any benefits.



Yes indeed.

At BOSC/ISMB last month we were discussing closer collaborations with 
the other Open Bio FOundation projects.


One of these is BioManyCores which is aiming at OpenCL programming but 
is currently concentrating on CUDA.


When our new workstations are delivered we will be looking into CUDA.


Given that OpenCL supports both GPU and CPU vectorization, and CUDA is 
Nvidia GPU specific, it may be worth waiting for OpenCL to be adopted. 
MacOS X Snow Leopard is only a couple of months away after all ;-)


Which applications would you most like to speed up? (current EMBOSS 
programs, and suggestions for new ones)


At our end the bottlenecks are mainly the indexing (dbi*  dbx*) and 
reformatting tools (seqret).


All the best,

Hamish
___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss


Re: [EMBOSS] EMBOSS database setup

2006-12-03 Thread Hamish McWilliam
Hi Alan,

 The appended definitions are simple ones that may be
 useful if you only want a few sequences at a time.
 If sites upgrade to SRS8 then alter accordingly.
 
 Alan
 
 DB embl [  type: N method: srswww format: embl release: EBI
   url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz;
   comment: EMBL from the EBI ]
 
 DB em [  type: N method: srswww format: embl release: EBI
   url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz;
   dbalias: EMBL
   comment: EMBL from the EBI ]
 
 DB swissprot [  type: P method: srswww format: swiss release: EBI
   url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz;
   comment: SWISSPROT from the EBI ]
 
 DB sw [  type: P method: srswww format: swiss release: EBI
   url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz;
   dbalias: SWISSPROT
   comment: SWISSPROT from the EBI ]
 
 DB uniprot [  type: P method: srswww format: swiss release: EBI
   url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz;
   comment: UNIPROT from the EBI ]
 
 DB uni [  type: P method: srswww format: swiss release: EBI
   url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz;
   dbalias: UNIPROT
   comment: UNIPROT from the EBI ]
 
 DB pir [  type: P method: srswww format: nbrf release: EBI
   url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz;
   comment: PIR from the EBI ]
 
 DB genbank [  type: N method: srswww format: genbank release: NCBI
   url: http://www.infobiogen.fr/srs7bin/cgi-bin/wgetz;
   comment: GenBank from Infobiogen ]
 
 DB gb [  type: N method: srswww format: genbank release: NCBI
   url: http://www.infobiogen.fr/srs7bin/cgi-bin/wgetz;
   dbalias: GENBANK
   comment: GenBank from Infobiogen ]
 
 DB refseq [  type: N method: srswww format: genbank release: NCBI
   url: http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz;
   comment: REFSEQ from EBI ]

For the EBI's SRS server please use:

   http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz

as the URL. This should allow for continued support when the server is 
upgraded.

Also note that the Infobiogen SRS service is no longer available. For 
other SRS sites carrying GenBank please see the Public SRS Server List 
(http://downloads.biowisdomsrs.com/publicsrs.html).

Hamish

___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss