Re: [EMBOSS] problem with eprimer32
In the mean time I managed to solve myself the problem, the hard way, by looking into the source code of primer itself. Seems that the current version needs thermodynamic parameters and searches the files that contain them either in the working directory or in /opt, unless you set a parameter to tell otherwise. I changed the source code of the EMBOSS program eprimer32, so as to add : eprimer32_send_stringC(stream, "PRIMER_THERMODYNAMIC_PARAMETERS_PATH", "/OPT/primer3-2.3.5/src/primer3_config/"); I could of course instead have copied the primer3_config directory to /opt but I preferred to do it this way. Regards, Guy Bottu --- Dit e-mailbericht bevat geen virussen en malware omdat avast! Antivirus-bescherming actief is. http://www.avast.com ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] problem with eprimer32
Dear all, I installed on my computer EMBOSS version 6.6.0.0 and tried to make eprimer32 work. I installed the last version of Primer3 (version 2.3.5) and I put a logical link in the bin directory of EMBOSS (primer32_core -> .../primer3_core). When I try to run it, I get : Pick PCR primers and hybridization oligos Whitehead primer3_core program output file [emboss_001.eprimer32]: Error: thermodynamic approach chosen, but path to thermodynamic parameters not specified What could be missing ? The EMBOSS eprimer32 manual does not say anything beyond the need to have eprimer3_core in the path. Regards, Guy Bottu --- Dit e-mailbericht bevat geen virussen en malware omdat avast! Antivirus-bescherming actief is. http://www.avast.com ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] What happened witht he mixed sequence database access method ?
Dear developpers, dear all, I installed on my computer EMBOSS version 6.6.0.0 and tried to install the SwissProt database. I put in emboss.deafults the following : DB sw [ type: P comment: 'SwissProt' methodquery: emboss formatquery: swiss methodall: direct formatall: fasta directory: /host/4UBUNTU/sequences/swissprot file: uniprot_sprot.fasta fields: 'id acc key des org' ] When I try to access sequences I get : entret 'sw:*' stdout Retrieve sequence entries from flatfile databases and files Error: No access method for database 'sw' Error: sequence access method '' not found Error: Unable to read sequence 'sw:*' Died: entret terminated: Bad value for '-sequence' and no prompt entret sw:papa1_carpa stdout Retrieve sequence entries from flatfile databases and files Error: No access method for database 'sw' Error: sequence access method '' not found Error: Unable to read sequence 'sw:papa1_carpa' Died: entret terminated: Bad value for '-sequence' and no prompt When I do not try to have 2 different access methods for SwissProt then it works fine. Yet I remember from the days that I worked for the Belgian EMBnet Node that it was possible to have 2 different access method for the same databank (e.g. SRS or app for query and direct for all). Did you decide to give up the possibility of mixed access methods or is there a bug ? Regards Guy Bottu --- Dit e-mailbericht bevat geen virussen en malware omdat avast! Antivirus-bescherming actief is. http://www.avast.com ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Help to build a motif for fzzpro
Dear Aengus, I have an idea of how to do it. You must of course complete the motif/pattern as much as possible because with just {DE}(4) you will find much to much unless you search only a few sequences. You must run the program fuzzpro with parameter -rformat=listfile. You will obtain as output an EMBOSS list file. You can then run fuzzpro again with as sequence input list::xxx (xxx the name of your file) and as pattern input @yyy where yyy contains : > basic_1 <[HKR]. > basic_2 basic_3 basic_4 Indeed, the input will contain only sequence fragments matching the pattern and the basic amino acid hence should be in one of the first 4 positions. Regards, Guy Bottu, ex-collaborator of the Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Antwort: Standard Output Flag
david.ba...@bayer.com wrote: The default graphics output is -graph x11 and goes to the X11 terminal. But as far as I know it is not possible to send the graphics to stdandard output. Well, it is actually possible : if you write -graph tekt you will get output for Tektronix terminals. If your terminal does not shift automatically from text mode to graphical mode you will get a lot of bizarre characters on your screen. This will of course not be useful :-) Cheers, Guy Bottu ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] getting files in GCG format with annotation
Dear Richard, I agree with Peter that it is not obvious what GCG simple sequence format is still useful for, since for giving the sequence as input to whatever software you can use seqret with whatever sequence format and for just reading the annotation you can use entret and for giving the features as input to whatever software you can use seqret with parameter -feature (GCG used for this the GCG RSF format but this did not become popular outside GCG/SeqLab). I can maybe add that a widely used format for features is GFF format and you can do : seqret -feature somedb:someid outfile.seq -osformat gff -oufo somegfffile You will obtain a file somegfffile in GFF format (with just the features, not the sequence). There is a lot of software that can use it. Regards, Guy Bottu, U.L.B. ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] wrappers4EMBOSS 2.5.0 released
Dear users of wrappers4EMBOSS, version 2.5.0 of wrappers4EMBOSS has just been released. It has been adapted to support EMBOSS version 6.3. It was necessary to modify some ACD files since from EMBOSS 6.3.0 on parameters of which the minimum or maximum allowed value depend on other parameters demand a special treatment. Regards, Guy Bottu, wEMBOSS development team ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] question about XML manual
Dear friends, It is already some while that the EMBOSS staff started working on a new EMBOSS manual using the XMLmind software (and I contributed myself a part about wEMBOSS). Excuse me if I am asking a stupid question but I fail to find where the manual can be downloaded or consulted on-line. Regards, Guy Bottu ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Convert scf to fasta
Peter Rice wrote: Any other formats we can usefully cover? The ZTR format (compressed version of SCF) designed by the Staden development team. http://en.wikipedia.org/wiki/Staden_Package http://www.be.embnet.org/stadenhelp/manual/formats_unix_toc.html But how many people are still using it ? Maybe not worth the effort. Guy Bottu ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] final version of wEMBOSS out !
Dear users of wEMBOSS, We just released version 2.2 of wEMBOSS. It has a completely cleaned and debugged code and is (at least for some while) the final version. If you have already wEMBOSS, you should definitively upgrade. You can download it from http://wemboss.sourceforge.net/ Note that if your previous installation of wEMBOSS is version 2.0 or higher, you can do the installation very easily by recovering the yourAnswers file from the previous installation and doing : perl install.pl < yourAnswers Regards, Marc Colet and Guy Bottu, the wEMBOSS development team ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Rebase installation nightmare
Yes, this is an old pain. It is difficult to find good EMBOSS documentation. Should improbe when John Ison puts the new manual on-line... 4. So I check whether there is a directory /usr/local/share/EMBOSS/data/REBASE. There is, with a file entitled "dummyfile". Is the directory REBASE writable for the UNIX user who runs the program rebaseextract ? Guy Bottu ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] wrappers4EMBOSS 2.3.0 released
Dear users of wrappers4EMBOSS, This mail concerns you if you are using or intend to use wrappers4EMBOSS with one of the following : EMBOSS 6.1.0, MRS 4, PhyML 3, CLUSTAL 2, InterProScan 4.5, EBI fastA access through Web Services. You might be interested to upgrade for one of the following reasons : - We support all EMBOSS versions from 3.0.0 to 6.1.0 (it was necessary to take account of the fact that MYEMBOSS can use "source" as well as "src" as directory name and that EMBOSS 6.1.0 requests to have parameter names that are unique in the first 6 characters). - We support MRS version 4 as well as version 3. - We have abandoned support for PhyML version 2 in favour of version 3. The wrapper for ModelGenerator has been modified accordingly in order to automatically start PhyML with a model generated by ModelGenerator, using not anymore the script generated by ModelGenerator itself (it is for version 2) but instead a Perl script that parses the ModelGenerator output. The user can choose whether to use the model selected according to Akaike, modified Akaike or Bayesian information criterion. - We support the new optional features introduced in CLUSTAL version 2 (using UPGMA instead of NJ, not using sequence weights, improving the alignment by iterative re-alignment). - The module for InterProScan works with version 4.5 and has HAMAP in its menu. - The list of databank names in ebi_fasta has been adapted to the recent situation on the server. Guy Bottu, wEMBOSS development team ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Tree building
Peter Rice wrote: This can be done by adding some new output formats to the existing phylipnew embassy applications. Phylip truncates id names for its own formats. We can extend them, and only truncate for strict phylip format outputs. Interesting to note the following : standard PHYLIP format reserves the first 10 characters for the name and has the sequence start at position 11. But some programmers like the one of PhyML use a format they call PHYLIP, which however allows a name of any length but then demands that there be a space between the name and the sequence. I already had trouble when I used a standard PHYLIP file made by PHYLIP or EMBOSS with a name of 10 characters as input to PhyML. Guy Bottu ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] announcing wEMBOSS version 2
Dear users of wEMBOSS, wEMBOSS has reached version 2. It has not much changed from the user's point of view but has changed substantially from the developer's and manager's point of view. In order to make further development easier, the code has been refreshed and is now maintained as a project under Eclipse with Epic plug-in. The development version is called wEMBOSSDEV and from this releases for distribution with name wEMBOSSDIST-x.x.x are regularly drawn. The new version does not anymore install its Perl libraries in the Perl system libraries. There was indeed no reason to do this, since the wEMBOSS libraries are not used by any other program. Storing them in a separate location makes it much easier to get rid of old versions, making sure they cannot interfere with newer versions. We have stopped the habit of distributing wrappers4EMBOSS inside wEMBOSS, since there was little to be gained by doing so. The 2 packages are now simply presented as separate tar archives for download. For the end user the main difference by now is that the included applets (Jalview for viewing multiple sequence alignments and ATV for viewing phylogenetic trees) have been upgraded to the latest version. Also, with the new version of wEMBOSS the Web Browser windows open in predetermined positions, making them better spread over the computer screen. Guy Bottu, wEMBOSS development team P.S. For those who already downloaded version 2.1 : it turned out to contain a number of bugs. These have been fixed with version 2.1.1 of 23 June, which you should definitively download. ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] EMBOSS Funding and New developments
Giovanni Marco Dall'Olio wrote: Another need that people have asked me several times (I am administrator of a web forum on bioinformatics) is to have a standard and official web interface to the emboss tools, since the list at http://emboss.sourceforge.net/interfaces/#web is outdated and google doesn't return clear results. Now that we are at it, could the hyperlink not be changed ? It is http://wemboss.sourceforge.net and not anymore http://www.wemboss.org Guy Bottu ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] New to EMBOSS
John Scott wrote: Planning on deploying emboss, but can't seem to find how much disk space usage I should plan for. Can someone let me know what I can expect to see on a system with about a hundred users? Thanks. Dear John, That is a difficult question. On our computer EMBOSS itself (without the sequence databases but with the motif databases) takes 250 Mbyte. The space your users need depends much on whether they are just grandmotherwise analyzing some sequences of their interest or whether they are in complete genome high throughput analysis. On our computer there is for the moment 49 Gbyte used (about 250 users, but not all are very active). Regards, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Command line on www-interface
Grzegorz Machnik wrote: If I utilize any of www- based EMBOSS interface, may I use any of the command line qualifiers, i.e. -sreverse2 ? It will be useful for me. Till now I use it succesfuly in UNIX- based software. There is no such options to chose it, even in "advanced options" section. Dear Grzegorz, You present here a good question. GUI's for EMBOSS usually have a script that converts ACD to HTML, XML or whatever language is used to describe the program pages. The "Associated" parameters however do not show up in the ACD files and need some ad hoc programmed module in the ACD parser. The GUI's much differ in what they offer. I think Jemboss is quite complete, at least for sequences. And wEMBOSS at least does support -sbegin1 -send1 -sreverse1 -sbegin2 -send2 -sreverse2. Regards, Guy Bottu, Belgian EMBnet Node and wEMBOSS development team ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] MSE blues
Dear friends, I had not used MSE for a long time. I recently discovered that with the most recent versions of some terminal emulation software, like TeraTerm-SSH, MSE does not function properly : it is not possible to delete bases/amino acids by pressing the "Delete" key of the PC, presumably because the character is not relayed properly to the distant computer running EMBOSS. Could the EMBOSS development team mend this by extending the list of characters recognized as a "delete" ? Or do you consider that it is not worth spending effort on MSE ? Regards, Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Problem with showdb and seqret
Mohd Afiq Hazlami wrote: But when it comes to 'showdb' tutorial, the program returns this: Alas, the manual and tutorial are out-of-date. This is an unfortunate chronic disease of EMBOSS, even if for the rest EMBOSS is great. Before you can access 'public' sequences using the : syntax you must first define databases in the file /share/EMBOSS/emboss.default There is a file emboss.default.template with examples. To get a quick try using the EMBOSS tutorial data you could put the following in emboss.default : DB test_embl [ type: N comment: 'testset for EMBL' format: embl method: emblcd dir: /opt/sw/EMBOSS6/EMBOSS-6.0.1/test/embl ] Do replace /opt/sw/EMBOSS6/EMBOSS-6.0.1/test/embl by /test/embl You will note that this directory does contain sequences and indexes. You can then do showdb entret test_embl:Z5 You can read more about it at http://emboss.sourceforge.net/docs/themes/Databases.html A question for the EMBOSS development team : what is the status of the new EMBOSS manual ? Regards, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] wEMBOSS multiple login problem fixed
Dear all, As those of you who use wEMBOSS will have noticed, with recent versions of Web browsers a problem had appeared : you had to type in your username/password for each frame separately (up to 4 times). This problem has been fixed by a change in the login procedure that makes you must again login just once. We have released a new subversion 1.8.1 with the fix and some further minor changes, see http://sourceforge.net/project/shownotes.php?release_id=662615&group_id=170030 http://sourceforge.net/project/shownotes.php?release_id=662578&group_id=170030 Regards, Guy Bottu, Belgian EMBnet Node - wEMBOSS development team ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] How to find protein sequences in a given genome using CDS information
Dear Nermin, You can do that with coderet. At the command line you can do : coderet AAA -cdsoutseq=XXX -translationoutseq=YYY -outfile=/dev/null -mrnaoutseq=/dev/null -restoutseq=/dev/null where AAA contains one or more sequences in EMBL or GenBank format. You will get the coding sequences in XXX and the protein sequences in YYY. How to do it in a GUI or dataflow system should be obvious to figure out. Regards, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Pairwise matching
Dear Henrikki, You could use the bl2seq program from the BLAST suite. It does always try both strands of the second sequence. If you need to have it under EMBOSS, you could use the wrappers4EMBOSS suite (http://wemboss.sourceforge.net). Note by the way that stretcher makes a global alignment, for a local alignment you would have to use matcher. Regards, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] wrappers4EMBOSS 2.2.0 released
Dear users of wrappers4EMBOSS, This message only concerns you if you are using wrappers4EMBOSS with one of the following : EMBOSS version 6, a local installation of MRS version 3, fastA, ps_scan.pl for searching PROSITE, InterProScan version 4.4, a Web Service access to WU-BLAST and fastA at the EBI. You might have to upgrade for the following reasons : - The MYEMBOSS-6.0.0 package does not contain test programs. As a consequence the installation script of wrappers4EMBOSS failed on a "fresh" installation and had to be re-run. This bug has been fixed. - The script mrsget.pl (used to query a local MRS 3 installation) sometimes gave erroneous results. This bug has been fixed (with help of MRS developer Maarten Hekkelman). - The programs fastasearch and fastapid from the fastA module have been modified so that the entire description line is now included in the output (truncated description lines can be terribly uninformative...). Bug fixes in both the "wrapper" and in the fastA suite itself (by "Bill" Pearson after submission of a bug report) make that it is (again) possible to switch off statistics and obtain just a list of n best "hits" (in case you might need that). We have also made the installation of the old fastA version 2 suite for the purpose of pairwise comparison optional. - A novelty is that PROSITE pattern matches can be validated using a second search against a mini-profile. The "wrapper" for ps_scan has been adapted to run with the latest version of ps_scan.pl and the evaluator.dat file that is distributed with PROSITE. This validation is performed by default but can be switched off. This new way of handling PROSITE has also been introduced in InterProScan version 4.4 and the "wrapper" has been modified accordingly. - The lists of available databases in the ebi_blast and ebi_fasta programs has been updated. We can mention that the HGVBASE at the EBI is now at last searchable by WU-BLAST. Regards, Guy Bottu, Belgian EMBnet Node - wEMBOSS development team ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] Is someone still using EMBOSS version 3 or 4 ?
Dear all, The current version of the wrappers4EMBOSS suite uses EMBOSS version 3.0.0 subroutine names and is therefore compatible with all EMBOSS versions from 3.0.0 up to 6.0.1. I am however considering upgrading the names to those of version 5.0.0, if not for other reasons then certainly to get rid of annoying long lists of compiler warning messages. Therefore my question : is still someone using EMBOSS version 3 or 4 and has some reason for not willing to upgrade between now and the first months of 2009 ? Regards, Guy Bottu, Belgian EMBnet Node - wEMBOSS development team ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] question about delimiter in infoseq
Dear Peter, dear Alan, The ACD file of the program infoseq contains a parameter "delimiter" with default value "|". My attention was drawn on it because it causes an error when you try to run the program infoseq under Staden spin. But is this parameter used at all ? I failed to obtain an output file that contains the "|". Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] EMBOSS remote server list?
Ashika Umanga Umagiliya wrote: I checked the URL http://www.infobiogen.fr and it says the service is down (or something like that ,cuz i dont speak french ). Yes, Infobiogen has been closed down because of cutting of its financing. You can find a list of SRS servers at http://downloads.lionbio.co.uk/publicsrs.html Guy Bottu ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] EMBOSS remote server list?
Ashika Umanga Umagiliya wrote: It would be much appreciated if anyway can give me some working remove DB Urls to use with entret. You could try these : DB ncbi_nuc [ type: N method: entrez format: genbank comment: 'nonredundant nuc. acid db at NCBI (by GI number)' ] DB ncbi_prot [ type: P method: entrez format: genbank comment: 'nonredundant protein db at NCBI (by GI number)' ] DB cmbi_sw [ type: P comment: 'SwissProt at CMBI' method: mrs3 dbalias: sprot format: swiss url: 'http://mrs.cmbi.ru.nl/mrs-web/plain.do' ] (you can only use the entry name as identifier, e.g. cmbi_sw:papa1_carpa ; you can figure out what databases besided "sprot" are available at http://mrs.cmbi.ru.nl/mrs-web/status.do?method=databanks) Hope this helps, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] Who is still using EMBOSS 3 ?
Dear all, I am considering upgrading the function names in locally developed programs to the names in EMBOSS version 4, in order to get rid of those annoying compiler warnings. This would make the wrappers4EMBOSS suite we distribute backward incompatible with EMBOSS version 3. Hence, is still someone using it ? Regards, Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] SeaView
Giovanni Marco Dall'Olio wrote: I thought this discussion was so interesting, that I have opened a section in wikipedia to discuss all sequence alignment editor software, starting from those that you have described here. Nice initative ! A remark : the latest version of SeaView I have seen supports clustalw and muscle for re-alignment, not t-coffee (although you can re-configure SeaView to put any alignment software in). Some other software that could be mentioned : - Genedoc (very useful for preparing alignments for publication) http://www.nrbsc.org/gfx/genedoc - SeqPup http://iubio.bio.indiana.edu/soft/molbio/seqpup/java/ - MSE (text-mode, also embossified version available as Embassy package) Regards, Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] wEMBOSS 1.8.0 and wrappers4EMBOSS 2.2.0 released
Dear users of wEMBOSS, We have the pleasure to announce that the new version of wEMBOSS has been released. It contains important bug and feature fixes : - wEMBOSS allows you to type in your Email address, so that the program is started "in the background" and you are warned by Email when it has finished. The idea was that you could then start other programs or close your Web browser. At a certain moment however a bug has appeared that made that on many systems the program crashed when you tried that. This bug has been fixed. - wEMBOSS did not work under Mac OSX unless a small change was made in the code (in the SETUID part). To relieve wEMBOSS site managers from the burden of having to "hack" the code themselves, we now distribute separate versions for Mac OSX and for other UNIX flavours. - The "Search for programs" function retrieved a huge number of irrelevant program names. This number has been drastically reduced by avoiding to search in the "See also" section of the program manuals. - The on-line manuals in HTML format are installed at different locations, depending on EMBOSS version and whether it is a standard EMBOSS or an EMBASSY program. wEMBOSS can now always find the manual, wherever it is installed by default. For a complete list of the changes you can look at the Changelog : http://sourceforge.net/project/shownotes.php?group_id=170030&release_id=615018 wEMBOSS 1.8.0 is released together with the new version 2.1.0 of wrappers4EMBOSS. Besides a lot of fixes, refreshments and enhancements it addresses the following important issues : - Those who installed the EBI Web Services module will have noticed that it stopped functioning. The reason is that the EBI upgraded its SOAP server without backwards compatibility. The new version of the wrapper contains upgraded versions of the clients. - This some were asking for : a tool to use a remote MRS server as sequence access mechanism under EMBOSS (using Web Services). - The PHYLIP suite (available under EMBOSS as an EMBASSY package) handles quite well the parsimony and distances methods, but is very weak for Maximum Likelihood (only 1 model, slow on big datasets). To address this deficiency we have added wrappers for PhyML and ModelGenerator. For a complete list of the changes you can look at the Changelog : http://sourceforge.net/project/shownotes.php?group_id=170030&release_id=615013 Regards, Belgian EMBnet Node - wEMBOSS development team, Guy Bottu ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] SeaView
Staffa, Nick (NIH/NIEHS) wrote: I would really appreciate any evaluation of the editor SEAVIEW by anyone reading this. Do you think SeaView is as good as SeqLab Dear Nick, Well, at the BEN site I installed SeaView as replacement, when we lost GCG+SeqLab. SeqLab does multiple sequence editing + graphical display of features + allows to execute programs on sequence (ranges). SeaView is basically just an editor (although it can call clustal or muscle to re-align portions of the aligment, and make a dotplot). What I find a great loss is that in SeqLab you can with the mouse select a block and then with one click delete it, while with SeaView you need to press as many time the as there are columns to delete. But if you have nothing else SeaView is certainly not bad. It has the advantage to be freeware and that there are versions for several platforms. You can install it and test it yourself. It is in principle easy to install (although there might be problems with prerequisite libraries). You can obtain it from ftp://pbil.univ-lyon1.fr/pub/mol_phylogeny/seaview (and in some LINUX distributions like Gentoo it is actually vailable). If you run into trouble I am willing to give some advice. Regards, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] The location of the HTML manuals
Dear Peter and Alan, We are currently working on a new version of wEMBOSS+wrappers4EMBOSS and on this occasion we were again disturbed by the location of the manuals in HTML format. Indeed, they once were all in one directory, but now they are partly in .../share/EMBOSS/doc/html/emboss/apps and partly in .../share/EMBOSS/doc/html/embassy/xxx. I actually wonder what problem you tried to solve by doing that. It certainly creates a problem for developers of Web interfaces, because popping up a manual by generating a hyperlink is not easy ; when all the manuals are in the same directory the interface program must just generate an hyperlink yy/xxx.html, where y is always the same thing and xxx is the name of the program. At the BEN site it works because I manually copied all the manuals to one location. This however does not make life easy for those who download+install wEMBOSS(+wrappers4EMBOSS). Regards, Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Dot Plot
Staffa, Nick (NIH/NIEHS) wrote: After a day of futzing around trying to create the point file that GCG's Compare makes, with an whole page of Perl, I attacked the figure file it makes. If one doesnt care about axes, then 2 or 3 commands set up the window and one adds his points in the original units a half page of perl. Figure is a great program. It's so old that surely they could open source it. That has been discussed a lot these days. Since they have abandonned the idea of making cash from it, why could they not make the very useful SeqLab or even all their software OpenSource ? But anyway, there is the GCGfigure program that runs on old MacOS and has always been freeware. Guy ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Dot Plot
Staffa, Nick (NIH/NIEHS) wrote: I am doing a genomic survey of sorts. I generate a lot of X,Y coordinates. I would like to make a plot of these. If you have still a copy of GCG lingering around : GCG has separate programs to compute the XY-coordinates and actually draw the plot (dotplot). You will of course need to write a Perl script to put the coordinates in the format used by dotplot. Another possibility I can think about are tools created by the UCSC. If you can get your coordinates in the right lav-like format (tricky, because it are not individual dots but rather stright line segments), you could use the LAJ Java application (interactive) or the mkdotplot program from the PipMaker server (creates PostScript file). You can look at http://oryx.ulb.ac.be/embosshelp/blastz.html#output to get a snapshot how the input ada.blastz and the output ada.blastz.dotplot.pdf look like. Regards, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] How to use toggle properly ?
Peter Rice wrote: Sebastien Moretti wrote: Just one further point - the user can still change the value of symbcons by putting -symbcons=value on the commandline so you need to use the -cons toggle (or boolean) value to test whether to use the value in your program. How can I do this ? > Simply test the value of cons in your application, and ignore the value of symbcons if cons if false. As Peter points out, the standard: "$(cons)" in the ACD is only for preventing the program to prompt for symbcons when run interactively in a text terminal or presenting the symbcons box in a GUI (provided that the GUI handles the ACD correctly). But the program can still obtain a value for probcons. So, inside the C code of the program we need : if (ajAcdGetBool("cons") { myconsensussymbolstring = ajAcdGetString("symbcons"); HERE SOME CODE THAT COMPUTES THE CONSENSUS } else { OPTIONALLY SOMETHING TO DO IF YOU DO NOT WANT A CONSENSUS } if you use a toggle instead of a boolean it would be ajAcdGetToggle Guy Bottu ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] How to use toggle properly ?
Sebastien Moretti wrote: I have an option (string symbcons) in my application that must be used only if another option is chosen (boolean cons). Dear Sebastien, I have some experience with writing EMBOSS programs and I think I can help you. EMBOSS parses the ACD files from top to bottom, so you must write the object "boolean: cons" first and "string: symbcons" later. To make sure that the program at the command line will only prompt for symbcons when cons is set to y, you must include in the objects cons an attribute standdard: "Y" and in the object "string: symbcons" an attribute standard: "$cons" Or you put additional instead of standard. Note that in case the user chooses n for cons, symbcons will not be prompted for and will be set to its default value. You must hence make sure that the C code of the program does something intelligent with that. Note that the objects of type boolean and toggle are handled the same by EMBOSS. The distinction was introduced for the sake of GUI developers. Note also that since there exist "expert" parameters that are neither standard or optional. They are never prompted at the command line. To allow GUI's to hide them the attribute "needed" was invented, but it depends on the GUI whether it is supported. Regards, Guy Bottu, Belgian EMnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] needle -nobrief
Dear Marc, I indeed do not see the explanation in the manual, but I did some testing and I understand that : - Identity and Similarity are computed by dividing by the total length of the alignment - Longest_Identity and Longest_Similarity are computed by dividing by the number of aligned positions (= total length of the alignment - number of indels) - Shortest_Identity and Shortest_Similarity are computed by dividing by the length of the longest sequence Regards, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Fasta
Staffa, Nick (NIH/NIEHS) wrote: ?There's no fasta in EMBOSS? We have the latest EMBOSS and we now got Jemboss to run. I don't see fasta. GCG had a GCGized version of fastA but there never was a fastA in EMBOSS. You can however install the native fastA from U. Virginia and put on top of it wrappers4EMBOSS (http://wemboss.sourceforge.net/). I do not know how easy or difficult it is to make it run properly under Jemboss ; if you try, let us all know. Regards, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] A question about CON entries
Peter Rice wrote: When reading a CON entry we need a database to use to read the true sequence and features. If we are reading from a database we can add the information in the database definition. How do we define a default to resolve EMBL CON entries? Can we handle EMBL release and EMBL updates? There are a number of practical issues : - an entry with "join" information can come from a databank as well as from a file. - EMBL and GenBank CON entries refer to segments in the same databank, but RefSeq refers to GenBank. - a sequence presented to EMBOSS can be CON or ANN type but have already a re-assembled sequence (depending on where it comes from) - each site has its own DB entries in emboss.default, so code that explicitly says "search in embl" might not work So, IMHO : - We need code for two cases : embl format (for EMBL,...) and for GenBank format (for GenBank, RefSeq,...). The software must look whether there are CO respectively CONTIG lines in the entry, looking for CON in the ID line is not good. - for databank sequences : the DB entry in emboss.default should have a parameter that indicates in which databank to search for the segments. If a site has RefSeq and EMBL but no GenBank, then RefSeq could still use sequence information from EMBL. If there is no parameter in the DB entry EMBOSS could for embl or genbank format entries search by default in the same databank or simply not try the assembly (what do you think is the best ?). - for "personal" sequences from files : is more tricky. Maybe an associated or advanced parameter that says that if the input sequence is of "join" type it must use a databank or file to retrieve the sequences. E.g. -sjoin=xxx or -join=xxx. If xxx is a databank the seqgments can be retrieved using the standard method defined in emboss.default and if xxx is a file it can be searched sequentially. There are still some issues : - the program entret is for retrieving entries as they are rather then for processing sequence information. Should entret also try the assembly or not ? - feature information is another matter. Some entries have no or a very poor feature information but there are entries that have features that are different from the seqment entries (this is certainly so for the ANN entries in EMBL and for RefSeq). How should we handle this ? Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] A question about CON entries
Dear friends, As you know, the databanks EMBL, GenBank and RefSeq contain entries that do not contain the sequence itself, but instead "join" information about how to compose the sequence using ranges from other sequences. When the databank has been installed under EMBOSS the sequence is not available, since EMBOSS wil give a "Warning: Sequence has zero length". Do you think that EMBOSS should be able to handle such cases ? And in the meantime, do you know about tools that can, given as input the "CON" sequence and a databank with sequences, assemble on-the-fly the sequence ? Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] extracting "promoter" regions
Shrish Tiwari wrote: Is there a way using EMBOSS programs to extract 2kb region around the transcription start site. When I use extractfeat -type mrna -before 1000 -after 1000 I get 2kb regions around every splice site apart from what I need. Any suggestions? Try adding the parameter -join Regards, Guy Bottu, Belgian ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] extractalign
Dear all, I just noticed that EMBOSS version 5 contains a program extractalign, which extracts ranges from a multiple sequence alignment. This is certainly an interesting tool. The program is however not accompanied by an on-line manual and it is not mentioned in the Changelog. Any comment fom the developers ? Happy Christmas to you all, Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] restrict -limit
Aengus Stewart wrote: > I seem to be having trouble with restrict not picking up -limit or am I not > using it correctly? restrict by default searches only for prototype enzymes ; if you want to see all enzymes you must explicitly set -nolimit. I however notice that also at our site the file .../share/EMBOSS/data/embossre.equ does not contain entries for BssKI and BseBI, while it should. Maybe there is a bug in the program rebaseextract or some subtle typo in the files from the Rebase. Could the EMBOSS team figure it out ? Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] wrappers4EMBOSS 2.0.0 released
Dear colleagues, You might have noticed that the wEMBOSS SourceForge site had not been updated for quite a while. The reason is the tragic disease and decease of our site manager and fellow developer Martin Sarachu from the Argentinean EMBnet Node (see http://www.embnet.org/files/shared/EMBnetNews/embnet_news13_3-high.pdf). The maintenance of the wEMBOSS project has now resumed and we can announce the release of a new version 2.0.0 of wrappers4EMBOSS. A new version of wEMBOSS (with version 2 of wrappers4EMBOSS included) is to be expected for begin 2008. An important change in wrappers4EMBOSS is that it now uses the EMBASSY module MYEMBOSS rather then EMBOSS itself for compiling and installing. This has two advantages : it goes much faster (which can be convenient if you need to redo it several times) and it avoids overwriting customizations you might have performed directly in the installed version of EMBOSS rather than in the source (addressing an issue raised by some users). Problem is that the manual in HTML format is now installed in .../share/EMBOSS/doc/html/embassy/myseq. A future version of wEMBOSS will fully address the problem of manual location ; for the moment users of wEMBOSS and some other GUI's might have to copy the manual entries "manually" to another location. There are two new modules : an interface to a local installation of the InterProScan server and an interface to a local installation of Maarten Hekkelman's MRS software for indexing databanks. The MRS module contains a program mrsindexsearch, which is much like the SRS based indexsearch. It also provides a tool for using a local MRS as sequence databank access mechanism for EMBOSS and a command line scripting tool. For a complete list of changes, you can consult the Changelog http://sourceforge.net/project/shownotes.php?group_id=170030&release_id=557931 As far as we know, wrappers4EMBOSS 2.0.0 and wEMBOSS 1.7.1 work well with all versions of EMBOSS from 3.0.0 to 5.0.0. If you however notice a bug, let us know. Also, since we do not have an SRS server anymore, we do not have the expertise for continuing supporting SRS beyond version 8.1 ; if you have a version 8.2 and notice a need for change in our SRS module, suggestions are welcome. Regards, Belgian EMBnet Node - wEMBOSS development team, Guy Bottu ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Seqret and searching a database with entries in a file
Sudeep Mehrotra wrote: > I have a database and I have a file which consists of list of protein > IDs. I want use seqret to search each entry (in the given file) in > the given database and output the search into another file. Dear Sudeep, If you can, using some script, transform your file into format : xxx:AC3355 xxx:CG6754 xxx:AV6754 with xxx the name of the databank (you might have to use bare accession numbers rather than version numbers), then it is easy, just run seqret list::File If you want the original entries rather than the entries in fastA format, use entret instead of seqret. Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] Question about acidify
Dear Peter, dear Alan, dear all, I remember that there had been question of implementing a tool called acidify that would allow for the easy integration of software under EMBOSS (with the help of an ACD file but without elaborate EMBOSS "wrapper" progrm). Can someone tell me how far this has gone. I ask this question because my colleagues of the SIMDAT project have expressed their interest. Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] [emboss-dev] MRS 3 as EMBOSS sequence access mechanism
Peter Rice wrote: > How does MRS3 search for other fields (accession, etc.) Alas, with MRS2 you could write a single URL to perform a query and then retrieve the result as plain text. With MRS3 this has become impossible, because there are now separate programs. Could EMBOSS eventually first ask for a query.do, parse the reply and then ask for a plain.do ? Or you could consider using the Web services instead. I have no experience with the Web services under MRS3 ; I will in the coming weeks try to get informed. > Where is the MRS server documentation??? I found no documentation of the URL > syntax or query fields at the CMBI server. There is no documentation for the moment. Well, let's not blame dear Maarten, he is alone in his corner with a big development project... You can however find a list of available databanks, with for each databank the available field shortnames. It is possible to write an URL like /mrs-3/query.do?db=uniprot&query=id:papa?_carpa|ac:papa?_carpa Guy ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] MRS 3 as EMBOSS sequence access mechanism
Dear all, The MRS servers are shifting from version 2 to version 3. A result is that the EMBOSS sequence access mechanism "mrs" does not work anymore. The reason is that the URL to be generated by the software is not anymore something like http://mrs.cmbi.kun.nl/mrs/cgi-bin/mrs.cgi?db=sprot%2Btrembl&query=id:papa1_carpa&format=entry&exp=1&save_to=text/plain but http://mrs.cmbi.kun.nl/mrs-3/plain.do?db=uniprot&id=papa1_carpa It is not possible to search several databanks at the same time, but MRS 3 has composite indexes for e.g. uniprot (=sprot+trembl) or embl (=nonredundant release + updates). It is also impossible to make a query with wild cards. A suppementary problem is that the MRS WWW server wants to set a cookie. Do you think the next release of EMBOSS could support MRS 3 (maybe, in the tradition of backward compatibility, have both "mrs" and "mrs3" access methods)? And are some of you really using MRS WWW as access sequence access mechanism for EMBOSS ? Regards, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] suggestion for improving restrict - Checked by AntiVir DEMO version -
Dear users and developers of EMBOSS, One of our users has a suggestion for improvinf restrict. He needs a list with the lengths of the restriction fragments, in the same order as they appear on the plasmid. Do you think this is an intersting addition ? (Note also how I suggested him to get around using Excel). Regards, Guy Bottu, BEN - Forwarded message from Xavier Danthinne <[EMAIL PROTECTED]> - From: "Xavier Danthinne" <[EMAIL PROTECTED]> To: "BEN administration" <[EMAIL PROTECTED]> Subject: Re: EMBOSS - Checked by AntiVir DEMO version - - Checked by AntiVir +DEM Date: Fri, 8 Jun 2007 10:16:10 -0600 X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.3028 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028 X-Authentication-Info: Submitted using SMTP AUTH LOGIN at +imta07a2.registeredsite.com from [63.79.129.51] using ID [EMAIL PROTECTED] at +Fri, 8 Jun 2007 12:16:02 -0400 X-AntiVirus: checked by AntiVir Milter 1.0.6; AVE 7.4.0.32; VDF 6.38.2.8 Hello Guy, I read the exercise #6 that you mentioned, and the third paragraph ("Now suppose that...") is exactly what I was looking for. I understand that the solution is to save the data from "restrict" as tab-delimited, so we can import them into excel and calculate in a new column the difference between one site and the next one. This is not such a big deal to do that each time, but if the program restrict could do it for us, that would be great. This is what computers are for, isn't it? Thanks for your help, and have a good weekend. Xavier Xavier Danthinne, Ph.D. O.D.260 Inc PO Box 534 Boise, ID 83701 Ph. (208)345-7369 Fax (208)345-7569 Cell (208)484-0104 - Original Message - From: "BEN administration" <[EMAIL PROTECTED]> To: "Xavier Danthinne" <[EMAIL PROTECTED]> Sent: Friday, June 08, 2007 3:25 AM Subject: Re: EMBOSS - Checked by AntiVir DEMO version - >On Wed, Jun 06, 2007 at 11:52:21PM -0600, Xavier Danthinne wrote: >>I still like using EMBOSS. I have a suggestion regarding the program >>"restrict". If the program could list restriction fragments with their >>size in the order by which they constitute a piece of DNA such as a >>plasmid, this would be great. I work with large cosmids, and it is >>sometimes difficult to figure out where a specific fragment is located >>among others. Having this feature (like we had in GCG) would help. > >Well, you can export the output in MS Excel format and then quite easily >comute the list you want. The exercise 6 from >ftp://ftp.be.embnet.org/pub/BEN_Tutorials/unix_perl/ex-UNIX.doc >gives you an example of how to. >If you use wEMBOSS rather then the command line the parameters to set are >"Comma separated enzyme list" ... "Allow circular DNA?" y "Sort output >alphabetically?" y (or n, dependant on your needs) "Report format" >tab-delimited table format. >Does this help ? > >Regards, >Guy Bottu > - End forwarded message - ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Is vectorstrip gapless by design or is it a bug ? - Checke
By the way, have you considered using instead stssearch to search for the primers at the ends of the sequence ? Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] [EMAIL PROTECTED]: Re: question about display double-strande
- Forwarded message from Duleep Samuel <[EMAIL PROTECTED]> - DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=SAzlXlDHdfFJk9cAn6wcMj/Nn8r9SHt3gK528ZaV2wJy2V2yaFiRkPGz3LX4FUAWMl2/Xl582TcZ4BZE6lTi8wAL21S2mv5V4fiAYjp9LM0RHYGDLW9v/xSR8t3N7dvlzEyH0LGk7ejUlYOJQNo9/PYCJP0BJl5oATVEMq9B0xU= Date: Sat, 27 Jan 2007 10:25:55 +0530 From: "Duleep Samuel" <[EMAIL PROTECTED]> To: "Guy Bottu" <[EMAIL PROTECTED]> Subject: Re: [EMBOSS] question about display double-stranded DNA - Checked by An In-Reply-To: <[EMAIL PROTECTED]> X-AntiVirus: checked by AntiVir Milter 1.0.6; AVE 7.3.0.32; VDF 6.37.0.228 will be useful please add if possible, regards Samuel On 1/25/07, Guy Bottu <[EMAIL PROTECTED]> wrote: >On Thu, Jan 25, 2007 at 02:23:00PM +, Peter Rice wrote: >> I am looking at remap changes at the moment, I will see what I can do. > >Could you consider an option to reject restriction enzymes that cut within >a certain range (or ranges). This feature existed in GCG and is really >something we would like to have (back). Allows e.g. to select enzymes that >cut around the gene you want to clone, but not inside. > >Guy Bottu, >BEN > >___ >EMBOSS mailing list >EMBOSS@lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/emboss > - End forwarded message - ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] question about display double-stranded DNA - Checked by An
On Thu, Jan 25, 2007 at 02:23:00PM +, Peter Rice wrote: > I am looking at remap changes at the moment, I will see what I can do. Could you consider an option to reject restriction enzymes that cut within a certain range (or ranges). This feature existed in GCG and is really something we would like to have (back). Allows e.g. to select enzymes that cut around the gene you want to clone, but not inside. Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] question using 'matpatmotifs' - Checked by AntiVir DEMO ve
On Thu, Jan 18, 2007 at 09:14:20AM -0500, Jean Mao wrote: > I used 'swissprot:hair_drome' as input sequence and run 'matpatmotifs' in > EMBOSS and got 0 hits. However, when I used the same input sequnce on > interproscan, the result > (http://www.ebi.ac.uk/cgi-bin/iprscan/iprscan?tool=iprscan&jobid=iprscan-200 > 70118-14025926) show that it contains basic Helix-loop-helix motif which is > ID PS50888 in prosite database. Is this a bug or did I do something wrong? I > also run the same sequence against the 'motifs' program in GCG package. > Again no hit was found. The reason is that GCG motifs and EMBOSS patmatmotif search only the PROSITE entries of type "pattern", while PS50888 is of type "matrix". If you want to search the complete PROSITE (patterns+matrices+rules), you can download the ps_scan script from ftp://ftp.expasy.org/databases/prosite/tools/ps_scan/sources and the pftools package from ftp://ftp.isrec.isb-sib.ch/pub/sib-isrec/pftools/pft2.3 You can run this under EMBOSS with the wrappers4EMBOSS package (http://wemboss.sourceforge.net/). Hope this helps, Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Bug in 'remap' program? - Checked by AntiVir DEMO version
On Wed, Jan 17, 2007 at 12:07:31PM -0500, Mao, Jean (NIH/CIT) [E] wrote: > How about BmgT120I, based on the 'redata' program, it has isoschizomers, but > non was listed in my output. > UnbI has isoschizomers also and has NO commercial provider listed. You have indeed pinpointed a bug or misfeature. The problem might be that the prototype enzyme is AsuI. But AsuI has no commercial providers. It is more easy to see this in our MRS server than using redata : http://bendisk.ulb.ac.be/mrs/cgi-bin/mrs.cgi?db=rebase&query=BmgT120I So, several isoschizomers of AsuI are displayed in the output instead of just one enzyme. Could Alan Bleasby comment about this ? Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Bug in 'remap' program? - Checked by AntiVir DEMO version
Dear Jean, The program remap by default only outputs one representative member (the prototype) of a series of isoschizomers and it only considers enzymes that have a commercial provider. If you want to see all enzymes you must run remap with parameters -nolimit -nocommercial. Regards, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Simple configuration of databases - Checked by AntiVir DEM
On Thu, Nov 23, 2006 at 09:29:15AM +0300, Maxim Jankov wrote: > Fortunately, I don't want to download all these gigabytes, and better > configure emboss to work with all that db's online. But the bad thing is > that I'm completely stuck in all that manuals and just begging for help of > someone who done configuration like that and can provide me with ready or > easy-to-understand config files. Dear Maxim, I can provide you just like that with some remote databank definitions we use at the BEN site. Put in your file .../share/EMBOSS/emboss.default : DB NCBI_NUC [ type: N method: entrez format: genbank comment: 'nonredundant nuc. acid db at NCBI (by GI number)' ] DB NCBI_PROT [ type: P method: entrez format: genbank comment: 'nonredundant protein db at NCBI (by GI number)' ] DB EBI_EMBL [ type: N methodquery: external format: embl comment: 'EMBL at EBI' app: '/opt/sw/EBIWS/WSDbfetchClient.pl fetchData embl:%s' ] DB EBI_EMBLSVA [ type: N methodquery: external format: embl comment: 'EMBL Sequence Version Archive at EBI (by SV)' app: '/opt/sw/EBIWS/WSDbfetchClient.pl fetchData emblsva:%s' ] DB EBI_EMBLCDS [ type: N methodquery: external format: embl comment: 'EMBL Coding Sequences at EBI (by ProteinID)' app: '/opt/sw/EBIWS/WSDbfetchClient.pl fetchData emblcds:%s' ] Note : EMBOSS does have a databank access method "dbfetch", but I never managed to get it working (can someone who did tell me how ?). Instead I use the script WSDbfetchClient.pl (see attachment). To make it work you will need to install in your Perl the module SOAP-Lite (version 0.60 and not higher !). You will have to adapt the "app" parameter here above and the "shebang line" of the script. Regards, Guy Bottu, Belgian EMBnet Node WSDbfetchClient.pl Description: Perl program ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] about restriction mapping - Checked by AntiVir DEMO version -
Dear EMBOSS team, While giving a course yesterday I was reminded of the following thing : Under GCG the restriction mapping programs have a parameter allowing to precise sequence ranges where enzymes are not allowed to cut. This can be very useful, e.g. when you want to select restriction enzymes that allow to cut out a gene to be cloned (which of course should not cut inside the gene). Would it not be nice if the EMBOSS programs had the same functionality ? Does anybody else also feel a need for this feature ? Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] using emboss programs from spin] - Checked by AntiVir DEMO version
On Thu, Nov 02, 2006 at 09:00:20AM +0100, Miguel Ortiz Lombardia wrote: > -BEGIN PGP SIGNED MESSAGE- > Here, a number of users are quite afraid of the command line ;-{ so they > would appreciate a graphic interface to EMBOSS. We tried Jemboss first, > but got a number of problems that we couldn't solve (essentially with > some results text files being corrupted). Then, I met with this > interface from the Staden package, and found it very useful, to a > certain extent. Dear Miguel, If you want a GUI for EMBOSS another possibility is wEMBOSS, a Web interface with the user data permanently stored at the side of the server. A version that fully supports EMBOSS 4 is just out (see http://wemboss.sourceforge.net). However, I consider that Staden spin remains useful. When I personally have to do some bioinformatic analysis with EMBOSS, I usually work at the command line. But for programs that produce an XY-graph I use spin, because it has this nice feature of displaying in one window a zoomable graph and in another window the sequence, allowing to see which detail of the graph corresponds to which range of the sequence. Regards, Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Emboss-explorer error - Checked by AntiVir DEMO version -
On Thu, Nov 02, 2006 at 10:49:42AM -0500, Sumit Middha wrote: > I have setup emboss-explorer and it works fine for most of the tools. > However, fuzznuc, fuzztran seem to give an error (unknown datatype > pattern). That is because in EMBOSS 4 the ACD file for the "fuzzies" have a new datatype called "pattern" ; also, the option "number of allowed mismatches" is now an attribute of "pattern" rather than a separate ACD parameter. Maybe you can send a message to Luke and ask him to mend this. Note by the way that the new version of the interface wEMBOSS, which can handle all the EMBOSS 4 intricacies, is just out. Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] How to apply the einverted and etandom to a fasta file - C
On Sun, Oct 29, 2006 at 11:39:35AM -0600, yun zheng wrote: > I am a new user of emboss. I am trying to find repeat sequences in a > nucleotide sequence file that have many sequences. > > Can anybody tell me how to use einverted and etandem to analyze all the > sequences in a fasta file? einverted is searching for palindromes rather than repeats. It operates without problem on a fastA multiple sequence file. The reason that the output file is empty is probably because it did not find any good palindrome. Maybe you can try experiment with the parameters. etandem operates only on one sequence at a time. You can see this because if you do etandem -help you see that it takes as input an object of type "sequence" rather than "seqall". If you want to treat many sequences at once, you will need to put them in separate files. If necessary you can run seqret -ossingle on your file. You can under the Tc shell (tcsh) (provided your files are all called something.fasta) do : foreach FASTAFILE (`ls *.fasta`) etandem $FASTAFILE -minrepeat=10 -maxrepeat=10 -threshold=20 -auto end Problem is that etandem works only well if you provide an appropriate value for minrepeat/maxrepeat/threshold. You can use equicktandem to get an idea (look in the 4th column of the output for a repeat size). Working on all sequences in one run will of course only go well if they all contain repeats of similar size and quality. I hope this helps. Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] case sensitive identifiers - Checked by AntiVir DEMO version -
On Fri, Sep 29, 2006 at 11:27:51AM +0100, Peter Rice wrote: > So, there will be 2 new (and for the first time boolean) attributes for > databases. To use them, you will need: > > caseidmatch: "Y" > hasaccession: "N" The "hasaccession" attribute is certainly useful for search methods like SRS and MRS who have the notion of searching in separate indexes. By default searching both "id" and "ac" is the thing to do, but there are databanks where there is no "ac" indexed or there are databanks, like EMBL or IMGTHLA, where the "id" and the "ac" are always identical, so that searching only the "id" gains time without loosing functionality. As for the case problem, I think we agree that the best is to always handle the sequence name as such (case as typed by the user) to the search method and in case the search method itself is not case senstive but the databank is, let EMBOSS if 'hasaccession: "Y"' parse the retrieved sequences and accept only those who match. This will work fine for SRS (and of course for the method "direct", where EMBOSS does all the work), but it will not work for MRS, since the current version of MRS does not allow case-different index words. Guy ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] case sensitive identifiers - Checked by AntiVir DEMO version -
On Fri, Sep 29, 2006 at 09:28:22AM +0100, [EMAIL PROTECTED] wrote: > For the PDB case, really only the end of the ID is case-sensitive. Do you > think the database should be case-sensitive for the whole ID, or does it > make sense to check for a pattern as the case-sensitive part? I think that trying to define which part of the ID is case-sensitive is making it just too complicated. Let's have it completely case-sensitive or not at all. > EMBOSS will initially read only one sequence for a seqall ... it does not > read in all the sequences and look for duplicates so we have to decide in > the emboss.defaults DB definition how to check a single ID (no way to read > them all and check for duplicates). Trying to check for duplicates is again too complicated. I understand that if a databank or a multiple sequence file has duplicates a "sequence" will retrieve the first and a "seqset" or "seqall" will retrieve them all. Well, let it be that way. It is the responsability of the database manager/user to make sure there are no duplicates. Guy ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] case sensitive identifiers - Checked by AntiVir DEMO version -
On Thu, Sep 28, 2006 at 03:32:36PM +0100, Peter Rice wrote: > For EMBOSS well, we could play with the way databases work. Not all > access methods allow case sensitive searching, but we could fetch all > entries and try to reject those that do not match. This would need > something in the EMBOSS id. We already allow modifiers after the id to > set sequence ranges pdbprot:1fbt_a[1:20] or we could add a qualifier > -scasesensitive for all sequence inputs. For the moment our emboss.default contains : DB pdbprot [ type: P format: fasta comment: 'protein sequences from PDB' methodquery: app app: "/nfsben/srs/bin/linux73/getz -e '[pdbprot-id:%s]'" methodall: direct dir: /nfsben/srs/data/blast/dbfb/pdb file: pdb ] and seqret pdbprot:1ml5_s yields : >1ml5_S 30S RIBOSOMAL PROTEIN S16 MVKIRLARFGSKHNPHYPHYRIVVTDARRKRDGKYIEKIGYYDPRKTTPDWLKVDVERAR YWLSVGAQPTDTARRLLRQAGVFRQEAREGA >1ml5_s 50S RIBOSOMAL PROTEIN L22 MEAKAIARYVRISPRKVRLVVDLIRGKSLEEARNILRYTNKRGAYFVAKVLESAAANAVN NHDMLEDRLYVKAAYVDEGPALKRVLPRARGRADIIKKRTSHITVILGEKHGK So, your idea of fetching all entries and then parsing them would work for SRS. I however think that instead of an associated parameter -scasesensitive it would be better to have in the emboss.default syntax for DB entries an optional parameter case:. You should be able to handle the situation where it is appropriate to pass an id to a case sensitive search method and the situation where it is appropriate to parse the output of a case-insensitive search method. This can best be decided for each databank at EMBOSS site configuartion time, rather than at sequence retrieval time. What do you think ? Regards, Guy Bottu ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] case sensitive identifiers - Checked by AntiVir DEMO version -
Dear colleagues, Thure Etzold, the developer of SRS, once said "You cannot imagine anything that crazy or there is at least one database manager who really does it". While trying to put in our MRS server databanks with the protein and nucleic acid sequences extracted from the PDB, I bumped on the following problem : some have identifiers only different by case. E.g. there is a 1fnt_A and a 1fnt_a. Now, most bioinformatic software is not case sensitive. I understand that MRS stores indices so that they can be displayed in their original case, but can only be searched case-insensitively ; it does automatically modify redundant indices, e.g. 1fnt_a is stored as 1fnt_a_12835. This is however not ideal. Should MRS be adapted so that it can handle case sensitive indices ? This will however not solve everything, since other software like EMBOSS or GCG is also case insensitive. My idea is to let the MRS parser store 1fnt_aLC (LC means lowercase) as identifier. A user can then search for the sequence he needs in MRS and in EMBOSS (if the EMBOSS installation uses MRS as databank access mechanism) ask for the sequence pdbprot:1fnt_alc. This would of course also work with 1fnt_a_12835 but it avoids the use of a meaningless and irreproducible number. Anybody a comment ? Regards, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] iep program for multiple protein sequences - Checked by
On Thu, Sep 07, 2006 at 06:27:22PM -0400, Tao Song wrote: > I wonder can the iep program that calculates the isoelectric point of > a protein be used > for a protein database? Yes, iep can take multiple proteins as input. It will create an output file with the titration curves of the different proteins the one after the other. It does not make one graph for all the proteins together, which would anyway not make much biological sence. > When asked to input protein sequence I gave 'tsw' > instead of > 'tsw:laci_ecoli' I got an error that said 'sequence must be protein sequence > without BZ U X > or *: found bad character Z'. Does iep can only take one protein sequence as > input file? The problem here is that iep takes as input only a protein sequence without ambiguity symbols. The reason is without doubt that it is not clear whether B should be handled as a Asp (negative) or a Asn (neutral), etc. Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] cutg database - Checked by AntiVir DEMO version -
On Fri, Jul 28, 2006 at 02:47:11PM -0500, Andres Pinzon wrote: > On 7/28/06, Andres Pinzon <[EMAIL PROTECTED]> wrote: > > Is it necessary to dowwnload all gb* archives from > > ftp://ftp.kazusa.or.jp/pub/codon/current/ in order to have the cutg > > database running? > It seems that one only need the CUTG.151.tar.gz (158Mb) file under the > compressed/ directory. ;-) > I have not ran the "cutgextract" command yet, i'll see. Dear Andres, It are actually the files *.codon that are needed for EMBOSS ; downloading and extracting CUTG.151.tar.gz will do the job. There is however a problem : cutgextract puts the files with codons in .../share/EMBOSS/data/CODON. Under wEMBOSS this produces a selector with thenthousands of entries, which causes delays in the transfer of the Web page and is though to use. I do not know how EMBOSS explorer behaves. The EMBOSS development team is considering putting the CUTG files in a separate directory CUTG, but they have not yet done so. At the BEN site I have modified cutgextract, so that it creates files with extension .cutg instead of .cut ; wEMBOSS only displays the *.cut files in the selector and the .cutg files can still be accessed by typing in their name or by retrieving them with embossdata. Hope this information helps, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] emboss explorer installation - Checked by AntiVir DEMO ver
On Wed, Jul 26, 2006 at 06:37:25PM -0500, Andres Pinzon wrote: > I just installed emboss 4.0 and wemboss 1.6.0, everythings fine ;-) I'm afraid you did not take a good look, because wEMBOSS 1.6.0 does not handle a few things that changed between EMBOSS 3.0 and 4.0 : - the program pages cannot display the length of the sequence, because the output format of infoseq changed - the "fuzzies" do not work because of the new object "pattern" - wEMBOSS cannot find the on-line manuals because their location changed (see also below) This will be fixed in wEMBOSS 1.7.0, which will be out end August or begin September. > Now im trying to install emboss-explorer (it used to be a > straightforward install) but some things went wrong, for instance, it > never finds the emboss manuals, as far as i can see from the install > file it tries to find'em in: > $EMBOSS_PREFIX/share/EMBOSS/doc/programs/html > But actually they are in: > $EMBOSS_PREFIX/share/EMBOSS/doc/html/emboss/apps Try the following : cd $EMBOSS_PREFIX/share/EMBOSS/doc/programs ln -s $EMBOSS_PREFIX/share/EMBOSS/doc/html/emboss/apps html (or make a directory html and just copy the files) Under wEMBOSS it works and I see no reason why it should not work under emboss explorer. The only problem is that if you have also "Embassadirs" there are troubles in navigating from one manual page to another because the Embassadirs are supposed to have their pages in different directories. Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] the dbfetch and mrs access methods - Checked by AntiVir DEMO version -
Dear colleagues, EMBOSS version 4 has two new database access methods : "mrs" and "dbfetch". Unfortunately, there seems to be nowhere a documentation that tells how to configure it in emboss.defaults and I did not succeed. Has anyone already successfully intstalled EMBOSS 4 and used these mthods ? Sincerely, Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Web Interfaces to Emboss Tools - Checked by AntiVir DEMO v
Dear Andy, Well, you might consider using wEMBOSS, developed by the Belgian and Argentinian EMBnet Nodes (http://wemboss.sourceforge.net/). It works with data stored on the server. The data are in a classic UNIX directory, so that you can easily let software run on these data in a terminal session or use ftp for bulk data transfers, in case you might want to do that. wEMBOSS has facilities to upload/download data and manage them in different "projects". On the same site you will also find wrappers4EMBOSS, that allows to integrate under the EMBOSS/wEMBOSS interface a number of programs that are very useful but lacking in EMBOSS (BLAST, fastA, MUSCLE, a program to search the complete PROSITE rather than just the patterns, ...). As for the recovery of data from SeqWeb, I have no experience with that. The related problem of how to use data generated by GCG under EMBOSS is simpler : EMBOSS does accept GCG sequence and GCG MSF files, as well as GCG codon usage tables. wEMBOSS does include a facility to convert GCG List Files into EMBOSS List Files. Sincerely, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
Re: [EMBOSS] Embossdata -reject option - Checked by AntiVir DEMO versio
On Wed, Apr 12, 2006 at 02:39:00PM +0200, Marc Logghe wrote: > I am intrigued by the -reject option of embossdata. > According to the doc: > "This specifies the names of the sub-directories of the EMBOSS data > directory that should be ignored when displaying data directories. > Choose from selection list of values 3, 5, 6". > I was not able to find out what this list of values corresponds to. Indeed tricky to find out what this means :-; You can look in the file .../share/EMBOSS/acd/embossdata.acd : selection: reject [ default: "3, 5, 6" minimum: "1" maximum: "6" values: "None, AAINDEX, CVS, CODONS, PRINTS, PROSITE, REBASE" delimiter: "," header: "Directories to ignore" information: "Select directories" help: "This specifies the names of the sub-directories of the EMBOSS data directory that should be ignored when displaying data directories." button: "Y" ] So, by default CVS, PRINTS and PROSITE are rejected. > I hoped to get a list to select from when embossdata was run with the > -options parameter, but this did not happen. That is because -reject is an "advanced", not an "optional"/"additinal" parameter. It is indeed impossible to get a selection list displayed at the command line, although many GUI's like wEMBOSS will show it. > Actually I was trying to find a way to obtain more or less the oposite > of '-reject', e.g. what if you only want the content of the CODONS > directory ? This does not work, there is no way to reject the files in the base data directory. The best you can do is to add on the command line -reject=2,3,5,6,7 or -reject= AAINDEX,CVS,PRINTS,PROSITE,REBASE What you can do however is : ls $EMBOSS_DATA/CODONS Hope this helps, Guy Bottu, Belgian EMBnet Node ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss
[EMBOSS] A note about fastA format(s) - Checked by AntiVir DEMO version -
Dear friends, We are using EMBOSS version 3.0. One of my colleagues tried to use a multiple sequence file in fastA format, where each comment line starts with a string containing multiple pipe signs. An USA of type fasta::file:xx|yy|zz|uu|ss did not work. After some trial I found that putting "pearson" instead of "fasta" helped. This is strange, since according to the on-line manual at http://emboss.sourceforge.net/docs/themes/SequenceFormats.html "fasta" and "pearson" are synonyms. Here it seems that "fasta" is instead treated the same as "ncbi". Comments ? Guy Bottu, BEN ___ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss