Hi Mauve concatenates chromosomes (contigs) prior to alignment.
How can this concatenates coordinates in an mauve alignment mapped back to the original individual Sequences? A final output of the alignment in clustal (.aln) format with original coordinates and chromosome name would be great. Any idea how I can solve the problem? with best regards, Martin -- Dr. Martin Münsterkötter MIPS - Institute of Bioinformatics and Systems Biology Helmholtz Zentrum München German Research Center for Environmental Health (GmbH) Ingolstädter Landstr. 1 85764 Neuherberg Germany http://www.helmholtz-muenchen.de/mips Phone: +49-89-3187-3579 Fax: +49-89-3187-3585 ________________________________________ Von: Guy Plunkett III [[email protected]] Gesendet: Freitag, 7. September 2012 17:59 An: [email protected] Betreff: Re: [Mauve-users] problem with input files in mauve I'm replying to Alina offlist, but I thought I'd pass on some more general information to all. (1) One issue she was having resulted from FASTA files that did not have the sequence wrapped, i.e., each sequence in the file consisted of an identifier line and a single sequence line, regardless of sequence length. While this is technically a valid FASTA file, Mauve doesn't deal well with them. I don't know what line length is the maximum Mauve can deal with, and I usually wrap sequences at something in the 60 - 80 residue range. Doing that with a file Alina sent me did the trick. So how can this be accomplished? Some folks will be proficient enough that they will just do it via command line text manipulation (regex, grep, sed, awk, etc.). Some folks will have a favorite text editor that readily allows such manipulations (BBEdit on Mac, for example). But the simplest solution for many folks will be a web service running Don Gilbert's venerable Readseq. Two such servers are <http://www.ebi.ac.uk/cgi-bin/readseq.cgi> and <http://iubio.bio.indiana.edu/cgi-bin/readseq.cgi>. Just upload your file with the Choose File button, select Pearson|Fasta|fa as the output format under Options, and click on Submit. A file called "readseq.cgi" is downloaded to your computer. The file is wrapped at 60 residues/line, and works with Mauve as is (although you might want to rename it). (2) The second issue was a draft genome in many contigs, where she received the data as a directory full of individual files. For Mauve to deal with those sequences as a single genome, you need to convert the mutiple single-sequence files to a single multiple-sequence file. The only way I know to do this is via command line, but it is straightforward,, and will work for both FASTA and GenBank formats. Assuming you have all the .gbk sequences for a genome in the directory "genome", launch the Terminal (Mac) or Command Shell (Windows) and navigate to the directory that the "genome" directory is in. Then type the command to merge all the files in the genome directory into a single new file called all.gbk (or what ever name you want to use; works for any file extension). On a Mac, type cat genomes/*.gbk > all.gbk On Windows, type copy /a genomes\*.gbk all.gbk (the /a is required to ensure that the resulting file is plain text) Hope folks find this useful. Cheers, Guy Dr. Guy Plunkett III Senior Scientist, Genome Center of Wisconsin Senior Scientist, DNASTAR <http://www.genome.wisc.edu/information/gplunkett.html>http://www.genome.wisc.edu/information/gplunkett.html ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Mauve-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mauve-users Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH) Ingolstädter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir´in Bärbel Brumme-Bothe Geschäftsführer: Prof. Dr. Günther Wess und Dr. Nikolaus Blum Registergericht: Amtsgericht München HRB 6466 USt-IdNr: DE 129521671 ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Mauve-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mauve-users
