Hi folks, First off, many thanks for your earlier help in figuring out soft masking (https://lists.soe.ucsc.edu/pipermail/genome/2010-November/024129.html). It all worked out just fine. Now I have two (related) follow-up questions.
PROJECT: I'm conducting some scans for selection on a mess of sea urchins. One sea urchin (the reference) has an assembled genome. The others are 454 sequences. I'd like to generate .chain files so that I can use liftOver to collect specified orthologous regions from the whole set of species. QUESTION 1: The first question is purely technical. The how-to page on whole-genome alignments (http://genomewiki.ucsc.edu/index.php/Whole_genome_alignment_howto) tells me that lastz has replaced blastz. This is fine, however, it seems that there is one significant difference between the two programs. While blastz (or the wrapper Blastz) can produce .lav files when there are multiple sequences in the target file, lastz cannot (or does not seem to be able to). How do y'all get around this limitation? Should I simply break apart the genome such that each (reasonably sized) scaffold from the target genome has its own file? Or do you use a different output format that can handle multiple sequences in the target? QUESTION 2: For two of my sea urchin species, I've been given .gtf formatted files that match each 454 read to a location within the target genome. Is there a way to use this existing map to generate chain files? I haven't found anything nearly so convenient for pulling out orthologous sequences for multiple species than having chain files. Many thanks, David _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
