Hello Yuan, For questions about the output of the tool from the Miller lab, it would probably be best to contact them yourself. I suspect that the extra genomes are only removed and the alignment is not regenerated, but you need to confirm this with them. They would provide reasons for the methodology they use. A quick google search brought up these links: http://www.bx.psu.edu/miller_lab/ http://globin.bx.psu.edu/html/contact.html
For the second question, it is a bit unclear about what you are attempting to do. If you wish to align upstream sequence based on the gene bound calling per-species, then the upstream5000.fa would be a fine place to retrieve sequence data. Identify the gene of interest in each genome (try a search by gene name) and use the "DNA" tool, or use the Table browser to download the sequence data (from UCSC Genes or other track), or extract it by RefSeqID from the upstream file. But, if you wish to align sequence based on a common location in the MAF Comparative alignment, then you will need to extract sequence from other species based on base human genomic locations you are interested in. The genomic coordinates for each aligned species is contained in the MAF data Sequence can be extracted from the Table browser in fasta format for certain data (conserved elements) based on the base genome (human) coordinates. Or it can be extract per-species using coordinates you pull from the MAF alignment. For alignments, Blastz does do multiple alignments. Blat does 1-1 alignments. On the UCSC Browser website, a Blat can be run against any of the current genomic sequence. For other types of alignment targets, a local install would be required. If you are examining 1-1 genomic comparisons, be sure to note the Chain and Net tracks and consider using the liftOver tool as a quick way to map data. If all questions have not been addressed, please write back with more detail and we can try to help more, Jennifer Jackson UCSC Genome Bioinformatics Group Yuan Hao wrote: > Dear UCSC genome user, > > I want to get multiple alignment for human, mouse, rat, cow and > chicken. I found there is a multiple alignments of 16 vertebrate > genomes with Human on UCSC. Because there are more genomes than I > want, so I try to use the "maf_order" program from Miller's lab to > exclude genomes I don't want, but after reading all the documents and > codes, it looks like maf_order just deleted unwanted genomes without > re-doing the multiple alignment. Could someone help me to clarify that > whether it is still a real alignment after maf_order? > > Another question is I downloaded upstream5000.fa from UCSC for all > above 5 species. I would like to do multiple alignments of these > upstream sequences. Apart from all-to-all alignment, is there other > way I could do the multiple alignment? I found upstream5000.fa uses > Refseq id (NM_XXXXX), which is different across species. How could I > extract upstream sequences from different species corresponding to the > same gene so that I only need to align 5 upstream sequences each time? > > I appreciate very much for your reply! Thank you very much in advance! > > Yuan > ------------------------------------------- > Yuan Hao > PhD student > Conway Institute > University College Dublin > Belfield, Dublin 4, Ireland > E-mail: [email protected] > ------------------------------------------- > > > > > > > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
