Hello, BLAT and BLAST are very different algorithms. The parameters used within each can also change the result dramatically. Short query sequences also have their own set of complications with either program. And BLAST will be computationally expensive if full chromosomes are used, as it was designed to compare genes to genes and generate the probability that the match is significant (evalue based on the z score and matrix used). BLAT was designed for full chromosome alignments so this is not an issue.
Comparing against contigs vs chromosomes will only matter if you have an alignment that spans over two contigs (and any overlap is not sufficient to compensate). With a short probe, this seems unlikely to be the case, but it is something to be aware of. Here is some info about the difference between the two programs. On the same page and in the BLAT documentation are parameter discussions (including how to create "BLAST like" results). http://genome.ucsc.edu/FAQ/FAQblat#blat1 We hope this helps, Jennifer ------------------------------------------------ Jennifer Jackson UCSC Genome Bioinformatics Group ----- "Adrian Johnson" <[email protected]> wrote: > From: "Adrian Johnson" <[email protected]> > To: [email protected] > Sent: Monday, December 7, 2009 7:04:53 PM GMT -08:00 US/Canada Pacific > Subject: [Genome] mapping probe sequence > > Dear UCSC GenomeBrowser staff: > > I wanted to map probe sequences from U133plus2 chip to human genome. > > I downloaded all chromosomes in FASTA format from ncbi FTP site: > > For example: > hs_ref_GRCh37_chrX.fa.gz > > I used BLAST to map the probe sequence to fasta file mentioned above. > I used strict conditions that I wanted only 100% identical hit. > > One such probe hit: > probe:HG-U133_Plus_2:208763_s_at:318:797 Homo sapiens > chromosome > X genomic contig, GRCh37 reference primary assembly 25 25 1 > 30253299 30253323 > > > when I try to find chrX:30253299-30253323, the position does not map > to the location it should have pointed. > the desired position is : > > 000000001 gggagtattgactggtcccttacct 000000025 > <<<<<<<<< ||||||||||||||||||||||||| <<<<<<<<< > 106957015 gggagtattgactggtcccttacct 106956991 > > > My question: > Although NCBI and UCSC are same reference genomes, why am I having > different position. Is this because I am searching against a contig > as > opposed to a full chromosome. > > Please help me. > thanks > Adrian > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
