Hello Luvina, Thank you very much. I followed the link provided herein but could not find blat binaries for 32bit. There is only a liftOver file in linux.i386 directory.
I still observe some strange differences in terms of what is reported in the sim4 and psl formats for the same set of parameters. In my previous email, I provided two files: target.fa and query.fa both of which are test files. If you execute the following: blat target.fa -t=dna query.fa -q=dna -dots=1 -out=psl test.psl blat target.fa -t=dna query.fa -q=dna -dots=1 -out=sim4 test.sim4 The outputs test.psl and test.sim4 do not seem to contain the same number of hits. Suprisingly, .sim4 has more hits than .psl although .psl is the generic format. I am unable to explain how this is possible. For now, I using the sim4 format and albeit difficult to parse. Thank you in advance. Mbandi ------------------------------------ Universiteit van wes kaapland > Hi Mbandi > > Thank you for contacting the mailing list, and yes, this is the correct > place to ask your question. One of our engineers suggests you use psl > since it is the native blat format, and not to use fastmap for ests > which may have introns in them. In addition, you may download our latest > version of BLAT which contains a few bugfixes and may be useful for your > purposes. The lastest BLAT is available in compiled form in our > downloads: > http://hgdownload.cse.ucsc.edu/downloads.html#source_downloads. We also > suggest you use pslReps and pslCDnsFilter for filtering psl results. > > I hope this information is useful and answers your question. Please > contact us again at [email protected] if you have any further questions. > > --- > Luvina Guruvadoo > UCSC Genome Bioinformatics Group > > > On 6/14/2012 1:23 PM, Mbandi S.K wrote: >> Dear ALL; >> >> Firstly, I'm happy to join this mailing list. I do not know if this >> group >> is the right place for my question. Kindly bear with me if my question >> is >> trivial or has been dealt with already. I have recently settled on BLAT >> v. >> 34 for a portion of my project to screen for EST(cDNA) that well aligned >> to my reference sequence. However, I find it hard to understand the >> effects of -minIdentity and -fastMap on the output. >> >> I also noticed that just changing the output format, affects the the >> reports in the output file. More ESTs are reported in sim4 format than >> in >> psl format. I want to write a parser to calculate coverage, identity etc >> in other for me to build a filtering matrix. attached here are two test >> files:query.fa and target.fa. I'm aware -fastMap is for DNA-DNA, but >> just >> for test purposes, I ran: >> blat target.fa -t=dna query.fa -q=dna -out=psl -minIdentity=100 -fastMap >> -dots=1 test.psl >> and >> blat target.fa -t=dna query.fa -q=dna -out=psl -fastMap -dots=1 test.psl >> >> However in the first instance; I do not find hits which I expected even >> though default -minIdentity is 90 which is less stringent to 100. When >> out=sim4 is used, the hits are totally different. Query.fa contains >> mutated and unmodified versions of seq1 from target.fa file. >> >> Has anyone experience strange results like this? Which output is better >> from experience? I will appreciate clarity in this regard. >> >> Many thanks, >> >> Mbandi S.K >> >> >> _______________________________________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
