On Tue, Apr 27, 2010 at 9:00 PM, Galt Barber <[email protected]> wrote: > > Hi, Peng! > > As the FAQ points out > http://genome.ucsc.edu/FAQ/FAQblat.html > > "A note on filtering output: increasing the -minScore parameter value beyond > one-half of the query size has no further effect. Therefore, use either the > pslReps or pslCDnaFilter program available in the Genome Browser source > code to filter for the size, score, coverage, or quality desired. For > information on obtaining the source code, see our FAQ on source code > licensing and downloads. " > > This seems to have been an odd restriction > which was removed at the urging of users, > however, the change came only in 2008: > > blat/version.doc > 1.72 (galt 09-Dec-08): (in blat version 34x3) > Fixed -minScore, filter was not working when over half query-size. > v197_branch: 1.72.0.2 > > revision 1.72 > date: 2008/12/09 08:11:46; author: galt; state: Exp; lines: +1 -0 > fixing minScore > ---------------------------- > > galt > Tue Dec 9 08:11:46 2008 +0000 > fixing minScore > diff --git src/jkOwnLib/gfBlatLib.c src/jkOwnLib/gfBlatLib.c > --- src/jkOwnLib/gfBlatLib.c > +++ src/jkOwnLib/gfBlatLib.c > @@ -18,7 +18,7 @@ > > > static void saveAlignments(char *chromName, int chromSize, int chromOffset, > struct ssBundle *bun, struct hash *t3Hash, > boolean qIsRc, boolean tIsRc, > enum ffStringency stringency, int minMatch, struct gfOutput *out) > /* Save significant alignments to file in .psl format. */ > { > struct dnaSeq *tSeq = bun->genoSeq, *qSeq = bun->qSeq; > struct ssFfItem *ffi; > -if (minMatch > qSeq->size/2) minMatch = qSeq->size/2; > -if (minMatch < 1) minMatch = 1; > for (ffi = bun->ffList; ffi != NULL; ffi = ffi->next) > { > struct ffAli *ff = ffi->ff; > struct trans3 *t3List = NULL; > int score; > if (t3Hash != NULL) > t3List = hashMustFindVal(t3Hash, tSeq->name); > score = scoreAli(ff, bun->isProt, stringency, tSeq, t3List); > if (score >= minMatch) > { > out->out(chromName, chromSize, chromOffset, ff, tSeq, t3Hash, qSeq, > qIsRc, tIsRc, stringency, minMatch, out); > } > } > } > > See the two lines leading with "-" ? > They were deleted. They seemed to be > unneeded and causing unexpected behavior > to users. > > Unfortunately, Jim Kent's official release > seems to date back to 2007, but you could > get the source and compile it. > > Any blat version after 34x3 should have the fix. > > With the newer version, the cutoff works more > as you would expect. And for your example > of a 25bp stretch of dna with one mismatch, > your score would be +24 for the matches and > -1 for the 1 mismatch, thus score=24-1==23. > > And thus if you use minScore of 23 or lower > you can see the output psl record. > -minScore=23 > > As we mentioned before, > you can just set minScore to zero and > then filter the psl output > with other tools afterwards.
Hi, Since setting minScore to zero would probably more common than other cases. I think that it is make sense to change its default value to 0 rather than an arbitrary number 30 as it is right now. Do you agree? > -Galt > > Ar 4/27/2010 3:35 PM, scríobh Peng Yu: >> >> Hi Galt, >> >> Here is the command that I use. You mentioned "Generally people don't >> much bother with using BLAT's own commandline options for minScore, >> etc." But I want to understand what minScore is and when it can be >> ignored. Would you please let me know? >> >> >> $ blat -t=dna -q=dna -stepSize=5 -minScore=25 -maxGap=0 -noHead \ >> database.fasta \ >> query.fasta \ >> query.psl >> $ cat query.fasta >>> >>> test_sequence >> >> cttgcaccggaaagtctgctccaga >> $ cat database.fasta >>> >>> database_chr1 >> >> ctagcaccggaaagtctgctccaga >> $ cat query.psl >> 24 1 0 0 0 0 0 0 + >> test_sequence 25 0 25 database_chr1 25 0 25 >> 1 25, 0, 0, >> >> >> >> On Mon, Apr 26, 2010 at 4:30 PM, Jennifer Jackson<[email protected]> >> wrote: >>> >>> Hello Peng, >>> >>> Very sorry, your reply went to the genome mailing list only, not to your >>> email address as well. Our apologies. >>> >>> Here is the posting: >>> https://lists.soe.ucsc.edu/pipermail/genome/2010-April/022012.html >>> >>> Jennifer >>> >>> --------------------------------- >>> Jennifer Jackson >>> UCSC Genome Informatics Group >>> http://genome.ucsc.edu/ >>> >>> On 4/24/10 12:09 PM, Peng Yu wrote: >>>> >>>> Could somebody answer me the following question? >>>> >>>> On Wed, Apr 21, 2010 at 2:48 PM, Peng Yu<[email protected]> wrote: >>>>> >>>>> I'm wondering what "some sort of gap penalty" refers to. Also I query >>>>> 25bp sequence using the default, BLAT still gives the result. By >>>>> definition 25bp sequence should at most have a score of 25, which is >>>>> less than 30. Why the query still returns the the result? >>>>> >>>>> -minScore=N sets minimum score. This is the matches minus the >>>>> mismatches minus some sort of gap penalty. Default is 30 >>>>> >>>>> >>>>> -- >>>>> Regards, >>>>> Peng >>>>> >>>> >>>> >>>> >>> >> >> >> > > -- Regards, Peng _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
