Hi,

I have the following two sequences. The query has one nucleotide
missing at position 13 compared with the database.
$ cat query.fasta
>test_sequence
cttgcaccggaatgtctgctccaga
$ cat database.fasta
>database_chr1
cttgcaccggaaagtctgctccaga


Then I run blast with the following command.

blat -t=dna -q=dna -stepSize=2 -minScore=25 -maxGap=1 -out=pslx
database.fasta query.fasta query2.pslx
blat -t=dna -q=dna -stepSize=3 -minScore=25 -maxGap=1 -out=pslx
database.fasta query.fasta query3.pslx

The resulted files are the following. I understand that stepSize is
the offset between the K-mers in the database. But I still don't
understand why stepSize has to be less than or equal to 2 to detect
this query in the database. Could you help me understand it?

$ cat query2.pslx
psLayout version 3

match   mis-    rep.    N's     Q gap   Q gap   T gap   T gap   strand  Q       
        Q       Q
        Q       T               T       T       T       block   blockSizes      
qStarts  tStarts
        match   match           count   bases   count   bases           name
        size    start   end     name            size    start   end     count
---------------------------------------------------------------------------------------------------------------------------------------------------------------
24      1       0       0       0       0       0       0       +       
test_sequence   25      0       25      database_chr1   25      0       25      
1       25,     0,      0,      cttgcaccggaatgtctgctccaga,      
cttgcaccggaaagtctgctccaga,
$ cat query3.pslx
psLayout version 3

match   mis-    rep.    N's     Q gap   Q gap   T gap   T gap   strand  Q       
        Q       Q
        Q       T               T       T       T       block   blockSizes      
qStarts  tStarts
        match   match           count   bases   count   bases           name
        size    start   end     name            size    start   end     count
---------------------------------------------------------------------------------------------------------------------------------------------------------------


-- 
Regards,
Peng
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to