Hi,
I have the following two sequences. The query has one nucleotide
missing at position 13 compared with the database.
$ cat query.fasta
>test_sequence
cttgcaccggaatgtctgctccaga
$ cat database.fasta
>database_chr1
cttgcaccggaaagtctgctccaga
Then I run blast with the following command.
blat -t=dna -q=dna -stepSize=2 -minScore=25 -maxGap=1 -out=pslx
database.fasta query.fasta query2.pslx
blat -t=dna -q=dna -stepSize=3 -minScore=25 -maxGap=1 -out=pslx
database.fasta query.fasta query3.pslx
The resulted files are the following. I understand that stepSize is
the offset between the K-mers in the database. But I still don't
understand why stepSize has to be less than or equal to 2 to detect
this query in the database. Could you help me understand it?
$ cat query2.pslx
psLayout version 3
match mis- rep. N's Q gap Q gap T gap T gap strand Q
Q Q
Q T T T T block blockSizes
qStarts tStarts
match match count bases count bases name
size start end name size start end count
---------------------------------------------------------------------------------------------------------------------------------------------------------------
24 1 0 0 0 0 0 0 +
test_sequence 25 0 25 database_chr1 25 0 25
1 25, 0, 0, cttgcaccggaatgtctgctccaga,
cttgcaccggaaagtctgctccaga,
$ cat query3.pslx
psLayout version 3
match mis- rep. N's Q gap Q gap T gap T gap strand Q
Q Q
Q T T T T block blockSizes
qStarts tStarts
match match count bases count bases name
size start end name size start end count
---------------------------------------------------------------------------------------------------------------------------------------------------------------
--
Regards,
Peng
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome