On Aug 30, 2011, at 4:42 PM, wang peter wrote:
hi every one:
i am still confused by parameters in function trimLRPatterns
if i set
Lpattern <- "AAAAAAAATTCTGCT"
Rpattern <- "GATCGGATTTTTTTT"
subject <- DNAString("TTCTGCTTGACGTGATCGGA")
trimLRPatterns(Lpattern = Lpattern, subject = subject)#
the results are
13-letter "DNAString" instance
seq: TGACGTGATCGGA
trimLRPatterns(Rpattern = Rpattern, subject = subject)#
the results are
13-letter "DNAString" instance
seq: TTCTGCTTGACGT
i think with.Rindels = F with.Lindels = F
AAAAAAAA and TTTTTTTT are insertions
To reduce any possible confusion, let's just take this case:
Lpattern <- "AAAAAAAATTCTGCT"
subject <- "TTCTGCT"
> trimLRPatterns(Lpattern = Lpattern, subject = subject)
[1] ""
# and just to be clear about the defaults
> trimLRPatterns(Lpattern = Lpattern, subject = subject,
max.Lmismatch=0, with.Lindels=FALSE)
[1] ""
Since there are no A's in subject, and max.Lmismatch=0, I think you
are saying that the substring "AAAAAAAA" of Lpattern appears to match
freely, as if it was being treated as an in/del, without any penalty.
That is not what is happening.
The function takes max.Lmismatch=0 as
max.Lmismatch = rep(0, nchar(Lpattern))
So, *all* suffixes of the Lpattern are candidates for trimming at
the beginning of the subject, so long as they exact-match, and the
longest wins. By the suffixes of the Lpattern I mean, in order,
substr(Lpattern, i, nchar(Lpattern)), i = 1:nchar(Lpattern)
The first one to match is "TTCTGCT" (i = 9), which actually equals
the subject. This is why the function returns "". It has nothing
to do with indels.
Maybe a better example for indels is:
> subject = "TTTACGT"
> Lpattern = "TTTAACGT" # pattern has an extra 'A'
> trimLRPatterns(Lpattern = Lpattern, subject = subject,
max.Lmismatch=3)
[1] "TTTACGT"
# need to allow for 4 errors because of the extra A
> trimLRPatterns(Lpattern = Lpattern, subject = subject,
max.Lmismatch=4)
[1] ""
> trimLRPatterns(Lpattern = Lpattern, subject = subject,
max.Lmismatch=0,
with.Lindels=TRUE)
[1] "TTACGT"
# need to allow for 1 "edit", to remove the extra A
> trimLRPatterns(Lpattern = Lpattern, subject = subject,
max.Lmismatch=1,
with.Lindels=TRUE)
[1] ""
Let me know if I didn't get your point.
thank you
[[alternative HTML version deleted]]
_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing