On Aug 30, 2011, at 4:42 PM, wang peter wrote:
hi every one:
    i am still confused by parameters in function trimLRPatterns
if i set
Lpattern <- "AAAAAAAATTCTGCT"
Rpattern <- "GATCGGATTTTTTTT"
subject <- DNAString("TTCTGCTTGACGTGATCGGA")
trimLRPatterns(Lpattern = Lpattern, subject = subject)#
the results are

13-letter "DNAString" instance
seq: TGACGTGATCGGA

trimLRPatterns(Rpattern = Rpattern, subject = subject)#

the results are
13-letter "DNAString" instance
seq: TTCTGCTTGACGT

i think with.Rindels = F with.Lindels = F

AAAAAAAA and TTTTTTTT are insertions

To reduce any possible confusion, let's just take this case:

        Lpattern <- "AAAAAAAATTCTGCT"
        subject <-          "TTCTGCT"

> trimLRPatterns(Lpattern = Lpattern, subject = subject)
[1] ""

# and just to be clear about the defaults
> trimLRPatterns(Lpattern = Lpattern, subject = subject,
        max.Lmismatch=0, with.Lindels=FALSE)
[1] ""

Since there are no A's in subject, and max.Lmismatch=0, I think you
are saying that the substring "AAAAAAAA" of Lpattern appears to match
freely, as if it was being treated as an in/del, without any penalty.
That is not what is happening.

The function takes max.Lmismatch=0 as

        max.Lmismatch = rep(0, nchar(Lpattern))

So, *all* suffixes of the Lpattern are candidates for trimming at
the beginning of the subject, so long as they exact-match, and the
longest wins.  By the suffixes of the Lpattern I mean, in order,

        substr(Lpattern, i, nchar(Lpattern)), i = 1:nchar(Lpattern)

The first one to match is "TTCTGCT" (i = 9), which actually equals
the subject.  This is why the function returns "".  It has nothing
to do with indels.

Maybe a better example for indels is:

>  subject = "TTTACGT"
> Lpattern = "TTTAACGT"            # pattern has an extra 'A'

> trimLRPatterns(Lpattern = Lpattern, subject = subject, max.Lmismatch=3)
[1] "TTTACGT"

# need to allow for 4 errors because of the extra A
> trimLRPatterns(Lpattern = Lpattern, subject = subject, max.Lmismatch=4)
[1] ""

> trimLRPatterns(Lpattern = Lpattern, subject = subject, max.Lmismatch=0,
        with.Lindels=TRUE)
[1] "TTACGT"

# need to allow for 1 "edit", to remove the extra A
> trimLRPatterns(Lpattern = Lpattern, subject = subject, max.Lmismatch=1,
        with.Lindels=TRUE)
[1] ""


Let me know if I didn't get your point.

thank you

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to