Gabor Grothendieck <ggrothendieck <at> gmail.com> writes: > > Try this: > > > findall("aba", "ababacababab") > [1] 1 3 7 9 > > gregexpr("a(?=ba)", "ababacababab", perl = TRUE) > [[1]] > [1] 1 3 7 9 > attr(,"match.length") > [1] 1 1 1 1 > > > findall("a.a", "ababacababab") > [1] 1 3 5 7 9 > > gregexpr("a(?=.a)", "ababacababab", perl = TRUE) > [[1]] > [1] 1 3 5 7 9 > attr(,"match.length") > [1] 1 1 1 1 1
Thanks --- somehow I did not realize that the expression in "?=..." can also be regular. My original problem was to find all three character matches where the first and the last one are the same. With findall() it works like: findall("(.).\\1", "ababacababab") # [1] 1 2 3 5 7 8 9 10 I am still not able to reproduce this with lookahead. Attempts with gregexpr("(.)?=.\\1", "ababacababab", perl = TRUE) do not work as the lookahead expression apparently does not know about the captured group from before. Regards Hans Werner Correction: I meant the '\G' metacharacter in Perl, not a modifier. > On Sun, Dec 20, 2009 at 7:22 AM, Hans W Borchers > <hwborchers <at> googlemail.com> wrote: > > Gabor Grothendieck <ggrothendieck <at> gmail.com> writes: > > > > [Sorry; Gmane forces me to delete "more quoted text".] > > > > ---- > > findall <- function(apat, atxt) { > > stopifnot(length(apat) == 1, length(atxt) == 1) > > pos <- c() # positions of matches > > i <- 1; n <- nchar(atxt) > > found <- regexpr(apat, substr(atxt, i, n), perl=TRUE) > > while (found > 0) { > > pos <- c(pos, i + found - 1) > > i <- i + found > > found <- regexpr(apat, substr(atxt, i, n), perl=TRUE) > > } > > return(pos) > > } > > ---- > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.