try this: > x <- gregexpr("a+(b+)", "abcdaabbcaaacaaab") > justA <- gregexpr("a+", "abcdaabbcaaacaaab") > # find matches in 'x' for 'justA' > indx <- which(justA[[1]] %in% x[[1]]) > # now determine where 'b' starts > justA[[1]][indx] + attr(justA[[1]], 'match.length')[indx] [1] 2 7 17 >
On Mon, Sep 27, 2010 at 11:48 AM, Titus von der Malsburg <malsb...@gmail.com> wrote: > Dear list! > >> gregexpr("a+(b+)", "abcdaabbc") > [[1]] > [1] 1 5 > attr(,"match.length") > [1] 2 4 > > What I want is the offsets of the matches for the group (b+), i.e. 2 > and 7, not the offsets of the complete matches. Is there a way in R > to get that? > > I know about gsubgn and strapply, but they only give me the strings > matched by groups not their offsets. > > I could write something myself that first takes the above matches > ("ab" and "aabb") and then searches again using only the group (b+). > For this to work, I'd have to parse the regular expression and search > several times (> 2, for nested groups) instead of just once. But I'm > sure there is a better way to do this. > > Thanks for any suggestion! > > Titus > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.