Here is a very slight further simplification, i.e. we can drop the final {1,}
> grep("^(?!(.)\\1{1,}$).*(.)\\2$", vec, perl = TRUE) [1] 2 3 5 On Sun, Nov 30, 2008 at 3:26 PM, Gabor Grothendieck <[EMAIL PROTECTED]> wrote: > Try this: > >> vec <- c("aaaa", "baaa", "bbaa", "bbba", "baamm", "aa") > >> grep("^(?!(.)\\1{1,}$).*(.)\\2{1,}$", vec, perl = TRUE) > [1] 2 3 5 > > The (?...) succeeds only if the string is not all the same > character and since that consumes no characters it > restarts at the beginning to match anything followed > by repeated characters to the end. > > On Sun, Nov 30, 2008 at 2:33 PM, Stefan Th. Gries <[EMAIL PROTECTED]> wrote: >> Hi all >> >> I have the following regular expression problem: I want to find >> complete elements of a vector that end in a repeated character but >> where the repetition doesn't make up the whole word. That is, for the >> vector vec: >> >> vec<-c("aaaa", "baaa", "bbaa", "bbba", "baamm", "aa") >> >> I would like to get >> "baaa" >> "bbaa" >> "baamm" >> >> >From tools where negative lookbehind can involve variable lengths, one >> would think this would work: >> >> grep("(?<!(?:\\1|^))(.)\\1{1,}$", vec, perl=T) >> >> But then R doesn't like it that much ... I also know I can get it like this: >> >> whole.word.rep <- grep("^(.)\\1{1,}$", vec, perl=T) # 1 6 >> rep.at.end <- grep("(.)\\1{1,}$", vec, perl=T) # 1 2 3 5 6 >> setdiff(rep.at.end, whole.word.rep) # 2 3 5 >> >> But is there a one-line grep thingy to do this? >> >> Thx for any pointers, >> STG >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.