Hi Mark,

Mark Leeds <marklee...@gmail.com> writes:

> Hi All: I have a regular expression problem. If a character string ends
> with "rhofixed" or "norhofixed", I want that part of the string to be
> removed. If it doesn't end with either of those two endings, then the
> result should be the same as the original. Below doesn't work for the
> second case. I know why but not how to fix it. I lookrd st friedl's book
> and I bet it's in there somewhere but I can't find it. Thanks.
>
> s <- c("lngimbintrhofixed","lngimbnointnorhofixed","test")
>
> result <- sub("^(.*)([n.*|r.*].*)$","\\1",s)
>
>  print(result)
> [1] "lngimbint"     "lngimbnointno" "test"
>
>       [[alternative HTML version deleted]]
>

The matching of the initial .* is by default greedy, so it will match
everything before the last 'n' or 'r'.  As you always have an 'r' in
'rho', your 'no' gets eaten by the first pattern.  You can make a
pattern non-greedy by appending '?' to the quantifier.

I would do

> s <- c("lngimbintrhofixed","lngimbnointnorhofixed","test")
> result <- sub("^(.*?)((no)?rhofixed)$","\\1",s)
> result
[1] "lngimbint"   "lngimbnoint" "test"

Cheers,

Loris

-- 
This signature is currently under construction.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to