Re: [R] unexpected behaviour of sub() / usage of regexp

2011-12-09 Thread Prof Brian Ripley
ve it. Thanks a lot Jannis - Ursprüngliche Message - Von: Sarah Goslee An: Duncan Murdoch Cc: Jannis; "r-help@r-project.org" Gesendet: 15:37 Freitag, 9.Dezember 2011 Betreff: Re: [R] unexpected behaviour of sub() / usage of regexp But I do get the incorrect result on R 2.14.0

Re: [R] unexpected behaviour of sub() / usage of regexp

2011-12-09 Thread Jannis
etreff: Re: [R] unexpected behaviour of sub() / usage of regexp But I do get the incorrect result on R 2.14.0 on linux: > sub('[[:digit:]]{1,2}', '', '9ewww') [1] "www" And also: > sub('[[:digit:]]{1,2}', '', '9ewww') [1] &q

Re: [R] unexpected behaviour of sub() / usage of regexp

2011-12-09 Thread Sarah Goslee
But I do get the incorrect result on R 2.14.0 on linux: > sub('[[:digit:]]{1,2}', '', '9ewww') [1] "www" And also: > sub('[[:digit:]]{1,2}', '', '9ewww') [1] "www" > sub('[[:digit:]]{1,2}', '', 'ewww9') [1] "ww9" > sub('\\d{1,2}', '', 'ewww9') [1] "ww9" But: > sub('\\d', '', 'ewww9') [1] "ewww"

Re: [R] unexpected behaviour of sub() / usage of regexp

2011-12-09 Thread Prof Brian Ripley
This is AFAICS an instance of bug PR#14408 : it seems that in UTF-8 locales the grammar generated by the TRE engine for repetitions is in odd cases buggy. And as the author has vanished, our hopes of his fixing it are slim. Try perl=TRUE . On 09/12/2011 14:20, Jannis wrote: Dear R users,

Re: [R] unexpected behaviour of sub() / usage of regexp

2011-12-09 Thread Duncan Murdoch
On 09/12/2011 9:20 AM, Jannis wrote: Dear R users, the way I understand the documentation of sub() and regexp the following code: sub('[[:digit:]]{1,2}', '', '9ewww') ... should yield: 'ewww' It returns, however: 'www' Why is this the case? My code should just substitute 1 (minimum)