Re: [Rd] invert argument in grep

2006-11-12 Thread Gabor Grothendieck
invert= would be consistent with the fact that egrep (-v), sed/vi (v) and
awk (~!) all have special facilities as indicated to handle such
negation/inversion.


On 11/12/06, Romain Francois <[EMAIL PROTECTED]> wrote:
> Duncan Murdoch wrote:
> > On 11/10/2006 12:52 PM, Romain Francois wrote:
> >> Duncan Murdoch wrote:
> >>> On 11/9/2006 5:14 AM, Romain Francois wrote:
>  Hello,
> 
>  What about an `invert` argument in grep, to return elements that
>  are *not* matching a regular expression :
> 
>  R> grep("pink", colors(), invert = TRUE, value = TRUE)
> 
>  would essentially return the same as :
> 
>  R> colors() [ - grep("pink", colors()) ]
> 
> 
>  I'm attaching the files that I modified (against today's tarball)
>  for that purpose.
> >>>
> >>> I think a more generally useful change would be to be able to return
> >>> a logical vector with TRUE for a match and FALSE for a non-match, so
> >>> a simple !grep(...) does the inversion.  (This is motivated by the
> >>> recent R-help discussion of the fact that x[-selection] doesn't
> >>> always invert the selection when it's a vector of indices.)
> >>>
> >>> A way to do that without expanding the argument list would be to allow
> >>>
> >>> value="logical"
> >>>
> >>> as well as value=TRUE and value=FALSE.
> >>>
> >>> This would make boolean operations easy, e.g.
> >>>
> >>> colors()[grep("dark", colors(), value="logical")
> >>>   & !grep("blue", colors(), value="logical")]
> >>>
> >>> to select the colors that contain "dark" but not "blue". (In this
> >>> case the RE to select that subset is rather simple because "dark"
> >>> always precedes "blue", but if that wasn't true, it would be a lot
> >>> messier.)
> >>>
> >>> Duncan Murdoch
> >> Hi,
> >>
> >> It sounds like a nice thing to have. I would still prefer to type :
> >>
> >> R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE),
> >> value = TRUE )
> >
> > That's good for intersecting two searches, but not for other boolean
> > combinations.
> >
> > My main point was that inversion isn't the only boolean operation you
> > may want, but R has perfectly good powerful boolean operators, so
> > installing a limited subset of boolean algebra into grep() is probably
> > the wrong approach.
>
> Hi,
>
> Yes, good point. I agree with you that the value = "logical" is probably
> worth having to take advantage of these logical operators.
>
>  but, what about all these functions calling grep and passing
> arguments through the ellipsis. With this invert argument, we could do :
>
> R> history(pattern = "grid\\..*\\(", invert = TRUE)
>
> BTW, why not use ... in ls ? in case someone would like to use perl
> regex to use ls, or to get back at this thread, issue commands like :
>
> R> ls("package:grid", pattern = "^grid\\.|Grob$", invert = TRUE)
>  [1] "absolute.size"   "applyEdit"   "applyEdits"
>  [4] "arcCurvature""arrow"   "childNames"
>  [7] "convertHeight"   "convertNative"   "convertUnit"
> [10] "convertWidth""convertX""convertY"
> [13] "current.transform"   "current.viewport""current.vpPath"
> [16] "current.vpTree"  "dataViewport""downViewport"
> [19] "draw.details""drawDetails" "editDetails"
> [22] "engine.display.list" "gEdit"   "gEditList"
> [25] "get.gpar""getNames""gList"
> [28] "gpar""gPath"   "grob"
> [31] "grobHeight"  "grobName""grobWidth"
> [34] "grobX"   "grobY"   "gTree"
> [37] "heightDetails"   "is.unit" "layout.heights"
> [40] "layoutRegion""layout.torture"  "layout.widths"
> [43] "plotViewport""pop.viewport""popViewport"
> [46] "postDrawDetails" "preDrawDetails"  "push.viewport"
> [49] "pushViewport""seekViewport""setChildren"
> [52] "stringHeight""stringWidth" "unit"
> [55] "unit.c"  "unit.length" "unit.pmax"
> [58] "unit.pmin"   "unit.rep""upViewport"
> [61] "validDetails""viewport""viewport.layout"
> [64] "viewport.transform"  "vpList"  "vpPath"
> [67] "vpStack" "vpTree"  "widthDetails"
> [70] "xDetails""yDetails"
>
> Then, what about ... in apropos ?
>
> Regards,
>
> Romain
>
>
> >>
> >>
> >> What about a way to pass more than one regular expression then be
> >> able to call :
> >>
> >> R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE,
> >> FALSE)
> >
> > Again, it covers & and !, but it misses other boolean operators.
> >
> >> I usually use that kind of shortcuts that are easy to remember.
> >>
> >> vgrep <- function(...) grep(..., value = TRUE)
> >> igrep <- function(...) grep(..., invert = TRUE)
> >> ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE)
> >
> > If you're willing to write these, 

Re: [Rd] invert argument in grep

2006-11-12 Thread Romain Francois
Duncan Murdoch wrote:
> On 11/10/2006 12:52 PM, Romain Francois wrote:
>> Duncan Murdoch wrote:
>>> On 11/9/2006 5:14 AM, Romain Francois wrote:
 Hello,

 What about an `invert` argument in grep, to return elements that 
 are *not* matching a regular expression :

 R> grep("pink", colors(), invert = TRUE, value = TRUE)

 would essentially return the same as :

 R> colors() [ - grep("pink", colors()) ]


 I'm attaching the files that I modified (against today's tarball) 
 for that purpose.
>>>
>>> I think a more generally useful change would be to be able to return 
>>> a logical vector with TRUE for a match and FALSE for a non-match, so 
>>> a simple !grep(...) does the inversion.  (This is motivated by the 
>>> recent R-help discussion of the fact that x[-selection] doesn't 
>>> always invert the selection when it's a vector of indices.)
>>>
>>> A way to do that without expanding the argument list would be to allow
>>>
>>> value="logical"
>>>
>>> as well as value=TRUE and value=FALSE.
>>>
>>> This would make boolean operations easy, e.g.
>>>
>>> colors()[grep("dark", colors(), value="logical")
>>>   & !grep("blue", colors(), value="logical")]
>>>
>>> to select the colors that contain "dark" but not "blue". (In this 
>>> case the RE to select that subset is rather simple because "dark" 
>>> always precedes "blue", but if that wasn't true, it would be a lot 
>>> messier.)
>>>
>>> Duncan Murdoch
>> Hi,
>>
>> It sounds like a nice thing to have. I would still prefer to type :
>>
>> R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE), 
>> value = TRUE )  
>
> That's good for intersecting two searches, but not for other boolean 
> combinations.
>
> My main point was that inversion isn't the only boolean operation you 
> may want, but R has perfectly good powerful boolean operators, so 
> installing a limited subset of boolean algebra into grep() is probably 
> the wrong approach.

Hi,

Yes, good point. I agree with you that the value = "logical" is probably 
worth having to take advantage of these logical operators.

 but, what about all these functions calling grep and passing 
arguments through the ellipsis. With this invert argument, we could do :

R> history(pattern = "grid\\..*\\(", invert = TRUE)

BTW, why not use ... in ls ? in case someone would like to use perl 
regex to use ls, or to get back at this thread, issue commands like :

R> ls("package:grid", pattern = "^grid\\.|Grob$", invert = TRUE)
 [1] "absolute.size"   "applyEdit"   "applyEdits"
 [4] "arcCurvature""arrow"   "childNames"
 [7] "convertHeight"   "convertNative"   "convertUnit"
[10] "convertWidth""convertX""convertY"
[13] "current.transform"   "current.viewport""current.vpPath"
[16] "current.vpTree"  "dataViewport""downViewport"
[19] "draw.details""drawDetails" "editDetails"
[22] "engine.display.list" "gEdit"   "gEditList"
[25] "get.gpar""getNames""gList"
[28] "gpar""gPath"   "grob"
[31] "grobHeight"  "grobName""grobWidth"
[34] "grobX"   "grobY"   "gTree"
[37] "heightDetails"   "is.unit" "layout.heights"
[40] "layoutRegion""layout.torture"  "layout.widths"
[43] "plotViewport""pop.viewport""popViewport"
[46] "postDrawDetails" "preDrawDetails"  "push.viewport"
[49] "pushViewport""seekViewport""setChildren"
[52] "stringHeight""stringWidth" "unit"
[55] "unit.c"  "unit.length" "unit.pmax"
[58] "unit.pmin"   "unit.rep""upViewport"
[61] "validDetails""viewport""viewport.layout"
[64] "viewport.transform"  "vpList"  "vpPath"
[67] "vpStack" "vpTree"  "widthDetails"
[70] "xDetails""yDetails"

Then, what about ... in apropos ?

Regards,

Romain


>>
>>
>> What about a way to pass more than one regular expression then be 
>> able to call :
>>
>> R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE, 
>> FALSE)
>
> Again, it covers & and !, but it misses other boolean operators.
>
>> I usually use that kind of shortcuts that are easy to remember.
>>
>> vgrep <- function(...) grep(..., value = TRUE)
>> igrep <- function(...) grep(..., invert = TRUE)
>> ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE)
>
> If you're willing to write these, then it's easy to write igrep 
> without an invert arg to grep:
>
> igrep <- function(pat, x, ...)
>setdiff(1:length(x), grep(pat, x, value = FALSE, ...))
>
> ivgrep would also be easy, except for the weird semantics of 
> value=TRUE pointed out by Brian:  but it could still be written with a 
> little bit of care.
>
> Duncan Murdoch
>
>>
>> What about things like the arguments `after` and `before` in unix 
>> grep. T

Re: [Rd] invert argument in grep

2006-11-10 Thread Duncan Murdoch
On 11/10/2006 12:52 PM, Romain Francois wrote:
> Duncan Murdoch wrote:
>> On 11/9/2006 5:14 AM, Romain Francois wrote:
>>> Hello,
>>>
>>> What about an `invert` argument in grep, to return elements that are 
>>> *not* matching a regular expression :
>>>
>>> R> grep("pink", colors(), invert = TRUE, value = TRUE)
>>>
>>> would essentially return the same as :
>>>
>>> R> colors() [ - grep("pink", colors()) ]
>>>
>>>
>>> I'm attaching the files that I modified (against today's tarball) for 
>>> that purpose.
>>
>> I think a more generally useful change would be to be able to return a 
>> logical vector with TRUE for a match and FALSE for a non-match, so a 
>> simple !grep(...) does the inversion.  (This is motivated by the 
>> recent R-help discussion of the fact that x[-selection] doesn't always 
>> invert the selection when it's a vector of indices.)
>>
>> A way to do that without expanding the argument list would be to allow
>>
>> value="logical"
>>
>> as well as value=TRUE and value=FALSE.
>>
>> This would make boolean operations easy, e.g.
>>
>> colors()[grep("dark", colors(), value="logical")
>>   & !grep("blue", colors(), value="logical")]
>>
>> to select the colors that contain "dark" but not "blue". (In this case 
>> the RE to select that subset is rather simple because "dark" always 
>> precedes "blue", but if that wasn't true, it would be a lot messier.)
>>
>> Duncan Murdoch
> Hi,
> 
> It sounds like a nice thing to have. I would still prefer to type :
> 
> R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE), 
> value = TRUE )  

That's good for intersecting two searches, but not for other boolean 
combinations.

My main point was that inversion isn't the only boolean operation you 
may want, but R has perfectly good powerful boolean operators, so 
installing a limited subset of boolean algebra into grep() is probably 
the wrong approach.
> 
> 
> What about a way to pass more than one regular expression then be able 
> to call :
> 
> R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE, FALSE)

Again, it covers & and !, but it misses other boolean operators.

> I usually use that kind of shortcuts that are easy to remember.
> 
> vgrep <- function(...) grep(..., value = TRUE)
> igrep <- function(...) grep(..., invert = TRUE)
> ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE)

If you're willing to write these, then it's easy to write igrep without 
an invert arg to grep:

igrep <- function(pat, x, ...)
setdiff(1:length(x), grep(pat, x, value = FALSE, ...))

ivgrep would also be easy, except for the weird semantics of value=TRUE 
pointed out by Brian:  but it could still be written with a little bit 
of care.

Duncan Murdoch

> 
> What about things like the arguments `after` and `before` in unix grep. 
> That could be used when grepping inside a function :
> 
> R> grep("plot\\.", body(plot.default) , value= TRUE)
> [1] "localWindow <- function(..., col, bg, pch, cex, lty, lwd) 
> plot.window(...)"
> [2] "plot.new()"
> [3] "plot.xy(xy, type, ...)"
> 
> 
> when this could be useful  (possibly).
> 
> R> # grep("plot\\.", plot.default, after = 2, value = TRUE)
> R> tmp <- tempfile(); sink(tmp) ; print(body(plot.default)); sink(); 
> system( paste( "grep -A2 plot\\. ", tmp) )
> localWindow <- function(..., col, bg, pch, cex, lty, lwd) 
> plot.window(...)
> localTitle <- function(..., col, bg, pch, cex, lty, lwd) title(...)
> xlabel <- if (!missing(x))
> --
> plot.new()
> localWindow(xlim, ylim, log, asp, ...)
> panel.first
> plot.xy(xy, type, ...)
> panel.last
> if (axes) {
> --
> if (frame.plot)
> localBox(...)
> if (ann)
> 
> 
> BTW, if I call :
> 
> R> grep("plot\\.", plot.default)
> Error in as.character(x) : cannot coerce to vector
> 
> What about adding that line at the beginning of grep, or something else 
> to be able to do as.character on a function ?
> 
> if(is.function(x)) x <- body(x)
> 
> 
> Cheers,
> 
> Romain
>>>
>>> Cheers,
>>>
>>> Romain
> 
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] invert argument in grep

2006-11-10 Thread Romain Francois
Duncan Murdoch wrote:
> On 11/9/2006 5:14 AM, Romain Francois wrote:
>> Hello,
>>
>> What about an `invert` argument in grep, to return elements that are 
>> *not* matching a regular expression :
>>
>> R> grep("pink", colors(), invert = TRUE, value = TRUE)
>>
>> would essentially return the same as :
>>
>> R> colors() [ - grep("pink", colors()) ]
>>
>>
>> I'm attaching the files that I modified (against today's tarball) for 
>> that purpose.
>
> I think a more generally useful change would be to be able to return a 
> logical vector with TRUE for a match and FALSE for a non-match, so a 
> simple !grep(...) does the inversion.  (This is motivated by the 
> recent R-help discussion of the fact that x[-selection] doesn't always 
> invert the selection when it's a vector of indices.)
>
> A way to do that without expanding the argument list would be to allow
>
> value="logical"
>
> as well as value=TRUE and value=FALSE.
>
> This would make boolean operations easy, e.g.
>
> colors()[grep("dark", colors(), value="logical")
>   & !grep("blue", colors(), value="logical")]
>
> to select the colors that contain "dark" but not "blue". (In this case 
> the RE to select that subset is rather simple because "dark" always 
> precedes "blue", but if that wasn't true, it would be a lot messier.)
>
> Duncan Murdoch
Hi,

It sounds like a nice thing to have. I would still prefer to type :

R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE), 
value = TRUE )  


What about a way to pass more than one regular expression then be able 
to call :

R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE, FALSE)


I usually use that kind of shortcuts that are easy to remember.

vgrep <- function(...) grep(..., value = TRUE)
igrep <- function(...) grep(..., invert = TRUE)
ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE)

What about things like the arguments `after` and `before` in unix grep. 
That could be used when grepping inside a function :

R> grep("plot\\.", body(plot.default) , value= TRUE)
[1] "localWindow <- function(..., col, bg, pch, cex, lty, lwd) 
plot.window(...)"
[2] "plot.new()"
[3] "plot.xy(xy, type, ...)"


when this could be useful  (possibly).

R> # grep("plot\\.", plot.default, after = 2, value = TRUE)
R> tmp <- tempfile(); sink(tmp) ; print(body(plot.default)); sink(); 
system( paste( "grep -A2 plot\\. ", tmp) )
localWindow <- function(..., col, bg, pch, cex, lty, lwd) 
plot.window(...)
localTitle <- function(..., col, bg, pch, cex, lty, lwd) title(...)
xlabel <- if (!missing(x))
--
plot.new()
localWindow(xlim, ylim, log, asp, ...)
panel.first
plot.xy(xy, type, ...)
panel.last
if (axes) {
--
if (frame.plot)
localBox(...)
if (ann)


BTW, if I call :

R> grep("plot\\.", plot.default)
Error in as.character(x) : cannot coerce to vector

What about adding that line at the beginning of grep, or something else 
to be able to do as.character on a function ?

if(is.function(x)) x <- body(x)


Cheers,

Romain
>>
>> Cheers,
>>
>> Romain


-- 
*mangosolutions*
/data analysis that delivers/

Tel   +44 1249 467 467
Fax   +44 1249 467 468

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] invert argument in grep

2006-11-10 Thread Duncan Murdoch
On 11/10/2006 6:28 AM, Prof Brian Ripley wrote:
> On Fri, 10 Nov 2006, Duncan Murdoch wrote:
> 
>> On 11/9/2006 5:14 AM, Romain Francois wrote:
>>> Hello,
>>>
>>> What about an `invert` argument in grep, to return elements that are
>>> *not* matching a regular expression :
>>>
>>> R> grep("pink", colors(), invert = TRUE, value = TRUE)
>>>
>>> would essentially return the same as :
>>>
>>> R> colors() [ - grep("pink", colors()) ]
> 
> Note that grep("pat", x, value = TRUE) is not the same as x[grep("pat", x)],
> as the help page carefully points out.  (I think it would be better 
> if it were.)
> 
>>> I'm attaching the files that I modified (against today's tarball) for
>>> that purpose.
> 
> (BTW, sending whole files makes it difficult to see the changes and even 
> harder to merge them; please use diffs.  From a quick look the changes 
> were very incomplete, as the internal functions were changed and there 
> were no changed C files.)
> 
>> I think a more generally useful change would be to be able to return a
>> logical vector with TRUE for a match and FALSE for a non-match, so a
>> simple !grep(...) does the inversion.  (This is motivated by the recent
>> R-help discussion of the fact that x[-selection] doesn't always invert
>> the selection when it's a vector of indices.)
> 
> I don't think that is pertinent here, as the indices are always a vector 
> of positive integers.  

The issue is that the vector might be empty, in which case 
arithmetically negating it has no effect.  Negating a vector of integer 
indices is not a good way to invert a selection, while logical negation 
of a logical vector is fine.

> 
>> A way to do that without expanding the argument list would be to allow
>>
>> value="logical"
>>
>> as well as value=TRUE and value=FALSE.
>>
>> This would make boolean operations easy, e.g.
>>
>> colors()[grep("dark", colors(), value="logical")
>>   & !grep("blue", colors(), value="logical")]
>>
>> to select the colors that contain "dark" but not "blue". (In this case
>> the RE to select that subset is rather simple because "dark" always
>> precedes "blue", but if that wasn't true, it would be a lot messier.)
> 
> That might be worthwhile, but it is relatively simple to change positive 
> integer indices to logical ones and v.v.
> 
> My personal take is that having 'value=TRUE' was already a complication 
> not worth having, and implementing it at C level was an efficiency tweak 
> not worth the maintenance effort (and also means that '[' methods are not 
> dispatched).

This makes it sound as though it would be worthwhile to redo the 
implementation of value=TRUE as something equivalent to x[grep("pat", 
x)] by putting this case into the R code.  This would simplify the C 
code and make the interface a little less quirky.  (I'm not sure how 
much code this would break because of the loss of coercion to character.)

The value="logical" implementation could also be done in R, not C.

The advantage of putting it into grep() rather than leaving it for the 
user to change later is that grep() has a copy of x in hand, so a user 
of grep() will not have to save length(x) to use in the conversion to 
logical.

Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] invert argument in grep

2006-11-10 Thread Prof Brian Ripley
On Fri, 10 Nov 2006, Duncan Murdoch wrote:

> On 11/9/2006 5:14 AM, Romain Francois wrote:
>> Hello,
>>
>> What about an `invert` argument in grep, to return elements that are
>> *not* matching a regular expression :
>>
>> R> grep("pink", colors(), invert = TRUE, value = TRUE)
>>
>> would essentially return the same as :
>>
>> R> colors() [ - grep("pink", colors()) ]

Note that grep("pat", x, value = TRUE) is not the same as x[grep("pat", x)],
as the help page carefully points out.  (I think it would be better 
if it were.)

>>
>> I'm attaching the files that I modified (against today's tarball) for
>> that purpose.

(BTW, sending whole files makes it difficult to see the changes and even 
harder to merge them; please use diffs.  From a quick look the changes 
were very incomplete, as the internal functions were changed and there 
were no changed C files.)

> I think a more generally useful change would be to be able to return a
> logical vector with TRUE for a match and FALSE for a non-match, so a
> simple !grep(...) does the inversion.  (This is motivated by the recent
> R-help discussion of the fact that x[-selection] doesn't always invert
> the selection when it's a vector of indices.)

I don't think that is pertinent here, as the indices are always a vector 
of positive integers.

> A way to do that without expanding the argument list would be to allow
>
> value="logical"
>
> as well as value=TRUE and value=FALSE.
>
> This would make boolean operations easy, e.g.
>
> colors()[grep("dark", colors(), value="logical")
>   & !grep("blue", colors(), value="logical")]
>
> to select the colors that contain "dark" but not "blue". (In this case
> the RE to select that subset is rather simple because "dark" always
> precedes "blue", but if that wasn't true, it would be a lot messier.)

That might be worthwhile, but it is relatively simple to change positive 
integer indices to logical ones and v.v.

My personal take is that having 'value=TRUE' was already a complication 
not worth having, and implementing it at C level was an efficiency tweak 
not worth the maintenance effort (and also means that '[' methods are not 
dispatched).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] invert argument in grep

2006-11-10 Thread Duncan Murdoch
On 11/9/2006 5:14 AM, Romain Francois wrote:
> Hello,
> 
> What about an `invert` argument in grep, to return elements that are 
> *not* matching a regular expression :
> 
> R> grep("pink", colors(), invert = TRUE, value = TRUE)
> 
> would essentially return the same as :
> 
> R> colors() [ - grep("pink", colors()) ]
> 
> 
> I'm attaching the files that I modified (against today's tarball) for 
> that purpose.

I think a more generally useful change would be to be able to return a 
logical vector with TRUE for a match and FALSE for a non-match, so a 
simple !grep(...) does the inversion.  (This is motivated by the recent 
R-help discussion of the fact that x[-selection] doesn't always invert 
the selection when it's a vector of indices.)

A way to do that without expanding the argument list would be to allow

value="logical"

as well as value=TRUE and value=FALSE.

This would make boolean operations easy, e.g.

colors()[grep("dark", colors(), value="logical")
   & !grep("blue", colors(), value="logical")]

to select the colors that contain "dark" but not "blue". (In this case 
the RE to select that subset is rather simple because "dark" always 
precedes "blue", but if that wasn't true, it would be a lot messier.)

Duncan Murdoch
> 
> Cheers,
> 
> Romain
> 
> 
> 
> 
> 
> grep <-
> function(pattern, x, ignore.case = FALSE, extended = TRUE, perl = FALSE,
>  value = FALSE, fixed = FALSE, useBytes = FALSE, invert = FALSE)
> {
> pattern <- as.character(pattern)
> ## when value = TRUE we return names
> if(!is.character(x)) x <- structure(as.character(x), names=names(x))
> ## behaves like == for NA pattern
> if (is.na(pattern)) {
> if(value)
> return(structure(rep.int(as.character(NA), length(x)),
>  names = names(x)))
> else
> return(rep.int(NA, length(x)))
> }
> 
> if(perl)
> .Internal(grep.perl(pattern, x, ignore.case, value, useBytes, invert))
> else
> .Internal(grep(pattern, x, ignore.case, extended, value, fixed,
>useBytes, invert))
> }
> 
> sub <-
> function(pattern, replacement, x, ignore.case = FALSE, extended = TRUE,
>  perl = FALSE, fixed = FALSE, useBytes = FALSE)
> {
> pattern <- as.character(pattern)
> replacement <- as.character(replacement)
> if(!is.character(x)) x <- as.character(x)
> if (is.na(pattern))
> return(rep.int(as.character(NA), length(x)))
> 
> if(perl)
> .Internal(sub.perl(pattern, replacement, x, ignore.case, useBytes))
> else
> .Internal(sub(pattern, replacement, x, ignore.case,
>   extended, fixed, useBytes))
> }
> 
> gsub <-
> function(pattern, replacement, x, ignore.case = FALSE, extended = TRUE,
>  perl = FALSE, fixed = FALSE, useBytes = FALSE)
> {
> pattern <- as.character(pattern)
> replacement <- as.character(replacement)
> if(!is.character(x)) x <- as.character(x)
> if (is.na(pattern))
> return(rep.int(as.character(NA), length(x)))
> 
> if(perl)
> .Internal(gsub.perl(pattern, replacement, x, ignore.case, useBytes))
> else
> .Internal(gsub(pattern, replacement, x, ignore.case,
>extended, fixed, useBytes))
> }
> 
> regexpr <-
> function(pattern, text, extended = TRUE, perl = FALSE,
>  fixed = FALSE, useBytes = FALSE)
> {
> pattern <- as.character(pattern)
> text <- as.character(text)
> if(perl)
> .Internal(regexpr.perl(pattern, text, useBytes))
> else
> .Internal(regexpr(pattern, text, extended, fixed, useBytes))
> }
> 
> gregexpr <-
> function(pattern, text, extended = TRUE, perl = FALSE,
>  fixed = FALSE, useBytes = FALSE)
> {
> pattern <- as.character(pattern)
> text <- as.character(text)
> if(perl)
>   .Internal(gregexpr.perl(pattern, text, useBytes))
> else
>   .Internal(gregexpr(pattern, text, extended, fixed, useBytes))
> }
> 
> agrep <-
> function(pattern, x, ignore.case = FALSE, value = FALSE,
>  max.distance = 0.1)
> {
> pattern <- as.character(pattern)
> if(!is.character(x)) x <- as.character(x)
> ## behaves like == for NA pattern
> if (is.na(pattern)){
> if (value)
> return(structure(rep.int(as.character(NA), length(x)),
>  names = names(x)))
> else
> return(rep.int(NA, length(x)))
> }
> 
> if(!is.character(pattern)
>|| (length(pattern) < 1)
>|| ((n <- nchar(pattern)) == 0))
> stop("'pattern' must be a non-empty character string")
> 
> if(!is.list(max.distance)) {
> if(!is.numeric(max.distance) || (max.distance < 0))
> stop("'max.distance' must be non-negative")
> if(max.distance < 1)# transform percentages
> max.distance <- ceiling(n * max.dista

[Rd] invert argument in grep

2006-11-09 Thread Romain Francois

Hello,

What about an `invert` argument in grep, to return elements that are 
*not* matching a regular expression :


R> grep("pink", colors(), invert = TRUE, value = TRUE)

would essentially return the same as :

R> colors() [ - grep("pink", colors()) ]


I'm attaching the files that I modified (against today's tarball) for 
that purpose.


Cheers,

Romain

--
*mangosolutions*
/data analysis that delivers/

Tel   +44 1249 467 467
Fax   +44 1249 467 468

grep <-
function(pattern, x, ignore.case = FALSE, extended = TRUE, perl = FALSE,
 value = FALSE, fixed = FALSE, useBytes = FALSE, invert = FALSE)
{
pattern <- as.character(pattern)
## when value = TRUE we return names
if(!is.character(x)) x <- structure(as.character(x), names=names(x))
## behaves like == for NA pattern
if (is.na(pattern)) {
if(value)
return(structure(rep.int(as.character(NA), length(x)),
 names = names(x)))
else
return(rep.int(NA, length(x)))
}

if(perl)
.Internal(grep.perl(pattern, x, ignore.case, value, useBytes, invert))
else
.Internal(grep(pattern, x, ignore.case, extended, value, fixed,
   useBytes, invert))
}

sub <-
function(pattern, replacement, x, ignore.case = FALSE, extended = TRUE,
 perl = FALSE, fixed = FALSE, useBytes = FALSE)
{
pattern <- as.character(pattern)
replacement <- as.character(replacement)
if(!is.character(x)) x <- as.character(x)
if (is.na(pattern))
return(rep.int(as.character(NA), length(x)))

if(perl)
.Internal(sub.perl(pattern, replacement, x, ignore.case, useBytes))
else
.Internal(sub(pattern, replacement, x, ignore.case,
  extended, fixed, useBytes))
}

gsub <-
function(pattern, replacement, x, ignore.case = FALSE, extended = TRUE,
 perl = FALSE, fixed = FALSE, useBytes = FALSE)
{
pattern <- as.character(pattern)
replacement <- as.character(replacement)
if(!is.character(x)) x <- as.character(x)
if (is.na(pattern))
return(rep.int(as.character(NA), length(x)))

if(perl)
.Internal(gsub.perl(pattern, replacement, x, ignore.case, useBytes))
else
.Internal(gsub(pattern, replacement, x, ignore.case,
   extended, fixed, useBytes))
}

regexpr <-
function(pattern, text, extended = TRUE, perl = FALSE,
 fixed = FALSE, useBytes = FALSE)
{
pattern <- as.character(pattern)
text <- as.character(text)
if(perl)
.Internal(regexpr.perl(pattern, text, useBytes))
else
.Internal(regexpr(pattern, text, extended, fixed, useBytes))
}

gregexpr <-
function(pattern, text, extended = TRUE, perl = FALSE,
 fixed = FALSE, useBytes = FALSE)
{
pattern <- as.character(pattern)
text <- as.character(text)
if(perl)
  .Internal(gregexpr.perl(pattern, text, useBytes))
else
  .Internal(gregexpr(pattern, text, extended, fixed, useBytes))
}

agrep <-
function(pattern, x, ignore.case = FALSE, value = FALSE,
 max.distance = 0.1)
{
pattern <- as.character(pattern)
if(!is.character(x)) x <- as.character(x)
## behaves like == for NA pattern
if (is.na(pattern)){
if (value)
return(structure(rep.int(as.character(NA), length(x)),
 names = names(x)))
else
return(rep.int(NA, length(x)))
}

if(!is.character(pattern)
   || (length(pattern) < 1)
   || ((n <- nchar(pattern)) == 0))
stop("'pattern' must be a non-empty character string")

if(!is.list(max.distance)) {
if(!is.numeric(max.distance) || (max.distance < 0))
stop("'max.distance' must be non-negative")
if(max.distance < 1)# transform percentages
max.distance <- ceiling(n * max.distance)
max.insertions <- max.deletions <- max.substitutions <-
max.distance
} else {
## partial matching
table <- c("all", "deletions", "insertions", "substitutions")
ind <- pmatch(names(max.distance), table)
if(any(is.na(ind)))
warning("unknown match distance components ignored")
max.distance <- max.distance[!is.na(ind)]
names(max.distance) <- table[ind]
## sanity checks
comps <- unlist(max.distance)
if(!all(is.numeric(comps)) || any(comps < 0))
stop("'max.distance' components must be non-negative")
## extract restrictions
if(is.null(max.distance$all))
max.distance$all <- 0.1
max.insertions <- max.deletions <- max.substitutions <-
max.distance$all
if(!is.null(max.distance$deletions))
max.deletions <- max.distance$deletions
if(!is.null(max.distance$insertions))
max.insertions <- max.distance$insertions
if(!is.null(max.distance$substitutions))
max.substitutions <- max.distance$su