Re: [Rd] invert argument in grep
invert= would be consistent with the fact that egrep (-v), sed/vi (v) and awk (~!) all have special facilities as indicated to handle such negation/inversion. On 11/12/06, Romain Francois <[EMAIL PROTECTED]> wrote: > Duncan Murdoch wrote: > > On 11/10/2006 12:52 PM, Romain Francois wrote: > >> Duncan Murdoch wrote: > >>> On 11/9/2006 5:14 AM, Romain Francois wrote: > Hello, > > What about an `invert` argument in grep, to return elements that > are *not* matching a regular expression : > > R> grep("pink", colors(), invert = TRUE, value = TRUE) > > would essentially return the same as : > > R> colors() [ - grep("pink", colors()) ] > > > I'm attaching the files that I modified (against today's tarball) > for that purpose. > >>> > >>> I think a more generally useful change would be to be able to return > >>> a logical vector with TRUE for a match and FALSE for a non-match, so > >>> a simple !grep(...) does the inversion. (This is motivated by the > >>> recent R-help discussion of the fact that x[-selection] doesn't > >>> always invert the selection when it's a vector of indices.) > >>> > >>> A way to do that without expanding the argument list would be to allow > >>> > >>> value="logical" > >>> > >>> as well as value=TRUE and value=FALSE. > >>> > >>> This would make boolean operations easy, e.g. > >>> > >>> colors()[grep("dark", colors(), value="logical") > >>> & !grep("blue", colors(), value="logical")] > >>> > >>> to select the colors that contain "dark" but not "blue". (In this > >>> case the RE to select that subset is rather simple because "dark" > >>> always precedes "blue", but if that wasn't true, it would be a lot > >>> messier.) > >>> > >>> Duncan Murdoch > >> Hi, > >> > >> It sounds like a nice thing to have. I would still prefer to type : > >> > >> R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE), > >> value = TRUE ) > > > > That's good for intersecting two searches, but not for other boolean > > combinations. > > > > My main point was that inversion isn't the only boolean operation you > > may want, but R has perfectly good powerful boolean operators, so > > installing a limited subset of boolean algebra into grep() is probably > > the wrong approach. > > Hi, > > Yes, good point. I agree with you that the value = "logical" is probably > worth having to take advantage of these logical operators. > > but, what about all these functions calling grep and passing > arguments through the ellipsis. With this invert argument, we could do : > > R> history(pattern = "grid\\..*\\(", invert = TRUE) > > BTW, why not use ... in ls ? in case someone would like to use perl > regex to use ls, or to get back at this thread, issue commands like : > > R> ls("package:grid", pattern = "^grid\\.|Grob$", invert = TRUE) > [1] "absolute.size" "applyEdit" "applyEdits" > [4] "arcCurvature""arrow" "childNames" > [7] "convertHeight" "convertNative" "convertUnit" > [10] "convertWidth""convertX""convertY" > [13] "current.transform" "current.viewport""current.vpPath" > [16] "current.vpTree" "dataViewport""downViewport" > [19] "draw.details""drawDetails" "editDetails" > [22] "engine.display.list" "gEdit" "gEditList" > [25] "get.gpar""getNames""gList" > [28] "gpar""gPath" "grob" > [31] "grobHeight" "grobName""grobWidth" > [34] "grobX" "grobY" "gTree" > [37] "heightDetails" "is.unit" "layout.heights" > [40] "layoutRegion""layout.torture" "layout.widths" > [43] "plotViewport""pop.viewport""popViewport" > [46] "postDrawDetails" "preDrawDetails" "push.viewport" > [49] "pushViewport""seekViewport""setChildren" > [52] "stringHeight""stringWidth" "unit" > [55] "unit.c" "unit.length" "unit.pmax" > [58] "unit.pmin" "unit.rep""upViewport" > [61] "validDetails""viewport""viewport.layout" > [64] "viewport.transform" "vpList" "vpPath" > [67] "vpStack" "vpTree" "widthDetails" > [70] "xDetails""yDetails" > > Then, what about ... in apropos ? > > Regards, > > Romain > > > >> > >> > >> What about a way to pass more than one regular expression then be > >> able to call : > >> > >> R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE, > >> FALSE) > > > > Again, it covers & and !, but it misses other boolean operators. > > > >> I usually use that kind of shortcuts that are easy to remember. > >> > >> vgrep <- function(...) grep(..., value = TRUE) > >> igrep <- function(...) grep(..., invert = TRUE) > >> ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE) > > > > If you're willing to write these,
Re: [Rd] invert argument in grep
Duncan Murdoch wrote: > On 11/10/2006 12:52 PM, Romain Francois wrote: >> Duncan Murdoch wrote: >>> On 11/9/2006 5:14 AM, Romain Francois wrote: Hello, What about an `invert` argument in grep, to return elements that are *not* matching a regular expression : R> grep("pink", colors(), invert = TRUE, value = TRUE) would essentially return the same as : R> colors() [ - grep("pink", colors()) ] I'm attaching the files that I modified (against today's tarball) for that purpose. >>> >>> I think a more generally useful change would be to be able to return >>> a logical vector with TRUE for a match and FALSE for a non-match, so >>> a simple !grep(...) does the inversion. (This is motivated by the >>> recent R-help discussion of the fact that x[-selection] doesn't >>> always invert the selection when it's a vector of indices.) >>> >>> A way to do that without expanding the argument list would be to allow >>> >>> value="logical" >>> >>> as well as value=TRUE and value=FALSE. >>> >>> This would make boolean operations easy, e.g. >>> >>> colors()[grep("dark", colors(), value="logical") >>> & !grep("blue", colors(), value="logical")] >>> >>> to select the colors that contain "dark" but not "blue". (In this >>> case the RE to select that subset is rather simple because "dark" >>> always precedes "blue", but if that wasn't true, it would be a lot >>> messier.) >>> >>> Duncan Murdoch >> Hi, >> >> It sounds like a nice thing to have. I would still prefer to type : >> >> R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE), >> value = TRUE ) > > That's good for intersecting two searches, but not for other boolean > combinations. > > My main point was that inversion isn't the only boolean operation you > may want, but R has perfectly good powerful boolean operators, so > installing a limited subset of boolean algebra into grep() is probably > the wrong approach. Hi, Yes, good point. I agree with you that the value = "logical" is probably worth having to take advantage of these logical operators. but, what about all these functions calling grep and passing arguments through the ellipsis. With this invert argument, we could do : R> history(pattern = "grid\\..*\\(", invert = TRUE) BTW, why not use ... in ls ? in case someone would like to use perl regex to use ls, or to get back at this thread, issue commands like : R> ls("package:grid", pattern = "^grid\\.|Grob$", invert = TRUE) [1] "absolute.size" "applyEdit" "applyEdits" [4] "arcCurvature""arrow" "childNames" [7] "convertHeight" "convertNative" "convertUnit" [10] "convertWidth""convertX""convertY" [13] "current.transform" "current.viewport""current.vpPath" [16] "current.vpTree" "dataViewport""downViewport" [19] "draw.details""drawDetails" "editDetails" [22] "engine.display.list" "gEdit" "gEditList" [25] "get.gpar""getNames""gList" [28] "gpar""gPath" "grob" [31] "grobHeight" "grobName""grobWidth" [34] "grobX" "grobY" "gTree" [37] "heightDetails" "is.unit" "layout.heights" [40] "layoutRegion""layout.torture" "layout.widths" [43] "plotViewport""pop.viewport""popViewport" [46] "postDrawDetails" "preDrawDetails" "push.viewport" [49] "pushViewport""seekViewport""setChildren" [52] "stringHeight""stringWidth" "unit" [55] "unit.c" "unit.length" "unit.pmax" [58] "unit.pmin" "unit.rep""upViewport" [61] "validDetails""viewport""viewport.layout" [64] "viewport.transform" "vpList" "vpPath" [67] "vpStack" "vpTree" "widthDetails" [70] "xDetails""yDetails" Then, what about ... in apropos ? Regards, Romain >> >> >> What about a way to pass more than one regular expression then be >> able to call : >> >> R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE, >> FALSE) > > Again, it covers & and !, but it misses other boolean operators. > >> I usually use that kind of shortcuts that are easy to remember. >> >> vgrep <- function(...) grep(..., value = TRUE) >> igrep <- function(...) grep(..., invert = TRUE) >> ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE) > > If you're willing to write these, then it's easy to write igrep > without an invert arg to grep: > > igrep <- function(pat, x, ...) >setdiff(1:length(x), grep(pat, x, value = FALSE, ...)) > > ivgrep would also be easy, except for the weird semantics of > value=TRUE pointed out by Brian: but it could still be written with a > little bit of care. > > Duncan Murdoch > >> >> What about things like the arguments `after` and `before` in unix >> grep. T
Re: [Rd] invert argument in grep
On 11/10/2006 12:52 PM, Romain Francois wrote: > Duncan Murdoch wrote: >> On 11/9/2006 5:14 AM, Romain Francois wrote: >>> Hello, >>> >>> What about an `invert` argument in grep, to return elements that are >>> *not* matching a regular expression : >>> >>> R> grep("pink", colors(), invert = TRUE, value = TRUE) >>> >>> would essentially return the same as : >>> >>> R> colors() [ - grep("pink", colors()) ] >>> >>> >>> I'm attaching the files that I modified (against today's tarball) for >>> that purpose. >> >> I think a more generally useful change would be to be able to return a >> logical vector with TRUE for a match and FALSE for a non-match, so a >> simple !grep(...) does the inversion. (This is motivated by the >> recent R-help discussion of the fact that x[-selection] doesn't always >> invert the selection when it's a vector of indices.) >> >> A way to do that without expanding the argument list would be to allow >> >> value="logical" >> >> as well as value=TRUE and value=FALSE. >> >> This would make boolean operations easy, e.g. >> >> colors()[grep("dark", colors(), value="logical") >> & !grep("blue", colors(), value="logical")] >> >> to select the colors that contain "dark" but not "blue". (In this case >> the RE to select that subset is rather simple because "dark" always >> precedes "blue", but if that wasn't true, it would be a lot messier.) >> >> Duncan Murdoch > Hi, > > It sounds like a nice thing to have. I would still prefer to type : > > R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE), > value = TRUE ) That's good for intersecting two searches, but not for other boolean combinations. My main point was that inversion isn't the only boolean operation you may want, but R has perfectly good powerful boolean operators, so installing a limited subset of boolean algebra into grep() is probably the wrong approach. > > > What about a way to pass more than one regular expression then be able > to call : > > R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE, FALSE) Again, it covers & and !, but it misses other boolean operators. > I usually use that kind of shortcuts that are easy to remember. > > vgrep <- function(...) grep(..., value = TRUE) > igrep <- function(...) grep(..., invert = TRUE) > ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE) If you're willing to write these, then it's easy to write igrep without an invert arg to grep: igrep <- function(pat, x, ...) setdiff(1:length(x), grep(pat, x, value = FALSE, ...)) ivgrep would also be easy, except for the weird semantics of value=TRUE pointed out by Brian: but it could still be written with a little bit of care. Duncan Murdoch > > What about things like the arguments `after` and `before` in unix grep. > That could be used when grepping inside a function : > > R> grep("plot\\.", body(plot.default) , value= TRUE) > [1] "localWindow <- function(..., col, bg, pch, cex, lty, lwd) > plot.window(...)" > [2] "plot.new()" > [3] "plot.xy(xy, type, ...)" > > > when this could be useful (possibly). > > R> # grep("plot\\.", plot.default, after = 2, value = TRUE) > R> tmp <- tempfile(); sink(tmp) ; print(body(plot.default)); sink(); > system( paste( "grep -A2 plot\\. ", tmp) ) > localWindow <- function(..., col, bg, pch, cex, lty, lwd) > plot.window(...) > localTitle <- function(..., col, bg, pch, cex, lty, lwd) title(...) > xlabel <- if (!missing(x)) > -- > plot.new() > localWindow(xlim, ylim, log, asp, ...) > panel.first > plot.xy(xy, type, ...) > panel.last > if (axes) { > -- > if (frame.plot) > localBox(...) > if (ann) > > > BTW, if I call : > > R> grep("plot\\.", plot.default) > Error in as.character(x) : cannot coerce to vector > > What about adding that line at the beginning of grep, or something else > to be able to do as.character on a function ? > > if(is.function(x)) x <- body(x) > > > Cheers, > > Romain >>> >>> Cheers, >>> >>> Romain > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] invert argument in grep
Duncan Murdoch wrote: > On 11/9/2006 5:14 AM, Romain Francois wrote: >> Hello, >> >> What about an `invert` argument in grep, to return elements that are >> *not* matching a regular expression : >> >> R> grep("pink", colors(), invert = TRUE, value = TRUE) >> >> would essentially return the same as : >> >> R> colors() [ - grep("pink", colors()) ] >> >> >> I'm attaching the files that I modified (against today's tarball) for >> that purpose. > > I think a more generally useful change would be to be able to return a > logical vector with TRUE for a match and FALSE for a non-match, so a > simple !grep(...) does the inversion. (This is motivated by the > recent R-help discussion of the fact that x[-selection] doesn't always > invert the selection when it's a vector of indices.) > > A way to do that without expanding the argument list would be to allow > > value="logical" > > as well as value=TRUE and value=FALSE. > > This would make boolean operations easy, e.g. > > colors()[grep("dark", colors(), value="logical") > & !grep("blue", colors(), value="logical")] > > to select the colors that contain "dark" but not "blue". (In this case > the RE to select that subset is rather simple because "dark" always > precedes "blue", but if that wasn't true, it would be a lot messier.) > > Duncan Murdoch Hi, It sounds like a nice thing to have. I would still prefer to type : R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE), value = TRUE ) What about a way to pass more than one regular expression then be able to call : R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE, FALSE) I usually use that kind of shortcuts that are easy to remember. vgrep <- function(...) grep(..., value = TRUE) igrep <- function(...) grep(..., invert = TRUE) ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE) What about things like the arguments `after` and `before` in unix grep. That could be used when grepping inside a function : R> grep("plot\\.", body(plot.default) , value= TRUE) [1] "localWindow <- function(..., col, bg, pch, cex, lty, lwd) plot.window(...)" [2] "plot.new()" [3] "plot.xy(xy, type, ...)" when this could be useful (possibly). R> # grep("plot\\.", plot.default, after = 2, value = TRUE) R> tmp <- tempfile(); sink(tmp) ; print(body(plot.default)); sink(); system( paste( "grep -A2 plot\\. ", tmp) ) localWindow <- function(..., col, bg, pch, cex, lty, lwd) plot.window(...) localTitle <- function(..., col, bg, pch, cex, lty, lwd) title(...) xlabel <- if (!missing(x)) -- plot.new() localWindow(xlim, ylim, log, asp, ...) panel.first plot.xy(xy, type, ...) panel.last if (axes) { -- if (frame.plot) localBox(...) if (ann) BTW, if I call : R> grep("plot\\.", plot.default) Error in as.character(x) : cannot coerce to vector What about adding that line at the beginning of grep, or something else to be able to do as.character on a function ? if(is.function(x)) x <- body(x) Cheers, Romain >> >> Cheers, >> >> Romain -- *mangosolutions* /data analysis that delivers/ Tel +44 1249 467 467 Fax +44 1249 467 468 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] invert argument in grep
On 11/10/2006 6:28 AM, Prof Brian Ripley wrote: > On Fri, 10 Nov 2006, Duncan Murdoch wrote: > >> On 11/9/2006 5:14 AM, Romain Francois wrote: >>> Hello, >>> >>> What about an `invert` argument in grep, to return elements that are >>> *not* matching a regular expression : >>> >>> R> grep("pink", colors(), invert = TRUE, value = TRUE) >>> >>> would essentially return the same as : >>> >>> R> colors() [ - grep("pink", colors()) ] > > Note that grep("pat", x, value = TRUE) is not the same as x[grep("pat", x)], > as the help page carefully points out. (I think it would be better > if it were.) > >>> I'm attaching the files that I modified (against today's tarball) for >>> that purpose. > > (BTW, sending whole files makes it difficult to see the changes and even > harder to merge them; please use diffs. From a quick look the changes > were very incomplete, as the internal functions were changed and there > were no changed C files.) > >> I think a more generally useful change would be to be able to return a >> logical vector with TRUE for a match and FALSE for a non-match, so a >> simple !grep(...) does the inversion. (This is motivated by the recent >> R-help discussion of the fact that x[-selection] doesn't always invert >> the selection when it's a vector of indices.) > > I don't think that is pertinent here, as the indices are always a vector > of positive integers. The issue is that the vector might be empty, in which case arithmetically negating it has no effect. Negating a vector of integer indices is not a good way to invert a selection, while logical negation of a logical vector is fine. > >> A way to do that without expanding the argument list would be to allow >> >> value="logical" >> >> as well as value=TRUE and value=FALSE. >> >> This would make boolean operations easy, e.g. >> >> colors()[grep("dark", colors(), value="logical") >> & !grep("blue", colors(), value="logical")] >> >> to select the colors that contain "dark" but not "blue". (In this case >> the RE to select that subset is rather simple because "dark" always >> precedes "blue", but if that wasn't true, it would be a lot messier.) > > That might be worthwhile, but it is relatively simple to change positive > integer indices to logical ones and v.v. > > My personal take is that having 'value=TRUE' was already a complication > not worth having, and implementing it at C level was an efficiency tweak > not worth the maintenance effort (and also means that '[' methods are not > dispatched). This makes it sound as though it would be worthwhile to redo the implementation of value=TRUE as something equivalent to x[grep("pat", x)] by putting this case into the R code. This would simplify the C code and make the interface a little less quirky. (I'm not sure how much code this would break because of the loss of coercion to character.) The value="logical" implementation could also be done in R, not C. The advantage of putting it into grep() rather than leaving it for the user to change later is that grep() has a copy of x in hand, so a user of grep() will not have to save length(x) to use in the conversion to logical. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] invert argument in grep
On Fri, 10 Nov 2006, Duncan Murdoch wrote: > On 11/9/2006 5:14 AM, Romain Francois wrote: >> Hello, >> >> What about an `invert` argument in grep, to return elements that are >> *not* matching a regular expression : >> >> R> grep("pink", colors(), invert = TRUE, value = TRUE) >> >> would essentially return the same as : >> >> R> colors() [ - grep("pink", colors()) ] Note that grep("pat", x, value = TRUE) is not the same as x[grep("pat", x)], as the help page carefully points out. (I think it would be better if it were.) >> >> I'm attaching the files that I modified (against today's tarball) for >> that purpose. (BTW, sending whole files makes it difficult to see the changes and even harder to merge them; please use diffs. From a quick look the changes were very incomplete, as the internal functions were changed and there were no changed C files.) > I think a more generally useful change would be to be able to return a > logical vector with TRUE for a match and FALSE for a non-match, so a > simple !grep(...) does the inversion. (This is motivated by the recent > R-help discussion of the fact that x[-selection] doesn't always invert > the selection when it's a vector of indices.) I don't think that is pertinent here, as the indices are always a vector of positive integers. > A way to do that without expanding the argument list would be to allow > > value="logical" > > as well as value=TRUE and value=FALSE. > > This would make boolean operations easy, e.g. > > colors()[grep("dark", colors(), value="logical") > & !grep("blue", colors(), value="logical")] > > to select the colors that contain "dark" but not "blue". (In this case > the RE to select that subset is rather simple because "dark" always > precedes "blue", but if that wasn't true, it would be a lot messier.) That might be worthwhile, but it is relatively simple to change positive integer indices to logical ones and v.v. My personal take is that having 'value=TRUE' was already a complication not worth having, and implementing it at C level was an efficiency tweak not worth the maintenance effort (and also means that '[' methods are not dispatched). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] invert argument in grep
On 11/9/2006 5:14 AM, Romain Francois wrote: > Hello, > > What about an `invert` argument in grep, to return elements that are > *not* matching a regular expression : > > R> grep("pink", colors(), invert = TRUE, value = TRUE) > > would essentially return the same as : > > R> colors() [ - grep("pink", colors()) ] > > > I'm attaching the files that I modified (against today's tarball) for > that purpose. I think a more generally useful change would be to be able to return a logical vector with TRUE for a match and FALSE for a non-match, so a simple !grep(...) does the inversion. (This is motivated by the recent R-help discussion of the fact that x[-selection] doesn't always invert the selection when it's a vector of indices.) A way to do that without expanding the argument list would be to allow value="logical" as well as value=TRUE and value=FALSE. This would make boolean operations easy, e.g. colors()[grep("dark", colors(), value="logical") & !grep("blue", colors(), value="logical")] to select the colors that contain "dark" but not "blue". (In this case the RE to select that subset is rather simple because "dark" always precedes "blue", but if that wasn't true, it would be a lot messier.) Duncan Murdoch > > Cheers, > > Romain > > > > > > grep <- > function(pattern, x, ignore.case = FALSE, extended = TRUE, perl = FALSE, > value = FALSE, fixed = FALSE, useBytes = FALSE, invert = FALSE) > { > pattern <- as.character(pattern) > ## when value = TRUE we return names > if(!is.character(x)) x <- structure(as.character(x), names=names(x)) > ## behaves like == for NA pattern > if (is.na(pattern)) { > if(value) > return(structure(rep.int(as.character(NA), length(x)), > names = names(x))) > else > return(rep.int(NA, length(x))) > } > > if(perl) > .Internal(grep.perl(pattern, x, ignore.case, value, useBytes, invert)) > else > .Internal(grep(pattern, x, ignore.case, extended, value, fixed, >useBytes, invert)) > } > > sub <- > function(pattern, replacement, x, ignore.case = FALSE, extended = TRUE, > perl = FALSE, fixed = FALSE, useBytes = FALSE) > { > pattern <- as.character(pattern) > replacement <- as.character(replacement) > if(!is.character(x)) x <- as.character(x) > if (is.na(pattern)) > return(rep.int(as.character(NA), length(x))) > > if(perl) > .Internal(sub.perl(pattern, replacement, x, ignore.case, useBytes)) > else > .Internal(sub(pattern, replacement, x, ignore.case, > extended, fixed, useBytes)) > } > > gsub <- > function(pattern, replacement, x, ignore.case = FALSE, extended = TRUE, > perl = FALSE, fixed = FALSE, useBytes = FALSE) > { > pattern <- as.character(pattern) > replacement <- as.character(replacement) > if(!is.character(x)) x <- as.character(x) > if (is.na(pattern)) > return(rep.int(as.character(NA), length(x))) > > if(perl) > .Internal(gsub.perl(pattern, replacement, x, ignore.case, useBytes)) > else > .Internal(gsub(pattern, replacement, x, ignore.case, >extended, fixed, useBytes)) > } > > regexpr <- > function(pattern, text, extended = TRUE, perl = FALSE, > fixed = FALSE, useBytes = FALSE) > { > pattern <- as.character(pattern) > text <- as.character(text) > if(perl) > .Internal(regexpr.perl(pattern, text, useBytes)) > else > .Internal(regexpr(pattern, text, extended, fixed, useBytes)) > } > > gregexpr <- > function(pattern, text, extended = TRUE, perl = FALSE, > fixed = FALSE, useBytes = FALSE) > { > pattern <- as.character(pattern) > text <- as.character(text) > if(perl) > .Internal(gregexpr.perl(pattern, text, useBytes)) > else > .Internal(gregexpr(pattern, text, extended, fixed, useBytes)) > } > > agrep <- > function(pattern, x, ignore.case = FALSE, value = FALSE, > max.distance = 0.1) > { > pattern <- as.character(pattern) > if(!is.character(x)) x <- as.character(x) > ## behaves like == for NA pattern > if (is.na(pattern)){ > if (value) > return(structure(rep.int(as.character(NA), length(x)), > names = names(x))) > else > return(rep.int(NA, length(x))) > } > > if(!is.character(pattern) >|| (length(pattern) < 1) >|| ((n <- nchar(pattern)) == 0)) > stop("'pattern' must be a non-empty character string") > > if(!is.list(max.distance)) { > if(!is.numeric(max.distance) || (max.distance < 0)) > stop("'max.distance' must be non-negative") > if(max.distance < 1)# transform percentages > max.distance <- ceiling(n * max.dista
[Rd] invert argument in grep
Hello, What about an `invert` argument in grep, to return elements that are *not* matching a regular expression : R> grep("pink", colors(), invert = TRUE, value = TRUE) would essentially return the same as : R> colors() [ - grep("pink", colors()) ] I'm attaching the files that I modified (against today's tarball) for that purpose. Cheers, Romain -- *mangosolutions* /data analysis that delivers/ Tel +44 1249 467 467 Fax +44 1249 467 468 grep <- function(pattern, x, ignore.case = FALSE, extended = TRUE, perl = FALSE, value = FALSE, fixed = FALSE, useBytes = FALSE, invert = FALSE) { pattern <- as.character(pattern) ## when value = TRUE we return names if(!is.character(x)) x <- structure(as.character(x), names=names(x)) ## behaves like == for NA pattern if (is.na(pattern)) { if(value) return(structure(rep.int(as.character(NA), length(x)), names = names(x))) else return(rep.int(NA, length(x))) } if(perl) .Internal(grep.perl(pattern, x, ignore.case, value, useBytes, invert)) else .Internal(grep(pattern, x, ignore.case, extended, value, fixed, useBytes, invert)) } sub <- function(pattern, replacement, x, ignore.case = FALSE, extended = TRUE, perl = FALSE, fixed = FALSE, useBytes = FALSE) { pattern <- as.character(pattern) replacement <- as.character(replacement) if(!is.character(x)) x <- as.character(x) if (is.na(pattern)) return(rep.int(as.character(NA), length(x))) if(perl) .Internal(sub.perl(pattern, replacement, x, ignore.case, useBytes)) else .Internal(sub(pattern, replacement, x, ignore.case, extended, fixed, useBytes)) } gsub <- function(pattern, replacement, x, ignore.case = FALSE, extended = TRUE, perl = FALSE, fixed = FALSE, useBytes = FALSE) { pattern <- as.character(pattern) replacement <- as.character(replacement) if(!is.character(x)) x <- as.character(x) if (is.na(pattern)) return(rep.int(as.character(NA), length(x))) if(perl) .Internal(gsub.perl(pattern, replacement, x, ignore.case, useBytes)) else .Internal(gsub(pattern, replacement, x, ignore.case, extended, fixed, useBytes)) } regexpr <- function(pattern, text, extended = TRUE, perl = FALSE, fixed = FALSE, useBytes = FALSE) { pattern <- as.character(pattern) text <- as.character(text) if(perl) .Internal(regexpr.perl(pattern, text, useBytes)) else .Internal(regexpr(pattern, text, extended, fixed, useBytes)) } gregexpr <- function(pattern, text, extended = TRUE, perl = FALSE, fixed = FALSE, useBytes = FALSE) { pattern <- as.character(pattern) text <- as.character(text) if(perl) .Internal(gregexpr.perl(pattern, text, useBytes)) else .Internal(gregexpr(pattern, text, extended, fixed, useBytes)) } agrep <- function(pattern, x, ignore.case = FALSE, value = FALSE, max.distance = 0.1) { pattern <- as.character(pattern) if(!is.character(x)) x <- as.character(x) ## behaves like == for NA pattern if (is.na(pattern)){ if (value) return(structure(rep.int(as.character(NA), length(x)), names = names(x))) else return(rep.int(NA, length(x))) } if(!is.character(pattern) || (length(pattern) < 1) || ((n <- nchar(pattern)) == 0)) stop("'pattern' must be a non-empty character string") if(!is.list(max.distance)) { if(!is.numeric(max.distance) || (max.distance < 0)) stop("'max.distance' must be non-negative") if(max.distance < 1)# transform percentages max.distance <- ceiling(n * max.distance) max.insertions <- max.deletions <- max.substitutions <- max.distance } else { ## partial matching table <- c("all", "deletions", "insertions", "substitutions") ind <- pmatch(names(max.distance), table) if(any(is.na(ind))) warning("unknown match distance components ignored") max.distance <- max.distance[!is.na(ind)] names(max.distance) <- table[ind] ## sanity checks comps <- unlist(max.distance) if(!all(is.numeric(comps)) || any(comps < 0)) stop("'max.distance' components must be non-negative") ## extract restrictions if(is.null(max.distance$all)) max.distance$all <- 0.1 max.insertions <- max.deletions <- max.substitutions <- max.distance$all if(!is.null(max.distance$deletions)) max.deletions <- max.distance$deletions if(!is.null(max.distance$insertions)) max.insertions <- max.distance$insertions if(!is.null(max.distance$substitutions)) max.substitutions <- max.distance$su