Re: [R] Using grep() to subset lines of text

2008-12-02 Thread Uwe Ligges



ppaarrkk wrote:

I have two vectors, a and b. b is a text file. I want to find in b those
elements of a which occur at the beginning of the line in b. I have the
following code, but it only returns a value for the first value in a, but I
want both. Any ideas please.


a = c(2,3)

b = NULL
b[1] = aaa 2 aaa
b[2] = 2 aaa
b[3] = 3 aaa
b[4] = aaa 3 aaa

grep(paste(^,a, sep=), b )




grep(paste(^, a, collapse = |, sep = ), b)

Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using grep() to subset lines of text

2008-11-29 Thread ppaarrkk

I have two vectors, a and b. b is a text file. I want to find in b those
elements of a which occur at the beginning of the line in b. I have the
following code, but it only returns a value for the first value in a, but I
want both. Any ideas please.


a = c(2,3)

b = NULL
b[1] = aaa 2 aaa
b[2] = 2 aaa
b[3] = 3 aaa
b[4] = aaa 3 aaa

grep(paste(^,a, sep=), b )

-- 
View this message in context: 
http://www.nabble.com/Using-grep%28%29-to-subset-lines-of-text-tp20746365p20746365.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using grep() to subset lines of text

2008-11-29 Thread Gabor Grothendieck
Try this:

 a - 2:3
 b - c(aaa 2 aaa, 2 aaa, 3 aaa, aaa 3 aaa)

 re - paste(^(, paste(a, collapse = |), ), sep = )
 re
[1] ^(2|3)
 grep(re, b, value = TRUE)
[1] 2 aaa 3 aaa

On Sat, Nov 29, 2008 at 7:00 AM, ppaarrkk [EMAIL PROTECTED] wrote:

 I have two vectors, a and b. b is a text file. I want to find in b those
 elements of a which occur at the beginning of the line in b. I have the
 following code, but it only returns a value for the first value in a, but I
 want both. Any ideas please.


 a = c(2,3)

 b = NULL
 b[1] = aaa 2 aaa
 b[2] = 2 aaa
 b[3] = 3 aaa
 b[4] = aaa 3 aaa

 grep(paste(^,a, sep=), b )

 --
 View this message in context: 
 http://www.nabble.com/Using-grep%28%29-to-subset-lines-of-text-tp20746365p20746365.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using grep() to subset lines of text

2008-11-29 Thread Stavros Macrakis
Hmm, this brings up an interesting question.  What if the string I'm looking
for contains escape characters?  For example, grep( paste( ^, (ab) ),
c(ab,(ab)) ) = c(1), not c(2).

I couldn't find an equivalent to Emacs's regexp-quote, which would let me
write regexp.quote((ab)) = \\(ab\\).  The syntax of regular expressions
is complicated enough that this is not trivial. Is there perhaps a CRAN
package with regular expression utilities?

-s

On Sat, Nov 29, 2008 at 7:12 AM, Gabor Grothendieck [EMAIL PROTECTED]
 wrote:

  a - 2:3
  b - c(aaa 2 aaa, 2 aaa, 3 aaa, aaa 3 aaa)
  re - paste(^(, paste(a, collapse = |), ), sep = )
  grep(re, b, value = TRUE)
 [1] 2 aaa 3 aaa


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using grep() to subset lines of text

2008-11-29 Thread Gabor Grothendieck
grep has a fixed = TRUE argument if you want to ignore all regexp's.

On Sat, Nov 29, 2008 at 3:55 PM, Stavros Macrakis [EMAIL PROTECTED] wrote:
 Hmm, this brings up an interesting question.  What if the string I'm looking
 for contains escape characters?  For example, grep( paste( ^, (ab) ),
 c(ab,(ab)) ) = c(1), not c(2).

 I couldn't find an equivalent to Emacs's regexp-quote, which would let me
 write regexp.quote((ab)) = \\(ab\\).  The syntax of regular expressions
 is complicated enough that this is not trivial. Is there perhaps a CRAN
 package with regular expression utilities?

 -s

 On Sat, Nov 29, 2008 at 7:12 AM, Gabor Grothendieck
 [EMAIL PROTECTED] wrote:

  a - 2:3
  b - c(aaa 2 aaa, 2 aaa, 3 aaa, aaa 3 aaa)
  re - paste(^(, paste(a, collapse = |), ), sep = )
  grep(re, b, value = TRUE)
 [1] 2 aaa 3 aaa



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using grep() to subset lines of text

2008-11-29 Thread Stavros Macrakis
But I don't want to ignore all regexp's -- I want to build a regexp which
contains string components which are parameters.

-s

On Sat, Nov 29, 2008 at 6:51 PM, Gabor Grothendieck [EMAIL PROTECTED]
 wrote:

 grep has a fixed = TRUE argument if you want to ignore all regexp's.

 On Sat, Nov 29, 2008 at 3:55 PM, Stavros Macrakis [EMAIL PROTECTED]
 wrote:
  Hmm, this brings up an interesting question.  What if the string I'm
 looking
  for contains escape characters?  For example, grep( paste( ^, (ab) ),
  c(ab,(ab)) ) = c(1), not c(2).
 
  I couldn't find an equivalent to Emacs's regexp-quote, which would let me
  write regexp.quote((ab)) = \\(ab\\).  The syntax of regular
 expressions
  is complicated enough that this is not trivial. Is there perhaps a CRAN
  package with regular expression utilities?
 
  -s
 
  On Sat, Nov 29, 2008 at 7:12 AM, Gabor Grothendieck
  [EMAIL PROTECTED] wrote:
 
   a - 2:3
   b - c(aaa 2 aaa, 2 aaa, 3 aaa, aaa 3 aaa)
   re - paste(^(, paste(a, collapse = |), ), sep = )
   grep(re, b, value = TRUE)
  [1] 2 aaa 3 aaa
 
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using grep() to subset lines of text

2008-11-29 Thread Gabor Grothendieck
Try this. For each character x in s,  if x is punctuation it is replaced
with \\x otherwise with [x] :

library(gsubfn)
gsubfn('.', ~ if (any(grep([[:punct:]], x))) paste0('\\', x) else
paste0('[', x, ']'), s)

See http://gsubfn.googlecode.com


On Sat, Nov 29, 2008 at 10:09 PM, Stavros Macrakis
[EMAIL PROTECTED] wrote:
 But I don't want to ignore all regexp's -- I want to build a regexp which
 contains string components which are parameters.

 -s

 On Sat, Nov 29, 2008 at 6:51 PM, Gabor Grothendieck
 [EMAIL PROTECTED] wrote:

 grep has a fixed = TRUE argument if you want to ignore all regexp's.

 On Sat, Nov 29, 2008 at 3:55 PM, Stavros Macrakis [EMAIL PROTECTED]
 wrote:
  Hmm, this brings up an interesting question.  What if the string I'm
  looking
  for contains escape characters?  For example, grep( paste( ^, (ab)
  ),
  c(ab,(ab)) ) = c(1), not c(2).
 
  I couldn't find an equivalent to Emacs's regexp-quote, which would let
  me
  write regexp.quote((ab)) = \\(ab\\).  The syntax of regular
  expressions
  is complicated enough that this is not trivial. Is there perhaps a CRAN
  package with regular expression utilities?
 
  -s
 
  On Sat, Nov 29, 2008 at 7:12 AM, Gabor Grothendieck
  [EMAIL PROTECTED] wrote:
 
   a - 2:3
   b - c(aaa 2 aaa, 2 aaa, 3 aaa, aaa 3 aaa)
   re - paste(^(, paste(a, collapse = |), ), sep = )
   grep(re, b, value = TRUE)
  [1] 2 aaa 3 aaa
 
 



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.