".*" is greedy... might want regex "number[^0-9]*([0-9] {4})" to avoid getting 
1999 from "I want the number 2000, not the number 1999."
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

Gabor Grothendieck <ggrothendi...@gmail.com> wrote:

On Thu, Aug 25, 2011 at 9:51 PM, Lorenzo Cattarino
<l.cattar...@uq.edu.au> wrote:
> Apologies for confusion. What I meant was the following:
>
> mytext <- "I want the number 2000, not the number two thousand"
>
> and the problem is to select "2000" as the first four digits after the word 
> "number". The position of 2000 in the string might change.
>
> thanks
> Lorenzo
>

strapply in gsubfn searches mytext for the indicated regular
expression and passes the back referenced portion (i.e. the portion of
mytext matching the parenthesized portion of the regular expression)
to the as.numeric function whose output is returned.

library(gsubfn)
strapply(mytext, "number.*([0-9]{4})", as.numeric, simplify = TRUE) # 2000

See http://gsubfn.googlecode.com for more info.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

_____________________________________________

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to