In the example below, a straight application of strsplit() is probably the simplest solution. In a more general case where it may be desirable to match patterns, a combination of sub() or gsub() with strsplit() might do the trick:
> x <- "Best-K Gene 11340 211952_at RANBP5 Noc= 3 - 2 LL= -963.669 -965.35" > patt <- "Best-K Gene \\d+ (\\w+) (\\w+) Noc= \\d - (\\d) LL= (.*)" > unlist(strsplit(gsub(patt,"\\1,\\2,\\3",x,perl=TRUE),",")) [1] "211952_at" "RANBP5" "2" Alternatively, you may want to take a look at the gsubfn package - it is quite useful. Still learning to use it myself... > library(gsubfn) > unlist(strapply(x,patt,function(x1,x2,x3) c(x1,x2,x3),backref=-3,perl=TRUE)) [1] "211952_at" "RANBP5" "2" ----- Original Message ---- From: Simon Blomberg <[EMAIL PROTECTED]> To: Edward Wijaya <[EMAIL PROTECTED]> Cc: r-help@r-project.org Sent: Thursday, July 31, 2008 11:48:23 PM Subject: Re: [R] Extract Element of String with R's Regex How about: unlist(strsplit(x, split=" "))[c(4:5,10)] That perl script looks like a good reason to avoid perl. Simon. On Fri, 2008-08-01 at 15:13 +0900, Edward Wijaya wrote: > Hi, > > I have this string, in which I want to extract some of it's element: > > > x <- "Best-K Gene 11340 211952_at RANBP5 Noc= 3 - 2 LL= -963.669 -965.35" > > yielding this array > > [1] "211952_at" "RANBP5" "2" > > > > In Perl we would do it this way: > > __BEGIN__ > my @needed =(); > my $str = "Best-K Gene 11340 211952_at RANBP5 Noc= 3 - 2 LL= > -963.669 -965.35"; > $str =~ /Best-K Gene \d+ (\w+) (\w+) Noc= \d - (\d) LL= (.*)/; > push @needed, ($1,$2,$3); > __END___ > > How can we achieve this with R? > > - E.W. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician Faculty of Biological and Chemical Sciences The University of Queensland St. Lucia Queensland 4072 Australia Room 320 Goddard Building (8) T: +61 7 3365 2506 http://www.uq.edu.au/~uqsblomb email: S.Blomberg1_at_uq.edu.au Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.