> -----Original Message-----
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Jonathan Greenberg
> Sent: Thursday, October 22, 2009 7:35 PM
> To: r-help
> Subject: [R] splitting a vector of strings...
> 
> Quick question -- if I have a vector of strings that I'd like 
> to split 
> into two new vectors based on a substring that is inside of 
> each string, 
> what is the most efficient way to do this?  The substring 
> that I want to 
> split on is multiple characters, if that matters, and it is 
> contained in 
> every element of the character vector.

strsplit and sub can both be used for this.  If you know
the string will be split into 2 parts then 2 calls to sub
with slightly different patterns will do it.  strsplit requires
less fiddling with the pattern and is handier when the number
of parts is variable or large.  strsplit's output often needs to
be rearranged for convenient use.

E.g., I made 100,000 strings with a 'qaz' in their middles with
  x<-paste("X",sample(1e5),sep="")
  y<-sub("X","Y",x)
  xy<-paste(x,y,sep="qaz")
and split them by the 'qaz' in two ways:
  system.time(ret1<-list(x=sub("qaz.*","",xy),y=sub(".*qaz","",xy)))
  # user  system elapsed 
  # 0.22    0.00    0.21 
 
system.time({tmp<-strsplit(xy,"qaz");ret2<-list(x=unlist(lapply(tmp,`[`,
1)),y=unlist(lapply(tmp,`[`,2)))})
   user  system elapsed 
  # 2.42    0.00    2.20 
  identical(ret1,ret2)
  #[1] TRUE
  identical(ret1$x,x) && identical(ret1$y,y)
  #[1] TRUE

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> 
> --j
> 
> -- 
> 
> Jonathan A. Greenberg, PhD
> Postdoctoral Scholar
> Center for Spatial Technologies and Remote Sensing (CSTARS)
> University of California, Davis
> One Shields Avenue
> The Barn, Room 250N
> Davis, CA 95616
> Phone: 415-763-5476
> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to