How about: x <- "1 2 -5, 3- 6 4 8 5-7 10"; x
library(gsubfn) strapply( x, '(([0-9]+) *- *([0-9]+))|([0-9]+)', function(one,two,three,four) { if( nchar(four) > 0 ) return(as.numeric(four) ) return( seq( from=as.numeric(two), to=as.numeric(three) ) ) } )[[1]] If x is a vector of strings and you remove the [[1]] then you will get a list with each element corresponding to a string in x (unlisting will give a single vector). This could be easily extended to handle floating point numbers instead of just integers and even negative numbers (as long as you have a clear rule to distinguish between a negative and a the end of the range). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of Bert Gunter > Sent: Friday, August 20, 2010 2:55 PM > To: r-help@r-project.org > Subject: [R] Regex exercise > > For regular expression afficianados, I'd like a cleverer solution to > the following problem (my solution works just fine for my needs; I'm > just trying to improve my regex skills): > > Given the string (entered, say, at a readline prompt): > > "1 2 -5, 3- 6 4 8 5-7 10" ## only integers will be entered > > parse it to produce the numeric vector: > > c(1, 2, 3, 4, 5, 3, 4, 5, 6, 8, 5, 6, 7, 10) > > Note that "-" in the expression is used to indicate a range of values > instead of ":" > > Here's my UNclever solution: > > First convert more than one space to a single space and then replace > "<any spaces>-<any spaces>" by ":" by: > > > x1 <- gsub(" *- *",":",gsub(" +"," ",resp)) #giving > > x1 > [1] "1 2:5, 3:6 4 8 5:7 10" ## Note that the comma remains > > Next convert the single string into a character vector via strsplit by > splitting on anything but ":" or a digit: > > > x2 <- strsplit(x1,split="[^:[:digit:]]+")[[1]] #giving > > x2 > [1] "1" "2:5" "3:6" "4" "8" "5:7" "10" > > Finally, parse() the vector, eval() each element, and unlist() the > resulting list of numeric vectors: > > > unlist(lapply(parse(text=x2),eval)) #giving, as desired, > [1] 1 2 3 4 5 3 4 5 6 4 8 5 6 7 10 > > > This seems far too clumsy and circumlocuitous not to have a more > elegant solution from a true regex expert. > > (Special note to Thomas Lumley: This seems one of the few instances > where eval(parse..)) may actually be appropriate.) > > Cheers to all, > > Bert > > -- > Bert Gunter > Genentech Nonclinical Biostatistics > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.