On Wed, Jul 14, 2010 at 2:21 PM, karena <dr.jz...@gmail.com> wrote: > > Hi, > > I have a data.frame as following: > var1 var2 > 1 ab_c_(ok) > 2 okf789(db)_c > 3 jojfiod(90).gt > 4 "ij"_(78)__op > 5 (iojfodjfo)_ab > > what I want is to create a new variable called "var3". the value of var3 is > the content in the Parentheses. so var3 would be: > var3 > ok > db > 90 > 78 > iojfodjfo >
Here are several alternatives. The gsub solution matches everything up to the ( as well as everything after the ) and replaces each with nothing. The strsplit solution splits each into three fields, everything before the (, everything with in the (), and everything after the ) and the picks off the second. The strapply solution matches everything from ( to ) and returns everything between them. The below works whether DF$var2 is factor or character but if you know its character you can drop the as.character in #2 and #3. # 1 gsub(".*[(]|[)].*", "", DF$var2) # 2 sapply(strsplit(as.character(DF$var2), "[()]"), "[", 2) # 3 library(gsubfn) strapply(as.character(DF$var2), "[(](.*)[)]", simplify = TRUE) ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.