Thanks I was looking at that package and reading your mails in the archive. I think my tiny mind got twisted in the regexp..
On Wed, May 5, 2010 at 2:35 PM, Gabor Grothendieck <ggrothendi...@gmail.com>wrote: > Here are two ways to extract 5 digits. > > In the first one \\1 refers to the portion matched between the > parentheses in the regular expression. > > In the second one strapply is like apply where the object to be worked > on is the first argument (array for apply, string for strapply) the > second modifies it (which dimension for apply, regular expression for > strapply) and the last is a function which acts on each value > (typically each row or column for apply and each match for strapply). > In this case we use c as our function to just return all the results. > They are returned in a list with one component per string but here > test is just a single string so we get a list one long and we ask for > the contents of the first component using [[1]]. > > # 1 - sub > sub(".*(\\d{5}).*", "\\1", test) > > # 2 - strapply - see http://gsubfn.googlecode.com > library(gsubfn) > strapply(test, "\\d{5}", c)[[1]] > > > > On Wed, May 5, 2010 at 5:13 PM, steven mosher <mosherste...@gmail.com> > wrote: > > Given a text like > > > > I want to be able to extract a matched regular expression from a piece of > > text. > > > > this apparently works, but is pretty ugly > > # some html > > > test<-"</tr><tr><th>88958</th><th>Abcdsef</th><th>67.8S</th><th>68.9\nW</th><th>26m</th>" > > # a pattern to extract 5 digits > >> pattern<-"[0-9]{5}" > > # regexpr returns a start point[1] and an attribute "match.length" > > attr(,"match.length) > > # get the substring from the start point to the stop point.. where stop = > > start +length-1 > >> > > > answer<-substr(test,regexpr(pattern,test)[1],regexpr(pattern,test)[1]+attr(regexpr(pattern,test),"match.length")-1) > >> answer > > [1] "88958" > > > > I tried using sub(pattern, replacement, x ) with a regexp that captured > the > > group. I'd found an example of this in the mails > > but it didnt seem to work.. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.