> On 9 Oct 2017, at 17:02 , Duncan Murdoch <murdoch.dun...@gmail.com> wrote:
> 
> I have a file containing "words" like
> 
> 
> a
> 
> a/b
> 
> a/b/c
> 
> where there may be multiple words on a line (separated by spaces).  The a, b, 
> and c strings can contain non-space, non-slash characters. I'd like to use 
> gsub() to extract the c strings (which should be empty if there are none).
> 
> A real example is
> 
> "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587"
> 
> which I'd like to transform to
> 
> " 587 587 587 587"
> 
> Another real example is
> 
> "f 1067 28680 24462"
> 
> which should transform to "   ".
> 
> I've tried a few different regexprs, but am unable to find a way to say 
> "transform words by deleting everything up to and including the 2nd slash" 
> when there might be zero, one or two slashes.  Any suggestions?
> 

I think you might need something like this:

s <- "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587"
l <- strsplit(s, " ")[[1]]
pat <- "[[:alnum:]]*/[[:alnum:]]*/([[:alnum:]]*)"
paste(ifelse(grepl(pat,l),gsub(pat, "\\1", l), ""), collapse=" ")

-pd

> Duncan Murdoch
> 
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to