On 09/10/2017 11:23 AM, Ulrik Stervbo wrote:
Hi Duncan,

why not split on / and take the correct elements? It is not as elegant as regex but could do the trick.

Thanks for the suggestion. There are likely many thousands of lines of data like the two real examples (which had about 5000 and 60000 lines respectively), so I was thinking that would be too slow, as it would involve nested strsplit() calls. But in fact, it's not so bad, so I might go with it. Here's a stab at it:

lines <- <the lines to be split, e.g. the lines starting with "f" in http://sci.esa.int/science-e/www/object/doc.cfm?fobjectid=54726>

l2 <- strsplit(lines, " ")
l3 <- lapply(l2, function(x) {
        y <- strsplit(x, "/")
        sapply(y, function(z) if (length(z) == 3) z[3] else "")
      })

Duncan


Best,
Ulrik

On Mon, 9 Oct 2017 at 17:03 Duncan Murdoch <murdoch.dun...@gmail.com <mailto:murdoch.dun...@gmail.com>> wrote:

    I have a file containing "words" like


    a

    a/b

    a/b/c

    where there may be multiple words on a line (separated by spaces).  The
    a, b, and c strings can contain non-space, non-slash characters. I'd
    like to use gsub() to extract the c strings (which should be empty if
    there are none).

    A real example is

    "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587"

    which I'd like to transform to

    " 587 587 587 587"

    Another real example is

    "f 1067 28680 24462"

    which should transform to "   ".

    I've tried a few different regexprs, but am unable to find a way to say
    "transform words by deleting everything up to and including the 2nd
    slash" when there might be zero, one or two slashes.  Any suggestions?

    Duncan Murdoch

    ______________________________________________
    R-help@r-project.org <mailto:R-help@r-project.org> mailing list --
    To UNSUBSCRIBE and more, see
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to