Re: [R] gsub: replacing a.a if no occurence of b in .

Charilaos Skiadas Sat, 24 Feb 2007 08:28:05 -0800

All these methods do assume that you don't have nested <tag>'s, like so:

<tag><tag>foo</tag>useful stuff</tag>some garbage</tag>

For that you would really need a true parser. So I would double-check  
to make sure this doesn't happen.

Do you have any control on where those XML files are generated  
though? It sounds to me it might be easier to fix the utility  
generating those XML files, since it clearly is doing something wrong.

On Feb 24, 2007, at 11:07 AM, Gabor Grothendieck wrote:

> I assume <tag> is known.
>
> This removes any occurrence </tag>.*</tag> where .* does not
> contain <tag> or </tag>.
>
> The regular expression, re, matches </tag>, then does a greedy
> match (?U) for anything followed by </tag> but uses a zero
> width lookahead subexpression (?=...) for the second </tag>
> so that it it can be rematched again.  gsubfn in package
> gsubfn is like the usual gsub except that instead of
> replacing the match with a string it passes the match
> to function f and then replaces the match with the output
> of f.  See the gsubfn home page:
>   http://code.google.com/p/gsubfn/
> and vignette.

Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gsub: replacing a.*a if no occurence of b in .*

Reply via email to

Re: [R] gsub: replacing a.a if no occurence of b in .