Re: [R] Remove superscripts from HTML objects

2012-04-13 Thread Chris Stubben
Sorry if I was not clear.  I wanted to remove the superscripts using xpath
queries if possible.  For example this will get p nodes with superscripts,
but how do I remove the superscripts if there are many matching nodes and
different superscripts?

xpathSApply(doc, "//p[sup]", xmlValue) 
[1] "Cata"


Chris

--
View this message in context: 
http://r.789695.n4.nabble.com/Remove-superscripts-from-HTML-objects-tp4550738p4555370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove superscripts from HTML objects

2012-04-13 Thread S Ellison
> h <- "CataDog"
> sub("","",h)

Probably safer to do  

gsub("","",h)

to avoid replacing multiple superscripts.

eg 
h2 <- 
"CataDogMouseaRaccoon"
sub("","",h2) #drops everything between first 
gsub("","",h2)#Drops each xxx


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove superscripts from HTML objects

2012-04-12 Thread mlell08
Hi,

h <- "CataDog"
sub("","",h)

see http://en.wikibooks.org/wiki/R_Programming/Text_Processing for more
information.

Regards!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Remove superscripts from HTML objects

2012-04-11 Thread Chris Stubben
Is there some way to remove superscripts from objects returned by
html/xmlParse (XML package)?

h <- "CataDog"
doc <- htmlParse(h)
 xpathSApply(doc, "//p", xmlValue)
[1] "Cata" "Dog"

I could probably remove the   tags from the "h" object above, but I'd
rather just work with the results from htmlParse if possible (and not use
readLines to load raw HTML first).

Thanks,
Chris Stubben
 


--
View this message in context: 
http://r.789695.n4.nabble.com/Remove-superscripts-from-HTML-objects-tp4550738p4550738.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.