Re: [R] Remove superscripts from HTML objects
Sorry if I was not clear. I wanted to remove the superscripts using xpath queries if possible. For example this will get p nodes with superscripts, but how do I remove the superscripts if there are many matching nodes and different superscripts? xpathSApply(doc, "//p[sup]", xmlValue) [1] "Cata" Chris -- View this message in context: http://r.789695.n4.nabble.com/Remove-superscripts-from-HTML-objects-tp4550738p4555370.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove superscripts from HTML objects
> h <- "CataDog" > sub("","",h) Probably safer to do gsub("","",h) to avoid replacing multiple superscripts. eg h2 <- "CataDogMouseaRaccoon" sub("","",h2) #drops everything between first gsub("","",h2)#Drops each xxx *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove superscripts from HTML objects
Hi, h <- "CataDog" sub("","",h) see http://en.wikibooks.org/wiki/R_Programming/Text_Processing for more information. Regards! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Remove superscripts from HTML objects
Is there some way to remove superscripts from objects returned by html/xmlParse (XML package)? h <- "CataDog" doc <- htmlParse(h) xpathSApply(doc, "//p", xmlValue) [1] "Cata" "Dog" I could probably remove the tags from the "h" object above, but I'd rather just work with the results from htmlParse if possible (and not use readLines to load raw HTML first). Thanks, Chris Stubben -- View this message in context: http://r.789695.n4.nabble.com/Remove-superscripts-from-HTML-objects-tp4550738p4550738.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.