Re: [R] Remove superscripts from HTML objects

2012-04-13 Thread Chris Stubben
Sorry if I was not clear. I wanted to remove the superscripts using xpath queries if possible. For example this will get p nodes with superscripts, but how do I remove the superscripts if there are many matching nodes and different superscripts? xpathSApply(doc, "//p[sup]", xmlValue) [1] "Cata"

Re: [R] Remove superscripts from HTML objects

2012-04-13 Thread S Ellison
> h <- "CataDog" > sub("","",h) Probably safer to do gsub("","",h) to avoid replacing multiple superscripts. eg h2 <- "CataDogMouseaRaccoon" sub("","",h2) #drops everything between first gsub("","",h2)#Drops each xxx ***

Re: [R] Remove superscripts from HTML objects

2012-04-12 Thread mlell08
Hi, h <- "CataDog" sub("","",h) see http://en.wikibooks.org/wiki/R_Programming/Text_Processing for more information. Regards! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www

[R] Remove superscripts from HTML objects

2012-04-11 Thread Chris Stubben
Is there some way to remove superscripts from objects returned by html/xmlParse (XML package)? h <- "CataDog" doc <- htmlParse(h) xpathSApply(doc, "//p", xmlValue) [1] "Cata" "Dog" I could probably remove the tags from the "h" object above, but I'd rather just work with the results from htmlPa