I'm using the function htmlParse() in the XML package, and I need a
little bit help on error handling while parsing an HTML page. So far I
can use either the default way:

# error = xmlErrorCumulator(), by default
library(XML)
doc = htmlParse("http://www.public.iastate.edu/~pdixon/stat500/";)
# the error message is:
# htmlParseStartTag: invalid element name

or the tryCatch() approach:

# error = NULL, errors to be caught by tryCatch()
tryCatch({
    doc = htmlParse("http://www.public.iastate.edu/~pdixon/stat500/";,
        error = NULL)
}, XMLError = function(e) {
    cat("There was an error in the XML at line", e$line, "column",
        e$col, "\n", e$message, "\n")
})
# verbose error message as:
# There was an error in the XML at line 90 column 2
# htmlParseStartTag: invalid element name

I wish to get the verbose error messages without really stopping the
parsing process; the first approach cannot return detailed error
messages, while the second one will stop the program...

Thanks!

Regards,
Yihui
--
Yihui Xie <xieyi...@gmail.com>
Phone: 515-294-6609 Web: http://yihui.name
Department of Statistics, Iowa State University
3211 Snedecor Hall, Ames, IA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to