Hi all,
I work on html parsing via generic AutoDetectParser() class.
I have to keep some "specific" attributes (id and class) in <table>
attribute in order to detect witch table have "meaning" for my app.
So, as far as I understand for now, I have to :
- extend HtmlHandler with MyHtmlHandler
- in MyHtmlHandler override public void startElement(...) with something
like this :
if (bodyLevel == 0 && discardLevel == 0) {
if ("TABLE".equals(name)){
AttributesImpl attributes = new AttributesImpl();
String id = atts.getValue("id");
String class = atts.getValue("class");
if (id != null){
attributes.addAttribute("", "id", "id", "CDATA", id);
}
if (class != null){
attributes.addAttribute("", "class", "class", "CDATA", class);
}
xhtml.startElement("http://www.w3.org/1999/xhtml", "table", "table",
attributes);
}
else{
//if other that table
super.startElement(...)
}
else{
//if other bodyLevel and discardLevel
super.startElement(...)
}
- And finally pass MyHtmlHandler to parse() method via parseContext.
*****
* This is the right way to do such a thing ?
* How I can use the parseContext to pass MyHtmlHandler ? I don't find any
example on it...
Any comment will be much appreciated,
Have a good day