When I set http.content.limit to "-1" so that no content will be truncated,
I get zero children for the root node in the HTMLParser. In the following
code snippte, "int len" is getting 0 when http.content.limit=-1

Am I missing something?


Code snippet from HTMLParser:
-----------------------------------------------

  processNode(Node node){
         /* Recursively traverse through child nodes */
         NodeList children = node.getChildNodes();
        if(children != null) {
                int len = children.getLength();
                for (int i = 0; i < len; i++) {
                           processNod(children.item(i)){
                }
         }
    }

     /* Some other code */
}
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to