Hi all,

I got a probleme with parser when i try to crawl 2000 site with a depth of
3.
I use nutch 0.81 version and my setup worked well with other site but this
list gave me this error:

2007-06-06 13:49:27,997 WARN  mapred.LocalJobRunner - job_qsjobz
java.lang.StackOverflowError
        at org.apache.xerces.dom.ParentNode.getLength(Unknown Source)
        at
org.apache.nutch.parse.html.DOMContentUtils.getOutlinks(DOMContentUtils.java:305)
        at
org.apache.nutch.parse.html.DOMContentUtils.getOutlinks(DOMContentUtils.java:347)
        at
org.apache.nutch.parse.html.DOMContentUtils.getOutlinks(DOMContentUtils.java:347)
        at
org.apache.nutch.parse.html.DOMContentUtils.getOutlinks(DOMContentUtils.java:347)
        at
org.apache.nutch.parse.html.DOMContentUtils.getOutlinks(DOMContentUtils.java:347)
        at
org.apache.nutch.parse.html.DOMContentUtils.getOutlinks(DOMContentUtils.java:347)

i cut the message because he's very long

Could someone help me please, i don't think there is already an answer in
the forum or in the jira 
Thank you very mutch for your help.
-- 
View this message in context: 
http://www.nabble.com/stackoverflow-error-tf3879034.html#a10992519
Sent from the Nutch - User mailing list archive at Nabble.com.


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to