Thank you for the info.
The OOM exception in your previous email indicates that your system is
running out of heap memory.  You either have instantiated too many objects,
or there are memory leaks in the source codes.

Hope this will help you!
Cheer!!

Adam Shuy, President
ePacific Web Design & Hosting
Professional Web/Software developer
TEL: 408-272-6946
www.epacificweb.com

-----Original Message-----
From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] 
Sent: Monday, July 16, 2007 8:43 AM
To: nutch-dev@lucene.apache.org
Subject: Re: OOM error during parsing with nekohtml

You could try looking at these two discussions:
http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html
http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html

--Kai

----- Original Message ----
From: Tsengtan A Shuy <[EMAIL PROTECTED]>
To: nutch-dev@lucene.apache.org; [EMAIL PROTECTED]
Sent: Monday, July 16, 2007 3:45:59 AM
Subject: RE: OOM error during parsing with nekohtml

I successfully run the whole-web crawl with the my new ubuntu OS, and I am
ready to fix the bug.  I need someone to guide me to get the most updated
source code and the bug assignment.

Thank you in advance!! 

Adam Shuy, President
ePacific Web Design & Hosting
Professional Web/Software developer
TEL: 408-272-6946
www.epacificweb.com
-----Original Message-----
From: Shailendra Mudgal [mailto:[EMAIL PROTECTED] 
Sent: Monday, July 16, 2007 3:05 AM
To: [EMAIL PROTECTED]; nutch-dev@lucene.apache.org
Subject: OOM error during parsing with nekohtml

Hi All,

We are getting an OOM Exception during the processing of
http://www.fotofinity.com/cgi-bin/homepages.cgi . We have also applied
Nutch-497 patch to our source code. But actually the error is coming during
the parse method.
Does anybody has any idea regarding this.  Here is the complete stacktrace :

java.lang.OutOfMemoryError: Java heap space
    at java.lang.String.toUpperCase(String.java:2637)
    at java.lang.String.toUpperCase(String.java:2660)
    at
org.cyberneko.html.filters.NamespaceBinder.bindNamespaces(NamespaceBinder.ja
va:443)
    at
org.cyberneko.html.filters.NamespaceBinder.startElement(NamespaceBinder.java
:252)
    at
org.cyberneko.html.HTMLTagBalancer.callStartElement(HTMLTagBalancer.java:100
9)
    at
org.cyberneko.html.HTMLTagBalancer.startElement(HTMLTagBalancer.java:639)
    at
org.cyberneko.html.HTMLTagBalancer.startElement(HTMLTagBalancer.java:646)
    at
org.cyberneko.html.HTMLScanner$ContentScanner.scanStartElement(HTMLScanner.j
ava:2343)
    at
org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:1820)
    at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:789)
    at
org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:478)
    at
org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:431)
    at
org.cyberneko.html.parsers.DOMFragmentParser.parse(DOMFragmentParser.java:16
4)
    at
org.apache.nutch.parse.html.HtmlParser.parseNeko(HtmlParser.java:265)
    at org.apache.nutch.parse.html.HtmlParser.parse(HtmlParser.java:229)
    at
org.apache.nutch.parse.html.HtmlParser.getParse(HtmlParser.java:168)
    at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:84)
    at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:75)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175)
    at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1445)


Regards,
Shailendra








       
____________________________________________________________________________
________
Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated
for today's economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow  

Reply via email to