Re: java heap space problem when using the language identifier

MilleBii Thu, 16 Jul 2009 14:30:51 -0700

Just trying indexing a smaller segment 300k URLs ... and the memory is just
going up and up... but it does NOT hit the physical boundary limit. Sounds
like a "memory leak" ???
How come I thought Java was doing the garbage collection automatically ????



2009/7/16 MilleBii <[email protected]>

> I get more details now for my error.
> What can I do about it, I have 4GB of memory, but it is not fully used (I
> think).
> I use cygwin/windows/local filesystem
>
> java.lang.OutOfMemoryError: Java heap space
>     at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:498)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>     at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
>
> ---------- Forwarded message ----------
> From: MilleBii <[email protected]>
> Date: 2009/7/15
> Subject: Errorr when using language-identifier plugin ?
> To: [email protected]
>
>
> I decided to add the language-identifier plugin... but I get the following
> error when I start indexing my crawldb. Not really explicit.
>  If I remove it works just fine. I tried on a smaller crawl database that I
> use for testing and it works fine too.
> Any idea where to look for ?
>
>
> 2009-07-15 16:19:54,875 WARN  mapred.LocalJobRunner - job_local_0001
> 2009-07-15 16:19:54,891 FATAL indexer.Indexer - Indexer:
> java.io.IOException: Job failed!
>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>     at org.apache.nutch.indexer.Indexer.index(Indexer.java:72)
>     at org.apache.nutch.indexer.Indexer.run(Indexer.java:92)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.nutch.indexer.Indexer.main(Indexer.java:101)
>
>
>
> --
> -MilleBii-
>
>
>
> --
> -MilleBii-
>



-- 
-MilleBii-

Re: java heap space problem when using the language identifier

Reply via email to