hi all,

I am getting this JVM error below during a recrawl specifically during the 
execution of 

$NUTCH_HOME/bin/nutch mergesegs crawl/MERGEDsegments crawl/segments/*

i am running on a single machine:
Linux 2.6.24-23-xen  x86_64
4G RAM
java-6-sun
nutch-1.0
JAVA_HEAP_MAX=-Xmx1000m 

Any suggestions? I am about to up my heap max to Xmx2000m

i havent encountered this before running with the above specs, so i am not sure 
what could have changed?
Any suggestions will be greatly appreciated.

Thanks.


> 
> 
> 2009-10-11 14:29:56,752 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 
> reduce > reduce
> 2009-10-11 14:30:15,801 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 
> reduce > reduce
> 2009-10-11 14:31:19,197 INFO [org.apache.hadoop.mapred.TaskRunner] - 
> Communication exception: java.lang.OutOfMemoryError: Java heap space
>       at 
> java.util.ResourceBundle$Control.getCandidateLocales(ResourceBundle.java:2220)
>       at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1229)
>       at java.util.ResourceBundle.getBundle(ResourceBundle.java:715)
>       at 
> org.apache.hadoop.mapred.Counters$Group.getResourceBundle(Counters.java:218)
>       at org.apache.hadoop.mapred.Counters$Group.<init>(Counters.java:202)
>       at org.apache.hadoop.mapred.Counters.getGroup(Counters.java:410)
>       at org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:491)
>       at org.apache.hadoop.mapred.Counters.sum(Counters.java:506)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:222)
>       at org.apache.hadoop.mapred.Task$1.run(Task.java:418)
>       at java.lang.Thread.run(Thread.java:619)
> 
> 2009-10-11 14:31:22,197 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 
> reduce > reduce
> 2009-10-11 14:31:25,197 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 
> reduce > reduce
> 2009-10-11 14:31:40,002 WARN [org.apache.hadoop.mapred.LocalJobRunner] - 
> job_local_0001
> java.lang.OutOfMemoryError: Java heap space
>       at 
> java.util.concurrent.locks.ReentrantLock.<init>(ReentrantLock.java:234)
>       at 
> java.util.concurrent.ConcurrentHashMap$Segment.<init>(ConcurrentHashMap.java:289)
>       at 
> java.util.concurrent.ConcurrentHashMap.<init>(ConcurrentHashMap.java:613)
>       at 
> java.util.concurrent.ConcurrentHashMap.<init>(ConcurrentHashMap.java:652)
>       at 
> org.apache.hadoop.io.AbstractMapWritable.<init>(AbstractMapWritable.java:49)
>       at org.apache.hadoop.io.MapWritable.<init>(MapWritable.java:42)
>       at org.apache.nutch.crawl.CrawlDatum.readFields(CrawlDatum.java:260)
>       at 
> org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:54)
>       at 
> org.apache.nutch.metadata.MetaWrapper.readFields(MetaWrapper.java:101)
>       at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>       at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>       at 
> org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:940)
>       at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:880)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:237)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:233)
>       at org.apache.nutch.segment.SegmentMerger.reduce(SegmentMerger.java:377)
>       at org.apache.nutch.segment.SegmentMerger.reduce(SegmentMerger.java:113)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:436)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:170)

Reply via email to