hi all, I am getting this JVM error below during a recrawl specifically during the execution of
$NUTCH_HOME/bin/nutch mergesegs crawl/MERGEDsegments crawl/segments/* i am running on a single machine: Linux 2.6.24-23-xen x86_64 4G RAM java-6-sun nutch-1.0 JAVA_HEAP_MAX=-Xmx1000m Any suggestions? I am about to up my heap max to Xmx2000m i havent encountered this before running with the above specs, so i am not sure what could have changed? Any suggestions will be greatly appreciated. Thanks. > > > 2009-10-11 14:29:56,752 INFO [org.apache.hadoop.mapred.LocalJobRunner] - > reduce > reduce > 2009-10-11 14:30:15,801 INFO [org.apache.hadoop.mapred.LocalJobRunner] - > reduce > reduce > 2009-10-11 14:31:19,197 INFO [org.apache.hadoop.mapred.TaskRunner] - > Communication exception: java.lang.OutOfMemoryError: Java heap space > at > java.util.ResourceBundle$Control.getCandidateLocales(ResourceBundle.java:2220) > at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1229) > at java.util.ResourceBundle.getBundle(ResourceBundle.java:715) > at > org.apache.hadoop.mapred.Counters$Group.getResourceBundle(Counters.java:218) > at org.apache.hadoop.mapred.Counters$Group.<init>(Counters.java:202) > at org.apache.hadoop.mapred.Counters.getGroup(Counters.java:410) > at org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:491) > at org.apache.hadoop.mapred.Counters.sum(Counters.java:506) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:222) > at org.apache.hadoop.mapred.Task$1.run(Task.java:418) > at java.lang.Thread.run(Thread.java:619) > > 2009-10-11 14:31:22,197 INFO [org.apache.hadoop.mapred.LocalJobRunner] - > reduce > reduce > 2009-10-11 14:31:25,197 INFO [org.apache.hadoop.mapred.LocalJobRunner] - > reduce > reduce > 2009-10-11 14:31:40,002 WARN [org.apache.hadoop.mapred.LocalJobRunner] - > job_local_0001 > java.lang.OutOfMemoryError: Java heap space > at > java.util.concurrent.locks.ReentrantLock.<init>(ReentrantLock.java:234) > at > java.util.concurrent.ConcurrentHashMap$Segment.<init>(ConcurrentHashMap.java:289) > at > java.util.concurrent.ConcurrentHashMap.<init>(ConcurrentHashMap.java:613) > at > java.util.concurrent.ConcurrentHashMap.<init>(ConcurrentHashMap.java:652) > at > org.apache.hadoop.io.AbstractMapWritable.<init>(AbstractMapWritable.java:49) > at org.apache.hadoop.io.MapWritable.<init>(MapWritable.java:42) > at org.apache.nutch.crawl.CrawlDatum.readFields(CrawlDatum.java:260) > at > org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:54) > at > org.apache.nutch.metadata.MetaWrapper.readFields(MetaWrapper.java:101) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) > at > org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:940) > at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:880) > at > org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:237) > at > org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:233) > at org.apache.nutch.segment.SegmentMerger.reduce(SegmentMerger.java:377) > at org.apache.nutch.segment.SegmentMerger.reduce(SegmentMerger.java:113) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:436) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:170)