Hey guys,

I am running hive and I am trying to join two tables (2.2GB and 136MB) on a
cluster of 9 nodes (replication = 3)

Hadoop version - 0.20.2
Each data node memory - 2GB
HADOOP_HEAPSIZE - 1000MB

other heap settings are defaults. My hive launches 40 Maptasks and every
task failed with the same error

2011-09-19 18:37:17,110 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 300
2011-09-19 18:37:17,223 FATAL org.apache.hadoop.mapred.TaskTracker:
Error running child : java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:781)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)


Looks like I need to tweak some of the heap settings for TTs to handle
the memory efficiently. I am unable to understand which variables to
modify (there are too many related to heap sizes).

Any specific things I must look at?

Thanks,

jS

Reply via email to