Add a little: The Hive version is 1.2.1 The Spark version is 1.4.1 The Hadoop version is 2.5.1
2015-11-26 20:36 GMT+08:00 Jone Zhang <joyoungzh...@gmail.com>: > Here is an error message: > > java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:2245) > at java.util.Arrays.copyOf(Arrays.java:2219) > at java.util.ArrayList.grow(ArrayList.java:242) > at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:216) > at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:208) > at java.util.ArrayList.add(ArrayList.java:440) > at > org.apache.hadoop.hive.ql.exec.spark.SortByShuffler$ShuffleFunction$1.next(SortByShuffler.java:95) > at > org.apache.hadoop.hive.ql.exec.spark.SortByShuffler$ShuffleFunction$1.next(SortByShuffler.java:70) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:216) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > > And the note from the SortByShuffler.java > // TODO: implement this by accumulating rows with the same > key into a list. > // Note that this list needs to improved to prevent > excessive memory usage, but this > // can be done in later phase. > > > The join sql run success when i use hive on mapreduce. > So how do mapreduce deal with it? > And Is there plan to improved to prevent excessive memory usage? > > Best wishes! > Thanks! >