RE: [Spark 1.5]: Exception in thread "broadcast-hash-join-2" java.lang.OutOfMemoryError: Java heap space -- Work in 1.4, but 1.5 doesn't

Shuai Zheng Wed, 04 Nov 2015 14:56:22 -0800

And an update is: this ONLY happen in Spark 1.5, I try to run it under Spark
1.4 and 1.4.1, there are no issue (the program is developed under Spark 1.4
last time, and I just re-test it, it works). So this is proven that there is
no issue on the logic and data, it is caused by the new version of Spark.


 

So I want to know any new setup I should set in Spark 1.5 to make it work? 

 

Regards,

 

Shuai

 

From: Shuai Zheng [mailto:szheng.c...@gmail.com] 
Sent: Wednesday, November 04, 2015 3:22 PM
To: user@spark.apache.org
Subject: [Spark 1.5]: Exception in thread "broadcast-hash-join-2"
java.lang.OutOfMemoryError: Java heap space

 

Hi All,

 

I have a program which actually run a bit complex business (join) in spark.
And I have below exception:

 

I running on Spark 1.5, and with parameter:

 

spark-submit --deploy-mode client --executor-cores=24 --driver-memory=2G
--executor-memory=45G -class . 

 

Some other setup:

 

sparkConf.set("spark.serializer",
"org.apache.spark.serializer.KryoSerializer").set("spark.kryoserializer.buff
er.max", "2047m");

sparkConf.set("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps").set("spark.sql.autoBroadcastJoinThreshold",
"104857600");

 

This is running on AWS c3*8xlarge instance. I am not sure what kind of
parameter I should set if I have below OutOfMemoryError exception.

 

#

# java.lang.OutOfMemoryError: Java heap space

# -XX:OnOutOfMemoryError="kill -9 %p"

#   Executing /bin/sh -c "kill -9 10181"...

Exception in thread "broadcast-hash-join-2" java.lang.OutOfMemoryError: Java
heap space

        at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProje
ction.apply(Unknown Source)

        at
org.apache.spark.sql.execution.joins.UnsafeHashedRelation$.apply(HashedRelat
ion.scala:380)

        at
org.apache.spark.sql.execution.joins.HashedRelation$.apply(HashedRelation.sc
ala:123)

        at
org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin$$anonfun$broadca
stFuture$1$$anonfun$apply$1.apply(BroadcastHashOuterJoin.scala:95)

        at
org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin$$anonfun$broadca
stFuture$1$$anonfun$apply$1.apply(BroadcastHashOuterJoin.scala:85)

        at
org.apache.spark.sql.execution.SQLExecution$.withExecutionId(SQLExecution.sc
ala:100)

        at
org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin$$anonfun$broadca
stFuture$1.apply(BroadcastHashOuterJoin.scala:85)

        at
org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin$$anonfun$broadca
stFuture$1.apply(BroadcastHashOuterJoin.scala:85)

        at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.
scala:24)

        at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)

        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
45)

        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
15)

        at java.lang.Thread.run(Thread.java:745)

 

Any hint will be very helpful.

 

Regards,

 

Shuai

RE: [Spark 1.5]: Exception in thread "broadcast-hash-join-2" java.lang.OutOfMemoryError: Java heap space -- Work in 1.4, but 1.5 doesn't

Reply via email to