On 16 December 2015 at 06:19, Deenar Toraskar < deenar.toras...@thinkreactive.co.uk> wrote:
> Hi > > I had the same problem. There is a query with a lot of small tables (5x) > all below the broadcast threshold and Spark is broadcasting all these > tables together without checking if there is sufficient memory available. > > I got around this issue by reducing the > *spark.sql.autoBroadcastJoinThreshold* to stop broadcasting the bigger > tables in the query. > > This looks like a issue to me. A fix would be to > a) ensure that in addition to the per table threshold, there is a total > broadcast size per query, so only data upto that limit is broadcast > preventing executors running out of memory. > > Shall I raise a JIRA for this? > > Regards > Deenar > > > On 4 November 2015 at 22:55, Shuai Zheng <szheng.c...@gmail.com> wrote: > >> And an update is: this ONLY happen in Spark 1.5, I try to run it under >> Spark 1.4 and 1.4.1, there are no issue (the program is developed under >> Spark 1.4 last time, and I just re-test it, it works). So this is proven >> that there is no issue on the logic and data, it is caused by the new >> version of Spark. >> >> >> >> So I want to know any new setup I should set in Spark 1.5 to make it >> work? >> >> >> >> Regards, >> >> >> >> Shuai >> >> >> >> *From:* Shuai Zheng [mailto:szheng.c...@gmail.com] >> *Sent:* Wednesday, November 04, 2015 3:22 PM >> *To:* user@spark.apache.org >> *Subject:* [Spark 1.5]: Exception in thread "broadcast-hash-join-2" >> java.lang.OutOfMemoryError: Java heap space >> >> >> >> Hi All, >> >> >> >> I have a program which actually run a bit complex business (join) in >> spark. And I have below exception: >> >> >> >> I running on Spark 1.5, and with parameter: >> >> >> >> spark-submit --deploy-mode client --executor-cores=24 --driver-memory=2G >> --executor-memory=45G –class … >> >> >> >> Some other setup: >> >> >> >> sparkConf.set("spark.serializer", >> "org.apache.spark.serializer.KryoSerializer").set("spark.kryoserializer.buffer.max", >> "2047m"); >> >> sparkConf.set("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails >> -XX:+PrintGCTimeStamps").set("spark.sql.autoBroadcastJoinThreshold", >> "104857600"); >> >> >> >> This is running on AWS c3*8xlarge instance. I am not sure what kind of >> parameter I should set if I have below OutOfMemoryError exception. >> >> >> >> # >> >> # java.lang.OutOfMemoryError: Java heap space >> >> # -XX:OnOutOfMemoryError="kill -9 %p" >> >> # Executing /bin/sh -c "kill -9 10181"... >> >> Exception in thread "broadcast-hash-join-2" java.lang.OutOfMemoryError: >> Java heap space >> >> at >> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown >> Source) >> >> at >> org.apache.spark.sql.execution.joins.UnsafeHashedRelation$.apply(HashedRelation.scala:380) >> >> at >> org.apache.spark.sql.execution.joins.HashedRelation$.apply(HashedRelation.scala:123) >> >> at >> org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin$$anonfun$broadcastFuture$1$$anonfun$apply$1.apply(BroadcastHashOuterJoin.scala:95) >> >> at >> org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin$$anonfun$broadcastFuture$1$$anonfun$apply$1.apply(BroadcastHashOuterJoin.scala:85) >> >> at >> org.apache.spark.sql.execution.SQLExecution$.withExecutionId(SQLExecution.scala:100) >> >> at >> org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin$$anonfun$broadcastFuture$1.apply(BroadcastHashOuterJoin.scala:85) >> >> at >> org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin$$anonfun$broadcastFuture$1.apply(BroadcastHashOuterJoin.scala:85) >> >> at >> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) >> >> at >> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) >> >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> >> at java.lang.Thread.run(Thread.java:745) >> >> >> >> Any hint will be very helpful. >> >> >> >> Regards, >> >> >> >> Shuai >> > >