[jira] [Commented] (SPARK-23427) spark.sql.autoBroadcastJoinThreshold causing OOM exception in the driver

Pratik Dhumal (JIRA) Wed, 21 Feb 2018 12:59:14 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-23427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372000#comment-16372000
 ]


Pratik Dhumal commented on SPARK-23427:
---------------------------------------

Sorry for very late reply.

I am facing that issue when Autobroadcast value is not -1.

Somehow, I couldn't reproduce same for Autobroadcast = -1, one thing I have 
noticed is, for me it goes through more than double the iteration when 
autobroadcast is -1. But, At certain point it does not iterate, and get stuck 
(with no errors and info message as *ContextCleaner: Cleaned accumulator* 
<numbers>)

Also,

This is the *stack trace* I'm getting. 
{code:java}
// code placeholder
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at java.lang.StringBuilder.append(StringBuilder.java:131)
at scala.StringContext.standardInterpolator(StringContext.scala:125)
at scala.StringContext.s(StringContext.scala:95)
at 
org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:220)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:54)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2546)
at 
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2192)
at 
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2199)
at org.apache.spark.sql.Dataset$$anonfun$count$1.apply(Dataset.scala:2227)
at org.apache.spark.sql.Dataset$$anonfun$count$1.apply(Dataset.scala:2226)
at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2559)
at org.apache.spark.sql.Dataset.count(Dataset.scala:2226).............
{code}
 

Hope this helps.

Thank you.

 

> spark.sql.autoBroadcastJoinThreshold causing OOM exception in the driver 
> -------------------------------------------------------------------------
>
>                 Key: SPARK-23427
>                 URL: https://issues.apache.org/jira/browse/SPARK-23427
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>         Environment: SPARK 2.0 version
>            Reporter: Dhiraj
>            Priority: Critical
>
> We are facing issue around value of spark.sql.autoBroadcastJoinThreshold.
> With spark.sql.autoBroadcastJoinThreshold -1 ( disable) we seeing driver 
> memory used flat.
> With any other values 10MB, 5MB, 2 MB, 1MB, 10K, 1K we see driver memory used 
> goes up with rate depending upon the size of the autoBroadcastThreshold and 
> getting OOM exception. The problem is memory used by autoBroadcast is not 
> being free up in the driver.
> Application imports oracle tables as master dataframes which are persisted. 
> Each job applies filter to these tables and then registered them as 
> tempViewTable . Then sql query are using to process data further. At the end 
> all the intermediate dataFrame are unpersisted.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-23427) spark.sql.autoBroadcastJoinThreshold causing OOM exception in the driver

Reply via email to