[ https://issues.apache.org/jira/browse/SPARK-23427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372000#comment-16372000 ]
Pratik Dhumal commented on SPARK-23427: --------------------------------------- Sorry for very late reply. I am facing that issue when Autobroadcast value is not -1. Somehow, I couldn't reproduce same for Autobroadcast = -1, one thing I have noticed is, for me it goes through more than double the iteration when autobroadcast is -1. But, At certain point it does not iterate, and get stuck (with no errors and info message as *ContextCleaner: Cleaned accumulator* <numbers>) Also, This is the *stack trace* I'm getting. {code:java} // code placeholder Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3332) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) at java.lang.StringBuilder.append(StringBuilder.java:136) at java.lang.StringBuilder.append(StringBuilder.java:131) at scala.StringContext.standardInterpolator(StringContext.scala:125) at scala.StringContext.s(StringContext.scala:95) at org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:220) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:54) at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2546) at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2192) at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2199) at org.apache.spark.sql.Dataset$$anonfun$count$1.apply(Dataset.scala:2227) at org.apache.spark.sql.Dataset$$anonfun$count$1.apply(Dataset.scala:2226) at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2559) at org.apache.spark.sql.Dataset.count(Dataset.scala:2226)............. {code} Hope this helps. Thank you. > spark.sql.autoBroadcastJoinThreshold causing OOM exception in the driver > ------------------------------------------------------------------------- > > Key: SPARK-23427 > URL: https://issues.apache.org/jira/browse/SPARK-23427 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0 > Environment: SPARK 2.0 version > Reporter: Dhiraj > Priority: Critical > > We are facing issue around value of spark.sql.autoBroadcastJoinThreshold. > With spark.sql.autoBroadcastJoinThreshold -1 ( disable) we seeing driver > memory used flat. > With any other values 10MB, 5MB, 2 MB, 1MB, 10K, 1K we see driver memory used > goes up with rate depending upon the size of the autoBroadcastThreshold and > getting OOM exception. The problem is memory used by autoBroadcast is not > being free up in the driver. > Application imports oracle tables as master dataframes which are persisted. > Each job applies filter to these tables and then registered them as > tempViewTable . Then sql query are using to process data further. At the end > all the intermediate dataFrame are unpersisted. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org