[ https://issues.apache.org/jira/browse/SPARK-23427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368728#comment-16368728 ]
Kazuaki Ishizaki edited comment on SPARK-23427 at 2/19/18 12:49 AM: -------------------------------------------------------------------- Thank you. I ran this program several times with 64GB heap size. I saw the following OOM in both cases `-1` or default (`10*1024`*1024`). I am running the program with other heap sizes. Is this OOM what you are seeing? If not, I would appreciate if you could upload stack trace when OOM occurred. {code} [info] org.apache.spark.sql.MyTest *** ABORTED *** (2 hours, 14 minutes, 36 seconds) [info] java.lang.OutOfMemoryError: [info] at java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:161) [info] at java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:155) [info] at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:125) [info] at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) [info] at java.lang.StringBuilder.append(StringBuilder.java:136) [info] at java.lang.StringBuilder.append(StringBuilder.java:131) [info] at scala.StringContext.standardInterpolator(StringContext.scala:125) [info] at scala.StringContext.s(StringContext.scala:95) [info] at org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:199) [info] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74) [info] at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252) [info] at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190) [info] at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) [info] at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:3295) [info] at org.apache.spark.sql.Dataset.createOrReplaceTempView(Dataset.scala:3033) [info] at org.apache.spark.sql.MyTest$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(MyTest.scala:87) [info] at org.apache.spark.sql.catalyst.plans.PlanTestBase$class.withSQLConf(PlanTest.scala:176) [info] at org.apache.spark.sql.MyTest.org$apache$spark$sql$test$SQLTestUtilsBase$$super$withSQLConf(MyTest.scala:27) [info] at org.apache.spark.sql.test.SQLTestUtilsBase$class.withSQLConf(SQLTestUtils.scala:167) [info] at org.apache.spark.sql.MyTest.withSQLConf(MyTest.scala:27) [info] at org.apache.spark.sql.MyTest$$anonfun$1.apply$mcV$sp(MyTest.scala:65) [info] at org.apache.spark.sql.MyTest$$anonfun$1.apply(MyTest.scala:65) [info] at org.apache.spark.sql.MyTest$$anonfun$1.apply(MyTest.scala:65) ... {code} was (Author: kiszk): Thank you. I ran this program several times with 64GB heap size. I saw the following OOM in both cases `-1` or default (`10*1024`*1024`). I am running the program with other heap sizes. Is this OOM what you are seeing? If not, I would appreciate if you could upload stack trace when OOM occurred. {code:java} [info] org.apache.spark.sql.MyTest *** ABORTED *** (2 hours, 14 minutes, 36 seconds) [info] java.lang.OutOfMemoryError: [info] at java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:161) [info] at java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:155) [info] at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:125) [info] at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) [info] at java.lang.StringBuilder.append(StringBuilder.java:136) [info] at java.lang.StringBuilder.append(StringBuilder.java:131) [info] at scala.StringContext.standardInterpolator(StringContext.scala:125) [info] at scala.StringContext.s(StringContext.scala:95) [info] at org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:199) [info] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74) [info] at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252) [info] at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190) [info] at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) [info] at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:3295) [info] at org.apache.spark.sql.Dataset.createOrReplaceTempView(Dataset.scala:3033) [info] at org.apache.spark.sql.MyTest$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(MyTest.scala:87) [info] at org.apache.spark.sql.catalyst.plans.PlanTestBase$class.withSQLConf(PlanTest.scala:176) [info] at org.apache.spark.sql.MyTest.org$apache$spark$sql$test$SQLTestUtilsBase$$super$withSQLConf(MyTest.scala:27) [info] at org.apache.spark.sql.test.SQLTestUtilsBase$class.withSQLConf(SQLTestUtils.scala:167) [info] at org.apache.spark.sql.MyTest.withSQLConf(MyTest.scala:27) [info] at org.apache.spark.sql.MyTest$$anonfun$1.apply$mcV$sp(MyTest.scala:65) [info] at org.apache.spark.sql.MyTest$$anonfun$1.apply(MyTest.scala:65) [info] at org.apache.spark.sql.MyTest$$anonfun$1.apply(MyTest.scala:65) ... {code:java} > spark.sql.autoBroadcastJoinThreshold causing OOM exception in the driver > ------------------------------------------------------------------------- > > Key: SPARK-23427 > URL: https://issues.apache.org/jira/browse/SPARK-23427 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0 > Environment: SPARK 2.0 version > Reporter: Dhiraj > Priority: Critical > > We are facing issue around value of spark.sql.autoBroadcastJoinThreshold. > With spark.sql.autoBroadcastJoinThreshold -1 ( disable) we seeing driver > memory used flat. > With any other values 10MB, 5MB, 2 MB, 1MB, 10K, 1K we see driver memory used > goes up with rate depending upon the size of the autoBroadcastThreshold and > getting OOM exception. The problem is memory used by autoBroadcast is not > being free up in the driver. > Application imports oracle tables as master dataframes which are persisted. > Each job applies filter to these tables and then registered them as > tempViewTable . Then sql query are using to process data further. At the end > all the intermediate dataFrame are unpersisted. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org