[ 
https://issues.apache.org/jira/browse/SPARK-23904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16646723#comment-16646723
 ] 

Stanislav Los commented on SPARK-23904:
---------------------------------------

[~igreenfi] [~RBerenguel] we had the same issue and I found easier solution to 
it (without need of altering Spark code). See below. I also updated 
stackoverflow.

We faced the same issue, and solution is to set parameter 
"spark.sql.ui.retainedExecutions" to lower value, for example --conf 
"spark.sql.ui.retainedExecutions=10" 
By default it's 1000.


It keeps instances count of 
org.apache.spark.sql.execution.ui.SQLExecutionUIData low enough.
SQLExecutionUIData have a reference to physicalPlanDescription, which can get 
very big.
In our case we had to read huge avro messages from Kafka with lot's of fields, 
and plan description was in the area of 8mg each.

> Big execution plan cause OOM
> ----------------------------
>
>                 Key: SPARK-23904
>                 URL: https://issues.apache.org/jira/browse/SPARK-23904
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.1
>            Reporter: Izek Greenfield
>            Priority: Major
>              Labels: SQL, query
>
> I create a question in 
> [StackOverflow|https://stackoverflow.com/questions/49508683/spark-physicalplandescription-string-is-to-big]
>  
> Spark create the text representation of query in any case even if I don't 
> need it.
> That causes many garbage object and unneeded GC... 
>  [Gist with code to 
> reproduce|https://gist.github.com/igreenfield/584c3336f03ba7d63e9026774eaf5e23]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to