[ 
https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675347#comment-16675347
 ] 

Eyal Farago commented on SPARK-24437:
-------------------------------------

[~mgaido], I think we agree on most of the details :)

I do think that once the plan is already executed, keeping the broadcast around 
is kind of a waste as this data is effectively cached as part of the cached 
relation.

on the other side, 'loosing' the broadcast means losing the ability to recover 
from a node(s) failure (spark relied on the RDD lineage for that), so I guess 
it's the least worst way of caching this relation.

 

BTW, [~dvogelbacher], what's consuming most of the memory on your scenario? are 
these the broadcast variables or the cached relations? I'd expect the later to 
be much heavier in most use cases.

> Memory leak in UnsafeHashedRelation
> -----------------------------------
>
>                 Key: SPARK-24437
>                 URL: https://issues.apache.org/jira/browse/SPARK-24437
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: gagan taneja
>            Priority: Critical
>         Attachments: Screen Shot 2018-05-30 at 2.05.40 PM.png, Screen Shot 
> 2018-05-30 at 2.07.22 PM.png, Screen Shot 2018-11-01 at 10.38.30 AM.png
>
>
> There seems to memory leak with 
> org.apache.spark.sql.execution.joins.UnsafeHashedRelation
> We have a long running instance of STS.
> With each query execution requiring Broadcast Join, UnsafeHashedRelation is 
> getting added for cleanup in ContextCleaner. This reference of 
> UnsafeHashedRelation is being held at some other Collection and not becoming 
> eligible for GC and because of this ContextCleaner is not able to clean it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to