[jira] [Assigned] (SPARK-54354) Driver / Executor hangs when there's not enough JVM heap memory for broadcast hashed relation

Dongjoon Hyun (Jira) Fri, 21 Nov 2025 08:25:06 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-54354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dongjoon Hyun reassigned SPARK-54354:
-------------------------------------

    Assignee: Hongze Zhang

> Driver / Executor hangs when there's not enough JVM heap memory for broadcast 
> hashed relation
> ---------------------------------------------------------------------------------------------
>
>                 Key: SPARK-54354
>                 URL: https://issues.apache.org/jira/browse/SPARK-54354
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 4.0.1
>            Reporter: Hongze Zhang
>            Assignee: Hongze Zhang
>            Priority: Major
>              Labels: pull-request-available
>
> This is a bug specifically happening when very large UnsafeHashedRelations 
> are created for broadcast hash join.
>  
> The rationale of the bug explained as following:
>  
> This code: 
> [https://github.com/apache/spark/blob/6cb88c10126bde79076ce5c8d7574cc5c9524746/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala#L142-L150]
>   creates a temporary unified memory manager with Long.MaxValue / 2 as the 
> heap memory size for the new hashed relation. This practice essentially 
> creates the condition where the actual JVM memory is significantly smaller 
> than the unified memory manager's heap size. And given we also have the 
> recursive retry policy for this specific case: 
> [https://github.com/apache/spark/blob/6cb88c10126bde79076ce5c8d7574cc5c9524746/core/src/main/java/org/apache/spark/memory/TaskMemoryManager.java#L402,]
>  and unified memory manager's heap size is too large, the program will likely 
> hang for very long time. In a typical local test, it could hang for as long 
> as 2 hours.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Assigned] (SPARK-54354) Driver / Executor hangs when there's not enough JVM heap memory for broadcast hashed relation

Reply via email to