[ 
https://issues.apache.org/jira/browse/SPARK-34523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290631#comment-17290631
 ] 

Kent Yao commented on SPARK-34523:
----------------------------------


Hi [~dongjoon], thanks for your suggestions.  When the problem goes to JDK, the 
solution is often to simply upgrade the JDK and be done with it. But I guess 
the hardest part for users may be to collect clues and find the corresponding 
problem. A documentation PR is a good choice and the detailed JIRA also helps.

> Java 8 prior to version 8u92 support is deprecated as of Spark 3.0.0.  

This statement is spark-version specific and too brief to get much users' 
attention.



> JDK-8194653:  Deadlock involving FileSystems.getDefault and 
> System.loadLibrary call
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-34523
>                 URL: https://issues.apache.org/jira/browse/SPARK-34523
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Spark Core
>    Affects Versions: 2.4.7, 3.0.2, 3.1.1
>            Reporter: Kent Yao
>            Priority: Major
>         Attachments: 4303.log, screenshot-1.png, screenshot-2.png
>
>
> h2. Instruction
> This will cause deadlock and hangs concurrent tasks forever on the same 
> executor. for example,
>  
> In the Spark UI stage tab, you may find some of the tasks hang for hours and 
> all others complete without delay.
>  
> !screenshot-2.png! 
> Also, you may find that these hanging tasks belong to the same executors.
> Usually, in this case, you will also get nothing helpful from the executor 
> log.
> If you print the executor jstack or you check the ThreadDump via SparkUI 
> executor tab and you find some task thread blocked like below, you are very 
> likely to hit the JDK-8194653 issue.
> !screenshot-1.png! 
> h2. Solutions
> Here are some options to circumvent this problem:
> 1. For the cluster managers side, you can update the JDK version according to 
> https://bugs.openjdk.java.net/browse/JDK-8194653
> 2. If you are not able to update the JDK version for the cluster entirely, 
> you can use `spark.executorEnv.JAVA_HOME` to specify a suitable JRE for your 
> apps
> 2. Also, turn on `spark.speculation` may let spark automatically re-run the 
> hanging tasks and bypass the problem



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to