[ https://issues.apache.org/jira/browse/SPARK-34523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290631#comment-17290631 ]
Kent Yao commented on SPARK-34523: ---------------------------------- Hi [~dongjoon], thanks for your suggestions. When the problem goes to JDK, the solution is often to simply upgrade the JDK and be done with it. But I guess the hardest part for users may be to collect clues and find the corresponding problem. A documentation PR is a good choice and the detailed JIRA also helps. > Java 8 prior to version 8u92 support is deprecated as of Spark 3.0.0. This statement is spark-version specific and too brief to get much users' attention. > JDK-8194653: Deadlock involving FileSystems.getDefault and > System.loadLibrary call > ----------------------------------------------------------------------------------- > > Key: SPARK-34523 > URL: https://issues.apache.org/jira/browse/SPARK-34523 > Project: Spark > Issue Type: Sub-task > Components: Spark Core > Affects Versions: 2.4.7, 3.0.2, 3.1.1 > Reporter: Kent Yao > Priority: Major > Attachments: 4303.log, screenshot-1.png, screenshot-2.png > > > h2. Instruction > This will cause deadlock and hangs concurrent tasks forever on the same > executor. for example, > > In the Spark UI stage tab, you may find some of the tasks hang for hours and > all others complete without delay. > > !screenshot-2.png! > Also, you may find that these hanging tasks belong to the same executors. > Usually, in this case, you will also get nothing helpful from the executor > log. > If you print the executor jstack or you check the ThreadDump via SparkUI > executor tab and you find some task thread blocked like below, you are very > likely to hit the JDK-8194653 issue. > !screenshot-1.png! > h2. Solutions > Here are some options to circumvent this problem: > 1. For the cluster managers side, you can update the JDK version according to > https://bugs.openjdk.java.net/browse/JDK-8194653 > 2. If you are not able to update the JDK version for the cluster entirely, > you can use `spark.executorEnv.JAVA_HOME` to specify a suitable JRE for your > apps > 2. Also, turn on `spark.speculation` may let spark automatically re-run the > hanging tasks and bypass the problem -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org