[ 
https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609507#comment-17609507
 ] 

John Pellman edited comment on SPARK-12216 at 9/26/22 1:42 PM:
---------------------------------------------------------------

Just as another data point, it appears that a variant of this issue also rears 
its head on GNU/Linux (Debian 10, 3.1.2, Scala 2.12.14) if you set your temp 
directory to be on an NFS mount:

{code}
22/09/26 13:19:09 ERROR org.apache.spark.util.ShutdownHookManager: Exception 
while deleting Spark temp dir: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3
java.io.IOException: Failed to delete: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3/$line10/.nfs00000000026e00cd00001377
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:144)
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:91)
        at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1141)
        at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$4(ShutdownHookManager.scala:65)
        at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$4$adapted(ShutdownHookManager.scala:62)
        at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
        at 
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
        at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$2(ShutdownHookManager.scala:62)
        at 
org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)
        at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996)
        at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at scala.util.Try$.apply(Try.scala:213)
        at 
org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
        at 
org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
{code}

The problem in this case seems to be that {{spark-shell}} is attempting to do a 
recursive unlink while files are still open (NFS client-side [silly 
renames|http://nfs.sourceforge.net/#faq_d2]).  It looks like this overall issue 
might be less of a "weird Windows thing" and more of an issue with spark-shell 
not waiting until all file handles are closed before attempting to remove the 
temp dir.  This behavior cannot be reproduced consistently and appears to be 
non-deterministic.

The obvious workaround here is to not put temp directories on NFS, but it does 
seem like you're relying upon file handling behavior that is specific to how 
Linux behaves using non-NFS volumes rather than doing a sanity check within 
spark-shell/scala(which might not be a bad idea).


was (Author: jpellman):
Just as another data point, it appears that a variant of this issue also rears 
its head on GNU/Linux (Debian 10, 3.1.2, Scala 2.12.14) if you set your temp 
directory to be on an NFS mount:

{code}
22/09/26 13:19:09 ERROR org.apache.spark.util.ShutdownHookManager: Exception 
while deleting Spark temp dir: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3
java.io.IOException: Failed to delete: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3/$line10/.nfs00000000026e00cd00001377
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:144)
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
        at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:91)
        at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1141)
        at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$4(ShutdownHookManager.scala:65)
        at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$4$adapted(ShutdownHookManager.scala:62)
        at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
        at 
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
        at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$2(ShutdownHookManager.scala:62)
        at 
org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)
        at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996)
        at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at scala.util.Try$.apply(Try.scala:213)
        at 
org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
        at 
org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
{code}

The problem in this case seems to be that {{spark-shell}} is attempting to do a 
recursive unlink while files are still open (NFS client-side [silly 
renames|http://nfs.sourceforge.net/#faq_d2]).  It looks like this overall issue 
might be less of a "weird Windows thing" and more of an issue with spark-shell 
not waiting until all file handles are closed before attempting to remove the 
temp dir.  This behavior cannot be reproduced consistently and appears to be 
non-deterministic.

The obvious workaround here is to not put temp directories on NFS, but it does 
seem like you're relying upon file handling behavior that is specific to Linux 
rather than doing a sanity check within spark-shell/scala(which might not be a 
bad idea).

> Spark failed to delete temp directory 
> --------------------------------------
>
>                 Key: SPARK-12216
>                 URL: https://issues.apache.org/jira/browse/SPARK-12216
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell
>         Environment: windows 7 64 bit
> Spark 1.52
> Java 1.8.0.65
> PATH includes:
> C:\Users\Stefan\spark-1.5.2-bin-hadoop2.6\bin
> C:\ProgramData\Oracle\Java\javapath
> C:\Users\Stefan\scala\bin
> SYSTEM variables set are:
> JAVA_HOME=C:\Program Files\Java\jre1.8.0_65
> HADOOP_HOME=C:\Users\Stefan\hadoop-2.6.0\bin
> (where the bin\winutils resides)
> both \tmp and \tmp\hive have permissions
> drwxrwxrwx as detected by winutils ls
>            Reporter: stefan
>            Priority: Minor
>
> The mailing list archives have no obvious solution to this:
> scala> :q
> Stopping spark context.
> 15/12/08 16:24:22 ERROR ShutdownHookManager: Exception while deleting Spark 
> temp dir: 
> C:\Users\Stefan\AppData\Local\Temp\spark-18f2a418-e02f-458b-8325-60642868fdff
> java.io.IOException: Failed to delete: 
> C:\Users\Stefan\AppData\Local\Temp\spark-18f2a418-e02f-458b-8325-60642868fdff
>         at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:884)
>         at 
> org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:63)
>         at 
> org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:60)
>         at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
>         at 
> org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:60)
>         at 
> org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:264)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:234)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234)
>         at 
> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:234)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234)
>         at scala.util.Try$.apply(Try.scala:161)
>         at 
> org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:234)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:216)
>         at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to