[jira] [Commented] (SPARK-12216) Spark failed to delete temp directory

2022-09-26 Thread John Pellman (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609507#comment-17609507
 ] 

John Pellman commented on SPARK-12216:
--

Just as another data point, it appears that a variant of this issue also rears 
its head on GNU/Linux (Debian 10, 3.1.2, Scala 2.12.14) if you set your temp 
directory to be on an NFS mount:

{code}
22/09/26 13:19:09 ERROR org.apache.spark.util.ShutdownHookManager: Exception 
while deleting Spark temp dir: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3
java.io.IOException: Failed to delete: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3/$line10/.nfs026e00cd1377
at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:144)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:91)
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1141)
at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$4(ShutdownHookManager.scala:65)
at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$4$adapted(ShutdownHookManager.scala:62)
at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at 
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$2(ShutdownHookManager.scala:62)
at 
org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)
at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996)
at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.util.Try$.apply(Try.scala:213)
at 
org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at 
org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
{code}

The problem in this case seems to be that {{spark-shell}} is attempting to do a 
recursive unlink while files are still open (NFS client-side [silly 
renames|http://nfs.sourceforge.net/#faq_d2]).  It looks like this overall issue 
might be less of a "weird Windows thing" and more of an issue with spark-shell 
not waiting until all file handles are closed before attempting to remove the 
temp dir.  This behavior cannot be reproduced consistently and appears to be 
non-deterministic.

The obvious workaround here is to not put temp directories on NFS, but it does 
seem like you're relying upon Linux to block the recursive unlink until all 
file handles are closed rather than doing a sanity check within 
spark-shell/scala(which might not be a bad idea).

> Spark failed to delete temp directory 
> --
>
> Key: SPARK-12216
> URL: https://issues.apache.org/jira/browse/SPARK-12216
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Shell
> Environment: windows 7 64 bit
> Spark 1.52
> Java 1.8.0.65
> PATH includes:
> C:\Users\Stefan\spark-1.5.2-bin-hadoop2.6\bin
> C:\ProgramData\Oracle\Java\javapath
> C:\Users\Stefan\scala\bin
> SYSTEM variables set are:
> JAVA_HOME=C:\Program Files\Java\jre1.8.0_65
> HADOOP_HOME=C:\Users\Stefan\hadoop-2.6.0\bin
> (where the bin\winutils resides)
> both \tmp and \tmp\hive have permissions
> drwxrwxrwx as detected by winutils ls
>Reporter: stefan
>Priority: Minor
>
> The mailing list archives have no obvious solution to this:
> scala> :q
> Stopping spark context.
> 15/12/08 16:24:2

[jira] [Comment Edited] (SPARK-12216) Spark failed to delete temp directory

2022-09-26 Thread John Pellman (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609507#comment-17609507
 ] 

John Pellman edited comment on SPARK-12216 at 9/26/22 1:40 PM:
---

Just as another data point, it appears that a variant of this issue also rears 
its head on GNU/Linux (Debian 10, 3.1.2, Scala 2.12.14) if you set your temp 
directory to be on an NFS mount:

{code}
22/09/26 13:19:09 ERROR org.apache.spark.util.ShutdownHookManager: Exception 
while deleting Spark temp dir: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3
java.io.IOException: Failed to delete: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3/$line10/.nfs026e00cd1377
at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:144)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:91)
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1141)
at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$4(ShutdownHookManager.scala:65)
at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$4$adapted(ShutdownHookManager.scala:62)
at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at 
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$2(ShutdownHookManager.scala:62)
at 
org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)
at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996)
at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.util.Try$.apply(Try.scala:213)
at 
org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at 
org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
{code}

The problem in this case seems to be that {{spark-shell}} is attempting to do a 
recursive unlink while files are still open (NFS client-side [silly 
renames|http://nfs.sourceforge.net/#faq_d2]).  It looks like this overall issue 
might be less of a "weird Windows thing" and more of an issue with spark-shell 
not waiting until all file handles are closed before attempting to remove the 
temp dir.  This behavior cannot be reproduced consistently and appears to be 
non-deterministic.

The obvious workaround here is to not put temp directories on NFS, but it does 
seem like you're relying upon file handling behavior that is specific to Linux 
rather than doing a sanity check within spark-shell/scala(which might not be a 
bad idea).


was (Author: jpellman):
Just as another data point, it appears that a variant of this issue also rears 
its head on GNU/Linux (Debian 10, 3.1.2, Scala 2.12.14) if you set your temp 
directory to be on an NFS mount:

{code}
22/09/26 13:19:09 ERROR org.apache.spark.util.ShutdownHookManager: Exception 
while deleting Spark temp dir: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3
java.io.IOException: Failed to delete: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3/$line10/.nfs026e00cd1377
at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:144)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursi

[jira] [Comment Edited] (SPARK-12216) Spark failed to delete temp directory

2022-09-26 Thread John Pellman (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609507#comment-17609507
 ] 

John Pellman edited comment on SPARK-12216 at 9/26/22 1:42 PM:
---

Just as another data point, it appears that a variant of this issue also rears 
its head on GNU/Linux (Debian 10, 3.1.2, Scala 2.12.14) if you set your temp 
directory to be on an NFS mount:

{code}
22/09/26 13:19:09 ERROR org.apache.spark.util.ShutdownHookManager: Exception 
while deleting Spark temp dir: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3
java.io.IOException: Failed to delete: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3/$line10/.nfs026e00cd1377
at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:144)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:91)
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1141)
at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$4(ShutdownHookManager.scala:65)
at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$4$adapted(ShutdownHookManager.scala:62)
at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at 
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at 
org.apache.spark.util.ShutdownHookManager$.$anonfun$new$2(ShutdownHookManager.scala:62)
at 
org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)
at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996)
at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.util.Try$.apply(Try.scala:213)
at 
org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at 
org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
{code}

The problem in this case seems to be that {{spark-shell}} is attempting to do a 
recursive unlink while files are still open (NFS client-side [silly 
renames|http://nfs.sourceforge.net/#faq_d2]).  It looks like this overall issue 
might be less of a "weird Windows thing" and more of an issue with spark-shell 
not waiting until all file handles are closed before attempting to remove the 
temp dir.  This behavior cannot be reproduced consistently and appears to be 
non-deterministic.

The obvious workaround here is to not put temp directories on NFS, but it does 
seem like you're relying upon file handling behavior that is specific to how 
Linux behaves using non-NFS volumes rather than doing a sanity check within 
spark-shell/scala(which might not be a bad idea).


was (Author: jpellman):
Just as another data point, it appears that a variant of this issue also rears 
its head on GNU/Linux (Debian 10, 3.1.2, Scala 2.12.14) if you set your temp 
directory to be on an NFS mount:

{code}
22/09/26 13:19:09 ERROR org.apache.spark.util.ShutdownHookManager: Exception 
while deleting Spark temp dir: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3
java.io.IOException: Failed to delete: 
/hadoop/spark/tmp/spark-af087c3d-6abf-40cb-b3c8-b86e38f2f827/repl-60fe6d34-7dfd-4530-bbb6-f1ace7e953b3/$line10/.nfs026e00cd1377
at 
org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:144)
at 
org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at 
org.apache.spark.ne