[ https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15940987#comment-15940987 ]
Jouni H edited comment on SPARK-12216 at 3/24/17 9:37 PM: ---------------------------------------------------------- I was able to reproduce this bug on Windows with the latest spark version: spark-2.1.0-bin-hadoop2.7 This bug happens for me when I include --jars for spark-submit AND use saveAsTextOut on the script. Example scenarios: * ERROR when include --jars AND use saveAsTextFile * Works when use saveAsTextFile, but don't use any --jars on command line * Works when you include --jars on command line but don't use saveAsTextOut (comment out) Example command line: {{spark-submit --jars aws-java-sdk-1.7.4.jar sparkbugtest.py bugtest.txt ./output/test1/}} The script here doesn't need the --jars file, but if you include it on the command line, it causes the shutdown bug. aws-java-sdk-1.7.4.jar can be downloaded from here: https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/1.7.4/aws-java-sdk-1.7.4.jar The input in the bugtest.txt doesn't matter. Example script: {noformat} import sys from pyspark.sql import SparkSession def main(): # Initialize the spark context. spark = SparkSession\ .builder\ .appName("SparkParseLogTest")\ .getOrCreate() lines = spark.read.text(sys.argv[1]).rdd.map(lambda r: r[0]) lines.saveAsTextFile(sys.argv[2]) if __name__ == "__main__": main() {noformat} I also use winutils.exe as mentioned here: https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-tips-and-tricks-running-spark-windows.html After the error is thrown and and spark-submit has ended, I take a look at the folder that couldn't be deleted, it has the .jar file inside, for example {{C:\Users\Jouni\AppData\Local\Temp\spark-9b68fc91-7ee7-481a-970d-38a6db6f6160\userFiles-948dc876-bced-4778-98a7-90944a7fb155\aws-java-sdk-1.7.4.jar}} was (Author: jouni): I was able to reproduce this bug on Windows with the latest spark version: spark-2.1.0-bin-hadoop2.7 This bug happens for me when I include --jars for spark-submit AND use saveAsTextOut on the script. Example scenarios: * ERROR when include --jars AND use saveAsTextFile * Works when use saveAsTextFile, but don't use any --jars on command line * Works when you include --jars on command line but don't use saveAsTextOut (comment out) Example command line: {{spark-submit --jars aws-java-sdk-1.7.4.jar sparkbugtest.py bugtest.txt ./output/test1/}} The script here doesn't need the --jars file, but if you include it on the command line, it causes the shutdown bug. aws-java-sdk-1.7.4.jar can be downloaded from here: https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/1.7.4/aws-java-sdk-1.7.4.jar The input in the bugtest.txt doesn't matter. Example script: {noformat} import sys from pyspark.sql import SparkSession def main(): # Initialize the spark context. spark = SparkSession\ .builder\ .appName("SparkParseLogTest")\ .getOrCreate() lines = spark.read.text(sys.argv[1]).rdd.map(lambda r: r[0]) lines.saveAsTextFile(sys.argv[2]) if __name__ == "__main__": main() {noformat} I also use winutils.exe as mentioned here: https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-tips-and-tricks-running-spark-windows.html > Spark failed to delete temp directory > -------------------------------------- > > Key: SPARK-12216 > URL: https://issues.apache.org/jira/browse/SPARK-12216 > Project: Spark > Issue Type: Bug > Components: Spark Shell > Environment: windows 7 64 bit > Spark 1.52 > Java 1.8.0.65 > PATH includes: > C:\Users\Stefan\spark-1.5.2-bin-hadoop2.6\bin > C:\ProgramData\Oracle\Java\javapath > C:\Users\Stefan\scala\bin > SYSTEM variables set are: > JAVA_HOME=C:\Program Files\Java\jre1.8.0_65 > HADOOP_HOME=C:\Users\Stefan\hadoop-2.6.0\bin > (where the bin\winutils resides) > both \tmp and \tmp\hive have permissions > drwxrwxrwx as detected by winutils ls > Reporter: stefan > Priority: Minor > > The mailing list archives have no obvious solution to this: > scala> :q > Stopping spark context. > 15/12/08 16:24:22 ERROR ShutdownHookManager: Exception while deleting Spark > temp dir: > C:\Users\Stefan\AppData\Local\Temp\spark-18f2a418-e02f-458b-8325-60642868fdff > java.io.IOException: Failed to delete: > C:\Users\Stefan\AppData\Local\Temp\spark-18f2a418-e02f-458b-8325-60642868fdff > at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:884) > at > org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:63) > at > org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:60) > at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) > at > org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:60) > at > org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:264) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) > at > org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) > at scala.util.Try$.apply(Try.scala:161) > at > org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:234) > at > org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:216) > at > org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org