Github user devaraj-kavali commented on a diff in the pull request: https://github.com/apache/spark/pull/19141#discussion_r138402807 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -565,7 +565,6 @@ private[spark] class Client( distribute(jarsArchive.toURI.getPath, resType = LocalResourceType.ARCHIVE, destName = Some(LOCALIZED_LIB_DIR)) - jarsArchive.delete() --- End diff -- Thanks @jerryshao for the comment. > What if your scenario and SPARK-20741's scenario are both encountered? Looks like your approach above cannot be worked. Can you provide some information why you think it doesn't work? If we delete the spark_libs.zip after completing the application(similar to staging dir deletion), it would not stack up till the process exit which solves SPARK-20741 and also becomes available during the execution for this current issue. > I'm wondering if we can copy or move this spark_libs.zip temp file to another non-temp file and add that file to the dist cache. That non-temp file will not be deleted and can be overwritten during another launching, so we will always have only one copy. If there are multiple jobs submitted/running concurrently, we would be overwriting the existing with the latest spark_libs.zip which may lead to apps failure during the copy-in-progress and also would become ambiguous to delete the file by which application. > Besides, I think we have several workarounds to handle this issue like spark.yarn.jars or spark.yarn.archive, so looks like this corner case is not so necessary to fix (just my thinking, normally people will not use local FS in a real cluster). I agree, this is a corner case and can be handled with workaround.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org