[GitHub] spark pull request #19141: [SPARK-21384] [YARN] Spark + YARN fails with Loca...

devaraj-kavali Tue, 12 Sep 2017 09:38:13 -0700

Github user devaraj-kavali commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19141#discussion_r138402807
  
    --- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala 
---
    @@ -565,7 +565,6 @@ private[spark] class Client(
               distribute(jarsArchive.toURI.getPath,
                 resType = LocalResourceType.ARCHIVE,
                 destName = Some(LOCALIZED_LIB_DIR))
    -          jarsArchive.delete()
    --- End diff --
    
    Thanks @jerryshao for the comment.
    
    > What if your scenario and SPARK-20741's scenario are both encountered? 
Looks like your approach above cannot be worked.
    
    Can you provide some information why you think it doesn't work? If we 
delete the spark_libs.zip after completing the application(similar to staging 
dir deletion), it would not stack up till the process exit which solves 
SPARK-20741 and also becomes available during the execution for this current 
issue. 
    > I'm wondering if we can copy or move this spark_libs.zip temp file to 
another non-temp file and add that file to the dist cache. That non-temp file 
will not be deleted and can be overwritten during another launching, so we will 
always have only one copy.
    
    If there are multiple jobs submitted/running concurrently, we would be 
overwriting the existing with the latest spark_libs.zip which may lead to apps 
failure during the copy-in-progress and also would become ambiguous to delete 
the file by which application.
    
    > Besides, I think we have several workarounds to handle this issue like 
spark.yarn.jars or spark.yarn.archive, so looks like this corner case is not so 
necessary to fix (just my thinking, normally people will not use local FS in a 
real cluster).
    
    I agree, this is a corner case and can be handled with workaround.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19141: [SPARK-21384] [YARN] Spark + YARN fails with Loca...

Reply via email to