[GitHub] spark pull request #14887: [SPARK-17321][YARN] YARN shuffle service should u...

tgravescs Thu, 01 Sep 2016 12:18:45 -0700

Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14887#discussion_r77236626
  
    --- Diff: 
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java
 ---
    @@ -25,6 +25,8 @@
     import com.google.common.collect.Lists;
     import org.apache.hadoop.conf.Configuration;
     import org.apache.hadoop.fs.Path;
    +import org.apache.hadoop.util.DiskChecker;
    --- End diff --
    
    So if you aren't using yarn recovery, do you care about this file at all?  
ie in your cluster if a nodemanager dies you don't expect it to come back up 
and be able to serve shuffle data?  
    If not perhaps we should make it configurable to not store it in the 
levelDB at all.
    
    I'd have to look to see when the recovery feature was added but as long as 
that is in early enough hadoop we can just make it dependent upon the yarn 
recovery setting.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14887: [SPARK-17321][YARN] YARN shuffle service should u...

Reply via email to