[GitHub] spark issue #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSets

squito Wed, 05 Oct 2016 20:02:01 -0700

Github user squito commented on the issue:

    https://github.com/apache/spark/pull/15249
  
    @mridulm on yarn's bad disk detection -- yes, you are right, it is very 
rudimentary check for bad disks.  It really can't catch everything (and we've 
seen that in practice).  I was just pointing out at least one case where you 
know some executors will be good and some won't.  You certainly still need 
node-level blacklisting.
    
    On the bigger topic of the what to do about the timeouts -- I'm now 
thinking that we should really think of the legacy 
"spark.scheduler.executorTaskBlacklistTime" as orthogonal to the new 
"spark.blacklist.*".  The new feature is about dealing w/ resources that are 
bad for a long period of time (eg., hardware failure).  The old feature was 
about trying to cope w/ resource contention.  I may have been using (abusing) 
it to deal w/ bad hardware, but that is only b/c it is the only thing there was.
    
    Rather than try to shoe-horn the resoure contention back in this at the 
11th hour might be a mistake.  Perhaps it makes the most sense to just leave 
the old feature in, beside this one.  It'll still be undocumented (and I'll 
remove the logic that ties the configs together), so it can still wait for a 
cleaner fix, but existing use cases aren't broken.  Maybe that is short 
timeouts for taskset-level blacklisting, or maybe its something else entirely.  
When I put the old feature back, I can update names & add comments to make this 
distinction clear.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSets

Reply via email to