Github user squito commented on the issue:

    https://github.com/apache/spark/pull/15249
  
    @kayousterhout on the topic of that error msg in the config validation 
(sorry github is being weird about letting me respond directly to your comment:
    
    > I see -- I didn't realize executor-level blacklisting still works. I know 
we had a long prior discussion about this, but I'm now wondering if we should 
actually allow this setting? Or do you think (similar to the discussion with 
Mridul below) that it never makes sense to have only executor-level 
blacklisting?
    >
    > Sorry for dragging this on b/c I didn't understand the config!!
    
    Well, the blacklisting will "work" in other settings, meaning executors 
could get blacklisted and tasks won't run there.  But I think a better way to 
think about it is -- what failure modes is spark robust to?  The major 
motivation I see is making it safe to one bad disk on one bad node (of course 
it should also generalize to making it safe to `n` bad nodes).  I don't see 
much motivation for using this feature without even getting that level of 
safety, which is why I think the validation makes sense.  But thats not to say 
you are getting *nothing* from the feature without that level of safety.
    
    To make this a little more concrete, consider what we need to tell users to 
get that level of safety with the current configuration.  They need to set 
    * "spark.scheduler.executorTaskBlacklistTime" > the runtime of an entire 
taskset, so once there is a failure, the task never gets rescheduled on the 
same executor  (in practice I tell users to just set it super high, eg. 1 day 
just to choose something.)
    * "spark.task.maxFailures" > number of executors that will ever be on one 
node for the lifetime of your app.
    
    The second condition is the really crummy part.  That means users have to 
update that setting as they reconfigure the executors for their job (users 
commonly play with executor sizes to see how it effects performance, eg. 1 big 
executor with all the resources on a node or more smaller ones), and with 
dynamic allocation and executors coming and going on a node this becomes 
basically impossible.
    
    Anyway the point is, if "spark.blacklist.task.maxTaskAttemptsPerNode" >= 
"spark.task.maxFailures", then you are back in this territory where it safety 
depends on how many executors you have per node and how they come and go.  In 
many cases it'll work just fine.  But we *know* there are perfectly natural 
deployments which won't be safe with that configuration, so I think it makes 
sense to fail-fast.  As you pointed out a while back, the user is probably 
unlikely to realize they are in this situation unless we tell them loudly.
    
    Drawing this line at one bad node is somewhat arbitrary.  Users might 
really want to be safe to 2 bad nodes, in which case they'd need to think 
through the conditions for themselves.  Or maybe they're OK with even less.  
But it also seems like a pretty reasonable expectation.
    
    I could also put the escape-hatch back in ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to