Task set is a set of tasks within one stage. Executor will be killed when it is idle for a period of time (default is 60s). The problem you mentioned is bug, scheduler should not allocate tasks on this to-be killed executors. I think it is fixed in 1.5.
Thanks Saisai On Thu, Sep 17, 2015 at 3:31 PM, Robert Saccone <rs.rsacc...@gmail.com> wrote: > Hello > > > We're running some experiments with Spark (v1.4) and have some questions > about its scheduling behavior. I am hoping someone can answer the > following questions. > > > What is a task set? It is mentioned in the Spark logs we get from our > runs but we can't seem to find a definition and how it relates to the Spark > concepts of Jobs, Stages, and Tasks in the online documentation. This > makes it hard to reason about the scheduling behavior. > > > What is the heuristic used to kill executors when running Spark with YARN > in dynamic mode? From the logs what we observe is that executors that have > work (task sets) queued to them are being killed and the work (task sets) > are being reassigned to other executors. This seems inconsistent with the > online documentation which says that executors aren't killed until they've > been idle for a user configurable number of seconds. > > > We're using the Fair scheduler pooling with multiple pools each with > different weights, so is it correct that there are queues in the pools and > in the executors as well? > > > We can provide more details on our setup if desired. > > > Regards, > > Rob Saccone > > IBM T. J. Watson Center >