GitHub user mccheah opened a pull request:

    https://github.com/apache/spark/pull/21366

    [SPARK-24248][K8S][WIP] Use the Kubernetes API to populate an event queue 
for scheduling

    ## What changes were proposed in this pull request?
    
    Previously, the scheduler backend was maintaining state in many places, not 
only for reading state but also writing to it. For example, state had to be 
managed in both the watch and in the executor allocator runnable. Furthermore, 
one had to keep track of multiple hash tables.
    
    We can do better here by:
    
    1. Consolidating the places where we manage state. Here, we take 
inspiration from traditional Kubernetes controllers. These controllers tend to 
implement an event queue which is populated by two sources: a watch connection, 
and a periodic poller. Controllers typically use both mechanisms for 
redundancy; the watch connection may drop, so the periodic polling serves as a 
backup. Both sources write pod updates to a single event queue and then a 
processor periodically processes the current state of pods as reported by the 
two sources.
    
    2. Storing less specialized in-memory state in general. Previously we were 
creating hash tables to represent the state of executors. Instead, it's easier 
to represent state solely by the event queue, which has predictable read/write 
patterns and is more or less just a local up-to-date cache of the cluster's 
status.
    
    ## How was this patch tested?
    
    Integration tests should test there's no regressions end to end. Unit tests 
to be updated, in particular focusing on different orderings of events, 
particularly accounting for when events come in unexpected ordering.
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/palantir/spark event-queue-driven-scheduling

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21366.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21366
    
----
commit 310263c253a8c4a3748cab5b5a7698e076695cd6
Author: mcheah <mcheah@...>
Date:   2018-05-18T20:39:47Z

    [SPARK-24248][K8S] Use the Kubernetes API to populate an event queue for 
scheduling
    
    Previously, the scheduler backend was maintaining state in many places,
    not only for reading state but also writing to it. For example, state
    had to be managed in both the watch and in the executor allocator
    runnable. Furthermore one had to keep track of multiple hash tables.
    
    We can do better here by:
    
    (1) Consolidating the places where we manage state. Here, we take
    inspiration from traditional Kubernetes controllers. These controllers
    tend to implement an event queue which is populated by two sources: a
    watch connection, and a periodic poller. Controllers typically use both
    mechanisms for redundancy; the watch connection may drop, so the
    periodic polling serves as a backup. Both sources write pod updates to a
    single event queue and then a processor periodically processes the
    current state of pods as reported by the two sources.
    
    (2) Storing less specialized in-memory state in general. Previously we
    were creating hash tables to represent the state of executors. Instead,
    it's easier to represent state solely by the event queue, which has
    predictable read/write patterns and is more or less just a local
    up-to-date cache of the cluster's status.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to