Hi Folks,

I'm currently working on a feature on aurora scheduler and executor. The 
implementation strategy became controversial on the review board, so I was 
wondering if I should broadcast it to more audience and initiate a discussion. 
Please feel free to let me know your thoughts, your help is greatly appreciated!

The high level goal of this feature is to improve reliability and performance 
of the Aurora scheduler job updater, by relying on health check status rather 
than watch_secs timeout when deciding an individual instance update state.

Please see the original review request https://reviews.apache.org/r/51536/
aurora JIRA ticket https://issues.apache.org/jira/browse/AURORA-894
design doc 
https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
for more details and background.

Note: The design doc becomes a little bit outdated on the "scheduler change 
summary" part (this is what the review request trying to address). As a result, 
I've left some comment to clarify the latest proposed implementation plan for 
scheduler change.

There are two questions I'm trying to address here:
1. How does the scheduler infer the executor version and be backward compatible?
2. Where do we determine if health check is enabled?

In short, there are 3 different solutions proposed on the review board.

In the first two approaches, the scheduler will rely on a string to determine 
the executor version. We determine whether health check is enabled merely on 
executor side. There will be communication between the executor and the 
scheduler.
Solution 1:
vCurrent executor sends a message in its health check thread during RUNNING 
state transition, and the vCurrent updater will infer the executor version from 
the presence of this message, and skip the watch_secs if necessary.

Solution 2:
Instead of relying on the presence of an arbitrary string in the message, rely 
on the presence of a string like: "capabilities:CAPABILITY_1,CAPABILITY-2" 
where CAPABILITY_1 and CAPABILITY_2 (etc.) are constants defined in api.thrift. 
Basically just formalizing the mechanism and making it a bit more future proof.

In the third solution, the scheduler infers the executor version from the 
JobUpdateSettings on scheduler side.
Solution 3:
Adding a bit to JobUpdateSettings which is ‘executorDrivenUpdates', if that is 
set, the scheduler assumes that the transition from STARTING -> RUNNING makes 
the executor healthy and concurrently, we release thermos and change 
HealthCheckConfig to say that it should only go to running after healthy.

Pros and Cons:
The main benefit of Solution 1 is:
1. By using the message in task status update, we don't have to make any schema 
change, which makes the design simple.
2. The feature is fully backward-compatible. When we roll out the vCurrent 
schedulers and executors, we do not have to instruct the users to provide 
additional field in the Job or Update configs, which could confuses customers 
when the vPrev and vCurrent executor coexist in the cluster.

Concerns:
Relying on the presence of a message makes things brittle. Also we do not want 
to expose this message to users.

The benefit of Solution 2 is making the feature more future proof. However, if 
we do not envision a new executor feature in the short term, it's not too much 
different from Solution 1.

The benefits of Solution 3 include:
1. We support more than just thermos now (and others rely on custom executors).
2. A lot of things in Aurora treat the executor as opaque. The status update 
message sent by executor should not be visible to users only if it's an error 
message.

Concerns:
1. In addition to the ‘executorDrivenUpdates' bit that identifies the executor 
version, we still need to notify the scheduler if health check is enabled on 
vCurrent executor, if not, the scheduler must be able to fall back to use 
watch_secs.
2. The users have to provide an additional field in their .aurora config files. 
The feature wouldn't be available unless new clients are rolled out as well.

Please let me know if I understand your suggestions correctly and hopefully 
everyone is on the same page!

Thanks,

Kai

Reply via email to