GitHub user navsan opened a pull request:
https://github.com/apache/incubator-quickstep/pull/33
BugFix: Update NumQueuedWorkOrders to fix scheduling
Quickstep scheduling is currently broken (since PR #14). The foreman only
schedules work for one worker, leaving all other workers idle. This PR fixes
that bug.
The foreman maintains the number of queued work orders for each worker in
the WorkerDirectory. This state was not being incremented when WorkOrders are
dispatched, and not being decremented when WorkOrders are completed. The search
for LeastLoadedWorker would therefore always pick the first worker, resulting
in serial execution of the entire query.
In this PR, I have incremented the number of queued work orders in the
Foreman when it dispatches messages.
When a WorkOrderCompletion message arrives, this number must be
decremented. However, the worker's thread ID is not available to the Foreman,
since PR #14 moved the message deserialization into the PolicyEnforcer. So, in
this PR, I've added a pointer to the WorkerDirectory in the PolicyEnforcer. The
PolicyEnforcer decrements the number of queued work orders while processing
WorkOrderCompletion messages.
The WorkerDirectory is not thread-safe, so the decrement should only be
done in the Foreman thread. Since the PolicyEnforcer is part of the Foreman
thread, this change should be fine.
Tested on CloudLab machine (40 workers) with a few example queries.
[A big thanks to @rogersjeffreyl for helping me debug this!]
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/incubator-quickstep fix_scheduler
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-quickstep/pull/33.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #33
----
commit 74e49fa7f91ab33e20d488ef9923b285214bc04e
Author: Navneet Potti <[email protected]>
Date: 2016-06-15T02:52:25Z
BugFix: Update NumQueuedWorkOrders to fix scheduling
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---