Review Request 51893: Allow cookie based authentication

2016-09-14 Thread Giulio Eulisse

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51893/
---

Review request for Aurora.


Repository: aurora


Description
---

This allows aurora client to connect to servers which are behind a
frontend which expects some sort of cookie to autheticate and authorize
users.

A cookie_jar option can be specified in the `~/.aurora/clusters.json`
to specify a file where the cookie jar is located. Such a cookie jar,
in MozillanCookieJar format, will be used to create the session and
therefore all the subsequent requests will use it.


Diffs
-

  src/main/python/apache/aurora/client/api/scheduler_client.py 
cbdb50ae409b70a35a03405f969d02a6145c9c53 

Diff: https://reviews.apache.org/r/51893/diff/


Testing
---

$ cat ~/aurora/clusters.json
[
{
  "name": "build",
  "scheduler_uri": "https://aliaurora.cern.ch";,
  "auth_mechanism": "UNAUTHENTICATED",
  "cookie_jar": "~/.aurora-token"
}
]
$ dist/aurora.pex quota get build/root


Thanks,

Giulio Eulisse



Re: Review Request 51893: Allow cookie based authentication

2016-09-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51893/#review148913
---



Master (5069f93) is red with this patch.
  ./build-support/jenkins/build.sh

+ popd
/home/jenkins/jenkins-slave/workspace/AuroraBot
+ exec /usr/bin/python2.7 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/virtualenv-15.0.2/virtualenv.py
 --no-download 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv
New python executable in 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/bin/python2.7
Also creating executable in 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/bin/python
Installing setuptools, pip, wheel...done.
Collecting isort==4.0.0
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Downloading isort-4.0.0-py2.py3-none-any.whl
Installing collected packages: isort
Successfully installed isort-4.0.0
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
ERROR: 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/aurora/client/api/scheduler_client.py
 Imports are incorrectly sorted.
--- 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/aurora/client/api/scheduler_client.py:before
 2016-09-14 15:51:04.589051
+++ 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/aurora/client/api/scheduler_client.py:after
  2016-09-14 15:56:43.672696
@@ -12,17 +12,15 @@
 # limitations under the License.
 #
 
+import cookielib
 import functools
+import sys
 import threading
 import time
 import traceback
-import sys
+from os.path import expanduser
 
 import requests
-import cookielib
-
-from os.path import expanduser
-
 from pystachio import Default, Integer, String
 from thrift.protocol import TJSONProtocol
 from thrift.transport import TTransport


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 14, 2016, 3:50 p.m., Giulio Eulisse wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51893/
> ---
> 
> (Updated Sept. 14, 2016, 3:50 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and WarnerSM WarnerSM.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This allows aurora client to connect to servers which are behind a
> frontend which expects some sort of cookie to autheticate and authorize
> users.
> 
> A cookie_jar option can be specified in the `~/.aurora/clusters.json`
> to specify a file where the cookie jar is located. Such a cookie jar,
> in MozillanCookieJar format, will be used to create the session and
> therefore all the subsequent requests will use it.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/client/api/scheduler_client.py 
> cbdb50ae409b70a35a03405f969d02a6145c9c53 
> 
> Diff: https://reviews.apache.org/r/51893/diff/
> 
> 
> Testing
> ---
> 
> $ cat ~/aurora/clusters.json
> [
> {
>   "name": "build",
>   "scheduler_uri": "https://aliaurora.cern.ch";,
>   "auth_mechanism": "UNAUTHENTICATED",
>   "cookie_jar": "~/.aurora-token"
> }
> ]
> $ dist/aurora.pex quota get build/root
> 
> 
> Thanks,
> 
> G

Re: Review Request 51893: Allow cookie based authentication

2016-09-14 Thread Giulio Eulisse

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51893/
---

(Updated Sept. 14, 2016, 4:03 p.m.)


Review request for Aurora, Joshua Cohen and WarnerSM WarnerSM.


Changes
---

@ReviewBot retry


Repository: aurora


Description
---

This allows aurora client to connect to servers which are behind a
frontend which expects some sort of cookie to autheticate and authorize
users.

A cookie_jar option can be specified in the `~/.aurora/clusters.json`
to specify a file where the cookie jar is located. Such a cookie jar,
in MozillanCookieJar format, will be used to create the session and
therefore all the subsequent requests will use it.


Diffs (updated)
-

  src/main/python/apache/aurora/client/api/scheduler_client.py 
cbdb50ae409b70a35a03405f969d02a6145c9c53 

Diff: https://reviews.apache.org/r/51893/diff/


Testing
---

$ cat ~/aurora/clusters.json
[
{
  "name": "build",
  "scheduler_uri": "https://aliaurora.cern.ch";,
  "auth_mechanism": "UNAUTHENTICATED",
  "cookie_jar": "~/.aurora-token"
}
]
$ dist/aurora.pex quota get build/root


Thanks,

Giulio Eulisse



Re: Review Request 51893: Allow cookie based authentication

2016-09-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51893/#review148914
---



Master (5069f93) is red with this patch.
  ./build-support/jenkins/build.sh

virtualenv-15.0.2/virtualenv_embedded/activate_this.py
virtualenv-15.0.2/virtualenv_embedded/deactivate.bat
virtualenv-15.0.2/virtualenv_embedded/distutils-init.py
virtualenv-15.0.2/virtualenv_embedded/distutils.cfg
virtualenv-15.0.2/virtualenv_embedded/python-config
virtualenv-15.0.2/virtualenv_embedded/site.py
virtualenv-15.0.2/virtualenv_support/
virtualenv-15.0.2/virtualenv_support/__init__.py
virtualenv-15.0.2/virtualenv_support/argparse-1.4.0-py2.py3-none-any.whl
virtualenv-15.0.2/virtualenv_support/pip-8.1.2-py2.py3-none-any.whl
virtualenv-15.0.2/virtualenv_support/setuptools-21.2.1-py2.py3-none-any.whl
virtualenv-15.0.2/virtualenv_support/wheel-0.29.0-py2.py3-none-any.whl
+ touch virtualenv-15.0.2/BOOTSTRAPPED
+ popd
/home/jenkins/jenkins-slave/workspace/AuroraBot
+ exec /usr/bin/python2.7 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/virtualenv-15.0.2/virtualenv.py
 --no-download 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv
New python executable in 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/bin/python2.7
Also creating executable in 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/bin/python
Installing setuptools, pip, wheel...done.
Collecting isort==4.0.0
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Downloading isort-4.0.0-py2.py3-none-any.whl
Installing collected packages: isort
Successfully installed isort-4.0.0
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
ERROR: 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/aurora/client/api/scheduler_client.py
 Imports are incorrectly sorted.
--- 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/aurora/client/api/scheduler_client.py:before
 2016-09-14 16:03:39.984022
+++ 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/aurora/client/api/scheduler_client.py:after
  2016-09-14 16:10:06.122743
@@ -21,7 +21,6 @@
 from os.path import expanduser
 
 import requests
-
 from pystachio import Default, Integer, String
 from thrift.protocol import TJSONProtocol
 from thrift.transport import TTransport


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 14, 2016, 4:03 p.m., Giulio Eulisse wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51893/
> ---
> 
> (Updated Sept. 14, 2016, 4:03 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and WarnerSM WarnerSM.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This allows aurora client to connect to servers which are behind a
> frontend which expects some sort of cookie to autheticate and authorize
> users.
> 
> A cookie_jar option can be specified in the `~/.aurora/clusters.json`
> to specify a file where the cookie jar is located. Such a cookie jar,
> in MozillanCookieJar format, will be used to create the session and
> 

Re: Review Request 51893: Allow cookie based authentication

2016-09-14 Thread Giulio Eulisse

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51893/
---

(Updated Sept. 14, 2016, 4:17 p.m.)


Review request for Aurora, Joshua Cohen and WarnerSM WarnerSM.


Changes
---

@ReviewBot retry


Repository: aurora


Description
---

This allows aurora client to connect to servers which are behind a
frontend which expects some sort of cookie to autheticate and authorize
users.

A cookie_jar option can be specified in the `~/.aurora/clusters.json`
to specify a file where the cookie jar is located. Such a cookie jar,
in MozillanCookieJar format, will be used to create the session and
therefore all the subsequent requests will use it.


Diffs (updated)
-

  src/main/python/apache/aurora/client/api/scheduler_client.py 
cbdb50ae409b70a35a03405f969d02a6145c9c53 

Diff: https://reviews.apache.org/r/51893/diff/


Testing
---

$ cat ~/aurora/clusters.json
[
{
  "name": "build",
  "scheduler_uri": "https://aliaurora.cern.ch";,
  "auth_mechanism": "UNAUTHENTICATED",
  "cookie_jar": "~/.aurora-token"
}
]
$ dist/aurora.pex quota get build/root


Thanks,

Giulio Eulisse



Re: Review Request 51874: Change framework_name default value from 'TwitterScheduler' to 'aurora'

2016-09-14 Thread Joshua Cohen


> On Sept. 14, 2016, 12:11 a.m., Zameer Manji wrote:
> > I support this change as a developer.
> > 
> > As an operator I am scared.
> > 
> > What happens to an existing cluster if we don't set `framework_name`? Will 
> > it register another frameowork_id? (bad) or will it fail to register? 
> > (better).
> 
> Santhosh Kumar Shanmugham wrote:
> The restarting framework will be treated like a scheduler fail-over.
> 
> Zameer Manji wrote:
> The release notes in this patch says
> > Update default value of command line option `-framework_name` to 
> 'aurora'. Please be aware that
>   depending on your usage of Mesos, this will be a backward incompatible 
> change.
>   
> I'm trying to understand the implications of the backwards 
> incompatability. Will the scheduler fail to register or will it register 
> under a new frameworkid (and then lose track of previous tasks?)

Santhosh, did you verify this in vagrant with a scheduler that already had 
tasks running? If it is backwards compatible then we can probably adjust the 
release notes?


- Joshua


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51874/#review148816
---


On Sept. 14, 2016, 12:18 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51874/
> ---
> 
> (Updated Sept. 14, 2016, 12:18 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1688
> https://issues.apache.org/jira/browse/AURORA-1688
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Change framework_name default value from 'TwitterScheduler' to 'aurora'
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md ad2c68a6defe07c94480d7dee5b1496b50dc34e5 
>   
> src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
>  8a386bd208956eb0c8c2f48874b0c6fb3af58872 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 97677f24a50963178a123b420d7ac136e4fde3fe 
> 
> Diff: https://reviews.apache.org/r/51874/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Santhosh Kumar Shanmugham
> 
>



Re: Review Request 51893: Allow cookie based authentication

2016-09-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51893/#review148920
---



Master (5069f93) is red with this patch.
  ./build-support/jenkins/build.sh

virtualenv-15.0.2/virtualenv_embedded/activate_this.py
virtualenv-15.0.2/virtualenv_embedded/deactivate.bat
virtualenv-15.0.2/virtualenv_embedded/distutils-init.py
virtualenv-15.0.2/virtualenv_embedded/distutils.cfg
virtualenv-15.0.2/virtualenv_embedded/python-config
virtualenv-15.0.2/virtualenv_embedded/site.py
virtualenv-15.0.2/virtualenv_support/
virtualenv-15.0.2/virtualenv_support/__init__.py
virtualenv-15.0.2/virtualenv_support/argparse-1.4.0-py2.py3-none-any.whl
virtualenv-15.0.2/virtualenv_support/pip-8.1.2-py2.py3-none-any.whl
virtualenv-15.0.2/virtualenv_support/setuptools-21.2.1-py2.py3-none-any.whl
virtualenv-15.0.2/virtualenv_support/wheel-0.29.0-py2.py3-none-any.whl
+ touch virtualenv-15.0.2/BOOTSTRAPPED
+ popd
/home/jenkins/jenkins-slave/workspace/AuroraBot
+ exec /usr/bin/python2.7 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/virtualenv-15.0.2/virtualenv.py
 --no-download 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv
New python executable in 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/bin/python2.7
Also creating executable in 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/bin/python
Installing setuptools, pip, wheel...done.
Collecting isort==4.0.0
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Downloading isort-4.0.0-py2.py3-none-any.whl
Installing collected packages: isort
Successfully installed isort-4.0.0
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
ERROR: 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/aurora/client/api/scheduler_client.py
 Imports are incorrectly sorted.
--- 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/aurora/client/api/scheduler_client.py:before
 2016-09-14 16:23:07.288021
+++ 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/aurora/client/api/scheduler_client.py:after
  2016-09-14 16:28:38.948190
@@ -21,7 +21,6 @@
 from os.path import expanduser
 
 import requests
-
 from pystachio import Default, Integer, String
 from thrift.protocol import TJSONProtocol
 from thrift.transport import TTransport


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 14, 2016, 4:17 p.m., Giulio Eulisse wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51893/
> ---
> 
> (Updated Sept. 14, 2016, 4:17 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and WarnerSM WarnerSM.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This allows aurora client to connect to servers which are behind a
> frontend which expects some sort of cookie to autheticate and authorize
> users.
> 
> A cookie_jar option can be specified in the `~/.aurora/clusters.json`
> to specify a file where the cookie jar is located. Such a cookie jar,
> in MozillanCookieJar format, will be used to create the session and
> 

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko


> On Sept. 14, 2016, 1:29 a.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 110
> > 
> >
> > The documentation implies we are returning a boolean but we are 
> > returning a `Result` object. I think you need to update that. I think what 
> > you are saying is that the `Result` object signals through `isCompleted` 
> > that the work should be repeated right?
> > 
> > (As an aside, maybe we should name this to `shouldReschedule` or 
> > similar).
> > 
> > 
> > If work needs to be repeated once the result has been computed, why 
> > can't the caller just access the `BatchWorker` in the `Work` it submits 
> > and do that itself?
> > 
> > That is why can't we extend `apply` to also accept a `BatchWorker` and 
> > the caller can submit another task within the `Work` if that's required?
> > 
> > Alternatively now that `execute` returns a `CompletableFuture`, the 
> > caller could use one of the many methods to execute code when the future is 
> > complete and have that extra code add more work.

Fixed the javadoc and converted `NoResult` into a static field to simplify 
callsite consumption.

As for the second suggestion, I don't like disseminating `BatchWorkers` or work 
item state maintenance outside the `BatchWorker` itself. That would put the 
burden of maintaining and passing around the state to the callsite and may 
create hard to reason/troubleshoot use cases. Consider the 
https://reviews.apache.org/r/51763. We would have to use an explicit wrapper 
object to hold on to local data to be able to re-queue it again as the trigger 
context is gone the moment `BatchWorker` call is placed. Also, queueing 
recurrent homogenuous events via `CompletableStage` chaining does not strike me 
as the right use of the functional style.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148837
---


On Sept. 14, 2016, 12:16 a.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 12:16 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered th

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/
---

(Updated Sept. 14, 2016, 4:47 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Javadoc changes and minor refactoring.


Repository: aurora


Description
---

This is the first (out of 3) patches intending to reduce storage write lock 
contention and as such improve overall system write throughput. It introduces 
the `BatchWorker` and migrates the majority of storage writes due to task 
status change events to use `TaskEventBatchWorker`.

#Problem
Our current storage system writes effectively behave as `SERIALIZABLE` 
transaction isolation level in SQL terms. This means all writes require 
exclusive access to the storage and no two transactions can happen in parallel 
[1]. While it certainly simplifies our implementation, it creates a single 
hotspot where multiple threads are competing for the storage write access. This 
type of contention only worsens as the cluster size grows, more tasks are 
scheduled, more status updates are processed, more subscribers are listening to 
status updates and etc. Eventually, the scheduler throughput (and especially 
task scheduling) becomes degraded to the extent that certain operations wait 
much longer (4x and more) for the lock acquisition than it takes to process 
their payload when inside the transaction. Some ops (like event processing) are 
generally tolerant of these types of delays. Others - not as much. The task 
scheduling suffers the most as backing up the scheduling queue directly affects 
t
 he Median Time To Assigned (MTTA).

#Remediation
Given the above, it's natural to assume that reducing the number of write 
transactions should help reducing the lock contention. This patch introduces a 
generic `BatchWorker` service that delivers a "best effort" batching approach 
by redirecting multiple individual write requests into a single FIFO queue 
served non-stop by a single dedicated thread. Every batch shares a single write 
transaction thus reducing the number of potential write lock requests. To 
minimize wait-in-queue time, items are dispatched immediately and the max 
number of items is bounded. There are a few `BatchWorker` instances specialized 
on particular workload types: task even processing, cron scheduling and task 
scheduling. Every instance can be tuned independently (max batch size) and 
provides specialized metrics helping to monitor each workload type perf.

#Results
The proposed approach has been heavily tested in production and delivered the 
best results. The lock contention latencies got down between 2x and 5x 
depending on the cluster load. A number of other approaches tried but discarded 
as not performing well or even performing much worse than the current master:
- Clock-driven batch execution - every batch is dispatched on a time schedule
- Max batch with a deadline - a batch is dispatched when max size is reached OR 
a timeout expires
- Various combinations of the above - some `BatchWorkers` are using 
clock-driven execution while others are using max batch with a deadline
- Completely non-blocking (event-based) completion notification - all call 
sites are notified of item completion via a `BatchWorkCompleted` event

Happy to provide more details on the above if interested.

#Upcoming
The introduction of the `BatchWorker` by itself was not enough to substantially 
improve the MTTA. It, however, paves the way for the next phase of scheduling 
perf improvement - taking more than 1 task from a given `TaskGroup` in a single 
scheduling round (coming soon). That improvement wouldn't deliver without 
decreasing the lock contention first. 

Note: it wasn't easy to have a clean diff split, so some functionality in 
`BatchWorker` (e.g.: `executeWithReplay`) appears to be unused in the current 
patch but will become obvious in the part 2 (coming out shortly).  

[1] - 
https://github.com/apache/aurora/blob/f6ac13b169aaad5aad73ef3cc72873781e30a705/src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java#L540-L555


Diffs (updated)
-

  src/main/java/org/apache/aurora/scheduler/BatchWorker.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/SchedulerModule.java 
4a7ef0b1b90607f68d89fe8e207f42c42a8c56a0 
  src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java 
f07746c2b990c1c2235e99f9e4775fc84f9c27b1 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskThrottler.java 
bbd971a2aa8a96cf79edd879ad60e1bebd933d79 
  src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
3c7cda09ab292d696070ca4d9dfedb1f6f71b0fe 
  
src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java 
594bb6219294dcc77d48dcad14e2a6f9caa0c534 
  src/test/java/org/apache/aurora/scheduler/BatchWorkerTest.j

Re: Review Request 51876: @ReviewBot retry Modify executor state transition logic to rely on health checks (if enabled)

2016-09-14 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review148925
---




src/main/python/apache/aurora/executor/common/health_checker.py (line 136)


The reason I sned the message here is because in test_health_checker.py, we 
expect the _maybe_update_failure_count() method sends the message to change the 
healthy state of health cheker. See: 
https://github.com/apache/aurora/blob/master/src/test/python/apache/aurora/executor/common/test_health_checker.py#L451

So currently, the flow in the health checker thread is expected to be:
(1)update failure count
(2)check_expire
(3)send_message
(4)sleep
(5)go to (1)

I agree with Maxim the following flow of health checker thread will make 
more sense, but it appears we have to refactor a number of tests:
(1)check expired
(2)send message
(3)update failure count
(4)sleep
(5)go to (1)



src/main/python/apache/aurora/executor/common/health_checker.py (line 158)


checking the time is expired by summing up the elapsed time is not a good 
strategy. i've tried to use a clock to check if the initital_interval expires 
in the same thread. However, that causes some flaky test when we apply a minor 
offset of the check interval in tests. See: 
https://github.com/apache/aurora/blob/master/src/test/python/apache/aurora/executor/common/test_health_checker.py#L114
I'm thinking about doing the expiration check in a separate thread, and 
memoize if initital_interval expires.


- Kai Huang


On Sept. 14, 2016, 4:43 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 14, 2016, 4:43 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> [TODOs]
> Currently I fixed all broken tests caused by my changes. However, more tests 
> needs to be added in order to accomodate the executor change. I will send 
> follow-up review update when I cover more edge cases. But any feedback on the 
> implementation and tests is greatly appreciated.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 795dae2d6b661fc528d952c2315196d94127961f 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/python/apache/aurora/executor/common/test_status_checker.py 
> 5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
>   src/

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148918
---


Fix it, then Ship it!




LGTM.

A couple of smaller points below, with the largest being the somewhat blurry 
handling of time-based executions.


src/main/java/org/apache/aurora/scheduler/BatchWorker.java (line 197)


`readyItems` is still a left over from the previous patch version. Could be 
renamed to just `batch`.



src/main/java/org/apache/aurora/scheduler/BatchWorker.java (line 228)


Do you have any particular usecase for this metric? 

We already have `batchesProcessed` and `itemsProcessed` and can compute the 
rate of processed items and batches. This seems more robust to me than just 
looking at `lastBatchSize`.



src/main/java/org/apache/aurora/scheduler/scheduling/TaskThrottler.java (lines 
73 - 87)


This code is exposing a slight design smell in my eye: We are using the 
`DelayedExecutor` to trigger the `batchWorker` when a backoff timer has passed. 
However, the latter already provides an internal backoff mechanism.

I'd rather see if we have just one defined way how to handle those 
time-based backoffs:

* either the BatchWorker does not perform any backoff handling at all. This 
seems to be in line with what Zameer has proposed in his last comment,
* or we make the backoff mechanism a first-class feature of the batchWorker 
so that I can say I can inject `backoffStrategy` and `lastBackoffMsec` to be 
used for the `WorkItem` created in `batchWorker.execute`.


- Stephan Erb


On Sept. 14, 2016, 6:47 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 6:47 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered the 
> best results. The lock contention latencies got down between 2x and 5x 
> depending on the cluster load. A number of other approaches tried but 
> discarded as not performing well or even performing much worse than the 
> current master:
> - Clock-driven batch execution - every batch is dispatched on a time s

Re: Review Request 51876: @ReviewBot retry Modify executor state transition logic to rely on health checks (if enabled)

2016-09-14 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review148927
---




src/main/python/apache/aurora/executor/common/health_checker.py (line 137)


It seems to me the order doesn't matter that much. However, I still have 
the feeling that Maxim's proposal makes the expiration and failure count more 
close to their true definition, and less ambuiguous. The tests should not 
dictate how we write the source code. Sometimes we have to refactor them.

Please feel free to weigh in.


- Kai Huang


On Sept. 14, 2016, 4:43 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 14, 2016, 4:43 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> [TODOs]
> Currently I fixed all broken tests caused by my changes. However, more tests 
> needs to be added in order to accomodate the executor change. I will send 
> follow-up review update when I cover more edge cases. But any feedback on the 
> implementation and tests is greatly appreciated.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 795dae2d6b661fc528d952c2315196d94127961f 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/python/apache/aurora/executor/common/test_status_checker.py 
> 5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
>   src/test/python/apache/aurora/executor/test_status_manager.py 
> ce4679ba1aa7b42cf0115c943d84663030182d23 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 
> 0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 
> 
> Diff: https://reviews.apache.org/r/51876/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Stephan Erb


> On Sept. 14, 2016, 6:56 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/TaskThrottler.java, 
> > lines 76-90
> > 
> >
> > This code is exposing a slight design smell in my eye: We are using the 
> > `DelayedExecutor` to trigger the `batchWorker` when a backoff timer has 
> > passed. However, the latter already provides an internal backoff mechanism.
> > 
> > I'd rather see if we have just one defined way how to handle those 
> > time-based backoffs:
> > 
> > * either the BatchWorker does not perform any backoff handling at all. 
> > This seems to be in line with what Zameer has proposed in his last comment,
> > * or we make the backoff mechanism a first-class feature of the 
> > batchWorker so that I can say I can inject `backoffStrategy` and 
> > `lastBackoffMsec` to be used for the `WorkItem` created in 
> > `batchWorker.execute`.

The same applies to the `startGroup(final TaskGroup group)` in Part 3: A 
backoff mechanism is used that calls the batchWorker that could provide a 
backoff on its own.


- Stephan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148918
---


On Sept. 14, 2016, 6:47 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 6:47 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered the 
> best results. The lock contention latencies got down between 2x and 5x 
> depending on the cluster load. A number of other approaches tried but 
> discarded as not performing well or even performing much worse than the 
> current master:
> - Clock-driven batch execution - every batch is dispatched on a time schedule
> - Max batch with a deadline - a batch is dispatched when max size is reached 
> OR a timeout expires
> - Various combinations of the above - some `BatchWorkers` are using 
> clock-driven execution while others are using max batch with a deadline
> - Completely non-blocking (event-based) completion notification - all call 
> sites are notified of item completion via a `BatchWorkCompleted` event
> 
> Happy to provide more details on th

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko


> On Sept. 14, 2016, 4:56 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/TaskThrottler.java, 
> > lines 76-90
> > 
> >
> > This code is exposing a slight design smell in my eye: We are using the 
> > `DelayedExecutor` to trigger the `batchWorker` when a backoff timer has 
> > passed. However, the latter already provides an internal backoff mechanism.
> > 
> > I'd rather see if we have just one defined way how to handle those 
> > time-based backoffs:
> > 
> > * either the BatchWorker does not perform any backoff handling at all. 
> > This seems to be in line with what Zameer has proposed in his last comment,
> > * or we make the backoff mechanism a first-class feature of the 
> > batchWorker so that I can say I can inject `backoffStrategy` and 
> > `lastBackoffMsec` to be used for the `WorkItem` created in 
> > `batchWorker.execute`.
> 
> Stephan Erb wrote:
> The same applies to the `startGroup(final TaskGroup group)` in Part 3: A 
> backoff mechanism is used that calls the batchWorker that could provide a 
> backoff on its own.

The `BatchWorker` is currently supporting only 2 use cases:
1. Simple no-repeat work item execution (`BatchWorker.execute()`)
2. Repeatable work item execution (`BatchWorker.executeWithReplay()`)

I don't see why `BatchWorker` has to use `BackoffStrategy` everywhere or not 
use it at all. There is a clear use case for it in #2 and absolutely no purpose 
in #1.

What you are suggesting adds yet another case of something like 
`BatchWorker.scheduleExecution()`, which goes beyond of what's immediately 
necessary to address the above cited goals. I don't mind considering further 
improvements to the `BatchWorker` outside of this patch scope but I'd like to 
keep it concise to avoid feature creep.


> On Sept. 14, 2016, 4:56 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 228
> > 
> >
> > Do you have any particular usecase for this metric? 
> > 
> > We already have `batchesProcessed` and `itemsProcessed` and can compute 
> > the rate of processed items and batches. This seems more robust to me than 
> > just looking at `lastBatchSize`.

It's more of a convenience metric, similar to how `TimedInterceptor` exposes 
both `total_nanos` and `total_events` in addition to `events_per_sec`. If you 
feel strong about removing redundancy here I am ok dropping it. Just thought 
having a single metric to track the effective batch size may be helpful.


> On Sept. 14, 2016, 4:56 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 197
> > 
> >
> > `readyItems` is still a left over from the previous patch version. 
> > Could be renamed to just `batch`.

Thanks. Fixed.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148918
---


On Sept. 14, 2016, 4:47 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 4:47 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling s

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148931
---


Ship it!




Master (5069f93) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 14, 2016, 4:47 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 4:47 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered the 
> best results. The lock contention latencies got down between 2x and 5x 
> depending on the cluster load. A number of other approaches tried but 
> discarded as not performing well or even performing much worse than the 
> current master:
> - Clock-driven batch execution - every batch is dispatched on a time schedule
> - Max batch with a deadline - a batch is dispatched when max size is reached 
> OR a timeout expires
> - Various combinations of the above - some `BatchWorkers` are using 
> clock-driven execution while others are using max batch with a deadline
> - Completely non-blocking (event-based) completion notification - all call 
> sites are notified of item completion via a `BatchWorkCompleted` event
> 
> Happy to provide more details on the above if interested.
> 
> #Upcoming
> The introduction of the `BatchWorker` by itself was not enough to 
> substantially improve the MTTA. It, however, paves the way for the next phase 
> of scheduling perf improvement - taking more than 1 task from a given 
> `TaskGroup` in a single scheduling round (coming soon). That improvement 
> wouldn't deliver without decreasing the lock contention first. 
> 
> Note: it wasn't easy to have a clean diff split, so some functionality in 
> `BatchWorker` (e.g.: `executeWithReplay`) appears to be unused in the current 
> patch but will become obvious in the part 2 (coming out shortly).  
> 
> [1] - 
> https://github.com/apache/aurora/blob/f6ac13b169aaad5aad73ef3cc72873781e30a705/src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java#L540-L555
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/BatchWorker.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/Schedul

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/
---

(Updated Sept. 14, 2016, 5:25 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Renaming `readyItems` to `batch`.


Repository: aurora


Description
---

This is the first (out of 3) patches intending to reduce storage write lock 
contention and as such improve overall system write throughput. It introduces 
the `BatchWorker` and migrates the majority of storage writes due to task 
status change events to use `TaskEventBatchWorker`.

#Problem
Our current storage system writes effectively behave as `SERIALIZABLE` 
transaction isolation level in SQL terms. This means all writes require 
exclusive access to the storage and no two transactions can happen in parallel 
[1]. While it certainly simplifies our implementation, it creates a single 
hotspot where multiple threads are competing for the storage write access. This 
type of contention only worsens as the cluster size grows, more tasks are 
scheduled, more status updates are processed, more subscribers are listening to 
status updates and etc. Eventually, the scheduler throughput (and especially 
task scheduling) becomes degraded to the extent that certain operations wait 
much longer (4x and more) for the lock acquisition than it takes to process 
their payload when inside the transaction. Some ops (like event processing) are 
generally tolerant of these types of delays. Others - not as much. The task 
scheduling suffers the most as backing up the scheduling queue directly affects 
t
 he Median Time To Assigned (MTTA).

#Remediation
Given the above, it's natural to assume that reducing the number of write 
transactions should help reducing the lock contention. This patch introduces a 
generic `BatchWorker` service that delivers a "best effort" batching approach 
by redirecting multiple individual write requests into a single FIFO queue 
served non-stop by a single dedicated thread. Every batch shares a single write 
transaction thus reducing the number of potential write lock requests. To 
minimize wait-in-queue time, items are dispatched immediately and the max 
number of items is bounded. There are a few `BatchWorker` instances specialized 
on particular workload types: task even processing, cron scheduling and task 
scheduling. Every instance can be tuned independently (max batch size) and 
provides specialized metrics helping to monitor each workload type perf.

#Results
The proposed approach has been heavily tested in production and delivered the 
best results. The lock contention latencies got down between 2x and 5x 
depending on the cluster load. A number of other approaches tried but discarded 
as not performing well or even performing much worse than the current master:
- Clock-driven batch execution - every batch is dispatched on a time schedule
- Max batch with a deadline - a batch is dispatched when max size is reached OR 
a timeout expires
- Various combinations of the above - some `BatchWorkers` are using 
clock-driven execution while others are using max batch with a deadline
- Completely non-blocking (event-based) completion notification - all call 
sites are notified of item completion via a `BatchWorkCompleted` event

Happy to provide more details on the above if interested.

#Upcoming
The introduction of the `BatchWorker` by itself was not enough to substantially 
improve the MTTA. It, however, paves the way for the next phase of scheduling 
perf improvement - taking more than 1 task from a given `TaskGroup` in a single 
scheduling round (coming soon). That improvement wouldn't deliver without 
decreasing the lock contention first. 

Note: it wasn't easy to have a clean diff split, so some functionality in 
`BatchWorker` (e.g.: `executeWithReplay`) appears to be unused in the current 
patch but will become obvious in the part 2 (coming out shortly).  

[1] - 
https://github.com/apache/aurora/blob/f6ac13b169aaad5aad73ef3cc72873781e30a705/src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java#L540-L555


Diffs (updated)
-

  src/main/java/org/apache/aurora/scheduler/BatchWorker.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/SchedulerModule.java 
4a7ef0b1b90607f68d89fe8e207f42c42a8c56a0 
  src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java 
f07746c2b990c1c2235e99f9e4775fc84f9c27b1 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskThrottler.java 
bbd971a2aa8a96cf79edd879ad60e1bebd933d79 
  src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
3c7cda09ab292d696070ca4d9dfedb1f6f71b0fe 
  
src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java 
594bb6219294dcc77d48dcad14e2a6f9caa0c534 
  src/test/java/org/apache/aurora/scheduler/BatchWorkerTest.java P

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148936
---



Master (5069f93) is red with this patch.
  ./build-support/jenkins/build.sh


:processTestResources
:testClasses
:checkstyleTest
:findbugsJmh
:findbugsMain
:findbugsTest
:licenseJmh UP-TO-DATE
:licenseMain UP-TO-DATE
:licenseTest UP-TO-DATE
:license UP-TO-DATE
:pmdJmh
:pmdMain
:pmdTest
:test

org.apache.aurora.scheduler.BatchWorkerTest > testExecute FAILED
java.lang.AssertionError at BatchWorkerTest.java:64
I0914 17:38:20.013 [ShutdownHook, SchedulerMain:101] Stopping scheduler 
services. 

1041 tests completed, 1 failed, 2 skipped
:test FAILED
:jacocoTestReport
Coverage report generated: 
file:///home/jenkins/jenkins-slave/workspace/AuroraBot/dist/reports/jacoco/test/html/index.html
:analyzeReport
Instruction coverage is 0.8872853310356982, but must be greater than 0.89
Branch coverage is 0.7994338859684593, but must be greater than 0.835

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':test'.
> There were failing tests. See the report at: 
> file:///home/jenkins/jenkins-slave/workspace/AuroraBot/dist/reports/tests/index.html

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

BUILD FAILED

Total time: 8 mins 12.185 secs


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 14, 2016, 5:25 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 5:25 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered the 
> best results. The lock contention latencies got down between 2x and 5x 
> depending on the cluster load. A number of other approaches tried but 
> discarded as not performing well or even performing much worse than the 
> current master:
> - Clock-driven batch execution - every batch is dispatched on a time schedule
> - Max batch with a deadline - a batch is dispatched when max size is reached 
> OR a timeout expires
> - Various combinations of the above - some `BatchWorkers` are using 
> clock-driven execution while others are using max batch with a d

Re: Review Request 51763: Batching writes - Part 2 (of 3): Converting cron jobs to use BatchWorker.

2016-09-14 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/#review148933
---




src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java (lines 
148 - 156)


I probably don't completely understand the logic, so stupid question ahead:

Couldn't we do that cleanup of JobDataMap in the handler where you 
currently do the `killFollowups.add(key)`? Having this spread out over the 
large `doExecute` function doesn't help understandability very much.



src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java (line 
158)


I will probably find this message confusing if I see it in the logs, in 
particular if I have configured the job with KILL_EXISTING.



src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java (line 
225)


The term "followup" is rather generic and difficuelt to understand if seen 
in isolation in the log. You should at least mention it is about killing a cron 
job.


- Stephan Erb


On Sept. 14, 2016, 2:23 a.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51763/
> ---
> 
> (Updated Sept. 14, 2016, 2:23 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the second part of the `BatchWorker` conversion work that moves cron 
> jobs to use non-blocking kill followups and reduces the number of trigger 
> threads. See https://reviews.apache.org/r/51759 for more background on the 
> `BatchWorker`.
> 
> #Problem
> The current implementation of the cron scheduling relies on a large number of 
> threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
> killing existing tasks according to `KILL_EXISTING` collision policy. This 
> creates large spikes of activities at synchronized intervals as users tend to 
> schedule their cron runs around similar schedules. Moreover, the current 
> implementation re-acquires write locks multiple times to deliver on 
> `KILL_EXISTING` policy. 
> 
> #Remediation
> Trigger level batching is still done in a blocking way but multiple cron 
> triggers may be bundled together to share the same write transaction. Any 
> followups, however, are performed in a non-blocking way by relying on a 
> `BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. 
> In order to still ensure non-concurrent execution of a given job key trigger, 
> a token (job key) is saved within the trigger itself. A concurrent trigger 
> will bail if a kill followup is still in progress (token is set AND no entry 
> in `killFollowups` set exists yet).
> 
> #Results
> The above approach allowed reducing the number of cron threads to 10 and 
> likely can be reduced even further. See https://reviews.apache.org/r/51759 
> for the lock contention results.
> 
> 
> Diffs
> -
> 
>   commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
> 8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
>   commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
> bc30990d57f444f7d64805ed85c363f1302736d0 
>   config/findbugs/excludeFilter.xml fe3f4ca5db1484124af14421a3349950dfec8519 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
> c07551e94f9221b5b21c5dc9715e82caa290c2e8 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
> 155d702d68367b247dd066f773c662407f0e3b5b 
>   
> src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
> 5c64ff2994e200b3453603ac5470e8e152cebc55 
>   src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
> 1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f 
> 
> Diff: https://reviews.apache.org/r/51763/diff/
> 
> 
> Testing
> ---
> 
> All types of testing including deploying to test and production clusters.
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>



Re: Review Request 51765: Batching writes - Part 3 (of 3): Converting TaskScheduler to use BatchWorker.

2016-09-14 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51765/#review148941
---


Ship it!




Seems to be a straight forward usage of the BatchWorker as implemented in Part 
1.

- Stephan Erb


On Sept. 14, 2016, 2:29 a.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51765/
> ---
> 
> (Updated Sept. 14, 2016, 2:29 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the final part of the `BatchWorker` conversion work that converts 
> `TaskScheduler`. See https://reviews.apache.org/r/51759 for more background 
> on the `BatchWorker`.
> 
> #Problem
> See https://reviews.apache.org/r/51759
> 
> #Remediation
> Task scheduling is one of the most dominant users of the write lock. It's 
> also one of the heaviest and the most latency-sensitive. As such, the default 
> max batch size is chosen conservatively low (3) and batch items are executed 
> in a blocking way. 
> 
> BTW, attempting to make task scheduling non-blocking resulted in a much worse 
> scheduling performance. The way our `DBTaskStore` is wired, all async 
> activities, including `EventBus` are bound to use a single async `Executor`, 
> which is currently limited at 8 threads [1]. Relying on the same `EventBus` 
> to deliver scheduling completion events resulted in slower scheduling perf as 
> those events were backed up behind all other activities, including tasks 
> status events, reconciliation and etc. Increasing the executor thread pool 
> size to a larger number on the other side, also increased the lock contention 
> defeating the whole purpose of this work.
> 
> #Results
> See https://reviews.apache.org/r/51759 for the lock contention results.
> 
> https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L51-L54
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 9d0d40b82653fb923bed16d06546288a1576c21d 
>   src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
> 11e8033438ad0808e446e41bb26b3fa4c04136c7 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
> c044ebe6f72183a67462bbd8e5be983eb592c3e9 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
> d266f6a25ae2360db2977c43768a19b1f1efe8ff 
>   src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
> c2ceb4e7685a9301f8014a9183e02fbad65bca26 
>   src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
> 95cf25eda0a5bfc0cc4c46d1439ebe9d5359ce79 
>   
> src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java
>  72562e6bd9a9860c834e6a9faa094c28600a8fed 
> 
> Diff: https://reviews.apache.org/r/51765/diff/
> 
> 
> Testing
> ---
> 
> All types of testing including deploying to test and production clusters.
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>



Re: Review Request 51893: Allow cookie based authentication

2016-09-14 Thread Giulio Eulisse

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51893/
---

(Updated Sept. 14, 2016, 6:19 p.m.)


Review request for Aurora, Joshua Cohen and WarnerSM WarnerSM.


Changes
---

@ReviewBot retry


Repository: aurora


Description
---

This allows aurora client to connect to servers which are behind a
frontend which expects some sort of cookie to autheticate and authorize
users.

A cookie_jar option can be specified in the `~/.aurora/clusters.json`
to specify a file where the cookie jar is located. Such a cookie jar,
in MozillanCookieJar format, will be used to create the session and
therefore all the subsequent requests will use it.


Diffs (updated)
-

  src/main/python/apache/aurora/client/api/scheduler_client.py 
cbdb50ae409b70a35a03405f969d02a6145c9c53 

Diff: https://reviews.apache.org/r/51893/diff/


Testing
---

$ cat ~/aurora/clusters.json
[
{
  "name": "build",
  "scheduler_uri": "https://aliaurora.cern.ch";,
  "auth_mechanism": "UNAUTHENTICATED",
  "cookie_jar": "~/.aurora-token"
}
]
$ dist/aurora.pex quota get build/root


Thanks,

Giulio Eulisse



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Zameer Manji


> On Sept. 13, 2016, 6:29 p.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 110
> > 
> >
> > The documentation implies we are returning a boolean but we are 
> > returning a `Result` object. I think you need to update that. I think what 
> > you are saying is that the `Result` object signals through `isCompleted` 
> > that the work should be repeated right?
> > 
> > (As an aside, maybe we should name this to `shouldReschedule` or 
> > similar).
> > 
> > 
> > If work needs to be repeated once the result has been computed, why 
> > can't the caller just access the `BatchWorker` in the `Work` it submits 
> > and do that itself?
> > 
> > That is why can't we extend `apply` to also accept a `BatchWorker` and 
> > the caller can submit another task within the `Work` if that's required?
> > 
> > Alternatively now that `execute` returns a `CompletableFuture`, the 
> > caller could use one of the many methods to execute code when the future is 
> > complete and have that extra code add more work.
> 
> Maxim Khutornenko wrote:
> Fixed the javadoc and converted `NoResult` into a static field to 
> simplify callsite consumption.
> 
> As for the second suggestion, I don't like disseminating `BatchWorkers` 
> or work item state maintenance outside the `BatchWorker` itself. That would 
> put the burden of maintaining and passing around the state to the callsite 
> and may create hard to reason/troubleshoot use cases. Consider the 
> https://reviews.apache.org/r/51763. We would have to use an explicit wrapper 
> object to hold on to local data to be able to re-queue it again as the 
> trigger context is gone the moment `BatchWorker` call is placed. Also, 
> queueing recurrent homogenuous events via `CompletableStage` chaining does 
> not strike me as the right use of the functional style.

Understood.

I spent some time about an alternative and I didn't get one that seemed to work 
well so I'll accept this design. Most solutions I looked that involved 
recursive closures. They would look like `queue work` -> `add call back on the 
future` -> `call back calls back into queuing work`. I think that's pretty 
messy and something that could be touched upon later.


- Zameer


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148837
---


On Sept. 14, 2016, 10:25 a.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 10:25 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatch

Re: Review Request 51893: Allow cookie based authentication

2016-09-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51893/#review148949
---



Master (5069f93) is red with this patch.
  ./build-support/jenkins/build.sh

virtualenv-15.0.2/virtualenv_embedded/activate_this.py
virtualenv-15.0.2/virtualenv_embedded/deactivate.bat
virtualenv-15.0.2/virtualenv_embedded/distutils-init.py
virtualenv-15.0.2/virtualenv_embedded/distutils.cfg
virtualenv-15.0.2/virtualenv_embedded/python-config
virtualenv-15.0.2/virtualenv_embedded/site.py
virtualenv-15.0.2/virtualenv_support/
virtualenv-15.0.2/virtualenv_support/__init__.py
virtualenv-15.0.2/virtualenv_support/argparse-1.4.0-py2.py3-none-any.whl
virtualenv-15.0.2/virtualenv_support/pip-8.1.2-py2.py3-none-any.whl
virtualenv-15.0.2/virtualenv_support/setuptools-21.2.1-py2.py3-none-any.whl
virtualenv-15.0.2/virtualenv_support/wheel-0.29.0-py2.py3-none-any.whl
+ touch virtualenv-15.0.2/BOOTSTRAPPED
+ popd
/home/jenkins/jenkins-slave/workspace/AuroraBot
+ exec /usr/bin/python2.7 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/virtualenv-15.0.2/virtualenv.py
 --no-download 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv
New python executable in 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/bin/python2.7
Also creating executable in 
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/bin/python
Installing setuptools, pip, wheel...done.
Collecting isort==4.0.0
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Downloading isort-4.0.0-py2.py3-none-any.whl
Installing collected packages: isort
Successfully installed isort-4.0.0
/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
ERROR: 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/aurora/client/api/scheduler_client.py
 Imports are incorrectly sorted.
--- 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/aurora/client/api/scheduler_client.py:before
 2016-09-14 18:21:03.193042
+++ 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/aurora/client/api/scheduler_client.py:after
  2016-09-14 18:26:10.685426
@@ -21,7 +21,6 @@
 from os.path import expanduser
 
 import requests
-
 from pystachio import Default, Integer, String
 from thrift.protocol import TJSONProtocol
 from thrift.transport import TTransport


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 14, 2016, 6:19 p.m., Giulio Eulisse wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51893/
> ---
> 
> (Updated Sept. 14, 2016, 6:19 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and WarnerSM WarnerSM.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This allows aurora client to connect to servers which are behind a
> frontend which expects some sort of cookie to autheticate and authorize
> users.
> 
> A cookie_jar option can be specified in the `~/.aurora/clusters.json`
> to specify a file where the cookie jar is located. Such a cookie jar,
> in MozillanCookieJar format, will be used to create the session and
> 

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148950
---


Fix it, then Ship it!




This patch LGTM. Just a single concern here and some questions. I will be 
moving on to the subsequent patches shortly.
Please do not commit this until all parts are shipped. Also, please don't 
forget to rebase/update subsequent patches if changes are made here.

Also, does this cover all state change consumers that interact on storage when 
the event is fired or is this a subset? Just wondering if this is supposed to 
apply to all of them or just a small set that you think need this change.


src/main/java/org/apache/aurora/scheduler/BatchWorker.java (line 233)


Instead of having an `Optional` and then later checking 
it's present when we compute the backoff, we could simplify the internal 
implementation by having non repeatable work items default to some sort of 
`NoopBackoffStrategy` instead.



src/main/java/org/apache/aurora/scheduler/SchedulerModule.java (line 69)


Just curious, is this default arbitrary or something that was derrived from 
your deployment in production?


- Zameer Manji


On Sept. 14, 2016, 10:25 a.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 10:25 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered the 
> best results. The lock contention latencies got down between 2x and 5x 
> depending on the cluster load. A number of other approaches tried but 
> discarded as not performing well or even performing much worse than the 
> current master:
> - Clock-driven batch execution - every batch is dispatched on a time schedule
> - Max batch with a deadline - a batch is dispatched when max size is reached 
> OR a timeout expires
> - Various combinations of the above - some `BatchWorkers` are using 
> clock-driven execution while others are using max batch with a deadline
> - Completely non-blocking (event-based) completion notification - all call 
> sites are notified of item completion via a `BatchWorkCompleted` event
> 
> Happy to provide more details on the above if i

Re: Review Request 51874: Change framework_name default value from 'TwitterScheduler' to 'aurora'

2016-09-14 Thread Santhosh Kumar Shanmugham


> On Sept. 13, 2016, 5:11 p.m., Zameer Manji wrote:
> > I support this change as a developer.
> > 
> > As an operator I am scared.
> > 
> > What happens to an existing cluster if we don't set `framework_name`? Will 
> > it register another frameowork_id? (bad) or will it fail to register? 
> > (better).
> 
> Santhosh Kumar Shanmugham wrote:
> The restarting framework will be treated like a scheduler fail-over.
> 
> Zameer Manji wrote:
> The release notes in this patch says
> > Update default value of command line option `-framework_name` to 
> 'aurora'. Please be aware that
>   depending on your usage of Mesos, this will be a backward incompatible 
> change.
>   
> I'm trying to understand the implications of the backwards 
> incompatability. Will the scheduler fail to register or will it register 
> under a new frameworkid (and then lose track of previous tasks?)
> 
> Joshua Cohen wrote:
> Santhosh, did you verify this in vagrant with a scheduler that already 
> had tasks running? If it is backwards compatible then we can probably adjust 
> the release notes?

Results from testing in Vagrant cluster,

Renaming framework from 'TwitterScheduler' to 'Aurora':

The framework re-registers after restart (treated by master as failover) and 
gets the same framework-id and performs task reconciliation thereby restoring 
the tasks.

I0914 16:48:28.408182  9815 master.cpp:1297] Giving framework 
071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 3weeks to 
failover
I0914 16:48:28.408226  9815 hierarchical.cpp:382] Deactivated framework 
071c44a1-b4d4-4339-a727-03a79f725851-
E0914 16:48:28.408617  9819 process.cpp:2105] Failed to shutdown socket with fd 
28: Transport endpoint is not connected
I0914 16:48:43.722126  9813 master.cpp:2424] Received SUBSCRIBE call for 
framework 'Aurora' at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
I0914 16:48:43.722190  9813 master.cpp:2500] Subscribing framework Aurora with 
checkpointing enabled and capabilities [ REVOCABLE_RESOURCES, GPU_RESOURCES ]
I0914 16:48:43.75  9813 master.cpp:2564] Updating info for framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:48:43.722256  9813 master.cpp:2577] Framework 
071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 failed over
I0914 16:48:43.722429  9813 hierarchical.cpp:348] Activated framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:48:43.722595  9813 master.cpp:5709] Sending 1 offers to framework 
071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
I0914 16:49:44.204677  9812 master.cpp:5447] Performing explicit task state 
reconciliation for 1 tasks of framework 
071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083

Rolling back framework name to 'TwitterScheduler' from 'Aurora':

Same here.

I0914 16:51:33.203495  9812 master.cpp:1297] Giving framework 
071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 3weeks to 
failover
I0914 16:51:33.203526  9812 hierarchical.cpp:382] Deactivated framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:51:49.614074  9813 master.cpp:2424] Received SUBSCRIBE call for 
framework 'TwitterScheduler' at 
scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
I0914 16:51:49.614215  9813 master.cpp:2500] Subscribing framework 
TwitterScheduler with checkpointing enabled and capabilities [ 
REVOCABLE_RESOURCES, GPU_RESOURCES ]
I0914 16:51:49.614312  9813 master.cpp:2564] Updating info for framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:51:49.614359  9813 master.cpp:2577] Framework 
071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 failed over
I0914 16:51:49.614977  9813 hierarchical.cpp:348] Activated framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:51:49.615170  9813 master.cpp:5709] Sending 1 offers to framework 
071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
I0914 16:52:50.315119  9812 master.cpp:5447] Performing explicit task state 
reconciliation for 1 tasks of framework 
071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083

Restarting the scheduler after updating the config to 'TwitterScheduler' from 
'Aurora':

Rename did not take effect. The master re-registered the framework to the same 
id and performed a task reconciliation.

I0914 20:11:49.178103 28171 master.cpp:1297] Giving framework 
071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
scheduler-c42cd8cf-09a0-4d81-a947-094c4fac601e@192.168

Re: Review Request 51874: Change framework_name default value from 'TwitterScheduler' to 'aurora'

2016-09-14 Thread Santhosh Kumar Shanmugham


> On Sept. 13, 2016, 5:11 p.m., Zameer Manji wrote:
> > I support this change as a developer.
> > 
> > As an operator I am scared.
> > 
> > What happens to an existing cluster if we don't set `framework_name`? Will 
> > it register another frameowork_id? (bad) or will it fail to register? 
> > (better).
> 
> Santhosh Kumar Shanmugham wrote:
> The restarting framework will be treated like a scheduler fail-over.
> 
> Zameer Manji wrote:
> The release notes in this patch says
> > Update default value of command line option `-framework_name` to 
> 'aurora'. Please be aware that
>   depending on your usage of Mesos, this will be a backward incompatible 
> change.
>   
> I'm trying to understand the implications of the backwards 
> incompatability. Will the scheduler fail to register or will it register 
> under a new frameworkid (and then lose track of previous tasks?)
> 
> Joshua Cohen wrote:
> Santhosh, did you verify this in vagrant with a scheduler that already 
> had tasks running? If it is backwards compatible then we can probably adjust 
> the release notes?
> 
> Santhosh Kumar Shanmugham wrote:
> Results from testing in Vagrant cluster,
> 
> Renaming framework from 'TwitterScheduler' to 'Aurora':
> 
> The framework re-registers after restart (treated by master as failover) 
> and gets the same framework-id and performs task reconciliation thereby 
> restoring the tasks.
> 
> I0914 16:48:28.408182  9815 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:48:28.408226  9815 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> E0914 16:48:28.408617  9819 process.cpp:2105] Failed to shutdown socket 
> with fd 28: Transport endpoint is not connected
> I0914 16:48:43.722126  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'Aurora' at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> I0914 16:48:43.722190  9813 master.cpp:2500] Subscribing framework Aurora 
> with checkpointing enabled and capabilities [ REVOCABLE_RESOURCES, 
> GPU_RESOURCES ]
> I0914 16:48:43.75  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722256  9813 master.cpp:2577] Framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 failed over
> I0914 16:48:43.722429  9813 hierarchical.cpp:348] Activated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722595  9813 master.cpp:5709] Sending 1 offers to 
> framework 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> I0914 16:49:44.204677  9812 master.cpp:5447] Performing explicit task 
> state reconciliation for 1 tasks of framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> 
> Rolling back framework name to 'TwitterScheduler' from 'Aurora':
> 
> Same here.
> 
> I0914 16:51:33.203495  9812 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:51:33.203526  9812 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.614074  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'TwitterScheduler' at 
> scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
> I0914 16:51:49.614215  9813 master.cpp:2500] Subscribing framework 
> TwitterScheduler with checkpointing enabled and capabilities [ 
> REVOCABLE_RESOURCES, GPU_RESOURCES ]
> I0914 16:51:49.614312  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.614359  9813 master.cpp:2577] Framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 failed over
> I0914 16:51:49.614977  9813 hierarchical.cpp:348] Activated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.615170  9813 master.cpp:5709] Sending 1 offers to 
> framework 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
> I0914 16:52:50.315119  9812 master.cpp:5447] Performing explicit task 
> state reconciliation for 1 tasks of framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
> 
> Restarting the scheduler after updating the config to 'TwitterScheduler' 
> fro

Review Request 51899: Ensure shell health checkers running for tasks running under an isolated fileystem are run within that filesystem.

2016-09-14 Thread Joshua Cohen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51899/
---

Review request for Aurora, Stephan Erb and Zhitao Li.


Repository: aurora


Description
---

Ensure shell health checkers running for tasks running under an isolated 
fileystem are run within that filesystem.


Diffs
-

  src/main/python/apache/aurora/common/health_check/shell.py 
35750823553406a96282545066f1291c20347ffa 
  src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
5211f28e4e6c0efd29d7d79058128adb71ec7da8 
  src/main/python/apache/aurora/executor/common/health_checker.py 
5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
  src/main/python/apache/thermos/common/BUILD 
879b812b6a262d6e13b64e662999dd436f039748 
  src/main/python/apache/thermos/common/process_util.py PRE-CREATION 
  src/main/python/apache/thermos/core/process.py 
2134d4ff05861d4eaee9bc7ea4763e76ce63288c 
  src/test/python/apache/aurora/common/health_check/test_shell.py 
011464cbe1df00f2a56d4690176e7c2d0d3fd535 
  src/test/python/apache/aurora/executor/common/test_health_checker.py 
bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
  src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
290627f8bc38d31ae123cfd1cdd36e9291c2de18 

Diff: https://reviews.apache.org/r/51899/diff/


Testing
---

./build-support/jenkins/build.sh
e2e tests


Thanks,

Joshua Cohen



Re: Review Request 51874: Change framework_name default value from 'TwitterScheduler' to 'Aurora'

2016-09-14 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51874/
---

(Updated Sept. 14, 2016, 1:58 p.m.)


Review request for Aurora, Joshua Cohen and Maxim Khutornenko.


Changes
---

Removing comment about backward incompatibility from RELEASE_NOTES.


Summary (updated)
-

Change framework_name default value from 'TwitterScheduler' to 'Aurora'


Bugs: AURORA-1688
https://issues.apache.org/jira/browse/AURORA-1688


Repository: aurora


Description (updated)
---

Change framework_name default value from 'TwitterScheduler' to 'Aurora'


Diffs (updated)
-

  RELEASE-NOTES.md ad2c68a6defe07c94480d7dee5b1496b50dc34e5 
  
src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
 8a386bd208956eb0c8c2f48874b0c6fb3af58872 
  src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
97677f24a50963178a123b420d7ac136e4fde3fe 

Diff: https://reviews.apache.org/r/51874/diff/


Testing (updated)
---

./build-support/jenkins/build.sh
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh

Testing to make sure backward compatibility:

Case 1: Rolling forward does not impact running tasks:
Renaming framework from 'TwitterScheduler' to 'Aurora':

The framework re-registers after restart (treated by master as failover) and 
gets the same framework-id. Running task remain unaffected.

Master log:
I0914 16:48:28.408182  9815 master.cpp:1297] Giving framework 
071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 3weeks to 
failover
I0914 16:48:28.408226  9815 hierarchical.cpp:382] Deactivated framework 
071c44a1-b4d4-4339-a727-03a79f725851-
E0914 16:48:28.408617  9819 process.cpp:2105] Failed to shutdown socket with fd 
28: Transport endpoint is not connected
I0914 16:48:43.722126  9813 master.cpp:2424] Received SUBSCRIBE call for 
framework 'Aurora' at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
I0914 16:48:43.722190  9813 master.cpp:2500] Subscribing framework Aurora with 
checkpointing enabled and capabilities [ REVOCABLE_RESOURCES, GPU_RESOURCES ]
I0914 16:48:43.75  9813 master.cpp:2564] Updating info for framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:48:43.722256  9813 master.cpp:2577] Framework 
071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 failed over
I0914 16:48:43.722429  9813 hierarchical.cpp:348] Activated framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:48:43.722595  9813 master.cpp:5709] Sending 1 offers to framework 
071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083

Scheduler log:
I0914 16:48:44.157 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
value: "071c44a1-b4d4-4339-a727-03a79f725851-"
, master: id: "461b98b8-63e1-40e3-96fd-cb62420945ae"
ip: 119646400
port: 5050
pid: "master@192.168.33.7:5050"
hostname: "aurora.local"
version: "1.0.0"
address {
  hostname: "aurora.local"
  ip: "192.168.33.7"
  port: 5050
}

Case 2: Rolling backward does not impact running tasks:
Rolling back framework name from 'Aurora' to 'TwitterScheduler':

The framework re-registers after restart (treated by master as failover) and 
gets the same framework-id. Running task remain unaffected.

Master log:
I0914 16:51:33.203495  9812 master.cpp:1297] Giving framework 
071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 3weeks to 
failover
I0914 16:51:33.203526  9812 hierarchical.cpp:382] Deactivated framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:51:49.614074  9813 master.cpp:2424] Received SUBSCRIBE call for 
framework 'TwitterScheduler' at 
scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
I0914 16:51:49.614215  9813 master.cpp:2500] Subscribing framework 
TwitterScheduler with checkpointing enabled and capabilities [ 
REVOCABLE_RESOURCES, GPU_RESOURCES ]
I0914 16:51:49.614312  9813 master.cpp:2564] Updating info for framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:51:49.614359  9813 master.cpp:2577] Framework 
071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 failed over
I0914 16:51:49.614977  9813 hierarchical.cpp:348] Activated framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:51:49.615170  9813 master.cpp:5709] Sending 1 offers to framework 
071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083

Scheduler log:
I0914 16:51:50.249 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
value: "071c44a1-b4d4-4339-a727-03a79f725851-"
, master: id: "461b98b8-63e1-40e3-9

Re: Review Request 51899: Ensure shell health checkers running for tasks running under an isolated fileystem are run within that filesystem.

2016-09-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51899/#review148970
---


Ship it!




Master (5069f93) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 14, 2016, 8:49 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51899/
> ---
> 
> (Updated Sept. 14, 2016, 8:49 p.m.)
> 
> 
> Review request for Aurora, Stephan Erb and Zhitao Li.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Ensure shell health checkers running for tasks running under an isolated 
> fileystem are run within that filesystem.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/common/health_check/shell.py 
> 35750823553406a96282545066f1291c20347ffa 
>   src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
> 5211f28e4e6c0efd29d7d79058128adb71ec7da8 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/thermos/common/BUILD 
> 879b812b6a262d6e13b64e662999dd436f039748 
>   src/main/python/apache/thermos/common/process_util.py PRE-CREATION 
>   src/main/python/apache/thermos/core/process.py 
> 2134d4ff05861d4eaee9bc7ea4763e76ce63288c 
>   src/test/python/apache/aurora/common/health_check/test_shell.py 
> 011464cbe1df00f2a56d4690176e7c2d0d3fd535 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 
> 290627f8bc38d31ae123cfd1cdd36e9291c2de18 
> 
> Diff: https://reviews.apache.org/r/51899/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> e2e tests
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>



Re: Review Request 51874: Change framework_name default value from 'TwitterScheduler' to 'Aurora'

2016-09-14 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51874/#review148971
---


Ship it!




Thanks for the detailed testing! LGTM. It seems like we can rename from 
`TwitterScheduler` to `Aurora` without any risk whatever.

- Zameer Manji


On Sept. 14, 2016, 1:58 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51874/
> ---
> 
> (Updated Sept. 14, 2016, 1:58 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1688
> https://issues.apache.org/jira/browse/AURORA-1688
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Change framework_name default value from 'TwitterScheduler' to 'Aurora'
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md ad2c68a6defe07c94480d7dee5b1496b50dc34e5 
>   
> src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
>  8a386bd208956eb0c8c2f48874b0c6fb3af58872 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 97677f24a50963178a123b420d7ac136e4fde3fe 
> 
> Diff: https://reviews.apache.org/r/51874/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> Testing to make sure backward compatibility:
> 
> Case 1: Rolling forward does not impact running tasks:
> Renaming framework from 'TwitterScheduler' to 'Aurora':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:48:28.408182  9815 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:48:28.408226  9815 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> E0914 16:48:28.408617  9819 process.cpp:2105] Failed to shutdown socket with 
> fd 28: Transport endpoint is not connected
> I0914 16:48:43.722126  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'Aurora' at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> I0914 16:48:43.722190  9813 master.cpp:2500] Subscribing framework Aurora 
> with checkpointing enabled and capabilities [ REVOCABLE_RESOURCES, 
> GPU_RESOURCES ]
> I0914 16:48:43.75  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722256  9813 master.cpp:2577] Framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 failed over
> I0914 16:48:43.722429  9813 hierarchical.cpp:348] Activated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722595  9813 master.cpp:5709] Sending 1 offers to framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> 
> Scheduler log:
> I0914 16:48:44.157 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
> value: "071c44a1-b4d4-4339-a727-03a79f725851-"
> , master: id: "461b98b8-63e1-40e3-96fd-cb62420945ae"
> ip: 119646400
> port: 5050
> pid: "master@192.168.33.7:5050"
> hostname: "aurora.local"
> version: "1.0.0"
> address {
>   hostname: "aurora.local"
>   ip: "192.168.33.7"
>   port: 5050
> }
> 
> Case 2: Rolling backward does not impact running tasks:
> Rolling back framework name from 'Aurora' to 'TwitterScheduler':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:51:33.203495  9812 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:51:33.203526  9812 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.614074  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'TwitterScheduler' at 
> scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
> I0914 16:51:49.614215  9813 master.cpp:2500] Subscribing framework 
> TwitterScheduler with checkpointing enabled and capabilities [ 
> REVOCABLE_RESOURCES, GPU_RESOURCES ]
> I0914 16:51:49.614312  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.614359  9813 master.cpp:2577] Framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 failed over
> I0914 16:51:49.614977  9813 hi

Re: Review Request 51874: Change framework_name default value from 'TwitterScheduler' to 'Aurora'

2016-09-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51874/#review148972
---


Ship it!




Master (5069f93) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 14, 2016, 8:58 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51874/
> ---
> 
> (Updated Sept. 14, 2016, 8:58 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1688
> https://issues.apache.org/jira/browse/AURORA-1688
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Change framework_name default value from 'TwitterScheduler' to 'Aurora'
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md ad2c68a6defe07c94480d7dee5b1496b50dc34e5 
>   
> src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
>  8a386bd208956eb0c8c2f48874b0c6fb3af58872 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 97677f24a50963178a123b420d7ac136e4fde3fe 
> 
> Diff: https://reviews.apache.org/r/51874/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> Testing to make sure backward compatibility:
> 
> Case 1: Rolling forward does not impact running tasks:
> Renaming framework from 'TwitterScheduler' to 'Aurora':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:48:28.408182  9815 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:48:28.408226  9815 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> E0914 16:48:28.408617  9819 process.cpp:2105] Failed to shutdown socket with 
> fd 28: Transport endpoint is not connected
> I0914 16:48:43.722126  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'Aurora' at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> I0914 16:48:43.722190  9813 master.cpp:2500] Subscribing framework Aurora 
> with checkpointing enabled and capabilities [ REVOCABLE_RESOURCES, 
> GPU_RESOURCES ]
> I0914 16:48:43.75  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722256  9813 master.cpp:2577] Framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 failed over
> I0914 16:48:43.722429  9813 hierarchical.cpp:348] Activated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722595  9813 master.cpp:5709] Sending 1 offers to framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> 
> Scheduler log:
> I0914 16:48:44.157 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
> value: "071c44a1-b4d4-4339-a727-03a79f725851-"
> , master: id: "461b98b8-63e1-40e3-96fd-cb62420945ae"
> ip: 119646400
> port: 5050
> pid: "master@192.168.33.7:5050"
> hostname: "aurora.local"
> version: "1.0.0"
> address {
>   hostname: "aurora.local"
>   ip: "192.168.33.7"
>   port: 5050
> }
> 
> Case 2: Rolling backward does not impact running tasks:
> Rolling back framework name from 'Aurora' to 'TwitterScheduler':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:51:33.203495  9812 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:51:33.203526  9812 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.614074  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'TwitterScheduler' at 
> scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
> I0914 16:51:49.614215  9813 master.cpp:2500] Subscribing framework 
> TwitterScheduler with checkpointing enabled and capabilities [ 
> REVOCABLE_RESOURCES, GPU_RESOURCES ]
> I0914 16:51:49.614312  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.614359  9813 master.cpp:2577] Framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 failed 

Re: Review Request 51874: Change framework_name default value from 'TwitterScheduler' to 'Aurora'

2016-09-14 Thread Joshua Cohen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51874/#review148973
---


Ship it!




Ship It!

- Joshua Cohen


On Sept. 14, 2016, 8:58 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51874/
> ---
> 
> (Updated Sept. 14, 2016, 8:58 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1688
> https://issues.apache.org/jira/browse/AURORA-1688
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Change framework_name default value from 'TwitterScheduler' to 'Aurora'
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md ad2c68a6defe07c94480d7dee5b1496b50dc34e5 
>   
> src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
>  8a386bd208956eb0c8c2f48874b0c6fb3af58872 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 97677f24a50963178a123b420d7ac136e4fde3fe 
> 
> Diff: https://reviews.apache.org/r/51874/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> Testing to make sure backward compatibility:
> 
> Case 1: Rolling forward does not impact running tasks:
> Renaming framework from 'TwitterScheduler' to 'Aurora':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:48:28.408182  9815 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:48:28.408226  9815 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> E0914 16:48:28.408617  9819 process.cpp:2105] Failed to shutdown socket with 
> fd 28: Transport endpoint is not connected
> I0914 16:48:43.722126  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'Aurora' at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> I0914 16:48:43.722190  9813 master.cpp:2500] Subscribing framework Aurora 
> with checkpointing enabled and capabilities [ REVOCABLE_RESOURCES, 
> GPU_RESOURCES ]
> I0914 16:48:43.75  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722256  9813 master.cpp:2577] Framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 failed over
> I0914 16:48:43.722429  9813 hierarchical.cpp:348] Activated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722595  9813 master.cpp:5709] Sending 1 offers to framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> 
> Scheduler log:
> I0914 16:48:44.157 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
> value: "071c44a1-b4d4-4339-a727-03a79f725851-"
> , master: id: "461b98b8-63e1-40e3-96fd-cb62420945ae"
> ip: 119646400
> port: 5050
> pid: "master@192.168.33.7:5050"
> hostname: "aurora.local"
> version: "1.0.0"
> address {
>   hostname: "aurora.local"
>   ip: "192.168.33.7"
>   port: 5050
> }
> 
> Case 2: Rolling backward does not impact running tasks:
> Rolling back framework name from 'Aurora' to 'TwitterScheduler':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:51:33.203495  9812 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:51:33.203526  9812 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.614074  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'TwitterScheduler' at 
> scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
> I0914 16:51:49.614215  9813 master.cpp:2500] Subscribing framework 
> TwitterScheduler with checkpointing enabled and capabilities [ 
> REVOCABLE_RESOURCES, GPU_RESOURCES ]
> I0914 16:51:49.614312  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.614359  9813 master.cpp:2577] Framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 failed over
> I0914 16:51:49.614977  9813 hierarchical.cpp:348] Activated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.615170  9813 master.

Review Request 51900: Add reconciliation_explicit_batch_size scheduler option to the documentation

2016-09-14 Thread Karthik Anantha Padmanabhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51900/
---

Review request for Aurora and Joshua Cohen.


Repository: aurora


Description
---

The reconciliation_explicit_batch_size was missing from the scheduler 
coniguration documentation. Given that this option is referenced in the 
`aurora_admin` command `reconcile_tasks` it will be good to have it.


Diffs
-

  docs/reference/scheduler-configuration.md 
31be71491de00e53ce4764e480ff2728d06a3f1d 

Diff: https://reviews.apache.org/r/51900/diff/


Testing
---

./build-support/jenkins/build.sh


Thanks,

Karthik Anantha Padmanabhan



Re: Review Request 51900: Add reconciliation_explicit_batch_size scheduler option to the documentation

2016-09-14 Thread Karthik Anantha Padmanabhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51900/
---

(Updated Sept. 14, 2016, 9:50 p.m.)


Review request for Aurora and Joshua Cohen.


Repository: aurora


Description (updated)
---

The reconciliation_explicit_batch_size was missing from the scheduler 
configuration documentation. Given that this option is referenced in the 
`aurora_admin` command `reconcile_tasks` it will be good to have it.


Diffs
-

  docs/reference/scheduler-configuration.md 
31be71491de00e53ce4764e480ff2728d06a3f1d 

Diff: https://reviews.apache.org/r/51900/diff/


Testing
---

./build-support/jenkins/build.sh


Thanks,

Karthik Anantha Padmanabhan



Re: Review Request 51900: Add reconciliation_explicit_batch_size scheduler option to the documentation

2016-09-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51900/#review148978
---


Ship it!




Master (5069f93) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 14, 2016, 9:50 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51900/
> ---
> 
> (Updated Sept. 14, 2016, 9:50 p.m.)
> 
> 
> Review request for Aurora and Joshua Cohen.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> The reconciliation_explicit_batch_size was missing from the scheduler 
> configuration documentation. Given that this option is referenced in the 
> `aurora_admin` command `reconcile_tasks` it will be good to have it.
> 
> 
> Diffs
> -
> 
>   docs/reference/scheduler-configuration.md 
> 31be71491de00e53ce4764e480ff2728d06a3f1d 
> 
> Diff: https://reviews.apache.org/r/51900/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51876: @ReviewBot retry Modify executor state transition logic to rely on health checks (if enabled)

2016-09-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review148937
---




src/main/python/apache/aurora/executor/aurora_executor.py (line 125)


I don't think we should include `Task is healthy.` message for 
non-health-check-enabled tasks as we have no visibility to the health of such 
tasks.



src/main/python/apache/aurora/executor/aurora_executor.py (lines 133 - 141)


This code is largely a repetition of what alrady exists in 
health_checker.py. I'd be great to reuse it (say from task_info.py).



src/main/python/apache/aurora/executor/common/health_checker.py (lines 40 - 43)


Do we really need this enum? A task that is in the middle of the 
`initial_interval_secs` timeout technically is still healthy, right? I feel 
having a boolean `isHealthy` should be enough to cover all possible scenarios 
as `StatusManager` checks for `isHealthy` AND the task status itself to invoke 
a callback.



src/main/python/apache/aurora/executor/common/health_checker.py (line 69)


This comment needs update.



src/main/python/apache/aurora/executor/common/health_checker.py (line 103)


`...if is_healthy else...`



src/main/python/apache/aurora/executor/common/health_checker.py (lines 136 - 
150)


If you get rid of the `HealthCheckStatus` enum I think all we need is this:
```
if self.initial_in_progress:
  if self.current_consecutive_health_checks >= 
self.min_consecutive_health_checks:
  self.healthy = True
else:
  self.healthy = self.current_consecutive_failures > 
self.max_consecutive_failures
  self.reason = reason
  
  
@property
def initial_in_progress(self):
  if not self._expired:
if self.elapsed_time <= self.initial_interval:
  return true
else:  
  self._expired = True
return !self._expired
```



src/main/python/apache/aurora/executor/common/health_checker.py (line 149)


This should be log.info instead.



src/main/python/apache/aurora/executor/common/status_checker.py (line 104)


Not obvious to me: what's the purpose of this condition here?



src/main/python/apache/aurora/executor/status_manager.py (line 60)


How about extending the `StatusManager` to take two callbacks instead? The 
old callback would remain a default one and the new would be an optional one to 
exclusively process 'TASK_RUNNING'. Since it's now status aware, it would make 
sense to consolidate status management here and avoid the if-else switch 
upstream.

Also, I'd use `mesos_pb2.TASK_RUNNING` instead of relying on string const.


- Maxim Khutornenko


On Sept. 14, 2016, 4:43 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 14, 2016, 4:43 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to schedule

Re: Review Request 51876: @ReviewBot retry Modify executor state transition logic to rely on health checks (if enabled)

2016-09-14 Thread Kai Huang


> On Sept. 14, 2016, 10:03 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/executor/common/status_checker.py, line 104
> > 
> >
> > Not obvious to me: what's the purpose of this condition here?

Oh, I thought HealthChecker is not the last link in the ChainedStatusChecker. 
If it's the last one, then this condition is not necessary.


> On Sept. 14, 2016, 10:03 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/executor/status_manager.py, line 60
> > 
> >
> > How about extending the `StatusManager` to take two callbacks instead? 
> > The old callback would remain a default one and the new would be an 
> > optional one to exclusively process 'TASK_RUNNING'. Since it's now status 
> > aware, it would make sense to consolidate status management here and avoid 
> > the if-else switch upstream.
> > 
> > Also, I'd use `mesos_pb2.TASK_RUNNING` instead of relying on string 
> > const.

Agreed. I had similar thought initially, now I'm more assured.


> On Sept. 14, 2016, 10:03 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/executor/common/health_checker.py, lines 40-43
> > 
> >
> > Do we really need this enum? A task that is in the middle of the 
> > `initial_interval_secs` timeout technically is still healthy, right? I feel 
> > having a boolean `isHealthy` should be enough to cover all possible 
> > scenarios as `StatusManager` checks for `isHealthy` AND the task status 
> > itself to invoke a callback.

How do we distinguish a undergoing health check from a successful health 
check(completed within initial_interval) if isHealthy is True in both of them? 
Should we make a second variable to indicate that?


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review148937
---


On Sept. 14, 2016, 4:43 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 14, 2016, 4:43 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> [TODOs]
> Currently I fixed all broken tests caused by my changes. However, more tests 
> needs to be added in order to accomodate the executor change. I will send 
> follow-up review update when I cover more edge cases. But any feedback on the 
> implementation and tests is greatly appreciated.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko


> On Sept. 14, 2016, 6:36 p.m., Zameer Manji wrote:
> > This patch LGTM. Just a single concern here and some questions. I will be 
> > moving on to the subsequent patches shortly.
> > Please do not commit this until all parts are shipped. Also, please don't 
> > forget to rebase/update subsequent patches if changes are made here.
> > 
> > Also, does this cover all state change consumers that interact on storage 
> > when the event is fired or is this a subset? Just wondering if this is 
> > supposed to apply to all of them or just a small set that you think need 
> > this change.

> Also, does this cover all state change consumers that interact on storage 
> when the event is fired or is this a subset

This covers all known high frequency consumers that require a write 
transaction. The list is derived by running a stack-dumping version of the 
`CallOrderEnforcingStorage.write()` in a live cluster for awhile and analysing 
top offenders. It may be not fully comprehensive though, so if you find any 
other candidates feel free to report them.


> On Sept. 14, 2016, 6:36 p.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 233
> > 
> >
> > Instead of having an `Optional` and then later 
> > checking it's present when we compute the backoff, we could simplify the 
> > internal implementation by having non repeatable work items default to some 
> > sort of `NoopBackoffStrategy` instead.

I don't really see how that would be a better solution in this particular case. 
We do need a `BackoffStrategy` in the repeatable work case it better be 
non-fake (and hard fail if not provided). Having a noop strategy softens that 
constraint.


> On Sept. 14, 2016, 6:36 p.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/SchedulerModule.java, line 69
> > 
> >
> > Just curious, is this default arbitrary or something that was derrived 
> > from your deployment in production?

It's semi-arbitrary but proven in production :).


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148950
---


On Sept. 14, 2016, 5:25 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 5:25 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task schedul

Re: Review Request 51876: @ReviewBot retry Modify executor state transition logic to rely on health checks (if enabled)

2016-09-14 Thread Kai Huang


> On Sept. 14, 2016, 10:03 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/executor/common/health_checker.py, lines 40-43
> > 
> >
> > Do we really need this enum? A task that is in the middle of the 
> > `initial_interval_secs` timeout technically is still healthy, right? I feel 
> > having a boolean `isHealthy` should be enough to cover all possible 
> > scenarios as `StatusManager` checks for `isHealthy` AND the task status 
> > itself to invoke a callback.
> 
> Kai Huang wrote:
> How do we distinguish a undergoing health check from a successful health 
> check(completed within initial_interval) if isHealthy is True in both of 
> them? Should we make a second variable to indicate that?

Oh, never mind, it seems I missed one point here: When we reach the requirement 
of suceessful health checks before initial_interval expires, we should 
pre-terminate the initial_interval secs by seeting the initital_in_progress to 
be expired. So that will work. Thanks!


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review148937
---


On Sept. 14, 2016, 4:43 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 14, 2016, 4:43 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> [TODOs]
> Currently I fixed all broken tests caused by my changes. However, more tests 
> needs to be added in order to accomodate the executor change. I will send 
> follow-up review update when I cover more edge cases. But any feedback on the 
> implementation and tests is greatly appreciated.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 795dae2d6b661fc528d952c2315196d94127961f 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/python/apache/aurora/executor/common/test_status_checker.py 
> 5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
>   src/test/python/apache/aurora/executor/test_status_manager.py 
> ce4679ba1aa7b42cf0115c943d84663030182d23 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 
> 0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 
> 
> Diff: https://reviews.apache.org/r/51876/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/
---

(Updated Sept. 14, 2016, 10:41 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Stephan suggested offline a small optimization for the Work. Also, dropping 
a flaky assert in BatchWorkerTest that apparently trips our jenkins build.


Repository: aurora


Description
---

This is the first (out of 3) patches intending to reduce storage write lock 
contention and as such improve overall system write throughput. It introduces 
the `BatchWorker` and migrates the majority of storage writes due to task 
status change events to use `TaskEventBatchWorker`.

#Problem
Our current storage system writes effectively behave as `SERIALIZABLE` 
transaction isolation level in SQL terms. This means all writes require 
exclusive access to the storage and no two transactions can happen in parallel 
[1]. While it certainly simplifies our implementation, it creates a single 
hotspot where multiple threads are competing for the storage write access. This 
type of contention only worsens as the cluster size grows, more tasks are 
scheduled, more status updates are processed, more subscribers are listening to 
status updates and etc. Eventually, the scheduler throughput (and especially 
task scheduling) becomes degraded to the extent that certain operations wait 
much longer (4x and more) for the lock acquisition than it takes to process 
their payload when inside the transaction. Some ops (like event processing) are 
generally tolerant of these types of delays. Others - not as much. The task 
scheduling suffers the most as backing up the scheduling queue directly affects 
t
 he Median Time To Assigned (MTTA).

#Remediation
Given the above, it's natural to assume that reducing the number of write 
transactions should help reducing the lock contention. This patch introduces a 
generic `BatchWorker` service that delivers a "best effort" batching approach 
by redirecting multiple individual write requests into a single FIFO queue 
served non-stop by a single dedicated thread. Every batch shares a single write 
transaction thus reducing the number of potential write lock requests. To 
minimize wait-in-queue time, items are dispatched immediately and the max 
number of items is bounded. There are a few `BatchWorker` instances specialized 
on particular workload types: task even processing, cron scheduling and task 
scheduling. Every instance can be tuned independently (max batch size) and 
provides specialized metrics helping to monitor each workload type perf.

#Results
The proposed approach has been heavily tested in production and delivered the 
best results. The lock contention latencies got down between 2x and 5x 
depending on the cluster load. A number of other approaches tried but discarded 
as not performing well or even performing much worse than the current master:
- Clock-driven batch execution - every batch is dispatched on a time schedule
- Max batch with a deadline - a batch is dispatched when max size is reached OR 
a timeout expires
- Various combinations of the above - some `BatchWorkers` are using 
clock-driven execution while others are using max batch with a deadline
- Completely non-blocking (event-based) completion notification - all call 
sites are notified of item completion via a `BatchWorkCompleted` event

Happy to provide more details on the above if interested.

#Upcoming
The introduction of the `BatchWorker` by itself was not enough to substantially 
improve the MTTA. It, however, paves the way for the next phase of scheduling 
perf improvement - taking more than 1 task from a given `TaskGroup` in a single 
scheduling round (coming soon). That improvement wouldn't deliver without 
decreasing the lock contention first. 

Note: it wasn't easy to have a clean diff split, so some functionality in 
`BatchWorker` (e.g.: `executeWithReplay`) appears to be unused in the current 
patch but will become obvious in the part 2 (coming out shortly).  

[1] - 
https://github.com/apache/aurora/blob/f6ac13b169aaad5aad73ef3cc72873781e30a705/src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java#L540-L555


Diffs (updated)
-

  src/main/java/org/apache/aurora/scheduler/BatchWorker.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/SchedulerModule.java 
4a7ef0b1b90607f68d89fe8e207f42c42a8c56a0 
  src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java 
f07746c2b990c1c2235e99f9e4775fc84f9c27b1 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskThrottler.java 
bbd971a2aa8a96cf79edd879ad60e1bebd933d79 
  src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
3c7cda09ab292d696070ca4d9dfedb1f6f71b0fe 
  
src/main/java/org/apache/aurora/scheduler/updater/JobUpdateController

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Zameer Manji


> On Sept. 14, 2016, 11:36 a.m., Zameer Manji wrote:
> > This patch LGTM. Just a single concern here and some questions. I will be 
> > moving on to the subsequent patches shortly.
> > Please do not commit this until all parts are shipped. Also, please don't 
> > forget to rebase/update subsequent patches if changes are made here.
> > 
> > Also, does this cover all state change consumers that interact on storage 
> > when the event is fired or is this a subset? Just wondering if this is 
> > supposed to apply to all of them or just a small set that you think need 
> > this change.
> 
> Maxim Khutornenko wrote:
> > Also, does this cover all state change consumers that interact on 
> storage when the event is fired or is this a subset
> 
> This covers all known high frequency consumers that require a write 
> transaction. The list is derived by running a stack-dumping version of the 
> `CallOrderEnforcingStorage.write()` in a live cluster for awhile and 
> analysing top offenders. It may be not fully comprehensive though, so if you 
> find any other candidates feel free to report them.

I think some comments at the consumers to say that batch worker is required for 
high throughput would be greatly appreciated for future contributers.


- Zameer


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148950
---


On Sept. 14, 2016, 3:41 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 3:41 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered the 
> best results. The lock contention latencies got down between 2x and 5x 
> depending on the cluster load. A number of other approaches tried but 
> discarded as not performing well or even performing much worse than the 
> current master:
> - Clock-driven batch execution - every batch is dispatched on a time schedule
> - Max batch with a deadline - a batch is dispatched when max size is reached 
> OR a timeout expires
> - Various combinations of the above - some `BatchWorkers` are using 
> clock-driven execution while others are using max batch with a deadline
> - Completely non-blocking (event-based) completion notification - all call 

Re: Review Request 51874: Change framework_name default value from 'TwitterScheduler' to 'Aurora'

2016-09-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51874/#review148988
---




src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
 (line 82)


Did you try to rollback to pre 0.15 scheduler while changing the framework 
name? Trying to see if we can drop this 'backwards incompatible' statement now.


- Maxim Khutornenko


On Sept. 14, 2016, 8:58 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51874/
> ---
> 
> (Updated Sept. 14, 2016, 8:58 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1688
> https://issues.apache.org/jira/browse/AURORA-1688
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Change framework_name default value from 'TwitterScheduler' to 'Aurora'
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md ad2c68a6defe07c94480d7dee5b1496b50dc34e5 
>   
> src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
>  8a386bd208956eb0c8c2f48874b0c6fb3af58872 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 97677f24a50963178a123b420d7ac136e4fde3fe 
> 
> Diff: https://reviews.apache.org/r/51874/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> Testing to make sure backward compatibility:
> 
> Case 1: Rolling forward does not impact running tasks:
> Renaming framework from 'TwitterScheduler' to 'Aurora':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:48:28.408182  9815 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:48:28.408226  9815 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> E0914 16:48:28.408617  9819 process.cpp:2105] Failed to shutdown socket with 
> fd 28: Transport endpoint is not connected
> I0914 16:48:43.722126  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'Aurora' at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> I0914 16:48:43.722190  9813 master.cpp:2500] Subscribing framework Aurora 
> with checkpointing enabled and capabilities [ REVOCABLE_RESOURCES, 
> GPU_RESOURCES ]
> I0914 16:48:43.75  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722256  9813 master.cpp:2577] Framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 failed over
> I0914 16:48:43.722429  9813 hierarchical.cpp:348] Activated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722595  9813 master.cpp:5709] Sending 1 offers to framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> 
> Scheduler log:
> I0914 16:48:44.157 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
> value: "071c44a1-b4d4-4339-a727-03a79f725851-"
> , master: id: "461b98b8-63e1-40e3-96fd-cb62420945ae"
> ip: 119646400
> port: 5050
> pid: "master@192.168.33.7:5050"
> hostname: "aurora.local"
> version: "1.0.0"
> address {
>   hostname: "aurora.local"
>   ip: "192.168.33.7"
>   port: 5050
> }
> 
> Case 2: Rolling backward does not impact running tasks:
> Rolling back framework name from 'Aurora' to 'TwitterScheduler':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:51:33.203495  9812 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:51:33.203526  9812 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.614074  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'TwitterScheduler' at 
> scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
> I0914 16:51:49.614215  9813 master.cpp:2500] Subscribing framework 
> TwitterScheduler with checkpointing enabled and capabilities [ 
> REVOCABLE_RESOURCES, GPU_RESOURCES ]
> I0914 16:51:49.614312  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.614359  9813 master.cpp:2577] Framework 
> 071

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko


> On Sept. 14, 2016, 6:36 p.m., Zameer Manji wrote:
> > This patch LGTM. Just a single concern here and some questions. I will be 
> > moving on to the subsequent patches shortly.
> > Please do not commit this until all parts are shipped. Also, please don't 
> > forget to rebase/update subsequent patches if changes are made here.
> > 
> > Also, does this cover all state change consumers that interact on storage 
> > when the event is fired or is this a subset? Just wondering if this is 
> > supposed to apply to all of them or just a small set that you think need 
> > this change.
> 
> Maxim Khutornenko wrote:
> > Also, does this cover all state change consumers that interact on 
> storage when the event is fired or is this a subset
> 
> This covers all known high frequency consumers that require a write 
> transaction. The list is derived by running a stack-dumping version of the 
> `CallOrderEnforcingStorage.write()` in a live cluster for awhile and 
> analysing top offenders. It may be not fully comprehensive though, so if you 
> find any other candidates feel free to report them.
> 
> Zameer Manji wrote:
> I think some comments at the consumers to say that batch worker is 
> required for high throughput would be greatly appreciated for future 
> contributers.

I'd rather avoid comment copypasta everywhere we use `BatchWorker`. I thought 
it's enough to trace the execution from the callsite to the `BatchWorker` 
itself to see how/why it's used, no?


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148950
---


On Sept. 14, 2016, 10:41 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 10:41 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered the 
> best results. The lock contention latencies got down between 2x and 5x 
> depending on the cluster load. A number of other approaches tried but 
> discarded as not performing well or even performing much worse than the 
> current master:
> - Clock-driven batch execution - every batch is dispatched on a time schedule
> - Max batch with a deadline - a batch is dispatched when max size is reached 
> OR a timeout exp

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Zameer Manji


> On Sept. 14, 2016, 11:36 a.m., Zameer Manji wrote:
> > This patch LGTM. Just a single concern here and some questions. I will be 
> > moving on to the subsequent patches shortly.
> > Please do not commit this until all parts are shipped. Also, please don't 
> > forget to rebase/update subsequent patches if changes are made here.
> > 
> > Also, does this cover all state change consumers that interact on storage 
> > when the event is fired or is this a subset? Just wondering if this is 
> > supposed to apply to all of them or just a small set that you think need 
> > this change.
> 
> Maxim Khutornenko wrote:
> > Also, does this cover all state change consumers that interact on 
> storage when the event is fired or is this a subset
> 
> This covers all known high frequency consumers that require a write 
> transaction. The list is derived by running a stack-dumping version of the 
> `CallOrderEnforcingStorage.write()` in a live cluster for awhile and 
> analysing top offenders. It may be not fully comprehensive though, so if you 
> find any other candidates feel free to report them.
> 
> Zameer Manji wrote:
> I think some comments at the consumers to say that batch worker is 
> required for high throughput would be greatly appreciated for future 
> contributers.
> 
> Maxim Khutornenko wrote:
> I'd rather avoid comment copypasta everywhere we use `BatchWorker`. I 
> thought it's enough to trace the execution from the callsite to the 
> `BatchWorker` itself to see how/why it's used, no?

Agreed, dropping this.


- Zameer


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148950
---


On Sept. 14, 2016, 3:41 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 3:41 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered the 
> best results. The lock contention latencies got down between 2x and 5x 
> depending on the cluster load. A number of other approaches tried but 
> discarded as not performing well or even performing much worse than the 
> current master:
> - Clock-driven batch execution - every batch is dispatched on a time schedule
> - Max batch with a deadline - a b

Re: Review Request 51763: Batching writes - Part 2 (of 3): Converting cron jobs to use BatchWorker.

2016-09-14 Thread Maxim Khutornenko


> On Sept. 14, 2016, 5:40 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java, 
> > lines 164-172
> > 
> >
> > I probably don't completely understand the logic, so stupid question 
> > ahead:
> > 
> > Couldn't we do that cleanup of JobDataMap in the handler where you 
> > currently do the `killFollowups.add(key)`? Having this spread out over the 
> > large `doExecute` function doesn't help understandability very much.

Unfortunately, we can't. That was my first thought but the context and its 
detail map are reconstructed for every trigger and are gone the moment a 
trigger is released.


> On Sept. 14, 2016, 5:40 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java, 
> > line 174
> > 
> >
> > I will probably find this message confusing if I see it in the logs, in 
> > particular if I have configured the job with KILL_EXISTING.

Any suggestions? There are lots of moving parts addressed in this message, so I 
am not sure how to better explain it without describing how quartz works :)


> On Sept. 14, 2016, 5:40 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java, 
> > line 274
> > 
> >
> > The term "followup" is rather generic and difficuelt to understand if 
> > seen in isolation in the log. You should at least mention it is about 
> > killing a cron job.

Good point. Rephrased.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/#review148933
---


On Sept. 14, 2016, 12:23 a.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51763/
> ---
> 
> (Updated Sept. 14, 2016, 12:23 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the second part of the `BatchWorker` conversion work that moves cron 
> jobs to use non-blocking kill followups and reduces the number of trigger 
> threads. See https://reviews.apache.org/r/51759 for more background on the 
> `BatchWorker`.
> 
> #Problem
> The current implementation of the cron scheduling relies on a large number of 
> threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
> killing existing tasks according to `KILL_EXISTING` collision policy. This 
> creates large spikes of activities at synchronized intervals as users tend to 
> schedule their cron runs around similar schedules. Moreover, the current 
> implementation re-acquires write locks multiple times to deliver on 
> `KILL_EXISTING` policy. 
> 
> #Remediation
> Trigger level batching is still done in a blocking way but multiple cron 
> triggers may be bundled together to share the same write transaction. Any 
> followups, however, are performed in a non-blocking way by relying on a 
> `BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. 
> In order to still ensure non-concurrent execution of a given job key trigger, 
> a token (job key) is saved within the trigger itself. A concurrent trigger 
> will bail if a kill followup is still in progress (token is set AND no entry 
> in `killFollowups` set exists yet).
> 
> #Results
> The above approach allowed reducing the number of cron threads to 10 and 
> likely can be reduced even further. See https://reviews.apache.org/r/51759 
> for the lock contention results.
> 
> 
> Diffs
> -
> 
>   commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
> 8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
>   commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
> bc30990d57f444f7d64805ed85c363f1302736d0 
>   config/findbugs/excludeFilter.xml fe3f4ca5db1484124af14421a3349950dfec8519 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
> c07551e94f9221b5b21c5dc9715e82caa290c2e8 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
> 155d702d68367b247dd066f773c662407f0e3b5b 
>   
> src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
> 5c64ff2994e200b3453603ac5470e8e152cebc55 
>   src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
> 1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f 
> 
> Diff: https://reviews.apache.org/r/51763/diff/
> 
> 
> Testing
> ---
> 
> All types of testing including deploying to test and production clusters.
> 
> 
> Thanks,
> 
> Maxim

Re: Review Request 51763: Batching writes - Part 2 (of 3): Converting cron jobs to use BatchWorker.

2016-09-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/
---

(Updated Sept. 14, 2016, 11:12 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Rebasing and comments.


Repository: aurora


Description
---

This is the second part of the `BatchWorker` conversion work that moves cron 
jobs to use non-blocking kill followups and reduces the number of trigger 
threads. See https://reviews.apache.org/r/51759 for more background on the 
`BatchWorker`.

#Problem
The current implementation of the cron scheduling relies on a large number of 
threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
killing existing tasks according to `KILL_EXISTING` collision policy. This 
creates large spikes of activities at synchronized intervals as users tend to 
schedule their cron runs around similar schedules. Moreover, the current 
implementation re-acquires write locks multiple times to deliver on 
`KILL_EXISTING` policy. 

#Remediation
Trigger level batching is still done in a blocking way but multiple cron 
triggers may be bundled together to share the same write transaction. Any 
followups, however, are performed in a non-blocking way by relying on a 
`BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. In 
order to still ensure non-concurrent execution of a given job key trigger, a 
token (job key) is saved within the trigger itself. A concurrent trigger will 
bail if a kill followup is still in progress (token is set AND no entry in 
`killFollowups` set exists yet).

#Results
The above approach allowed reducing the number of cron threads to 10 and likely 
can be reduced even further. See https://reviews.apache.org/r/51759 for the 
lock contention results.


Diffs (updated)
-

  commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
  commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
bc30990d57f444f7d64805ed85c363f1302736d0 
  config/findbugs/excludeFilter.xml fe3f4ca5db1484124af14421a3349950dfec8519 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
c07551e94f9221b5b21c5dc9715e82caa290c2e8 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
155d702d68367b247dd066f773c662407f0e3b5b 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
5c64ff2994e200b3453603ac5470e8e152cebc55 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f 

Diff: https://reviews.apache.org/r/51763/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148994
---


Ship it!




Master (5069f93) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 14, 2016, 10:41 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 10:41 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered the 
> best results. The lock contention latencies got down between 2x and 5x 
> depending on the cluster load. A number of other approaches tried but 
> discarded as not performing well or even performing much worse than the 
> current master:
> - Clock-driven batch execution - every batch is dispatched on a time schedule
> - Max batch with a deadline - a batch is dispatched when max size is reached 
> OR a timeout expires
> - Various combinations of the above - some `BatchWorkers` are using 
> clock-driven execution while others are using max batch with a deadline
> - Completely non-blocking (event-based) completion notification - all call 
> sites are notified of item completion via a `BatchWorkCompleted` event
> 
> Happy to provide more details on the above if interested.
> 
> #Upcoming
> The introduction of the `BatchWorker` by itself was not enough to 
> substantially improve the MTTA. It, however, paves the way for the next phase 
> of scheduling perf improvement - taking more than 1 task from a given 
> `TaskGroup` in a single scheduling round (coming soon). That improvement 
> wouldn't deliver without decreasing the lock contention first. 
> 
> Note: it wasn't easy to have a clean diff split, so some functionality in 
> `BatchWorker` (e.g.: `executeWithReplay`) appears to be unused in the current 
> patch but will become obvious in the part 2 (coming out shortly).  
> 
> [1] - 
> https://github.com/apache/aurora/blob/f6ac13b169aaad5aad73ef3cc72873781e30a705/src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java#L540-L555
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/BatchWorker.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/Sched

Re: Review Request 51765: Batching writes - Part 3 (of 3): Converting TaskScheduler to use BatchWorker.

2016-09-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51765/
---

(Updated Sept. 14, 2016, 11:17 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Rebased.


Repository: aurora


Description
---

This is the final part of the `BatchWorker` conversion work that converts 
`TaskScheduler`. See https://reviews.apache.org/r/51759 for more background on 
the `BatchWorker`.

#Problem
See https://reviews.apache.org/r/51759

#Remediation
Task scheduling is one of the most dominant users of the write lock. It's also 
one of the heaviest and the most latency-sensitive. As such, the default max 
batch size is chosen conservatively low (3) and batch items are executed in a 
blocking way. 

BTW, attempting to make task scheduling non-blocking resulted in a much worse 
scheduling performance. The way our `DBTaskStore` is wired, all async 
activities, including `EventBus` are bound to use a single async `Executor`, 
which is currently limited at 8 threads [1]. Relying on the same `EventBus` to 
deliver scheduling completion events resulted in slower scheduling perf as 
those events were backed up behind all other activities, including tasks status 
events, reconciliation and etc. Increasing the executor thread pool size to a 
larger number on the other side, also increased the lock contention defeating 
the whole purpose of this work.

#Results
See https://reviews.apache.org/r/51759 for the lock contention results.

https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L51-L54


Diffs (updated)
-

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
9d0d40b82653fb923bed16d06546288a1576c21d 
  src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
11e8033438ad0808e446e41bb26b3fa4c04136c7 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
c044ebe6f72183a67462bbd8e5be983eb592c3e9 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
d266f6a25ae2360db2977c43768a19b1f1efe8ff 
  src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
c2ceb4e7685a9301f8014a9183e02fbad65bca26 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
95cf25eda0a5bfc0cc4c46d1439ebe9d5359ce79 
  
src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 
72562e6bd9a9860c834e6a9faa094c28600a8fed 

Diff: https://reviews.apache.org/r/51765/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51765: Batching writes - Part 3 (of 3): Converting TaskScheduler to use BatchWorker.

2016-09-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51765/
---

(Updated Sept. 14, 2016, 11:18 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Really rebasing.


Repository: aurora


Description
---

This is the final part of the `BatchWorker` conversion work that converts 
`TaskScheduler`. See https://reviews.apache.org/r/51759 for more background on 
the `BatchWorker`.

#Problem
See https://reviews.apache.org/r/51759

#Remediation
Task scheduling is one of the most dominant users of the write lock. It's also 
one of the heaviest and the most latency-sensitive. As such, the default max 
batch size is chosen conservatively low (3) and batch items are executed in a 
blocking way. 

BTW, attempting to make task scheduling non-blocking resulted in a much worse 
scheduling performance. The way our `DBTaskStore` is wired, all async 
activities, including `EventBus` are bound to use a single async `Executor`, 
which is currently limited at 8 threads [1]. Relying on the same `EventBus` to 
deliver scheduling completion events resulted in slower scheduling perf as 
those events were backed up behind all other activities, including tasks status 
events, reconciliation and etc. Increasing the executor thread pool size to a 
larger number on the other side, also increased the lock contention defeating 
the whole purpose of this work.

#Results
See https://reviews.apache.org/r/51759 for the lock contention results.

https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L51-L54


Diffs (updated)
-

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
9d0d40b82653fb923bed16d06546288a1576c21d 
  src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
11e8033438ad0808e446e41bb26b3fa4c04136c7 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
c044ebe6f72183a67462bbd8e5be983eb592c3e9 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
d266f6a25ae2360db2977c43768a19b1f1efe8ff 
  src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
c2ceb4e7685a9301f8014a9183e02fbad65bca26 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
95cf25eda0a5bfc0cc4c46d1439ebe9d5359ce79 
  
src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 
72562e6bd9a9860c834e6a9faa094c28600a8fed 

Diff: https://reviews.apache.org/r/51765/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51763: Batching writes - Part 2 (of 3): Converting cron jobs to use BatchWorker.

2016-09-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/#review148996
---


Ship it!




Master (5069f93) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 14, 2016, 11:12 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51763/
> ---
> 
> (Updated Sept. 14, 2016, 11:12 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the second part of the `BatchWorker` conversion work that moves cron 
> jobs to use non-blocking kill followups and reduces the number of trigger 
> threads. See https://reviews.apache.org/r/51759 for more background on the 
> `BatchWorker`.
> 
> #Problem
> The current implementation of the cron scheduling relies on a large number of 
> threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
> killing existing tasks according to `KILL_EXISTING` collision policy. This 
> creates large spikes of activities at synchronized intervals as users tend to 
> schedule their cron runs around similar schedules. Moreover, the current 
> implementation re-acquires write locks multiple times to deliver on 
> `KILL_EXISTING` policy. 
> 
> #Remediation
> Trigger level batching is still done in a blocking way but multiple cron 
> triggers may be bundled together to share the same write transaction. Any 
> followups, however, are performed in a non-blocking way by relying on a 
> `BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. 
> In order to still ensure non-concurrent execution of a given job key trigger, 
> a token (job key) is saved within the trigger itself. A concurrent trigger 
> will bail if a kill followup is still in progress (token is set AND no entry 
> in `killFollowups` set exists yet).
> 
> #Results
> The above approach allowed reducing the number of cron threads to 10 and 
> likely can be reduced even further. See https://reviews.apache.org/r/51759 
> for the lock contention results.
> 
> 
> Diffs
> -
> 
>   commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
> 8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
>   commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
> bc30990d57f444f7d64805ed85c363f1302736d0 
>   config/findbugs/excludeFilter.xml fe3f4ca5db1484124af14421a3349950dfec8519 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
> c07551e94f9221b5b21c5dc9715e82caa290c2e8 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
> 155d702d68367b247dd066f773c662407f0e3b5b 
>   
> src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
> 5c64ff2994e200b3453603ac5470e8e152cebc55 
>   src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
> 1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f 
> 
> Diff: https://reviews.apache.org/r/51763/diff/
> 
> 
> Testing
> ---
> 
> All types of testing including deploying to test and production clusters.
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>



Re: Review Request 51765: Batching writes - Part 3 (of 3): Converting TaskScheduler to use BatchWorker.

2016-09-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51765/#review148998
---


Ship it!




Master (5069f93) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 14, 2016, 11:18 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51765/
> ---
> 
> (Updated Sept. 14, 2016, 11:18 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the final part of the `BatchWorker` conversion work that converts 
> `TaskScheduler`. See https://reviews.apache.org/r/51759 for more background 
> on the `BatchWorker`.
> 
> #Problem
> See https://reviews.apache.org/r/51759
> 
> #Remediation
> Task scheduling is one of the most dominant users of the write lock. It's 
> also one of the heaviest and the most latency-sensitive. As such, the default 
> max batch size is chosen conservatively low (3) and batch items are executed 
> in a blocking way. 
> 
> BTW, attempting to make task scheduling non-blocking resulted in a much worse 
> scheduling performance. The way our `DBTaskStore` is wired, all async 
> activities, including `EventBus` are bound to use a single async `Executor`, 
> which is currently limited at 8 threads [1]. Relying on the same `EventBus` 
> to deliver scheduling completion events resulted in slower scheduling perf as 
> those events were backed up behind all other activities, including tasks 
> status events, reconciliation and etc. Increasing the executor thread pool 
> size to a larger number on the other side, also increased the lock contention 
> defeating the whole purpose of this work.
> 
> #Results
> See https://reviews.apache.org/r/51759 for the lock contention results.
> 
> https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L51-L54
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 9d0d40b82653fb923bed16d06546288a1576c21d 
>   src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
> 11e8033438ad0808e446e41bb26b3fa4c04136c7 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
> c044ebe6f72183a67462bbd8e5be983eb592c3e9 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
> d266f6a25ae2360db2977c43768a19b1f1efe8ff 
>   src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
> c2ceb4e7685a9301f8014a9183e02fbad65bca26 
>   src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
> 95cf25eda0a5bfc0cc4c46d1439ebe9d5359ce79 
>   
> src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java
>  72562e6bd9a9860c834e6a9faa094c28600a8fed 
> 
> Diff: https://reviews.apache.org/r/51765/diff/
> 
> 
> Testing
> ---
> 
> All types of testing including deploying to test and production clusters.
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>



Re: Review Request 51874: Change framework_name default value from 'TwitterScheduler' to 'Aurora'

2016-09-14 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51874/
---

(Updated Sept. 14, 2016, 5:33 p.m.)


Review request for Aurora, Joshua Cohen and Maxim Khutornenko.


Bugs: AURORA-1688
https://issues.apache.org/jira/browse/AURORA-1688


Repository: aurora


Description
---

Change framework_name default value from 'TwitterScheduler' to 'Aurora'


Diffs
-

  RELEASE-NOTES.md ad2c68a6defe07c94480d7dee5b1496b50dc34e5 
  
src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
 8a386bd208956eb0c8c2f48874b0c6fb3af58872 
  src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
97677f24a50963178a123b420d7ac136e4fde3fe 

Diff: https://reviews.apache.org/r/51874/diff/


Testing (updated)
---

./build-support/jenkins/build.sh
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh

Testing to make sure backward compatibility:

# HEAD of master:

# Case 1: Rolling forward does not impact running tasks:
Renaming framework from 'TwitterScheduler' to 'Aurora':

The framework re-registers after restart (treated by master as failover) and 
gets the same framework-id. Running task remain unaffected.

Master log:
I0914 16:48:28.408182  9815 master.cpp:1297] Giving framework 
071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 3weeks to 
failover
I0914 16:48:28.408226  9815 hierarchical.cpp:382] Deactivated framework 
071c44a1-b4d4-4339-a727-03a79f725851-
E0914 16:48:28.408617  9819 process.cpp:2105] Failed to shutdown socket with fd 
28: Transport endpoint is not connected
I0914 16:48:43.722126  9813 master.cpp:2424] Received SUBSCRIBE call for 
framework 'Aurora' at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
I0914 16:48:43.722190  9813 master.cpp:2500] Subscribing framework Aurora with 
checkpointing enabled and capabilities [ REVOCABLE_RESOURCES, GPU_RESOURCES ]
I0914 16:48:43.75  9813 master.cpp:2564] Updating info for framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:48:43.722256  9813 master.cpp:2577] Framework 
071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 failed over
I0914 16:48:43.722429  9813 hierarchical.cpp:348] Activated framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:48:43.722595  9813 master.cpp:5709] Sending 1 offers to framework 
071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083

Scheduler log:
I0914 16:48:44.157 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
value: "071c44a1-b4d4-4339-a727-03a79f725851-"
, master: id: "461b98b8-63e1-40e3-96fd-cb62420945ae"
ip: 119646400
port: 5050
pid: "master@192.168.33.7:5050"
hostname: "aurora.local"
version: "1.0.0"
address {
  hostname: "aurora.local"
  ip: "192.168.33.7"
  port: 5050
}

# Case 2: Rolling backward does not impact running tasks:
Rolling back framework name from 'Aurora' to 'TwitterScheduler':

The framework re-registers after restart (treated by master as failover) and 
gets the same framework-id. Running task remain unaffected.

Master log:
I0914 16:51:33.203495  9812 master.cpp:1297] Giving framework 
071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 3weeks to 
failover
I0914 16:51:33.203526  9812 hierarchical.cpp:382] Deactivated framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:51:49.614074  9813 master.cpp:2424] Received SUBSCRIBE call for 
framework 'TwitterScheduler' at 
scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
I0914 16:51:49.614215  9813 master.cpp:2500] Subscribing framework 
TwitterScheduler with checkpointing enabled and capabilities [ 
REVOCABLE_RESOURCES, GPU_RESOURCES ]
I0914 16:51:49.614312  9813 master.cpp:2564] Updating info for framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:51:49.614359  9813 master.cpp:2577] Framework 
071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 failed over
I0914 16:51:49.614977  9813 hierarchical.cpp:348] Activated framework 
071c44a1-b4d4-4339-a727-03a79f725851-
I0914 16:51:49.615170  9813 master.cpp:5709] Sending 1 offers to framework 
071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083

Scheduler log:
I0914 16:51:50.249 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
value: "071c44a1-b4d4-4339-a727-03a79f725851-"
, master: id: "461b98b8-63e1-40e3-96fd-cb62420945ae"
ip: 119646400
port: 5050
pid: "master@192.168.33.7:5050"
hostname: "aurora.local"
version: "1.0.0"
address {
  hostname: "aurora.local"
  ip: "192.168.33.7"
  port: 5050
}

# Ca

Re: Review Request 51874: Change framework_name default value from 'TwitterScheduler' to 'Aurora'

2016-09-14 Thread Santhosh Kumar Shanmugham


> On Sept. 14, 2016, 3:48 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java,
> >  line 82
> > 
> >
> > Did you try to rollback to pre 0.15 scheduler while changing the 
> > framework name? Trying to see if we can drop this 'backwards incompatible' 
> > statement now.

Tested "roll-forward" (to Aurora) and "roll-back" (via release and config 
change) (to TwitterScheduler) on Aurora-0.14 (depends on Mesos-0.27.2) and 
Aurora-0.15(dependes on Mesos-0.28.2). The master was able to re-register the 
framework with the same "id" and the running tasks were continuing to make 
progress. (See details in testing section)

However I could not rollback the scheduler from 0.15 to 0.14 from source inside 
vagrant. Started to on "aurorabuild all" complain with message,
"Could not satisfy all requirements for mesos.native==0.27.2"


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51874/#review148988
---


On Sept. 14, 2016, 1:58 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51874/
> ---
> 
> (Updated Sept. 14, 2016, 1:58 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1688
> https://issues.apache.org/jira/browse/AURORA-1688
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Change framework_name default value from 'TwitterScheduler' to 'Aurora'
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md ad2c68a6defe07c94480d7dee5b1496b50dc34e5 
>   
> src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
>  8a386bd208956eb0c8c2f48874b0c6fb3af58872 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 97677f24a50963178a123b420d7ac136e4fde3fe 
> 
> Diff: https://reviews.apache.org/r/51874/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> Testing to make sure backward compatibility:
> 
> Case 1: Rolling forward does not impact running tasks:
> Renaming framework from 'TwitterScheduler' to 'Aurora':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:48:28.408182  9815 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:48:28.408226  9815 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> E0914 16:48:28.408617  9819 process.cpp:2105] Failed to shutdown socket with 
> fd 28: Transport endpoint is not connected
> I0914 16:48:43.722126  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'Aurora' at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> I0914 16:48:43.722190  9813 master.cpp:2500] Subscribing framework Aurora 
> with checkpointing enabled and capabilities [ REVOCABLE_RESOURCES, 
> GPU_RESOURCES ]
> I0914 16:48:43.75  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722256  9813 master.cpp:2577] Framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 failed over
> I0914 16:48:43.722429  9813 hierarchical.cpp:348] Activated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722595  9813 master.cpp:5709] Sending 1 offers to framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> 
> Scheduler log:
> I0914 16:48:44.157 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
> value: "071c44a1-b4d4-4339-a727-03a79f725851-"
> , master: id: "461b98b8-63e1-40e3-96fd-cb62420945ae"
> ip: 119646400
> port: 5050
> pid: "master@192.168.33.7:5050"
> hostname: "aurora.local"
> version: "1.0.0"
> address {
>   hostname: "aurora.local"
>   ip: "192.168.33.7"
>   port: 5050
> }
> 
> Case 2: Rolling backward does not impact running tasks:
> Rolling back framework name from 'Aurora' to 'TwitterScheduler':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:51:33.203495  9812 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 3weeks to 
> fa