Re: Review Request 65494: Add task page to Scheduler UI

2018-02-02 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65494/#review196780
---


Ship it!




Ship It!

- Kai Huang


On Feb. 3, 2018, 12:27 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65494/
> ---
> 
> (Updated Feb. 3, 2018, 12:27 a.m.)
> 
> 
> Review request for Aurora, Kai Huang, Reza Motamedi, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Add task page to Scheduler UI. I haven't created any inbound links to this 
> yet because there isn't any really useful information, but we have a need to 
> permalink to tasks from outside the Scheduler so right now this scratches 
> that itch. 
> 
> If anyone has ideas about where we can add permalinks or inbound clicks.. let 
> me know! In the future we could maybe bring some of the Observer stuff into 
> this page and make it really useful.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/Breadcrumb.js 
> 76c62709cf3ebf76e74e08573bc8b245a45d5fe9 
>   ui/src/main/js/components/TaskStatus.js 
> 2e5a2b46121d6cc7c80c6d7589803a10973b5c1b 
>   ui/src/main/js/components/__tests__/Breadcrumb-test.js 
> 47f7afb29b9192e214040097dbfee2ff7f7e29ed 
>   ui/src/main/js/index.js 9f94d4bd6f649d74bdd9c3aa99783e01cae25d93 
>   ui/src/main/js/pages/Task.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Task-test.js PRE-CREATION 
>   ui/src/main/sass/components/_instance-page.scss 
> 1da87dcb0f66c258179295b23bda5c895bf5308f 
> 
> 
> Diff: https://reviews.apache.org/r/65494/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> 
> See screenshot for Vagrant test.
> 
> 
> File Attachments
> 
> 
> Screen Shot 2018-02-02 at 4.22.16 PM.png
>   
> https://reviews.apache.org/media/uploaded/files/2018/02/03/588521b3-6014-4e63-aa4c-8389d4ed44e3__Screen_Shot_2018-02-02_at_4.22.16_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63650: Display pending task reasons in TaskList

2017-11-07 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63650/#review190399
---


Ship it!




Ship It!

- Kai Huang


On Nov. 7, 2017, 11:48 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63650/
> ---
> 
> (Updated Nov. 7, 2017, 11:48 p.m.)
> 
> 
> Review request for Aurora, Kai Huang and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Add reasons to pending tasks
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/JobHistory.js 
> 7eff462777deb05c494399f7accc8b703417a447 
>   ui/src/main/js/components/JobStatus.js 
> 8f7023beda21f4c0ba8952b4df13ac80841e8750 
>   ui/src/main/js/components/TaskList.js 
> c0815cfea05e4669083e53660aef52f4221c9cc2 
>   ui/src/main/js/components/__tests__/TaskList-test.js 
> 948e86bd9f44380308bbd4bff9a254708545953c 
>   ui/src/main/js/pages/Job.js 2f96e23008603ca1bc61b9cc80967e3efedb0fbf 
>   ui/src/main/js/test-utils/TaskBuilders.js 
> f2b06451eb601e5c52a0902eeb29152ee39926b0 
> 
> 
> Diff: https://reviews.apache.org/r/63650/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:lint
> ./gradlew ui:test
> 
> 
> File Attachments
> 
> 
> Pending reasons in UI
>   
> https://reviews.apache.org/media/uploaded/files/2017/11/07/43c3dc08-76e8-48c2-a0a9-1e8cf90d8e1d__Screen_Shot_2017-11-07_at_3.45.03_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63436: Enabling ErrorBoundary in Scheduler UI

2017-10-31 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63436/#review189779
---


Ship it!





ui/src/main/js/components/__tests__/ErrorBoundary-test.js
Lines 35 (patched)
<https://reviews.apache.org/r/63436/#comment267006>

Pull the common prefix into a constant?


- Kai Huang


On Oct. 31, 2017, 8:15 p.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63436/
> ---
> 
> (Updated Oct. 31, 2017, 8:15 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Kai Huang.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> # Enabling ErrorBoundary in Scheduler UI
> React 16 introduces a new concept of an “error boundary” that allows us to 
> limit the impact of an error and not unmount the whole component tree. I am 
> open to keeping or removing the stack trace.
> 
> from React docs:
> > As of React 16, errors that were not caught by any error boundary will 
> > result in unmounting of the whole React component tree.
> 
> 
> Diffs
> -
> 
>   ui/.eslintrc 5cdc4e67030a79c3f81c06f585cc9ff5ce959e52 
>   ui/src/main/js/components/ErrorBoundary.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/ErrorBoundary-test.js PRE-CREATION 
>   ui/src/main/js/index.js 9f94d4bd6f649d74bdd9c3aa99783e01cae25d93 
> 
> 
> Diff: https://reviews.apache.org/r/63436/diff/2/
> 
> 
> Testing
> ---
> 
> - Unit tests added.
> 
> ```
> # rmotamedi@tw-mbp-rmotamedi:~/oss/aurora on git:error-boundry-test ? 
> [13:06:52]
> ? ./gradlew ui:test
> 
> BUILD SUCCESSFUL in 0s
> 2 actionable tasks: 2 up-to-date
> 
> # rmotamedi@tw-mbp-rmotamedi:~/oss/aurora on git:error-boundry-test ? 
> [13:07:46]
> ? ./gradlew ui:lint
> 
> BUILD SUCCESSFUL in 5s
> 4 actionable tasks: 4 up-to-date
> ```
> 
> 
> - For more testing, I added the following component (that raises an error) 
> and installed it under different components. I attached the screen-shots of 
> how this will render. Clicking the link initializes the process of Jira 
> ticket creation.
> 
> ## ComponentWithError
> 
> ```
> import React from 'react'
> 
> export default class ComponentWithError extends React.Component {
>   render() {
> throw new Error('Crashed!');
> return ;
>   }
> }
> ```
> 
> 
> File Attachments
> 
> 
> Apache Jira ticket template filled with information on the bug
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/31/0b50175f-0dd5-44fe-8bc4-c69ca1723729__Screen_Shot_2017-10-31_at_11.55.39_AM.png
> Catching exceptions on the router
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/31/915182ef-4229-4b22-b7c6-20a5f24f8e1a__Screen_Shot_2017-10-31_at_11.44.56_AM.png
> 
> 
> Thanks,
> 
> Reza Motamedi
> 
>



Re: Review Request 63436: Enabling ErrorBoundary in Scheduler UI

2017-10-30 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63436/#review189684
---



+1 to fix the error message in test. Wonder if we could try this? 
https://github.com/facebook/react/issues/11098

- Kai Huang


On Oct. 31, 2017, 1:58 a.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63436/
> ---
> 
> (Updated Oct. 31, 2017, 1:58 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Kai Huang.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> # Enabling ErrorBoundary in Scheduler UI
> React 16 introduces a new concept of an “error boundary” that allows us to 
> limit the impact of an error and not unmount the whole component tree. I am 
> open to keeping or removing the stack trace.
> 
> from React docs:
> > As of React 16, errors that were not caught by any error boundary will 
> > result in unmounting of the whole React component tree.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/ErrorBoundary.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/ErrorBoundary-test.js PRE-CREATION 
>   ui/src/main/js/index.js 9f94d4bd6f649d74bdd9c3aa99783e01cae25d93 
>   ui/src/main/sass/app.scss 8a10e1c5a501828c39ca7b9cf6b518a2d7a99d28 
>   ui/src/main/sass/components/_error-boundary.scss PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/63436/diff/1/
> 
> 
> Testing
> ---
> 
> - Unit tests added. This however causes an error message to be logged on 
> stdout.
> 
> ```
> ? ./gradlew ui:test
> ...
> ASS src/main/js/pages/__tests__/Job-test.js
> PASS src/main/js/components/__tests__/InstanceViz-test.js
> PASS src/main/js/pages/__tests__/Update-test.js
> PASS src/main/js/components/__tests__/InstanceHistoryItem-test.js
> PASS src/main/js/components/__tests__/Pagination-test.js
> PASS src/main/js/components/__tests__/ErrorBoundary-test.js
>   ? Console
> 
> console.error node_modules/react-dom/cjs/react-dom.development.js:8305
>   The above error occurred in the  component:
>   in ComponentWithError
>   in ErrorBoundary (created by WrapperComponent)
>   in WrapperComponent
> 
>   React will try to recreate this component tree from scratch using the 
> error boundary you provided, ErrorBoundary.
> ...
> 
> BUILD SUCCESSFUL in 0s
> 2 actionable tasks: 2 up-to-date
> 
> # rmotamedi@tw-mbp-rmotamedi:~/oss/aurora on git:error-boundry ? [18:52:22]
> ? ./gradlew ui:lint
> 
> BUILD SUCCESSFUL in 6s
> 4 actionable tasks: 4 up-to-date
> ```
> 
> 
> - For more testing, I added the following component (that raises an error) 
> and installed it under `UpdateInstanceEvents`. I then caught the error at 
> various depths in the doom tree. I attached the screen-shots of how this will 
> render.
> 
> ## ComponentWithError
> 
> ```
> import React from 'react'
> 
> export default class ComponentWithError extends React.Component {
>   render() {
> throw new Error('Crashed!');
> return ;
>   }
> }
> ```
> 
> 
> File Attachments
> 
> 
> handling boundary on the main router
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/31/d95e9f85-3dad-4332-ac33-3cb9f8a4455c__Screen_Shot_2017-10-30_at_6.30.22_PM.png
> handling boundary on a component deep in the doom tree.
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/31/64d47905-b058-4ef5-9d19-fa3311a47f9d__Screen_Shot_2017-10-30_at_6.36.18_PM.png
> 
> 
> Thanks,
> 
> Reza Motamedi
> 
>



Re: Review Request 63375: Add resource units to config summary

2017-10-27 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63375/#review189480
---


Ship it!




Ship It!

- Kai Huang


On Oct. 27, 2017, 7:58 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63375/
> ---
> 
> (Updated Oct. 27, 2017, 7:58 p.m.)
> 
> 
> Review request for Aurora, Kai Huang and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Add resource units to config summary
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/RoleQuota.js 
> bb1b6e7e5dbedf875f27caf9bc9ca28319b82c5b 
>   ui/src/main/js/components/TaskConfigSummary.js 
> b1cf5a956e0230932944f76ad8d25b356b9d8f23 
>   ui/src/main/js/utils/Quota.js PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/63375/diff/1/
> 
> 
> Testing
> ---
> 
> See screenshot.
> 
> 
> File Attachments
> 
> 
> Screen Shot 2017-10-27 at 12.55.52 PM.png
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/27/be64586c-3ab8-4e17-bec2-b166719da2e7__Screen_Shot_2017-10-27_at_12.55.52_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63374: Revert to old Job Page tab names and add counts

2017-10-27 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63374/#review189477
---


Ship it!




Ship It!

- Kai Huang


On Oct. 27, 2017, 7:48 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63374/
> ---
> 
> (Updated Oct. 27, 2017, 7:48 p.m.)
> 
> 
> Review request for Aurora, Kai Huang and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Changing the names of tabs causes unnecessary confusion. Revert to "active 
> tasks/completed tasks" and add the instance count back to both.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/JobHistory.js 
> 74f6fcb1117e80bcd5a033dad06ca00f9bc2ddfc 
>   ui/src/main/js/components/JobStatus.js 
> 84e335c91da7231eda30ae945dd669f80fb1cc38 
> 
> 
> Diff: https://reviews.apache.org/r/63374/diff/1/
> 
> 
> Testing
> ---
> 
> See screenshots.
> 
> 
> File Attachments
> 
> 
> Screen Shot 2017-10-27 at 12.46.52 PM.png
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/27/33061824-a52c-4b54-a14f-27793f5bc9ea__Screen_Shot_2017-10-27_at_12.46.52_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63344: Support updates with no desiredState on Job and Update pages

2017-10-26 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63344/#review189365
---


Ship it!




Ship It!

- Kai Huang


On Oct. 26, 2017, 10:01 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63344/
> ---
> 
> (Updated Oct. 26, 2017, 10:01 p.m.)
> 
> 
> Review request for Aurora, Kai Huang and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> When updates only delete instances, desiredState is null.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/UpdateConfig.js 
> 89dea2512d6fed712c7f059006034cd3d1ffb0f7 
>   ui/src/main/js/utils/Update.js dee239312ecc6214d954336f1e57730f1e87802d 
>   ui/src/main/js/utils/__tests__/Update-test.js 
> 88fa5f7edb267246a1a0489eba414e9f7e84ee05 
> 
> 
> Diff: https://reviews.apache.org/r/63344/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> ./gradlew ui:lint
> 
> Verified in Vagrant.
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63339: Search entire job name for query string on JobList

2017-10-26 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63339/#review189364
---


Ship it!




Ship It!

- Kai Huang


On Oct. 26, 2017, 8:53 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63339/
> ---
> 
> (Updated Oct. 26, 2017, 8:53 p.m.)
> 
> 
> Review request for Aurora, Kai Huang and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Search entire job name for query string on JobList
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/JobList.js 
> bb7f4885ae6a3452f953f69c7c89d87200539b00 
>   ui/src/main/js/components/__tests__/JobList-test.js 
> deda6fe8ba362c01ce5c9c9a2ab43e334043796d 
> 
> 
> Diff: https://reviews.apache.org/r/63339/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> ./gradlew ui:lint
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63333: Do not fetch neighbor tasks if no active task

2017-10-26 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6/#review189335
---


Ship it!




Ship It!

- Kai Huang


On Oct. 26, 2017, 7:42 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6/
> ---
> 
> (Updated Oct. 26, 2017, 7:42 p.m.)
> 
> 
> Review request for Aurora, Kai Huang and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Do not fetch neighbor tasks if no active task
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/pages/Instance.js 2979b3c9f1a13ead6dbc673a6710c5d9cab5ca5e 
> 
> 
> Diff: https://reviews.apache.org/r/6/diff/1/
> 
> 
> Testing
> ---
> 
> Verified in Vagrant.
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63281: Clean up TaskList component layout.

2017-10-25 Thread Kai Huang


> On Oct. 25, 2017, 5:10 p.m., Kai Huang wrote:
> > ui/src/main/js/components/TaskListItemActions.js
> > Lines 3 (patched)
> > <https://reviews.apache.org/r/63281/diff/1/?file=1867950#file1867950line3>
> >
> > s/TaskListItemActioins/TaskListItemHost/ ?
> 
> David McLaughlin wrote:
> The idea is that we might add more actions in the future.

Ah, that will make sense.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63281/#review189207
---


On Oct. 24, 2017, 11:32 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63281/
> ---
> 
> (Updated Oct. 24, 2017, 11:32 p.m.)
> 
> 
> Review request for Aurora, Kai Huang and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Clean up TaskList component layout by separating out some of the components 
> (to help with extensibility).
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/TaskList.js 
> 4a4b8d365737b002acbe47c2d9b0253c5ce4f2ed 
>   ui/src/main/js/components/TaskListItem.js PRE-CREATION 
>   ui/src/main/js/components/TaskListItemActions.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/TaskList-test.js 
> c222b61de65c6333e9689123302206802871eb90 
>   ui/src/main/js/components/__tests__/TaskListItem-test.js PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/63281/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> ./gradlew ui:lint
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63281: Clean up TaskList component layout.

2017-10-25 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63281/#review189207
---


Ship it!




LGTM


ui/src/main/js/components/TaskListItemActions.js
Lines 3 (patched)
<https://reviews.apache.org/r/63281/#comment266175>

s/TaskListItemActioins/TaskListItemHost/ ?


- Kai Huang


On Oct. 24, 2017, 11:32 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63281/
> ---
> 
> (Updated Oct. 24, 2017, 11:32 p.m.)
> 
> 
> Review request for Aurora, Kai Huang and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Clean up TaskList component layout by separating out some of the components 
> (to help with extensibility).
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/TaskList.js 
> 4a4b8d365737b002acbe47c2d9b0253c5ce4f2ed 
>   ui/src/main/js/components/TaskListItem.js PRE-CREATION 
>   ui/src/main/js/components/TaskListItemActions.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/TaskList-test.js 
> c222b61de65c6333e9689123302206802871eb90 
>   ui/src/main/js/components/__tests__/TaskListItem-test.js PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/63281/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> ./gradlew ui:lint
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63260: Fix alignment of text on JobList

2017-10-24 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63260/#review189096
---


Ship it!




Ship It!

- Kai Huang


On Oct. 24, 2017, 8:44 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63260/
> ---
> 
> (Updated Oct. 24, 2017, 8:44 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Kai Huang, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> They mix of upper and lowercase letters is making the centered text look off 
> (when the box model is actually doing the right thing). To compensate, 
> manually raise the job name text.
> 
> 
> Diffs
> -
> 
>   ui/src/main/sass/components/_job-list-page.scss 
> 4738de849cf13ced71b2ace3cae7c8a9a8342698 
> 
> 
> Diff: https://reviews.apache.org/r/63260/diff/1/
> 
> 
> Testing
> ---
> 
> In Vagrant (see screenshots).
> 
> 
> File Attachments
> 
> 
> before
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/24/7f0997c4-2cf6-42fc-b0d9-07bd43a61767__Screen_Shot_2017-10-24_at_1.41.50_PM.png
> after
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/24/7b5ee436-5f72-4a61-9c99-cd60d1a08efc__Screen_Shot_2017-10-24_at_1.42.02_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63197: Fix back button issue on Jobs page

2017-10-23 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63197/#review188948
---


Ship it!




Ship It!

- Kai Huang


On Oct. 21, 2017, 1:34 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63197/
> ---
> 
> (Updated Oct. 21, 2017, 1:34 a.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> The Tabs component manages its own state, and the parent component controls 
> the activeTab based on the URL by setting the 'activeTab' property. So we 
> need to listen to changes to the property (and to the state) in order to 
> support back (and forward) button. 
> 
> FWIW, this is the type of problem that disappears with the bloated and 
> complex Redux style architecture.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/Tabs.js a48c600ddea38481fbfc4e6af4f42ed1fb834287 
> 
> 
> Diff: https://reviews.apache.org/r/63197/diff/1/
> 
> 
> Testing
> ---
> 
> Tested in Vagrant and confirmed back button now works.
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63135: Refactor Job Page to make it more pluggable

2017-10-19 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63135/#review188749
---



Fix the gradle build error:
```
* Exception is:
org.gradle.api.tasks.TaskExecutionException: Execution failed for task 
':ui:test'.
at 
org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.executeActions(ExecuteActionsTaskExecuter.java:100)
```

- Kai Huang


On Oct. 19, 2017, 4 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63135/
> ---
> 
> (Updated Oct. 19, 2017, 4 a.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Refactor Job Page to create cleaner abstractions. Also refactors Tabs to make 
> them easier to write tests for (using shallow from enzyme). 
> 
> Also improved Pagination code (was generating a warning) now that React can 
> handle returning null from components (and doing the right thing).
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/JobHistory.js PRE-CREATION 
>   ui/src/main/js/components/JobOverview.js PRE-CREATION 
>   ui/src/main/js/components/JobStatus.js PRE-CREATION 
>   ui/src/main/js/components/Pagination.js 
> 094ef2b248bd802ec63f562a978fd1b4d4bb5eb8 
>   ui/src/main/js/components/Tabs.js e382aa74e0f73e78c8092a58a605bffc54d15a81 
>   ui/src/main/js/components/__tests__/JobHistory-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/JobStatus-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Tabs-test.js 
> 252e5e001478b5a727e208b6b2d09d10128e83b9 
>   ui/src/main/js/pages/Job.js 070f1e43e4fe9c9eac688a9514c14e3296ef41af 
>   ui/src/main/js/pages/__tests__/Job-test.js 
> 03ef20ddefcc915ed4bd6242de3fe8686efdc202 
> 
> 
> Diff: https://reviews.apache.org/r/63135/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:lint
> ./gradlew ui:test
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63098: Clean up Job Page CSS

2017-10-18 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63098/#review188524
---


Ship it!




Ship It!

- Kai Huang


On Oct. 18, 2017, 3:29 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63098/
> ---
> 
> (Updated Oct. 18, 2017, 3:29 a.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Clean up Job Page CSS. 
> 
> * Make update list smaller (was too dominant on the page).
> * Show update progress/size of history.
> * Tidy up whitespace. 
> * Move expander to end of task list item.
> * Wrap the main job overview loading element in a panel group to prevent 
> jarring page change as content loads.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/TaskConfigSummary.js 
> 43b50d9e485d1d8c03572453c4c01a01c528ffe7 
>   ui/src/main/js/components/TaskList.js 
> 5a61de8fc46cc729906eb756c54c208320d34da7 
>   ui/src/main/js/components/UpdateList.js 
> 2df28394713d31da7d60b6b32028c16613bec3a0 
>   ui/src/main/js/pages/Job.js 5f92ad0b801c047f6c056623069438a5aa781c02 
>   ui/src/main/js/pages/__tests__/Job-test.js 
> 2b126b65bb09c40a528f87148942615dd035e36d 
>   ui/src/main/sass/components/_instance-page.scss 
> 99204fdfca4441d824c3dfff083f78e1d094b4c9 
>   ui/src/main/sass/components/_task-list.scss 
> a6e2f0a1994134381c6d16b05209771ab2b09988 
>   ui/src/main/sass/components/_update-list.scss 
> 83a1f5a07291ea3aaabb8145877f5d7f8d8f433e 
> 
> 
> Diff: https://reviews.apache.org/r/63098/diff/2/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:lint
> ./gradlew ui:test
> 
> See screenshots.
> 
> 
> File Attachments
> 
> 
> Tighter task list item. 
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/18/8effbff4-5b1a-4883-857e-b6107ff25c6c__Screen_Shot_2017-10-17_at_8.06.09_PM.png
> Job Update List
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/18/9b23-1d3b-4eb4-94c3-cb0b332dfd95__Screen_Shot_2017-10-17_at_8.06.16_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63082: Hide InstanceHistory when there are no old tasks.

2017-10-17 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63082/#review188348
---


Ship it!




Ship It!

- Kai Huang


On Oct. 17, 2017, 5:30 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63082/
> ---
> 
> (Updated Oct. 17, 2017, 5:30 p.m.)
> 
> 
> Review request for Aurora, Kai Huang, Reza Motamedi, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Hide InstanceHistory when no old tasks.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/InstanceHistory.js 
> fb0639079d227ba19cf97ed1061d6a55db0f0497 
>   ui/src/main/js/components/__tests__/InstanceHistory-test.js 
> 16314810312a48d44faeaa795640db48b225d243 
> 
> 
> Diff: https://reviews.apache.org/r/63082/diff/2/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> No old tasks
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/17/be3f31f6-d681-48e2-8ca7-708a85ca278b__Screen_Shot_2017-10-17_at_10.28.08_AM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63079: Center pagination links when not a table.

2017-10-17 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63079/#review188347
---


Ship it!




Ship It!

- Kai Huang


On Oct. 17, 2017, 5:14 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63079/
> ---
> 
> (Updated Oct. 17, 2017, 5:14 p.m.)
> 
> 
> Review request for Aurora, Kai Huang, Reza Motamedi, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Center pagination links when not a table.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/Pagination.js 
> 6a8b73ed0a9466e263c1647db97c2e3fce381c3f 
>   ui/src/main/sass/app.scss 3a799b658ad6c94144cda7beec38c4b72ec393d8 
>   ui/src/main/sass/components/_pagination.scss PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/63079/diff/1/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> Updates Page fix
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/17/6f2323d9-05b7-4e55-b94b-b2429fb5e0de__Screen_Shot_2017-10-17_at_10.13.59_AM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63081: Format constraints on Task Config Summary

2017-10-17 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63081/#review188346
---


Ship it!




Ship It!

- Kai Huang


On Oct. 17, 2017, 5:21 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63081/
> ---
> 
> (Updated Oct. 17, 2017, 5:21 p.m.)
> 
> 
> Review request for Aurora, Kai Huang, Reza Motamedi, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Format constraints on Task Config Summary
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/TaskConfigSummary.js 
> 69b1a40cc4419d5aecb3da7c8bcfb549b789dc7f 
>   ui/src/main/sass/components/_job-page.scss 
> 8523fcf4c917b46c9d514325f073bd5788798156 
> 
> 
> Diff: https://reviews.apache.org/r/63081/diff/1/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> Constraint fix
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/17/020c1035-0e54-42ad-b685-17130e370e51__Screen_Shot_2017-10-17_at_10.20.52_AM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 63062: Expose list of neighbors in the instance page

2017-10-17 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63062/#review188341
---


Fix it, then Ship it!




LGTM


ui/src/main/sass/components/_instance-page.scss
Lines 31 (patched)
<https://reviews.apache.org/r/63062/#comment265326>

s/taskkey/task-key/
or task-instance-id


- Kai Huang


On Oct. 17, 2017, 5:34 a.m., Reza Motamedi wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63062/
> ---
> 
> (Updated Oct. 17, 2017, 5:34 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Kai Huang, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> # Expose list of neighbors in the instance page
> 
> Exposing the list of neighbor task on the host is useful when application 
> tail letencies degrade because of noisy neighbors. Currenlty our service 
> oweners need to jump to the thermos-UI to get this info. This feature will 
> definately help spotting misbehaving neighbors a lot easier.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/TaskNeighbors.js PRE-CREATION 
>   ui/src/main/js/components/TaskStatus.js 
> b5149182c80b0edcdec88d5c6e24d3015409b0fc 
>   ui/src/main/js/components/__tests__/TaskNeighbors-test.js PRE-CREATION 
>   ui/src/main/js/pages/Instance.js c4d625c063b6037ae2ff93e638a3040436515cd5 
>   ui/src/main/js/utils/Common.js 8f2da7c8a5b44d2ded89a2070ea6ac4bf1fb6ed5 
>   ui/src/main/sass/components/_instance-page.scss 
> 99204fdfca4441d824c3dfff083f78e1d094b4c9 
> 
> 
> Diff: https://reviews.apache.org/r/63062/diff/3/
> 
> 
> Testing
> ---
> 
> ? ./gradlew ui:test
> ? ./gradlew ui:lint
> 
> 
> File Attachments
> 
> 
> Screen Shot 2017-10-16 at 6.27.44 PM.png
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/17/7fa71abf-f83e-412f-b554-dc880ccdb25c__Screen_Shot_2017-10-16_at_6.27.44_PM.png
> 
> 
> Thanks,
> 
> Reza Motamedi
> 
>



Re: Review Request 63065: Fix link on Navigation logo

2017-10-17 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63065/#review188339
---


Ship it!




Ship It!

- Kai Huang


On Oct. 17, 2017, 4:10 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63065/
> ---
> 
> (Updated Oct. 17, 2017, 4:10 a.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Fix link on Navigation logo
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/Navigation.js 
> e46cafdaec08d24698b60434fb821f9933f130b9 
> 
> 
> Diff: https://reviews.apache.org/r/63065/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62958: Add URL handling for tab switching on Job page

2017-10-16 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62958/#review188181
---


Ship it!




LGTM

- Kai Huang


On Oct. 13, 2017, 12:10 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62958/
> ---
> 
> (Updated Oct. 13, 2017, 12:10 a.m.)
> 
> 
> Review request for Aurora, Kai Huang, Reza Motamedi, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This maintains the existing UI functionality where clicking on tabs updates 
> the browser location, to create shareable URLs to tab views.
> 
> 
> Diffs
> -
> 
>   ui/package.json cde8d106346d9ac498cfee9b5291ebc637fc6a2a 
>   ui/src/main/js/components/Tabs.js 43b1950b11b80dc3017730801992341bc527c39c 
>   ui/src/main/js/components/__tests__/Tabs-test.js 
> e028c2dd86739ed9762aa1d0be5c609d5487e06e 
>   ui/src/main/js/pages/Job.js fc400f7442a1f8a5f0ebfe366dfe40ef40e7108e 
>   ui/src/main/js/pages/__tests__/Job-test.js 
> 4cc76b8731d71ceca87a5d1e259360b2af8feba0 
> 
> 
> Diff: https://reviews.apache.org/r/62958/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:lint
> ./gradlew ui:test
> 
> Unfortunately the testing here is only on one side (restoring the URL state). 
> I cannot simulate the change events from the unit tests because of 
> limitations with shallow rendering. When I figure out how to add integration 
> tests that work with React Router, I'll add coverage.
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62955: Manage Bootstrap with webpack in new UI

2017-10-12 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62955/#review187909
---


Ship it!




Ship It!

- Kai Huang


On Oct. 12, 2017, 10:22 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62955/
> ---
> 
> (Updated Oct. 12, 2017, 10:22 p.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This serves bootstrap and all static dependencies via the UI bundle generated 
> by webpack. It means we can remove the vendored code from the repo when we 
> delete the old UI. In additon, the Scheduler no longer has to serve them as 
> resources.
> 
> 
> Diffs
> -
> 
>   .gitignore 2f24af533a93615df9545a99a3acaa86ed9e2bc5 
>   src/main/resources/scheduler/assets/scheduler/new-index.html 
> d59a6f50bd8661e38e796f4b087bc25a09adb061 
>   ui/package.json cde8d106346d9ac498cfee9b5291ebc637fc6a2a 
>   ui/src/main/js/index.js 13d722fa06f7c5c86f16eae772367f1438ad795f 
>   ui/webpack.config.js 4fd7b352627ac8e3dc55efc4fadb17a96894b937 
> 
> 
> Diff: https://reviews.apache.org/r/62955/diff/1/
> 
> 
> Testing
> ---
> 
> Verified bootstrap styles and fonts (icons) are still applied in Vagrant.
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62908: Implement Job page in React

2017-10-11 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62908/#review187732
---


Ship it!




Ship It!

- Kai Huang


On Oct. 12, 2017, 2:46 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62908/
> ---
> 
> (Updated Oct. 12, 2017, 2:46 a.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Job page implemented in React. This is the last of the big commits before a 
> bug bash and other small nits than can (hopefully) be collaborated on. 
> 
> Added deep configuration diff widget (for jobs with multiple configs) but 
> otherwise very similar to the existing page.
> 
> 
> Diffs
> -
> 
>   ui/package.json f4532dfcb7c7b65445190a8039eca5d89b4dcd56 
>   ui/src/main/js/components/Breadcrumb.js 
> 76c62709cf3ebf76e74e08573bc8b245a45d5fe9 
>   ui/src/main/js/components/ConfigDiff.js PRE-CREATION 
>   ui/src/main/js/components/JobConfig.js PRE-CREATION 
>   ui/src/main/js/components/Pagination.js 
> 0aac09d7360eee1cf971b35d1874afe4fa95b0f7 
>   ui/src/main/js/components/Tabs.js PRE-CREATION 
>   ui/src/main/js/components/TaskConfigSummary.js PRE-CREATION 
>   ui/src/main/js/components/TaskList.js PRE-CREATION 
>   ui/src/main/js/components/TaskStateMachine.js PRE-CREATION 
>   ui/src/main/js/components/UpdateList.js 
> 3f57669140b04b29e12003d68ded387825e83f85 
>   ui/src/main/js/components/UpdatePreview.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Breadcrumb-test.js 
> 47f7afb29b9192e214040097dbfee2ff7f7e29ed 
>   ui/src/main/js/components/__tests__/ConfigDiff-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/JobConfig-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Tabs-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/TaskList-test.js PRE-CREATION 
>   ui/src/main/js/index.js 30646f78a422788c5dbc07f97ca61b739a339bd9 
>   ui/src/main/js/pages/Job.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Job-test.js PRE-CREATION 
>   ui/src/main/js/test-utils/TaskBuilders.js 
> 35b915217f67a712710473b5901d964f1675e4d9 
>   ui/src/main/js/utils/Task.js c58ff7d86af013bd977f4b114c3be5f84aac 
>   ui/src/main/sass/app.scss 315b666bd4b56b6c0c1f5b53b38a9cd61a1e4b9c 
>   ui/src/main/sass/components/_job-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_task-list.scss PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/62908/diff/2/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:lint
> ./gradlew ui:test
> 
> See screenshots for live API testing.
> 
> 
> File Attachments
> 
> 
> Simple job - no updates
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/12/b6b53783-67d5-4cd8-a6ee-a9dc93f6f3e1__Screen_Shot_2017-10-11_at_5.16.48_PM.png
> Job With Update In Progress
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/12/552bfba1-eb4d-4dfd-a21a-9d73d715230b__Screen_Shot_2017-10-11_at_5.18.29_PM.png
> Configuration Tab - with diffing
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/12/6be8dc50-e231-4430-b3cf-7e383fc420bd__Screen_Shot_2017-10-11_at_5.19.25_PM.png
> Multi-Config Diffing support
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/12/5bb4b753-af84-4fe1-9ce3-4110b2c96225__Screen_Shot_2017-10-11_at_5.20.15_PM.png
> Expanded task item with update history too
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/12/7f2fc9e4-fa25-4b6b-ae50-afa90ac6992c__Screen_Shot_2017-10-11_at_5.20.39_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62908: Implement Job page in React

2017-10-11 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62908/#review187721
---




ui/src/main/js/components/ConfigDiff.js
Lines 19 (patched)
<https://reviews.apache.org/r/62908/#comment264767>

1. Why use `diff-before` here if the css style has applied in the wrapper 
of picker(divNavigation below)?

2. Is this an unimplemented css style?

Perhaps we should add a legend alongside the instanceGroup picker, so that 
it's more intuitive to figure out the color theory.



ui/src/main/js/components/JobConfig.js
Lines 16 (patched)
<https://reviews.apache.org/r/62908/#comment264776>

UX:
When there are 3 configs on the page, there is a lot of empty space on the 
right. Can we fill each row by 3 columns?



ui/src/main/js/components/TaskConfigSummary.js
Lines 55 (patched)
<https://reviews.apache.org/r/62908/#comment264774>

remove the `?



ui/src/main/js/components/__tests__/ConfigDiff-test.js
Lines 44 (patched)
<https://reviews.apache.org/r/62908/#comment264770>

add a space: dropdownwhen



ui/src/main/js/pages/Job.js
Lines 3 (patched)
<https://reviews.apache.org/r/62908/#comment264777>

It appers the link in the navigation Breadcrumb still points to the old 
scheduler ui. We should fixing the links since it's going to production.


- Kai Huang


On Oct. 12, 2017, 12:25 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62908/
> ---
> 
> (Updated Oct. 12, 2017, 12:25 a.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Job page implemented in React. This is the last of the big commits before a 
> bug bash and other small nits than can (hopefully) be collaborated on. 
> 
> Added deep configuration diff widget (for jobs with multiple configs) but 
> otherwise very similar to the existing page.
> 
> 
> Diffs
> -
> 
>   ui/package.json f4532dfcb7c7b65445190a8039eca5d89b4dcd56 
>   ui/src/main/js/components/ConfigDiff.js PRE-CREATION 
>   ui/src/main/js/components/JobConfig.js PRE-CREATION 
>   ui/src/main/js/components/Pagination.js 
> 0aac09d7360eee1cf971b35d1874afe4fa95b0f7 
>   ui/src/main/js/components/Tabs.js PRE-CREATION 
>   ui/src/main/js/components/TaskConfigSummary.js PRE-CREATION 
>   ui/src/main/js/components/TaskList.js PRE-CREATION 
>   ui/src/main/js/components/TaskStateMachine.js PRE-CREATION 
>   ui/src/main/js/components/UpdateList.js 
> 3f57669140b04b29e12003d68ded387825e83f85 
>   ui/src/main/js/components/UpdatePreview.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/ConfigDiff-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/JobConfig-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Tabs-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/TaskList-test.js PRE-CREATION 
>   ui/src/main/js/index.js 30646f78a422788c5dbc07f97ca61b739a339bd9 
>   ui/src/main/js/pages/Job.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Job-test.js PRE-CREATION 
>   ui/src/main/js/test-utils/TaskBuilders.js 
> 35b915217f67a712710473b5901d964f1675e4d9 
>   ui/src/main/js/utils/Task.js c58ff7d86af013bd977f4b114c3be5f84aac 
>   ui/src/main/sass/app.scss 315b666bd4b56b6c0c1f5b53b38a9cd61a1e4b9c 
>   ui/src/main/sass/components/_job-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_task-list.scss PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/62908/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:lint
> ./gradlew ui:test
> 
> See screenshots for live API testing.
> 
> 
> File Attachments
> 
> 
> Simple job - no updates
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/12/b6b53783-67d5-4cd8-a6ee-a9dc93f6f3e1__Screen_Shot_2017-10-11_at_5.16.48_PM.png
> Job With Update In Progress
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/12/552bfba1-eb4d-4dfd-a21a-9d73d715230b__Screen_Shot_2017-10-11_at_5.18.29_PM.png
> Configuration Tab - with diffing
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/12/6be8dc50-e231-4430-b3cf-7e383fc420bd__Screen_Shot_2017-10-11_at_5.19.25_PM.png
> Multi-Config Diffing support
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/12/5bb4b753-af84-4fe1-9ce3-4110b2c96225__Screen_Shot_2017-10-11_at_5.20.15_PM.png
> Expanded task item with update history too
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/12/7f2fc9e4-fa25-4b6b-ae50-afa90ac6992c__Screen_Shot_2017-10-11_at_5.20.39_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62720: Implement Instance pages in React

2017-10-09 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62720/#review187435
---


Ship it!




LGTM, except missing some test cases for showing correct task duration.


ui/src/main/js/components/__tests__/InstanceHistoryItem-test.js
Lines 54 (patched)
<https://reviews.apache.org/r/62720/#comment264387>

Add a few more test case for this suite


- Kai Huang


On Oct. 5, 2017, 11:50 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62720/
> ---
> 
> (Updated Oct. 5, 2017, 11:50 p.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Instance page moved over to the new UI. 
> 
> One of the big issues I ran into here was dealing with Thrift types. 
> Currently the Thrift compiler generates a file for use in JS. The compiler 
> gives you two options for JavaScript - one is 'node' which plays well with 
> our module system but assumes you'll be using the built-in socket 
> implementation in node, and then one for 'jquery' which uses the global 
> namespace but doesn't use vars (so we can't eval as it breaks strict mode). 
> So the TL;DR is - to write tests against scheduler states, I've had to copy 
> and paste the Thrift enums into a test setup file. 
> 
> The alternative is to simply ignore Thrift types and duplicate it all without 
> relying on globals. Thus making sure the tests and application code are using 
> the same enums. However, this approach won't help you find any UI regressions 
> when you update Thrift - and even worse, you won't spot any API regressions 
> when developing the UI. So I've just gone for this approach for now.
> 
> 
> Diffs
> -
> 
>   ui/.eslintrc 8d37c60e1edd4528ce4abdf6dda18ec2dfc078f4 
>   ui/package.json fe0397a888b337137c1d13b6f9559dae13698b0a 
>   ui/src/main/js/components/InstanceHistory.js PRE-CREATION 
>   ui/src/main/js/components/InstanceHistoryItem.js PRE-CREATION 
>   ui/src/main/js/components/Layout.js 
> 4ca54e318786859e23dc0444d7d2750817428e81 
>   ui/src/main/js/components/StateMachine.js PRE-CREATION 
>   ui/src/main/js/components/TaskDetails.js PRE-CREATION 
>   ui/src/main/js/components/TaskStatus.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/InstanceHistory-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/InstanceHistoryItem-test.js 
> PRE-CREATION 
>   ui/src/main/js/components/__tests__/StateMachine-test.js PRE-CREATION 
>   ui/src/main/js/index.js 4a879e6163eaabc203f2d3133419438c4e9730e8 
>   ui/src/main/js/pages/Instance.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Instance-test.js PRE-CREATION 
>   ui/src/main/js/test-utils/Builder.js PRE-CREATION 
>   ui/src/main/js/test-utils/TaskBuilders.js PRE-CREATION 
>   ui/src/main/js/test-utils/__tests__/Builder-test.js PRE-CREATION 
>   ui/src/main/js/utils/Common.js 2e12e3cc6af75a62de9f11797f89dbf3279d2ce6 
>   ui/src/main/js/utils/Task.js PRE-CREATION 
>   ui/src/main/js/utils/Thrift.js PRE-CREATION 
>   ui/src/main/sass/app.scss d9673d0e8faa0b565f92e068ed4e01de0c789e8d 
>   ui/src/main/sass/components/_instance-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_job-list-page.scss 
> d31344d8eabb41114f4ff4678ff841119dfdac82 
>   ui/src/main/sass/components/_layout.scss 
> 1d0553b8fdee2c9e11727831b7a112f534ce3ec6 
>   ui/src/main/sass/components/_state-machine.scss PRE-CREATION 
>   ui/src/main/sass/components/_status.scss PRE-CREATION 
>   ui/test-setup.js 054e7c2302aa70bc94a18581af5bd8e4e70559b1 
> 
> 
> Diff: https://reviews.apache.org/r/62720/diff/3/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:lint
> ./gradlew ui:test
> 
> Testing in Vagrant.
> 
> 
> File Attachments
> 
> 
> Active Task with State Machine component
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/04/16f93047-a94b-4ffa-92f2-7fc27dbed9c9__Screen_Shot_2017-10-03_at_10.12.34_PM.png
> Expanded Finished Task
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/04/f5daaf35-4e4b-4106-b7ce-2773b97d2277__Screen_Shot_2017-10-03_at_10.28.18_PM.png
> Tooltip
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/04/0fe652c3-48a3-4292-8179-7d09bfbcc1e4__Screen_Shot_2017-10-03_at_10.28.33_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62720: Implement Instance pages in React

2017-10-04 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62720/#review187150
---




ui/src/main/js/components/StateMachine.js
Lines 37 (patched)
<https://reviews.apache.org/r/62720/#comment264064>

Add a key property here to suppress the warnings in the browser console?


- Kai Huang


On Oct. 4, 2017, 5:43 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62720/
> ---
> 
> (Updated Oct. 4, 2017, 5:43 a.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Instance page moved over to the new UI. 
> 
> One of the big issues I ran into here was dealing with Thrift types. 
> Currently the Thrift compiler generates a file for use in JS. The compiler 
> gives you two options for JavaScript - one is 'node' which plays well with 
> our module system but assumes you'll be using the built-in socket 
> implementation in node, and then one for 'jquery' which uses the global 
> namespace but doesn't use vars (so we can't eval as it breaks strict mode). 
> So the TL;DR is - to write tests against scheduler states, I've had to copy 
> and paste the Thrift enums into a test setup file. 
> 
> The alternative is to simply ignore Thrift types and duplicate it all without 
> relying on globals. Thus making sure the tests and application code are using 
> the same enums. However, this approach won't help you find any UI regressions 
> when you update Thrift - and even worse, you won't spot any API regressions 
> when developing the UI. So I've just gone for this approach for now.
> 
> 
> Diffs
> -
> 
>   ui/.eslintrc 8d37c60e1edd4528ce4abdf6dda18ec2dfc078f4 
>   ui/package.json fe0397a888b337137c1d13b6f9559dae13698b0a 
>   ui/src/main/js/components/InstanceHistory.js PRE-CREATION 
>   ui/src/main/js/components/InstanceHistoryItem.js PRE-CREATION 
>   ui/src/main/js/components/Layout.js 
> 4ca54e318786859e23dc0444d7d2750817428e81 
>   ui/src/main/js/components/StateMachine.js PRE-CREATION 
>   ui/src/main/js/components/TaskDetails.js PRE-CREATION 
>   ui/src/main/js/components/TaskStatus.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/InstanceHistory-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/InstanceHistoryItem-test.js 
> PRE-CREATION 
>   ui/src/main/js/components/__tests__/StateMachine-test.js PRE-CREATION 
>   ui/src/main/js/index.js 4a879e6163eaabc203f2d3133419438c4e9730e8 
>   ui/src/main/js/pages/Instance.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Instance-test.js PRE-CREATION 
>   ui/src/main/js/test-utils/Builder.js PRE-CREATION 
>   ui/src/main/js/test-utils/TaskBuilders.js PRE-CREATION 
>   ui/src/main/js/test-utils/__tests__/Builder-test.js PRE-CREATION 
>   ui/src/main/js/utils/Common.js 2e12e3cc6af75a62de9f11797f89dbf3279d2ce6 
>   ui/src/main/js/utils/Task.js PRE-CREATION 
>   ui/src/main/js/utils/Thrift.js PRE-CREATION 
>   ui/src/main/sass/app.scss d9673d0e8faa0b565f92e068ed4e01de0c789e8d 
>   ui/src/main/sass/components/_instance-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_job-list-page.scss 
> d31344d8eabb41114f4ff4678ff841119dfdac82 
>   ui/src/main/sass/components/_layout.scss 
> 1d0553b8fdee2c9e11727831b7a112f534ce3ec6 
>   ui/src/main/sass/components/_state-machine.scss PRE-CREATION 
>   ui/src/main/sass/components/_status.scss PRE-CREATION 
>   ui/test-setup.js 054e7c2302aa70bc94a18581af5bd8e4e70559b1 
> 
> 
> Diff: https://reviews.apache.org/r/62720/diff/2/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:lint
> ./gradlew ui:test
> 
> Testing in Vagrant.
> 
> 
> File Attachments
> 
> 
> Active Task with State Machine component
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/04/16f93047-a94b-4ffa-92f2-7fc27dbed9c9__Screen_Shot_2017-10-03_at_10.12.34_PM.png
> Expanded Finished Task
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/04/f5daaf35-4e4b-4106-b7ce-2773b97d2277__Screen_Shot_2017-10-03_at_10.28.18_PM.png
> Tooltip
>   
> https://reviews.apache.org/media/uploaded/files/2017/10/04/0fe652c3-48a3-4292-8179-7d09bfbcc1e4__Screen_Shot_2017-10-03_at_10.28.33_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62720: Implement Instance pages in React

2017-10-04 Thread Kai Huang


> On Oct. 4, 2017, 9:01 p.m., Kai Huang wrote:
> > ui/src/main/js/index.js
> > Line 21 (original), 22 (patched)
> > <https://reviews.apache.org/r/62720/diff/2/?file=1845631#file1845631line22>
> >
> > I'm not sure why I cannot access the job page and the instance page.
> > Visiting this url 
> > http://192.168.33.7:8081/beta/scheduler/vagrant/test/hello/0 on my vagrant 
> > image gives me the following error:
> > 
> > ```
> > Home.js:15 Uncaught TypeError: Cannot read property 'getRoleSummary' of 
> > undefined
> > at HomePage.componentWillMount (Home.js:15)
> > at setComponentProps (preact.esm.js:617)
> > at renderComponent (preact.esm.js:711)
> > at setComponentProps (preact.esm.js:634)
> > at buildComponentFromVNode (preact.esm.js:814)
> > at idiff (preact.esm.js:355)
> > at innerDiffNode (preact.esm.js:474)
> > at idiff (preact.esm.js:397)
> > at diff (preact.esm.js:306)
> > at renderComponent (preact.esm.js:727)
> > ```
> 
> David McLaughlin wrote:
> I don't think you've applied the patch properly then? It means the api 
> isn't injected.

It turns out to be an issue of my local vagrant environment. Please ignore this 
comment.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62720/#review187129
---


On Oct. 4, 2017, 5:43 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62720/
> ---
> 
> (Updated Oct. 4, 2017, 5:43 a.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Instance page moved over to the new UI. 
> 
> One of the big issues I ran into here was dealing with Thrift types. 
> Currently the Thrift compiler generates a file for use in JS. The compiler 
> gives you two options for JavaScript - one is 'node' which plays well with 
> our module system but assumes you'll be using the built-in socket 
> implementation in node, and then one for 'jquery' which uses the global 
> namespace but doesn't use vars (so we can't eval as it breaks strict mode). 
> So the TL;DR is - to write tests against scheduler states, I've had to copy 
> and paste the Thrift enums into a test setup file. 
> 
> The alternative is to simply ignore Thrift types and duplicate it all without 
> relying on globals. Thus making sure the tests and application code are using 
> the same enums. However, this approach won't help you find any UI regressions 
> when you update Thrift - and even worse, you won't spot any API regressions 
> when developing the UI. So I've just gone for this approach for now.
> 
> 
> Diffs
> -
> 
>   ui/.eslintrc 8d37c60e1edd4528ce4abdf6dda18ec2dfc078f4 
>   ui/package.json fe0397a888b337137c1d13b6f9559dae13698b0a 
>   ui/src/main/js/components/InstanceHistory.js PRE-CREATION 
>   ui/src/main/js/components/InstanceHistoryItem.js PRE-CREATION 
>   ui/src/main/js/components/Layout.js 
> 4ca54e318786859e23dc0444d7d2750817428e81 
>   ui/src/main/js/components/StateMachine.js PRE-CREATION 
>   ui/src/main/js/components/TaskDetails.js PRE-CREATION 
>   ui/src/main/js/components/TaskStatus.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/InstanceHistory-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/InstanceHistoryItem-test.js 
> PRE-CREATION 
>   ui/src/main/js/components/__tests__/StateMachine-test.js PRE-CREATION 
>   ui/src/main/js/index.js 4a879e6163eaabc203f2d3133419438c4e9730e8 
>   ui/src/main/js/pages/Instance.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Instance-test.js PRE-CREATION 
>   ui/src/main/js/test-utils/Builder.js PRE-CREATION 
>   ui/src/main/js/test-utils/TaskBuilders.js PRE-CREATION 
>   ui/src/main/js/test-utils/__tests__/Builder-test.js PRE-CREATION 
>   ui/src/main/js/utils/Common.js 2e12e3cc6af75a62de9f11797f89dbf3279d2ce6 
>   ui/src/main/js/utils/Task.js PRE-CREATION 
>   ui/src/main/js/utils/Thrift.js PRE-CREATION 
>   ui/src/main/sass/app.scss d9673d0e8faa0b565f92e068ed4e01de0c789e8d 
>   ui/src/main/sass/components/_instance-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_job-list-page.scss 
> d31344d8eabb41114f4ff4678ff841119dfdac82 
>   ui/src/main/sass/components/_layout.scss 
> 1d0553b8fdee2c9e11727831b7a112f534ce3ec6 
>

Re: Review Request 62720: Implement Instance pages in React

2017-10-04 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62720/#review187129
---




ui/src/main/js/index.js
Line 21 (original), 22 (patched)
<https://reviews.apache.org/r/62720/#comment264050>

I'm not sure why I cannot access the job page and the instance page.
Visiting this url 
http://192.168.33.7:8081/beta/scheduler/vagrant/test/hello/0 on my vagrant 
image gives me the following error:

```
Home.js:15 Uncaught TypeError: Cannot read property 'getRoleSummary' of 
undefined
at HomePage.componentWillMount (Home.js:15)
at setComponentProps (preact.esm.js:617)
at renderComponent (preact.esm.js:711)
at setComponentProps (preact.esm.js:634)
at buildComponentFromVNode (preact.esm.js:814)
at idiff (preact.esm.js:355)
at innerDiffNode (preact.esm.js:474)
at idiff (preact.esm.js:397)
at diff (preact.esm.js:306)
at renderComponent (preact.esm.js:727)
```


- Kai Huang


On Oct. 4, 2017, 5:43 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62720/
> ---
> 
> (Updated Oct. 4, 2017, 5:43 a.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Instance page moved over to the new UI. 
> 
> One of the big issues I ran into here was dealing with Thrift types. 
> Currently the Thrift compiler generates a file for use in JS. The compiler 
> gives you two options for JavaScript - one is 'node' which plays well with 
> our module system but assumes you'll be using the built-in socket 
> implementation in node, and then one for 'jquery' which uses the global 
> namespace but doesn't use vars (so we can't eval as it breaks strict mode). 
> So the TL;DR is - to write tests against scheduler states, I've had to copy 
> and paste the Thrift enums into a test setup file. 
> 
> The alternative is to simply ignore Thrift types and duplicate it all without 
> relying on globals. Thus making sure the tests and application code are using 
> the same enums. However, this approach won't help you find any UI regressions 
> when you update Thrift - and even worse, you won't spot any API regressions 
> when developing the UI. So I've just gone for this approach for now.
> 
> 
> Diffs
> -
> 
>   ui/.eslintrc 8d37c60e1edd4528ce4abdf6dda18ec2dfc078f4 
>   ui/package.json fe0397a888b337137c1d13b6f9559dae13698b0a 
>   ui/src/main/js/components/InstanceHistory.js PRE-CREATION 
>   ui/src/main/js/components/InstanceHistoryItem.js PRE-CREATION 
>   ui/src/main/js/components/Layout.js 
> 4ca54e318786859e23dc0444d7d2750817428e81 
>   ui/src/main/js/components/StateMachine.js PRE-CREATION 
>   ui/src/main/js/components/TaskDetails.js PRE-CREATION 
>   ui/src/main/js/components/TaskStatus.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/InstanceHistory-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/InstanceHistoryItem-test.js 
> PRE-CREATION 
>   ui/src/main/js/components/__tests__/StateMachine-test.js PRE-CREATION 
>   ui/src/main/js/index.js 4a879e6163eaabc203f2d3133419438c4e9730e8 
>   ui/src/main/js/pages/Instance.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Instance-test.js PRE-CREATION 
>   ui/src/main/js/test-utils/Builder.js PRE-CREATION 
>   ui/src/main/js/test-utils/TaskBuilders.js PRE-CREATION 
>   ui/src/main/js/test-utils/__tests__/Builder-test.js PRE-CREATION 
>   ui/src/main/js/utils/Common.js 2e12e3cc6af75a62de9f11797f89dbf3279d2ce6 
>   ui/src/main/js/utils/Task.js PRE-CREATION 
>   ui/src/main/js/utils/Thrift.js PRE-CREATION 
>   ui/src/main/sass/app.scss d9673d0e8faa0b565f92e068ed4e01de0c789e8d 
>   ui/src/main/sass/components/_instance-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_job-list-page.scss 
> d31344d8eabb41114f4ff4678ff841119dfdac82 
>   ui/src/main/sass/components/_layout.scss 
> 1d0553b8fdee2c9e11727831b7a112f534ce3ec6 
>   ui/src/main/sass/components/_state-machine.scss PRE-CREATION 
>   ui/src/main/sass/components/_status.scss PRE-CREATION 
>   ui/test-setup.js 054e7c2302aa70bc94a18581af5bd8e4e70559b1 
> 
> 
> Diff: https://reviews.apache.org/r/62720/diff/2/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:lint
> ./gradlew ui:test
> 
> Testing in Vagrant.
> 
> 
> File Attachments
> 
> 
> Active Task with State Machine component
>   
> https://rev

Re: Review Request 62451: Implement Role and Environment pages in Preact.

2017-09-27 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62451/#review186511
---


Ship it!




LGTM

- Kai Huang


On Sept. 28, 2017, 12:18 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62451/
> ---
> 
> (Updated Sept. 28, 2017, 12:18 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Kai Huang, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Implement Role and Environment pages in Preact. The color scheme and other 
> design elements aren't final yet - the plan is to wait until the structure is 
> all in place and then go back and style things. 
> 
> Major difference from the current UI is the use of color to visualize the 
> task states and the move away from the table format of the data.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/JobList.js PRE-CREATION 
>   ui/src/main/js/components/JobListItem.js PRE-CREATION 
>   ui/src/main/js/components/Layout.js PRE-CREATION 
>   ui/src/main/js/components/Pagination.js 
> dec89be93d531f755097ac8b3ee48d643d31262f 
>   ui/src/main/js/components/RoleQuota.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/JobList-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/RoleQuota-test.js PRE-CREATION 
>   ui/src/main/js/index.js 717aaa080d1365785a603ed7d1ece69efdaafec7 
>   ui/src/main/js/pages/Jobs.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Jobs-test.js PRE-CREATION 
>   ui/src/main/js/utils/Common.js PRE-CREATION 
>   ui/src/main/js/utils/Job.js PRE-CREATION 
>   ui/src/main/sass/app.scss 0b3967d39ed443eaf5bfadb473816135e0a978e4 
>   ui/src/main/sass/components/_job-list-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_layout.scss 
> b8a83b585fcef11964dc8349703ad453d37b5d2d 
>   ui/src/main/sass/components/_tables.scss 
> 2ea60bc64ccb2b556ef6edf25016f6a01cd075f6 
>   ui/src/main/sass/modules/_colors.scss 
> 147cff691058972aebc5f79f22955cf2448ce48c 
> 
> 
> Diff: https://reviews.apache.org/r/62451/diff/3/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> ./gradlew ui:lint
> 
> See screenshot for local development with some Twitter data.
> 
> 
> File Attachments
> 
> 
> Screenshot 
>   
> https://reviews.apache.org/media/uploaded/files/2017/09/21/684d8630-a9b4-4fbf-8955-a7fc118afcce__Screen_Shot_2017-09-21_at_4.11.05_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62451: Implement Role and Environment pages in Preact.

2017-09-27 Thread Kai Huang


> On Sept. 27, 2017, 12:15 a.m., Kai Huang wrote:
> > ui/src/main/js/index.js
> > Line 17 (original), 18 (patched)
> > <https://reviews.apache.org/r/62451/diff/1/?file=1832481#file1832481line18>
> >
> > In the homepage, it seems the role link still points to the old 
> > role/environment page?
> > ```
> > {r.role}
> > ```
> 
> David McLaughlin wrote:
> If you're talking about code not in this review (i.e. RoleList) then that 
> was fixed in another review.

Yes, saw that in the other review.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62451/#review186347
---


On Sept. 28, 2017, 12:18 a.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62451/
> ---
> 
> (Updated Sept. 28, 2017, 12:18 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Kai Huang, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Implement Role and Environment pages in Preact. The color scheme and other 
> design elements aren't final yet - the plan is to wait until the structure is 
> all in place and then go back and style things. 
> 
> Major difference from the current UI is the use of color to visualize the 
> task states and the move away from the table format of the data.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/JobList.js PRE-CREATION 
>   ui/src/main/js/components/JobListItem.js PRE-CREATION 
>   ui/src/main/js/components/Layout.js PRE-CREATION 
>   ui/src/main/js/components/Pagination.js 
> dec89be93d531f755097ac8b3ee48d643d31262f 
>   ui/src/main/js/components/RoleQuota.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/JobList-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/RoleQuota-test.js PRE-CREATION 
>   ui/src/main/js/index.js 717aaa080d1365785a603ed7d1ece69efdaafec7 
>   ui/src/main/js/pages/Jobs.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Jobs-test.js PRE-CREATION 
>   ui/src/main/js/utils/Common.js PRE-CREATION 
>   ui/src/main/js/utils/Job.js PRE-CREATION 
>   ui/src/main/sass/app.scss 0b3967d39ed443eaf5bfadb473816135e0a978e4 
>   ui/src/main/sass/components/_job-list-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_layout.scss 
> b8a83b585fcef11964dc8349703ad453d37b5d2d 
>   ui/src/main/sass/components/_tables.scss 
> 2ea60bc64ccb2b556ef6edf25016f6a01cd075f6 
>   ui/src/main/sass/modules/_colors.scss 
> 147cff691058972aebc5f79f22955cf2448ce48c 
> 
> 
> Diff: https://reviews.apache.org/r/62451/diff/3/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> ./gradlew ui:lint
> 
> See screenshot for local development with some Twitter data.
> 
> 
> File Attachments
> 
> 
> Screenshot 
>   
> https://reviews.apache.org/media/uploaded/files/2017/09/21/684d8630-a9b4-4fbf-8955-a7fc118afcce__Screen_Shot_2017-09-21_at_4.11.05_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62607: Replace Preact and custom testing with React + Enzyme

2017-09-27 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62607/#review186485
---


Ship it!




Ship It!

- Kai Huang


On Sept. 27, 2017, 9:31 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62607/
> ---
> 
> (Updated Sept. 27, 2017, 9:31 p.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Replace Preact and custom testing with React + jest + Enzyme now that they 
> are MIT licensed.
> 
> Side-effects:
> 
> * The Reactable library is not compatible with React 16.0 (the MIT licensed 
> version we are forced to use) and attempts to flag this started as early as 
> June this year. But the author seems to have abandoned the library. So I had 
> to create my own Pagination class. I found myself repeatedly using Reactable 
> in places where I just wanted pagination (for lists of divs) anyway, so maybe 
> this is a good thing.
> * Deleted custom (and questionable) shallow renderer.
> * The download time will be reduced due to not having to download PhantomJS.
> 
> 
> Diffs
> -
> 
>   build.gradle 460500aaeab28dcfeb29ec602d057ea4bca12378 
>   ui/karma.conf.js d42cda7d025f8c5618039888f4fb49ff388c3af1 
>   ui/package.json d6802029e0aeb9f6692ec8cc065949ce17b2ca07 
>   ui/src/__mocks__/react.js PRE-CREATION 
>   ui/src/main/js/components/Pagination.js PRE-CREATION 
>   ui/src/main/js/components/RoleList.js 
> 32595601aae06730f537d37d95be2e746f779103 
>   ui/src/main/js/components/__tests__/Breadcrumb-test.js 
> 18af0fe6c360f375fb0fa1694304d821041d9e16 
>   ui/src/main/js/components/__tests__/Home-test.js 
> 8e6bc09050148ea9711dbfe6f52b83c8210262be 
>   ui/src/main/js/components/__tests__/Pagination-test.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Home-test.js 
> 4f13f99ea24c0252b5f6cbb6980e0c96832d5120 
>   ui/src/main/js/utils/ShallowRender.js 
> 52e8bb259264e7dd47ee97cec35553acc147e818 
>   ui/src/main/js/utils/__tests__/ShallowRender-test.js 
> d5663a7c44de795a5aa0b28d4f373871ffcc3fd2 
>   ui/src/main/sass/components/_tables.scss 
> 58f176cffcdae34de7c776e31b3f0342efa3024f 
>   ui/test-setup.js PRE-CREATION 
>   ui/webpack.config.js e7cd672df62006bd5c5b60a6ba0903e16a767d13 
> 
> 
> Diff: https://reviews.apache.org/r/62607/diff/4/
> 
> 
> Testing
> ---
> 
> $ ./gradlew ui:lint
> $ ./gradlew ui:test
> 
> I also tested the UI in my Vagrant image.
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62607: Replace Preact and custom testing with React + Enzyme

2017-09-27 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62607/#review186469
---




ui/src/main/js/components/Pagination.js
Lines 75 (patched)
<https://reviews.apache.org/r/62607/#comment263020>

nit, the variable name here collide with the `currentPage` attribute in 
Pages component. s/currentPage/currentPageItems/



ui/src/main/js/components/Pagination.js
Lines 87 (patched)
<https://reviews.apache.org/r/62607/#comment263021>

s/maxPages/pageRange/



ui/src/main/js/components/Pagination.js
Lines 93 (patched)
<https://reviews.apache.org/r/62607/#comment263031>

Should this be extracted into renderer? So that we don't need to pass 
isTable flags?


- Kai Huang


On Sept. 27, 2017, 5:51 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62607/
> ---
> 
> (Updated Sept. 27, 2017, 5:51 p.m.)
> 
> 
> Review request for Aurora, Kai Huang and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Replace Preact and custom testing with React + jest + Enzyme now that they 
> are MIT licensed.
> 
> Side-effects:
> 
> * The Reactable library is not compatible with React 16.0 (the MIT licensed 
> version we are forced to use) and attempts to flag this started as early as 
> June this year. But the author seems to have abandoned the library. So I had 
> to create my own Pagination class. I found myself repeatedly using Reactable 
> in places where I just wanted pagination (for lists of divs) anyway, so maybe 
> this is a good thing.
> * Deleted custom (and questionable) shallow renderer.
> * The download time will be reduced due to not having to download PhantomJS.
> 
> 
> Diffs
> -
> 
>   build.gradle 460500aaeab28dcfeb29ec602d057ea4bca12378 
>   ui/karma.conf.js d42cda7d025f8c5618039888f4fb49ff388c3af1 
>   ui/package.json d6802029e0aeb9f6692ec8cc065949ce17b2ca07 
>   ui/src/__mocks__/react.js PRE-CREATION 
>   ui/src/main/js/components/Pagination.js PRE-CREATION 
>   ui/src/main/js/components/RoleList.js 
> 32595601aae06730f537d37d95be2e746f779103 
>   ui/src/main/js/components/__tests__/Breadcrumb-test.js 
> 18af0fe6c360f375fb0fa1694304d821041d9e16 
>   ui/src/main/js/components/__tests__/Home-test.js 
> 8e6bc09050148ea9711dbfe6f52b83c8210262be 
>   ui/src/main/js/components/__tests__/Pagination-test.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Home-test.js 
> 4f13f99ea24c0252b5f6cbb6980e0c96832d5120 
>   ui/src/main/js/utils/ShallowRender.js 
> 52e8bb259264e7dd47ee97cec35553acc147e818 
>   ui/src/main/js/utils/__tests__/ShallowRender-test.js 
> d5663a7c44de795a5aa0b28d4f373871ffcc3fd2 
>   ui/src/main/sass/components/_tables.scss 
> 58f176cffcdae34de7c776e31b3f0342efa3024f 
>   ui/test-setup.js PRE-CREATION 
>   ui/webpack.config.js e7cd672df62006bd5c5b60a6ba0903e16a767d13 
> 
> 
> Diff: https://reviews.apache.org/r/62607/diff/3/
> 
> 
> Testing
> ---
> 
> $ ./gradlew ui:lint
> $ ./gradlew ui:test
> 
> I also tested the UI in my Vagrant image.
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62451: Implement Role and Environment pages in Preact.

2017-09-26 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62451/#review186347
---




ui/src/main/js/index.js
Line 17 (original), 18 (patched)
<https://reviews.apache.org/r/62451/#comment262891>

In the homepage, it seems the role link still points to the old 
role/environment page?
```
{r.role}
```


- Kai Huang


On Sept. 21, 2017, 11:17 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62451/
> ---
> 
> (Updated Sept. 21, 2017, 11:17 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Kai Huang, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Implement Role and Environment pages in Preact. The color scheme and other 
> design elements aren't final yet - the plan is to wait until the structure is 
> all in place and then go back and style things. 
> 
> Major difference from the current UI is the use of color to visualize the 
> task states and the move away from the table format of the data.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/JobList.js PRE-CREATION 
>   ui/src/main/js/components/Layout.js PRE-CREATION 
>   ui/src/main/js/components/RoleQuota.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/JobList-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/RoleQuota-test.js PRE-CREATION 
>   ui/src/main/js/index.js 717aaa080d1365785a603ed7d1ece69efdaafec7 
>   ui/src/main/js/pages/Jobs.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Jobs-test.js PRE-CREATION 
>   ui/src/main/js/utils/Common.js PRE-CREATION 
>   ui/src/main/sass/app.scss 0b3967d39ed443eaf5bfadb473816135e0a978e4 
>   ui/src/main/sass/components/_job-list-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_layout.scss 
> b8a83b585fcef11964dc8349703ad453d37b5d2d 
>   ui/src/main/sass/components/_tables.scss 
> 58f176cffcdae34de7c776e31b3f0342efa3024f 
>   ui/src/main/sass/modules/_colors.scss 
> 147cff691058972aebc5f79f22955cf2448ce48c 
> 
> 
> Diff: https://reviews.apache.org/r/62451/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> ./gradlew ui:lint
> 
> See screenshot for local development with some Twitter data.
> 
> 
> File Attachments
> 
> 
> Screenshot 
>   
> https://reviews.apache.org/media/uploaded/files/2017/09/21/684d8630-a9b4-4fbf-8955-a7fc118afcce__Screen_Shot_2017-09-21_at_4.11.05_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62451: Implement Role and Environment pages in Preact.

2017-09-25 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62451/#review186142
---




ui/src/main/js/components/RoleQuota.js
Lines 70 (patched)
<https://reviews.apache.org/r/62451/#comment262575>

From the user perspective, a pie chart for the resource utilization would 
be much easier to consume.


- Kai Huang


On Sept. 21, 2017, 11:17 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62451/
> ---
> 
> (Updated Sept. 21, 2017, 11:17 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Kai Huang, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Implement Role and Environment pages in Preact. The color scheme and other 
> design elements aren't final yet - the plan is to wait until the structure is 
> all in place and then go back and style things. 
> 
> Major difference from the current UI is the use of color to visualize the 
> task states and the move away from the table format of the data.
> 
> 
> Diffs
> -
> 
>   ui/src/main/js/components/JobList.js PRE-CREATION 
>   ui/src/main/js/components/Layout.js PRE-CREATION 
>   ui/src/main/js/components/RoleQuota.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/JobList-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/RoleQuota-test.js PRE-CREATION 
>   ui/src/main/js/index.js 717aaa080d1365785a603ed7d1ece69efdaafec7 
>   ui/src/main/js/pages/Jobs.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Jobs-test.js PRE-CREATION 
>   ui/src/main/js/utils/Common.js PRE-CREATION 
>   ui/src/main/sass/app.scss 0b3967d39ed443eaf5bfadb473816135e0a978e4 
>   ui/src/main/sass/components/_job-list-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_layout.scss 
> b8a83b585fcef11964dc8349703ad453d37b5d2d 
>   ui/src/main/sass/components/_tables.scss 
> 58f176cffcdae34de7c776e31b3f0342efa3024f 
>   ui/src/main/sass/modules/_colors.scss 
> 147cff691058972aebc5f79f22955cf2448ce48c 
> 
> 
> Diff: https://reviews.apache.org/r/62451/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> ./gradlew ui:lint
> 
> See screenshot for local development with some Twitter data.
> 
> 
> File Attachments
> 
> 
> Screenshot 
>   
> https://reviews.apache.org/media/uploaded/files/2017/09/21/684d8630-a9b4-4fbf-8955-a7fc118afcce__Screen_Shot_2017-09-21_at_4.11.05_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62135: HomePage implemented in Preact

2017-09-12 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62135/#review185193
---


Ship it!




Ship It!

- Kai Huang


On Sept. 8, 2017, 11:40 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62135/
> ---
> 
> (Updated Sept. 8, 2017, 11:40 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Kai Huang, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Implementation of the home page in PreactJS. This was delayed heavily by the 
> lack of a shallow renderer 
> (https://facebook.github.io/react/docs/shallow-renderer.html) for testing. It 
> was too painful to test without it (e.g. Links when fully rendered must be 
> rendered within a Router context), so I hand-rolled one based on some 
> community attempts. The key motivation is just to allow us to decouple markup 
> from component logic, and also to avoid having to test child components 
> within component heirarchies. IMO the end result is very clean and easy to 
> read (and write!). 
> 
> If Preact gets a shallow renderer in the future, we can swap the one included 
> here out.
> 
> 
> Diffs
> -
> 
>   src/main/resources/scheduler/assets/images/aurora_logo_white.png 
> PRE-CREATION 
>   ui/.eslintrc 355b6a8a5f5d25dd4c71dc546dc7d4acfffdd506 
>   ui/package.json f712518b27477bccb03f49c86eac3ee5f769fc88 
>   ui/src/main/js/client/scheduler-client.js PRE-CREATION 
>   ui/src/main/js/components/Breadcrumb.js PRE-CREATION 
>   ui/src/main/js/components/Home.js 91d60b387cf3b1fb268e73b7b50922a83898c31f 
>   ui/src/main/js/components/Icon.js PRE-CREATION 
>   ui/src/main/js/components/Loading.js PRE-CREATION 
>   ui/src/main/js/components/Navigation.js PRE-CREATION 
>   ui/src/main/js/components/RoleList.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Breadcrumb-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Home-test.js 
> 2a80958d9303d1d3b9ae8f95013c66cb39f6bac3 
>   ui/src/main/js/index.js 2f7467b41ac3373a90c5453a7534d384a585b464 
>   ui/src/main/js/pages/Home.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Home-test.js PRE-CREATION 
>   ui/src/main/js/utils/ShallowRender.js PRE-CREATION 
>   ui/src/main/js/utils/__tests__/ShallowRender-test.js PRE-CREATION 
>   ui/src/main/sass/app.scss PRE-CREATION 
>   ui/src/main/sass/components/_base.scss PRE-CREATION 
>   ui/src/main/sass/components/_breadcrumb.scss PRE-CREATION 
>   ui/src/main/sass/components/_home-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_layout.scss PRE-CREATION 
>   ui/src/main/sass/components/_navigation.scss PRE-CREATION 
>   ui/src/main/sass/components/_tables.scss PRE-CREATION 
>   ui/src/main/sass/modules/_all.scss PRE-CREATION 
>   ui/src/main/sass/modules/_colors.scss PRE-CREATION 
>   ui/src/main/sass/modules/_typography.scss PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/62135/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> 
> Running in vagrant (screenshots attached).
> 
> 
> File Attachments
> 
> 
> preview
>   
> https://reviews.apache.org/media/uploaded/files/2017/09/08/011f99fb-a14e-4547-b90d-a5a1909c737c__Screen_Shot_2017-09-08_at_3.58.48_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62135: HomePage implemented in Preact

2017-09-11 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62135/#review185096
---




ui/src/main/js/utils/__tests__/ShallowRender-test.js
Lines 44 (patched)
<https://reviews.apache.org/r/62135/#comment261325>

nit. Add the following assertion as well, since the name is tesing handlie 
multiple elements?
```
expect(el.contains()).to.be.true;
```


- Kai Huang


On Sept. 8, 2017, 11:40 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62135/
> ---
> 
> (Updated Sept. 8, 2017, 11:40 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Kai Huang, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Implementation of the home page in PreactJS. This was delayed heavily by the 
> lack of a shallow renderer 
> (https://facebook.github.io/react/docs/shallow-renderer.html) for testing. It 
> was too painful to test without it (e.g. Links when fully rendered must be 
> rendered within a Router context), so I hand-rolled one based on some 
> community attempts. The key motivation is just to allow us to decouple markup 
> from component logic, and also to avoid having to test child components 
> within component heirarchies. IMO the end result is very clean and easy to 
> read (and write!). 
> 
> If Preact gets a shallow renderer in the future, we can swap the one included 
> here out.
> 
> 
> Diffs
> -
> 
>   src/main/resources/scheduler/assets/images/aurora_logo_white.png 
> PRE-CREATION 
>   ui/.eslintrc 355b6a8a5f5d25dd4c71dc546dc7d4acfffdd506 
>   ui/package.json f712518b27477bccb03f49c86eac3ee5f769fc88 
>   ui/src/main/js/client/scheduler-client.js PRE-CREATION 
>   ui/src/main/js/components/Breadcrumb.js PRE-CREATION 
>   ui/src/main/js/components/Home.js 91d60b387cf3b1fb268e73b7b50922a83898c31f 
>   ui/src/main/js/components/Icon.js PRE-CREATION 
>   ui/src/main/js/components/Loading.js PRE-CREATION 
>   ui/src/main/js/components/Navigation.js PRE-CREATION 
>   ui/src/main/js/components/RoleList.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Breadcrumb-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Home-test.js 
> 2a80958d9303d1d3b9ae8f95013c66cb39f6bac3 
>   ui/src/main/js/index.js 2f7467b41ac3373a90c5453a7534d384a585b464 
>   ui/src/main/js/pages/Home.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Home-test.js PRE-CREATION 
>   ui/src/main/js/utils/ShallowRender.js PRE-CREATION 
>   ui/src/main/js/utils/__tests__/ShallowRender-test.js PRE-CREATION 
>   ui/src/main/sass/app.scss PRE-CREATION 
>   ui/src/main/sass/components/_base.scss PRE-CREATION 
>   ui/src/main/sass/components/_breadcrumb.scss PRE-CREATION 
>   ui/src/main/sass/components/_home-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_layout.scss PRE-CREATION 
>   ui/src/main/sass/components/_navigation.scss PRE-CREATION 
>   ui/src/main/sass/components/_tables.scss PRE-CREATION 
>   ui/src/main/sass/modules/_all.scss PRE-CREATION 
>   ui/src/main/sass/modules/_colors.scss PRE-CREATION 
>   ui/src/main/sass/modules/_typography.scss PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/62135/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> 
> Running in vagrant (screenshots attached).
> 
> 
> File Attachments
> 
> 
> preview
>   
> https://reviews.apache.org/media/uploaded/files/2017/09/08/011f99fb-a14e-4547-b90d-a5a1909c737c__Screen_Shot_2017-09-08_at_3.58.48_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62135: HomePage implemented in Preact

2017-09-11 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62135/#review185088
---




ui/src/main/js/components/Home.js
Line 4 (original), 3-5 (patched)
<https://reviews.apache.org/r/62135/#comment261308>

Don't have a strong preference on one over another, as long as we keep 
consistent. In general it seems reasonable to use named functions and avoid 
anonymous ones.


- Kai Huang


On Sept. 8, 2017, 11:40 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62135/
> ---
> 
> (Updated Sept. 8, 2017, 11:40 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Kai Huang, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Implementation of the home page in PreactJS. This was delayed heavily by the 
> lack of a shallow renderer 
> (https://facebook.github.io/react/docs/shallow-renderer.html) for testing. It 
> was too painful to test without it (e.g. Links when fully rendered must be 
> rendered within a Router context), so I hand-rolled one based on some 
> community attempts. The key motivation is just to allow us to decouple markup 
> from component logic, and also to avoid having to test child components 
> within component heirarchies. IMO the end result is very clean and easy to 
> read (and write!). 
> 
> If Preact gets a shallow renderer in the future, we can swap the one included 
> here out.
> 
> 
> Diffs
> -
> 
>   src/main/resources/scheduler/assets/images/aurora_logo_white.png 
> PRE-CREATION 
>   ui/.eslintrc 355b6a8a5f5d25dd4c71dc546dc7d4acfffdd506 
>   ui/package.json f712518b27477bccb03f49c86eac3ee5f769fc88 
>   ui/src/main/js/client/scheduler-client.js PRE-CREATION 
>   ui/src/main/js/components/Breadcrumb.js PRE-CREATION 
>   ui/src/main/js/components/Home.js 91d60b387cf3b1fb268e73b7b50922a83898c31f 
>   ui/src/main/js/components/Icon.js PRE-CREATION 
>   ui/src/main/js/components/Loading.js PRE-CREATION 
>   ui/src/main/js/components/Navigation.js PRE-CREATION 
>   ui/src/main/js/components/RoleList.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Breadcrumb-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Home-test.js 
> 2a80958d9303d1d3b9ae8f95013c66cb39f6bac3 
>   ui/src/main/js/index.js 2f7467b41ac3373a90c5453a7534d384a585b464 
>   ui/src/main/js/pages/Home.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Home-test.js PRE-CREATION 
>   ui/src/main/js/utils/ShallowRender.js PRE-CREATION 
>   ui/src/main/js/utils/__tests__/ShallowRender-test.js PRE-CREATION 
>   ui/src/main/sass/app.scss PRE-CREATION 
>   ui/src/main/sass/components/_base.scss PRE-CREATION 
>   ui/src/main/sass/components/_breadcrumb.scss PRE-CREATION 
>   ui/src/main/sass/components/_home-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_layout.scss PRE-CREATION 
>   ui/src/main/sass/components/_navigation.scss PRE-CREATION 
>   ui/src/main/sass/components/_tables.scss PRE-CREATION 
>   ui/src/main/sass/modules/_all.scss PRE-CREATION 
>   ui/src/main/sass/modules/_colors.scss PRE-CREATION 
>   ui/src/main/sass/modules/_typography.scss PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/62135/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> 
> Running in vagrant (screenshots attached).
> 
> 
> File Attachments
> 
> 
> preview
>   
> https://reviews.apache.org/media/uploaded/files/2017/09/08/011f99fb-a14e-4547-b90d-a5a1909c737c__Screen_Shot_2017-09-08_at_3.58.48_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 62135: HomePage implemented in Preact

2017-09-11 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62135/#review185062
---




ui/src/main/js/index.js
Line 5 (original), 5 (patched)
<https://reviews.apache.org/r/62135/#comment261283>

Dumb question: I didn't see SchedulerClient defined in 
client/scheduler-client.js. Is it implicitly provided by the makeClient() 
function?


- Kai Huang


On Sept. 8, 2017, 11:40 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62135/
> ---
> 
> (Updated Sept. 8, 2017, 11:40 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Kai Huang, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Implementation of the home page in PreactJS. This was delayed heavily by the 
> lack of a shallow renderer 
> (https://facebook.github.io/react/docs/shallow-renderer.html) for testing. It 
> was too painful to test without it (e.g. Links when fully rendered must be 
> rendered within a Router context), so I hand-rolled one based on some 
> community attempts. The key motivation is just to allow us to decouple markup 
> from component logic, and also to avoid having to test child components 
> within component heirarchies. IMO the end result is very clean and easy to 
> read (and write!). 
> 
> If Preact gets a shallow renderer in the future, we can swap the one included 
> here out.
> 
> 
> Diffs
> -
> 
>   src/main/resources/scheduler/assets/images/aurora_logo_white.png 
> PRE-CREATION 
>   ui/.eslintrc 355b6a8a5f5d25dd4c71dc546dc7d4acfffdd506 
>   ui/package.json f712518b27477bccb03f49c86eac3ee5f769fc88 
>   ui/src/main/js/client/scheduler-client.js PRE-CREATION 
>   ui/src/main/js/components/Breadcrumb.js PRE-CREATION 
>   ui/src/main/js/components/Home.js 91d60b387cf3b1fb268e73b7b50922a83898c31f 
>   ui/src/main/js/components/Icon.js PRE-CREATION 
>   ui/src/main/js/components/Loading.js PRE-CREATION 
>   ui/src/main/js/components/Navigation.js PRE-CREATION 
>   ui/src/main/js/components/RoleList.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Breadcrumb-test.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Home-test.js 
> 2a80958d9303d1d3b9ae8f95013c66cb39f6bac3 
>   ui/src/main/js/index.js 2f7467b41ac3373a90c5453a7534d384a585b464 
>   ui/src/main/js/pages/Home.js PRE-CREATION 
>   ui/src/main/js/pages/__tests__/Home-test.js PRE-CREATION 
>   ui/src/main/js/utils/ShallowRender.js PRE-CREATION 
>   ui/src/main/js/utils/__tests__/ShallowRender-test.js PRE-CREATION 
>   ui/src/main/sass/app.scss PRE-CREATION 
>   ui/src/main/sass/components/_base.scss PRE-CREATION 
>   ui/src/main/sass/components/_breadcrumb.scss PRE-CREATION 
>   ui/src/main/sass/components/_home-page.scss PRE-CREATION 
>   ui/src/main/sass/components/_layout.scss PRE-CREATION 
>   ui/src/main/sass/components/_navigation.scss PRE-CREATION 
>   ui/src/main/sass/components/_tables.scss PRE-CREATION 
>   ui/src/main/sass/modules/_all.scss PRE-CREATION 
>   ui/src/main/sass/modules/_colors.scss PRE-CREATION 
>   ui/src/main/sass/modules/_typography.scss PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/62135/diff/1/
> 
> 
> Testing
> ---
> 
> ./gradlew ui:test
> 
> Running in vagrant (screenshots attached).
> 
> 
> File Attachments
> 
> 
> preview
>   
> https://reviews.apache.org/media/uploaded/files/2017/09/08/011f99fb-a14e-4547-b90d-a5a1909c737c__Screen_Shot_2017-09-08_at_3.58.48_PM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 61864: Bootstrap the build pipeline for new Preact UI.

2017-08-24 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61864/#review183766
---




build.gradle
Lines 27 (patched)
<https://reviews.apache.org/r/61864/#comment259821>

Do we need to apply this plugin at global scope?


- Kai Huang


On Aug. 23, 2017, 11:53 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61864/
> ---
> 
> (Updated Aug. 23, 2017, 11:53 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Kai Huang, Santhosh Kumar 
> Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Bootstrap the build pipeline for new Preact UI. Obviously due to the 
> disappointing decision by Facebook to keep their PATENTS clause, we can no 
> longer use React or jest or any of the other Facebook OSS projects. 
> Thankfully Preact has a compatibility layer that mostly hides this (see 
> code), so it's really only jest (which is so much simpler than Karma) that is 
> missed here. 
> 
> Right now we only insert a "Hello, World" message under a new path in the 
> Scheduler. The goal here is to verify that this works in the Apache 
> infrastructure, and that people can download the patch and confirm their 
> local development environment is built correctly by gradle. Will be 
> particularly useful if you don't have any of the modern Node/UI stack 
> installed, because the idea is that Gradle will do it all for you.
> 
> Note: this will add time, perhaps significant time, to the Scheduler build 
> process the first time it sets up the environment. I've made sure to set up 
> input/output directories in Gradle to make sure the UP-TO-DATE mechanism is 
> respected and the work isn't repeated. 
> 
> The following tasks become available:
> 
> ./gradlew ui:install (runs npm install and fetches 3rd party JS dependencies)
> ./gradlew ui:webpack (the main build step - performed by webpack)
> ./gradlew ui:test (runs Karma-Jasmine-Webpack-Chai powered test suite) 
> ./gradlew ui:lint (runs eslint on the source code)
> 
> And all of these are now performed as part of the root ./gradlew build step. 
> This also maintains feature parity with ./gradlew processResources 
> --continuous. 
> 
> This is the first time I've ever really had to do significant changes to our 
> Gradle build, so any feedback on my changes there are particularly 
> appreciated.
> 
> 
> Diffs
> -
> 
>   .gitignore b4e2bcbef6fdb1c049acda7c4fda4bd47988bea2 
>   build.gradle c2c402f3ed0043b1e9befb6f9c4423649ee5c105 
>   settings.gradle b097e2fd958fa0ce6076fc104eb3890c4029295d 
>   src/main/java/org/apache/aurora/scheduler/http/JettyServerModule.java 
> f7e72e77e50680d937d727cb4c0eb8940aabf03b 
>   src/main/resources/scheduler/assets/scheduler/new-index.html PRE-CREATION 
>   ui/.babelrc PRE-CREATION 
>   ui/.eslintrc PRE-CREATION 
>   ui/karma.conf.js PRE-CREATION 
>   ui/package.json PRE-CREATION 
>   ui/src/main/js/components/Home.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Home-test.js PRE-CREATION 
>   ui/src/main/js/index.js PRE-CREATION 
>   ui/webpack.config.js PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/61864/diff/2/
> 
> 
> Testing
> ---
> 
> The Scheduler is running in my Vagrant image with the Hello World message 
> served under /beta.
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 61864: Bootstrap the build pipeline for new Preact UI.

2017-08-24 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61864/#review183761
---


Ship it!




./gradlew build works for me in vagrant box now.

- Kai Huang


On Aug. 23, 2017, 11:53 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61864/
> ---
> 
> (Updated Aug. 23, 2017, 11:53 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Kai Huang, Santhosh Kumar 
> Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Bootstrap the build pipeline for new Preact UI. Obviously due to the 
> disappointing decision by Facebook to keep their PATENTS clause, we can no 
> longer use React or jest or any of the other Facebook OSS projects. 
> Thankfully Preact has a compatibility layer that mostly hides this (see 
> code), so it's really only jest (which is so much simpler than Karma) that is 
> missed here. 
> 
> Right now we only insert a "Hello, World" message under a new path in the 
> Scheduler. The goal here is to verify that this works in the Apache 
> infrastructure, and that people can download the patch and confirm their 
> local development environment is built correctly by gradle. Will be 
> particularly useful if you don't have any of the modern Node/UI stack 
> installed, because the idea is that Gradle will do it all for you.
> 
> Note: this will add time, perhaps significant time, to the Scheduler build 
> process the first time it sets up the environment. I've made sure to set up 
> input/output directories in Gradle to make sure the UP-TO-DATE mechanism is 
> respected and the work isn't repeated. 
> 
> The following tasks become available:
> 
> ./gradlew ui:install (runs npm install and fetches 3rd party JS dependencies)
> ./gradlew ui:webpack (the main build step - performed by webpack)
> ./gradlew ui:test (runs Karma-Jasmine-Webpack-Chai powered test suite) 
> ./gradlew ui:lint (runs eslint on the source code)
> 
> And all of these are now performed as part of the root ./gradlew build step. 
> This also maintains feature parity with ./gradlew processResources 
> --continuous. 
> 
> This is the first time I've ever really had to do significant changes to our 
> Gradle build, so any feedback on my changes there are particularly 
> appreciated.
> 
> 
> Diffs
> -
> 
>   .gitignore b4e2bcbef6fdb1c049acda7c4fda4bd47988bea2 
>   build.gradle c2c402f3ed0043b1e9befb6f9c4423649ee5c105 
>   settings.gradle b097e2fd958fa0ce6076fc104eb3890c4029295d 
>   src/main/java/org/apache/aurora/scheduler/http/JettyServerModule.java 
> f7e72e77e50680d937d727cb4c0eb8940aabf03b 
>   src/main/resources/scheduler/assets/scheduler/new-index.html PRE-CREATION 
>   ui/.babelrc PRE-CREATION 
>   ui/.eslintrc PRE-CREATION 
>   ui/karma.conf.js PRE-CREATION 
>   ui/package.json PRE-CREATION 
>   ui/src/main/js/components/Home.js PRE-CREATION 
>   ui/src/main/js/components/__tests__/Home-test.js PRE-CREATION 
>   ui/src/main/js/index.js PRE-CREATION 
>   ui/webpack.config.js PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/61864/diff/2/
> 
> 
> Testing
> ---
> 
> The Scheduler is running in my Vagrant image with the Hello World message 
> served under /beta.
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 60714: aurora job restart request should be idempotent and retryable

2017-07-16 Thread Kai Huang
urora.local:8081/api, not retrying
```


Thanks,

Kai Huang



Re: Review Request 60714: aurora job restart request should be idempotent and retryable

2017-07-07 Thread Kai Huang
ARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 INFO] Detected RUNNING instance 0
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 INFO] Instance 0 has been up and healthy for at least 30 seconds
 INFO] All instances were restarted successfully
 Job devcluster/vagrant/test/hello restarted successfully
```


Thanks,

Kai Huang



Re: Review Request 60714: aurora job restart request should be idempotent and retryable

2017-07-07 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60714/
---

(Updated July 7, 2017, 8:20 p.m.)


Review request for Aurora, David McLaughlin, Mehrdad Nurolahzade, Santhosh 
Kumar Shanmugham, and Zameer Manji.


Changes
---

Removed unused property in restarter.


Bugs: AURORA-1940
https://issues.apache.org/jira/browse/AURORA-1940


Repository: aurora


Description
---

aurora job restart request should be idempotent and retryable.

There was a recent change to the Aurora client to provide "at most once" 
instead of "at least once" retries for non-idempotent operations. See: 
https://github.com/apache/aurora/commit/f1e25375def5a047da97d8bdfb47a3a9101568f6

Technically, `aurora job restart` is a non-idempotent operation, thus it was 
not retried. However, when a transport exception occurs, the operator has to 
babysit simple operations like aurora job restart if it were not retried. 
Compared to the requests that were causing problems (admin tasks, job creating, 
updates, etc.), restarts in general should be retried rather than erring on the 
side of caution.

Job restart can be divided into three steps:
- 1. get instance status (getTasksWithoutConfigs)
- 2. restart shards (restartShards)
- 3. watch instance until healthy (getTasksWithoutConfigs)

TTransport exception can be thrown at each of these step, ideally we should 
make __ALL__ of the steps above __idempotent__ and retryable. 

However, given that the `watch` logic is also used in --wait options of job 
create/kill command), making this step retryable will have an impact job 
create/kill commands as well. IMO, it's probably OK for us to retry the watch 
during job create/kill since the watch step is readonly.

In this CR, I will make the first __TWO__ steps retryable since they are 
self-contained in job restart command.
If people are OK with this strategy, I'll make the `watch` step retryable as 
well.


Diffs (updated)
-

  src/main/python/apache/aurora/client/api/restarter.py 
6600c6b608ee70a02e11ca8550a06b7fb76fd863 
  src/test/python/apache/aurora/api_util.py 
bd6e3a6abb5a2eec621121fc1b4d547c54a9d19d 
  src/test/python/apache/aurora/client/api/test_restarter.py 
a81003e79e090bbfa151431366f5394f405d99eb 
  src/test/python/apache/aurora/client/cli/test_restart.py 
cb4adc55caec354d2cdf6a92ff2dd8a3b4a9eca8 


Diff: https://reviews.apache.org/r/60714/diff/2/

Changes: https://reviews.apache.org/r/60714/diff/1-2/


Testing
---

./build-support/jenkins/build.sh
 
 To test the retry logic, random TTranport exceptions were thrown at the client 
side(for all api calls to the scheduler proxy), aurora job restart command was 
issued against scheduler, and the output was like:
```
 vagrant@aurora:~$ aurora job restart devcluster/vagrant/test/hello
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 INFO] Performing rolling restart of job devcluster/vagrant/test/hello 
(instances: [0])
 INFO] Restarting instances: [0]
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 INFO] Watching instances: [0]
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 INFO] Detected RUNNING instance 0
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 INFO] Instance 0 has been up and healthy for at least 30 seconds
 INFO] All instances were restarted successfully
 Job devcluster/vagrant/test/hello restarted successfully
```


Thanks,

Kai Huang



Review Request 60714: aurora job restart request should be idempotent and retryable

2017-07-07 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60714/
---

Review request for Aurora, David McLaughlin, Mehrdad Nurolahzade, Santhosh 
Kumar Shanmugham, and Zameer Manji.


Bugs: AURORA-1940
https://issues.apache.org/jira/browse/AURORA-1940


Repository: aurora


Description
---

aurora job restart request should be idempotent and retryable.

There was a recent change to the Aurora client to provide "at most once" 
instead of "at least once" retries for non-idempotent operations. See: 
https://github.com/apache/aurora/commit/f1e25375def5a047da97d8bdfb47a3a9101568f6

Technically, `aurora job restart` is a non-idempotent operation, thus it was 
not retried. However, when a transport exception occurs, the operator has to 
babysit simple operations like aurora job restart if it were not retried. 
Compared to the requests that were causing problems (admin tasks, job creating, 
updates, etc.), restarts in general should be retried rather than erring on the 
side of caution.

Job restart can be divided into three steps:
- 1. get instance status (getTasksWithoutConfigs)
- 2. restart shards (restartShards)
- 3. watch instance until healthy (getTasksWithoutConfigs)

TTransport exception can be thrown at each of these step, ideally we should 
make __ALL__ of the steps above __idempotent__ and retryable. 

However, given that the `watch` logic is also used in --wait options of job 
create/kill command), making this step retryable will have an impact job 
create/kill commands as well. IMO, it's probably OK for us to retry the watch 
during job create/kill since the watch step is readonly.

In this CR, I will make the first __TWO__ steps retryable since they are 
self-contained in job restart command.
If people are OK with this strategy, I'll make the `watch` step retryable as 
well.


Diffs
-

  src/main/python/apache/aurora/client/api/restarter.py 
6600c6b608ee70a02e11ca8550a06b7fb76fd863 
  src/test/python/apache/aurora/api_util.py 
bd6e3a6abb5a2eec621121fc1b4d547c54a9d19d 
  src/test/python/apache/aurora/client/api/test_restarter.py 
a81003e79e090bbfa151431366f5394f405d99eb 
  src/test/python/apache/aurora/client/cli/test_restart.py 
cb4adc55caec354d2cdf6a92ff2dd8a3b4a9eca8 


Diff: https://reviews.apache.org/r/60714/diff/1/


Testing
---

./build-support/jenkins/build.sh
 
 To test the retry logic, random TTranport exceptions were thrown at the client 
side(for all api calls to the scheduler proxy), aurora job restart command was 
issued against scheduler, and the output was like:
```
 vagrant@aurora:~$ aurora job restart devcluster/vagrant/test/hello
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 INFO] Performing rolling restart of job devcluster/vagrant/test/hello 
(instances: [0])
 INFO] Restarting instances: [0]
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 INFO] Watching instances: [0]
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 INFO] Detected RUNNING instance 0
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 WARN] Transport error communicating with scheduler: Timed out talking to 
http://aurora.local:8081/api, retrying...
 INFO] Instance 0 has been up and healthy for at least 30 seconds
 INFO] All instances were restarted successfully
 Job devcluster/vagrant/test/hello restarted successfully
```


Thanks,

Kai Huang



Review Request 60437: Add timing metrics in MesosCallbackHandler for backward compatibility.

2017-06-26 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60437/
---

Review request for Aurora, David McLaughlin, Santhosh Kumar, and Zameer Manji.


Repository: aurora


Description
---

Add timing metrics in MesosCallbackHandler for backward compatibility.

We have an internal performance dashboard that reads the following timing 
metrics:
```
scheduler_resource_offers,
scheduler_framework_message,
scheduler_status_update
```
Currently, these timing metrics merely reside in MesosSchedulerImpl class. If 
we were to enable the Mesos HTTP API in aurora(switch from 
__MesosSchedulerImpl__ to __VersionedMesosSchedulerImpl__), we will lose the 
timing metrics in MesosSchedulerImpl. This CR migrates the missing timing 
metrics from MesosSchedulerImpl to MesosCallbackHandler, so that the above 
timing metrics will be visible after enabling Mesos HTTP API.


Diffs
-

  src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
772a04cbad99eb088105822f41b0758f47f7915a 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosSchedulerImpl.java 
c3a34d2ae51a02247a7e6176dec0f04bddc0270d 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java 
b955d61bbed2a6f4cb7f2581ebec398d637f0895 


Diff: https://reviews.apache.org/r/60437/diff/1/


Testing
---

./build-support/jenkins/build.sh


Thanks,

Kai Huang



Review Request 60350: Add missing stats in MesosCallbackHandler

2017-06-21 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60350/
---

Review request for Aurora, David McLaughlin, Santhosh Kumar, and Zameer Manji.


Bugs: AURORA-1937
https://issues.apache.org/jira/browse/AURORA-1937


Repository: aurora


Description
---

*Add missing stats in MesosCallbackHandler.

We are thinking about switching to a new driver implementation around V1Mesos 
(https://reviews.apache.org/r/57061). The V1 Mesos code requires a Scheduler 
callback with a different API. To maximize code reuse, event handling logic was 
extracted into a MesosCallbackHandler class. However, two metrics for handling 
__task status update__, and for handling __framework message__ are missing in 
this class.

This CR adds the two missing stats in the MesosCallbackHandler class.


Diffs
-

  src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
772a04cbad99eb088105822f41b0758f47f7915a 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java 
b955d61bbed2a6f4cb7f2581ebec398d637f0895 


Diff: https://reviews.apache.org/r/60350/diff/1/


Testing
---

./build-support/jenkins/build.sh


Thanks,

Kai Huang



Re: Review Request 59940: Add a whitelist for TaskStateChange events in Webhook.

2017-06-13 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59940/
---

(Updated June 13, 2017, 11:34 p.m.)


Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, Stephan 
Erb, and Zameer Manji.


Changes
---

Updated the RELEASE-NOTES for this feature.


Bugs: AURORA-1934
https://issues.apache.org/jira/browse/AURORA-1934


Repository: aurora


Description
---

Add a whitelist for TaskStateChange events in Webhook.

Aurora Scheduler has a webhook module that watches all TaskStateChanges and 
send events to configured endpoint. This will flood the endpoint with a lot of 
noise if we only care about certain types of TaskStateChange event(e.g. task 
state change from RUNNING -> LOST). This CR allows user to provide a whitelist 
of TaskStateChange event types in their webhook configuration file, so that the 
webhook will only post these events to the configured endpoint.


Diffs (updated)
-

  RELEASE-NOTES.md c73643d669363a56e530c2d8c79d6ea06cc9415b 
  docs/features/webhooks.md a060975488a50ea34c79aebd66d056df8a3b437e 
  src/main/java/org/apache/aurora/scheduler/events/Webhook.java 
3868779986285ac302d028f8713f683192951b83 
  src/main/java/org/apache/aurora/scheduler/events/WebhookInfo.java 
37c0d7944c7a37e1a5c65dafb290fcdd28765db3 
  src/test/java/org/apache/aurora/scheduler/events/WebhookTest.java 
e8335d9b78dbf30bf3ae08b6bdd02018cea76f6b 


Diff: https://reviews.apache.org/r/59940/diff/6/

Changes: https://reviews.apache.org/r/59940/diff/5-6/


Testing
---

./build-support/jenkins/build.sh

This change is backward compatible. By default, all TaskStateChange statuses 
will be matched and posted to the configured endpoint.

The user can also match all TaskStateChange statuses using a wildcard character 
"*" in webhook.json like below:
```
{
  "headers": {
"Content-Type": "application/vnd.kafka.json.v1+json",
"Producer-Type": "reliable"
  },
  "targetURL": "http://localhost:5000/;,
  "timeoutMsec": 50,
  "statuses": ["*"]
}
```

If they are only interested in TaskStateChange statuses: LOST, FAILED, they can 
provide them in the whitelist:
```
{
  "headers": {
"Content-Type": "application/vnd.kafka.json.v1+json",
"Producer-Type": "reliable"
  },
  "targetURL": "http://localhost:5000/;,
  "timeoutMsec": 50,
  "statuses": ["LOST", "FAILED"]
}
```


Thanks,

Kai Huang



Re: Review Request 59940: Add a whitelist for TaskStateChange events in Webhook.

2017-06-13 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59940/
---

(Updated June 13, 2017, 9:40 p.m.)


Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, Stephan 
Erb, and Zameer Manji.


Changes
---

Fixed variable accessibility and naming issues.


Bugs: AURORA-1934
https://issues.apache.org/jira/browse/AURORA-1934


Repository: aurora


Description
---

Add a whitelist for TaskStateChange events in Webhook.

Aurora Scheduler has a webhook module that watches all TaskStateChanges and 
send events to configured endpoint. This will flood the endpoint with a lot of 
noise if we only care about certain types of TaskStateChange event(e.g. task 
state change from RUNNING -> LOST). This CR allows user to provide a whitelist 
of TaskStateChange event types in their webhook configuration file, so that the 
webhook will only post these events to the configured endpoint.


Diffs (updated)
-

  docs/features/webhooks.md a060975488a50ea34c79aebd66d056df8a3b437e 
  src/main/java/org/apache/aurora/scheduler/events/Webhook.java 
3868779986285ac302d028f8713f683192951b83 
  src/main/java/org/apache/aurora/scheduler/events/WebhookInfo.java 
37c0d7944c7a37e1a5c65dafb290fcdd28765db3 
  src/test/java/org/apache/aurora/scheduler/events/WebhookTest.java 
e8335d9b78dbf30bf3ae08b6bdd02018cea76f6b 


Diff: https://reviews.apache.org/r/59940/diff/4/

Changes: https://reviews.apache.org/r/59940/diff/3-4/


Testing
---

./build-support/jenkins/build.sh

This change is backward compatible. By default, all TaskStateChange statuses 
will be matched and posted to the configured endpoint.

The user can also match all TaskStateChange statuses using a wildcard character 
"*" in webhook.json like below:
```
{
  "headers": {
"Content-Type": "application/vnd.kafka.json.v1+json",
"Producer-Type": "reliable"
  },
  "targetURL": "http://localhost:5000/;,
  "timeoutMsec": 50,
  "statuses": ["*"]
}
```

If they are only interested in TaskStateChange statuses: LOST, FAILED, they can 
provide them in the whitelist:
```
{
  "headers": {
"Content-Type": "application/vnd.kafka.json.v1+json",
"Producer-Type": "reliable"
  },
  "targetURL": "http://localhost:5000/;,
  "timeoutMsec": 50,
  "statuses": ["LOST", "FAILED"]
}
```


Thanks,

Kai Huang



Re: Review Request 59940: Add a whitelist for TaskStateChange events in Webhook.

2017-06-13 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59940/
---

(Updated June 13, 2017, 6:21 p.m.)


Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, Stephan 
Erb, and Zameer Manji.


Changes
---

Update documentation for webhook feature.


Bugs: AURORA-1934
https://issues.apache.org/jira/browse/AURORA-1934


Repository: aurora


Description
---

Add a whitelist for TaskStateChange events in Webhook.

Aurora Scheduler has a webhook module that watches all TaskStateChanges and 
send events to configured endpoint. This will flood the endpoint with a lot of 
noise if we only care about certain types of TaskStateChange event(e.g. task 
state change from RUNNING -> LOST). This CR allows user to provide a whitelist 
of TaskStateChange event types in their webhook configuration file, so that the 
webhook will only post these events to the configured endpoint.


Diffs (updated)
-

  docs/features/webhooks.md a060975488a50ea34c79aebd66d056df8a3b437e 
  src/main/java/org/apache/aurora/scheduler/events/Webhook.java 
3868779986285ac302d028f8713f683192951b83 
  src/main/java/org/apache/aurora/scheduler/events/WebhookInfo.java 
37c0d7944c7a37e1a5c65dafb290fcdd28765db3 
  src/test/java/org/apache/aurora/scheduler/events/WebhookTest.java 
e8335d9b78dbf30bf3ae08b6bdd02018cea76f6b 


Diff: https://reviews.apache.org/r/59940/diff/3/

Changes: https://reviews.apache.org/r/59940/diff/2-3/


Testing
---

./build-support/jenkins/build.sh

This change is backward compatible. By default, all TaskStateChange statuses 
will be matched and posted to the configured endpoint.

The user can also match all TaskStateChange statuses using a wildcard character 
"*" in webhook.json like below:
```
{
  "headers": {
"Content-Type": "application/vnd.kafka.json.v1+json",
"Producer-Type": "reliable"
  },
  "targetURL": "http://localhost:5000/;,
  "timeoutMsec": 50,
  "statuses": ["*"]
}
```

If they are only interested in TaskStateChange statuses: LOST, FAILED, they can 
provide them in the whitelist:
```
{
  "headers": {
"Content-Type": "application/vnd.kafka.json.v1+json",
"Producer-Type": "reliable"
  },
  "targetURL": "http://localhost:5000/;,
  "timeoutMsec": 50,
  "statuses": ["LOST", "FAILED"]
}
```


Thanks,

Kai Huang



Re: Review Request 59940: Add a whitelist for TaskStateChange events in Webhook.

2017-06-13 Thread Kai Huang


> On June 13, 2017, 2:24 p.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/events/Webhook.java
> > Lines 54 (patched)
> > <https://reviews.apache.org/r/59940/diff/2/?file=1749519#file1749519line54>
> >
> > This seems clearer:
> > 
> >   webhookInfo.getWhitelistedStatuses().isPresent() && 
> > webhookInfo.getWhitelistedStatuses().get().contains(status)

The negation here corresponds to case a): the whitelist is absent (not provided 
by user or contains wildcard character)


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59940/#review177740
-----------


On June 12, 2017, 10:51 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59940/
> ---
> 
> (Updated June 12, 2017, 10:51 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, 
> Stephan Erb, and Zameer Manji.
> 
> 
> Bugs: AURORA-1934
> https://issues.apache.org/jira/browse/AURORA-1934
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Add a whitelist for TaskStateChange events in Webhook.
> 
> Aurora Scheduler has a webhook module that watches all TaskStateChanges and 
> send events to configured endpoint. This will flood the endpoint with a lot 
> of noise if we only care about certain types of TaskStateChange event(e.g. 
> task state change from RUNNING -> LOST). This CR allows user to provide a 
> whitelist of TaskStateChange event types in their webhook configuration file, 
> so that the webhook will only post these events to the configured endpoint.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/events/Webhook.java 
> 3868779986285ac302d028f8713f683192951b83 
>   src/main/java/org/apache/aurora/scheduler/events/WebhookInfo.java 
> 37c0d7944c7a37e1a5c65dafb290fcdd28765db3 
>   src/test/java/org/apache/aurora/scheduler/events/WebhookTest.java 
> e8335d9b78dbf30bf3ae08b6bdd02018cea76f6b 
> 
> 
> Diff: https://reviews.apache.org/r/59940/diff/2/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> This change is backward compatible. By default, all TaskStateChange statuses 
> will be matched and posted to the configured endpoint.
> 
> The user can also match all TaskStateChange statuses using a wildcard 
> character "*" in webhook.json like below:
> ```
> {
>   "headers": {
> "Content-Type": "application/vnd.kafka.json.v1+json",
> "Producer-Type": "reliable"
>   },
>   "targetURL": "http://localhost:5000/;,
>   "timeoutMsec": 50,
>   "statuses": ["*"]
> }
> ```
> 
> If they are only interested in TaskStateChange statuses: LOST, FAILED, they 
> can provide them in the whitelist:
> ```
> {
>   "headers": {
> "Content-Type": "application/vnd.kafka.json.v1+json",
> "Producer-Type": "reliable"
>   },
>   "targetURL": "http://localhost:5000/;,
>   "timeoutMsec": 50,
>   "statuses": ["LOST", "FAILED"]
> }
> ```
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 59940: Add a whitelist for TaskStateChange events in Webhook.

2017-06-13 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59940/#review177756
---




src/main/java/org/apache/aurora/scheduler/events/Webhook.java
Lines 54 (patched)
<https://reviews.apache.org/r/59940/#comment251468>

So a task status is whitelisted if:

a) the whitelist is absent(not provided by user or contains wildcard 
character)

b) the task status is explicitly specified in the whitelist.


- Kai Huang


On June 12, 2017, 10:51 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59940/
> ---
> 
> (Updated June 12, 2017, 10:51 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, 
> Stephan Erb, and Zameer Manji.
> 
> 
> Bugs: AURORA-1934
> https://issues.apache.org/jira/browse/AURORA-1934
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Add a whitelist for TaskStateChange events in Webhook.
> 
> Aurora Scheduler has a webhook module that watches all TaskStateChanges and 
> send events to configured endpoint. This will flood the endpoint with a lot 
> of noise if we only care about certain types of TaskStateChange event(e.g. 
> task state change from RUNNING -> LOST). This CR allows user to provide a 
> whitelist of TaskStateChange event types in their webhook configuration file, 
> so that the webhook will only post these events to the configured endpoint.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/events/Webhook.java 
> 3868779986285ac302d028f8713f683192951b83 
>   src/main/java/org/apache/aurora/scheduler/events/WebhookInfo.java 
> 37c0d7944c7a37e1a5c65dafb290fcdd28765db3 
>   src/test/java/org/apache/aurora/scheduler/events/WebhookTest.java 
> e8335d9b78dbf30bf3ae08b6bdd02018cea76f6b 
> 
> 
> Diff: https://reviews.apache.org/r/59940/diff/2/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> This change is backward compatible. By default, all TaskStateChange statuses 
> will be matched and posted to the configured endpoint.
> 
> The user can also match all TaskStateChange statuses using a wildcard 
> character "*" in webhook.json like below:
> ```
> {
>   "headers": {
> "Content-Type": "application/vnd.kafka.json.v1+json",
> "Producer-Type": "reliable"
>   },
>   "targetURL": "http://localhost:5000/;,
>   "timeoutMsec": 50,
>   "statuses": ["*"]
> }
> ```
> 
> If they are only interested in TaskStateChange statuses: LOST, FAILED, they 
> can provide them in the whitelist:
> ```
> {
>   "headers": {
> "Content-Type": "application/vnd.kafka.json.v1+json",
> "Producer-Type": "reliable"
>   },
>   "targetURL": "http://localhost:5000/;,
>   "timeoutMsec": 50,
>   "statuses": ["LOST", "FAILED"]
> }
> ```
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 59940: Add a whitelist for TaskStateChange events in Webhook.

2017-06-12 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59940/
---

(Updated June 12, 2017, 10:51 p.m.)


Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, Stephan 
Erb, and Zameer Manji.


Changes
---

Add a WebhookInfoBuilder to obtain the desired testability after adding the 
whitelist field into WebhookInfo.


Bugs: AURORA-1934
https://issues.apache.org/jira/browse/AURORA-1934


Repository: aurora


Description
---

Add a whitelist for TaskStateChange events in Webhook.

Aurora Scheduler has a webhook module that watches all TaskStateChanges and 
send events to configured endpoint. This will flood the endpoint with a lot of 
noise if we only care about certain types of TaskStateChange event(e.g. task 
state change from RUNNING -> LOST). This CR allows user to provide a whitelist 
of TaskStateChange event types in their webhook configuration file, so that the 
webhook will only post these events to the configured endpoint.


Diffs (updated)
-

  src/main/java/org/apache/aurora/scheduler/events/Webhook.java 
3868779986285ac302d028f8713f683192951b83 
  src/main/java/org/apache/aurora/scheduler/events/WebhookInfo.java 
37c0d7944c7a37e1a5c65dafb290fcdd28765db3 
  src/test/java/org/apache/aurora/scheduler/events/WebhookTest.java 
e8335d9b78dbf30bf3ae08b6bdd02018cea76f6b 


Diff: https://reviews.apache.org/r/59940/diff/2/

Changes: https://reviews.apache.org/r/59940/diff/1-2/


Testing
---

./build-support/jenkins/build.sh

This change is backward compatible. By default, all TaskStateChange statuses 
will be matched and posted to the configured endpoint.

The user can also match all TaskStateChange statuses using a wildcard character 
"*" in webhook.json like below:
```
{
  "headers": {
"Content-Type": "application/vnd.kafka.json.v1+json",
"Producer-Type": "reliable"
  },
  "targetURL": "http://localhost:5000/;,
  "timeoutMsec": 50,
  "statuses": ["*"]
}
```

If they are only interested in TaskStateChange statuses: LOST, FAILED, they can 
provide them in the whitelist:
```
{
  "headers": {
"Content-Type": "application/vnd.kafka.json.v1+json",
"Producer-Type": "reliable"
  },
  "targetURL": "http://localhost:5000/;,
  "timeoutMsec": 50,
  "statuses": ["LOST", "FAILED"]
}
```


Thanks,

Kai Huang



Re: Review Request 59940: Add a whitelist for TaskStateChange events in Webhook.

2017-06-08 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59940/
---

(Updated June 9, 2017, 5:13 a.m.)


Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, Stephan 
Erb, and Zameer Manji.


Changes
---

Update the summary, no code change.


Summary (updated)
-

Add a whitelist for TaskStateChange events in Webhook.


Bugs: AURORA-1934
https://issues.apache.org/jira/browse/AURORA-1934


Repository: aurora


Description
---

Add a whitelist for TaskStateChange events in Webhook.

Aurora Scheduler has a webhook module that watches all TaskStateChanges and 
send events to configured endpoint. This will flood the endpoint with a lot of 
noise if we only care about certain types of TaskStateChange event(e.g. task 
state change from RUNNING -> LOST). This CR allows user to provide a whitelist 
of TaskStateChange event types in their webhook configuration file, so that the 
webhook will only post these events to the configured endpoint.


Diffs
-

  src/main/java/org/apache/aurora/scheduler/events/Webhook.java 
3868779986285ac302d028f8713f683192951b83 
  src/main/java/org/apache/aurora/scheduler/events/WebhookInfo.java 
37c0d7944c7a37e1a5c65dafb290fcdd28765db3 
  src/main/java/org/apache/aurora/scheduler/events/WebhookModule.java 
1f10af71830386652d21961b733bd0927c5436a1 
  src/main/resources/org/apache/aurora/scheduler/webhook_whitelist.json 
PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/events/WebhookTest.java 
e8335d9b78dbf30bf3ae08b6bdd02018cea76f6b 


Diff: https://reviews.apache.org/r/59940/diff/1/


Testing
---

./build-support/jenkins/build.sh

This change is backward compatible. By default, all TaskStateChange statuses 
will be matched and posted to the configured endpoint.

The user can also match all TaskStateChange statuses using a wildcard character 
"*" in webhook.json like below:
```
{
  "headers": {
"Content-Type": "application/vnd.kafka.json.v1+json",
"Producer-Type": "reliable"
  },
  "targetURL": "http://localhost:5000/;,
  "timeoutMsec": 50,
  "statuses": ["*"]
}
```

If they are only interested in TaskStateChange statuses: LOST, FAILED, they can 
provide them in the whitelist:
The user can also match all TaskStateChange statuses using a wildcard character 
"*" in webhook.json like below:
```
{
  "headers": {
"Content-Type": "application/vnd.kafka.json.v1+json",
    "Producer-Type": "reliable"
  },
  "targetURL": "http://localhost:5000/;,
  "timeoutMsec": 50,
  "statuses": ["LOST", "FAILED"]
}
```


Thanks,

Kai Huang



Review Request 59940: whitelist event stream

2017-06-08 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59940/
---

Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, Stephan 
Erb, and Zameer Manji.


Repository: aurora


Description
---

Add a whitelist for TaskStateChange events in Webhook.

Aurora Scheduler has a webhook module that watches all TaskStateChanges and 
send events to configured endpoint. This will flood the endpoint with a lot of 
noise if we only care about certain types of TaskStateChange event(e.g. task 
state change from RUNNING -> LOST). This CR allows user to provide a whitelist 
of TaskStateChange event types in their webhook configuration file, so that the 
webhook will only post these events to the configured endpoint.


Diffs
-

  src/main/java/org/apache/aurora/scheduler/events/Webhook.java 
3868779986285ac302d028f8713f683192951b83 
  src/main/java/org/apache/aurora/scheduler/events/WebhookInfo.java 
37c0d7944c7a37e1a5c65dafb290fcdd28765db3 
  src/main/java/org/apache/aurora/scheduler/events/WebhookModule.java 
1f10af71830386652d21961b733bd0927c5436a1 
  src/main/resources/org/apache/aurora/scheduler/webhook_whitelist.json 
PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/events/WebhookTest.java 
e8335d9b78dbf30bf3ae08b6bdd02018cea76f6b 


Diff: https://reviews.apache.org/r/59940/diff/1/


Testing
---

./build-support/jenkins/build.sh

This change is backward compatible. By default, all TaskStateChange statuses 
will be matched and posted to the configured endpoint.

The user can also match all TaskStateChange statuses using a wildcard character 
"*" in webhook.json like below:
```
{
  "headers": {
"Content-Type": "application/vnd.kafka.json.v1+json",
"Producer-Type": "reliable"
  },
  "targetURL": "http://localhost:5000/;,
  "timeoutMsec": 50,
  "statuses": ["*"]
}
```

If they are only interested in TaskStateChange statuses: LOST, FAILED, they can 
provide them in the whitelist:
The user can also match all TaskStateChange statuses using a wildcard character 
"*" in webhook.json like below:
```
{
  "headers": {
"Content-Type": "application/vnd.kafka.json.v1+json",
    "Producer-Type": "reliable"
  },
  "targetURL": "http://localhost:5000/;,
  "timeoutMsec": 50,
  "statuses": ["LOST", "FAILED"]
}
```


Thanks,

Kai Huang



Re: Review Request 59699: Improve task history pruning by batch deleting tasks

2017-06-02 Thread Kai Huang


> On June 2, 2017, 12:35 a.m., David McLaughlin wrote:
> > Ship It!

It looks like the deleteTasks() method could be even simpler: perhaps we don't 
need to move taskStore.deleteTasks out of the deleteEvent factory method? The 
only trick here is that we generate a single event for all task deletions.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/#review176716
---


On June 2, 2017, 12:01 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59699/
> ---
> 
> (Updated June 2, 2017, 12:01 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar, Stephan Erb, and 
> Zameer Manji.
> 
> 
> Bugs: AURORA-1929
> https://issues.apache.org/jira/browse/AURORA-1929
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Improve task history pruning by batch deleting tasks.
> 
> The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
> cluster has a large number of inactive tasks.
> 
> This CR batches all removeTasks operations and execute them all at once to 
> avoid additional cost of coalescing. The fix will also benefit implicit task 
> history pruning since it has similar underlying implementation. See 
> https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
> details.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
> 73878210f9028901fda3b08e66c6a63c24260d35 
> 
> 
> Diff: https://reviews.apache.org/r/59699/diff/5/
> 
> 
> Testing
> ---
> 
> __unit_tests:__
> 
> ./build-support/jenkins/build.sh
> 
> No unit tests were created for this patch since it does not add new 
> functionalities or alter the interface, but improves the efficiency of the 
> existing code.
> 
> __e2e tests:__
> 
> Attached was a screenshot of the task history pruning benchmark obtained from 
> a scale test in Twitter's test cluster.
> 
> - Before applying this patch, the task history pruning takes ~30 minutes on 
> 130K tasks.
> 
> - After applying the patch, the pruning takes ~1 minute.
> 
> 
> File Attachments
> --------
> 
> task_history_pruning_benchmark.png
>   
> https://reviews.apache.org/media/uploaded/files/2017/06/01/74eb5104-d338-4530-abd2-b82fbdc6bf84__task_history_pruning_benchmark.png
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 59699: Improve task history pruning by batch deleting tasks

2017-06-01 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/
---

(Updated June 2, 2017, 12:01 a.m.)


Review request for Aurora, David McLaughlin, Santhosh Kumar, Stephan Erb, and 
Zameer Manji.


Changes
---

fixed style issue


Bugs: AURORA-1929
https://issues.apache.org/jira/browse/AURORA-1929


Repository: aurora


Description
---

Improve task history pruning by batch deleting tasks.

The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
cluster has a large number of inactive tasks.

This CR batches all removeTasks operations and execute them all at once to 
avoid additional cost of coalescing. The fix will also benefit implicit task 
history pruning since it has similar underlying implementation. See 
https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
details.


Diffs (updated)
-

  src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
73878210f9028901fda3b08e66c6a63c24260d35 


Diff: https://reviews.apache.org/r/59699/diff/5/

Changes: https://reviews.apache.org/r/59699/diff/4-5/


Testing
---

__unit_tests:__

./build-support/jenkins/build.sh

No unit tests were created for this patch since it does not add new 
functionalities or alter the interface, but improves the efficiency of the 
existing code.

__e2e tests:__

Attached was a screenshot of the task history pruning benchmark obtained from a 
scale test in Twitter's test cluster.

- Before applying this patch, the task history pruning takes ~30 minutes on 
130K tasks.

- After applying the patch, the pruning takes ~1 minute.


File Attachments


task_history_pruning_benchmark.png
  
https://reviews.apache.org/media/uploaded/files/2017/06/01/74eb5104-d338-4530-abd2-b82fbdc6bf84__task_history_pruning_benchmark.png


Thanks,

Kai Huang



Re: Review Request 59699: Improve task history pruning by batch deleting tasks

2017-06-01 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/
---

(Updated June 1, 2017, 11:48 p.m.)


Review request for Aurora, David McLaughlin, Santhosh Kumar, Stephan Erb, and 
Zameer Manji.


Changes
---

Refactored the createDeleteEvent code to ensure that the tasks deleted from 
TaskStore are consistent with the tasks in PubsubEvent.


Bugs: AURORA-1929
https://issues.apache.org/jira/browse/AURORA-1929


Repository: aurora


Description
---

Improve task history pruning by batch deleting tasks.

The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
cluster has a large number of inactive tasks.

This CR batches all removeTasks operations and execute them all at once to 
avoid additional cost of coalescing. The fix will also benefit implicit task 
history pruning since it has similar underlying implementation. See 
https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
details.


Diffs (updated)
-

  src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
73878210f9028901fda3b08e66c6a63c24260d35 


Diff: https://reviews.apache.org/r/59699/diff/4/

Changes: https://reviews.apache.org/r/59699/diff/3-4/


Testing
---

__unit_tests:__

./build-support/jenkins/build.sh

No unit tests were created for this patch since it does not add new 
functionalities or alter the interface, but improves the efficiency of the 
existing code.

__e2e tests:__

Attached was a screenshot of the task history pruning benchmark obtained from a 
scale test in Twitter's test cluster.

- Before applying this patch, the task history pruning takes ~30 minutes on 
130K tasks.

- After applying the patch, the pruning takes ~1 minute.


File Attachments


task_history_pruning_benchmark.png
  
https://reviews.apache.org/media/uploaded/files/2017/06/01/74eb5104-d338-4530-abd2-b82fbdc6bf84__task_history_pruning_benchmark.png


Thanks,

Kai Huang



Re: Review Request 59699: Improve task history pruning by batch deleting tasks

2017-06-01 Thread Kai Huang


> On June 1, 2017, 10:46 p.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java
> > Line 376 (original), 377 (patched)
> > <https://reviews.apache.org/r/59699/diff/3/?file=1739138#file1739138line379>
> >
> > I wonder if it's using the taskIds from the event? Since they have been 
> > selected from the database and might not match the taskIds we passed in. 
> > This logic was there in the previous patch too. This would ensure 
> > consistency between taskStore.deleteTasks and the taskIds we broadcast as 
> > deleted.

Previously taskStore.deleteTasks() is not using the taskIds from the event. But 
it doesn't hurt to ensure the consistency here.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/#review176694
-------


On June 1, 2017, 10:41 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59699/
> ---
> 
> (Updated June 1, 2017, 10:41 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Santhosh Kumar.
> 
> 
> Bugs: AURORA-1929
> https://issues.apache.org/jira/browse/AURORA-1929
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Improve task history pruning by batch deleting tasks.
> 
> The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
> cluster has a large number of inactive tasks.
> 
> This CR batches all removeTasks operations and execute them all at once to 
> avoid additional cost of coalescing. The fix will also benefit implicit task 
> history pruning since it has similar underlying implementation. See 
> https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
> details.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
> 73878210f9028901fda3b08e66c6a63c24260d35 
> 
> 
> Diff: https://reviews.apache.org/r/59699/diff/3/
> 
> 
> Testing
> ---
> 
> __unit_tests:__
> 
> ./build-support/jenkins/build.sh
> 
> No unit tests were created for this patch since it does not add new 
> functionalities or alter the interface, but improves the efficiency of the 
> existing code.
> 
> __e2e tests:__
> 
> Attached was a screenshot of the task history pruning benchmark obtained from 
> a scale test in Twitter's test cluster.
> 
> - Before applying this patch, the task history pruning takes ~30 minutes on 
> 130K tasks.
> 
> - After applying the patch, the pruning takes ~1 minute.
> 
> 
> File Attachments
> 
> 
> task_history_pruning_benchmark.png
>   
> https://reviews.apache.org/media/uploaded/files/2017/06/01/74eb5104-d338-4530-abd2-b82fbdc6bf84__task_history_pruning_benchmark.png
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 59699: Improve task history pruning by batch deleting tasks

2017-06-01 Thread Kai Huang


> On June 1, 2017, 9:22 p.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java
> > Lines 389-391 (original), 374-376 (patched)
> > <https://reviews.apache.org/r/59699/diff/1-2/?file=1736080#file1736080line391>
> >
> > We probably don't even need the separate events. We could just have:
> > 
> > eventSink.post(createDeleteEvent(taskStore, taskIds));
> 
> Kai Huang wrote:
> So this will change the semantics that: The Delete Event is published 
> after we delete task from TaskStore?
> 
> David McLaughlin wrote:
> I don't think it changes the semantics at all? In both cases, the events 
> are published after the batch delete. All I'm suggesting is we send a single 
> event for the batch rather than one event per task in the batch.

addressed.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/#review176670
-------


On June 1, 2017, 10:41 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59699/
> ---
> 
> (Updated June 1, 2017, 10:41 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Santhosh Kumar.
> 
> 
> Bugs: AURORA-1929
> https://issues.apache.org/jira/browse/AURORA-1929
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Improve task history pruning by batch deleting tasks.
> 
> The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
> cluster has a large number of inactive tasks.
> 
> This CR batches all removeTasks operations and execute them all at once to 
> avoid additional cost of coalescing. The fix will also benefit implicit task 
> history pruning since it has similar underlying implementation. See 
> https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
> details.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
> 73878210f9028901fda3b08e66c6a63c24260d35 
> 
> 
> Diff: https://reviews.apache.org/r/59699/diff/3/
> 
> 
> Testing
> ---
> 
> __unit_tests:__
> 
> ./build-support/jenkins/build.sh
> 
> No unit tests were created for this patch since it does not add new 
> functionalities or alter the interface, but improves the efficiency of the 
> existing code.
> 
> __e2e tests:__
> 
> Attached was a screenshot of the task history pruning benchmark obtained from 
> a scale test in Twitter's test cluster.
> 
> - Before applying this patch, the task history pruning takes ~30 minutes on 
> 130K tasks.
> 
> - After applying the patch, the pruning takes ~1 minute.
> 
> 
> File Attachments
> 
> 
> task_history_pruning_benchmark.png
>   
> https://reviews.apache.org/media/uploaded/files/2017/06/01/74eb5104-d338-4530-abd2-b82fbdc6bf84__task_history_pruning_benchmark.png
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 59699: Improve task history pruning by batch deleting tasks

2017-06-01 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/
---

(Updated June 1, 2017, 10:41 p.m.)


Review request for Aurora, David McLaughlin and Santhosh Kumar.


Changes
---

- 1. Refactored the deletTasks code to post a single event for all task 
deletions.

- 2. Upload benchmark result for task history pruning.


Bugs: AURORA-1929
https://issues.apache.org/jira/browse/AURORA-1929


Repository: aurora


Description
---

Improve task history pruning by batch deleting tasks.

The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
cluster has a large number of inactive tasks.

This CR batches all removeTasks operations and execute them all at once to 
avoid additional cost of coalescing. The fix will also benefit implicit task 
history pruning since it has similar underlying implementation. See 
https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
details.


Diffs (updated)
-

  src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
73878210f9028901fda3b08e66c6a63c24260d35 


Diff: https://reviews.apache.org/r/59699/diff/3/

Changes: https://reviews.apache.org/r/59699/diff/2-3/


Testing (updated)
---

__unit_tests:__

./build-support/jenkins/build.sh

No unit tests were created for this patch since it does not add new 
functionalities or alter the interface, but improves the efficiency of the 
existing code.

__e2e tests:__

Attached was a screenshot of the task history pruning benchmark obtained from a 
scale test in Twitter's test cluster.

- Before applying this patch, the task history pruning takes ~30 minutes on 
130K tasks.

- After applying the patch, the pruning takes ~1 minute.


File Attachments (updated)


task_history_pruning_benchmark.png
  
https://reviews.apache.org/media/uploaded/files/2017/06/01/74eb5104-d338-4530-abd2-b82fbdc6bf84__task_history_pruning_benchmark.png


Thanks,

Kai Huang



Re: Review Request 59699: Improve task history pruning by batch deleting tasks

2017-06-01 Thread Kai Huang


> On June 1, 2017, 9:22 p.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java
> > Lines 389-391 (original), 374-376 (patched)
> > <https://reviews.apache.org/r/59699/diff/1-2/?file=1736080#file1736080line391>
> >
> > We probably don't even need the separate events. We could just have:
> > 
> > eventSink.post(createDeleteEvent(taskStore, taskIds));

So this will change the semantics that: The Delete Event is published after we 
delete task from TaskStore?


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/#review176670
-----------


On June 1, 2017, 9:13 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59699/
> ---
> 
> (Updated June 1, 2017, 9:13 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Santhosh Kumar.
> 
> 
> Bugs: AURORA-1929
> https://issues.apache.org/jira/browse/AURORA-1929
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Improve task history pruning by batch deleting tasks.
> 
> The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
> cluster has a large number of inactive tasks.
> 
> This CR batches all removeTasks operations and execute them all at once to 
> avoid additional cost of coalescing. The fix will also benefit implicit task 
> history pruning since it has similar underlying implementation. See 
> https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
> details.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
> 73878210f9028901fda3b08e66c6a63c24260d35 
> 
> 
> Diff: https://reviews.apache.org/r/59699/diff/2/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 59699: Improve task history pruning by batch deleting tasks

2017-06-01 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/
---

(Updated June 1, 2017, 9:13 p.m.)


Review request for Aurora, David McLaughlin and Santhosh Kumar.


Changes
---

- 1. Do a batch delete of all tasks to be pruned.

- 2. Rename the deleteTasks method to createDeleteEvent.


Bugs: AURORA-1929
https://issues.apache.org/jira/browse/AURORA-1929


Repository: aurora


Description
---

Improve task history pruning by batch deleting tasks.

The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
cluster has a large number of inactive tasks.

This CR batches all removeTasks operations and execute them all at once to 
avoid additional cost of coalescing. The fix will also benefit implicit task 
history pruning since it has similar underlying implementation. See 
https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
details.


Diffs (updated)
-

  src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
73878210f9028901fda3b08e66c6a63c24260d35 


Diff: https://reviews.apache.org/r/59699/diff/2/

Changes: https://reviews.apache.org/r/59699/diff/1-2/


Testing
---

./build-support/jenkins/build.sh


Thanks,

Kai Huang



Re: Review Request 59699: Improve task history pruning by batch deleting tasks

2017-05-31 Thread Kai Huang


> On June 1, 2017, 12:53 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java
> > Line 388 (original), 411 (patched)
> > <https://reviews.apache.org/r/59699/diff/1/?file=1736080#file1736080line411>
> >
> > I'm not sure this is the right approach. I don't think we want to skip 
> > writing transactions to the replicated log. 
> > 
> > All of the APIs down to the storage layer already support batching so 
> > seems like we just want to move the delete command outside of the code that 
> > generates pubsub events. E.g. call taskStore.deleteTasks(taskIds) here when 
> > we receive the batch from the prune tasks endpoint:
> > 
> > 
> > https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java#L371
> > 
> > And then remove the taskStore.deleteTasks from the PubsubEvent factory 
> > method (have no idea why it's in there).
> 
> Kai Huang wrote:
> Moving the taskStore.deleteTasks out will break some other operations:
> e.g.
> When we kill a pending task, we will also delete the task (generates a 
> PubsubEvent and calls taskStore.deleteTasks). 
> https://github.com/apache/aurora/blob/master/src/test/java/org/apache/aurora/scheduler/state/StateManagerImplTest.java#L215
>     
> 
> We need to add taskStore.deleteTasks for all other use cases that 
> indirectly calls taskStore.deleteTasks.
> 
> Kai Huang wrote:
> Instead of rewriting the code, I think I just need to refactor those 
> tests mentioned above. Currently they were all written in a way that assumes 
> the taskStore.deleteTasks was included in the PubSubEvent factory.
> 
> David McLaughlin wrote:
> I see. Also looking at the code it seems here we do actually still write 
> a transaction, so the boolean flag you've introduced is more like 
> "skipDelete" rather than "skipTransaction". Having a skipDelete flag in a 
> method called deleteTasks would be a pretty big code smell :)
> 
> deleteTasks is only called by the TaskHistoryPruner and the admin 
> endpoint, and is always a batch endpoint - so I think the flaw in this code 
> was pushing it into the per-task state machine code. All we want in our 
> deleteTasks code is:
> 
> @Override
> public void deleteTasks(MutableStoreProvider storeProvider, final 
> Set taskIds) {
>   TaskStore.Mutable taskStore = storeProvider.getUnsafeTaskStore();
>   taskStore.deleteTasks(taskIds);
>   Set events = taskIds.stream().map(id -> 
> deleteTasks(taskStore, id)).collect(...);
>   eventSink.post(events);
> }
> 
> This would require moving the taskStore.deletedTasks out of the event 
> creation code (which I would consider renaming from deleteTasks to 
> taskDeletedEvent or something) and into the side-effect handler like so:
> 
> 
> case DELETED:
>   taskStore.deleteTasks(...);
>   events.add(deleteTasks(taskStore, ...));
>   
>   
> Does that make sense?

Yes, that makes much more sense. I'll extract the taskStore.deleteTasks and 
rewrite deleteTasks code for the task prune endpoint.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/#review176567
---


On June 1, 2017, 12:18 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59699/
> ---
> 
> (Updated June 1, 2017, 12:18 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Santhosh Kumar.
> 
> 
> Bugs: AURORA-1929
> https://issues.apache.org/jira/browse/AURORA-1929
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Improve task history pruning by batch deleting tasks.
> 
> The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
> cluster has a large number of inactive tasks.
> 
> This CR batches all removeTasks operations and execute them all at once to 
> avoid additional cost of coalescing. The fix will also benefit implicit task 
> history pruning since it has similar underlying implementation. See 
> https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
> details.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
> 73878210f9028901fda3b08e66c6a63c24260d35 
> 
> 
> Diff: https://reviews.apache.org/r/59699/diff/1/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 59699: Improve task history pruning by batch deleting tasks

2017-05-31 Thread Kai Huang


> On June 1, 2017, 12:53 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java
> > Line 388 (original), 411 (patched)
> > <https://reviews.apache.org/r/59699/diff/1/?file=1736080#file1736080line411>
> >
> > I'm not sure this is the right approach. I don't think we want to skip 
> > writing transactions to the replicated log. 
> > 
> > All of the APIs down to the storage layer already support batching so 
> > seems like we just want to move the delete command outside of the code that 
> > generates pubsub events. E.g. call taskStore.deleteTasks(taskIds) here when 
> > we receive the batch from the prune tasks endpoint:
> > 
> > 
> > https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java#L371
> > 
> > And then remove the taskStore.deleteTasks from the PubsubEvent factory 
> > method (have no idea why it's in there).
> 
> Kai Huang wrote:
> Moving the taskStore.deleteTasks out will break some other operations:
> e.g.
> When we kill a pending task, we will also delete the task (generates a 
> PubsubEvent and calls taskStore.deleteTasks). 
> https://github.com/apache/aurora/blob/master/src/test/java/org/apache/aurora/scheduler/state/StateManagerImplTest.java#L215
> 
> 
> We need to add taskStore.deleteTasks for all other use cases that 
> indirectly calls taskStore.deleteTasks.

Instead of rewriting the code, I think I just need to refactor those tests 
mentioned above. Currently they were all written in a way that assumes the 
taskStore.deleteTasks was included in the PubSubEvent factory.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/#review176567
---


On June 1, 2017, 12:18 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59699/
> ---
> 
> (Updated June 1, 2017, 12:18 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Santhosh Kumar.
> 
> 
> Bugs: AURORA-1929
> https://issues.apache.org/jira/browse/AURORA-1929
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Improve task history pruning by batch deleting tasks.
> 
> The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
> cluster has a large number of inactive tasks.
> 
> This CR batches all removeTasks operations and execute them all at once to 
> avoid additional cost of coalescing. The fix will also benefit implicit task 
> history pruning since it has similar underlying implementation. See 
> https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
> details.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
> 73878210f9028901fda3b08e66c6a63c24260d35 
> 
> 
> Diff: https://reviews.apache.org/r/59699/diff/1/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 59699: Improve task history pruning by batch deleting tasks

2017-05-31 Thread Kai Huang


> On June 1, 2017, 12:57 a.m., David McLaughlin wrote:
> > Can you also provide metrics showing how much time is spent in the coalesce 
> > method? And show before/after results using the scale test?

Sure. I will add some benchmarks for the task history pruning.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/#review176568
---


On June 1, 2017, 12:18 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59699/
> ---
> 
> (Updated June 1, 2017, 12:18 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Santhosh Kumar.
> 
> 
> Bugs: AURORA-1929
> https://issues.apache.org/jira/browse/AURORA-1929
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Improve task history pruning by batch deleting tasks.
> 
> The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
> cluster has a large number of inactive tasks.
> 
> This CR batches all removeTasks operations and execute them all at once to 
> avoid additional cost of coalescing. The fix will also benefit implicit task 
> history pruning since it has similar underlying implementation. See 
> https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
> details.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
> 73878210f9028901fda3b08e66c6a63c24260d35 
> 
> 
> Diff: https://reviews.apache.org/r/59699/diff/1/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 59699: Improve task history pruning by batch deleting tasks

2017-05-31 Thread Kai Huang


> On June 1, 2017, 12:53 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java
> > Line 388 (original), 411 (patched)
> > <https://reviews.apache.org/r/59699/diff/1/?file=1736080#file1736080line411>
> >
> > I'm not sure this is the right approach. I don't think we want to skip 
> > writing transactions to the replicated log. 
> > 
> > All of the APIs down to the storage layer already support batching so 
> > seems like we just want to move the delete command outside of the code that 
> > generates pubsub events. E.g. call taskStore.deleteTasks(taskIds) here when 
> > we receive the batch from the prune tasks endpoint:
> > 
> > 
> > https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java#L371
> > 
> > And then remove the taskStore.deleteTasks from the PubsubEvent factory 
> > method (have no idea why it's in there).

Moving the taskStore.deleteTasks out will break some other operations:
e.g.
When we kill a pending task, we will also delete the task (generates a 
PubsubEvent and calls taskStore.deleteTasks). 
https://github.com/apache/aurora/blob/master/src/test/java/org/apache/aurora/scheduler/state/StateManagerImplTest.java#L215


We need to add taskStore.deleteTasks for all other use cases that indirectly 
calls taskStore.deleteTasks.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/#review176567
---


On June 1, 2017, 12:18 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59699/
> ---
> 
> (Updated June 1, 2017, 12:18 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Santhosh Kumar.
> 
> 
> Bugs: AURORA-1929
> https://issues.apache.org/jira/browse/AURORA-1929
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Improve task history pruning by batch deleting tasks.
> 
> The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
> cluster has a large number of inactive tasks.
> 
> This CR batches all removeTasks operations and execute them all at once to 
> avoid additional cost of coalescing. The fix will also benefit implicit task 
> history pruning since it has similar underlying implementation. See 
> https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
> details.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
> 73878210f9028901fda3b08e66c6a63c24260d35 
> 
> 
> Diff: https://reviews.apache.org/r/59699/diff/1/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Review Request 59699: Improve task history pruning by batch deleting tasks

2017-05-31 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59699/
---

Review request for Aurora, David McLaughlin and Santhosh Kumar.


Bugs: AURORA-1929
https://issues.apache.org/jira/browse/AURORA-1929


Repository: aurora


Description
---

Improve task history pruning by batch deleting tasks.

The `'aurora_admin prune_tasks'` endpoint seems to be very slow when the 
cluster has a large number of inactive tasks.

This CR batches all removeTasks operations and execute them all at once to 
avoid additional cost of coalescing. The fix will also benefit implicit task 
history pruning since it has similar underlying implementation. See 
https://issues.apache.org/jira/browse/AURORA-1929 for more information and 
details.


Diffs
-

  src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
73878210f9028901fda3b08e66c6a63c24260d35 


Diff: https://reviews.apache.org/r/59699/diff/1/


Testing
---

./build-support/jenkins/build.sh


Thanks,

Kai Huang



Re: Review Request 55105: AURORA-1870 Add finer grained timings to the Snapshot process

2017-01-08 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55105/#review160868
---




commons/src/test/java/org/apache/aurora/common/stats/SlidingStatsTest.java 
(line 48)
<https://reviews.apache.org/r/55105/#comment232119>

Can we add a clock for SlidingStats like 
https://reviews.apache.org/r/54967/, so that we can use a Fake clock to test if 
the elapsed time was accumulated correctly?


- Kai Huang


On Jan. 5, 2017, 4:36 a.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55105/
> ---
> 
> (Updated Jan. 5, 2017, 4:36 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Joshua Cohen.
> 
> 
> Bugs: AURORA-1870
> https://issues.apache.org/jira/browse/AURORA-1870
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1870 Add finer grained timings to the Snapshot process
> 
> I gave up on `@Timed` interceptor approach because major refactoring is 
> required in order to have snapshot fields instantiated by Guice through 
> `Provider` or `@Provides` interfaces. The abstract class approach is much 
> easier/cleaner.
> 
> 
> Diffs
> -
> 
>   commons/src/main/java/org/apache/aurora/common/stats/SlidingStats.java 
> f7a5ae41e307627fc55157758e9b7cdd861c3268 
>   commons/src/test/java/org/apache/aurora/common/stats/SlidingStatsTest.java 
> PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
> 7aa111ec14696ae40f518c42f3c7f45d8ab0e94c 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java
>  f56a1624c7188da175ad3e6de323c3442802f2ea 
> 
> Diff: https://reviews.apache.org/r/55105/diff/
> 
> 
> Testing
> ---
> 
> ```
> $ curl localhost:8081/vars | grep snapshot_restore
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
> snapshot_restore_crons_events 1
> snapshot_restore_crons_events_per_sec 0.0
> snapshot_restore_crons_nanos_per_event 0.0
> snapshot_restore_crons_nanos_total 73648
> snapshot_restore_crons_nanos_total_per_sec 0.0
> snapshot_restore_dbscript_events 1
> snapshot_restore_dbscript_events_per_sec 0.0
> snapshot_restore_dbscript_nanos_per_event 0.0
> snapshot_restore_dbscript_nanos_total 1148842021
> snapshot_restore_dbscript_nanos_total_per_sec 0.0
> snapshot_restore_hosts_events 1
> snapshot_restore_hosts_events_per_sec 0.0
> snapshot_restore_hosts_nanos_per_event 0.0
> snapshot_restore_hosts_nanos_total 76166
> snapshot_restore_hosts_nanos_total_per_sec 0.0
> snapshot_restore_job_updates_events 1
> snapshot_restore_job_updates_events_per_sec 0.0
> snapshot_restore_job_updates_nanos_per_event 0.0
> snapshot_restore_job_updates_nanos_total 49482
> snapshot_restore_job_updates_nanos_total_per_sec 0.0
> snapshot_restore_locks_events 1
> snapshot_restore_locks_events_per_sec 0.0
> snapshot_restore_locks_nanos_per_event 0.0
> snapshot_restore_locks_nanos_total 125084
> snapshot_restore_locks_nanos_total_per_sec 0.0
> snapshot_restore_quota_events 1
> snapshot_restore_quota_events_per_sec 0.0
> snapshot_restore_quota_nanos_per_event 0.0
> snapshot_restore_quota_nanos_total 52305
> snapshot_restore_quota_nanos_total_per_sec 0.0
> snapshot_restore_scheduler_metadata_events 1
> snapshot_restore_scheduler_metadata_events_per_sec 0.0
> snapshot_restore_scheduler_metadata_nanos_per_event 0.0
> snapshot_restore_scheduler_metadata_nanos_total 70816
> snapshot_restore_scheduler_metadata_nanos_total_per_sec 0.0
> snapshot_restore_tasks_events 1
> snapshot_restore_tasks_events_per_sec 0.0
> snapshot_restore_tasks_nanos_per_event 0.0
> snapshot_restore_tasks_nanos_total 91377
> snapshot_restore_tasks_nanos_total_per_sec 0.0
> ```
> 
> ```
> $ aurora_admin scheduler_snapshot devcluster
>  INFO] Response from scheduler: OK (message: )
>  
> $ curl localhost:8081/vars | grep snapshot_save
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
> 100 482260 482260 0  3266k  0 --:--:-- --:--:-- --:--:-- 3363k
> snapshot_save_crons_events 1
> snapshot_save_crons_events_per_sec 0.0
> snapshot_save_crons_nanos_per_event 0.0

Re: Review Request 54967: AURORA-1856 Expose stats on deleted job updates in JobUpdateHistoryPruner

2017-01-08 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54967/#review160828
---


Ship it!




lgtm!

- Kai Huang


On Dec. 22, 2016, 7:37 a.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54967/
> ---
> 
> (Updated Dec. 22, 2016, 7:37 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Joshua Cohen.
> 
> 
> Bugs: AURORA-1856
> https://issues.apache.org/jira/browse/AURORA-1856
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1856 Expose stats on deleted job updates in JobUpdateHistoryPruner
> 
> 
> Diffs
> -
> 
>   
> src/main/java/org/apache/aurora/scheduler/pruning/JobUpdateHistoryPruner.java 
> 6ab39ca0a64dfab9fe2fdec79fef1b4d320b6dc6 
>   
> src/test/java/org/apache/aurora/scheduler/pruning/JobUpdateHistoryPrunerTest.java
>  20f790ef4d04cd8aaa7cdab4442040a31fa72838 
> 
> Diff: https://reviews.apache.org/r/54967/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Re: Review Request 55105: AURORA-1870 Add finer grained timings to the Snapshot process

2017-01-04 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55105/#review160517
---




src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
(line 521)
<https://reviews.apache.org/r/55105/#comment231657>

If we need this elsewhere, can we make the timing part more reusable like:

Stats.time("statsName", () -> {#function to time})


- Kai Huang


On Dec. 30, 2016, 9:18 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55105/
> ---
> 
> (Updated Dec. 30, 2016, 9:18 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Joshua Cohen.
> 
> 
> Bugs: AURORA-1870
> https://issues.apache.org/jira/browse/AURORA-1870
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1870 Add finer grained timings to the Snapshot process
> 
> I gave up on `@Timed` interceptor approach because major refactoring is 
> required in order to have snapshot fields instantiated by Guice through 
> `Provider` or `@Provides` interfaces. The abstract class approach is much 
> easier/cleaner.
> 
> 
> Diffs
> -
> 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
> 7aa111ec14696ae40f518c42f3c7f45d8ab0e94c 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java
>  f56a1624c7188da175ad3e6de323c3442802f2ea 
> 
> Diff: https://reviews.apache.org/r/55105/diff/
> 
> 
> Testing
> ---
> 
> ```
> $ curl localhost:8081/vars | grep snapshot_restore
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
> snapshot_restore_crons_events 1
> snapshot_restore_crons_events_per_sec 0.0
> snapshot_restore_crons_nanos_per_event 0.0
> snapshot_restore_crons_nanos_total 73648
> snapshot_restore_crons_nanos_total_per_sec 0.0
> snapshot_restore_dbscript_events 1
> snapshot_restore_dbscript_events_per_sec 0.0
> snapshot_restore_dbscript_nanos_per_event 0.0
> snapshot_restore_dbscript_nanos_total 1148842021
> snapshot_restore_dbscript_nanos_total_per_sec 0.0
> snapshot_restore_hosts_events 1
> snapshot_restore_hosts_events_per_sec 0.0
> snapshot_restore_hosts_nanos_per_event 0.0
> snapshot_restore_hosts_nanos_total 76166
> snapshot_restore_hosts_nanos_total_per_sec 0.0
> snapshot_restore_job_updates_events 1
> snapshot_restore_job_updates_events_per_sec 0.0
> snapshot_restore_job_updates_nanos_per_event 0.0
> snapshot_restore_job_updates_nanos_total 49482
> snapshot_restore_job_updates_nanos_total_per_sec 0.0
> snapshot_restore_locks_events 1
> snapshot_restore_locks_events_per_sec 0.0
> snapshot_restore_locks_nanos_per_event 0.0
> snapshot_restore_locks_nanos_total 125084
> snapshot_restore_locks_nanos_total_per_sec 0.0
> snapshot_restore_quota_events 1
> snapshot_restore_quota_events_per_sec 0.0
> snapshot_restore_quota_nanos_per_event 0.0
> snapshot_restore_quota_nanos_total 52305
> snapshot_restore_quota_nanos_total_per_sec 0.0
> snapshot_restore_scheduler_metadata_events 1
> snapshot_restore_scheduler_metadata_events_per_sec 0.0
> snapshot_restore_scheduler_metadata_nanos_per_event 0.0
> snapshot_restore_scheduler_metadata_nanos_total 70816
> snapshot_restore_scheduler_metadata_nanos_total_per_sec 0.0
> snapshot_restore_tasks_events 1
> snapshot_restore_tasks_events_per_sec 0.0
> snapshot_restore_tasks_nanos_per_event 0.0
> snapshot_restore_tasks_nanos_total 91377
> snapshot_restore_tasks_nanos_total_per_sec 0.0
> ```
> 
> ```
> $ aurora_admin scheduler_snapshot devcluster
>  INFO] Response from scheduler: OK (message: )
>  
> $ curl localhost:8081/vars | grep snapshot_save
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
> 100 482260 482260 0  3266k  0 --:--:-- --:--:-- --:--:-- 3363k
> snapshot_save_crons_events 1
> snapshot_save_crons_events_per_sec 0.0
> snapshot_save_crons_nanos_per_event 0.0
> snapshot_save_crons_nanos_total 466181
> snapshot_save_crons_nanos_total_per_sec 0.0
> snapshot_save_dbscript_events 1
> snapshot_save_dbscript_events_per_sec 0.0
> snapshot_save_dbscript_nanos_per_event 0.0
> snapshot_save_dbscr

Re: Review Request 52806: Revert "Add min_consecutive_health_checks in HealthCheckConfig"

2016-10-12 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52806/#review152411
---


Ship it!




Ship It!

- Kai Huang


On Oct. 12, 2016, 9:09 p.m., David McLaughlin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52806/
> ---
> 
> (Updated Oct. 12, 2016, 9:09 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Kai Huang, and Zameer Manji.
> 
> 
> Bugs: AURORA-1793
> https://issues.apache.org/jira/browse/AURORA-1793
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This reverts commit ed72b1bf662d1e29d2bb483b317c787630c26a9e.
> 
> See AURORA-1793
> 
> Revert "Add support for receiving min_consecutive_successes in health checker"
> 
> This reverts commit e91130e49445c3933b6e27f5fde18c3a0e61b87a.
> 
> See AURORA-1793
> 
> Revert "Modify executor state transition logic to rely on health checks (if 
> enabled)."
> 
> This reverts commit ca683cb9e27bae76424a687bc6c3af5a73c501b9.
> 
> See AURORA-1793
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md f3dd8bb0f983c560f29ac39824e517c9f145d69e 
>   docs/features/job-updates.md c4ec42e1b641718bcb11ad5221640e359459751f 
>   docs/reference/configuration.md 71d2ce5faa9169efe864e77235f4089e3b9398e9 
>   src/main/python/apache/aurora/client/api/updater_util.py 
> ebeddab4fbdc0678878bff3215e58f026092ff79 
>   src/main/python/apache/aurora/client/config.py 
> ce4bffee49a97b7b0e16bcb6368141e97a676991 
>   src/main/python/apache/aurora/config/schema/base.py 
> baea660a6ea9f3f7213f98f8bfdb95d912083ac0 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> d30f2d3be08e8aa9f75b4ba92a6af70adcd6a6a8 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 1e0be108b49480d57c5ab94b1d2903bb57bae20a 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> fccb69a13effbf4f06a9180872ef52d9ef399698 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 26aea23c35fd860320075cc224022798078bac44 
>   src/test/python/apache/aurora/client/test_config.py 
> ff46558590c1a5b6ebd0b06db6238bff3c6858b6 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> 28769dca68a6353fc1283a8bb279fae05173aaac 
>   src/test/python/apache/aurora/executor/common/test_status_checker.py 
> 8942639594131037e04b3e1f34847147c1e832ed 
>   src/test/python/apache/aurora/executor/test_status_manager.py 
> 7c0efe8e7d42621a5e880e2c99b475c3f1244095 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 
> ac914db6c990f6762439f04c06ba64fd0c8457b3 
> 
> Diff: https://reviews.apache.org/r/52806/diff/
> 
> 
> Testing
> ---
> 
> bash src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>



Re: Review Request 52766: Fix a bug in insufficient successes during initial_interval_secs

2016-10-11 Thread Kai Huang


> On Oct. 12, 2016, 1:49 a.m., David McLaughlin wrote:
> > src/test/python/apache/aurora/executor/common/test_health_checker.py, line 
> > 500
> > <https://reviews.apache.org/r/52766/diff/1/?file=1531886#file1531886line500>
> >
> > Maybe I'm misunderstanding something - but this test suite seems to 
> > lack any test for the value of hc.health_check_passed? This is what 
> > ultimately signals that the task should move into RUNNING:
> > 
> > 
> > https://github.com/apache/aurora/blob/master/src/main/python/apache/aurora/executor/common/health_checker.py#L222

Thanks for the reminder. Added test coverage for this value. 
initial_interval_secs can expire either at (a) before TASK_RUNNING or at (b) 
after TASK_RUNNING.

In zameer's job, the status returned in case (a) is "not health_check_passed & 
not healthy".
One additional health check is required at the boundary (c) to further ensure 
the task passed health check.


We do not detect this because test_task_health_ok() in test_thermos_executor 
only covers case (b). 
https://github.com/apache/aurora/blob/master/src/test/python/apache/aurora/executor/test_thermos_executor.py#L458


   achieve min successes
 
   health_check_passed & healthy
  
TASK_STARTING ---(a)(c)---> 
TASK_RUNNING---(b)-> TASK_FINISHED
|   
|   
|   
| reach max failure limit
|fail to achieve min successes  
|
|   
| health_check_passed & not healthy
|not health_check_passed & not healthy  
   \|/  
|   
   TASK_FAILED
   \|/
   TASK_FAILED


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52766/#review152252
---


On Oct. 12, 2016, 5:01 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52766/
> ---
> 
> (Updated Oct. 12, 2016, 5:01 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Zameer Manji.
> 
> 
> Bugs: AURORA-1791
> https://issues.apache.org/jira/browse/AURORA-1791
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Fix a bug in commit ca683cb. The commit is related to this review 
> https://reviews.apache.org/r/51876/. Please see it for more details and 
> backgrounds.
> 
> Currently, health checks are performed during a grace period called 
> initial_interval_secs. It is likely that HealthChecker fails to see 
> sufficient number of successes before the intitial_interval_secs expires. For 
> example, for a task with HealthCheckConfig(initital_interval_secs=15, 
> interval_secs=10, min_consecutive_successes=1). If the task sleeps during the 
> first 12 seconds and becomes healthy afterwards, the health checker will 
> report the task status as "TASK_FAILED" and miss the "healthy" status between 
> second 12-15. This is because only one health check is performed at second 10 
> before the initial_interval_secs expires. This is an implementation flaw that 
> breaks backward-compatability. 
> 
> To address this problem, I rewrite the function that is responsible for 
> updating the failure counts and the healthy status. The expected behavior is 
> that for the task described above, the health checker will performs a health 
> check after the initial_interval_secs expires and sets the health check 
> status to be healthy. Please see this review for more details.
> 
> Will add some more tests since the current e2e tests does not include the 
> above test case.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 1e0be108b49480d57c5ab94b1d2903bb57bae20a 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> 28769dca68a6353fc1283a8bb279fae05173aaac 
> 
> Diff: https://reviews.apache.org/r/52766/diff/
> 
> 
> Testing
> ---
&

Re: Review Request 52766: Fix a bug in insufficient successes during initial_interval_secs

2016-10-11 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52766/
---

(Updated Oct. 12, 2016, 5:01 a.m.)


Review request for Aurora, Joshua Cohen and Zameer Manji.


Changes
---

Add test coverage for the value of hc.health_check_passed which signals the 
task state transition.


Bugs: AURORA-1791
https://issues.apache.org/jira/browse/AURORA-1791


Repository: aurora


Description
---

Fix a bug in commit ca683cb. The commit is related to this review 
https://reviews.apache.org/r/51876/. Please see it for more details and 
backgrounds.

Currently, health checks are performed during a grace period called 
initial_interval_secs. It is likely that HealthChecker fails to see sufficient 
number of successes before the intitial_interval_secs expires. For example, for 
a task with HealthCheckConfig(initital_interval_secs=15, interval_secs=10, 
min_consecutive_successes=1). If the task sleeps during the first 12 seconds 
and becomes healthy afterwards, the health checker will report the task status 
as "TASK_FAILED" and miss the "healthy" status between second 12-15. This is 
because only one health check is performed at second 10 before the 
initial_interval_secs expires. This is an implementation flaw that breaks 
backward-compatability. 

To address this problem, I rewrite the function that is responsible for 
updating the failure counts and the healthy status. The expected behavior is 
that for the task described above, the health checker will performs a health 
check after the initial_interval_secs expires and sets the health check status 
to be healthy. Please see this review for more details.

Will add some more tests since the current e2e tests does not include the above 
test case.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/common/health_checker.py 
1e0be108b49480d57c5ab94b1d2903bb57bae20a 
  src/test/python/apache/aurora/executor/common/test_health_checker.py 
28769dca68a6353fc1283a8bb279fae05173aaac 

Diff: https://reviews.apache.org/r/52766/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/executor::

./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh

Modified the test in http_example.py. Let the http server sleep for the first 
10 seconds.

Launch a job that contains the task with Default 
HealthCheckConfig(initial_interval_secs=15, interval_secs=10, 
min_consecutive_successes=1) in vagrant aurora cluster. The task transitions to 
TASK_RUNNING state after ~20 seconds.


File Attachments


Task with default Health Check Config
  
https://reviews.apache.org/media/uploaded/files/2016/10/12/64cf6610-9294-46cb-b159-6e5721da5fff__Screen_Shot_2016-10-11_at_6.17.00_PM.png


Thanks,

Kai Huang



Review Request 52766: Fix a bug in insufficient successes during initial_interval_secs

2016-10-11 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52766/
---

Review request for Aurora, Joshua Cohen and Zameer Manji.


Bugs: AURORA-1791
https://issues.apache.org/jira/browse/AURORA-1791


Repository: aurora


Description
---

Fix a bug in commit ca683cb. The commit is related to this review 
https://reviews.apache.org/r/51876/. Please see it for more details and 
backgrounds.

Currently, health checks are performed during a grace period called 
initial_interval_secs. It is likely that HealthChecker fails to see sufficient 
number of successes before the intitial_interval_secs expires. For example, for 
a task with HealthCheckConfig(initital_interval_secs=15, interval_secs=10, 
min_consecutive_successes=1). If the task sleeps during the first 12 seconds 
and becomes healthy afterwards, the health checker will report the task status 
as "TASK_FAILED" and miss the "healthy" status between second 12-15. This is 
because only one health check is performed at second 10 before the 
initial_interval_secs expires. This is an implementation flaw that breaks 
backward-compatability. 

To address this problem, I rewrite the function that is responsible for 
updating the failure counts and the healthy status. The expected behavior is 
that for the task described above, the health checker will performs a health 
check after the initial_interval_secs expires and sets the health check status 
to be healthy. Please see this review for more details.

Will add some more tests since the current e2e tests does not include the above 
test case.


Diffs
-

  src/main/python/apache/aurora/executor/common/health_checker.py 
1e0be108b49480d57c5ab94b1d2903bb57bae20a 
  src/test/python/apache/aurora/executor/common/test_health_checker.py 
28769dca68a6353fc1283a8bb279fae05173aaac 

Diff: https://reviews.apache.org/r/52766/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/executor::

./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh

Modified the test in http_example.py. Let the http server sleep for the first 
10 seconds.

Launch a job that contains the task with Default 
HealthCheckConfig(initial_interval_secs=15, interval_secs=10, 
min_consecutive_successes=1) in vagrant aurora cluster. The task transitions to 
TASK_RUNNING state after ~20 seconds.


File Attachments


Task with default Health Check Config
  
https://reviews.apache.org/media/uploaded/files/2016/10/12/64cf6610-9294-46cb-b159-6e5721da5fff__Screen_Shot_2016-10-11_at_6.17.00_PM.png


Thanks,

Kai Huang



Re: Review Request 52453: Add support for receiving min_consecutive_successes in health checker

2016-10-05 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52453/
---

(Updated Oct. 5, 2016, 10:01 p.m.)


Review request for Aurora, Joshua Cohen and Zameer Manji.


Changes
---

Fix merge conflict in RELEASE-NOTE.md


Bugs: AURORA-894
https://issues.apache.org/jira/browse/AURORA-894


Repository: aurora


Description
---

- Add support for receiving a new HealthCheckConfig attribute 
"min_consecutive_successes" in health checker.
- Add an entry in release note that describes the health check driven update 
feature.

This patch is related to https://reviews.apache.org/r/52094/, in which I added 
a new configuration value "min_consecutive_successes" in HealthCheckConfig.


Diffs (updated)
-

  RELEASE-NOTES.md 97f05d52cfd2ed481feae0d01172b5b1af72b80a 
  docs/features/job-updates.md 792f2ae5fd14b1ea9af8be000629ce5a7fc2fe8f 
  src/main/python/apache/aurora/client/api/updater_util.py 
c649316edb876565c92cc90c9f030e153c008924 
  src/main/python/apache/aurora/executor/common/health_checker.py 
03fbffdc3862a94c2ba42c9b9e8f2be4094129b8 

Diff: https://reviews.apache.org/r/52453/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/executor::

./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh


Thanks,

Kai Huang



Re: Review Request 52453: Add support for receiving min_consecutive_successes in health checker

2016-10-05 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52453/
---

(Updated Oct. 5, 2016, 9:49 p.m.)


Review request for Aurora, Joshua Cohen and Zameer Manji.


Changes
---

Remove dependencies.


Bugs: AURORA-894
https://issues.apache.org/jira/browse/AURORA-894


Repository: aurora


Description
---

- Add support for receiving a new HealthCheckConfig attribute 
"min_consecutive_successes" in health checker.
- Add an entry in release note that describes the health check driven update 
feature.

This patch is related to https://reviews.apache.org/r/52094/, in which I added 
a new configuration value "min_consecutive_successes" in HealthCheckConfig.


Diffs
-

  RELEASE-NOTES.md 49c03e85ae4c2e3ebc8af89e9ce41df9fd52d6cd 
  docs/features/job-updates.md 792f2ae5fd14b1ea9af8be000629ce5a7fc2fe8f 
  src/main/python/apache/aurora/client/api/updater_util.py 
c649316edb876565c92cc90c9f030e153c008924 
  src/main/python/apache/aurora/executor/common/health_checker.py 
03fbffdc3862a94c2ba42c9b9e8f2be4094129b8 

Diff: https://reviews.apache.org/r/52453/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/executor::

./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh


Thanks,

Kai Huang



Re: Review Request 52453: Add support for receiving min_consecutive_successes in health checker

2016-10-05 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52453/
---

(Updated Oct. 5, 2016, 6:09 a.m.)


Review request for Aurora, Joshua Cohen and Zameer Manji.


Changes
---

Add documentations for health check driven update in job update documentation.


Bugs: AURORA-894
https://issues.apache.org/jira/browse/AURORA-894


Repository: aurora


Description
---

- Add support for receiving a new HealthCheckConfig attribute 
"min_consecutive_successes" in health checker.
- Add an entry in release note that describes the health check driven update 
feature.

This patch is related to https://reviews.apache.org/r/52094/, in which I added 
a new configuration value "min_consecutive_successes" in HealthCheckConfig.


Diffs (updated)
-

  RELEASE-NOTES.md 49c03e85ae4c2e3ebc8af89e9ce41df9fd52d6cd 
  docs/features/job-updates.md 792f2ae5fd14b1ea9af8be000629ce5a7fc2fe8f 
  src/main/python/apache/aurora/client/api/updater_util.py 
c649316edb876565c92cc90c9f030e153c008924 
  src/main/python/apache/aurora/executor/common/health_checker.py 
03fbffdc3862a94c2ba42c9b9e8f2be4094129b8 

Diff: https://reviews.apache.org/r/52453/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/executor::

./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh


Thanks,

Kai Huang



Re: Review Request 52453: Add support for receiving min_consecutive_successes in health checker

2016-10-04 Thread Kai Huang


> On Oct. 4, 2016, 7:17 p.m., Joshua Cohen wrote:
> > src/main/python/apache/aurora/client/api/updater_util.py, line 39
> > <https://reviews.apache.org/r/52453/diff/2/?file=1519298#file1519298line39>
> >
> > "Watch seconds should not be negative."

sure. Will do.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52453/#review151372
-------


On Oct. 4, 2016, 12:42 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52453/
> ---
> 
> (Updated Oct. 4, 2016, 12:42 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Add support for receiving a new HealthCheckConfig attribute 
> "min_consecutive_successes" in health checker.
> - Add an entry in release note that describes the health check driven update 
> feature.
> 
> This patch is related to https://reviews.apache.org/r/52094/, in which I 
> added a new configuration value "min_consecutive_successes" in 
> HealthCheckConfig.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md 49c03e85ae4c2e3ebc8af89e9ce41df9fd52d6cd 
>   src/main/python/apache/aurora/client/api/updater_util.py 
> c649316edb876565c92cc90c9f030e153c008924 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 03fbffdc3862a94c2ba42c9b9e8f2be4094129b8 
> 
> Diff: https://reviews.apache.org/r/52453/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 52453: Add support for receiving min_consecutive_successes in health checker

2016-10-04 Thread Kai Huang


> On Oct. 4, 2016, 6:36 a.m., Kai Huang wrote:
> > I noticed that AuroraBot says the test is green but some coverage are 
> > missing...
> > 
> > When I test the code locally, the code passes pants tests and style check 
> > in:
> > ./build-support/jenkins/build.sh
> > 
> > it also passes the e2e tests in:
> > ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> > 
> > However, I noticed some abnormalies in individual tests:
> > 
> > The following test command is always green:
> > ./pants test.pytest --options="-s -k test_thermos_executor" 
> > src/test/python/apache/aurora/executor::
> > 
> > However, the following test command with coverage flag turned-on is flaky 
> > for one test: test_health_check_ok.
> > ./pants test.pytest --options="-s -k test_thermos_executor" 
> > --coverage=1 src/test/python/apache/aurora/executor::
> > 
> > This test is testing health check in a thermos executor with very small 
> > initial_interval(0.1 seconds) plus a fast StatusManager(Polling status 
> > every 10 milli_seconds). See 
> > https://github.com/apache/aurora/blob/master/src/test/python/apache/aurora/executor/test_thermos_executor.py#L216
> >  and 
> > https://github.com/apache/aurora/blob/master/src/test/python/apache/aurora/executor/test_thermos_executor.py#L458
> > 
> > This may explain why AuroraBot says green tests but coverage are missing.
> > 
> > I'm not sure if this is purely an artifact of the coverage flag or caused 
> > by my code change in the Aurora Executor. But I'm now concerned that the 
> > flakiness is not exposed by our pants test and e2e test.
> > 
> > My guess is that it might be caused by my code 
> > change(https://github.com/apache/aurora/blob/master/src/main/python/apache/aurora/executor/aurora_executor.py#L120),
> >  and exacerbated by the --coverage flag. 
> > 
> > I was wondering if we should just proceed and ignore the warning or be more 
> > precautious and revisit executor change?

Currently the test case is doing a health check with initial_interval_secs=0.1 
secs, interval_secs=0.1 secs. We can eliminate the flakiness by slightly 
increasing the initial_interval_secs in the test case (e.g. from 0.1 to 0.12 
sec).

But I believe that should be addressed in an separate review.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52453/#review151303
---


On Oct. 4, 2016, 12:42 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52453/
> ---
> 
> (Updated Oct. 4, 2016, 12:42 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Add support for receiving a new HealthCheckConfig attribute 
> "min_consecutive_successes" in health checker.
> - Add an entry in release note that describes the health check driven update 
> feature.
> 
> This patch is related to https://reviews.apache.org/r/52094/, in which I 
> added a new configuration value "min_consecutive_successes" in 
> HealthCheckConfig.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md 49c03e85ae4c2e3ebc8af89e9ce41df9fd52d6cd 
>   src/main/python/apache/aurora/client/api/updater_util.py 
> c649316edb876565c92cc90c9f030e153c008924 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 03fbffdc3862a94c2ba42c9b9e8f2be4094129b8 
> 
> Diff: https://reviews.apache.org/r/52453/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 52453: Add support for receiving min_consecutive_successes in health checker

2016-10-04 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52453/#review151303
---



I noticed that AuroraBot says the test is green but some coverage are missing...

When I test the code locally, the code passes pants tests and style check in:
./build-support/jenkins/build.sh

it also passes the e2e tests in:
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh

However, I noticed some abnormalies in individual tests:

The following test command is always green:
./pants test.pytest --options="-s -k test_thermos_executor" 
src/test/python/apache/aurora/executor::

However, the following test command with coverage flag turned-on is flaky for 
one test: test_health_check_ok.
./pants test.pytest --options="-s -k test_thermos_executor" --coverage=1 
src/test/python/apache/aurora/executor::

This test is testing health check in a thermos executor with very small 
initial_interval(0.1 seconds) plus a fast StatusManager(Polling status every 10 
milli_seconds). See 
https://github.com/apache/aurora/blob/master/src/test/python/apache/aurora/executor/test_thermos_executor.py#L216
 and 
https://github.com/apache/aurora/blob/master/src/test/python/apache/aurora/executor/test_thermos_executor.py#L458

This may explain why AuroraBot says green tests but coverage are missing.

I'm not sure if this is purely an artifact of the coverage flag or caused by my 
code change in the Aurora Executor. But I'm now concerned that the flakiness is 
not exposed by our pants test and e2e test.

My guess is that it might be caused by my code 
change(https://github.com/apache/aurora/blob/master/src/main/python/apache/aurora/executor/aurora_executor.py#L120),
 and exacerbated by the --coverage flag. 

I was wondering if we should just proceed and ignore the warning or be more 
precautious and revisit executor change?

- Kai Huang


On Oct. 4, 2016, 12:42 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52453/
> ---
> 
> (Updated Oct. 4, 2016, 12:42 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Add support for receiving a new HealthCheckConfig attribute 
> "min_consecutive_successes" in health checker.
> - Add an entry in release note that describes the health check driven update 
> feature.
> 
> This patch is related to https://reviews.apache.org/r/52094/, in which I 
> added a new configuration value "min_consecutive_successes" in 
> HealthCheckConfig.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md 49c03e85ae4c2e3ebc8af89e9ce41df9fd52d6cd 
>   src/main/python/apache/aurora/client/api/updater_util.py 
> c649316edb876565c92cc90c9f030e153c008924 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 03fbffdc3862a94c2ba42c9b9e8f2be4094129b8 
> 
> Diff: https://reviews.apache.org/r/52453/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 52453: Add support for receiving min_consecutive_successes in health checker

2016-10-03 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52453/
---

(Updated Oct. 4, 2016, 12:42 a.m.)


Review request for Aurora, Joshua Cohen and Zameer Manji.


Changes
---

Relax client-side constraint on watch_secs to accept zero as value.


Bugs: AURORA-894
https://issues.apache.org/jira/browse/AURORA-894


Repository: aurora


Description
---

- Add support for receiving a new HealthCheckConfig attribute 
"min_consecutive_successes" in health checker.
- Add an entry in release note that describes the health check driven update 
feature.

This patch is related to https://reviews.apache.org/r/52094/, in which I added 
a new configuration value "min_consecutive_successes" in HealthCheckConfig.


Diffs (updated)
-

  RELEASE-NOTES.md 49c03e85ae4c2e3ebc8af89e9ce41df9fd52d6cd 
  src/main/python/apache/aurora/client/api/updater_util.py 
c649316edb876565c92cc90c9f030e153c008924 
  src/main/python/apache/aurora/executor/common/health_checker.py 
03fbffdc3862a94c2ba42c9b9e8f2be4094129b8 

Diff: https://reviews.apache.org/r/52453/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/executor::

./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh


Thanks,

Kai Huang



Review Request 52453: Add a patch that links client and executor change

2016-09-30 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52453/
---

Review request for Aurora, Joshua Cohen and Zameer Manji.


Bugs: AURORA-894
https://issues.apache.org/jira/browse/AURORA-894


Repository: aurora


Description
---

- Add support for receiving a new HealthCheckConfig attribute 
"min_consecutive_successes" in health checker.
- Add an entry in release note that describes the health check driven update 
feature.

This patch is related to https://reviews.apache.org/r/52094/, in which I added 
a new configuration value "min_consecutive_successes" in HealthCheckConfig.


Diffs
-

  RELEASE-NOTES.md 49c03e85ae4c2e3ebc8af89e9ce41df9fd52d6cd 
  src/main/python/apache/aurora/executor/common/health_checker.py 
03fbffdc3862a94c2ba42c9b9e8f2be4094129b8 

Diff: https://reviews.apache.org/r/52453/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/executor::

./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh


Thanks,

Kai Huang



Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-30 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/
---

(Updated Sept. 30, 2016, 5:17 p.m.)


Review request for Aurora, Joshua Cohen and Zameer Manji.


Changes
---

Add test case for the scenario where a status provider throws during 
initialization.


Bugs: AURORA-1225
https://issues.apache.org/jira/browse/AURORA-1225


Repository: aurora


Description
---

Modify executor state transition logic to rely on health checks (if enabled).

[Summary]
Executor needs to start executing user content in STARTING and transition to 
RUNNING when a successful required number of health checks is reached.

This review contains a series of executor changes that implement the health 
check driven updates. It gives more context of the design of this feature.

[Background]
Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
and the design doc: 
https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
 for more details and background.

[Description]
If health check is enabled on vCurrent executor, the health checker will send a 
"TASK_RUNNING" message when a successful required number of health checks is 
reached within the initial_interval_secs. On the other hand, a "TASK_FAILED" 
message was sent if the health checker fails to reach the required number of 
health checks within that period, or a maximum number of failed health check 
limit is reached after the initital_interval_secs.

If health check is disabled on the vCurrent executor, it will sends 
"TASK_RUNNING" message to scheduler after the thermos runner was started. In 
this scenario, the behavior of vCurrent executor will be the same as the vPrev 
executor.

[Change List]
The current change set includes:
1. Removed the status memoization in ChainedStatusChecker.
2. Modified the StatusManager to be edge triggered.
3. Changed the Aurora Executor callback function.
4. Modified the Health Checker and redefined the meaning initial_interval_secs.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/aurora_executor.py 
ce5ef680f01831cd89fced8969ae3246c7f60cfd 
  src/main/python/apache/aurora/executor/common/health_checker.py 
5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
  src/main/python/apache/aurora/executor/common/status_checker.py 
795dae2d6b661fc528d952c2315196d94127961f 
  src/main/python/apache/aurora/executor/status_manager.py 
228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
  src/test/python/apache/aurora/executor/common/test_health_checker.py 
bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
  src/test/python/apache/aurora/executor/common/test_status_checker.py 
5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
  src/test/python/apache/aurora/executor/test_status_manager.py 
ce4679ba1aa7b42cf0115c943d84663030182d23 
  src/test/python/apache/aurora/executor/test_thermos_executor.py 
0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 

Diff: https://reviews.apache.org/r/51876/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/executor::


Thanks,

Kai Huang



Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-30 Thread Kai Huang


> On Sept. 30, 2016, 4:11 p.m., Joshua Cohen wrote:
> > Can you add a test case for the scenario where a status provider throws 
> > during initialization?

Oh, that's a good point. Will do.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review151029
---


On Sept. 29, 2016, 11:11 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 29, 2016, 11:11 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 795dae2d6b661fc528d952c2315196d94127961f 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/python/apache/aurora/executor/common/test_status_checker.py 
> 5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
>   src/test/python/apache/aurora/executor/test_status_manager.py 
> ce4679ba1aa7b42cf0115c943d84663030182d23 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 
> 0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 
> 
> Diff: https://reviews.apache.org/r/51876/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 52094: Add min_consecutive_health_checks in HealthCheckConfig

2016-09-29 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52094/
---

(Updated Sept. 29, 2016, 11:43 p.m.)


Review request for Aurora, Joshua Cohen and Zameer Manji.


Changes
---

Add validation for min_consecutive_successes in HealthCheckConfig.


Bugs: AURORA-1224
https://issues.apache.org/jira/browse/AURORA-1224


Repository: aurora


Description
---

Add min_consecutive_health_checks to HealthCheckConfig.

[Summary]
HealthCheckConfig should accept a new configuration value that will tell how 
many positive consecutive health checks an instance requires to move from 
STARTING to RUNNING.

[Background]
This review depends on the executor change(AURORA-1225). Please see 
https://reviews.apache.org/r/51876/ for more details and background.

[Change List]
1. Add a configuration value "min_consecutive_health_checks"(default=1) to 
HealthCheckConfig struct.
2. Modify the default value of watch_secs to be 0.
3. Add a client-side constraint: 
initial_interval_secs >= min_consecutive_health_checks * interval_secs
4. Update the unit tests for health check config in client/config.py, skip unit 
tests related to watch_secs.


Diffs (updated)
-

  docs/reference/configuration.md f2a0b1873f31e91f3bf0cac6f8448e8130fae688 
  src/main/python/apache/aurora/client/config.py 
0186af52f0d7d7e3981ec59bf6a01aafee2bcfb1 
  src/main/python/apache/aurora/config/schema/base.py 
845163043b0b7b2f9e7aca14677ca9f094658551 
  src/test/python/apache/aurora/client/test_config.py 
5cf68a5145ddf9478baa30453c0bcb73136fa7eb 

Diff: https://reviews.apache.org/r/52094/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/client::

./pants test.pytest src/test/python/apache/aurora/config::


Thanks,

Kai Huang



Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-29 Thread Kai Huang


> On Sept. 28, 2016, 9:52 p.m., Joshua Cohen wrote:
> > I tried to commit this, but e2e tests hung for me.
> > 
> > Kai, can you investigate?
> 
> Kai Huang wrote:
> There is a bug of thermos kill for task running in docker 
> container(https://issues.apache.org/jira/browse/AURORA-1426 ). It is likely 
> due to my code change triggered this bug or exacerbate the effect of it. I'll 
> investigate this.

It turned out to be a bug related to my changes in aurora executor. Fixed it by 
adding missing error handling. See the new review request.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review150769
-------


On Sept. 29, 2016, 11:11 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 29, 2016, 11:11 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 795dae2d6b661fc528d952c2315196d94127961f 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/python/apache/aurora/executor/common/test_status_checker.py 
> 5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
>   src/test/python/apache/aurora/executor/test_status_manager.py 
> ce4679ba1aa7b42cf0115c943d84663030182d23 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 
> 0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 
> 
> Diff: https://reviews.apache.org/r/51876/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-29 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/
---

(Updated Sept. 29, 2016, 11:11 p.m.)


Review request for Aurora, Joshua Cohen and Zameer Manji.


Changes
---

Fixed a bug in aurora executor. Add error handling when setting up 
StatusCheckers.


Bugs: AURORA-1225
https://issues.apache.org/jira/browse/AURORA-1225


Repository: aurora


Description
---

Modify executor state transition logic to rely on health checks (if enabled).

[Summary]
Executor needs to start executing user content in STARTING and transition to 
RUNNING when a successful required number of health checks is reached.

This review contains a series of executor changes that implement the health 
check driven updates. It gives more context of the design of this feature.

[Background]
Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
and the design doc: 
https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
 for more details and background.

[Description]
If health check is enabled on vCurrent executor, the health checker will send a 
"TASK_RUNNING" message when a successful required number of health checks is 
reached within the initial_interval_secs. On the other hand, a "TASK_FAILED" 
message was sent if the health checker fails to reach the required number of 
health checks within that period, or a maximum number of failed health check 
limit is reached after the initital_interval_secs.

If health check is disabled on the vCurrent executor, it will sends 
"TASK_RUNNING" message to scheduler after the thermos runner was started. In 
this scenario, the behavior of vCurrent executor will be the same as the vPrev 
executor.

[Change List]
The current change set includes:
1. Removed the status memoization in ChainedStatusChecker.
2. Modified the StatusManager to be edge triggered.
3. Changed the Aurora Executor callback function.
4. Modified the Health Checker and redefined the meaning initial_interval_secs.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/aurora_executor.py 
ce5ef680f01831cd89fced8969ae3246c7f60cfd 
  src/main/python/apache/aurora/executor/common/health_checker.py 
5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
  src/main/python/apache/aurora/executor/common/status_checker.py 
795dae2d6b661fc528d952c2315196d94127961f 
  src/main/python/apache/aurora/executor/status_manager.py 
228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
  src/test/python/apache/aurora/executor/common/test_health_checker.py 
bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
  src/test/python/apache/aurora/executor/common/test_status_checker.py 
5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
  src/test/python/apache/aurora/executor/test_status_manager.py 
ce4679ba1aa7b42cf0115c943d84663030182d23 
  src/test/python/apache/aurora/executor/test_thermos_executor.py 
0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 

Diff: https://reviews.apache.org/r/51876/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/executor::


Thanks,

Kai Huang



Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-29 Thread Kai Huang


> On Sept. 28, 2016, 9:52 p.m., Joshua Cohen wrote:
> > I tried to commit this, but e2e tests hung for me.
> > 
> > Kai, can you investigate?

There is a bug of thermos kill for task running in docker 
container(https://issues.apache.org/jira/browse/AURORA-1426 ). It is likely due 
to my code change triggered this bug or exacerbate the effect of it. I'll 
investigate this.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review150769
---


On Sept. 28, 2016, 9:07 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 28, 2016, 9:07 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 795dae2d6b661fc528d952c2315196d94127961f 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/python/apache/aurora/executor/common/test_status_checker.py 
> 5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
>   src/test/python/apache/aurora/executor/test_status_manager.py 
> ce4679ba1aa7b42cf0115c943d84663030182d23 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 
> 0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 
> 
> Diff: https://reviews.apache.org/r/51876/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-28 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/
---

(Updated Sept. 28, 2016, 9:07 p.m.)


Review request for Aurora, Joshua Cohen and Zameer Manji.


Changes
---

Explicitly check all callback types in StatusManager


Bugs: AURORA-1225
https://issues.apache.org/jira/browse/AURORA-1225


Repository: aurora


Description
---

Modify executor state transition logic to rely on health checks (if enabled).

[Summary]
Executor needs to start executing user content in STARTING and transition to 
RUNNING when a successful required number of health checks is reached.

This review contains a series of executor changes that implement the health 
check driven updates. It gives more context of the design of this feature.

[Background]
Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
and the design doc: 
https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
 for more details and background.

[Description]
If health check is enabled on vCurrent executor, the health checker will send a 
"TASK_RUNNING" message when a successful required number of health checks is 
reached within the initial_interval_secs. On the other hand, a "TASK_FAILED" 
message was sent if the health checker fails to reach the required number of 
health checks within that period, or a maximum number of failed health check 
limit is reached after the initital_interval_secs.

If health check is disabled on the vCurrent executor, it will sends 
"TASK_RUNNING" message to scheduler after the thermos runner was started. In 
this scenario, the behavior of vCurrent executor will be the same as the vPrev 
executor.

[Change List]
The current change set includes:
1. Removed the status memoization in ChainedStatusChecker.
2. Modified the StatusManager to be edge triggered.
3. Changed the Aurora Executor callback function.
4. Modified the Health Checker and redefined the meaning initial_interval_secs.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/aurora_executor.py 
ce5ef680f01831cd89fced8969ae3246c7f60cfd 
  src/main/python/apache/aurora/executor/common/health_checker.py 
5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
  src/main/python/apache/aurora/executor/common/status_checker.py 
795dae2d6b661fc528d952c2315196d94127961f 
  src/main/python/apache/aurora/executor/status_manager.py 
228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
  src/test/python/apache/aurora/executor/common/test_health_checker.py 
bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
  src/test/python/apache/aurora/executor/common/test_status_checker.py 
5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
  src/test/python/apache/aurora/executor/test_status_manager.py 
ce4679ba1aa7b42cf0115c943d84663030182d23 
  src/test/python/apache/aurora/executor/test_thermos_executor.py 
0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 

Diff: https://reviews.apache.org/r/51876/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/executor::


Thanks,

Kai Huang



Re: Review Request 52094: Add min_consecutive_health_checks in HealthCheckConfig

2016-09-27 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52094/
---

(Updated Sept. 28, 2016, 4:08 a.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Changes
---

Removed deprecated unit tests for watch_secs.

Resumed default value of watch_secs.


Bugs: AURORA-1224
https://issues.apache.org/jira/browse/AURORA-1224


Repository: aurora


Description
---

Add min_consecutive_health_checks to HealthCheckConfig.

[Summary]
HealthCheckConfig should accept a new configuration value that will tell how 
many positive consecutive health checks an instance requires to move from 
STARTING to RUNNING.

[Background]
This review depends on the executor change(AURORA-1225). Please see 
https://reviews.apache.org/r/51876/ for more details and background.

[Change List]
1. Add a configuration value "min_consecutive_health_checks"(default=1) to 
HealthCheckConfig struct.
2. Modify the default value of watch_secs to be 0.
3. Add a client-side constraint: 
initial_interval_secs >= min_consecutive_health_checks * interval_secs
4. Update the unit tests for health check config in client/config.py, skip unit 
tests related to watch_secs.


Diffs (updated)
-

  docs/reference/configuration.md f2a0b1873f31e91f3bf0cac6f8448e8130fae688 
  src/main/python/apache/aurora/client/config.py 
0186af52f0d7d7e3981ec59bf6a01aafee2bcfb1 
  src/main/python/apache/aurora/config/schema/base.py 
845163043b0b7b2f9e7aca14677ca9f094658551 
  src/test/python/apache/aurora/client/test_config.py 
5cf68a5145ddf9478baa30453c0bcb73136fa7eb 

Diff: https://reviews.apache.org/r/52094/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/client::

./pants test.pytest src/test/python/apache/aurora/config::


Thanks,

Kai Huang



Re: Review Request 52094: Add min_consecutive_health_checks in HealthCheckConfig

2016-09-27 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52094/
---

(Updated Sept. 27, 2016, 11:33 p.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Changes
---

Renamed "min_consecutive_health_checks" to "min_consecutive_successes", in 
accordance with executor change.

Added documentaton for the HealthCheckConfig change.


Bugs: AURORA-1224
https://issues.apache.org/jira/browse/AURORA-1224


Repository: aurora


Description
---

Add min_consecutive_health_checks to HealthCheckConfig.

[Summary]
HealthCheckConfig should accept a new configuration value that will tell how 
many positive consecutive health checks an instance requires to move from 
STARTING to RUNNING.

[Background]
This review depends on the executor change(AURORA-1225). Please see 
https://reviews.apache.org/r/51876/ for more details and background.

[Change List]
1. Add a configuration value "min_consecutive_health_checks"(default=1) to 
HealthCheckConfig struct.
2. Modify the default value of watch_secs to be 0.
3. Add a client-side constraint: 
initial_interval_secs >= min_consecutive_health_checks * interval_secs
4. Update the unit tests for health check config in client/config.py, skip unit 
tests related to watch_secs.


Diffs (updated)
-

  docs/reference/configuration.md f2a0b1873f31e91f3bf0cac6f8448e8130fae688 
  src/main/python/apache/aurora/client/config.py 
0186af52f0d7d7e3981ec59bf6a01aafee2bcfb1 
  src/main/python/apache/aurora/config/schema/base.py 
845163043b0b7b2f9e7aca14677ca9f094658551 
  src/test/python/apache/aurora/client/test_config.py 
5cf68a5145ddf9478baa30453c0bcb73136fa7eb 

Diff: https://reviews.apache.org/r/52094/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/client::

./pants test.pytest src/test/python/apache/aurora/config::


Thanks,

Kai Huang



Re: Review Request 52094: Add min_consecutive_health_checks in HealthCheckConfig

2016-09-27 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52094/
---

(Updated Sept. 27, 2016, 11:30 p.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Changes
---

Rename "min_consecutive_health_checks" to "min_consecutive_successes", in 
accordance with executor change. Add documentations for config change.


Bugs: AURORA-1224
https://issues.apache.org/jira/browse/AURORA-1224


Repository: aurora


Description
---

Add min_consecutive_health_checks to HealthCheckConfig.

[Summary]
HealthCheckConfig should accept a new configuration value that will tell how 
many positive consecutive health checks an instance requires to move from 
STARTING to RUNNING.

[Background]
This review depends on the executor change(AURORA-1225). Please see 
https://reviews.apache.org/r/51876/ for more details and background.

[Change List]
1. Add a configuration value "min_consecutive_health_checks"(default=1) to 
HealthCheckConfig struct.
2. Modify the default value of watch_secs to be 0.
3. Add a client-side constraint: 
initial_interval_secs >= min_consecutive_health_checks * interval_secs
4. Update the unit tests for health check config in client/config.py, skip unit 
tests related to watch_secs.


Diffs (updated)
-

  docs/reference/configuration.md f2a0b1873f31e91f3bf0cac6f8448e8130fae688 
  src/main/python/apache/aurora/client/config.py 
0186af52f0d7d7e3981ec59bf6a01aafee2bcfb1 
  src/main/python/apache/aurora/config/schema/base.py 
845163043b0b7b2f9e7aca14677ca9f094658551 
  src/test/python/apache/aurora/client/test_config.py 
5cf68a5145ddf9478baa30453c0bcb73136fa7eb 

Diff: https://reviews.apache.org/r/52094/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/client::

./pants test.pytest src/test/python/apache/aurora/config::


Thanks,

Kai Huang



Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-27 Thread Kai Huang


> On Sept. 27, 2016, 9:27 p.m., Stephan Erb wrote:
> > src/main/python/apache/aurora/executor/common/status_checker.py, line 104
> > <https://reviews.apache.org/r/51876/diff/5/?file=1509365#file1509365line104>
> >
> > This is slightly beyond the pull request, but for the matter of 
> > generalization: We could only break here if the returned state is in 
> > `ExecutorBase.TERMINAL_STATES`.

It is possible the status is in a unknown or unmapped state, in this case it's 
safer to invoke the shutdown callback function.


> On Sept. 27, 2016, 9:27 p.m., Stephan Erb wrote:
> > src/main/python/apache/aurora/executor/aurora_executor.py, line 219
> > <https://reviews.apache.org/r/51876/diff/5/?file=1509363#file1509363line219>
> >
> > The `status_result` is carrying that is healthy string already as its 
> > `reason`. To reduce duplication you should use that instead.

Thanks, will fix it.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review150612
-------


On Sept. 23, 2016, 6:58 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 23, 2016, 6:58 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 795dae2d6b661fc528d952c2315196d94127961f 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/python/apache/aurora/executor/common/test_status_checker.py 
> 5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
>   src/test/python/apache/aurora/executor/test_status_manager.py 
> ce4679ba1aa7b42cf0115c943d84663030182d23 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 
> 0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 
> 
> Diff: https://reviews.apache.org/r/51876/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-26 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review150428
---




src/main/python/apache/aurora/executor/common/health_checker.py (line 196)
<https://reviews.apache.org/r/51876/#comment218408>

We should export consecutive_successes as well?


- Kai Huang


On Sept. 23, 2016, 6:58 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 23, 2016, 6:58 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 795dae2d6b661fc528d952c2315196d94127961f 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/python/apache/aurora/executor/common/test_status_checker.py 
> 5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
>   src/test/python/apache/aurora/executor/test_status_manager.py 
> ce4679ba1aa7b42cf0115c943d84663030182d23 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 
> 0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 
> 
> Diff: https://reviews.apache.org/r/51876/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-23 Thread Kai Huang


> On Sept. 23, 2016, 10:18 p.m., Zameer Manji wrote:
> > src/main/python/apache/aurora/executor/aurora_executor.py, line 120
> > <https://reviews.apache.org/r/51876/diff/5/?file=1509363#file1509363line120>
> >
> > This check is brittle to determine if health checking is enabled. 
> > Please consider an alternative approach.
> > 
> > You have access to the `assigned_task` object. Passing that in to the 
> > helper method `mesos_task_instance_from_assigned_task` will give you an 
> > instance of the executor config.
> > 
> > You can then check the `health_check_config()` property of the result 
> > to see if health checking is enabled for the task.

Seems like we are taking a step back? In the first revision, I implemented 
pretty much the way you mentioned above, that is creating a 
is_health_check_enabled(assigned_task) function in task_info.py. 

However, Maxim raised a valid point that is_health_check_enabled has some 
duplication with the creation of health checker in later step.
So the problem would be whether we should reuse the logic of 
is_health_check_enabled in health_checker?

One solution is to store all the computation result(prot_map, health_checker, 
health_check_config) in a utility class. So that it can be reuse later. But a 
downside here is that the is_health_check_enabled now serves multiple purposes, 
and the meaning of this function is not clear. It should only answer one 
question: is health check enabled on this task? 

>From my perspective, I think we should allow some sacrifice of reusability 
>here.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review150246
-------


On Sept. 23, 2016, 6:58 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 23, 2016, 6:58 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 795dae2d6b661fc528d952c2315196d94127961f 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/python/apache/aurora/executor/co

Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-23 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/
---

(Updated Sept. 23, 2016, 6:58 p.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Changes
---

Modified the logic of checking if health check is enabled for a task.


Bugs: AURORA-1225
https://issues.apache.org/jira/browse/AURORA-1225


Repository: aurora


Description
---

Modify executor state transition logic to rely on health checks (if enabled).

[Summary]
Executor needs to start executing user content in STARTING and transition to 
RUNNING when a successful required number of health checks is reached.

This review contains a series of executor changes that implement the health 
check driven updates. It gives more context of the design of this feature.

[Background]
Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
and the design doc: 
https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
 for more details and background.

[Description]
If health check is enabled on vCurrent executor, the health checker will send a 
"TASK_RUNNING" message when a successful required number of health checks is 
reached within the initial_interval_secs. On the other hand, a "TASK_FAILED" 
message was sent if the health checker fails to reach the required number of 
health checks within that period, or a maximum number of failed health check 
limit is reached after the initital_interval_secs.

If health check is disabled on the vCurrent executor, it will sends 
"TASK_RUNNING" message to scheduler after the thermos runner was started. In 
this scenario, the behavior of vCurrent executor will be the same as the vPrev 
executor.

[Change List]
The current change set includes:
1. Removed the status memoization in ChainedStatusChecker.
2. Modified the StatusManager to be edge triggered.
3. Changed the Aurora Executor callback function.
4. Modified the Health Checker and redefined the meaning initial_interval_secs.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/aurora_executor.py 
ce5ef680f01831cd89fced8969ae3246c7f60cfd 
  src/main/python/apache/aurora/executor/common/health_checker.py 
5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
  src/main/python/apache/aurora/executor/common/status_checker.py 
795dae2d6b661fc528d952c2315196d94127961f 
  src/main/python/apache/aurora/executor/status_manager.py 
228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
  src/test/python/apache/aurora/executor/common/test_health_checker.py 
bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
  src/test/python/apache/aurora/executor/common/test_status_checker.py 
5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
  src/test/python/apache/aurora/executor/test_status_manager.py 
ce4679ba1aa7b42cf0115c943d84663030182d23 
  src/test/python/apache/aurora/executor/test_thermos_executor.py 
0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 

Diff: https://reviews.apache.org/r/51876/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/executor::


Thanks,

Kai Huang



Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-23 Thread Kai Huang


> On Sept. 23, 2016, 4:53 p.m., Joshua Cohen wrote:
> > src/main/python/apache/aurora/executor/aurora_executor.py, line 81
> > <https://reviews.apache.org/r/51876/diff/4/?file=1508698#file1508698line81>
> >
> > I don't think we should expose this simply for the sake of testing, 
> > besides breaking abstractions, it's also brittle. A bug could set this 
> > value to `True` even if the behavior we want to test is broken.
> > 
> > We should be able to test the health check logic directly based on the 
> > behavior of the executor. I.e. assert that the `TASK_RUNNING` update is not 
> > sent right away if one of the status providers is a health checker.

Actually I've tried this approach in the tests. Checking the presence of a 
health checker seems to break the test possibly because it duplicates the 
health checker preparation in launchTask function. The problem can be solved by 
placing it before we launch a task(in this way, checking the presence of health 
checker does not interferes with the status update sending). I'm kind of 
curious why the test is so time sensitive.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review150200
-------


On Sept. 23, 2016, 12:57 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 23, 2016, 12:57 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 795dae2d6b661fc528d952c2315196d94127961f 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/python/apache/aurora/executor/common/test_status_checker.py 
> 5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
>   src/test/python/apache/aurora/executor/test_status_manager.py 
> ce4679ba1aa7b42cf0115c943d84663030182d23 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 
> 0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 
> 
> Diff: https://reviews.apache.org/r/51876/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 52094: Add min_consecutive_health_checks in HealthCheckConfig

2016-09-22 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52094/#review150134
---




docs/reference/configuration.md (line 368)
<https://reviews.apache.org/r/52094/#comment218001>

In addition to documentation, we also need to notify the user:
1. For task with health check, they can leave out watch_secs and relies on 
initial_interval_secs.
2. For task without health check, they still have to provide watch_secs to 
cover the warming up period.

I'm thinking that whether we should incorporate this information as a 
client warning message to prevent misuse of the configuration?

I'll wait until Maxim to weigh in. But thanks for your feedback, Dmitriy!


- Kai Huang


On Sept. 23, 2016, 4:34 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52094/
> ---
> 
> (Updated Sept. 23, 2016, 4:34 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1224
> https://issues.apache.org/jira/browse/AURORA-1224
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Add min_consecutive_health_checks to HealthCheckConfig.
> 
> [Summary]
> HealthCheckConfig should accept a new configuration value that will tell how 
> many positive consecutive health checks an instance requires to move from 
> STARTING to RUNNING.
> 
> [Background]
> This review depends on the executor change(AURORA-1225). Please see 
> https://reviews.apache.org/r/51876/ for more details and background.
> 
> [Change List]
> 1. Add a configuration value "min_consecutive_health_checks"(default=1) to 
> HealthCheckConfig struct.
> 2. Modify the default value of watch_secs to be 0.
> 3. Add a client-side constraint: 
> initial_interval_secs >= min_consecutive_health_checks * interval_secs
> 4. Update the unit tests for health check config in client/config.py, skip 
> unit tests related to watch_secs.
> 
> 
> Diffs
> -
> 
>   docs/reference/configuration.md f2a0b1873f31e91f3bf0cac6f8448e8130fae688 
>   src/main/python/apache/aurora/client/config.py 
> 0186af52f0d7d7e3981ec59bf6a01aafee2bcfb1 
>   src/main/python/apache/aurora/config/schema/base.py 
> 845163043b0b7b2f9e7aca14677ca9f094658551 
>   src/test/python/apache/aurora/client/test_config.py 
> 5cf68a5145ddf9478baa30453c0bcb73136fa7eb 
> 
> Diff: https://reviews.apache.org/r/52094/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/client::
> 
> ./pants test.pytest src/test/python/apache/aurora/config::
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 52094: Add min_consecutive_health_checks in HealthCheckConfig

2016-09-22 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52094/
---

(Updated Sept. 23, 2016, 4:34 a.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Changes
---

Add documentation for min_consecutive_health_checks, update the documentaion of 
watch_secs and initial_interval_secs.


Bugs: AURORA-1224
https://issues.apache.org/jira/browse/AURORA-1224


Repository: aurora


Description
---

Add min_consecutive_health_checks to HealthCheckConfig.

[Summary]
HealthCheckConfig should accept a new configuration value that will tell how 
many positive consecutive health checks an instance requires to move from 
STARTING to RUNNING.

[Background]
This review depends on the executor change(AURORA-1225). Please see 
https://reviews.apache.org/r/51876/ for more details and background.

[Change List]
1. Add a configuration value "min_consecutive_health_checks"(default=1) to 
HealthCheckConfig struct.
2. Modify the default value of watch_secs to be 0.
3. Add a client-side constraint: 
initial_interval_secs >= min_consecutive_health_checks * interval_secs
4. Update the unit tests for health check config in client/config.py, skip unit 
tests related to watch_secs.


Diffs (updated)
-

  docs/reference/configuration.md f2a0b1873f31e91f3bf0cac6f8448e8130fae688 
  src/main/python/apache/aurora/client/config.py 
0186af52f0d7d7e3981ec59bf6a01aafee2bcfb1 
  src/main/python/apache/aurora/config/schema/base.py 
845163043b0b7b2f9e7aca14677ca9f094658551 
  src/test/python/apache/aurora/client/test_config.py 
5cf68a5145ddf9478baa30453c0bcb73136fa7eb 

Diff: https://reviews.apache.org/r/52094/diff/


Testing
---

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/client::

./pants test.pytest src/test/python/apache/aurora/config::


Thanks,

Kai Huang



Re: Review Request 52094: Add min_consecutive_health_checks in HealthCheckConfig

2016-09-22 Thread Kai Huang


> On Sept. 23, 2016, 1:16 a.m., Dmitriy Shirchenko wrote:
> > src/main/python/apache/aurora/client/config.py, line 113
> > <https://reviews.apache.org/r/52094/diff/1/?file=1505992#file1505992line113>
> >
> > why was this configuration option deleted? it is still referenced: 
> > https://github.com/apache/aurora/blob/master/docs/reference/configuration.md#healthcheckconfig-objects
> > 
> > if you are introducing a new option it should be documented.

This configuration option is not deleted. We just won't validate the constraint 
between watch_secs, max_consecutive_failures here.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52094/#review150118
-------


On Sept. 20, 2016, 7:43 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52094/
> ---
> 
> (Updated Sept. 20, 2016, 7:43 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1224
> https://issues.apache.org/jira/browse/AURORA-1224
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Add min_consecutive_health_checks to HealthCheckConfig.
> 
> [Summary]
> HealthCheckConfig should accept a new configuration value that will tell how 
> many positive consecutive health checks an instance requires to move from 
> STARTING to RUNNING.
> 
> [Background]
> This review depends on the executor change(AURORA-1225). Please see 
> https://reviews.apache.org/r/51876/ for more details and background.
> 
> [Change List]
> 1. Add a configuration value "min_consecutive_health_checks"(default=1) to 
> HealthCheckConfig struct.
> 2. Modify the default value of watch_secs to be 0.
> 3. Add a client-side constraint: 
> initial_interval_secs >= min_consecutive_health_checks * interval_secs
> 4. Update the unit tests for health check config in client/config.py, skip 
> unit tests related to watch_secs.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/client/config.py 
> 0186af52f0d7d7e3981ec59bf6a01aafee2bcfb1 
>   src/main/python/apache/aurora/config/schema/base.py 
> 845163043b0b7b2f9e7aca14677ca9f094658551 
>   src/test/python/apache/aurora/client/test_config.py 
> 5cf68a5145ddf9478baa30453c0bcb73136fa7eb 
> 
> Diff: https://reviews.apache.org/r/52094/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/client::
> 
> ./pants test.pytest src/test/python/apache/aurora/config::
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



  1   2   >