Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-06-14 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review204770
---



As you have agreed with this patch in general, I will merge it now and follow 
up with an additional performance patch set in the future (hopefully in a week 
or two).

- Stephan Erb


On March 20, 2018, 11:41 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 20, 2018, 11:41 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicates that a significant part of the refresh time os spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/2/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 55 running tasks and 2004 finished 
> ones.
> 
> Before this patch:
> 
> D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.92s
> D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.93s
> D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.89s
>  
> With this patch:
> 
> D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.49s
> D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-06-14 Thread Stephan Erb


> On March 21, 2018, 6:47 p.m., Santhosh Kumar Shanmugham wrote:
> > Ship It!
> 
> Santhosh Kumar Shanmugham wrote:
> I am a little confused about the performance testing results that are 
> posted here, since the pervious results indicated gains from 2secs to 
> 0.2secs, while the current one is much lesser. Can you add a little bit of 
> context regarding the results? Thanks.

Yes definitely. I somehow missed that this one here was still open.

* Code on master, give list of filesystem paths: Parse each path into its 
components, do some basic validation, join components back to a path string. 
Afterwards filter all paths that point to the same sandbox. What is expensive 
here is the parsing/splitting and the realpath computation needed for the 
duplication check.

* First version of the patch: Eliminates both sources of a slowdown and removes 
some code that was now used in the Aurora repository. 

* Second version of the patch: Just removes the slowdown of realpath by 
switching to a simpler symlink check. The parsing/splitting is still there. I 
decided to keep it in there as Reza indicated that you are still using this.

What came on top was that my bechmarking environment was not stable so the 
overall time varied across both patches.


- Stephan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199681
---


On March 20, 2018, 11:41 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 20, 2018, 11:41 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicates that a significant part of the refresh time os spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/2/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 55 running tasks and 2004 finished 
> ones.
> 
> Before this patch:
> 
> D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.92s
> D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.93s
> D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.89s
>  
> With this patch:
> 
> D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.49s
> D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-21 Thread Santhosh Kumar Shanmugham


> On March 21, 2018, 10:47 a.m., Santhosh Kumar Shanmugham wrote:
> > Ship It!

I am a little confused about the performance testing results that are posted 
here, since the pervious results indicated gains from 2secs to 0.2secs, while 
the current one is much lesser. Can you add a little bit of context regarding 
the results? Thanks.


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199681
---


On March 20, 2018, 3:41 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 20, 2018, 3:41 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicates that a significant part of the refresh time os spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/2/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 55 running tasks and 2004 finished 
> ones.
> 
> Before this patch:
> 
> D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.92s
> D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.93s
> D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.89s
>  
> With this patch:
> 
> D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.49s
> D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-21 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199681
---


Ship it!




Ship It!

- Santhosh Kumar Shanmugham


On March 20, 2018, 3:41 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 20, 2018, 3:41 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicates that a significant part of the refresh time os spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/2/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 55 running tasks and 2004 finished 
> ones.
> 
> Before this patch:
> 
> D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.92s
> D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.93s
> D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.89s
>  
> With this patch:
> 
> D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.49s
> D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-21 Thread Reza Motamedi


> On March 21, 2018, 5:36 p.m., Reza Motamedi wrote:
> > Ship It!

Can you explain a bit more about the exeperiement settings?


- Reza


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199678
---


On March 20, 2018, 10:41 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 20, 2018, 10:41 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicates that a significant part of the refresh time os spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/2/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 55 running tasks and 2004 finished 
> ones.
> 
> Before this patch:
> 
> D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.92s
> D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.93s
> D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.89s
>  
> With this patch:
> 
> D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.49s
> D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-21 Thread Reza Motamedi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199678
---


Ship it!




Ship It!

- Reza Motamedi


On March 20, 2018, 10:41 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 20, 2018, 10:41 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicates that a significant part of the refresh time os spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/2/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 55 running tasks and 2004 finished 
> ones.
> 
> Before this patch:
> 
> D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.92s
> D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.93s
> D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.89s
>  
> With this patch:
> 
> D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.49s
> D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-20 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199612
---


Ship it!




Master (b3fa9fe) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On March 20, 2018, 10:41 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 20, 2018, 10:41 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicates that a significant part of the refresh time os spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/2/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 55 running tasks and 2004 finished 
> ones.
> 
> Before this patch:
> 
> D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.92s
> D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.93s
> D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.89s
>  
> With this patch:
> 
> D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.49s
> D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-20 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/
---

(Updated March 20, 2018, 11:41 p.m.)


Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.


Changes
---

Optimize path detection without changing or removing existing code of 
`ExecutorDetector`.


Repository: aurora


Description (updated)
---

Profiling indicates that a significant part of the refresh time os spend in 
`os.path.realpath`. 
This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
the `latest`
symlink in the Mesos folder layout. 

This patch takes a slightly different approach to solve this problem based on 
`os.path.islink`.
The latter is faster as it just needs to look at a single folder rather than an 
entire path.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/common/path_detector.py 
ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
  src/test/python/apache/aurora/executor/common/test_path_detector.py 
7b5ef0cf552d22d4cfbf3357071de036551026dc 


Diff: https://reviews.apache.org/r/66139/diff/2/

Changes: https://reviews.apache.org/r/66139/diff/1-2/


Testing (updated)
---

I have tested this build on a node with 55 running tasks and 2004 finished ones.

Before this patch:

D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.92s
D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.93s
D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.89s
 
With this patch:

D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.48s
D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.49s
D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.48s


Thanks,

Stephan Erb



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-20 Thread Stephan Erb


> On March 20, 2018, 9:20 p.m., Reza Motamedi wrote:
> > src/main/python/apache/aurora/executor/common/executor_detector.py
> > Lines 28-85 (original), 26-38 (patched)
> > 
> >
> > We are using the functionalities here. Is the performance tunning 
> > related to this code delete?

Oh interesting. Do you use all removed functionality? 

The performance improvement is only partially related. There are cycles wasted 
in the translation from `path` to the parsing result of `ScanfParser` and back 
again, but the dominant factor is the removed `realpath` in `path_detector.py`.

I should be able to rewrite the patch so that `executor_detector.py` remains 
unchanged.


> On March 20, 2018, 9:20 p.m., Reza Motamedi wrote:
> > src/main/python/apache/aurora/executor/common/path_detector.py
> > Lines 31-33 (original), 31-33 (patched)
> > 
> >
> > Curious if this generator pattern is really useful. Don't we just 
> > consume everything it generates?

We need the result of `os.path.join(path, self._sandbox_path)` as a variable, 
in order to both return it and check via `os.path.exists(path)`. An alternative 
would be to use an explicit for loop here.


- Stephan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199564
---


On March 20, 2018, 12:22 a.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 20, 2018, 12:22 a.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicated that roughly 70% of the refresh time was spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> To facilitate this optimization the patch removes unused functionality from 
> ExecutorDetector.
> Most probably this  was used by former GC executor that got removed a few 
> years ago.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/BUILD 
> 486230db34a22ea5dd0f68da911c0afb1afbcac0 
>   src/main/python/apache/aurora/executor/common/executor_detector.py 
> a07bfc34caa5f86153ace8184b061e253c39e92e 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/main/python/apache/thermos/monitoring/detector.py 
> 6e5a620e9f49e0960a814f4c1f701a21cc7678fd 
>   src/test/python/apache/aurora/executor/common/test_executor_detector.py 
> d2a948f2e95cbc1723b63462787c4256cc56731d 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/1/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 75 running tasks and 4500 finished 
> ones.
> 
> Before this patch:
> 
> D0319 09:05:07.219057 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.26s
> D0319 09:05:14.412610 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.39s
> D0319 09:05:21.600985 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.29s
>  
> With this patch:
> 
> D0319 09:00:24.084657 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> D0319 09:00:29.869699 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.78s
> D0319 09:00:35.643407 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> 
> The regular 2-3 second freezes when navigating the Thermos UI are now almost 
> gone for me.
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-20 Thread Reza Motamedi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199564
---




src/main/python/apache/aurora/executor/common/executor_detector.py
Lines 28-85 (original), 26-38 (patched)


We are using the functionalities here. Is the performance tunning related 
to this code delete?



src/main/python/apache/aurora/executor/common/path_detector.py
Lines 31-33 (original), 31-33 (patched)


Curious if this generator pattern is really useful. Don't we just consume 
everything it generates?


- Reza Motamedi


On March 19, 2018, 11:22 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 19, 2018, 11:22 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicated that roughly 70% of the refresh time was spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> To facilitate this optimization the patch removes unused functionality from 
> ExecutorDetector.
> Most probably this  was used by former GC executor that got removed a few 
> years ago.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/BUILD 
> 486230db34a22ea5dd0f68da911c0afb1afbcac0 
>   src/main/python/apache/aurora/executor/common/executor_detector.py 
> a07bfc34caa5f86153ace8184b061e253c39e92e 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/main/python/apache/thermos/monitoring/detector.py 
> 6e5a620e9f49e0960a814f4c1f701a21cc7678fd 
>   src/test/python/apache/aurora/executor/common/test_executor_detector.py 
> d2a948f2e95cbc1723b63462787c4256cc56731d 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/1/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 75 running tasks and 4500 finished 
> ones.
> 
> Before this patch:
> 
> D0319 09:05:07.219057 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.26s
> D0319 09:05:14.412610 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.39s
> D0319 09:05:21.600985 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.29s
>  
> With this patch:
> 
> D0319 09:00:24.084657 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> D0319 09:00:29.869699 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.78s
> D0319 09:00:35.643407 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> 
> The regular 2-3 second freezes when navigating the Thermos UI are now almost 
> gone for me.
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-19 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199501
---


Ship it!




Master (b3fa9fe) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On March 19, 2018, 11:22 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 19, 2018, 11:22 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicated that roughly 70% of the refresh time was spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> To facilitate this optimization the patch removes unused functionality from 
> ExecutorDetector.
> Most probably this  was used by former GC executor that got removed a few 
> years ago.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/BUILD 
> 486230db34a22ea5dd0f68da911c0afb1afbcac0 
>   src/main/python/apache/aurora/executor/common/executor_detector.py 
> a07bfc34caa5f86153ace8184b061e253c39e92e 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/main/python/apache/thermos/monitoring/detector.py 
> 6e5a620e9f49e0960a814f4c1f701a21cc7678fd 
>   src/test/python/apache/aurora/executor/common/test_executor_detector.py 
> d2a948f2e95cbc1723b63462787c4256cc56731d 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/1/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 75 running tasks and 4500 finished 
> ones.
> 
> Before this patch:
> 
> D0319 09:05:07.219057 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.26s
> D0319 09:05:14.412610 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.39s
> D0319 09:05:21.600985 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.29s
>  
> With this patch:
> 
> D0319 09:00:24.084657 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> D0319 09:00:29.869699 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.78s
> D0319 09:00:35.643407 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> 
> The regular 2-3 second freezes when navigating the Thermos UI are now almost 
> gone for me.
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-19 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199490
---



@ReviewBot retry

- Stephan Erb


On March 19, 2018, 5:24 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 19, 2018, 5:24 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly and Renan DelValle.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicated that roughly 70% of the refresh time was spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> To facilitate this optimization the patch removes unused functionality from 
> ExecutorDetector.
> Most probably this  was used by former GC executor that got removed a few 
> years ago.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/BUILD 
> 486230db34a22ea5dd0f68da911c0afb1afbcac0 
>   src/main/python/apache/aurora/executor/common/executor_detector.py 
> a07bfc34caa5f86153ace8184b061e253c39e92e 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/main/python/apache/thermos/monitoring/detector.py 
> 6e5a620e9f49e0960a814f4c1f701a21cc7678fd 
>   src/test/python/apache/aurora/executor/common/test_executor_detector.py 
> d2a948f2e95cbc1723b63462787c4256cc56731d 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/1/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 75 running tasks and 4500 finished 
> ones.
> 
> Before this patch:
> 
> D0319 09:05:07.219057 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.26s
> D0319 09:05:14.412610 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.39s
> D0319 09:05:21.600985 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.29s
>  
> With this patch:
> 
> D0319 09:00:24.084657 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> D0319 09:00:29.869699 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.78s
> D0319 09:00:35.643407 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> 
> The regular 2-3 second freezes when navigating the Thermos UI are now almost 
> gone for me.
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-19 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199442
---



Master (aaadad7) is red with this patch.
  ./build-support/jenkins/build.sh

Pass 2: Analyzing classes (319 / 332) - 96% complete
Pass 2: Analyzing classes (320 / 332) - 96% complete
Pass 2: Analyzing classes (321 / 332) - 96% complete
Pass 2: Analyzing classes (322 / 332) - 96% complete
Pass 2: Analyzing classes (323 / 332) - 97% complete
Pass 2: Analyzing classes (324 / 332) - 97% complete
Pass 2: Analyzing classes (325 / 332) - 97% complete
Pass 2: Analyzing classes (326 / 332) - 98% complete
Pass 2: Analyzing classes (327 / 332) - 98% complete
Pass 2: Analyzing classes (328 / 332) - 98% complete
Pass 2: Analyzing classes (329 / 332) - 99% complete
Pass 2: Analyzing classes (330 / 332) - 99% complete
Pass 2: Analyzing classes (331 / 332) - 99% complete
Pass 2: Analyzing classes (332 / 332) - 100% complete
Done with analysis
:test

org.apache.aurora.scheduler.events.WebhookTest > 
testTaskChangedWithOldStateError FAILED
java.lang.AssertionError at WebhookTest.java:251
I0319 17:27:43.181 [ShutdownHook, SchedulerMain] Stopping scheduler services. 

1081 tests completed, 1 failed, 1 skipped
:test FAILED
:jacocoTestReport
Coverage report generated: 
file:///home/jenkins/jenkins-slave/workspace/AuroraBot/dist/reports/jacoco/test/html/index.html
:jacocoTestCoverageVerification

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':test'.
> There were failing tests. See the report at: 
> file:///home/jenkins/jenkins-slave/workspace/AuroraBot/dist/reports/tests/test/index.html

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

* Get more help at https://help.gradle.org

BUILD FAILED in 6m 54s
45 actionable tasks: 36 executed, 9 up-to-date


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On March 19, 2018, 4:24 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 19, 2018, 4:24 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly and Renan DelValle.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicated that roughly 70% of the refresh time was spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> To facilitate this optimization the patch removes unused functionality from 
> ExecutorDetector.
> Most probably this  was used by former GC executor that got removed a few 
> years ago.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/BUILD 
> 486230db34a22ea5dd0f68da911c0afb1afbcac0 
>   src/main/python/apache/aurora/executor/common/executor_detector.py 
> a07bfc34caa5f86153ace8184b061e253c39e92e 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/main/python/apache/thermos/monitoring/detector.py 
> 6e5a620e9f49e0960a814f4c1f701a21cc7678fd 
>   src/test/python/apache/aurora/executor/common/test_executor_detector.py 
> d2a948f2e95cbc1723b63462787c4256cc56731d 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/1/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 75 running tasks and 4500 finished 
> ones.
> 
> Before this patch:
> 
> D0319 09:05:07.219057 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.26s
> D0319 09:05:14.412610 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.39s
> D0319 09:05:21.600985 14514 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 2.29s
>  
> With this patch:
> 
> D0319 09:00:24.084657 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> D0319 09:00:29.869699 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.78s
> D0319 09:00:35.643407 5427 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.77s
> 
> The regular 2-3 second freezes when navigating the Thermos UI are now almost 
> gone for me.
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-03-19 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/
---

Review request for Aurora, Jordan Ly and Renan DelValle.


Repository: aurora


Description
---

Profiling indicated that roughly 70% of the refresh time was spend in 
`os.path.realpath`. 
This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
the `latest`
symlink in the Mesos folder layout. 

This patch takes a slightly different approach to solve this problem based on 
`os.path.islink`.
The latter is faster as it just needs to look at a single folder rather than an 
entire path.

To facilitate this optimization the patch removes unused functionality from 
ExecutorDetector.
Most probably this  was used by former GC executor that got removed a few years 
ago.


Diffs
-

  src/main/python/apache/aurora/executor/BUILD 
486230db34a22ea5dd0f68da911c0afb1afbcac0 
  src/main/python/apache/aurora/executor/common/executor_detector.py 
a07bfc34caa5f86153ace8184b061e253c39e92e 
  src/main/python/apache/aurora/executor/common/path_detector.py 
ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
  src/main/python/apache/thermos/monitoring/detector.py 
6e5a620e9f49e0960a814f4c1f701a21cc7678fd 
  src/test/python/apache/aurora/executor/common/test_executor_detector.py 
d2a948f2e95cbc1723b63462787c4256cc56731d 
  src/test/python/apache/aurora/executor/common/test_path_detector.py 
7b5ef0cf552d22d4cfbf3357071de036551026dc 


Diff: https://reviews.apache.org/r/66139/diff/1/


Testing
---

I have tested this build on a node with 75 running tasks and 4500 finished ones.

Before this patch:

D0319 09:05:07.219057 14514 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 2.26s
D0319 09:05:14.412610 14514 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 2.39s
D0319 09:05:21.600985 14514 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 2.29s
 
With this patch:

D0319 09:00:24.084657 5427 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.77s
D0319 09:00:29.869699 5427 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.78s
D0319 09:00:35.643407 5427 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.77s

The regular 2-3 second freezes when navigating the Thermos UI are now almost 
gone for me.


Thanks,

Stephan Erb