Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-06-14 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review204770
---



As you have agreed with this patch in general, I will merge it now and follow 
up with an additional performance patch set in the future (hopefully in a week 
or two).

- Stephan Erb


On March 20, 2018, 11:41 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 20, 2018, 11:41 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicates that a significant part of the refresh time os spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/2/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 55 running tasks and 2004 finished 
> ones.
> 
> Before this patch:
> 
> D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.92s
> D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.93s
> D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.89s
>  
> With this patch:
> 
> D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.49s
> D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh

2018-06-14 Thread Stephan Erb


> On March 21, 2018, 6:47 p.m., Santhosh Kumar Shanmugham wrote:
> > Ship It!
> 
> Santhosh Kumar Shanmugham wrote:
> I am a little confused about the performance testing results that are 
> posted here, since the pervious results indicated gains from 2secs to 
> 0.2secs, while the current one is much lesser. Can you add a little bit of 
> context regarding the results? Thanks.

Yes definitely. I somehow missed that this one here was still open.

* Code on master, give list of filesystem paths: Parse each path into its 
components, do some basic validation, join components back to a path string. 
Afterwards filter all paths that point to the same sandbox. What is expensive 
here is the parsing/splitting and the realpath computation needed for the 
duplication check.

* First version of the patch: Eliminates both sources of a slowdown and removes 
some code that was now used in the Aurora repository. 

* Second version of the patch: Just removes the slowdown of realpath by 
switching to a simpler symlink check. The parsing/splitting is still there. I 
decided to keep it in there as Reza indicated that you are still using this.

What came on top was that my bechmarking environment was not stable so the 
overall time varied across both patches.


- Stephan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199681
---


On March 20, 2018, 11:41 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> ---
> 
> (Updated March 20, 2018, 11:41 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Profiling indicates that a significant part of the refresh time os spend in 
> `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
> the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on 
> `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than 
> an entire path.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/common/path_detector.py 
> ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 
> 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/2/
> 
> 
> Testing
> ---
> 
> I have tested this build on a node with 55 running tasks and 2004 finished 
> ones.
> 
> Before this patch:
> 
> D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.92s
> D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.93s
> D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.89s
>  
> With this patch:
> 
> D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.49s
> D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
> checkpoint refresh in 0.48s
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>