> On July 18, 2017, 12:46 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/python/apache/thermos/monitoring/resource.py
> > Line 54 (original), 53-60 (patched)
> > <https://reviews.apache.org/r/60376/diff/3/?file=1766459#file1766459line54>
> >
> >     Can you add some docstrings to these classes to explain what they will 
> > contain? Particularly the difference between the `AggregateResourceResult` 
> > and `(Proc|Full)ResourceResult`.

Done.


> On July 18, 2017, 12:46 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/python/apache/thermos/monitoring/resource.py
> > Line 61 (original), 67 (patched)
> > <https://reviews.apache.org/r/60376/diff/3/?file=1766459#file1766459line68>
> >
> >     Update docstring.

Done.


> On July 18, 2017, 12:46 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/python/apache/thermos/monitoring/resource.py
> > Line 68 (original), 74 (patched)
> > <https://reviews.apache.org/r/60376/diff/3/?file=1766459#file1766459line75>
> >
> >     Update docstring.

Done.


> On July 18, 2017, 12:46 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/python/apache/thermos/monitoring/resource.py
> > Lines 88 (patched)
> > <https://reviews.apache.org/r/60376/diff/3/?file=1766459#file1766459line89>
> >
> >     snake-case the field names.

Done.


> On July 18, 2017, 12:46 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/python/apache/thermos/monitoring/resource.py
> > Lines 175-179 (patched)
> > <https://reviews.apache.org/r/60376/diff/3/?file=1766459#file1766459line181>
> >
> >     Can we use `sum` and `zip` to sum the tuples column-wise?

Not sure if that makes it easier to read. I moved to something that matches 
what this part of the calculation looked like before:

```
    aggregated_procs = sum(map(attrgetter('num_procs'), 
full_resources.proc_usage.values()))
    aggregated_sample = sum(map(attrgetter('process_sample'), 
full_resources.proc_usage.values()),
        ProcessSample.empty())
```


> On July 18, 2017, 12:46 a.m., Santhosh Kumar Shanmugham wrote:
> > src/test/python/apache/thermos/monitoring/test_resource.py
> > Lines 104-105 (patched)
> > <https://reviews.apache.org/r/60376/diff/3/?file=1766461#file1766461line106>
> >
> >     Feed some random values.

done.


> On July 18, 2017, 12:46 a.m., Santhosh Kumar Shanmugham wrote:
> > src/test/python/apache/thermos/monitoring/test_resource.py
> > Lines 113 (patched)
> > <https://reviews.apache.org/r/60376/diff/3/?file=1766461#file1766461line115>
> >
> >     Assert disk_usage.

done.


- Reza


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60376/#review180775
-----------------------------------------------------------


On July 18, 2017, 6:26 a.m., Reza Motamedi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60376/
> -----------------------------------------------------------
> 
> (Updated July 18, 2017, 6:26 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, Jordan Ly, and 
> Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> # Observer task page to load consumption info from history
> 
> Resource consumptions of Thermos Processes are periodically calculated by 
> TaskResourceMonitor threads (one thread per Thermos task). This information 
> is used to display a (semi) fresh state of the tasks running on a host in the 
> Observer host page, aka landing page. An aggregate history of the 
> consumptions is kept at the task level, although TaskResourceMonitor needs to 
> first collect the resource at the Process level and then aggregate them.
> 
> On the other hand, when an Observer _task page_ is visited, the resources 
> consumption of Thermos Processes within that task are calculated again and 
> displayed without being aggregated. This can become very slow since time to 
> complete resource calculation is affected by the load on the host.
> 
> By applying this patch we take advantage of the periodic work and fulfill 
> information resource requested in Observer task page from already collected 
> resource consumptions.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/thermos/monitoring/resource.py 
> 434666696e600a0e6c19edd986c86575539976f2 
>   src/test/python/apache/aurora/executor/common/test_resource_manager.py 
> a898e4d81d34d1e30e39db1be1a66bc9e0ab1a35 
>   src/test/python/apache/thermos/monitoring/test_resource.py 
> d794a998f1d9fc52ba260cd31ac444aee7f8ed28 
> 
> 
> Diff: https://reviews.apache.org/r/60376/diff/4/
> 
> 
> Testing
> -------
> 
> I stress tested this patch on a host that had a slow Observer page. 
> Interestingly, I did not need to do much to make the Observer slow. There are 
> a few points to be made clear first.
> - We at Twitter limit the resources allocated to the Observer using 
> `systemd`. The observer is allowed to use only 20% of a CPU core. The 
> attached screen shots are from such a setup.
> - Having assigned 20% of a cpu core to Observer, starting only 8 `task`s, 
> each with 3 `process`es is enough to make the Observer slow; 11secs to load 
> `task page`.
> 
> 
> File Attachments
> ----------------
> 
> without the patch -- Screen Shot 2017-06-22 at 1.11.12 PM.png
>   
> https://reviews.apache.org/media/uploaded/files/2017/06/22/03968028-a2f5-4a99-ba57-b7a41c471436__without_the_patch_--_Screen_Shot_2017-06-22_at_1.11.12_PM.png
> with the patch -- Screen Shot 2017-06-22 at 1.07.41 PM.png
>   
> https://reviews.apache.org/media/uploaded/files/2017/06/22/5962c018-27d3-4463-a277-f6ad48b7f2d7__with_the_patch_--_Screen_Shot_2017-06-22_at_1.07.41_PM.png
> 
> 
> Thanks,
> 
> Reza Motamedi
> 
>

Reply via email to