-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60376/
-----------------------------------------------------------

Review request for Aurora, David McLaughlin, Joshua Cohen, and Santhosh Kumar 
Shanmugham.


Repository: aurora


Description
-------

# Observer task page to load consumption info from history

Resource consumptions of Thermos Processes are periodically calculated by 
TaskResourceMonitor threads (one thread per Thermos task). This information is 
used to display a (semi) fresh state of the tasks running on a host in the 
Observer host page, aka landing page. An aggregate history of the consumptions 
is kept at the task level, although TaskResourceMonitor needs to first collect 
the resource at the Process level and then aggregate them.

On the other hand, when an Observer _task page_ is visited, the resources 
consumption of Thermos Processes within that task are calculated again and 
displayed without being aggregated. This can become very slow since time to 
complete resource calculation is affected by the load on the host.

By applying this patch we take advantage of the periodic work and fulfill 
information resource requested in Observer task page from already collected 
resource consumptions.


Diffs
-----

  src/main/python/apache/thermos/monitoring/resource.py 
434666696e600a0e6c19edd986c86575539976f2 
  src/test/python/apache/thermos/monitoring/test_resource.py 
d794a998f1d9fc52ba260cd31ac444aee7f8ed28 


Diff: https://reviews.apache.org/r/60376/diff/1/


Testing
-------

I stress tested this patch on a host that had a slow Observer page. 
Interestingly, I did not need to do much to make the Observer slow. There are a 
few points to be made clear first.
- We at Twitter limit the resources allocated to the Observer using `systemd`. 
The observer is allowed to use only 20% of a CPU core. The attached screen 
shots are from such a setup.
- Having assigned 20% of a cpu core to Observer, starting only 8 `task`s, each 
with 3 `process`es is enough to make the Observer slow; 11secs to load `task 
page`.


File Attachments
----------------

without the patch -- Screen Shot 2017-06-22 at 1.11.12 PM.png
  
https://reviews.apache.org/media/uploaded/files/2017/06/22/03968028-a2f5-4a99-ba57-b7a41c471436__without_the_patch_--_Screen_Shot_2017-06-22_at_1.11.12_PM.png
with the patch -- Screen Shot 2017-06-22 at 1.07.41 PM.png
  
https://reviews.apache.org/media/uploaded/files/2017/06/22/5962c018-27d3-4463-a277-f6ad48b7f2d7__with_the_patch_--_Screen_Shot_2017-06-22_at_1.07.41_PM.png


Thanks,

Reza Motamedi

Reply via email to