----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/60376/#review179043 -----------------------------------------------------------
Let's see how this works out in practice! I am really looking forward to snappy thermos pages again. src/main/python/apache/thermos/monitoring/resource.py Line 54 (original), 53 (patched) <https://reviews.apache.org/r/60376/#comment253446> It might be worth to rename this to something like 'AggregateResourceResult' or 'TaskResourceResult'. src/main/python/apache/thermos/monitoring/resource.py Line 160 (original), 166 (patched) <https://reviews.apache.org/r/60376/#comment253448> The usages of `ts_value_pair` has made it difficult to understand what is going on. I believe it becomes a little bit clearer to the casual reader if you do something like this: timestamp, resources = self._history.get(timestamp) This will replace usages of `ts_value_pair[1].proc_usage.values()` with `resources.proc_usage.values()` which makes it slightly clearer. Calling this `full_resources` instead of `resources` should work as well. src/main/python/apache/thermos/monitoring/resource.py Lines 190 (patched) <https://reviews.apache.org/r/60376/#comment253449> in this case you coud do _, resources = self._history.get(time.time()) as you don't care about the timestamp. src/main/python/apache/thermos/monitoring/resource.py Line 222 (original), 243-244 (patched) <https://reviews.apache.org/r/60376/#comment253450> We normally don't do hanging indentation https://github.com/twitter/commons/blob/master/src/python/twitter/common/styleguide.md#best-practices src/test/python/apache/thermos/monitoring/test_resource.py Lines 84-103 (patched) <https://reviews.apache.org/r/60376/#comment253451> I am not a big fan of the mocking here, even though the sibling testcases are not much better :). Mocking results in high coupling, making the implementation harder to change in the future. Please give it a try if you can manage to inject a normal, pre-populated `ResourceHistory` into the constructor of the `TaskResourceMonitor` instead. (If you don't have any conclusive results after 10-15 minutes feel free to give up. There might be some further interdependency that I am not spotting right now) - Stephan Erb On June 22, 2017, 10:18 p.m., Reza Motamedi wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/60376/ > ----------------------------------------------------------- > > (Updated June 22, 2017, 10:18 p.m.) > > > Review request for Aurora, David McLaughlin, Joshua Cohen, and Santhosh Kumar > Shanmugham. > > > Repository: aurora > > > Description > ------- > > # Observer task page to load consumption info from history > > Resource consumptions of Thermos Processes are periodically calculated by > TaskResourceMonitor threads (one thread per Thermos task). This information > is used to display a (semi) fresh state of the tasks running on a host in the > Observer host page, aka landing page. An aggregate history of the > consumptions is kept at the task level, although TaskResourceMonitor needs to > first collect the resource at the Process level and then aggregate them. > > On the other hand, when an Observer _task page_ is visited, the resources > consumption of Thermos Processes within that task are calculated again and > displayed without being aggregated. This can become very slow since time to > complete resource calculation is affected by the load on the host. > > By applying this patch we take advantage of the periodic work and fulfill > information resource requested in Observer task page from already collected > resource consumptions. > > > Diffs > ----- > > src/main/python/apache/thermos/monitoring/resource.py > 434666696e600a0e6c19edd986c86575539976f2 > src/test/python/apache/thermos/monitoring/test_resource.py > d794a998f1d9fc52ba260cd31ac444aee7f8ed28 > > > Diff: https://reviews.apache.org/r/60376/diff/1/ > > > Testing > ------- > > I stress tested this patch on a host that had a slow Observer page. > Interestingly, I did not need to do much to make the Observer slow. There are > a few points to be made clear first. > - We at Twitter limit the resources allocated to the Observer using > `systemd`. The observer is allowed to use only 20% of a CPU core. The > attached screen shots are from such a setup. > - Having assigned 20% of a cpu core to Observer, starting only 8 `task`s, > each with 3 `process`es is enough to make the Observer slow; 11secs to load > `task page`. > > > File Attachments > ---------------- > > without the patch -- Screen Shot 2017-06-22 at 1.11.12 PM.png > > https://reviews.apache.org/media/uploaded/files/2017/06/22/03968028-a2f5-4a99-ba57-b7a41c471436__without_the_patch_--_Screen_Shot_2017-06-22_at_1.11.12_PM.png > with the patch -- Screen Shot 2017-06-22 at 1.07.41 PM.png > > https://reviews.apache.org/media/uploaded/files/2017/06/22/5962c018-27d3-4463-a277-f6ad48b7f2d7__with_the_patch_--_Screen_Shot_2017-06-22_at_1.07.41_PM.png > > > Thanks, > > Reza Motamedi > >