-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67627/#review204916
-----------------------------------------------------------


Ship it!




Master (4719fa7) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On June 18, 2018, 8:57 a.m., Stephan Erb wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67627/
> -----------------------------------------------------------
> 
> (Updated June 18, 2018, 8:57 a.m.)
> 
> 
> Review request for Aurora, Renan DelValle, Reza Motamedi, and Santhosh Kumar 
> Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add observer command line option `--disable_task_resource_collection` to
> disable the collection of CPU, memory, and disk metrics for observed tasks.
> This is useful in setups where metrics cannot be gathered reliable (e.g. when
> using PID namespaces) or when it is expensive due to hundreds of active tasks
> per host.
> 
> 
> Diffs
> -----
> 
>   RELEASE-NOTES.md edc081f502370190597ad028f3275cdfd572f5ca 
>   docs/reference/observer-configuration.md 
> c791b3480e5bf35e6eb0fbea908ff3242eab315d 
>   src/main/python/apache/aurora/config/BUILD 
> 12e7fe973f456d0847ce63d3b293131a7f4c3bdd 
>   src/main/python/apache/aurora/tools/thermos_observer.py 
> fd9465d2e2b3135f3fdf8230777117adaa89337c 
>   src/main/python/apache/thermos/monitoring/resource.py 
> 72ed4e5a82dfd8a09e0a8262f6da4992ac98542a 
>   src/main/python/apache/thermos/observer/task_observer.py 
> 94cd6c541bb7f8a4c153cc51caa63d2c08888a49 
>   src/test/python/apache/thermos/monitoring/test_resource.py 
> 44450647a180f86903ebd37f2a9f4327496597e9 
> 
> 
> Diff: https://reviews.apache.org/r/67627/diff/1/
> 
> 
> Testing
> -------
> 
> We are running our Mesos agents with enabled PID namespaces (i.e.
> `--isolation='namespaces/ipc,namespaces/pid,...'`). Sometimes the hosts are
> also tightly packed with many small tasks (e.g. `~130` active tasks and 
> `~1000`
> finished tasks). Even with very relaxed scrape settings of 
> `--task_process_collection_interval_secs=3000` and
> `--task_disk_collection_interval_secs=3000` it can take between `150ms-2500ms`
> to render the observer landing page `/main`. This patch reduces this to about
> `100ms-150ms`. There is no immediate downside as metrics reporting is broken
> anyway due to the PID namespacing.
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>

Reply via email to