[
https://issues.apache.org/jira/browse/MESOS-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964570#comment-13964570
]
Dominic Hamon commented on MESOS-1036:
--------------------------------------
Statistical bits added in
https://reviews.apache.org/r/20047
https://reviews.apache.org/r/20015
https://reviews.apache.org/r/20018
> Implement a library for exposing statistical metrics.
> -----------------------------------------------------
>
> Key: MESOS-1036
> URL: https://issues.apache.org/jira/browse/MESOS-1036
> Project: Mesos
> Issue Type: Improvement
> Components: statistics
> Reporter: Benjamin Mahler
> Assignee: Dominic Hamon
>
> At the current time, reporting of statistical metrics is dedicated to
> specific endpoints for each component, primarily the following two:
> {noformat}
> /master/stats.json
> /slave/stats.json
> {noformat}
> Additional endpoints have not been added (for example, containerization
> statistics, allocator statistics, libprocess statistics) due to the inherent
> difficulty involved: one must either expose this data up to these higher
> level endpoints, or add a new endpoint for exposing the component specific
> statistics.
> This is why the {{Statistics}} class in libprocess was created, however it is
> not being used for any statistical reporting at the current time.
> [~benjaminhindman] and I had white-boarded the kinds of abstractions we
> wanted to build to make statistical reporting trivial from anywhere in the
> code:
> Create the notion of a {{Statistic}} or {{Metric}} object that can be
> directly manipulated to store statistics, for example:
> {code}
> // In the Registrar initialization:
> Metric storage_latency = statistics.create("registrar", "storage_latency");
> // Recording an individual storage latency.
> storage_latency.set(latency);
> {code}
> In addition to this, we wanted the notion of a {{Meter}}, which automatically
> exposes a metered version of a statistic, for example:
> {code}
> Metric storage_latency = statistics.create("registrar", "storage_latency");
> // Adds "storage_latency_average" which computes average over the window.
> statistics.meter(storage_latency, Average());
> // Adds a "storage_latency_p99", percentile is a non-trivial implementation.
> statistics.meter(registrar_storage_latency, Percentile(99));
> // Adds a "storage_latency_maximum"
> statistics.meter(registrar_storage_latency, Maximum());
> {code}
> Of course, I'm not advocating a particular API in the above examples, I'm
> just laying out the types of things we wanted to see available.
> As we add these types of abstractions, we will want to avoid storing large
> time series data in memory as is currently done in {{Statistics}}. There are
> a number of things to consider with respect to the windowing technique, but I
> think the notion of a window should transition from "amount of history to be
> kept" to "a statistical rolling window". For example, when computing an
> average, you would most likely want a rolling 1 minute average, as opposed to
> the average for a 2 week window.
> Efficiency of this library will be important to avoid high RSS overhead.
--
This message was sent by Atlassian JIRA
(v6.2#6252)