Nico Kruber created FLINK-12983:
-----------------------------------
Summary: Replace descriptive histogram's storage back-end
Key: FLINK-12983
URL: https://issues.apache.org/jira/browse/FLINK-12983
Project: Flink
Issue Type: Sub-task
Components: Runtime / Metrics
Reporter: Nico Kruber
Assignee: Nico Kruber
{{DescriptiveStatistics}} relies on their {{ResizableDoubleArray}} for storing
double values for their histograms. However, this is constantly resizing an
internal array and seems to have quite some overhead.
Additionally, we're not using {{SynchronizedDescriptiveStatistics}} which,
according to its docs, we should. Currently, we seem to be somewhat safe
because {{ResizableDoubleArray}} has some synchronized parts but these are
scheduled to go away with commons.math version 4.
Internal tests with the current implementation, one based on a linear array of
twice the histogram size (and moving values back to the start once the window
reaches the end), and one using a circular array (wrapping around with flexible
start position) has shown these numbers using the optimised code from
FLINK-10236, FLINK-12981, and FLINK-12982:
# only adding values to the histogram
{code}
Benchmark Mode Cnt Score
Error Units
HistogramBenchmarks.dropwizardHistogramAdd thrpt 30 47985.359 ±
25.847 ops/ms
HistogramBenchmarks.descriptiveHistogramAdd thrpt 30 70158.792 ±
276.858 ops/ms
--- with FLINK-10236, FLINK-12981, and FLINK-12982 ---
HistogramBenchmarks.descriptiveHistogramAdd thrpt 30 75303.040 ±
475.355 ops/ms
HistogramBenchmarks.histogramCircularArrayAdd thrpt 30 790123.475 ±
48420.672 ops/ms
HistogramBenchmarks.histogramLinearArrayAdd thrpt 30 385126.074 ±
3038.773 ops/ms
{code}
# after adding each value, also retrieving a common set of metrics:
{code}
Benchmark Mode Cnt Score
Error Units
HistogramBenchmarks.dropwizardHistogram thrpt 30 400.274 ±
4.930 ops/ms
HistogramBenchmarks.descriptiveHistogram thrpt 30 124.533 ±
1.060 ops/ms
--- with FLINK-10236, FLINK-12981, and FLINK-12982 ---
HistogramBenchmarks.descriptiveHistogram thrpt 30 251.895 ±
1.809 ops/ms
HistogramBenchmarks.histogramCircularArray thrpt 30 298.881 ±
10.027 ops/ms
HistogramBenchmarks.histogramLinearArray thrpt 30 234.380 ±
5.014 ops/ms
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)