Hi Andor, Ted, Enrico,

Thanks again for your discussions and inputs. The changes have been
submitted. https://github.com/apache/zookeeper/pull/1698

Would you mind reviewing the PR?

Best,

Li

On Wed, May 5, 2021 at 10:00 AM Li Wang <li4w...@gmail.com> wrote:

> Thanks Andor!
>
> Li
>
> On Tue, May 4, 2021 at 2:00 PM Andor Molnar <an...@apache.org> wrote:
>
>> Awesome work Li, indeed.
>> I’m happy to review the PR. Thanks for the contribution.
>>
>> Andor
>>
>>
>>
>> > On 2021. May 4., at 3:49, Li Wang <li4w...@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > Thanks Ted and Enrico again for the inputs and discussions. I would
>> like to
>> > share some updates from my side.
>> >
>> > 1. The main issue is that Prometheus summary computation is expensive
>> and
>> > locked/synchronized. To reduce the perf impact, I changed the
>> > PrometheusLabelledSummary.add() operation to be async, so it is handled
>> by
>> > separate threads from the main thread.
>> >
>> > 2. Ran benchmark against the changes. The perf of read and write
>> operations
>> > are significantly improved.  .
>> >
>> > a) For read operation, the avg latency is reduced about 50% and avg
>> > throughout is increased about 95%.
>> > b) For write operation, the avg latency is reduced about 28% and avg
>> > throughput is increased about 20%.
>> >
>> > 3. I opened a JIRA for the issue and can submit a PR for the change.
>> >
>> > https://issues.apache.org/jira/browse/ZOOKEEPER-4289
>> >
>> > Please let me know if you have any thoughts or questions.
>> >
>> > Thanks,
>> >
>> > Li
>> >
>> >
>> >
>> > On Wed, Apr 28, 2021 at 11:32 PM Enrico Olivelli <eolive...@gmail.com>
>> > wrote:
>> >
>> >> Il giorno mer 28 apr 2021 alle ore 21:48 Li Wang <li4w...@gmail.com>
>> ha
>> >> scritto:
>> >>>
>> >>> Hi Enrico,
>> >>>
>> >>> Please see my comments inline.
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Li.
>> >>>
>> >>> On Mon, Apr 26, 2021 at 10:42 PM Enrico Olivelli <eolive...@gmail.com
>> >
>> >>> wrote:
>> >>>
>> >>>> Sorry for top posting.
>> >>>>
>> >>>> I have never seen in other applications that Prometheus has such a
>> >>>> significant impact.
>> >>>
>> >>>
>> >>> Please see my reply in the previous response.
>> >>>
>> >>>
>> >>>>
>> >>>> The first things that come into my mind:
>> >>>> - collect a couple of dumps with some perf tool and dig into the
>> >> problem
>> >>>
>> >>>
>> >>> Heap dump, thread dump and CPU profiling with Prometheus enabled vs
>> >>> disabled have been collected. That's how we found the issue.
>> >>>
>> >>>
>> >>>> -  verify that we have the latest version of Prometheus client
>> >>>>
>> >>>
>> >>> ZK uses 0.9.0 and the latest version of Prometheus client is 0.10.0.
>> >>> Checked in the change logs, it doesn't look like there are major
>> changes
>> >>> between these two. https://github.com/prometheus/client_java/releases
>> We
>> >>> can upgrade the version and see if it makes any difference.
>> >>>
>> >>> - tune the few knobs we have in the Prometheus client
>> >>>>
>> >>>
>> >>> What are the knobs we can tune? Can you provide more info on this?
>> >>
>> >> Unfortunately there is nothing we can tune directly with the zookeeper
>> >> configuration file
>> >>
>> >> In ZK code we currently have only some basic configuration of
>> >> summaries, and we are using mostly the default values
>> >>
>> >>
>> https://github.com/apache/zookeeper/blob/11c07921c15e2fb7692375327b53f26a583b77ca/zookeeper-metrics-providers/zookeeper-prometheus-metrics/src/main/java/org/apache/zookeeper/metrics/prometheus/PrometheusMetricsProvider.java#L340
>> >> there is no way to fine tune the buckets and all of the structures.
>> >>
>> >> we should work on looking for optimal values
>> >>
>> >
>> >
>> >
>> >
>> >>
>> >>
>> >>>
>> >>>>
>> >>>> In Apache Pulsar and in Apache Bookkeeper we have some customizations
>> >> in
>> >>>> the Prometheus metrics collectors, we could the a look and port those
>> >> to
>> >>>> Zookeeper (initially I worked on the Prometheus integration based on
>> my
>> >>>> usecases I have with Pulsar, Bookkeeper and other systems that
>> already
>> >> use
>> >>>> Prometheus, but here in Zookeeper we are using the basic Prometheus
>> >> client)
>> >>>>
>> >>>
>> >>> Are there any customizations for minimizing the performance impact?
>> Is
>> >>> there anything that we can port to ZK to help the issue?
>> >>
>> >> Yes, there is much we can port, this is the Prometheus Metrics
>> >> provider in BookKeeper
>> >>
>> >>
>> https://github.com/apache/bookkeeper/tree/master/bookkeeper-stats-providers/prometheus-metrics-provider/src/main/java/org/apache/bookkeeper/stats/prometheus
>> >>
>> >> Enrico
>> >>
>> >>>
>> >>>
>> >>>> Enrico
>> >>>>
>> >>>> Il Mar 27 Apr 2021, 06:35 Ted Dunning <ted.dunn...@gmail.com> ha
>> >> scritto:
>> >>>>
>> >>>>> Batching metrics reporting is very similar to option (c) but with
>> >> locking
>> >>>>> like option (a). That can usually be made faster by passing a
>> >> reference
>> >>>> to
>> >>>>> the metrics accumulator to the reporting thread which can do the
>> >> batch
>> >>>>> update without locks. Usually requires ping-pong metrics
>> >> accumulators so
>> >>>>> that a thread can accumulate in one accumulator for a bit, pass that
>> >> to
>> >>>> the
>> >>>>> merge thread and switch to using the alternate accumulator. Since
>> all
>> >>>>> threads typically report at the same time, this avoids a stampede on
>> >> the
>> >>>>> global accumulator.
>> >>>>>
>> >>>>>
>> >>>>> On Mon, Apr 26, 2021 at 9:30 PM Li Wang <li4w...@gmail.com> wrote:
>> >>>>>
>> >>>>>> batching metrics reporting can help. For example, in the
>> >>>> CommitProcessor,
>> >>>>>> increasing the maxCommitBatchSize helps improving the the
>> >> performance
>> >>>> of
>> >>>>>> write operation.
>> >>>>>>
>> >>>>>>
>> >>>>>> On Mon, Apr 26, 2021 at 9:21 PM Li Wang <li4w...@gmail.com> wrote:
>> >>>>>>
>> >>>>>>> Yes, I am thinking that handling metrics reporting in a separate
>> >>>>> thread,
>> >>>>>>> so it doesn't impact the "main" thread.
>> >>>>>>>
>> >>>>>>> Not sure about the idea of merging at the end of a reporting
>> >> period.
>> >>>>> Can
>> >>>>>>> you elaborate a bit on it?
>> >>>>>>>
>> >>>>>>> Thanks,
>> >>>>>>>
>> >>>>>>> Li
>> >>>>>>>
>> >>>>>>> On Mon, Apr 26, 2021 at 9:11 PM Ted Dunning <
>> >> ted.dunn...@gmail.com>
>> >>>>>> wrote:
>> >>>>>>>
>> >>>>>>>> Would it help to keep per thread metrics that are either
>> >> reported
>> >>>>>>>> independently or are merged at the end of a reporting period?
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Mon, Apr 26, 2021 at 8:51 PM Li Wang <li4w...@gmail.com>
>> >> wrote:
>> >>>>>>>>
>> >>>>>>>>> Hi Community,
>> >>>>>>>>>
>> >>>>>>>>> I've done further investigation on the issue and found the
>> >>>> following
>> >>>>>>>>>
>> >>>>>>>>> 1. The perf of the read operation was decreased due to the
>> >> lock
>> >>>>>>>> contention
>> >>>>>>>>> in the Prometheus TimeWindowQuantiles APIs. 3 out of 4
>> >>>>>> CommitProcWorker
>> >>>>>>>>> threads were blocked on the TimeWindowQuantiles.insert() API
>> >> when
>> >>>>> the
>> >>>>>>>> test
>> >>>>>>>>> was.
>> >>>>>>>>>
>> >>>>>>>>> 2. The perf of the write operation was decreased because of
>> >> the
>> >>>> high
>> >>>>>> CPU
>> >>>>>>>>> usage from Prometheus Summary type of metrics. The CPU usage
>> >> of
>> >>>>>>>>> CommitProcessor increased about 50% when Prometheus was
>> >> disabled
>> >>>>>>>> compared
>> >>>>>>>>> to enabled (46% vs 80% with 4 CPU, 63% vs 99% with 12 CPU).
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> Prometheus integration is a great feature, however the
>> >> negative
>> >>>>>>>> performance
>> >>>>>>>>> impact is very significant.  I wonder if anyone has any
>> >> thoughts
>> >>>> on
>> >>>>>> how
>> >>>>>>>> to
>> >>>>>>>>> reduce the perf impact.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> Li
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Tue, Apr 6, 2021 at 12:33 PM Li Wang <li4w...@gmail.com>
>> >>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Hi,
>> >>>>>>>>>>
>> >>>>>>>>>> I would like to reach out to the community to see if anyone
>> >> has
>> >>>>> some
>> >>>>>>>>>> insights or experience with the performance impact of
>> >> enabling
>> >>>>>>>> prometheus
>> >>>>>>>>>> metrics.
>> >>>>>>>>>>
>> >>>>>>>>>> I have done load comparison tests for Prometheus enabled vs
>> >>>>> disabled
>> >>>>>>>> and
>> >>>>>>>>>> found the performance is reduced about 40%-60% for both
>> >> read and
>> >>>>>> write
>> >>>>>>>>>> oeprations (i.e. getData, getChildren and createNode).
>> >>>>>>>>>>
>> >>>>>>>>>> The load test was done with Zookeeper 3.7, cluster size of 5
>> >>>>>>>> participants
>> >>>>>>>>>> and 5 observers, each ZK server has 10G heap size and 4
>> >> cpu, 500
>> >>>>>>>>> concurrent
>> >>>>>>>>>> users sending requests.
>> >>>>>>>>>>
>> >>>>>>>>>> The performance impact is quite significant.  I wonder if
>> >> this
>> >>>> is
>> >>>>>>>>> expected
>> >>>>>>>>>> and what are things we can do to have ZK performing the same
>> >>>> while
>> >>>>>>>>>> leveraging the new feature of Prometheus metric.
>> >>>>>>>>>>
>> >>>>>>>>>> Best,
>> >>>>>>>>>>
>> >>>>>>>>>> Li
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>
>>
>>

Reply via email to