I definitely agree with Guozhang's "meta" comment: if it's possible to
allow users to pick and choose individual RocksDB metrics that would be
ideal. One further question is whether these will be debug or info level
metrics, or a separate level altogether? If there is a nontrivial overhead
associated with attaching RocksDB metrics it would probably be good to be
able to independently turn on/off Rocks metrics

On Tue, May 21, 2019 at 9:00 AM Guozhang Wang <wangg...@gmail.com> wrote:

> Hello Bruno,
>
> Thanks for the KIP, I have a few minor comments and a meta one which are
> relatively aligned with other folks:
>
> Minor:
>
> 1) Regarding the "rocksdb-state-id = [store ID]", to be consistent with
> other state store metrics (see
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-444%3A+Augment+metrics+for+Kafka+Streams),
> this tag should be either "rocksdb-window-state-id",
> "rocksdb-session-state-id" or "rocksdb-state-id". For window / session
> store backed by rocksDB, the tags should not be "rocksdb-state-id".
>
> 2) Also for window / session store, my take is that you plan to have
> multiple rocksDB behind the scene to report to the same set of metrics, is
> that right? My concern is that for such types of state stores, most of the
> access would be on the first segment rocksDB instance, and hence coalescing
> all of instances as a single set of metrics may dilute it.
>
> 3) I agree with @sop...@confluent.io <sop...@confluent.io> that we should
> better have some documentation educating users what to do when see what
> anomalies in metrics; though I think this is a long endeavoring effort that
> may go beyond this KIP's scope, let's keep that as a separate umbrella JIRA
> to accumulate knowledge over time.
>
>
> Meta:
>
> 4) Instead of trying to enumerate all the ones that might be helpful, I'd
> recommend that we expose a user-friendly API in StreamsMetrics to allow
> users to add more metrics from RocksDB they'd like to have, while only
> keeping a small set of most-meaningful ones that are ubiquitously useful
> out-of-the-box. WDYT?
>
>
> Guozhang
>
>
>
> On Tue, May 21, 2019 at 8:04 AM Dongjin Lee <dong...@apache.org> wrote:
>
>> Hi Bruno,
>>
>> I just read the KIP. I think this feature is great. As far as I know, most
>> Kafka users monitor the host resources, JVM resources, and Kafka metrics
>> only, not RocksDB for configuring the statistics feature is a little bit
>> tiresome. Since RocksDB impacts the performance of Kafka Streams, I
>> believe
>> providing these metrics with other metrics in one place is much better.
>>
>> However, I am a little bit not assured for how much information should be
>> provided to the users with the documentation - how the user can control
>> the
>>  RocksDB may on the boundary of the scope. How do you think?
>>
>> +1. I entirely agree to Bill's comments; I also think `rocksdb-store-id`
>> is
>> better than `rocksdb-state-id` and metrics on total compactions and an
>> average number of compactions is also needed.
>>
>> Regards,
>> Dongjin
>>
>> On Tue, May 21, 2019 at 2:48 AM John Roesler <j...@confluent.io> wrote:
>>
>> > Hi Bruno,
>> >
>> > Looks really good overall. This is going to be an awesome addition.
>> >
>> > My only thought was that we have "bytes-flushed-(rate|total) and
>> > flush-time-(avg|min|max)" metrics, and the description states that
>> > these are specifically for Memtable flush operations. What do you
>> > think about calling it "memtable-bytes-flushed... (etc)"? On one hand,
>> > I could see this being redundant, since that's the only thing that
>> > gets "flushed" inside of Rocks. But on the other, we have an
>> > independent "flush" operation in streams, which might be confusing.
>> > Plus, it might help people who are looking at the full "menu" of
>> > hundreds of metrics. They can't read and remember every description
>> > while trying to understand the full list of metrics, so going for
>> > maximum self-documentation in the name seems nice.
>> >
>> > But that's a minor thought. Modulo the others' comments, this looks good
>> > to me.
>> >
>> > Thanks,
>> > -John
>> >
>> > On Mon, May 20, 2019 at 11:07 AM Bill Bejeck <bbej...@gmail.com> wrote:
>> > >
>> > > Hi Bruno,
>> > >
>> > > Thanks for the KIP, this will be a useful addition.
>> > >
>> > > Overall the KIP looks good to me, and I have two minor comments.
>> > >
>> > > 1. For the tags should, I'm wondering if rocksdb-state-id should be
>> > > rocksdb-store-id
>> > > instead?
>> > >
>> > > 2. With the compaction metrics, would it be possible to add total
>> > > compactions and an average number of compactions?  I've taken a look
>> at
>> > the
>> > > available RocksDB metrics, and I'm not sure.  But users can control
>> how
>> > > many L0 files it takes to trigger compaction so if it is possible; it
>> may
>> > > be useful.
>> > >
>> > > Thanks,
>> > > Bill
>> > >
>> > >
>> > > On Mon, May 20, 2019 at 9:15 AM Bruno Cadonna <br...@confluent.io>
>> > wrote:
>> > >
>> > > > Hi Sophie,
>> > > >
>> > > > Thank you for your comments.
>> > > >
>> > > > It's a good idea to supplement the metrics with configuration option
>> > > > to change the metrics. I also had some thoughts about it. However, I
>> > > > think I need some experimentation to get this right.
>> > > >
>> > > > I added the block cache hit rates for index and filter blocks to the
>> > > > KIP. As far as I understood, they should stay at zero, if users do
>> not
>> > > > configure RocksDB to include index and filter blocks into the block
>> > > > cache. Did you also understand this similarly? I guess also in this
>> > > > case some experimentation would be good to be sure.
>> > > >
>> > > > Best,
>> > > > Bruno
>> > > >
>> > > >
>> > > > On Sat, May 18, 2019 at 2:29 AM Sophie Blee-Goldman <
>> > sop...@confluent.io>
>> > > > wrote:
>> > > > >
>> > > > > Actually I wonder if it might be useful to users to be able to
>> break
>> > up
>> > > > the
>> > > > > cache hit stats by type? Some people may choose to store index and
>> > filter
>> > > > > blocks alongside data blocks, and it would probably be very
>> helpful
>> > for
>> > > > > them to know who is making more effective use of the cache in
>> order
>> > to
>> > > > tune
>> > > > > how much of it is allocated to each. I'm not sure how common this
>> > really
>> > > > is
>> > > > > but I think it would be invaluable to those who do. RocksDB
>> > performance
>> > > > can
>> > > > > be quite opaque..
>> > > > >
>> > > > > Cheers,
>> > > > > Sophie
>> > > > >
>> > > > > On Fri, May 17, 2019 at 5:01 PM Sophie Blee-Goldman <
>> > sop...@confluent.io
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Hey Bruno!
>> > > > > >
>> > > > > > This all looks pretty good to me, but one suggestion I have is
>> to
>> > > > > > supplement each of the metrics with some info on how the user
>> can
>> > > > control
>> > > > > > them. In other words, which options could/should they set in
>> > > > > > RocksDBConfigSetter should they discover a particular
>> bottleneck?
>> > > > > >
>> > > > > > I don't think this necessarily needs to go into the KIP, but I
>> do
>> > > > think it
>> > > > > > should be included in the docs somewhere (happy to help build up
>> > the
>> > > > list
>> > > > > > of associated options when the time comes)
>> > > > > >
>> > > > > > Thanks!
>> > > > > > Sophie
>> > > > > >
>> > > > > > On Fri, May 17, 2019 at 2:54 PM Bruno Cadonna <
>> br...@confluent.io>
>> > > > wrote:
>> > > > > >
>> > > > > >> Hi all,
>> > > > > >>
>> > > > > >> this KIP describes the extension of the Kafka Streams' metrics
>> to
>> > > > include
>> > > > > >> RocksDB's internal statistics.
>> > > > > >>
>> > > > > >> Please have a look at it and let me know what you think. Since
>> I
>> > am
>> > > > not a
>> > > > > >> RocksDB expert, I am thankful for any additional pair of eyes
>> that
>> > > > > >> evaluates this KIP.
>> > > > > >>
>> > > > > >>
>> > > > > >>
>> > > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-471:+Expose+RocksDB+Metrics+in+Kafka+Streams
>> > > > > >>
>> > > > > >> Best regards,
>> > > > > >> Bruno
>> > > > > >>
>> > > > > >
>> > > >
>> >
>>
>>
>> --
>> *Dongjin Lee*
>>
>> *A hitchhiker in the mathematical world.*
>> *github:  <http://goog_969573159/>github.com/dongjinleekr
>> <https://github.com/dongjinleekr>linkedin:
>> kr.linkedin.com/in/dongjinleekr
>> <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
>> speakerdeck.com/dongjin
>> <https://speakerdeck.com/dongjin>*
>>
>
>
> --
> -- Guozhang
>

Reply via email to