Re: [DISCUSS] KIP-607: Add Metrics to Record the Memory Used by RocksDB to Kafka Streams

Bruno Cadonna Mon, 20 Jul 2020 11:01:01 -0700

Hi,

During the implementation of this KIP and after some discussion aboutRocksDB metrics, I decided to make some major modifications to this KIPand kick off discussion again.


https://cwiki.apache.org/confluence/display/KAFKA/KIP-607%3A+Add+Metrics+to+Kafka+Streams+to+Report+Properties+of+RocksDB

Best,
Bruno

On 15.05.20 17:11, Bill Bejeck wrote:

Thanks for the KIP, Bruno. Having sensible, easy to access RocksDB memory
reporting will be a welcomed addition.

FWIW I also agree to have the metrics reported on a store level. I'm glad
you changed the KIP to that effect.

-Bill



On Wed, May 13, 2020 at 6:24 PM Guozhang Wang <wangg...@gmail.com> wrote:

Hi Bruno,

Sounds good to me.

I think I'm just a bit more curious to see its impact on performance: as
long as we have one INFO level rocksDB metrics, then we'd have to turn on
the scheduled rocksdb metrics recorder whereas previously, we can decide to
not turn on the recorder at all if all are set as DEBUG and we configure at
INFO level in production. But this is an implementation detail anyways and
maybe the impact is negligible after all. We can check and re-discuss this
afterwards :)


Guozhang


On Wed, May 13, 2020 at 9:34 AM Sophie Blee-Goldman <sop...@confluent.io>
wrote:

Thanks Bruno! I took a look at the revised KIP and it looks good to me.

Sophie

On Wed, May 13, 2020 at 6:59 AM Bruno Cadonna <br...@confluent.io>

wrote:

Hi John,

Thank you for the feedback!

I agree and I will change the KIP as I stated in my previous e-mail to
Guozhang.

Best,
Bruno

On Tue, May 12, 2020 at 3:07 AM John Roesler <vvcep...@apache.org>

wrote:


Thanks, all.

If you don’t mind, I’ll pitch in a few cents’ worth.

In my life I’ve generally found more granular metrics to be more

useful,

as long as there’s a sane way to roll them up. It does seem nice to see

it

on the per-store level. For roll-up purposes, the task and thread tags
should be sufficient.


I think the only reason we make some metrics Debug is that

_recording_

them can be expensive. If there’s no added expense, I think we can just
register store-level metrics at Info level.


Thanks for the KIP, Bruno!
-John

On Mon, May 11, 2020, at 17:32, Guozhang Wang wrote:

Hello Sophie / Bruno,

I've also thought about the leveling question, and one motivation I

had for

setting it in instance-level is that we want to expose it in INFO

level:

today our report leveling is not very finer grained --- which I

think

is

sth. worth itself --- such that one have to either turn on all

DEBUG

metrics recording or none of them. If we can allow users to e.g.

specify

"turn on streams-metrics and stream-state-metrics, but not others"

and

then

I think it should be just at store-level. However, right now if we

want to

set it as store-level then it would be DEBUG and not exposed by

default.


So it seems we have several options in addition to the proposed

one:


a) we set it at store-level as INFO; but then one can argue why

this

is

INFO while others (bytes-written, etc) are DEBUG.
b) we set it at store-level as DEBUG, believing that we do not

usually

need

to turn it on.
c) maybe, we can set it at task-level (? I'm not so sure myself

about

this.. :P) as INFO.


Guozhang




On Mon, May 11, 2020 at 12:29 PM Sophie Blee-Goldman <

sop...@confluent.io>

wrote:

Hey Bruno,

Thanks for the KIP! I have one high-level concern, which is that

we

should

consider
reporting these metrics on the per-store level rather than

instance-wide. I

know I was
the one who first proposed making it instance-wide, so bear with

me:


While I would still argue that the instance-wide memory usage is

probably

the most *useful*,
exposing them at the store-level does not prevent users from

monitoring the

instance-wide
memory. They should be able to roll up all the store-level

metrics

on an

instance to
compute the total off-heap memory. But rolling it up for the

users

does

prevent them from
using this to debug rare cases where one store may be using

significantly

more memory than
expected.

It's also worth considering that some users may be using the

bounded

memory

config setter
to put a cap on the off-heap memory of the entire process, in

which

case

the memory usage
metric for any one store should reflect the memory usage of the

entire

instance. In that case
any effort to roll up the memory usages ourselves would just be

wasted.


Sorry for the reversal, but after a second thought I'm pretty

strongly in

favor of reporting these
at the store level.

Best,
Sophie

On Wed, May 6, 2020 at 8:41 AM Bruno Cadonna <br...@confluent.io

wrote:

Hi all,

I'd like to discuss KIP-607 that aims to add RocksDB memory

usage

metrics to Kafka Streams.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-607%3A+Add+Metrics+to+Record+the+Memory+Used+by+RocksDB+to+Kafka+Streams


Best,
Bruno



--
-- Guozhang



--
-- Guozhang

Re: [DISCUSS] KIP-607: Add Metrics to Record the Memory Used by RocksDB to Kafka Streams

Reply via email to