Re: [DISCUSS] KIP-1250: Add metric to track size of in-memory state stores

Matthias J. Sax Thu, 22 Jan 2026 18:18:37 -0800

Thanks for the updates Evan. Sounds good. I did not look at the KIPagain yet, but I think you can start a VOTE?


-Matthias


On 1/21/26 3:25 PM, Evan Zhou via dev wrote:

Hi Matthias,

MJS1: I did some more digging, and I see what you were initially asking.
I've updated the KIP to specify that a gauge will be used to collect the
metric rather than a sensor, which will make the performance impact trivial
compared to the initial proposal. I've also changed the recording level to
be INFO, matching the behavior of the RocksDB equivalent metric.

Thanks,
Evan

On Fri, Jan 16, 2026 at 1:43 PM Evan Zhou <[email protected]> wrote:

Hi Matthias,

MJS1: When trying to figure out how to record this metric, I figured the
simplest and most straightforward way is to update the metric on every put
and delete, which would have a nonzero performance impact, thus I decided
to add it only at the DEBUG level. I'm not sure how RocksDB collects its
equivalent metric, but my understanding is that Kafka Streams just passes
up the value it gets from RocksDB and doesn't do any of the collecting
itself, which is why I believe the in-memory implementation would be more
expensive

MJS2: Good point, I've updated the KIP to reflect this. To repeat what is
in the KIP, this metric will be implemented for all classes starting with
`Metered` under `streams/state/internals`. For completeness, the metered
iterators are able to remove elements, thus they are included here.

Thanks,
Evan

On Mon, Jan 12, 2026 at 6:13 PM Matthias J. Sax <[email protected]> wrote:

Thanks for the KIP Evan.

MJS1: Why is the metric added at DEBUG level? The corresponding RocksDB
metric is added at INFO level. The KIP says:

The addition of this metric will have no effect on existing users. The

performance impact of adding this metric is non-trivial, thus we only
record it at the DEBUG level.

Why is it expensive? Not clear to me atm. And why would it be cheaper
for the RocksDB case?




About Lucas question: Not sure if we would need to do anything about it?
Given that we have specific RocksDB metrics, which are not available for
in-memory stores, why should we not have specific in-memory store
metrics? RocksDB metrics are also not containing "rocksdb" in their names?

As long as it's properly document as "in-memory metric" (and the
description already mentions it explicitly) it might just be fine?



MJS2: The KIP says, we would implement this only for
`MeteredKeyValueStore`. Why not also add it to the other store types
like windowed and others?


-Matthias


On 12/12/25 4:14 PM, Evan Zhou via dev wrote:

Hi Lucas,

Thanks for the feedback.

My original intention was to use "estimate" in the metric name as the
differentiator, but I agree that this is not very clear. How does

changing

the metric name to "in-memory-state-num-keys" sound?

Thanks,
Evan

On Thu, Dec 11, 2025 at 1:21 AM Lucas Brutschy via dev <

[email protected]>

wrote:

Hi Evan,

thanks for the KIP!

If this is specific to in-memory stores, I wonder if we should add
this to the metric name? People could become confused that rocksDB
state is not showing up.

Cheers,
Lucas

On Thu, Dec 11, 2025 at 1:40 AM Evan Zhou via dev <

[email protected]>

wrote:


Hi all,

I'd like to start the discussion for KIP-1250, which adds a new

metric to

track the size of in-memory state stores. Today, a similar metric

exists

in

Kafka, but only for RocksDB, and this KIP intends to close that gap.

https://cwiki.apache.org/confluence/x/noTMFw

Thanks,
Evan

Re: [DISCUSS] KIP-1250: Add metric to track size of in-memory state stores

Reply via email to