Re: [DISCUSS] KIP-989: RocksDB Iterator Metrics

Lucas Brutschy Tue, 14 May 2024 10:06:10 -0700

Hi Nick,

you are still referring to oldest-open-iterator-age-ms in the
`Proposed Changes` section.


Cheers,
Lucas

On Thu, May 2, 2024 at 4:00 PM Lucas Brutschy <[email protected]> wrote:
>
> Hi Nick!
>
> I agree, the age variant is a bit nicer since the semantics are very
> clear from the name. If you'd rather go for the simple implementation,
> how about calling it `oldest-iterator-open-since-ms`? I believe this
> could be understood without docs. Either way, I think we should be
> able to open the vote for this KIP because nobody raised any major /
> blocking concerns.
>
> Looking forward to getting this voted on soon!
>
> Cheers
> Lucas
>
> On Sun, Mar 31, 2024 at 5:23 PM Nick Telford <[email protected]> wrote:
> >
> > Hi Matthias,
> >
> > > For the oldest iterator metric, I would propose something simple like
> > > `iterator-opened-ms` and it would just be the actual timestamp when the
> > > iterator was opened. I don't think we need to compute the actual age,
> > > but user can to this computation themselves?
> >
> > That works for me; it's easier to implement like that :-D I'm a little
> > concerned that the name "iterator-opened-ms" may not be obvious enough
> > without reading the docs.
> >
> > > If we think reporting the age instead of just the timestamp is better, I
> > > would propose `iterator-max-age-ms`. I should be sufficient to call out
> > > (as it's kinda "obvious" anyway) that the metric applies to open
> > > iterator only.
> >
> > While I think it's preferable to record the timestamp, rather than the age,
> > this does have the benefit of a more obvious metric name.
> >
> > > Nit: the KIP says it's a store-level metric, but I think it would be
> > > good to say explicitly that it's recorded with DEBUG level only?
> >
> > Yes, I've already updated the KIP with this information in the table.
> >
> > Regards,
> >
> > Nick
> >
> > On Sun, 31 Mar 2024 at 10:53, Matthias J. Sax <[email protected]> wrote:
> >
> > > The time window thing was just an idea. Happy to drop it.
> > >
> > > For the oldest iterator metric, I would propose something simple like
> > > `iterator-opened-ms` and it would just be the actual timestamp when the
> > > iterator was opened. I don't think we need to compute the actual age,
> > > but user can to this computation themselves?
> > >
> > > If we think reporting the age instead of just the timestamp is better, I
> > > would propose `iterator-max-age-ms`. I should be sufficient to call out
> > > (as it's kinda "obvious" anyway) that the metric applies to open
> > > iterator only.
> > >
> > > And yes, I was hoping that the code inside MetereXxxStore might already
> > > be setup in a way that custom stores would inherit the iterator metrics
> > > automatically -- I am just not sure, and left it as an exercise for
> > > somebody to confirm :)
> > >
> > >
> > > Nit: the KIP says it's a store-level metric, but I think it would be
> > > good to say explicitly that it's recorded with DEBUG level only?
> > >
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 3/28/24 2:52 PM, Nick Telford wrote:
> > > > Quick addendum:
> > > >
> > > > My suggested metric "oldest-open-iterator-age-seconds" should be
> > > > "oldest-open-iterator-age-ms". Milliseconds is obviously a better
> > > > granularity for such a metric.
> > > >
> > > > Still accepting suggestions for a better name.
> > > >
> > > > On Thu, 28 Mar 2024 at 13:41, Nick Telford <[email protected]>
> > > wrote:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> Sorry for leaving this for so long. So much for "3 weeks until KIP
> > > freeze"!
> > > >>
> > > >> On Sophie's comments:
> > > >> 1. Would Matthias's suggestion of a separate metric tracking the age of
> > > >> the oldest open iterator (within the tag set) satisfy this? That way we
> > > can
> > > >> keep iterator-duration-(avg|max) for closed iterators, which can be
> > > useful
> > > >> for performance debugging for iterators that don't leak. I'm not sure
> > > what
> > > >> we'd call this metric, maybe: "oldest-open-iterator-age-seconds"? Seems
> > > >> like a mouthful.
> > > >>
> > > >> 2. You're right, it makes more sense to provide
> > > >> iterator-duration-(avg|max). Honestly, I can't remember why I had
> > > "total"
> > > >> before, or why I was computing a rate-of-change over it.
> > > >>
> > > >> 3, 4, 5, 6. Agreed, I'll make all those changes as suggested.
> > > >>
> > > >> 7. Combined with Matthias's point about RocksDB, I'm convinced that 
> > > >> this
> > > >> is the wrong KIP for these. I'll introduce the additional Rocks metrics
> > > in
> > > >> another KIP.
> > > >>
> > > >> On Matthias's comments:
> > > >> A. Not sure about the time window. I'm pretty sure all existing avg/max
> > > >> metrics are since the application was started? Any other suggestions
> > > here
> > > >> would be appreciated.
> > > >>
> > > >> B. Agreed. See point 1 above.
> > > >>
> > > >> C. Good point. My focus was very much on Rocks memory leaks when I 
> > > >> wrote
> > > >> the first draft. I can generalise it. My only concern is that it might
> > > make
> > > >> it more difficult to detect Rocks iterator leaks caused *within* our
> > > >> high-level iterator, e.g. RocksJNI, RocksDB, RocksDBStore, etc. But we
> > > >> could always provide a RocksDB-specific metric for this, as you
> > > suggested.
> > > >>
> > > >> D. Hmm, we do already have MeteredKeyValueIterator, which automatically
> > > >> wraps the iterator from inner-stores of MeteredKeyValueStore. If we
> > > >> implemented these metrics there, then custom stores would automatically
> > > >> gain the functionality, right? This seems like a pretty logical place 
> > > >> to
> > > >> implement these metrics, since MeteredKeyValueStore is all about adding
> > > >> metrics to state stores.
> > > >>
> > > >>> I imagine the best way to implement this would be to do so at the
> > > >>> high-level iterator rather than implementing it separately for each
> > > >>> specific iterator implementation for every store type.
> > > >>
> > > >> Sophie, does MeteredKeyValueIterator fit with your recommendation?
> > > >>
> > > >> Thanks for your thoughts everyone, I'll update the KIP now.
> > > >>
> > > >> Nick
> > > >>
> > > >> On Thu, 14 Mar 2024 at 03:37, Sophie Blee-Goldman <
> > > [email protected]>
> > > >> wrote:
> > > >>
> > > >>> About your last two points: I completely agree that we should try to
> > > >>> make this independent of RocksDB, and should probably adopt a
> > > >>> general philosophy of being store-implementation agnostic unless
> > > >>> there is good reason to focus on a particular store type: eg if it was
> > > >>> only possible to implement for certain stores, or only made sense in
> > > >>> the context of a certain store type but not necessarily stores in
> > > general.
> > > >>>
> > > >>> While leaking memory due to unclosed iterators on RocksDB stores is
> > > >>> certainly the most common issue, I think Matthias sufficiently
> > > >>> demonstrated that the problem of leaking iterators is not actually
> > > >>> unique to RocksDB, and we should consider including in-memory
> > > >>> stores at the very least. I also think that at this point, we may as
> > > well
> > > >>> just implement the metrics for *all* store types, whether rocksdb or
> > > >>> in-memory or custom. Not just because it probably applies to all
> > > >>> store types (leaking iterators are rarely a good thing!) but because
> > > >>> I imagine the best way to implement this would be to do so at the
> > > >>> high-level iterator rather than implementing it separately for each
> > > >>> specific iterator implementation for every store type.
> > > >>>
> > > >>> That said, I haven't thought all that carefully about the
> > > implementation
> > > >>> yet -- it just strikes me as easiest to do at the top level of the
> > > store
> > > >>> hierarchy rather than at the bottom. My gut instinct may very well be
> > > >>> wrong, but that's what it's saying
> > > >>>
> > > >>> On Thu, Mar 7, 2024 at 10:43 AM Matthias J. Sax <[email protected]>
> > > wrote:
> > > >>>
> > > >>>> Seems I am late to this party. Can we pick this up again aiming for
> > > 3.8
> > > >>>> release? I think it would be a great addition. Few comments:
> > > >>>>
> > > >>>>
> > > >>>> - I think it does make sense to report `iterator-duration-avg` and
> > > >>>> `iterator-duration-max` for all *closed* iterators -- it just seems 
> > > >>>> to
> > > >>>> be a useful metric (wondering if this would be _overall_ or bounded 
> > > >>>> to
> > > >>>> some time window?)
> > > >>>>
> > > >>>> - About the duration iterators are currently open, I believe the only
> > > >>>> useful way is to report the "oldest iterator", ie, the smallest
> > > iterator
> > > >>>> open-time, of all currently open-iterator? We all agree that in
> > > general,
> > > >>>> leaking iterator would bump the count metric, and if there is a few
> > > ones
> > > >>>> which are not closed and open for a long time, it seem sufficient to
> > > >>>> detect the single oldest one for alerting purpose?
> > > >>>>
> > > >>>> - What I don't like about the KIP is it focus on RocksDB. I don't
> > > think
> > > >>>> we should build on the internal RocksDB counters as proposed (I 
> > > >>>> guess,
> > > >>>> we could still expose them, similar to other RocksDB metrics which we
> > > >>>> expose already). However, for this new metric, we should track it
> > > >>>> ourselves and thus make it independent of RocksDB -- in the end, an
> > > >>>> in-memory store could also leak memory (and kill a JVM with an
> > > >>>> out-of-memory error) and we should be able to track it.
> > > >>>>
> > > >>>> - Not sure if we would like to add support for custom stores, to 
> > > >>>> allow
> > > >>>> them to register their iterators with this metric? Or would this not
> > > be
> > > >>>> necessary, because custom stores could just register a custom metric
> > > >>>> about it to begin with?
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> -Matthias
> > > >>>>
> > > >>>> On 10/25/23 4:41 PM, Sophie Blee-Goldman wrote:
> > > >>>>>>
> > > >>>>>>    If we used "iterator-duration-max", for
> > > >>>>>> example, would it not be confusing that it includes Iterators that
> > > >>> are
> > > >>>>>> still open, and therefore the duration is not yet known?
> > > >>>>>
> > > >>>>>
> > > >>>>> 1. Ah, I think I understand your concern better now -- I totally
> > > agree
> > > >>>> that
> > > >>>>> a
> > > >>>>>    "iterator-duration-max" metric would be confusing/misleading. I
> > > was
> > > >>>>> thinking about it a bit differently, something more akin to the
> > > >>>>> "last-rebalance-seconds-ago" consumer metric. As the name suggests,
> > > >>>>> that basically just tracks how long the consumer has gone without
> > > >>>>> rebalancing -- it doesn't purport to represent the actual duration
> > > >>>> between
> > > >>>>> rebalances, just the current time since the last one.  The hard part
> > > >>> is
> > > >>>>> really
> > > >>>>> in choosing a name that reflects this -- maybe you have some better
> > > >>> ideas
> > > >>>>> but off the top of my head, perhaps something like
> > > >>>> "iterator-lifetime-max"?
> > > >>>>>
> > > >>>>> 2. I'm not quite sure how to interpret the "iterator-duration-total"
> > > >>>> metric
> > > >>>>> -- what exactly does it mean to add up all the iterator durations?
> > > For
> > > >>>>> some context, while this is not a hard-and-fast rule, in general
> > > >>> you'll
> > > >>>>> find that Kafka/Streams metrics tend to come in pairs of avg/max or
> > > >>>>> rate/total. Something that you might measure the avg for usually is
> > > >>>>> also useful to measure the max, whereas a total metric is probably
> > > >>>>> also useful as a rate but not so much as an avg. I actually think
> > > this
> > > >>>>> is part of why it feels like it makes so much sense to include a
> > > "max"
> > > >>>>> version of this metric, as Lucas suggested, even if the name of
> > > >>>>> "iterator-duration-max" feels misleading. Ultimately the metric 
> > > >>>>> names
> > > >>>>> are up to you, but for this reason, I would personally advocate for
> > > >>>>> just going with an "iterator-duration-avg" and
> > > "iterator-duration-max"
> > > >>>>>
> > > >>>>> I did see your example in which you mention one could monitor the
> > > >>>>> rate of change of the "-total" metric. While this does make sense to
> > > >>>>> me, if the only way to interpret a metric is by computing another
> > > >>>>> function over it, then why not just make that computation the metric
> > > >>>>> and cut out the middle man? And in this case, to me at least, it
> > > feels
> > > >>>>> much easier to understand a metric like "iterator-duration-max" vs
> > > >>>>> something like "iterator-duration-total-rate"
> > > >>>>>
> > > >>>>> 3. By the way, can you add another column to the table with the new
> > > >>>> metrics
> > > >>>>> that lists the recording level? My suggestion would be to put the
> > > >>>>> "number-open-iterators" at INFO and the other two at DEBUG. See
> > > >>>>> the following for my reasoning behind this recommendation
> > > >>>>>
> > > >>>>> 4. I would change the "Type" entry for the "number-open-iterators"
> > > >>> from
> > > >>>>> "Value" to "Gauge". This helps justify the "INFO" level for this
> > > >>> metric,
> > > >>>>> since unlike the other metrics which are "Measurables", the current
> > > >>>>> timestamp won't need to be retrieved on each recording
> > > >>>>>
> > > >>>>> 5. Can you list the tags that would be associated with each of these
> > > >>>>> metrics (either in the table, or separately above/below if they will
> > > >>> be
> > > >>>>> the same for all)
> > > >>>>>
> > > >>>>> 6. Do you have a strong preference for the name
> > > >>> "number-open-iterators"
> > > >>>>> or would you be alright in shortening this to "num-open-iterators"?
> > > >>> The
> > > >>>>> latter is more in line with the naming scheme used elsewhere in 
> > > >>>>> Kafka
> > > >>>>> for similar kinds of metrics, and a shorter name is always nice.
> > > >>>>>
> > > >>>>> 7. With respect to the rocksdb cache metrics, those sound useful but
> > > >>>>> if it was me, I would probably save them for a separate KIP mainly
> > > >>> just
> > > >>>>> because the KIP freeze deadline is in a few weeks, and I wouldn't
> > > want
> > > >>>>> to end up blocking all the new metrics just because there was 
> > > >>>>> ongoing
> > > >>>>> debate about a subset of them. That said, you do have 3 full weeks,
> > > so
> > > >>>>> I would hope that you could get both sets of metrics agreed upon in
> > > >>>>> that timeframe!
> > > >>>>>
> > > >>>>>
> > > >>>>> On Tue, Oct 24, 2023 at 6:35 AM Nick Telford <[email protected]
> > > >
> > > >>>> wrote:
> > > >>>>>
> > > >>>>>> I don't really have a problem with adding such a metric, I'm just
> > > not
> > > >>>>>> entirely sure how it would work. If we used 
> > > >>>>>> "iterator-duration-max",
> > > >>> for
> > > >>>>>> example, would it not be confusing that it includes Iterators that
> > > >>> are
> > > >>>>>> still open, and therefore the duration is not yet known? When
> > > >>> graphing
> > > >>>> that
> > > >>>>>> over time, I suspect it would be difficult to understand.
> > > >>>>>>
> > > >>>>>> 3.
> > > >>>>>> FWIW, this would still be picked up by "open-iterators", since that
> > > >>>> metric
> > > >>>>>> is only decremented when Iterator#close is called (via the
> > > >>>>>> ManagedKeyValueIterator#onClose hook).
> > > >>>>>>
> > > >>>>>> I'm actually considering expanding the scope of this KIP slightly 
> > > >>>>>> to
> > > >>>>>> include improved Block Cache metrics, as my own memory leak
> > > >>>> investigations
> > > >>>>>> have trended in that direction. Do you think the following metrics
> > > >>>> should
> > > >>>>>> be included in this KIP, or should I create a new KIP?
> > > >>>>>>
> > > >>>>>>      - block-cache-index-usage (number of bytes occupied by index
> > > >>> blocks)
> > > >>>>>>      - block-cache-filter-usage (number of bytes occupied by filter
> > > >>>> blocks)
> > > >>>>>>
> > > >>>>>> Regards,
> > > >>>>>> Nick
> > > >>>>>>
> > > >>>>>> On Tue, 24 Oct 2023 at 07:09, Sophie Blee-Goldman <
> > > >>>> [email protected]>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> I actually think we could implement Lucas' suggestion pretty 
> > > >>>>>>> easily
> > > >>> and
> > > >>>>>>> without too much additional effort. We have full control over the
> > > >>>>>> iterator
> > > >>>>>>> that is returned by the various range queries, so it would be easy
> > > >>> to
> > > >>>>>>> register a gauge metric for how long it has been since the 
> > > >>>>>>> iterator
> > > >>> was
> > > >>>>>>> created. Then we just deregister the metric when the iterator is
> > > >>>> closed.
> > > >>>>>>>
> > > >>>>>>> With respect to how useful this metric would be, both Nick and
> > > Lucas
> > > >>>> have
> > > >>>>>>> made good points: I would agree that in general, leaking iterators
> > > >>>> would
> > > >>>>>>> mean an ever-increasing iterator count that should be possible to
> > > >>> spot
> > > >>>>>>> without this. However, a few things to consider:
> > > >>>>>>>
> > > >>>>>>> 1. it's really easy to set up an alert based on some maximum
> > > >>> threshold
> > > >>>> of
> > > >>>>>>> how long an iterator should remain open for. It's relatively more
> > > >>>> tricky
> > > >>>>>> to
> > > >>>>>>> set up alerts based on the current count of open iterators and how
> > > >>> it
> > > >>>>>>> changes over time.
> > > >>>>>>> 2. As Lucas mentioned, it only takes a few iterators to wreak 
> > > >>>>>>> havoc
> > > >>> in
> > > >>>>>>> extreme cases. Sometimes more advanced applications end up with
> > > >>> just a
> > > >>>>>> few
> > > >>>>>>> leaking iterators despite closing the majority of them. I've seen
> > > >>> this
> > > >>>>>>> happen just once personally, but it was driving everyone crazy
> > > >>> until we
> > > >>>>>>> figured it out.
> > > >>>>>>> 3. A metric for how long the iterator has been open would help to
> > > >>>>>> identify
> > > >>>>>>> hanging iterators due to some logic where the iterator is properly
> > > >>>> closed
> > > >>>>>>> but for whatever reason just isn't being advanced to the end, and
> > > >>> thus
> > > >>>>>> not
> > > >>>>>>> reached the iterator#close line of the user code. This case seems
> > > >>>>>> difficult
> > > >>>>>>> to spot without the specific metric for iterator lifetime
> > > >>>>>>> 4. Lastly, I think you could argue that all of the above are 
> > > >>>>>>> fairly
> > > >>>>>>> advanced use cases, but this seems like a fairly advanced feature
> > > >>>>>> already,
> > > >>>>>>> so it doesn't seem unreasonable to try and cover all the bases.
> > > >>>>>>>
> > > >>>>>>> All that said, my philosophy is that the KIP author gets the final
> > > >>> word
> > > >>>>>> on
> > > >>>>>>> what to pull into scope as long as the proposal isn't harming
> > > anyone
> > > >>>>>>> without the extra feature/changes. So it's up to you Nick --  just
> > > >>>> wanted
> > > >>>>>>> to add some context on how it could work, and why it would be
> > > >>> helpful
> > > >>>>>>>
> > > >>>>>>> Thanks for the KIP!
> > > >>>>>>>
> > > >>>>>>> On Wed, Oct 18, 2023 at 7:04 AM Lucas Brutschy
> > > >>>>>>> <[email protected]> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Nick,
> > > >>>>>>>>
> > > >>>>>>>> I did not think in detail about how to implement it, just about
> > > >>> what
> > > >>>>>>>> metrics would be nice to have. You are right, we'd have to
> > > >>>>>>>> register/deregister the iterators during open/close. This would 
> > > >>>>>>>> be
> > > >>>>>>>> more complicated to implement than the other metrics, but I do 
> > > >>>>>>>> not
> > > >>> see
> > > >>>>>>>> a fundamental problem with it.
> > > >>>>>>>>
> > > >>>>>>>> As far as I understand, even a low number of leaked iterators can
> > > >>> hurt
> > > >>>>>>>> RocksDB compaction significantly. So we may even want to detect 
> > > >>>>>>>> if
> > > >>> the
> > > >>>>>>>> iterators are opened by one-time / rare queries against the state
> > > >>>>>>>> store.
> > > >>>>>>>>
> > > >>>>>>>> But, as I said, that would be an addition and not a change of the
> > > >>>>>>>> current contents of the KIP, so I'd support the KIP moving 
> > > >>>>>>>> forward
> > > >>>>>>>> even without this extension.
> > > >>>>>>>>
> > > >>>>>>>> Cheers, Lucas
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Tue, Oct 17, 2023 at 3:45 PM Nick Telford <
> > > >>> [email protected]>
> > > >>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>> Hi Lucas,
> > > >>>>>>>>>
> > > >>>>>>>>> Hmm, I'm not sure how we could reliably identify such leaked
> > > >>>>>> Iterators.
> > > >>>>>>>> If
> > > >>>>>>>>> we tried to include open iterators when calculating
> > > >>>>>> iterator-duration,
> > > >>>>>>>> we'd
> > > >>>>>>>>> need some kind of registry of all the open iterator creation
> > > >>>>>>> timestamps,
> > > >>>>>>>>> wouldn't we?
> > > >>>>>>>>>
> > > >>>>>>>>> In general, if you have a leaky Iterator, it should manifest as 
> > > >>>>>>>>> a
> > > >>>>>>>>> persistently climbing "open-iterators" metric, even on a busy
> > > >>> node,
> > > >>>>>>>> because
> > > >>>>>>>>> each time that Iterator is used, it will leak another one. So
> > > >>> even in
> > > >>>>>>> the
> > > >>>>>>>>> presence of many non-leaky Iterators on a busy instance, the
> > > >>> metric
> > > >>>>>>>> should
> > > >>>>>>>>> still consistently climb.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>>
> > > >>>>>>>>> Nick
> > > >>>>>>>>>
> > > >>>>>>>>> On Mon, 16 Oct 2023 at 14:24, Lucas Brutschy <
> > > >>> [email protected]
> > > >>>>>>>> .invalid>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> Hi Nick!
> > > >>>>>>>>>>
> > > >>>>>>>>>> thanks for the KIP! I think this could be quite useful, given
> > > the
> > > >>>>>>>>>> problems that we had with leaks due to RocksDB resources not
> > > >>> being
> > > >>>>>>>>>> closed.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I don't have any pressing issues why we can't accept it like it
> > > >>> is,
> > > >>>>>>>>>> just one minor point for discussion: would it also make sense 
> > > >>>>>>>>>> to
> > > >>>>>> make
> > > >>>>>>>>>> it possible to identify a few very long-running / leaked
> > > >>>>>> iterators? I
> > > >>>>>>>>>> can imagine on a busy node, it would be hard to spot that 1% of
> > > >>> the
> > > >>>>>>>>>> iterators never close when looking only at closed iterator or
> > > the
> > > >>>>>>>>>> number of iterators. But it could still be good to identify
> > > those
> > > >>>>>>>>>> leaks early. One option would be to add `iterator-duration-max`
> > > >>> and
> > > >>>>>>>>>> take open iterators into account when computing the metric.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Cheers,
> > > >>>>>>>>>> Lucas
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Thu, Oct 5, 2023 at 3:50 PM Nick Telford <
> > > >>>>>> [email protected]>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Hi Colt,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I kept the details out of the KIP, because KIPs generally
> > > >>>>>> document
> > > >>>>>>>>>>> high-level design, but the way I'm doing it is like this:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>           final ManagedKeyValueIterator<Bytes, byte[]>
> > > >>>>>>>>>>> rocksDbPrefixSeekIterator = cf.prefixScan(accessor,
> > > >>> prefixBytes);
> > > >>>>>>>>>>> -->     final long startedAt = System.nanoTime();
> > > >>>>>>>>>>>           openIterators.add(rocksDbPrefixSeekIterator);
> > > >>>>>>>>>>>           rocksDbPrefixSeekIterator.onClose(() -> {
> > > >>>>>>>>>>> -->
> > > >>>>>>>    metricsRecorder.recordIteratorDuration(System.nanoTime()
> > > >>>>>>>> -
> > > >>>>>>>>>>> startedAt);
> > > >>>>>>>>>>>               openIterators.remove(rocksDbPrefixSeekIterator);
> > > >>>>>>>>>>>           });
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> The lines with the arrow are the new code. This pattern is
> > > >>>>>> repeated
> > > >>>>>>>>>>> throughout RocksDBStore, wherever a new RocksDbIterator is
> > > >>>>>> created.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Regards,
> > > >>>>>>>>>>> Nick
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Thu, 5 Oct 2023 at 12:32, Colt McNealy <[email protected]
> > > >
> > > >>>>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Thank you for the KIP, Nick!
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> This would be highly useful for many reasons. Much more sane
> > > >>>>>> than
> > > >>>>>>>>>> checking
> > > >>>>>>>>>>>> for leaked iterators by profiling memory usage while running
> > > >>>>>>> 100's
> > > >>>>>>>> of
> > > >>>>>>>>>>>> thousands of range scans via interactive queries (:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> One small question:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> The iterator-duration metrics will be updated whenever an
> > > >>>>>>>> Iterator's
> > > >>>>>>>>>>>> close() method is called
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Does the Iterator have its own "createdAt()" or equivalent
> > > >>>>>> field,
> > > >>>>>>>> or
> > > >>>>>>>>>> do we
> > > >>>>>>>>>>>> need to keep track of the Iterator's start time upon 
> > > >>>>>>>>>>>> creation?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Cheers,
> > > >>>>>>>>>>>> Colt McNealy
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> *Founder, LittleHorse.dev*
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> On Wed, Oct 4, 2023 at 9:07 AM Nick Telford <
> > > >>>>>>>> [email protected]>
> > > >>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> Hi everyone,
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> KIP-989 is a small Kafka Streams KIP to add a few new 
> > > >>>>>>>>>>>>> metrics
> > > >>>>>>>> around
> > > >>>>>>>>>> the
> > > >>>>>>>>>>>>> creation and use of RocksDB Iterators, to aid users in
> > > >>>>>>>> identifying
> > > >>>>>>>>>>>>> "Iterator leaks" that could cause applications to leak 
> > > >>>>>>>>>>>>> native
> > > >>>>>>>> memory.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Let me know what you think!
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-989%3A+RocksDB+Iterator+Metrics
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> P.S. I'm not too sure about the formatting of the "New
> > > >>>>>> Metrics"
> > > >>>>>>>>>> table,
> > > >>>>>>>>>>>> any
> > > >>>>>>>>>>>>> advice there would be appreciated.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Regards,
> > > >>>>>>>>>>>>> Nick
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > > >
> > >

Re: [DISCUSS] KIP-989: RocksDB Iterator Metrics

Reply via email to