Hi Lucas,

Hmm, I'm not sure how we could reliably identify such leaked Iterators. If
we tried to include open iterators when calculating iterator-duration, we'd
need some kind of registry of all the open iterator creation timestamps,
wouldn't we?

In general, if you have a leaky Iterator, it should manifest as a
persistently climbing "open-iterators" metric, even on a busy node, because
each time that Iterator is used, it will leak another one. So even in the
presence of many non-leaky Iterators on a busy instance, the metric should
still consistently climb.

Regards,

Nick

On Mon, 16 Oct 2023 at 14:24, Lucas Brutschy <lbruts...@confluent.io.invalid>
wrote:

> Hi Nick!
>
> thanks for the KIP! I think this could be quite useful, given the
> problems that we had with leaks due to RocksDB resources not being
> closed.
>
> I don't have any pressing issues why we can't accept it like it is,
> just one minor point for discussion: would it also make sense to make
> it possible to identify a few very long-running / leaked iterators? I
> can imagine on a busy node, it would be hard to spot that 1% of the
> iterators never close when looking only at closed iterator or the
> number of iterators. But it could still be good to identify those
> leaks early. One option would be to add `iterator-duration-max` and
> take open iterators into account when computing the metric.
>
> Cheers,
> Lucas
>
>
> On Thu, Oct 5, 2023 at 3:50 PM Nick Telford <nick.telf...@gmail.com>
> wrote:
> >
> > Hi Colt,
> >
> > I kept the details out of the KIP, because KIPs generally document
> > high-level design, but the way I'm doing it is like this:
> >
> >         final ManagedKeyValueIterator<Bytes, byte[]>
> > rocksDbPrefixSeekIterator = cf.prefixScan(accessor, prefixBytes);
> > -->     final long startedAt = System.nanoTime();
> >         openIterators.add(rocksDbPrefixSeekIterator);
> >         rocksDbPrefixSeekIterator.onClose(() -> {
> > -->         metricsRecorder.recordIteratorDuration(System.nanoTime() -
> > startedAt);
> >             openIterators.remove(rocksDbPrefixSeekIterator);
> >         });
> >
> > The lines with the arrow are the new code. This pattern is repeated
> > throughout RocksDBStore, wherever a new RocksDbIterator is created.
> >
> > Regards,
> > Nick
> >
> > On Thu, 5 Oct 2023 at 12:32, Colt McNealy <c...@littlehorse.io> wrote:
> >
> > > Thank you for the KIP, Nick!
> > >
> > > This would be highly useful for many reasons. Much more sane than
> checking
> > > for leaked iterators by profiling memory usage while running 100's of
> > > thousands of range scans via interactive queries (:
> > >
> > > One small question:
> > >
> > > >The iterator-duration metrics will be updated whenever an Iterator's
> > > close() method is called
> > >
> > > Does the Iterator have its own "createdAt()" or equivalent field, or
> do we
> > > need to keep track of the Iterator's start time upon creation?
> > >
> > > Cheers,
> > > Colt McNealy
> > >
> > > *Founder, LittleHorse.dev*
> > >
> > >
> > > On Wed, Oct 4, 2023 at 9:07 AM Nick Telford <nick.telf...@gmail.com>
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > KIP-989 is a small Kafka Streams KIP to add a few new metrics around
> the
> > > > creation and use of RocksDB Iterators, to aid users in identifying
> > > > "Iterator leaks" that could cause applications to leak native memory.
> > > >
> > > > Let me know what you think!
> > > >
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-989%3A+RocksDB+Iterator+Metrics
> > > >
> > > > P.S. I'm not too sure about the formatting of the "New Metrics"
> table,
> > > any
> > > > advice there would be appreciated.
> > > >
> > > > Regards,
> > > > Nick
> > > >
> > >
>

Reply via email to