Re: [DISCUSS] KIP-831: Add metric for log recovery progress

Luke Chen Tue, 10 May 2022 06:48:29 -0700

Hi James and all,

I checked again and I can see when creating UnifiedLog, we expected the
logs/indexes/snapshots are in good state.
So, I don't think we should break the current design to expose the
`RemainingBytesToRecovery`
metric.


If there is no other comments, I'll start a vote within this week.

Thank you.
Luke

On Fri, May 6, 2022 at 6:00 PM Luke Chen <[email protected]> wrote:

> Hi James,
>
> Thanks for your input.
>
> For the `RemainingBytesToRecovery` metric proposal, I think there's one
> thing I didn't make it clear.
> Currently, when log manager start up, we'll try to load all logs
> (segments), and during the log loading, we'll try to recover logs if
> necessary.
> And the logs loading is using "thread pool" as you thought.
>
> So, here's the problem:
> All segments in each log folder (partition) will be loaded in each log
> recovery thread, and until it's loaded, we can know how many segments (or
> how many Bytes) needed to recover.
> That means, if we have 10 partition logs in one broker, and we have 2 log
> recovery threads (num.recovery.threads.per.data.dir=2), before the
> threads load the segments in each log, we only know how many logs
> (partitions) we have in the broker (i.e. RemainingLogsToRecover metric).
> We cannot know how many segments/Bytes needed to recover until each thread
> starts to load the segments under one log (partition).
>
> So, the example in the KIP, it shows:
> Currently, there are still 5 logs (partitions) needed to recover under
> /tmp/log1 dir. And there are 2 threads doing the jobs, where one thread has
> 10000 segments needed to recover, and the other one has 3 segments needed
> to recover.
>
>    - kafka.log
>       - LogManager
>          - RemainingLogsToRecover
>             - /tmp/log1 => 5            ← there are 5 logs under
>             /tmp/log1 needed to be recovered
>             - /tmp/log2 => 0
>          - RemainingSegmentsToRecover
>             - /tmp/log1                     ← 2 threads are doing log
>             recovery for /tmp/log1
>             - 0 => 10000         ← there are 10000 segments needed to be
>                recovered for thread 0
>                - 1 => 3
>                - /tmp/log2
>                - 0 => 0
>                - 1 => 0
>
>
> So, after a while, the metrics might look like this:
> It said, now, there are only 4 logs needed to recover in /tmp/log1, and
> the thread 0 has 9000 segments left, and thread 1 has 5 segments left
> (which should imply the thread already completed 2 logs recovery in the
> period)
>
>    - kafka.log
>       - LogManager
>          - RemainingLogsToRecover
>             - /tmp/log1 => 3            ← there are 3 logs under
>             /tmp/log1 needed to be recovered
>             - /tmp/log2 => 0
>          - RemainingSegmentsToRecover
>             - /tmp/log1                     ← 2 threads are doing log
>             recovery for /tmp/log1
>             - 0 => 9000         ← there are 9000 segments needed to be
>                recovered for thread 0
>                - 1 => 5
>                - /tmp/log2
>                - 0 => 0
>                - 1 => 0
>
>
> That said, the `RemainingBytesToRecovery` metric is difficult to achieve
> as you expected. I think the current proposal with `RemainingLogsToRecover`
> and `RemainingSegmentsToRecover` should already provide enough info for
> the log recovery progress.
>
> I've also updated the KIP example to make it clear.
>
>
> Thank you.
> Luke
>
>
> On Thu, May 5, 2022 at 3:31 AM James Cheng <[email protected]> wrote:
>
>> Hi Luke,
>>
>> Thanks for adding RemainingSegmentsToRecovery.
>>
>> Another thought: different topics can have different segment sizes. I
>> don't know how common it is, but it is possible. Some topics might want
>> small segment sizes to more granular expiration of data.
>>
>> The downside of RemainingLogsToRecovery and RemainingSegmentsToRecovery
>> is that the rate that they will decrement depends on the configuration and
>> patterns of the topics and partitions and segment sizes. If someone is
>> monitoring those metrics, they might see times where the metric decrements
>> slowly, followed by a burst where it decrements quickly.
>>
>> What about RemainingBytesToRecovery? This would not depend on the
>> configuration of the topic or of the data. It would actually be a pretty
>> good metric, because I think that this metric would change at a constant
>> rate (based on the disk I/O speed that the broker allocates to recovery).
>> Because it changes at a constant rate, you would be able to use the
>> rate-of-change to predict when it hits zero, which will let you know when
>> the broker is going to start up. Like, I would imagine if we graphed
>> RemainingBytesToRecovery that we'd see a fairly straight line that is
>> decrementing at a steady rate towards zero.
>>
>> What do you think about adding RemainingBytesToRecovery?
>>
>> Or, what would you think about making the primary metric be
>> RemainingBytesToRecovery, and getting rid of the others?
>>
>> I don't know if I personally would rather have all 3 metrics, or would
>> just use RemainingBytesToRecovery. I'd too would like more community input
>> on which of those metrics would be useful to people.
>>
>> About the JMX metrics, you said that if
>> num.recovery.threads.per.data.dir=2, that there might be a separate
>> RemainingSegmentsToRecovery counter for each thread. Is that actually how
>> the data is structured within the Kafka recovery threads? Does each thread
>> get a fixed set of partitions, or is there just one big pool of partitions
>> that the threads all work on?
>>
>> As a more concrete example:
>> * If I have 9 small partitions and 1 big partition, and
>> num.recovery.threads.per.data.dir=2
>> Does each thread get 5 partitions, which means one thread will finish
>> much sooner than the other?
>> OR
>> Do both threads just work on the set of 10 partitions, which means likely
>> 1 thread will be busy with the big partition, while the other one ends up
>> plowing through the 9 small partitions?
>>
>> If each thread gets assigned 5 partitions, then it would make sense that
>> each thread has its own counter.
>> If the threads works on a single pool of 10 partitions, then it would
>> probably mean that the counter is on the pool of partitions itself, and not
>> on each thread.
>>
>> -James
>>
>> > On May 4, 2022, at 5:55 AM, Luke Chen <[email protected]> wrote:
>> >
>> > Hi devs,
>> >
>> > If there are no other comments, I'll start a vote tomorrow.
>> >
>> > Thank you.
>> > Luke
>> >
>> > On Sun, May 1, 2022 at 5:08 PM Luke Chen <[email protected]> wrote:
>> >
>> >> Hi James,
>> >>
>> >> Sorry for the late reply.
>> >>
>> >> Yes, this is a good point, to know how many segments to be recovered if
>> >> there are some large partitions.
>> >> I've updated the KIP, to add a `*RemainingSegmentsToRecover*` metric
>> for
>> >> each log recovery thread, to show the value.
>> >> The example in the Proposed section here
>> >> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-831%3A+Add+metric+for+log+recovery+progress#KIP831:Addmetricforlogrecoveryprogress-ProposedChanges
>> >
>> >> shows what it will look like.
>> >>
>> >> Thanks for the suggestion.
>> >>
>> >> Thank you.
>> >> Luke
>> >>
>> >>
>> >>
>> >> On Sat, Apr 23, 2022 at 8:54 AM James Cheng <[email protected]>
>> wrote:
>> >>
>> >>> The KIP describes RemainingLogsToRecovery, which seems to be the
>> number
>> >>> of partitions in each log.dir.
>> >>>
>> >>> We have some partitions which are much much larger than others. Those
>> >>> large partitions have many many more segments than others.
>> >>>
>> >>> Is there a way the metric can reflect partition size? Could it be
>> >>> RemainingSegmentsToRecover? Or even RemainingBytesToRecover?
>> >>>
>> >>> -James
>> >>>
>> >>> Sent from my iPhone
>> >>>
>> >>>> On Apr 20, 2022, at 2:01 AM, Luke Chen <[email protected]> wrote:
>> >>>>
>> >>>> Hi all,
>> >>>>
>> >>>> I'd like to propose a KIP to expose a metric for log recovery
>> progress.
>> >>>> This metric would let the admins have a way to monitor the log
>> recovery
>> >>>> progress.
>> >>>> Details can be found here:
>> >>>>
>> >>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-831%3A+Add+metric+for+log+recovery+progress
>> >>>>
>> >>>> Any feedback is appreciated.
>> >>>>
>> >>>> Thank you.
>> >>>> Luke
>> >>>
>> >>
>>
>>

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

Reply via email to