Hey Mickael, Just checking to see if you have any thoughts on this. thanks!
On Thu, 11 Apr 2024 at 15:11, Elxan Eminov <elxanemino...@gmail.com> wrote: > Hi Mickael! > Any thoughts on this? > Thanks! > > On Wed, 3 Apr 2024 at 13:21, Elxan Eminov <elxanemino...@gmail.com> wrote: > >> Hi Mickael, >> Thanks for your response and apologies for a huge delay in mine. >> >> My thinking is that any partition could go stale if there are no records >> being produced into it. If enough of such partitions are present and are >> owned by a single MM task, an OOM could happen. >> >> Regarding the scenario where the TTL value is lower than the refresh >> interval - I believe that this is an edge that we need to document and >> prevent against, for example either failing to start on such a combination >> or resorting to a default value that would satisfy the constraint and >> logging an error. >> >> Thanks, >> Elkhan >> >> On Thu, 8 Feb 2024 at 14:17, Mickael Maison <mickael.mai...@gmail.com> >> wrote: >> >>> Hi, >>> >>> Thanks for the updates. >>> I'm wondering whether we really need the ttl eviction mechanism. The >>> motivation is to "avoid storing stale LRO entries which can cause an >>> eventual OOM error". How could it contain stake entries? I would >>> expect its cache to only contain entries for partitions assigned to >>> the task that owns it. Also what is the expected behavior if there's >>> no available LRO in the cache? If we keep this mechanism what happens >>> if its value is lower than >>> replication.record.lag.metric.refresh.interval? >>> >>> Thanks, >>> Mickael >>> >>> On Mon, Feb 5, 2024 at 5:23 PM Elxan Eminov <elxanemino...@gmail.com> >>> wrote: >>> > >>> > Hi Mickael! >>> > Any further thoughts on this? >>> > >>> > Thanks, >>> > Elkhan >>> > >>> > On Thu, 18 Jan 2024 at 11:53, Mickael Maison <mickael.mai...@gmail.com >>> > >>> > wrote: >>> > >>> > > Hi Elxan, >>> > > >>> > > Thanks for the updates. >>> > > >>> > > We used dots to separate words in configuration names, so I think >>> > > replication.offset.lag.metric.last-replicated-offset.ttl should be >>> > > named replication.offset.lag.metric.last.replicated.offset.ttl >>> > > instead. >>> > > >>> > > About the names of the metrics, fair enough if you prefer keeping the >>> > > replication prefix. Out of the alternatives you mentioned, I think I >>> > > prefer replication-record-lag. I think the metrics and configuration >>> > > names should match too. Let's see what the others think about it. >>> > > >>> > > Thanks, >>> > > Mickael >>> > > >>> > > On Mon, Jan 15, 2024 at 9:50 PM Elxan Eminov < >>> elxanemino...@gmail.com> >>> > > wrote: >>> > > > >>> > > > Apologies, forgot to reply on your last comment about the metric >>> name. >>> > > > I believe both replication-lag and record-lag are a little too >>> abstract - >>> > > > what do you think about either leaving it as >>> replication-offset-lag or >>> > > > renaming to replication-record-lag? >>> > > > >>> > > > Thanks >>> > > > >>> > > > On Wed, 10 Jan 2024 at 15:31, Mickael Maison < >>> mickael.mai...@gmail.com> >>> > > > wrote: >>> > > > >>> > > > > Hi Elxan, >>> > > > > >>> > > > > Thanks for the KIP, it looks like a useful addition. >>> > > > > >>> > > > > Can you add to the KIP the default value you propose for >>> > > > > replication.lag.metric.refresh.interval? In MirrorMaker most >>> interval >>> > > > > configs can be set to -1 to disable them, will it be the case >>> for this >>> > > > > new feature or will this setting only accept positive values? >>> > > > > I also wonder if replication-lag, or record-lag would be clearer >>> names >>> > > > > instead of replication-offset-lag, WDYT? >>> > > > > >>> > > > > Thanks, >>> > > > > Mickael >>> > > > > >>> > > > > On Wed, Jan 3, 2024 at 6:15 PM Elxan Eminov < >>> elxanemino...@gmail.com> >>> > > > > wrote: >>> > > > > > >>> > > > > > Hi all, >>> > > > > > Here is the vote thread: >>> > > > > > >>> https://lists.apache.org/thread/ftlnolcrh858dry89sjg06mdcdj9mrqv >>> > > > > > >>> > > > > > Cheers! >>> > > > > > >>> > > > > > On Wed, 27 Dec 2023 at 11:23, Elxan Eminov < >>> elxanemino...@gmail.com> >>> > > > > wrote: >>> > > > > > >>> > > > > > > Hi all, >>> > > > > > > I've updated the KIP with the details we discussed in this >>> thread. >>> > > > > > > I'll call in a vote after the holidays if everything looks >>> good. >>> > > > > > > Thanks! >>> > > > > > > >>> > > > > > > On Sat, 26 Aug 2023 at 15:49, Elxan Eminov < >>> > > elxanemino...@gmail.com> >>> > > > > > > wrote: >>> > > > > > > >>> > > > > > >> Relatively minor change with a new metric for MM2 >>> > > > > > >> >>> > > > > > >> >>> > > > > >>> > > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-971%3A+Expose+replication-offset-lag+MirrorMaker2+metric >>> > > > > > >> >>> > > > > > > >>> > > > > >>> > > >>> >>