[GitHub] [kafka-site] satishd merged pull request #543: Added the new key to KEYS file

2023-09-16 Thread via GitHub


satishd merged PR #543:
URL: https://github.com/apache/kafka-site/pull/543


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [VOTE] KIP-953: partition method to be overloaded to accept headers as well.

2023-09-16 Thread Jack Tomy
Hey Ismael, Sagar and everyone,

Sorry I seem to have interpreted the thread wrong. Before we go ahead with
the DTO based approach I have few reasons not to go with it.
a. It is not following the pattern we are following today. But here I agree
that patterns are to be changed for good.
b. The client is not supposed to modify the record object (This is my
understanding, If this is not a necessary requirement, please call it
out.), passing the entire object lets the client do that. To avoid that,
there has to be a way to deep copy the record object each time, this adds
unnecessary requirements on the record object to support the deepcopy
implementation. I see a lot of complexity and coupling coming in here due
to this N I believe it's a strong reason not to go ahead with the DTO
approach.

Please let me know what you think.

Thanks.





On Wed, Sep 6, 2023 at 7:06 AM Sagar  wrote:

> Hey Jack,
>
> The way I interpreted this thread, it seems like there's more alignment on
> the DTO based approach. I spent some time on the suggestion that Ismael had
> regarding the usage of ProducerRecord. Did you get a chance to look at the
> reply I had posted and whether that makes sense? Also, checking out the
> AdminClient APIs examples provided by Ismael will give you more context.
> Let me know what you think.
>
> Thanks!
> Sagar.
>
> On Thu, Aug 31, 2023 at 12:49 PM Jack Tomy  wrote:
>
> > Hey everyone,
> >
> > As I see devs favouring the current style of implementation, and that is
> > inline with existing code. I would like to go ahead with the same
> approach
> > as mentioned in the KIP.
> > Can I get a few more votes so that I can take the KIP forward.
> >
> > Thanks
> >
> >
> >
> > On Sun, Aug 27, 2023 at 1:38 PM Sagar  wrote:
> >
> > > Hi Ismael,
> > >
> > > Thanks for pointing us towards the direction of a DTO based approach.
> The
> > > AdminClient examples seem very neat and extensible in that sense.
> > > Personally, I was trying to think only along the lines of how the
> current
> > > Partitioner interface has been designed, i.e having all requisite
> > > parameters as separate arguments (Topic, Key, Value etc).
> > >
> > > Regarding this question of yours:
> > >
> > > A more concrete question: did we consider having the method `partition`
> > > > take `ProduceRecord` as one of its parameters and `Cluster` as the
> > other?
> > >
> > >
> > > No, I don't think in the discussion thread it was brought up and as I
> > said
> > > it appears that could be due to an attempt to keep the new method's
> > > signature similar to the existing one within Partitioner. If I
> understood
> > > the intent of the question correctly, are you trying to hint here that
> > > `ProducerRecord` already contains all the arguments that the
> `partition`
> > > method accepts and also has a `headers` field within it. So, instead of
> > > adding another method for the `headers` field, why not create a new
> > method
> > > taking ProducerRecord directly?
> > >
> > > If my understanding is correct, then it seems like a very clean way of
> > > adding support for `headers`. Anyways, the partition method within
> > > KafkaProducer already takes a ProducerRecord as an argument so that
> makes
> > > it easier. Keeping that in mind, should this new method's (which takes
> a
> > > ProducerRecord as an input) default implementation invoke the existing
> > > method ? One challenge I see there is that the existing partition
> method
> > > expects serialized keys and values while ProducerRecord doesn't have
> > access
> > > to those (It directly operates on K, V).
> > >
> > > Thanks!
> > > Sagar.
> > >
> > >
> > > On Sun, Aug 27, 2023 at 8:51 AM Ismael Juma  wrote:
> > >
> > > > A more concrete question: did we consider having the method
> `partition`
> > > > take `ProduceRecord` as one of its parameters and `Cluster` as the
> > other?
> > > >
> > > > Ismael
> > > >
> > > > On Sat, Aug 26, 2023 at 12:50 PM Greg Harris
> > >  > > > >
> > > > wrote:
> > > >
> > > > > Hey Ismael,
> > > > >
> > > > > > The mention of "runtime" is specific to Connect. When it comes to
> > > > > clients,
> > > > > one typically compiles and runs with the same version or runs with
> a
> > > > newer
> > > > > version than the one used for compilation. This is standard
> practice
> > in
> > > > > Java and not something specific to Kafka.
> > > > >
> > > > > When I said "older runtimes" I was being lazy, and should have said
> > > > > "older versions of clients at runtime," thank you for figuring out
> > > > > what I meant.
> > > > >
> > > > > I don't know how common it is to compile a partitioner against one
> > > > > version of clients, and then distribute and run the partitioner
> with
> > > > > older versions of clients and expect graceful degradation of
> > features.
> > > > > If you say that it is very uncommon and not something that we
> should
> > > > > optimize for, then I won't suggest otherwise.
> > > > >
> > > > > > With regards to the Admin APIs, they have been e

[GitHub] [kafka-site] satishd opened a new pull request, #543: Added the new key to KEYS file

2023-09-16 Thread via GitHub


satishd opened a new pull request, #543:
URL: https://github.com/apache/kafka-site/pull/543

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2209

2023-09-16 Thread Apache Jenkins Server
See 




Re: Re: Re: Re: [DISCUSS] KIP-971 Expose replication-offset-lag MirrorMaker2 metric

2023-09-16 Thread Elxan Eminov
Hi Viktor and huqedi,
Apologies for the late reply - yes, I believe a periodic update of the LEO
is the better approach here.
I will update the KIP accordingly and ping back in this thread once done.

Thanks a lot for the input!
Best, Elkhan


On Wed, 13 Sept 2023 at 15:34, Viktor Somogyi-Vass
 wrote:

> Elkhan, do you think making yours similar would make sense?
>
> On Wed, Sep 6, 2023 at 4:12 AM hudeqi <16120...@bjtu.edu.cn> wrote:
>
> > Hey, Viktor.
> > As far as my implementation is concerned, the default setting is 30s, but
> > I added it to `MirrorConnectorConfig`, which can be adjusted freely
> > according to the load of the source cluster and the number of tasks.
> >
> > best,
> > hudeqi
> >
> > "Viktor Somogyi-Vass"  > >写道:
> > > Hey Elkhan and hudeqi,
> > >
> > > I'm reading your debate around the implementation. I also think a
> > > scheduled task would be better in overall accuracy and performance
> > > (compared to calling endOffsets with every poll).
> > > Hudeqi, do you have any experience of what works best for you in terms
> of
> > > time intervals? I would think refreshing the metric every 5-10sec would
> > be
> > > overall good and sufficient for the users (as short intervals can be
> > quite
> > > noisy anyways).
> > >
> > > Best,
> > > Viktor
> > >
> > > On Mon, Sep 4, 2023 at 11:41 AM hudeqi <16120...@bjtu.edu.cn> wrote:
> > >
> > > > My approach is to create another thread to regularly request and
> update
> > > > the end offset of each partition for the `keySet` in the collection
> > > > `lastReplicatedSourceOffsets` mentioned by your kip (if there is no
> > update
> > > > for a long time, it will be removed from
> > `lastReplicatedSourceOffsets`).
> > > > Obviously, such processing makes the calculation of the partition
> > offset
> > > > lag less real-time and accurate.
> > > > But this also meets our needs, because we need the partition offset
> > lag to
> > > > analyze the replication performance of the task and which task may
> have
> > > > performance problems; and if you monitor the overall offset lag of
> the
> > > > topic, then using the
> > > > "kafka_consumer_consumer_fetch_manager_metrics_records_lag" metric
> > will be
> > > > more real-time and accurate.
> > > > This is my suggestion. I hope to be able to throw bricks and start
> > jade,
> > > > we can come up with a better solution.
> > > >
> > > > best,
> > > > hudeqi
> > > >
> > > > "Elxan Eminov" 写道:
> > > > > @huqedi replying to your comment on the PR (
> > > > > https://github.com/apache/kafka/pull/14077#discussion_r1314592488
> ),
> > > > quote:
> > > > >
> > > > > "I guess we have a disagreement about lag? My understanding of lag
> > is:
> > > > the
> > > > > real LEO of the source cluster partition minus the LEO that has
> been
> > > > > written to the target cluster. It seems that your definition of lag
> > is:
> > > > the
> > > > > lag between the mirror task getting data from consumption and
> > writing it
> > > > to
> > > > > the target cluster?"
> > > > >
> > > > > Yes, this is the case. I've missed the fact that the consumer
> itself
> > > > might
> > > > > be lagging behind the actual data in the partition.
> > > > > I believe your definition of the lag is more precise, but:
> > > > > Implementing it this way will come at the cost of an extra
> > listOffsets
> > > > > request, introducing the overhead that you mentioned in your
> initial
> > > > > comment.
> > > > >
> > > > > If you have enough insights about this, what would you say is the
> > chances
> > > > > of the task consumer lagging behind the LEO of the partition?
> > > > > Are they big enough to justify the extra call to listOffsets?
> > > > > @Viktor,  any thoughts?
> > > > >
> > > > > Thanks,
> > > > > Elkhan
> > > > >
> > > > > On Mon, 4 Sept 2023 at 09:36, Elxan Eminov <
> elxanemino...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > I already have the PR for this so if it will make it easier to
> > discuss,
> > > > > > feel free to take a look:
> > https://github.com/apache/kafka/pull/14077
> > > > > >
> > > > > > On Mon, 4 Sept 2023 at 09:17, hudeqi <16120...@bjtu.edu.cn>
> wrote:
> > > > > >
> > > > > >> But does the offset of the last `ConsumerRecord` obtained in
> poll
> > not
> > > > > >> only represent the offset of this record in the source cluster?
> It
> > > > seems
> > > > > >> that it cannot represent the LEO of the source cluster for this
> > > > partition.
> > > > > >> I understand that the offset lag introduced here should be the
> > LEO of
> > > > the
> > > > > >> source cluster minus the offset of the last record to be polled?
> > > > > >>
> > > > > >> best,
> > > > > >> hudeqi
> > > > > >>
> > > > > >>
> > > > > >> > -原始邮件-
> > > > > >> > 发件人: "Elxan Eminov" 
> > > > > >> > 发送时间: 2023-09-04 14:52:08 (星期一)
> > > > > >> > 收件人: dev@kafka.apache.org
> > > > > >> > 抄送:
> > > > > >> > 主题: Re: [DISCUSS] KIP-971 Expose replication-offset-lag
> > > > MirrorMake