Re: [DISCUSS] Apache Kafka 3.1.0 release

2021-12-07 Thread David Jacot
Hi Justine,

I am +1 on getting this in the 3.1 release as it is a serious
regression in clusters with a high number of partitions.

Thanks,
David

On Tue, Dec 7, 2021 at 10:39 PM Justine Olshan
 wrote:
>
> Hi all,
> I've filed a bug for an extra map allocation that is used in the fetch
> path. https://issues.apache.org/jira/browse/KAFKA-13512
> I think it qualifies as a blocker since this path is used pretty frequently
> and it looks to be a regression.
>
> I also have a PR open to fix the issue. With this change, the performance
> looks much better. https://github.com/apache/kafka/pull/11576
> Thanks,
> Justine
>
> On Fri, Dec 3, 2021 at 5:29 AM David Jacot 
> wrote:
>
> > Hi Rajini,
> >
> > Interesting bug. The patch seems to be low risk so I suppose that
> > it is fine to keep it in 3.1.0.
> >
> > Thanks,
> > David
> >
> > On Fri, Dec 3, 2021 at 2:26 PM David Jacot  wrote:
> > >
> > > Hi Colin,
> > >
> > > Thanks for the heads up. It makes sense to include it in order
> > > to keep the KRaft inline with ZK behavior.
> > >
> > > Thanks,
> > > David
> > >
> > > On Fri, Dec 3, 2021 at 9:44 AM Rajini Sivaram 
> > wrote:
> > > >
> > > > Hi David,
> > > >
> > > > Sorry, I had completely forgotten about code freeze and merged
> > > > https://issues.apache.org/jira/browse/KAFKA-13461 to 3.1 branch
> > yesterday.
> > > > Can you take a look and see if we want it in 3.1.0? It is not a
> > regression
> > > > in 3.1, but we see this issue in tests and when it happens, the
> > controller
> > > > no longer operates as a controller.
> > > >
> > > > Thank you,
> > > >
> > > > Rajini
> > > >
> > > > On Thu, Dec 2, 2021 at 10:56 PM Colin McCabe 
> > wrote:
> > > >
> > > > > Hi David,
> > > > >
> > > > > We'd like to include "KAFKA-13490: Fix createTopics and
> > > > > incrementalAlterConfigs for KRaft mode #11416" in the upcoming
> > release.
> > > > > This fixes some bugs in how createTopics and incrementalAlterConfigs
> > are
> > > > > handled by the controller. It is specific to KRaft, so will not
> > affect ZK
> > > > > mode.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > > On Wed, Nov 24, 2021, at 01:20, David Jacot wrote:
> > > > > > Hi Mickael,
> > > > > >
> > > > > > Thanks for reporting it. It makes sense to include it in the 3.1
> > release
> > > > > > as well as it is a regression.
> > > > > >
> > > > > > Thanks,
> > > > > > David
> > > > > >
> > > > > > On Tue, Nov 23, 2021 at 6:52 PM Mickael Maison <
> > mickael.mai...@gmail.com>
> > > > > wrote:
> > > > > >>
> > > > > >> Hi David,
> > > > > >>
> > > > > >> Can we also consider
> > https://issues.apache.org/jira/browse/KAFKA-13397?
> > > > > >> It's essentially a regression but in a very specific case. To hit
> > it,
> > > > > >> you must be running MirrorMaker in dedicated mode and have
> > changed the
> > > > > >> separator of the default replication policy.
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Mickael
> > > > > >>
> > > > > >> On Tue, Nov 23, 2021 at 4:58 PM David Jacot
> > 
> > > > > wrote:
> > > > > >> >
> > > > > >> > Hi Ron,
> > > > > >> >
> > > > > >> > Thank you for reaching out about this. While this is clearly
> > not a
> > > > > >> > regression, I agree with including it in 3.1 in order to have
> > proper
> > > > > >> > and correct configuration constraints for KRaft. You can
> > proceed.
> > > > > >> >
> > > > > >> > Cheers,
> > > > > >> > David
> > > > > >> >
> > > > > >> > On Tue, Nov 23, 2021 at 2:55 PM Ron Dagostino <
> > rndg...@gmail.com>
> > > > > wrote:
> > > > > >> > >
> > > > > >> > > Hi David.  I would like to nominate
> > > > > >> > >
> > https://issues.apache.org/jira/projects/KAFKA/issues/KAFKA-13456
> > > > > >> > > "Tighten KRaft config checks/constraints" as a 3.1.0
> > blocker.  The
> > > > > >> > > existing configuration constraints/checks related to KRaft
> > currently
> > > > > >> > > do not eliminate certain illegal configuration combinations.
> > The
> > > > > >> > > illegal combinations do not cause harm at the moment, but we
> > would
> > > > > >> > > like to implement constraints in 3.1.0 to catch them while
> > KRaft is
> > > > > >> > > still in Preview.  We could add these additional checks later
> > in
> > > > > 3.2.x
> > > > > >> > > instead, but we would like to add these as early as possible:
> > we
> > > > > >> > > expect more people to begin trying KRaft with each subsequent
> > > > > release,
> > > > > >> > > and it would be best to eliminate as quickly as we can the
> > > > > possibility
> > > > > >> > > of people using configurations that would need fixing later.
> > > > > >> > >
> > > > > >> > > A patch is available at
> > https://github.com/apache/kafka/pull/11503/
> > > > > .
> > > > > >> > >
> > > > > >> > > Ron
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > On Tue, Nov 23, 2021 at 3:19 AM David Jacot
> > > > >  wrote:
> > > > > >> > > >
> > > > > >> > > > Hi Chris,
> > > > > >> > > >
> > > > > >> > > > Thanks for reporting both issues. As both are regressions,
> > I do
> > > 

[jira] [Created] (KAFKA-13517) Add ConfigurationKeys to ConfigResource class

2021-12-07 Thread Vikas Singh (Jira)
Vikas Singh created KAFKA-13517:
---

 Summary: Add ConfigurationKeys to ConfigResource class
 Key: KAFKA-13517
 URL: https://issues.apache.org/jira/browse/KAFKA-13517
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 3.0.0, 2.8.1
Reporter: Vikas Singh
Assignee: Vikas Singh
 Fix For: 2.8.1


A list of {{ConfigResource}} class is passed as argument to 
{{AdminClient::describeConfigs}} api to indicate configuration of the entities 
to fetch. The {{ConfigResource}} class is made up of two fields, name and type 
of entity. Kafka returns *all* configurations for the entities provided to the 
admin client api.

This admin api in turn uses {{DescribeConfigsRequest}} kafka api to get the 
configuration for the entities in question. In addition to name and type of 
entity whose configuration to get, Kafka {{DescribeConfigsResource}} structure 
also lets users provide {{ConfigurationKeys}} list, which allows users to fetch 
only the configurations that are needed.

However, this field isn't exposed in the {{ConfigResource}} class that is used 
by AdminClient, so users of AdminClient have no way to ask for specific 
configuration. The API always returns *all* configurations. Then the user of 
the {{AdminClient::describeConfigs}} go over the returned list and filter out 
the config keys that they are interested in.

This results in boilerplate code for all users of 
{{AdminClient::describeConfigs}} api, in addition to  being wasteful use of 
resource. It becomes painful in large cluster case where to fetch one 
configuration of all topics, we need to fetch all configuration of all topics, 
which can be huge in size. 

Creating this Jira to add same field (i.e. {{{}ConfigurationKeys{}}}) to the 
{{ConfigResource}} structure to bring it to parity to 
{{DescribeConfigsResource}} Kafka API structure. There should be no backward 
compatibility issue as the field will be optional and will behave same way if 
it is not specified (i.e. by passing null to backend kafka api) 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

2021-12-07 Thread Luke Chen
Hi Colin,

I'm not quite sure if I understand your thoughts correctly.
If I was wrong, please let me know.

Also, I'm not quite sure how I could lock this feature to a new IBP
version.
I saw "KIP-584: Versioning scheme for features" is still under development.
Not sure if I need to lock the IBP version, how should I do?

Thank you.
Luke

On Tue, Dec 7, 2021 at 9:41 PM Luke Chen  wrote:

> Hi Colin,
>
> Thanks for your comments. I've updated the KIP to mention about the KIP
> won't affect current broker side behavior.
>
> > One scenario that we need to consider is what happens during a rolling
> upgrade. If the coordinator moves back and forth between brokers with
> different IBPs, it seems that the same epoch numbers could be reused for a
> group, if things are done in the obvious manner (old IBP = don't read or
> write epoch, new IBP = do)
>
> I think this KIP doesn't care about the group epoch number at all. The
> subscription metadata is passed from each member to group coordinator, and
> then the group coordinator pass all of them back to the consumer lead. So
> even if the epoch number is reused in a group, it should be fine. On the
> other hand, the group coordinator will have no idea if the join group
> request sent from consumer containing the new subscription "generation"
> field or not, because group coordinator won't deserialize the metadata.
>
> I've added also added them into the KIP.
>
> Thank you.
> Luke
>
> On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe  wrote:
>
>> Hi Luke,
>>
>> Thanks for the explanation.
>>
>> I don't see any description of how the broker decides to use the new
>> version of ConsumerProtocolSubscription or not. This probably needs to be
>> locked to a new IBP version.
>>
>> One scenario that we need to consider is what happens during a rolling
>> upgrade. If the coordinator moves back and forth between brokers with
>> different IBPs, it seems that the same epoch numbers could be reused for a
>> group, if things are done in the obvious manner (old IBP = don't read or
>> write epoch, new IBP = do).
>>
>> best,
>> Colin
>>
>>
>> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
>> > Hi Colin,
>> > Thanks for your comment.
>> >
>> >> How are we going to avoid the situation where the broker restarts, and
>> > the same generation number is reused?
>> >
>> > Actually, this KIP doesn't have anything to do with the brokers. The
>> > "generation" field I added, is in the subscription metadata, which will
>> not
>> > be deserialized by brokers. The metadata is only deserialized by
>> consumer
>> > lead. And for the consumer lead, the only thing the lead cared about, is
>> > the highest generation of the ownedPartitions among all the consumers.
>> With
>> > the highest generation of the ownedPartitions, the consumer lead can
>> > distribute the partitions as sticky as possible, and most importantly,
>> > without errors.
>> >
>> > That is, after this KIP, if the broker restarts, and the same generation
>> > number is reused, it won't break current rebalance behavior. But it'll
>> help
>> > the consumer lead do the sticky assignments correctly.
>> >
>> > Thank you.
>> > Luke
>> >
>> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe  wrote:
>> >
>> >> How are we going to avoid the situation where the broker restarts, and
>> the
>> >> same generation number is reused?
>> >>
>> >> best,
>> >> Colin
>> >>
>> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
>> >> > Hi all,
>> >> >
>> >> > I'd like to start the vote for KIP-792: Add "generation" field into
>> >> > consumer protocol.
>> >> >
>> >> > The goal of this KIP is to allow the assignor/consumer coordinator to
>> >> have
>> >> > a way to identify the out-of-date members/assignments, to avoid
>> rebalance
>> >> > stuck issues in current protocol.
>> >> >
>> >> > Detailed description can be found here:
>> >> >
>> >>
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
>> >> >
>> >> > Any feedback is welcome.
>> >> >
>> >> > Thank you.
>> >> > Luke
>> >>
>>
>


Re: [DISCUSS] KIP-782: Expandable batch size in producer

2021-12-07 Thread Jun Rao
Hi, Luke,

Thanks for the KIP.  A few comments below.

10. Accumulating small batches could improve memory usage. Will that
introduce extra copying when generating a produce request? Currently, a
produce request takes a single MemoryRecords per partition.
11. Do we need to introduce a new config batch.max.size? Could we just
increase the default of batch.size? We probably need to have KIP-794

resolved
before increasing the default batch size since the larger the batch size,
the worse the problem in KIP-794.
12. As for max.request.size, currently it's used for both the max record
size and the max request size, which is unintuitive. Perhaps we could
introduce a new config max.record.size that defaults to 1MB. We could then
increase max.request.size to sth like 10MB.

Thanks,

Jun


On Mon, Nov 29, 2021 at 6:02 PM Artem Livshits
 wrote:

> Hi Luke,
>
> I don't mind increasing the max.request.size to a higher number, e.g. 2MB
> could be good.  I think we should also run some benchmarks to see the
> effects of different sizes.
>
> I agree that changing round robin to random solves an independent existing
> issue, however the logic in this KIP exacerbates the issue, so there is
> some dependency.
>
> -Artem
>
> On Wed, Nov 24, 2021 at 12:43 AM Luke Chen  wrote:
>
> > Hi Artem,
> > Yes, I agree if we go with random selection instead of round-robin
> > selection, the latency issue will be more fair. That is, if there are 10
> > partitions, the 10th partition will always be the last choice in each
> round
> > in current design, but with random selection, the chance to be selected
> is
> > more fair.
> >
> > However, I think that's kind of out of scope with this KIP. This is an
> > existing issue, and it might need further discussion to decide if this
> > change is necessary.
> >
> > I agree the default 32KB for "batch.max.size" might be not huge
> improvement
> > compared with 256KB. I'm thinking, maybe default to "64KB" for
> > "batch.max.size", and make the documentation clear that if the
> > "batch.max.size"
> > is increased, there might be chances that the "ready" partitions need to
> > wait for next request to send to broker, because of the
> "max.request.size"
> > (default 1MB) limitation. "max.request.size" can also be considered to
> > increase to avoid this issue. What do you think?
> >
> > Thank you.
> > Luke
> >
> > On Wed, Nov 24, 2021 at 2:26 AM Artem Livshits
> >  wrote:
> >
> > > >  maybe I can firstly decrease the "batch.max.size" to 32KB
> > >
> > > I think 32KB is too small.  With 5 in-flight and 100ms latency we can
> > > produce 1.6MB/s per partition.  With 256KB we can produce 12.8MB/s per
> > > partition.  We should probably set up some testing and see if 256KB has
> > > problems.
> > >
> > > To illustrate latency dynamics, let's consider a simplified model: 1
> > > in-flight request per broker, produce latency 125ms, 256KB max request
> > > size, 16 partitions assigned to the same broker, every second 128KB is
> > > produced to each partition (total production rate is 2MB/sec).
> > >
> > > If the batch size is 16KB, then the pattern would be the following:
> > >
> > > 0ms - produce 128KB into each partition
> > > 0ms - take 16KB from each partition send (total 256KB)
> > > 125ms - complete first 16KB from each partition, send next 16KB
> > > 250ms - complete second 16KB, send next 16KB
> > > ...
> > > 1000ms - complete 8th 16KB from each partition
> > >
> > > from this model it's easy to see that there are 256KB that are sent
> > > immediately, 256KB that are sent in 125ms, ... 256KB that are sent in
> > > 875ms.
> > >
> > > If the batch size is 256KB, then the pattern would be the following:
> > >
> > > 0ms - produce 128KB into each partition
> > > 0ms - take 128KB each from first 2 partitions and send (total 256KB)
> > > 125ms - complete 2 first partitions, send data from next 2 partitions
> > > ...
> > > 1000ms - complete last 2 partitions
> > >
> > > even though the pattern is different, there are still 256KB that are
> sent
> > > immediately, 256KB that are sent in 125ms, ... 256KB that are sent in
> > > 875ms.
> > >
> > > Now, in this example if we do strictly round-robin (current
> > implementation)
> > > and we have this exact pattern (not sure how often such regular pattern
> > > would happen in practice -- I would expect that it would be a bit more
> > > random), some partitions would experience higher latency than others
> (not
> > > sure how much it would matter in practice -- in the end of the day some
> > > bytes produced to a topic would have higher latency and some bytes
> would
> > > have lower latency).  This pattern is easily fixed by choosing the next
> > > partition randomly instead of using round-robin.
> > >
> > > -Artem
> > >
> > > On Tue, Nov 23, 2021 at 12:08 AM Luke Chen  wrote:
> > >
> > > > Hi Tom,
> > > > Thanks for your comments. And thanks for Artem's 

[jira] [Created] (KAFKA-13516) Connection level metrics are not closed

2021-12-07 Thread Aman Agarwal (Jira)
Aman Agarwal created KAFKA-13516:


 Summary: Connection level metrics are not closed
 Key: KAFKA-13516
 URL: https://issues.apache.org/jira/browse/KAFKA-13516
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 3.0.0
Reporter: Aman Agarwal


Connection level metrics are not closed by the Selector on connection close, 
hence leaking the sensors.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13515) Fix KRaft config validation issues

2021-12-07 Thread Colin McCabe (Jira)
Colin McCabe created KAFKA-13515:


 Summary: Fix KRaft config validation issues
 Key: KAFKA-13515
 URL: https://issues.apache.org/jira/browse/KAFKA-13515
 Project: Kafka
  Issue Type: Bug
Reporter: Colin McCabe






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] KIP-802: Validation Support for Kafka Connect SMT Options

2021-12-07 Thread Chris Egerton
Hi Gunnar,

Thanks for the KIP! The section on backwards compatibility is especially
impressive and was enjoyable to read.

Overall I like the direction of the KIP (and in fact just ran into a
situation yesterday where it would be valuable). I only have one major
thought: could we add similar validate methods for the Converter and
HeaderConverter interfaces? With KIP-769 [1], it looks like we'll have a
new Converter::config method, so if that makes it through, it should be a
matter of just adding the same methods to those interfaces as well
(although we may want to be tolerant of null ConfigDef objects being
returned from HeaderConverter::config since the Connect framework has not
been enforcing this requirement to date).

That aside, a few small nits:

1. The "This page is meant as a template" section can be removed :)
2. The "Current Status" can be updated to "Under Discussion"
3. Might want to add javadocs to the newly-proposed validate method (I'm
assuming they'll largely mirror the ones for the existing
Connector::validate method, but we may also want to add a {@since} tag or
some other information on which versions of Connect will leverage the
method).

[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+APIs+to+list+all+plugins+and+retrieve+their+configuration+definitions#KIP769:ConnectAPIstolistallpluginsandretrievetheirconfigurationdefinitions-PublicInterfaces
(section labeled "Converter interface"

Cheers,

Chris

On Wed, Nov 24, 2021 at 11:32 AM Gunnar Morling
 wrote:

> Hey all,
>
> I would like to propose a KIP for Apache Kafka Connect which adds
> validation support for SMT-related configuration options:
>
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-802%3A+Validation+Support+for+Kafka+Connect+SMT+Options
>
> This feature allows users to make sure an SMT is configured correctly
> before actually putting a connector with that SMT in place.
>
> Any feedback, comments, and suggestions around this proposal will
> be greatly appreciated.
>
> Thanks,
>
> --Gunnar
>


Re: [Discuss] KIP-803: Add Task ID and Connector Name to Connect Task Context

2021-12-07 Thread Chris Egerton
Hi Sarah,

Thanks for the KIP! I have two major thoughts:

1. Adding new methods like this to the Connect API comes with the risk that
connectors that invoke them become incompatible with older versions of
Connect. For example, if I updated my connector to use the newly-proposed
SourceTaskContext::connector method, it would fail with a NoSuchMethodError
when run on Connect version 3.1 (which will not have this feature). We've
been careful to document this limitation in any newly-introduced methods in
the past (see KIP-610 [1] and KIP-618 [2], for example), but even then,
there's still risk with anything like this and we may not want to expand
the API if the benefits don't outweigh the costs.

2. It's already possible to implement this logic directly in a connector
today, without any changes to the Connect framework. Every connector can
learn its own name in Connector::start by reading the "name" configuration
property, and can then choose to pass that information along to its tasks
as part of the configs it generates in Connector::taskConfigs. And, in the
same way, connectors can choose to provide IDs for each task in the task
configs that they generate. If this isn't sufficient for your use cases, it
should be documented as a rejected alternative.

BTW, it looks like this aims to accomplish something very similar or even
identical to KIP-438 [3]. Ryanne Dolan (the author of that KIP) may want to
weigh in here; I've CC'd them.

[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-610%3A+Error+Reporting+in+Sink+Connectors#KIP610:ErrorReportinginSinkConnectors-Method
(see javadocs for SinkTaskContext::errantRecordReporter, paragraph starting
with "This method was added in Apache Kafka 2.6")
[2] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-618%3A+Exactly-Once+Support+for+Source+Connectors#KIP618:ExactlyOnceSupportforSourceConnectors-ConnectorAPIexpansions
(see javadocs for SourceTaskContext::transactionContext, paragraph starting
with "This method was added in Apache Kafka 3.0")
[3] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-438%3A+Expose+task%2C+connector+IDs+in+Connect+API

Cheers,

Chris

On Tue, Dec 7, 2021 at 10:07 AM Sarah Story 
wrote:

> Hi all! Just wanted to bump this KIP for adding Task ID and Connector Name
> to the task context. It's a small change, and I'd love some feedback on it!
>
> Thanks!
> Sarah
>
> On Wed, Nov 24, 2021 at 10:26 AM Sarah Story <
> sa...@sarahstoryengineering.com> wrote:
>
> > I have written a KIP for adding Task ID and Connector name to the Connect
> > API task context.
> >
> > Here is the doc: https://cwiki.apache.org/confluence/x/4pKqCw
> >
> > Looking forward to hearing all of your thoughts!
> >
>


Re: [DISCUSS] Apache Kafka 3.1.0 release

2021-12-07 Thread Justine Olshan
Hi all,
I've filed a bug for an extra map allocation that is used in the fetch
path. https://issues.apache.org/jira/browse/KAFKA-13512
I think it qualifies as a blocker since this path is used pretty frequently
and it looks to be a regression.

I also have a PR open to fix the issue. With this change, the performance
looks much better. https://github.com/apache/kafka/pull/11576
Thanks,
Justine

On Fri, Dec 3, 2021 at 5:29 AM David Jacot 
wrote:

> Hi Rajini,
>
> Interesting bug. The patch seems to be low risk so I suppose that
> it is fine to keep it in 3.1.0.
>
> Thanks,
> David
>
> On Fri, Dec 3, 2021 at 2:26 PM David Jacot  wrote:
> >
> > Hi Colin,
> >
> > Thanks for the heads up. It makes sense to include it in order
> > to keep the KRaft inline with ZK behavior.
> >
> > Thanks,
> > David
> >
> > On Fri, Dec 3, 2021 at 9:44 AM Rajini Sivaram 
> wrote:
> > >
> > > Hi David,
> > >
> > > Sorry, I had completely forgotten about code freeze and merged
> > > https://issues.apache.org/jira/browse/KAFKA-13461 to 3.1 branch
> yesterday.
> > > Can you take a look and see if we want it in 3.1.0? It is not a
> regression
> > > in 3.1, but we see this issue in tests and when it happens, the
> controller
> > > no longer operates as a controller.
> > >
> > > Thank you,
> > >
> > > Rajini
> > >
> > > On Thu, Dec 2, 2021 at 10:56 PM Colin McCabe 
> wrote:
> > >
> > > > Hi David,
> > > >
> > > > We'd like to include "KAFKA-13490: Fix createTopics and
> > > > incrementalAlterConfigs for KRaft mode #11416" in the upcoming
> release.
> > > > This fixes some bugs in how createTopics and incrementalAlterConfigs
> are
> > > > handled by the controller. It is specific to KRaft, so will not
> affect ZK
> > > > mode.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > > On Wed, Nov 24, 2021, at 01:20, David Jacot wrote:
> > > > > Hi Mickael,
> > > > >
> > > > > Thanks for reporting it. It makes sense to include it in the 3.1
> release
> > > > > as well as it is a regression.
> > > > >
> > > > > Thanks,
> > > > > David
> > > > >
> > > > > On Tue, Nov 23, 2021 at 6:52 PM Mickael Maison <
> mickael.mai...@gmail.com>
> > > > wrote:
> > > > >>
> > > > >> Hi David,
> > > > >>
> > > > >> Can we also consider
> https://issues.apache.org/jira/browse/KAFKA-13397?
> > > > >> It's essentially a regression but in a very specific case. To hit
> it,
> > > > >> you must be running MirrorMaker in dedicated mode and have
> changed the
> > > > >> separator of the default replication policy.
> > > > >>
> > > > >> Thanks,
> > > > >> Mickael
> > > > >>
> > > > >> On Tue, Nov 23, 2021 at 4:58 PM David Jacot
> 
> > > > wrote:
> > > > >> >
> > > > >> > Hi Ron,
> > > > >> >
> > > > >> > Thank you for reaching out about this. While this is clearly
> not a
> > > > >> > regression, I agree with including it in 3.1 in order to have
> proper
> > > > >> > and correct configuration constraints for KRaft. You can
> proceed.
> > > > >> >
> > > > >> > Cheers,
> > > > >> > David
> > > > >> >
> > > > >> > On Tue, Nov 23, 2021 at 2:55 PM Ron Dagostino <
> rndg...@gmail.com>
> > > > wrote:
> > > > >> > >
> > > > >> > > Hi David.  I would like to nominate
> > > > >> > >
> https://issues.apache.org/jira/projects/KAFKA/issues/KAFKA-13456
> > > > >> > > "Tighten KRaft config checks/constraints" as a 3.1.0
> blocker.  The
> > > > >> > > existing configuration constraints/checks related to KRaft
> currently
> > > > >> > > do not eliminate certain illegal configuration combinations.
> The
> > > > >> > > illegal combinations do not cause harm at the moment, but we
> would
> > > > >> > > like to implement constraints in 3.1.0 to catch them while
> KRaft is
> > > > >> > > still in Preview.  We could add these additional checks later
> in
> > > > 3.2.x
> > > > >> > > instead, but we would like to add these as early as possible:
> we
> > > > >> > > expect more people to begin trying KRaft with each subsequent
> > > > release,
> > > > >> > > and it would be best to eliminate as quickly as we can the
> > > > possibility
> > > > >> > > of people using configurations that would need fixing later.
> > > > >> > >
> > > > >> > > A patch is available at
> https://github.com/apache/kafka/pull/11503/
> > > > .
> > > > >> > >
> > > > >> > > Ron
> > > > >> > >
> > > > >> > >
> > > > >> > > On Tue, Nov 23, 2021 at 3:19 AM David Jacot
> > > >  wrote:
> > > > >> > > >
> > > > >> > > > Hi Chris,
> > > > >> > > >
> > > > >> > > > Thanks for reporting both issues. As both are regressions,
> I do
> > > > agree that
> > > > >> > > > they are blockers and that we would fix them for 3.1.
> > > > >> > > >
> > > > >> > > > Cheers,
> > > > >> > > > David
> > > > >> > > >
> > > > >> > > > On Mon, Nov 22, 2021 at 10:50 PM Chris Egerton
> > > > >> > > >  wrote:
> > > > >> > > > >
> > > > >> > > > > Hi David,
> > > > >> > > > >
> > > > >> > > > > I have another blocker to propose. KAFKA-13472 (
> > > > >> > > > > https://issues.apache.org/jira/browse/KAFKA-13472) is
> another
> > > > regression in
> > > > >> 

[jira] [Created] (KAFKA-13514) Flakey test StickyAssignorTest

2021-12-07 Thread Matthias J. Sax (Jira)
Matthias J. Sax created KAFKA-13514:
---

 Summary: Flakey test StickyAssignorTest
 Key: KAFKA-13514
 URL: https://issues.apache.org/jira/browse/KAFKA-13514
 Project: Kafka
  Issue Type: Test
  Components: clients, unit tests
Reporter: Matthias J. Sax


org.apache.kafka.clients.consumer.StickyAssignorTest.testLargeAssignmentAndGroupWithNonEqualSubscription()

No real stack trace, but only:
{quote}java.util.concurrent.TimeoutException: 
testLargeAssignmentAndGroupWithNonEqualSubscription() timed out after 60 
seconds{quote}
STDOUT
{quote}[2021-12-07 01:32:23,920] ERROR Found multiple consumers consumer1 and 
consumer2 claiming the same TopicPartition topic-0 in the same generation -1, 
this will be invalidated and removed from their previous assignment. 
(org.apache.kafka.clients.consumer.internals.AbstractStickyAssignor:150) 
[2021-12-07 01:32:58,964] ERROR Found multiple consumers consumer1 and 
consumer2 claiming the same TopicPartition topic-0 in the same generation -1, 
this will be invalidated and removed from their previous assignment. 
(org.apache.kafka.clients.consumer.internals.AbstractStickyAssignor:150) 
[2021-12-07 01:32:58,976] ERROR Found multiple consumers consumer1 and 
consumer2 claiming the same TopicPartition topic-0 in the same generation -1, 
this will be invalidated and removed from their previous assignment. 
(org.apache.kafka.clients.consumer.internals.AbstractStickyAssignor:150){quote}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13513) Flaky test AdjustStreamThreadCountTest

2021-12-07 Thread Matthias J. Sax (Jira)
Matthias J. Sax created KAFKA-13513:
---

 Summary: Flaky test AdjustStreamThreadCountTest
 Key: KAFKA-13513
 URL: https://issues.apache.org/jira/browse/KAFKA-13513
 Project: Kafka
  Issue Type: Test
  Components: streams, unit tests
Reporter: Matthias J. Sax


org.apache.kafka.streams.integration.AdjustStreamThreadCountTest.testConcurrentlyAccessThreads
{quote}java.lang.AssertionError: expected null, but 
was: at 
org.junit.Assert.fail(Assert.java:89) at 
org.junit.Assert.failNotNull(Assert.java:756) at 
org.junit.Assert.assertNull(Assert.java:738) at 
org.junit.Assert.assertNull(Assert.java:748) at 
org.apache.kafka.streams.integration.AdjustStreamThreadCountTest.testConcurrentlyAccessThreads(AdjustStreamThreadCountTest.java:367)
{quote}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13512) topicIdsToNames and topicNamesToIds allocate unnecessary maps

2021-12-07 Thread Justine Olshan (Jira)
Justine Olshan created KAFKA-13512:
--

 Summary: topicIdsToNames and topicNamesToIds allocate unnecessary 
maps
 Key: KAFKA-13512
 URL: https://issues.apache.org/jira/browse/KAFKA-13512
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.1.0
Reporter: Justine Olshan
Assignee: Justine Olshan


Currently we write the methods as follows:

{{def topicNamesToIds(): util.Map[String, Uuid] = {}}
{{    new util.HashMap(metadataSnapshot.topicIds.asJava)}}
{{}}}

We do not need to allocate a new map however, we can simply use

{{Collections.unmodifiableMap(metadataSnapshot.topicIds.asJava)}}

We can do something similar for the topicIdsToNames implementation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [Discuss] KIP-803: Add Task ID and Connector Name to Connect Task Context

2021-12-07 Thread Sarah Story
Hi all! Just wanted to bump this KIP for adding Task ID and Connector Name
to the task context. It's a small change, and I'd love some feedback on it!

Thanks!
Sarah

On Wed, Nov 24, 2021 at 10:26 AM Sarah Story <
sa...@sarahstoryengineering.com> wrote:

> I have written a KIP for adding Task ID and Connector name to the Connect
> API task context.
>
> Here is the doc: https://cwiki.apache.org/confluence/x/4pKqCw
>
> Looking forward to hearing all of your thoughts!
>


[jira] [Created] (KAFKA-13511) Update TimestampConverter SMT to support unix epoch as millis, micros, and seconds

2021-12-07 Thread Julien Chanaud (Jira)
Julien Chanaud created KAFKA-13511:
--

 Summary: Update TimestampConverter SMT to support unix epoch as 
millis, micros, and seconds
 Key: KAFKA-13511
 URL: https://issues.apache.org/jira/browse/KAFKA-13511
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Reporter: Julien Chanaud


Currently, the SMT TimestampConverter can convert Timestamp from either source 
String, Long or any target Date to String, Long or Date.

The problem is that Long source or target is required to be epoch in 
milliseconds.

In many cases, epoch is represented with different precisions. This leads to 
several Jira tickets :
 * KAFKA-12364
 * KAFKA-10561

I propose to add a new config to TimestampConverter called "epoch.precision" 
which defaults to "millis" so as to not impact existing code, and allows for 
more precisions : seconds, millis, micros.

"transforms": "TimestampConverter",
"transforms.TimestampConverter.type": 
"org.apache.kafka.connect.transforms.TimestampConverter$Value","transforms.TimestampConverter.field":
 "event_date"
*"transforms.TimestampConverter.epoch.precision": "micros"*
"transforms.TimestampConverter.target.type": "Timestamp"



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE] KIP-792: Add "generation" field into consumer protocol

2021-12-07 Thread Luke Chen
Hi Colin,

Thanks for your comments. I've updated the KIP to mention about the KIP
won't affect current broker side behavior.

> One scenario that we need to consider is what happens during a rolling
upgrade. If the coordinator moves back and forth between brokers with
different IBPs, it seems that the same epoch numbers could be reused for a
group, if things are done in the obvious manner (old IBP = don't read or
write epoch, new IBP = do)

I think this KIP doesn't care about the group epoch number at all. The
subscription metadata is passed from each member to group coordinator, and
then the group coordinator pass all of them back to the consumer lead. So
even if the epoch number is reused in a group, it should be fine. On the
other hand, the group coordinator will have no idea if the join group
request sent from consumer containing the new subscription "generation"
field or not, because group coordinator won't deserialize the metadata.

I've added also added them into the KIP.

Thank you.
Luke

On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe  wrote:

> Hi Luke,
>
> Thanks for the explanation.
>
> I don't see any description of how the broker decides to use the new
> version of ConsumerProtocolSubscription or not. This probably needs to be
> locked to a new IBP version.
>
> One scenario that we need to consider is what happens during a rolling
> upgrade. If the coordinator moves back and forth between brokers with
> different IBPs, it seems that the same epoch numbers could be reused for a
> group, if things are done in the obvious manner (old IBP = don't read or
> write epoch, new IBP = do).
>
> best,
> Colin
>
>
> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
> > Hi Colin,
> > Thanks for your comment.
> >
> >> How are we going to avoid the situation where the broker restarts, and
> > the same generation number is reused?
> >
> > Actually, this KIP doesn't have anything to do with the brokers. The
> > "generation" field I added, is in the subscription metadata, which will
> not
> > be deserialized by brokers. The metadata is only deserialized by consumer
> > lead. And for the consumer lead, the only thing the lead cared about, is
> > the highest generation of the ownedPartitions among all the consumers.
> With
> > the highest generation of the ownedPartitions, the consumer lead can
> > distribute the partitions as sticky as possible, and most importantly,
> > without errors.
> >
> > That is, after this KIP, if the broker restarts, and the same generation
> > number is reused, it won't break current rebalance behavior. But it'll
> help
> > the consumer lead do the sticky assignments correctly.
> >
> > Thank you.
> > Luke
> >
> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe  wrote:
> >
> >> How are we going to avoid the situation where the broker restarts, and
> the
> >> same generation number is reused?
> >>
> >> best,
> >> Colin
> >>
> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
> >> > Hi all,
> >> >
> >> > I'd like to start the vote for KIP-792: Add "generation" field into
> >> > consumer protocol.
> >> >
> >> > The goal of this KIP is to allow the assignor/consumer coordinator to
> >> have
> >> > a way to identify the out-of-date members/assignments, to avoid
> rebalance
> >> > stuck issues in current protocol.
> >> >
> >> > Detailed description can be found here:
> >> >
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
> >> >
> >> > Any feedback is welcome.
> >> >
> >> > Thank you.
> >> > Luke
> >>
>


Re: [DISCUSS] KIP-784: Add top-level error code field to DescribeLogDirsResponse

2021-12-07 Thread Mickael Maison
Hi Magnus,

That's a good point, I've updated the KIP.

Considering it's a very small KIP, if I don't see any more feedback,
I'll start a vote in the next couple of days.

Thanks,
Mickael

On Mon, Nov 22, 2021 at 11:05 AM Magnus Edenhill  wrote:
>
> Hey Mickael and thanks for the KIP.
>
> I'm a bit late to the game, but one small nit for consistency:
> The global response error codes are typically at the start of the response,
> usually right after the ThrottleTime.
> This also makes it easier for the response parser to assign the global
> error for each sub-result, if needed.
>
> Thanks,
> Magnus
>
>
> Den mån 18 okt. 2021 kl 17:08 skrev Mickael Maison  >:
>
> > Hi,
> >
> > I wrote a very minor KIP that proposes adding a top-level error code
> > field to DescribeLogDirsResponse:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-784%3A+Add+top-level+error+code+field+to+DescribeLogDirsResponse
> >
> > Please let me know if you have any feedback.
> > Thanks
> >


[jira] [Created] (KAFKA-13510) KIP-769: Connect APIs to list all connector plugins and retrieve their configuration definitions

2021-12-07 Thread Mickael Maison (Jira)
Mickael Maison created KAFKA-13510:
--

 Summary: KIP-769: Connect APIs to list all connector plugins and 
retrieve their configuration definitions
 Key: KAFKA-13510
 URL: https://issues.apache.org/jira/browse/KAFKA-13510
 Project: Kafka
  Issue Type: Bug
Reporter: Mickael Maison
Assignee: Mickael Maison






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] KIP-682: Connect TimestampConverter support for multiple fields and multiple input formats

2021-12-07 Thread Mickael Maison
Thanks Joshua for this KIP, it looks like a useful addition.

I have a few questions:

1. In Public Interfaces and Proposed Changes, the KIP mentions the
current "field" setting will be renamed to "fields". Do you instead
mean a new setting "fields" will be added and the existing setting
will be marked deprecated?

2. What happens if a user specifies both "field" and "fields"? Is one
setting taking precedence? Is it a bad configuration that will be
rejected? Same for the format and format.input/format.output settings.

3. Can you also add the proposed documentation for the new settings in the KIP?

Best,
Mickael

On Mon, Nov 22, 2021 at 11:37 AM Joshua Grisham  wrote:
>
> Hi Randall, thanks again for the great feedback! Sorry it took me a minute
> to get back on this one--I have been a bit full up over the last few
> months--but now I will try to brush it off and see if we can figure out if
> it makes sense to move forward and possibly close this one out.
>
> Regarding backward compatibility -- the intention of my proposed changes
> (including the candidate PR) was that it still is backward compatible, but
> I realize that what I wrote in the KIP was more theoretical about what
> could happen instead of what I actually did in the PR (that I actually made
> it backward-compatible).  So now I have updated the text in the KIP to
> hopefully more accurately reflect this and I think/hope that it, along with
> what is in the candidate PR, addresses everything you mentioned in your
> first and second paragraph.
>
> For clarity here is exactly what I have done in the PR:
>
> First I saw that the way the config properties for TimestampConverter
> lacked the same structure as it did for other SMTs (e.g. ReplaceField as
> one example), especially since I was adding several new properties, so I
> added the same ConfigName and ConfigDefault interfaces (instead of just
> having various string attributes on the TimestampConverter class itself).
> The existing public class  string attributes FIELD_CONFIG,
> TARGET_TYPE_CONFIG, and FORMAT_CONFIG all still exist and all still can be
> continued to used by anyone else, they will just get warnings at build time
> since I marked them with @Deprecated.  I did remove the private class
> strings since they were no longer used in the class but this should not
> have any impact on compatibility.
>
> Second I added the new properties to CONFIG_DEF but kept the existing
> properties there (so they still exist and can be used) and in the
> configure() method I have used ConfigUtils.translateDeprecatedConfigs() to
> alias the old "field" property to the new "fields" property automatically
> just like was done in ReplaceField from the change from KIP-629 (
> https://github.com/apache/kafka/commit/eab61cad2c418786ab5e58aa1267f689d82a61d1
> )
> In the end it means that a user can still have their connector use the
> "field" property but when the configure() method is called it will just put
> whatever they set in "field" to "fields" automatically.
>
> Also the property "format" still behaves exactly the same as before if it
> is used (it continues to function as a SimpleDateFormatter input and
> output) it is just that there is an additional capability added by
> voluntarily adopting the two new properties "format.input" and
> "format.output".
>
> As far as I can tell, in theory someone could have an existing connector,
> using TimestampConverter (or something else which uses the API even),
> upgrade to this version, and not change anything else, and it still behaves
> the same as before.  But perhaps it should not be marked as @Deprecated to
> avoid the warnings or otherwise is there a clearly defined deprecation
> policy that can be followed? I tried to search and look in the Kafka docs
> but could not find anything after several attempts and rabbit-holes
> unfortunately (closest I seem to find is just a list or documentation on
> what has been deprecated in each release).
>
> However please do say if there is anything else you think is missing either
> in the implementation or now in the updated KIP regarding this!
>
> Also hopefully with my updates to the KIP it has addressed your other
> comments or questions, but please feel free to give it another look when
> you have time and I would welcome any feedback that you have.
>
> I also noticed there was one small change where my PR now has a merge
> conflict but it should be pretty minimal to update as long as everything
> else is still looking ok.. I will try to take a look at this soon as well
> if I am able.
>
> Thanks again and have a great week!
>
> Best,
> Joshua
>
>
>
> Den ons 28 juli 2021 kl 16:05 skrev Randall Hauch :
>
> > Thanks for the contribution, Joshua. Overall I think this is a really good
> > improvement to this connector, for all the reasons cited in the motivation
> > section. I do have some requests, though.
> >
> > The biggest issue I have is that the change is not backward compatible.
> > We've worked really 

Re: Revoke Access

2021-12-07 Thread Tom Bentley
Hi Dnyaneshwar,

You need to do this yourself by sending an email to
dev-unsubscr...@kafka.apache.org from your subscribed email address. The
subject and body can be empty. No one else can unsubscribe you on your
behalf.

Kind regards,

Tom

On Tue, Dec 7, 2021 at 8:38 AM Dnyaneshwar Jadhav <
dnyaneshwar.jadh...@gmail.com> wrote:

> Hi,
>
> I would like to unsubscribe from kafka maintenance emails
> Accidentally I have been registered to the dev team.
> Hopefully in the future I might be in a good shape to contribute.
>
> *Thanks & Regards*
> Dnyaneshwar Dattaram Jadhav
> New Mumbai, INDIA
> +91-8097020379
>
> LinkedIn Profile 
>


Revoke Access

2021-12-07 Thread Dnyaneshwar Jadhav
Hi,

I would like to unsubscribe from kafka maintenance emails
Accidentally I have been registered to the dev team.
Hopefully in the future I might be in a good shape to contribute.

*Thanks & Regards*
Dnyaneshwar Dattaram Jadhav
New Mumbai, INDIA
+91-8097020379

LinkedIn Profile