[jira] [Created] (KAFKA-13421) Fix ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup
Colin McCabe created KAFKA-13421: Summary: Fix ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup Key: KAFKA-13421 URL: https://issues.apache.org/jira/browse/KAFKA-13421 Project: Kafka Issue Type: Bug Reporter: Colin McCabe ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup is failing with this error: {code} ConsumerBounceTest > testSubscribeWhenTopicUnavailable() PASSED kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup() failed, lo g available in /home/cmccabe/src/kafka9/core/build/reports/testOutput/kafka.api.ConsumerBounceTest.testRollingBr okerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup().test.stdout ConsumerBounceTest > testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup() FAILED org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists at org.apache.zookeeper.KeeperException.create(KeeperException.java:126) at kafka.zk.KafkaZkClient$CheckedEphemeral.getAfterNodeExists(KafkaZkClient.scala:1904) at kafka.zk.KafkaZkClient$CheckedEphemeral.create(KafkaZkClient.scala:1842) at kafka.zk.KafkaZkClient.checkedEphemeralCreate(KafkaZkClient.scala:1809) at kafka.zk.KafkaZkClient.registerBroker(KafkaZkClient.scala:96) at kafka.server.KafkaServer.startup(KafkaServer.scala:320) at kafka.integration.KafkaServerTestHarness.$anonfun$restartDeadBrokers$2(KafkaServerTestHarness.scala:2 12) at scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.scala:18) at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563) at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561) at scala.collection.AbstractIterable.foreach(Iterable.scala:919) at scala.collection.IterableOps$WithFilter.foreach(Iterable.scala:889) at kafka.integration.KafkaServerTestHarness.restartDeadBrokers(KafkaServerTestHarness.scala:203) at kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsB igGroup$1(ConsumerBounceTest.scala:327) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190) at kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup(C onsumerBounceTest.scala:319) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [kafka-site] stanislavkozlovski opened a new pull request #379: MINOR: Update coding guide to mention need to maintain public client API compatibility
stanislavkozlovski opened a new pull request #379: URL: https://github.com/apache/kafka-site/pull/379 Previously, the coding guide would imply that it's acceptable to break API compatibility due to the project's early stage in maturity. For years now, the project has been mature and adopted enough to warrant not breaking API compatibility. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: Wiki Permissions Request
Added. On 10/29/21 3:15 AM, Ahmed Oubalas wrote: Hello all, Wiki ID: ahmed.oulabas I need to create a new KIP page. Thank you for your help! Ahmed
Re: [DISCUSS] KIP-779: Allow Source Tasks to Handle Producer Exceptions
Arjun, Thank you for your feedback, I have updated the KIP. This solution is more elegant than my original proposal; however, after working on the implementation, we have now pushed the configuration from the connector/task itself back to the connect worker. All tasks running on the worker would share this ignore producer exception configuration flag. This works for my use cases where I cannot envision setting this for only one type of connector we have, but this does take the choice out of the hands of the connector developer. I suppose that is for the best, in a vacuum only the worker should have a say in how it handles message production. Additional thoughts and feedback are welcome. Knowles On Thu, Oct 28, 2021 at 10:54 AM Arjun Satish wrote: > Yes, that makes sense. And it fits in very nicely with the current error > handling framework. > > On Thu, Oct 28, 2021 at 10:39 AM Knowles Atchison Jr < > katchiso...@gmail.com> > wrote: > > > That would work. I originally thought that it would be confusing to > > overload that function when a Record that wasn't actually written, but > > looking at SourceTask more closely, in commitRecord(SourceRecord, > > RecordMetadata), the RecordMetadata is set to null in the event of a > > filtered transformation so the framework is already doing this in a > certain > > regard. > > > > Knowles > > > > On Thu, Oct 28, 2021 at 10:29 AM Arjun Satish > > wrote: > > > > > To ack the message back to the source system, we already have a > > > commitRecord method. Once the bad record is handled by skip/dlq, we > could > > > just call commitRecord() on it? > > > > > > On Thu, Oct 28, 2021 at 9:35 AM Knowles Atchison Jr < > > katchiso...@gmail.com > > > > > > > wrote: > > > > > > > Hi Chris, > > > > > > > > Thank you for your reply! > > > > > > > > It is a clarity error regarding the javadoc. I am not operationally > > > > familiar with all of the exceptions Kafka considers non-retriable, > so I > > > > pulled the list from Callback.java: > > > > > > > > > > > > > > https://github.com/apache/kafka/blob/1afe2a5190e9c98e38c84dc793f4303ea51bc19b/clients/src/main/java/org/apache/kafka/clients/producer/Callback.java#L35 > > > > to be an illustrative example of the types of exceptions that would > > kill > > > > the connector outright. Any exception thrown during the producer > write > > > will > > > > be passed to this handler. I will update the KIP/PR to be more clear > on > > > > this matter. > > > > > > > > You raise an excellent point, how should the framework protect the > > > > connector or developer from themselves? If a connector enables > > > exactly-once > > > > semantics, it would make sense to me to have the task killed. The > > > framework > > > > should enforce this type of misconfiguration that would break the > > > internal > > > > semantics of KIP-618. WorkerSourceTask could check the configuration > > > before > > > > handing off the records and exception to this function, fail initial > > > > configuration check, or something of that nature. > > > > > > > > Hi Arjun, > > > > > > > > Thank you for your response! > > > > > > > > My specific use case is our custom JMS connector. We ack back to the > > jms > > > > broker once Kafka commits the record. We thread out our JMS consumer > > such > > > > that I would need access to the SourceRecord to confirm we are going > to > > > > throw away the message. > > > > > > > > Skipping such records, writing some log messages, and/or writing some > > > error > > > > context to a DLQ would cover most if not all of the use cases I > > envision. > > > > > > > > "discard.message.on.producer.exception": "true" > > > > > > > > or some equivalent would get my personal use case 99% of the way > > there. I > > > > would still need some kind of callback from inside the connector with > > the > > > > Source Record to successfully ack back to my source system. > > > > > > > > I have updated the KIP regarding the callback being executed in a > > > different > > > > thread than poll(). > > > > > > > > Knowles > > > > > > > > On Thu, Oct 28, 2021 at 2:02 AM Arjun Satish > > > > > wrote: > > > > > > > > > Hi Knowles, > > > > > > > > > > Thanks for the KIP! > > > > > > > > > > Could you please call out some use-cases on what the source > > connectors > > > > > would do when they hit such exceptions? I'm wondering if we would > > need > > > to > > > > > do anything other than skipping such records, writing some log > > > messages, > > > > > and/or writing some error context to a DLQ? > > > > > > > > > > One of the goals for Connect was to abstract away intricacies of > > Kafka > > > > > topics, clients etc, so that connectors could focus on the external > > > > systems > > > > > themselves. Ideally, we'd want to see if we could call out the most > > > > common > > > > > cases and handle them in the framework itself, instead of > delegating > > > them > > > > > back to the connector. This way, instead of the new API, we'd > > probably > > > >
[jira] [Created] (KAFKA-13420) consumer protocol should include "generation" field for assignor to distinguish between new/old OwnedPartitions
Luke Chen created KAFKA-13420: - Summary: consumer protocol should include "generation" field for assignor to distinguish between new/old OwnedPartitions Key: KAFKA-13420 URL: https://issues.apache.org/jira/browse/KAFKA-13420 Project: Kafka Issue Type: Improvement Components: clients, consumer Reporter: Luke Chen Assignee: Luke Chen In [KIP-429|https://cwiki.apache.org/confluence/display/KAFKA/KIP-429%3A+Kafka+Consumer+Incremental+Rebalance+Protocol], we add a new field: `OwnedPartitions` into consumer protocol, for cooperative protocol do partition revoking things. But recently, we found the `ownedPartitions` info might be out-of-date due to some reasons (ex: unstable network), and the out-of-date `ownedPartitions` causes unexpected rebalance stuck issue (ex: KAFKA-12984, KAFKA-13406). To fix it, we should consider to add the "generation" field in the consumer protocol, so that we can rely on the "generation" info to identify if the `ownedPartition` is up-to-date or out-of-date. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Apache Kafka 3.1.0 release
Hi David, Please update the components of the following KIPs: - KIP-390: Support Compression Level - Core, Clients - KIP-653: Upgrade log4j to log4j2 - Clients, Connect, Core, Streams (that is, Log4j-appender, Tools, and Trogdor are excluded.) Best, Dongjin On Fri, Oct 29, 2021 at 2:24 AM Chris Egerton wrote: > Hi David, > > I've moved KIP-618 to the "postponed" section as it will not be merged in > time due to lack of review. > > Cheers, > > Chris > > On Thu, Oct 28, 2021 at 1:07 PM David Jacot > wrote: > > > Hi team, > > > > Just a quick reminder that the Feature freeze is tomorrow (October 29th). > > In order to be fair with everyone in all the time zones, I plan to cut > the > > release branch early next week. > > > > Cheers, > > David > > > > On Mon, Oct 18, 2021 at 9:56 AM David Jacot wrote: > > > > > Hi team, > > > > > > KIP freeze for the next major release of Apache Kafka was reached > > > last week. > > > > > > I have updated the release plan with all the adopted KIPs which are > > > considered > > > for AK 3.1.0. Please, verify the plan and let me know if any KIP should > > be > > > added > > > to or removed from the release plan. > > > > > > For the KIPs which are still in progress, please work closely with your > > > reviewers > > > to make sure that they land on time for the feature freeze. > > > > > > The next milestone for the AK 3.1.0 release is the feature freeze on > > > October 29th, > > > 2021. > > > > > > Cheers, > > > David > > > > > > On Fri, Oct 15, 2021 at 9:05 AM David Jacot > wrote: > > > > > >> Hi folks, > > >> > > >> Just a quick reminder that the KIP freeze is today. Don't forget to > > close > > >> your ongoing votes. > > >> > > >> Best, > > >> David > > >> > > >> On Thu, Oct 14, 2021 at 5:31 PM David Jacot > > wrote: > > >> > > >>> Hi Luke, > > >>> > > >>> Added it to the plan. > > >>> > > >>> Thanks, > > >>> David > > >>> > > >>> On Thu, Oct 14, 2021 at 10:09 AM Luke Chen > wrote: > > >>> > > Hi David, > > KIP-766 is merged into trunk. Please help add it into the release > > plan. > > > > Thank you. > > Luke > > > > On Mon, Oct 11, 2021 at 10:50 PM David Jacot > > > > wrote: > > > > > Hi Michael, > > > > > > Sure. I have updated the release plan to include it. Thanks for > the > > > heads up. > > > > > > Best, > > > David > > > > > > On Mon, Oct 11, 2021 at 4:39 PM Mickael Maison < > > mickael.mai...@gmail.com> > > > wrote: > > > > > > > Hi David, > > > > > > > > You can add KIP-690 to the release plan. The vote passed months > > ago > > > > and I merged the PR today. > > > > > > > > Thanks > > > > > > > > On Fri, Oct 8, 2021 at 8:32 AM David Jacot > > > > > > wrote: > > > > > > > > > > Hi folks, > > > > > > > > > > Just a quick reminder that KIP Freeze is next Friday, October > > 15th. > > > > > > > > > > Cheers, > > > > > David > > > > > > > > > > On Wed, Sep 29, 2021 at 3:52 PM Chris Egerton > > > > > > > > > wrote: > > > > > > > > > > > Thanks David! > > > > > > > > > > > > On Wed, Sep 29, 2021 at 2:56 AM David Jacot > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi Chris, > > > > > > > > > > > > > > Sure thing. I have added KIP-618 to the release plan. > Thanks > > for > > > the > > > > > > heads > > > > > > > up. > > > > > > > > > > > > > > Best, > > > > > > > David > > > > > > > > > > > > > > On Wed, Sep 29, 2021 at 8:53 AM David Jacot < > > dja...@confluent.io> > > > > wrote: > > > > > > > > > > > > > > > Hi Kirk, > > > > > > > > > > > > > > > > Yes, it is definitely possible if you can get the KIP > > voted > > > before > > > > the > > > > > > > KIP > > > > > > > > freeze > > > > > > > > and the code committed before the feature freeze. > Please, > > let me > > > > know > > > > > > > when > > > > > > > > the > > > > > > > > KIP is voted and I will add it to the release plan. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > David > > > > > > > > > > > > > > > > On Tue, Sep 28, 2021 at 7:05 PM Chris Egerton > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > >> Hi David, > > > > > > > >> > > > > > > > >> Wondering if we can get KIP-618 included? The vote > passed > > months > > > > ago > > > > > > > and a > > > > > > > >> PR has been available since mid-June. > > > > > > > >> > > > > > > > >> Cheers, > > > > > > > >> > > > > > > > >> Chris > > > > > > > >> > > > > > > > >> On Tue, Sep 28, 2021 at 12:53 PM Kirk True < > > > k...@mustardgrain.com > > > > > > > > > > > > wrote: > > > > > > > >> > > > > > > > >>
Wiki Permissions Request
Hello all, Wiki ID: ahmed.oulabas I need to create a new KIP page. Thank you for your help! Ahmed
[jira] [Created] (KAFKA-13419) sync group failed with retriable error might cause out-of-date ownedPartition in Cooperative protocol
Luke Chen created KAFKA-13419: - Summary: sync group failed with retriable error might cause out-of-date ownedPartition in Cooperative protocol Key: KAFKA-13419 URL: https://issues.apache.org/jira/browse/KAFKA-13419 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 3.0.0 Reporter: Luke Chen Assignee: Luke Chen In KAFKA-13406, we found there's user got stuck when in rebalancing with cooperative sticky assignor. The reason is the "ownedPartition" is out-of-date, and it failed the cooperative assignment validation. Investigate deeper, I found the root cause is we didn't reset generation and state after sync group fail. In KAFKA-12983, we fixed the issue that the onJoinPrepare is not called in resetStateAndRejoin method. And it causes the ownedPartition not get cleared. But there's another case that the ownedPartition will be out-of-date. Here's the example: # consumer A joined and synced group successfully with generation 1 # New rebalance started with generation 2, consumer A joined successfully, but somehow, consumer A doesn't send out sync group immediately # other consumer completed sync group successfully in generation 2, except consumer A. # After consumer A send out sync group, the new rebalance start, with generation 3. So consumer A got REBALANCE_IN_PROGRESS error with sync group response # When receiving REBALANCE_IN_PROGRESS, we re-join the group, with generation 3, with the assignment (ownedPartition) in generation 1. # So, now, we have out-of-date ownedPartition sent, with unexpected results happened We might want to do resetStateAndRejoin when retriable errors happend in *sync group*. Because when we got sync group error, it means, join group passed, and other consumers (and the leader) might already completed this round of rebalance. The assignment distribution this consumer have is already out-of-date. The errors should resetStateAndRejoin in sync group are: {code:java} if (exception instanceof UnknownMemberIdException || exception instanceof IllegalGenerationException || exception instanceof RebalanceInProgressException || exception instanceof MemberIdRequiredException) continue; else if (!future.isRetriable()) throw exception; {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)