[jira] [Created] (KAFKA-13961) GKE kafka-consumer-groups returns incorrect hosts

2022-06-06 Thread Mikhail Filatov (Jira)
Mikhail Filatov created KAFKA-13961:
---

 Summary: GKE kafka-consumer-groups returns incorrect hosts
 Key: KAFKA-13961
 URL: https://issues.apache.org/jira/browse/KAFKA-13961
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 2.8.1
Reporter: Mikhail Filatov


Hello,

We faced with strange behavior on Google Kubernetes Engine with 
kafka-consumer-groups.sh script:
{code:java}
./bin/kafka-consumer-groups.sh -bootstrap-server "localhost:9093" 
--command-config ${KAFKA_HOME}/bin/adminclient.properties --describe --group 
group{code}
Script returned results successfully, but host column value is incorrect: 
instead of particular consumer host we have GKE node ip. Hadn't luck to find, 
how describeGroup request managed, so asking for your suggestions.

Could it be connected with some GKE configuration (LoadBalancer or something 
like that)? On other clouds we haven't faced such behavior.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] KIP-787 - MM2 Interface to manage Kafka resources

2022-06-06 Thread Omnia Ibrahim
Hi Chris, Thanks for having the time to look into this.

1. Is the introduction of the new "ForwardingAdmin" class necessary, or can
> the same behavior can be achieved by subclassing the existing
> KafkaAdminClient class?

forwarding decorators give more flexibility than the inheritance, in this
case, ForwardingAdmin gives the ability to use the default KafkaAdminClient
for reading/describing resources and at the same time configure another
client to connect to the federated solution to create and update resources.
Using forwarding seems cleaner and more flexible for this use case than
inheritance.

2. Would it be just as accurate to name the new Mirror Maker 2 property
> "admin.class" instead of "forwarding.admin.class"? I think brevity may work
> in our favor here
>
I don't mind renaming it to "admin.class" if this is better.

3. Would the admin class specified by the user also take effect for KIP-158
> [1] style automatic topic creation? (Forgive me if this isn't applicable
> for Mirror Maker 2; I'm asking solely based on the knowledge that MM2 can
> be run as a source connector and has its own source task class [2].)

No this only control creating/updating mirrored topics and MM2 internal
topics. And not going to affect connect runtime's internal topics that are
needed by connect cluster that runs MM2.

4. Would the admin class specified by the user also take effect for
> internal topics created by the Connect framework (i.e., the statue, config,
> and offsets topics)?

 No this wouldn't address connect as connect "realtime" uses TopicAdmin to
manage topics needed for KafkaOffsetBackingStore, KafkaStatusBackingStore
and KafkaConfigBackingStore directly. These 3 topics aren't a huge concern
for MM2 as they are a small set of topics and can be created up-front as a
one-time job. However, the main concern of the KIP is addressing the
mirrored topics, synced configs, synced ACLs and the internal topics of MM2
which are needed for MM2 features like heartbeat and offset mapping between
Kafka clusters.

The KIP states that Mirror Maker 2 will "Use
> ForwardingAdmin in MirrorUtils instead of TopicAdmin to create internal
> compacted topics", but IIUC these topics (the ones created with the
> MirrorUtils class) are Mirror Maker 2-specific and different from the
> Connect framework's internal topics.

MM2 has been using 2 patterns to create topics
1- Use AdminClient directly to create/update mirrored topics and their ACLs
2- Use TopicAdmin in MirrorUtils to create MM2 internal topics which are
heartbeat, mm2-offset-syncs..internal and
.checkpoints.internal

This KIP will only replace AdminClient and TopicAdmin in the MM2 codebase
by ForwardingAdmin and not connect related topics.
As Colin mentioned before we can have a feature KIP where we use
ForwardingAdmin outside MM2 but this is not addressed in this KIP.

Hope this answered your questions.

Best
Omnia



On Wed, Jun 1, 2022 at 2:14 AM Chris Egerton 
wrote:

> Hi Omnia,
>
> Thank you for your patience with this KIP! I have a few quick thoughts:
>
> 1. Is the introduction of the new "ForwardingAdmin" class necessary, or can
> the same behavior can be achieved by subclassing the existing
> KafkaAdminClient class?
>
> 2. Would it be just as accurate to name the new Mirror Maker 2 property
> "admin.class" instead of "forwarding.admin.class"? I think brevity may work
> in our favor here
>
> 3. Would the admin class specified by the user also take effect for KIP-158
> [1] style automatic topic creation? (Forgive me if this isn't applicable
> for Mirror Maker 2; I'm asking solely based on the knowledge that MM2 can
> be run as a source connector and has its own source task class [2].)
>
> 4. Would the admin class specified by the user also take effect for
> internal topics created by the Connect framework (i.e., the statue, config,
> and offsets topics)? The KIP states that Mirror Maker 2 will "Use
> ForwardingAdmin in MirrorUtils instead of TopicAdmin to create internal
> compacted topics", but IIUC these topics (the ones created with the
> MirrorUtils class) are Mirror Maker 2-specific and different from the
> Connect framework's internal topics.
>
> [1] -
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-158%3A+Kafka+Connect+should+allow+source+connectors+to+set+topic-specific+settings+for+new+topics
> [2] -
>
> https://github.com/apache/kafka/blob/4c9eeef5b2dff9a4f0977fbc5ac7eaaf930d0d0e/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorSourceTask.java
>
> Cheers,
>
> Chris
>
> On Wed, May 25, 2022 at 5:26 AM Omnia Ibrahim 
> wrote:
>
> > Hi everyone, If there's no major concern anymore, I'll start the
> > voting process.
> >
> > On Fri, May 20, 2022 at 5:58 PM Omnia Ibrahim 
> > wrote:
> >
> > > Hi Colin,
> > >
> > > >Thanks for the clarification. I agree it's reasonable for people to
> want
> > > to use their own implementations of Admin. And we could have a config
> for
> > > this, so that it becomes pluggable (possibly in other pla

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #977

2022-06-06 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 6158 lines...]
[Pipeline] }
[Pipeline] // timestamps
[Pipeline] }
[Pipeline] // timeout
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
Failed in branch JDK 17 and Scala 2.12
[2022-06-06T18:19:42.362Z] 
[2022-06-06T18:19:42.362Z] > Task :streams:checkstyleTest
[2022-06-06T18:19:47.326Z] 
[2022-06-06T18:19:47.326Z] > Task :examples:spotbugsMain
[2022-06-06T18:19:47.326Z] [main] INFO edu.umd.cs.findbugs.ExitCodes - 
Calculating exit code...
[2022-06-06T18:19:50.723Z] 
[2022-06-06T18:19:50.723Z] > Task :shell:spotbugsMain
[2022-06-06T18:19:50.723Z] [main] INFO edu.umd.cs.findbugs.ExitCodes - 
Calculating exit code...
[2022-06-06T18:19:55.366Z] 
[2022-06-06T18:19:55.366Z] > Task :clients:spotbugsMain
[2022-06-06T18:20:02.629Z] 
[2022-06-06T18:20:02.629Z] > Task :core:compileTestScala
[2022-06-06T18:20:02.629Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/integration/kafka/server/DynamicBrokerReconfigurationTest.scala:38:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:02.629Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/integration/kafka/server/MultipleListenersWithSameSecurityProtocolBaseTest.scala:30:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:16.946Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/unit/kafka/server/AdvertiseBrokerTest.scala:22:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:16.946Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/unit/kafka/server/BrokerEpochIntegrationTest.scala:27:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:18.157Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/unit/kafka/server/DynamicConfigTest.scala:21:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:21.742Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/unit/kafka/server/KafkaMetricReporterClusterIdTest.scala:23:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:21.742Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/unit/kafka/server/KafkaMetricsReporterTest.scala:24:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:21.742Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/unit/kafka/server/LeaderElectionTest.scala:32:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:21.742Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/unit/kafka/server/LogRecoveryTest.scala:25:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:22.946Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/unit/kafka/server/ReplicaFetchTest.scala:23:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:24.150Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/unit/kafka/server/ReplicationQuotasTest.scala:28:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:24.150Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/unit/kafka/server/ServerGenerateBrokerIdTest.scala:23:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:24.150Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/unit/kafka/server/ServerGenerateClusterIdTest.scala:29:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:24.150Z] [Warn] 
/home/jenkins/jenkins-slave/workspace/Kafka_kafka_trunk/core/src/test/scala/unit/kafka/server/ServerStartupTest.scala:21:21:
 imported `QuorumTestHarness` is permanently hidden by definition of object 
QuorumTestHarness in package server
[2022-06-06T18:20:31.802Z] 
[2022-06-06T18:20:31.802Z] > Task :core:spotbugsMain
[2022-06-06T18:20:31.802Z] [main] IN

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #978

2022-06-06 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 6184 lines...]
[2022-06-06T18:30:48.114Z] > Task :jmh-benchmarks:classes
[2022-06-06T18:30:48.114Z] > Task :jmh-benchmarks:compileTestJava NO-SOURCE
[2022-06-06T18:30:49.050Z] > Task :jmh-benchmarks:checkstyleMain
[2022-06-06T18:30:49.050Z] > Task :jmh-benchmarks:testClasses UP-TO-DATE
[2022-06-06T18:30:49.050Z] > Task :jmh-benchmarks:checkstyleTest NO-SOURCE
[2022-06-06T18:30:49.299Z] > Task :core:spotbugsMain
[2022-06-06T18:30:49.988Z] > Task :connect:runtime:compileTestJava
[2022-06-06T18:30:49.988Z] > Task :connect:runtime:testClasses
[2022-06-06T18:30:51.049Z] 
[2022-06-06T18:30:51.050Z] > Task :jmh-benchmarks:spotbugsMain
[2022-06-06T18:30:51.050Z] [main] INFO edu.umd.cs.findbugs.ExitCodes - 
Calculating exit code...
[2022-06-06T18:30:51.742Z] > Task :connect:mirror:compileTestJava
[2022-06-06T18:30:51.742Z] > Task :connect:mirror:testClasses
[2022-06-06T18:30:52.680Z] > Task :connect:mirror:checkstyleTest
[2022-06-06T18:30:52.857Z] > Task :jmh-benchmarks:spotbugsMain
[2022-06-06T18:31:02.706Z] > Task :connect:runtime:checkstyleTest
[2022-06-06T18:31:04.614Z] > Task :streams:testClasses
[2022-06-06T18:31:04.614Z] > Task :streams:streams-scala:compileTestJava 
NO-SOURCE
[2022-06-06T18:31:07.677Z] 
[2022-06-06T18:31:07.677Z] > Task :streams:checkstyleTest
[2022-06-06T18:31:10.224Z] 
[2022-06-06T18:31:10.224Z] > Task :streams:checkstyleTest
[2022-06-06T18:31:10.326Z] > Task :streams:streams-scala:compileTestScala
[2022-06-06T18:31:10.326Z] > Task :streams:streams-scala:testClasses
[2022-06-06T18:31:10.326Z] > Task :streams:streams-scala:checkstyleTest 
NO-SOURCE
[2022-06-06T18:31:12.921Z] > Task :streams:compileTestJava
[2022-06-06T18:31:13.878Z] 
[2022-06-06T18:31:13.878Z] > Task :jmh-benchmarks:spotbugsMain
[2022-06-06T18:31:13.878Z] [main] INFO edu.umd.cs.findbugs.ExitCodes - 
Calculating exit code...
[2022-06-06T18:31:14.681Z] > Task :core:spotbugsMain
[2022-06-06T18:31:18.274Z] > Task :jmh-benchmarks:spotbugsMain
[2022-06-06T18:31:32.370Z] 
[2022-06-06T18:31:32.370Z] > Task :core:spotbugsMain
[2022-06-06T18:31:32.370Z] [main] INFO edu.umd.cs.findbugs.ExitCodes - 
Calculating exit code...
[2022-06-06T18:31:32.370Z] 
[2022-06-06T18:31:32.370Z] FAILURE: Build failed with an exception.
[2022-06-06T18:31:32.370Z] 
[2022-06-06T18:31:32.370Z] * What went wrong:
[2022-06-06T18:31:32.370Z] Execution failed for task ':clients:checkstyleTest'.
[2022-06-06T18:31:32.370Z] > Checkstyle rule violations were found. See the 
report at: 
file:///home/jenkins/workspace/Kafka_kafka_trunk/clients/build/reports/checkstyle/test.html
[2022-06-06T18:31:32.370Z]   Checkstyle files with violations: 1
[2022-06-06T18:31:32.370Z]   Checkstyle violations by severity: [error:1]
[2022-06-06T18:31:32.370Z] 
[2022-06-06T18:31:32.370Z] 
[2022-06-06T18:31:32.370Z] * Try:
[2022-06-06T18:31:32.370Z] > Run with --stacktrace option to get the stack 
trace.
[2022-06-06T18:31:32.370Z] > Run with --info or --debug option to get more log 
output.
[2022-06-06T18:31:32.370Z] > Run with --scan to get full insights.
[2022-06-06T18:31:32.370Z] 
[2022-06-06T18:31:32.370Z] * Get more help at https://help.gradle.org
[2022-06-06T18:31:32.370Z] 
[2022-06-06T18:31:32.370Z] Deprecated Gradle features were used in this build, 
making it incompatible with Gradle 8.0.
[2022-06-06T18:31:32.370Z] 
[2022-06-06T18:31:32.370Z] You can use '--warning-mode all' to show the 
individual deprecation warnings and determine if they come from your own 
scripts or plugins.
[2022-06-06T18:31:32.370Z] 
[2022-06-06T18:31:32.370Z] See 
https://docs.gradle.org/7.4.2/userguide/command_line_interface.html#sec:command_line_warnings
[2022-06-06T18:31:32.370Z] 
[2022-06-06T18:31:32.370Z] Execution optimizations have been disabled for 1 
invalid unit(s) of work during this build to ensure correctness.
[2022-06-06T18:31:32.370Z] Please consult deprecation warnings for more details.
[2022-06-06T18:31:32.370Z] 
[2022-06-06T18:31:32.370Z] BUILD FAILED in 7m 53s
[2022-06-06T18:31:32.370Z] 235 actionable tasks: 191 executed, 44 up-to-date
[2022-06-06T18:31:32.370Z] 
[2022-06-06T18:31:32.370Z] See the profiling report at: 
file:///home/jenkins/workspace/Kafka_kafka_trunk/build/reports/profile/profile-2022-06-06-18-23-39.html
[2022-06-06T18:31:32.370Z] A fine-grained performance profile is available: use 
the --scan option.
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // timestamps
[Pipeline] }
[Pipeline] // timeout
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
Failed in branch JDK 17 and Scala 2.12
[2022-06-06T18:31:35.950Z] 
[2022-06-06T18:31:35.950Z] > Task :streams:checkstyleTest
[2022-06-06T18:31:40.498Z] [main] INFO edu.umd.cs.findbugs.ExitCodes - 
Calculating exit code...
[2022-06-06T18:31:47.270Z] 
[2022-06-06T18:3

[jira] [Resolved] (KAFKA-13592) Fix flaky test ControllerIntegrationTest.testTopicIdUpgradeAfterReassigningPartitions

2022-06-06 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-13592.
-
Resolution: Fixed

> Fix flaky test 
> ControllerIntegrationTest.testTopicIdUpgradeAfterReassigningPartitions
> -
>
> Key: KAFKA-13592
> URL: https://issues.apache.org/jira/browse/KAFKA-13592
> Project: Kafka
>  Issue Type: Bug
>Reporter: David Jacot
>Assignee: Kvicii.Yu
>Priority: Minor
>
> {noformat}
> java.util.NoSuchElementException: None.get
>   at scala.None$.get(Option.scala:529)
>   at scala.None$.get(Option.scala:527)
>   at 
> kafka.controller.ControllerIntegrationTest.testTopicIdUpgradeAfterReassigningPartitions(ControllerIntegrationTest.scala:1239){noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] KIP-842: Add richer group offset reset mechanisms

2022-06-06 Thread 邓子明
I think making it a separate enum config is ok, it is similar to a boolean 
config but provide more choices, so the config name should be 
offset.reset.strategy.offset-out-of-range? 

- -
Best,
Ziming

> On May 30, 2022, at 9:48 PM, hudeqi <16120...@bjtu.edu.cn> wrote:
> 
> Hi, Ziming.
> I thought about it again and thought it might be better to add an additional 
> auxiliaryStrategy, so that we can implement more auxiliary strategies in this 
> way, not just nearest. What do you think?
> Best,
> hudeqi
> 
> 
> > -原始邮件-
> > 发件人: hudeqi <16120...@bjtu.edu.cn>
> > 发送时间: 2022-05-30 15:33:45 (星期一)
> > 收件人: dev@kafka.apache.org
> > 抄送: 
> > 主题: Re: Re: [DISCUSS] KIP-842: Add richer group offset reset mechanisms
> > 
> > Thank you for your reply.
> > According to the current implementation in pull request, it may not be 
> possible to directly remove the enumeration value of nearest. However, on the 
> whole, putting nearest in the OffsetResetStrategy enumeration class may cause 
> some misunderstandings in use. There are two solutions. One is to make a 
> defensive check on the value of "auto.offset.reset" when initializing the 
> consumer, and the other one is that I change the implementation. For example, 
> I add an additional variable named "auxiliaryStrategy" to the 
> SubscriptionState class to reset offset when triggering out-of-range.
> > Besides this problem, is there any other problem? What can I do next to 
> push this kip up for adoption? This is my first time I do, I don't understand 
> very well.
> > 
> > > -原始邮件-
> > > 发件人: "deng ziming" 
> > > 发送时间: 2022-05-30 13:23:53 (星期一)
> > > 收件人: dev@kafka.apache.org
> > > 抄送: 
> > > 主题: Re: [DISCUSS] KIP-842: Add richer group offset reset mechanisms
> > > 
> > > Thank you for your reply.
> > > 
> > > According to your description, strategy=nearest is redundant, 
> right? We only rely on nearest.offset.reset=true/false to handle 
> OffsetOutOfRange, I think we can directly remove this enum value, WDYT?
> > > 
> > > --
> > > Best,
> > > Ziming
> > > 
> > > > On May 27, 2022, at 5:19 PM, hudeqi 
> <16120...@bjtu.edu.cn> wrote:
> > > > 
> > > > Thank you for your attention and reply. Here are my reply to 
> your questions:
> > > > 
> > > > 1. If strategy=safe_latest and there is not committed offset, 
> whether the group is newly started is determined in this way: when the group 
> is started, a timestamp "createTimeStamp" will be passed as the start time of 
> the group. When the offset needs to be reset, the timestamp will be added to 
> "ListOffsetsRequest" as a new field. The server compares the timestamp 
> "createTimeStamp" with the timestamp "logStartTime", which is the first 
> message time for each partition. If "createTimeStamp" "greater than 
> "logStartTime" means that the group is newly started for this partition and 
> consumed from the latest, otherwise it means that the partition is newly 
> expanded and needs to be consumed from the earliest. For details, you can see 
> related jira and pr.
> > > > 
> > > > 
> > > > 
> > > > 2. Strictly speaking, nearest is not a level strategy with 
> earlyliest/latest/safe_latest/earliest_on_start/latest_on_start. It is more 
> like an auxiliary strategy, which is only dealt with out-of-range. So if you 
> set nearest.offset.reset=true, no matter what strategy "auto.offset.reset" is 
> set to, it will be performed according to the strategy of nearest when 
> out-of-range (to the earliest if it was under the range , or to the latest if 
> it was over the range), and for the case where no committed offset, nearest 
> will naturally have no effect, instead, it is determined by the main 
> strategy(auto.offset.reset).
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 3. This I have added to the form of module "Proposed Changes" 
> in kip-842.
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 4. The meaning of nearest.offset.reset has been clearly 
> expressed in point 2, this configuration is disabled default, that is to say, 
> when out-of-range, reset strategy is performed according to the main strategy 
> (auto.offset.reset).
> > > > 
> > > > 
> > > > 
> > > >> -原始邮件-
> > > >> 发件人: "deng ziming" 
> > > >> 发送时间: 2022-05-27 16:02:53 (星期五)
> > > >> 收件人: dev@kafka.apache.org
> > > >> 抄送: 
> > > >> 主题: Re: [DISCUSS] KIP-842: Add richer group offset reset 
> mechanisms
> > > >> 
> > > >> Thank you for this KIP, the motivation makes sense to me, 
> left some questions:
> > > >> 
> > > >> 1. If strategy=safe_latest and there is not committed 
> offset, we have 2 choices based on whether the group is started newly, can 
> you elaborate on how can we decide the group is started newly? It would be 
> clear.
> > > >> 
> > > >> 2. If strategy=nearest and there is not committed offset, 
> its behavior is determined by the earliest, or latest, or safe_latest used 
> together. can you elaborate on it more clearly?
> > > >> 
> > > >> 3. Can you also add a column "current reset behavior” and 
> change "reset behav

Re: [DISCUSS] KIP-821: Connect Transforms support for nested structures

2022-06-06 Thread Chris Egerton
Hi Jorge,

Thanks! Sorry for the delay; here are my thoughts:

1. Under the "Accessing multiple values by deep-scan" header it's stated
that "If deep-scan is used, it must have only one field after the asterisk
level.". However, in example 3 for the Cast SMT and other examples for
other SMTs, the spec contains a field of "*.child.k2", which appears to
have two fields after the asterisk level. I may be misunderstanding the
proposal, but it seems like the two contradict each other.

2. I'm a little unclear on why we need the special handling for arrays
where, for an array field "a", the field name "a" can be treated as either
the array itself, or every element in the array. Is there a reason we can't
use the field name "a.*" to handle the latter case, and "a" to handle the
former?

3. How would a user specify that they'd like to access a field with the
literal name "*"?

4. For the Cast SMT, do you think it might bite some people if fields that
can't be cast correctly are silently ignored? I'm imagining the case where
none of the fields in a multi-path expression can be cast correctly and it
ends up eating half of someone's day to track down why their SMT isn't
doing anything.

5. For the ExtractField and ValueToKey SMTs, what happens if a deep-scan
field name is used, but only one field is found? Is the resulting field
still an array, or is it just the single field that was found? (FWIW I'm
leaning towards keeping it an array just to keep schemas consistent in a
pipeline in case the number of fields found fluctuates across records.)

6. (Nit) For the HeaderFrom SMT, it's stated that "As this SMT affects only
existing fields, additional configurations will not be required.". Given
the new "field.syntax.version" property, this part should probably be
removed.

7. Is recursive descent intentionally excluded? That was an important part
of Joshua's KIP and his feedback on this KIP; I think it's worth pursuing
if we can.

Cheers,

Chris

On Tue, May 24, 2022 at 3:49 AM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Thanks, Chris!
>
> I have updated the KIP with the rejected alternatives updated. Also, I have
> drafted the support for arrays and deep scans as part of the proposed
> notation to make it more complete, even though these can be implemented in
> multiple PRs.
>
> Looking forward to your feedback.
>
> Cheers,
> Jorge.
>
> On Sat, 21 May 2022 at 17:39, Chris Egerton 
> wrote:
>
> > Hi Jorge,
> >
> > I really appreciate the effort you've made to simplify the syntax and
> > feature set of a JSONPath-based approach as much as possible. I'm still
> > hesitant to continue with it, though.
> >
> > 1. The syntax is much less friendly. Just compare "top.mid.bottom" to
> > "$['top']['mid']['bottom']"... not everyone uses JSONPath or even JSON,
> and
> > the learning curve for the former is going to be steeper. The examples in
> > the new KIP draft you published demonstrate this pretty well, and this is
> > without diving into the details of what escape syntax would look like.
> >
> > 2. The three new major features that this syntax adds (array accesses,
> deep
> > scans, and multi-value paths) could all be added pretty easily to the dot
> > notation syntax without introducing brackets and dollar signs. Array
> > accesses can be described using the same syntax as struct/map field
> access,
> > deep scans can be described using '*', and multi-value paths can be
> > described by referencing the name of a field that's expected to have
> > children. These are all top-of-the-head ideas and can probably be
> refined,
> > but hopefully they demonstrate that we can keep the syntax simple without
> > sacrificing features. Of course, the question of leaving room for future
> > features might arise... given that these are the out-of-the-box SMTs that
> > are likely to be the first that many people encounter, I'd err on the
> side
> > of simplicity and a gentle learning curve; if people need to get more out
> > of transforms, the option to implement their own is still there. If we
> can
> > address 95% of use cases with something easy to use, it's not worth
> making
> > the feature harder for everyone to use just to accommodate the remaining
> > 5%.
> >
> > 3. The advantage of leveraging an existing syntax is twofold: users who
> are
> > already familiar with that syntax don't need to learn a new syntax, and
> > maintainers of the syntax get to leverage existing libraries and
> > documentation for the syntax. With the current proposal, reusing
> libraries
> > is off the table, which means that we're going to have to parse this
> > ourselves (including all the escape syntax edge cases), and we won't be
> > able to automatically leverage new features added to that syntax. And
> given
> > how stripped-down the syntax is in comparison to full JSONPath, there's
> > still going to be a learning curve for users who are already familiar
> with
> > it, and we'll still have to document how Connect's variant of JSONP

Re: [DISCUSS] KIP-787 - MM2 Interface to manage Kafka resources

2022-06-06 Thread Chris Egerton
Hi Omnia,

1. I might be missing something, but can you give a concrete Java example
of how the proposed ForwardingAdmin class is more convenient than
subclassing the KafkaAdminClient class? AFAICT the two would be virtually
identical. I might be misunderstanding exactly how that class will be used;
I'm envisioning it as the pluggable class that users will implement for
custom administration logic and specify as the value for the
"forwarding.admin.class" (or ".forwarding.admin.class") property,
and that it will be instantiated with a KafkaAdminClient instance that can
be used to get the same logic that MM2 provides today. In the case you
mentioned (KafkaAdminClient for read/describe, custom Admin for
create/update), I'd imagine one could override the createTopics,
deleteTopics, createAcls, deleteAcls (maybe?), alterConfigs (maybe?), etc.
methods, and then leave other methods such as listTopics, describeTopics,
describeCluster, etc. as they are.

3. KIP-158 deals with the topics that source connectors write to, not the
internal topics used by the Connect framework. IIUC this includes the
topics that MM2 mirrors. I think it's still fine if we want to leave
support for this out and say that this KIP only addresses code that lives
inside the connect/mirror (and possibly connect/mirror-client?) modules, I
just want to make sure that whatever behavior we settle on is specified
clearly in the KIP and user-facing documentation.

4. That's fine 👍 Just out of curiosity, is the motivation for this
decision to simplify the implementation? I can imagine it'd be easier to
modify the MM2 codebase exclusively and not have to worry about touching
the Connect framework as well given the possibility for unintended
consequences for other connectors with the latter. Wondering if there's a
distinction on the feature front as well that makes MM2 internal topics
different from Connect internal topics.

Cheers,

Chris

On Mon, Jun 6, 2022 at 6:36 AM Omnia Ibrahim 
wrote:

> Hi Chris, Thanks for having the time to look into this.
>
> 1. Is the introduction of the new "ForwardingAdmin" class necessary, or can
> > the same behavior can be achieved by subclassing the existing
> > KafkaAdminClient class?
>
> forwarding decorators give more flexibility than the inheritance, in this
> case, ForwardingAdmin gives the ability to use the default KafkaAdminClient
> for reading/describing resources and at the same time configure another
> client to connect to the federated solution to create and update resources.
> Using forwarding seems cleaner and more flexible for this use case than
> inheritance.
>
> 2. Would it be just as accurate to name the new Mirror Maker 2 property
> > "admin.class" instead of "forwarding.admin.class"? I think brevity may
> work
> > in our favor here
> >
> I don't mind renaming it to "admin.class" if this is better.
>
> 3. Would the admin class specified by the user also take effect for KIP-158
> > [1] style automatic topic creation? (Forgive me if this isn't applicable
> > for Mirror Maker 2; I'm asking solely based on the knowledge that MM2 can
> > be run as a source connector and has its own source task class [2].)
>
> No this only control creating/updating mirrored topics and MM2 internal
> topics. And not going to affect connect runtime's internal topics that are
> needed by connect cluster that runs MM2.
>
> 4. Would the admin class specified by the user also take effect for
> > internal topics created by the Connect framework (i.e., the statue,
> config,
> > and offsets topics)?
>
>  No this wouldn't address connect as connect "realtime" uses TopicAdmin to
> manage topics needed for KafkaOffsetBackingStore, KafkaStatusBackingStore
> and KafkaConfigBackingStore directly. These 3 topics aren't a huge concern
> for MM2 as they are a small set of topics and can be created up-front as a
> one-time job. However, the main concern of the KIP is addressing the
> mirrored topics, synced configs, synced ACLs and the internal topics of MM2
> which are needed for MM2 features like heartbeat and offset mapping between
> Kafka clusters.
>
> The KIP states that Mirror Maker 2 will "Use
> > ForwardingAdmin in MirrorUtils instead of TopicAdmin to create internal
> > compacted topics", but IIUC these topics (the ones created with the
> > MirrorUtils class) are Mirror Maker 2-specific and different from the
> > Connect framework's internal topics.
>
> MM2 has been using 2 patterns to create topics
> 1- Use AdminClient directly to create/update mirrored topics and their ACLs
> 2- Use TopicAdmin in MirrorUtils to create MM2 internal topics which are
> heartbeat, mm2-offset-syncs..internal and
> .checkpoints.internal
>
> This KIP will only replace AdminClient and TopicAdmin in the MM2 codebase
> by ForwardingAdmin and not connect related topics.
> As Colin mentioned before we can have a feature KIP where we use
> ForwardingAdmin outside MM2 but this is not addressed in this KIP.
>
> Hope this answered your questions.
>
> Best
> Omn