Re: [DISCUSSION] KIP-418: A method-chaining way to branch KStream
Ivan, I tried to make a very rough proof of concept of a fluent API based off of KStream here (https://github.com/apache/kafka/pull/6512), and I think I succeeded at removing both cons. - Compatibility: I was incorrect earlier about compatibility issues, there aren't any direct ones. I was unaware that Java is smart enough to distinguish between a branch(varargs...) returning one thing and branch() with no arguments returning another thing. - Requiring a terminal method: We don't actually need it. We can just build up the branches in the KBranchedStream who shares its state with the ProcessorSupplier that will actually do the branching. It's not terribly pretty in its current form, but I think it demonstrates its feasibility. To be clear, I don't think that pull request should be final or even a starting point if we go in this direction, I just wanted to see how challenging it would be to get the API working. I will say though, that I'm not sure the existing solution could be deprecated in favor of this, which I had originally suggested was a possibility. The reason is that the newly branched streams are not available in the same scope as each other. That is, if we wanted to merge them back together again I don't see a way to do that. The KIP proposal has the same issue, though - all this means is that for either solution, deprecating the existing branch(...) is not on the table. Thanks, Paul On Wed, Mar 27, 2019 at 12:08 PM Ivan Ponomarev wrote: > OK, let me summarize what we have discussed up to this point. > > First, it seems that it's commonly agreed that branch API needs > improvement. Motivation is given in the KIP. > > There are two potential ways to do it: > > 1. (as origianlly proposed) > > new KafkaStreamsBrancher<..>() >.branch(predicate1, ks ->..) >.branch(predicate2, ks->..) >.defaultBranch(ks->..) //optional >.onTopOf(stream).mapValues(...) //onTopOf returns its argument > > PROS: 1) Fully backwards compatible. 2) The code won't make sense until > all the necessary ingredients are provided. > > CONS: The need to create a KafkaStreamsBrancher instance contrasts the > fluency of other KStream methods. > > 2. (as Paul proposes) > > stream >.branch(predicate1, ks ->...) >.branch(predicate2, ks->...) >.defaultBranch(ks->...) //or noDefault(). Both defaultBranch(..) and > noDefault() return void > > PROS: Generally follows the way KStreams interface is defined. > > CONS: We need to define two terminal methods (defaultBranch(ks->) and > noDefault()). And for a user it is very easy to miss the fact that one > of the terminal methods should be called. If these methods are not > called, we can throw an exception in runtime. > > Colleagues, what are your thoughts? Can we do better? > > Regards, > > Ivan > > 27.03.2019 18:46, Ivan Ponomarev пишет: > > > > > > 25.03.2019 17:43, Ivan Ponomarev пишет: > >> Paul, > >> > >> I see your point when you are talking about > >> stream..branch..branch...default.. > >> > >> Still, I believe that this cannot not be implemented the easy way. > >> Maybe we all should think further. > >> > >> Let me comment on two of your ideas. > >> > >>> user could specify a terminal method that assumes nothing will reach > >>> the default branch, > >> throwing an exception if such a case occurs. > >> > >> 1) OK, apparently this should not be the only option besides > >> `default`, because there are scenarios when we want to just silently > >> drop the messages that didn't match any predicate. 2) Throwing an > >> exception in the middle of data flow processing looks like a bad idea. > >> In stream processing paradigm, I would prefer to emit a special > >> message to a dedicated stream. This is exactly where `default` can be > >> used. > >> > >>> it would be fairly easily for the InternalTopologyBuilder to track > >>> dangling > >> branches that haven't been terminated and raise a clear error before it > >> becomes an issue. > >> > >> You mean a runtime exception, when the program is compiled and run? > >> Well, I'd prefer an API that simply won't compile if used > >> incorrectly. Can we build such an API as a method chain starting from > >> KStream object? There is a huge cost difference between runtime and > >> compile-time errors. Even if a failure uncovers instantly on unit > >> tests, it costs more for the project than a compilation failure. > >> > >> Regards, > >> > >> Ivan > >> > >> > >> 25.03.2019 0:38, Paul Whalen пишет: > >>> Ivan, > >>> > >>> Good point about the terminal operation being required. But is that > >>> really > >>> such a bad thing? If the user doesn't want a defaultBranch they can > >>> call > >>> some other terminal method (noDefaultBranch()?) just as easily. In > >>> fact I > >>> think it creates an opportunity for a nicer API - a user could specify > a > >>> terminal method that assumes nothing will reach the default branch, > >>> throwing an exception if such a case occurs. That seems
Re: MirrorMaker 2.0 and Streams interplay (topic naming control)
John, You make a good case for it already being a public API, so my nerves are definitely eased on that front. I do think we have a path to move forward with the user space solution, and if I get a chance, I'm going to try proving it out with a trivial demo using an early MM2 build - but it's nice to hear your support of the use case regardless. The ACL concern makes a lot of sense, and while I don't think it would be a deal breaker because of what you say about advanced control naturally requiring extra care, I'm generally against the added complexity of custom topic naming unless we really need it. It looks like MM2 will also support optional ACL mirroring, so that should only make things easier. Regarding the management burden of doing these switchovers: fortunately our case is something like running in pre-prod maybe 3 consecutive days out of the month, and just prod for the rest of the month. So if it wasn't the most effortless or fast process we could tolerate it. Though if it was easy I wouldn't be surprised if others wanted a similar workflow with much faster iteration - spinning up a new environment with the same data as prod is always a boon. Thanks again! Paul On Wed, Mar 27, 2019 at 2:17 PM John Roesler wrote: > Hi Paul, > > Sorry for overlooking the "offset translation" MM2 feature. I'm glad > Ryanne was able to confirm this would work. > > I'm just one voice, but FWIW, I think that the internal topic naming > scheme is a public API. We document the structure of the naming > scheme in several places. We also recommend making use of the fact > that the applicationId is a prefix of the topic name in conjunction with > Kafka Broker ACLs to grant access to the internal topics to the > applications that own them. > > Actually, for this latter reason, I'm concerned that giving more control > over the names of internal topics might make topic security and > access control more difficult. Or maybe this concern is off-base, and > folks who take advanced control over the topic name would also take > on the responsibility to make sure their naming scheme works in > conjunction with their broker configs. > > For whatever reason, I hadn't considered prefixing the application's > id with "pre-prod.". Offhand, I think this would achieve the desired > outcome. There may be some devil in the details, of course. > > > Glad to hear, by the way, that you've already considered the problem > of concurrent modifications to the changelogs (etc.). It sounds like > your plan should work, although it might become a management burden > if you start wanting to run a lot of these stream-app tests. In that case, > you could consider mirroring the relevant topics *again* into a > test-specific > prefix (like "pre-prod.test-1.", up to some point. Then, you could stop > the mirror, run the test, verify the results, and then just delete the > whole test dataset. > > > Does it seem like you have a good path forward? From what I'm > hearing, the "user-space" approach is at least worth exploring before > considering a new API. Of course, if it doesn't pan out for whatever > reason, > I'd (personally) support adding whatever features are necessary to support > your use case. > > Thanks, > -John > > > > On Mon, Mar 25, 2019 at 9:40 PM Paul Whalen wrote: > > > John and Ryanne, > > > > Thanks for the responses! I think Ryanne's way of describing the question > > is actually a much better summary than my long winded description: "a > > Streams app can switch between topics with and without a cluster alias > > prefix when you migrate between prod and pre-prod, while preserving > state." > > > > To address a few of John's points... > > > > But, the prod app will still be running, and its changelog will still be > > > mirrored into pre-prod when you start the pre-prod app. > > > > > The idea is actually to turn off the mirroring from prod to pre-prod > during > > this period, so the environments can operate completely independently and > > their state can comfortably diverge during the testing period. After the > > testing period we'd be happy to throw away everything in pre-prod and > start > > mirroring again from prod with a blank slate. > > > > Also, the pre-prod app won't be in the same consumer group as the prod > app, > > > so it won't know from what offset to start processing input. > > > > > This is where I'm hoping the magic of MM2 will come in - at the time we > > shut off mirroring from prod to pre-prod in order to spin of the pre-prod > > environment, we will do an "offset translation" with RemoteClusterUtils > > like Ryanne mentioned, so new Streams apps in pre-prod will see consumer > > offsets that make sense for reading from pre-prod topics. > > > > I like both of your ideas around the "user space" solution: subscribing > to > > multiple topics, or choosing a topic based on config. However, in order > to > > populate their internal state properly, when the pre-prod apps come up > they > > will need to look fo
Re: [ANNOUNCE] Apache Kafka 2.2.0
Hello Mickael, Thanks for reporting this, I double checked the release process and it is indeed mentioned: https://cwiki.apache.org/confluence/display/KAFKA/Release+Process @Matthias J Sax : I've copied the javadocs from release repo to kafka-site github and it should have fixed it now. Guozhang On Wed, Mar 27, 2019 at 7:41 AM Mickael Maison wrote: > Thanks Matthias for running this release! > > The links to the 2.2 javadocs are broken: > https://kafka.apache.org/22/javadoc/index.html > The same happened for 2.1.0 > ( > https://lists.apache.org/thread.html/57f2940225bcce36c3a01ec524dc967e81cb20e159d9b1c851b712f4@%3Cdev.kafka.apache.org%3E > ) > so we probably want to review this part of the release process. > > On Wed, Mar 27, 2019 at 6:28 AM Sanjeev Kumar > wrote: > > > > Congratulations for the 2.2.0 release! > > > > Regards, Sanjeev > > > > On Wed, Mar 27, 2019 at 6:43 AM Guozhang Wang > wrote: > > > > > Thanks Matthias for the release! > > > > > > On Tue, Mar 26, 2019 at 9:31 PM Dongjin Lee > wrote: > > > > > > > Congratulations and thanks for your great work, Matthias!! > > > > > > > > Best, > > > > Dongjin > > > > > > > > On Wed, Mar 27, 2019 at 4:49 AM Stephane Maarek < > > > kafka.tutori...@gmail.com > > > > > > > > > wrote: > > > > > > > > > Congratulations on this amazing release! Lots of cool new features > :) > > > > > > > > > > I've also released a YouTube video that will hopefully help the > > > community > > > > > get up to speed: https://www.youtube.com/watch?v=kaWbp1Cnfo4&t=5s > > > > > > > > > > Happy watching! > > > > > > > > > > On Tue, Mar 26, 2019 at 7:02 PM Matthias J. Sax > > > > wrote: > > > > > > > > > > > The Apache Kafka community is pleased to announce the release for > > > > Apache > > > > > > Kafka 2.2.0 > > > > > > > > > > > > - Added SSL support for custom principal name > > > > > > - Allow SASL connections to periodically re-authenticate > > > > > > - Command line tool bin/kafka-topics.sh adds AdminClient support > > > > > > - Improved consumer group management > > > > > >- default group.id is `null` instead of empty string > > > > > > - API improvement > > > > > >- Producer: introduce close(Duration) > > > > > >- AdminClient: introduce close(Duration) > > > > > >- Kafka Streams: new flatTransform() operator in Streams DSL > > > > > >- KafkaStreams (and other classed) now implement AutoClosable > to > > > > > > support try-with-resource > > > > > >- New Serdes and default method implementations > > > > > > - Kafka Streams exposed internal client.id via ThreadMetadata > > > > > > - Metric improvements: All `-min`, `-avg` and `-max` metrics > will > > > now > > > > > > output `NaN` as default value > > > > > > > > > > > > All of the changes in this release can be found in the release > notes: > > > > > > https://www.apache.org/dist/kafka/2.2.0/RELEASE_NOTES.html > > > > > > > > > > > > > > > > > > You can download the source and binary release (Scala 2.11 and > 2.12) > > > > > > from: https://kafka.apache.org/downloads#2.2.0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --- > > > > > > > > > > > > > > > > > > Apache Kafka is a distributed streaming platform with four core > APIs: > > > > > > > > > > > > > > > > > > ** The Producer API allows an application to publish a stream > records > > > > to > > > > > > one or more Kafka topics. > > > > > > > > > > > > ** The Consumer API allows an application to subscribe to one or > more > > > > > > topics and process the stream of records produced to them. > > > > > > > > > > > > ** The Streams API allows an application to act as a stream > > > processor, > > > > > > consuming an input stream from one or more topics and producing > an > > > > > > output stream to one or more output topics, effectively > transforming > > > > the > > > > > > input streams to output streams. > > > > > > > > > > > > ** The Connector API allows building and running reusable > producers > > > or > > > > > > consumers that connect Kafka topics to existing applications or > data > > > > > > systems. For example, a connector to a relational database might > > > > > > capture every change to a table. > > > > > > > > > > > > > > > > > > With these APIs, Kafka can be used for two broad classes of > > > > application: > > > > > > > > > > > > ** Building real-time streaming data pipelines that reliably get > data > > > > > > between systems or applications. > > > > > > > > > > > > ** Building real-time streaming applications that transform or > react > > > > > > to the streams of data. > > > > > > > > > > > > > > > > > > Apache Kafka is in use at large and small companies worldwide, > > > > including > > > > > > Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, > > > > Rabobank, > > > > > > Target, The New York Times, Uber, Yelp, and Zalando, among > others. > > > > > > > > > > > > A big thank you
[jira] [Resolved] (KAFKA-8166) Kafka 2.2 Javadoc broken
[ https://issues.apache.org/jira/browse/KAFKA-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang resolved KAFKA-8166. -- Resolution: Fixed Assignee: Guozhang Wang Fix Version/s: 2.2.0 > Kafka 2.2 Javadoc broken > > > Key: KAFKA-8166 > URL: https://issues.apache.org/jira/browse/KAFKA-8166 > Project: Kafka > Issue Type: Bug > Components: documentation >Affects Versions: 2.2.0 >Reporter: Sachin NS >Assignee: Guozhang Wang >Priority: Major > Labels: documentation > Fix For: 2.2.0 > > Attachments: 2.2_javadoc.JPG > > > The Kafka 2.2 javadocs link mentioned in the Apache Kafka API documentation > ([http://kafka.apache.org/documentation/#producerapi)] errors out and gives a > 404 Not Found error. Below is the link which is mentioned in the > documentation: > [http://kafka.apache.org/22/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html] > I don't see javadoc folder within [https://kafka.apache.org/22/] > This error is not present in Kafka 2.1 or previous documentation. > Attaching screenshot for reference. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [DISCUSS] KIP-433: Provide client API version to authorizer
Thanks, Ying Zheng. Looks good overall. One question is, should the version be specified as a Kafka version rather than as a RPC API version? I don't think most users are aware of RPC versions, but something like "min kafka version" would be easier to understand. That is how we handle the minimum inter-broker protocol version and the minimum on-disk format version, after all. best, Colin On Tue, Mar 26, 2019, at 17:52, Ying Zheng wrote: > I have rewritten the KIP. The new proposal is adding a new configuration > min.api.version in Kafka broker. > > Please review the new KIP. Thank you! > > On Fri, Mar 1, 2019 at 11:06 AM Colin McCabe wrote: > > > On Wed, Feb 27, 2019, at 15:53, Harsha wrote: > > > HI Colin, > > > Overlooked the IDEMPOTENT_WRITE ACL. This along with > > > client.min.version should solve the cases proposed in the KIP. > > > Can we turn this KIP into adding min.client.version config to broker > > > and it could be part of the dynamic config . > > > > +1, sounds like a good idea. > > > > Colin > > > > > > > > > > Thanks, > > > Harsha > > > > > > On Wed, Feb 27, 2019, at 12:17 PM, Colin McCabe wrote: > > > > On Tue, Feb 26, 2019, at 16:33, Harsha wrote: > > > > > Hi Colin, > > > > > > > > > > "> I think Ismael and Gwen here bring up a good point. The version > > of the > > > > > > request is a technical detail that isn't really related to > > > > > > authorization. There are a lot of other technical details like > > this > > > > > > like the size of the request, the protocol it came in on, etc. > > None of > > > > > > them are passed to the authorizer-- they all have configuration > > knobs > > > > > > to control how we handle them. If we add this technical detail, > > > > > > logically we'll have to start adding all the others, and the > > authorizer > > > > > > API will get really bloated. It's better to keep it focused on > > > > > > authorization, I think." > > > > > > > > > > probably my previous email is not clear but I am agreeing with > > Gwen's point. > > > > > I am not in favor of extending authorizer to support this. > > > > > > > > > > > > > > > "> Another thing to consider is that if we add a new broker > > configuration > > > > > > that lets us set a minimum client version which is allowed, that > > could > > > > > > be useful to other users as well. On the other hand, most users > > are > > > > > > not likely to write a custom authorizer to try to take advantage > > of > > > > > > version information being passed to the authorizer. So, I think > > using> a configuration is clearly the better way to go here. Perhaps it > > can > > > > > > be a KIP-226 dynamic configuration to make this easier to deploy?" > > > > > > > > > > Although minimum client version might help to a certain extent there > > > > > are other cases where we want users to not start using transactions > > for > > > > > example. My proposal in the previous thread was to introduce another > > > > > module/interface, let's say > > > > > "SupportedAPIs" which will take in dynamic configuration to check > > which > > > > > APIs are allowed. > > > > > It can throw UnsupportedException just like we are throwing > > > > > Authorization Exception. > > > > > > > > Hi Harsha, > > > > > > > > We can already prevent people from using transactions using ACLs, > > > > right? That's what the IDEMPOTENT_WRITE ACL was added for. > > > > > > > > In general, I think users should be able to think of ACLs in terms of > > > > "what can I do" rather than "how is it implemented." For example, > > > > maybe some day we will replace FetchRequest with GetStuffRequest. But > > > > users who have READ permission on a topic shouldn't have to change > > > > anything. So I think the Authorizer interface should not be aware of > > > > individual RPC types or message versions. > > > > > > > > best, > > > > Colin > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > Harsha > > > > > > > > > > > > > > > n Tue, Feb 26, 2019, at 10:04 AM, Colin McCabe wrote: > > > > > > Hi Harsha, > > > > > > > > > > > > I think Ismael and Gwen here bring up a good point. The version > > of the > > > > > > request is a technical detail that isn't really related to > > > > > > authorization. There are a lot of other technical details like > > this > > > > > > like the size of the request, the protocol it came in on, etc. > > None of > > > > > > them are passed to the authorizer-- they all have configuration > > knobs > > > > > > to control how we handle them. If we add this technical detail, > > > > > > logically we'll have to start adding all the others, and the > > authorizer > > > > > > API will get really bloated. It's better to keep it focused on > > > > > > authorization, I think. > > > > > > > > > > > > Another thing to consider is that if we add a new broker > > configuration > > > > > > that lets us set a minimum client version which is allowed, that > > could > > > > > > be useful to other users as well. On t
Re: [VOTE] KIP-422: Use the default value of max.poll.interval in Streams
Ah, good point, Guozhang. I'll remove that mention from the KIP. On Wed, Mar 27, 2019 at 3:30 PM Bill Bejeck wrote: > +1 for me, > > Thanks, > Bill > > On Wed, Mar 27, 2019 at 4:13 PM Guozhang Wang wrote: > > > +1 from me. > > > > Though note that we cannot make such changes in older versions since even > > if we release new versions out of those branches they are considered > > bug-fix only and hence should not have any interface impacting changes. > > > > > > Guozhang > > > > On Wed, Mar 27, 2019 at 12:55 PM John Roesler wrote: > > > > > Hi all, > > > > > > Since the KIP is so small, I'm going to optimistically start the vote > for > > > KIP-422 to remove our "max int" default max.poll.interval.ms in > Streams > > > and > > > fall back to the Consumer default of five minutes. > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-442%3A+Return+to+default+max+poll+interval+in+Streams > > > > > > Permalink: https://cwiki.apache.org/confluence/x/1COGBg > > > > > > See also: https://issues.apache.org/jira/browse/KAFKA-6399 > > > > > > Please let me know if you have any objections and wish to return to the > > > discussion phase! > > > > > > Thanks, > > > -John > > > > > > > > > -- > > -- Guozhang > > >
[jira] [Created] (KAFKA-8168) Add a generated ApiMessageType class
Colin P. McCabe created KAFKA-8168: -- Summary: Add a generated ApiMessageType class Key: KAFKA-8168 URL: https://issues.apache.org/jira/browse/KAFKA-8168 Project: Kafka Issue Type: Improvement Reporter: Colin P. McCabe Assignee: Colin P. McCabe Add a generated ApiMessageType class. This will make it easier to do operations based on the type of an ApiMessage. Once all the RPCs are converted to use protocol generation, we can switch to using this instead of ApiKeys.java (possibly renaming this to ApiKeys.java?) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [VOTE] KIP-422: Use the default value of max.poll.interval in Streams
+1 for me, Thanks, Bill On Wed, Mar 27, 2019 at 4:13 PM Guozhang Wang wrote: > +1 from me. > > Though note that we cannot make such changes in older versions since even > if we release new versions out of those branches they are considered > bug-fix only and hence should not have any interface impacting changes. > > > Guozhang > > On Wed, Mar 27, 2019 at 12:55 PM John Roesler wrote: > > > Hi all, > > > > Since the KIP is so small, I'm going to optimistically start the vote for > > KIP-422 to remove our "max int" default max.poll.interval.ms in Streams > > and > > fall back to the Consumer default of five minutes. > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-442%3A+Return+to+default+max+poll+interval+in+Streams > > > > Permalink: https://cwiki.apache.org/confluence/x/1COGBg > > > > See also: https://issues.apache.org/jira/browse/KAFKA-6399 > > > > Please let me know if you have any objections and wish to return to the > > discussion phase! > > > > Thanks, > > -John > > > > > -- > -- Guozhang >
[jira] [Resolved] (KAFKA-1149) Please delete old releases from mirroring system
[ https://issues.apache.org/jira/browse/KAFKA-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang resolved KAFKA-1149. -- Resolution: Fixed > Please delete old releases from mirroring system > > > Key: KAFKA-1149 > URL: https://issues.apache.org/jira/browse/KAFKA-1149 > Project: Kafka > Issue Type: Bug > Environment: http://www.apache.org/dist/kafka/old_releases/ >Reporter: Sebb >Assignee: Guozhang Wang >Priority: Major > > To reduce the load on the ASF mirrors, projects are required to delete old > releases [1] > Please can you remove all non-current releases? > Thanks! > [Note that older releases are always available from the ASF archive server] > [1] http://www.apache.org/dev/release.html#when-to-archive -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [VOTE] KIP-422: Use the default value of max.poll.interval in Streams
+1 from me. Though note that we cannot make such changes in older versions since even if we release new versions out of those branches they are considered bug-fix only and hence should not have any interface impacting changes. Guozhang On Wed, Mar 27, 2019 at 12:55 PM John Roesler wrote: > Hi all, > > Since the KIP is so small, I'm going to optimistically start the vote for > KIP-422 to remove our "max int" default max.poll.interval.ms in Streams > and > fall back to the Consumer default of five minutes. > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-442%3A+Return+to+default+max+poll+interval+in+Streams > > Permalink: https://cwiki.apache.org/confluence/x/1COGBg > > See also: https://issues.apache.org/jira/browse/KAFKA-6399 > > Please let me know if you have any objections and wish to return to the > discussion phase! > > Thanks, > -John > -- -- Guozhang
[jira] [Created] (KAFKA-8167) Document named stateful operators
John Roesler created KAFKA-8167: --- Summary: Document named stateful operators Key: KAFKA-8167 URL: https://issues.apache.org/jira/browse/KAFKA-8167 Project: Kafka Issue Type: Task Components: documentation, streams Affects Versions: 2.1.1, 2.2.0, 2.1.0, 2.3.0, 2.1.2, 2.2.1 Reporter: John Roesler In KIP-372 / KAFKA-7406, we added the ability to name all persistent resources in support of topology compatibility. We missed documenting it, though, or at least, I couldn't find the docs. Since this feature is a higher-level, cross-cutting concern, we should add a section to the docs describing the compatibility problem and the full set of practices that you can employ to achieve compatibility by naming persistent resources. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[VOTE] KIP-422: Use the default value of max.poll.interval in Streams
Hi all, Since the KIP is so small, I'm going to optimistically start the vote for KIP-422 to remove our "max int" default max.poll.interval.ms in Streams and fall back to the Consumer default of five minutes. https://cwiki.apache.org/confluence/display/KAFKA/KIP-442%3A+Return+to+default+max+poll+interval+in+Streams Permalink: https://cwiki.apache.org/confluence/x/1COGBg See also: https://issues.apache.org/jira/browse/KAFKA-6399 Please let me know if you have any objections and wish to return to the discussion phase! Thanks, -John
Re: [DISCUSS] KIP-422: Return to default max poll interval in Streams
Since this KIP is so small in scope, I'll optimistically start the vote (in a separate thread). If there are objections, we can return to the discussion. Thanks, -John On Wed, Mar 27, 2019 at 2:31 PM John Roesler wrote: > Thanks for the response, Guozhang. > > Oh, I should have mentioned that. I was thinking that we might as well > make this change in 1.0+, not just in trunk. I'll update the KIP. > > Thanks, > -John > > On Wed, Mar 27, 2019 at 12:30 PM Guozhang Wang wrote: > >> This is a reasonable one and we should probably do it post 1.0 already. >> I'm +1 (binding) on this. >> >> >> Guozhang >> >> On Wed, Mar 27, 2019 at 10:23 AM John Roesler wrote: >> >> > Hi all, >> > >> > I'd like to propose KIP-422 to remove our "max int" default >> > max.poll.interval.ms in Streams and fall back to the Consumer default >> of >> > five minutes. >> > >> > >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-442%3A+Return+to+default+max+poll+interval+in+Streams >> > >> > Permalink: https://cwiki.apache.org/confluence/x/1COGBg >> > >> > See also: https://issues.apache.org/jira/browse/KAFKA-6399 >> > >> > Thanks, >> > -John >> > >> >> >> -- >> -- Guozhang >> >
Re: [DISCUSS] KIP-422: Return to default max poll interval in Streams
Thanks for the response, Guozhang. Oh, I should have mentioned that. I was thinking that we might as well make this change in 1.0+, not just in trunk. I'll update the KIP. Thanks, -John On Wed, Mar 27, 2019 at 12:30 PM Guozhang Wang wrote: > This is a reasonable one and we should probably do it post 1.0 already. > I'm +1 (binding) on this. > > > Guozhang > > On Wed, Mar 27, 2019 at 10:23 AM John Roesler wrote: > > > Hi all, > > > > I'd like to propose KIP-422 to remove our "max int" default > > max.poll.interval.ms in Streams and fall back to the Consumer default of > > five minutes. > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-442%3A+Return+to+default+max+poll+interval+in+Streams > > > > Permalink: https://cwiki.apache.org/confluence/x/1COGBg > > > > See also: https://issues.apache.org/jira/browse/KAFKA-6399 > > > > Thanks, > > -John > > > > > -- > -- Guozhang >
Re: MirrorMaker 2.0 and Streams interplay (topic naming control)
Hi Paul, Sorry for overlooking the "offset translation" MM2 feature. I'm glad Ryanne was able to confirm this would work. I'm just one voice, but FWIW, I think that the internal topic naming scheme is a public API. We document the structure of the naming scheme in several places. We also recommend making use of the fact that the applicationId is a prefix of the topic name in conjunction with Kafka Broker ACLs to grant access to the internal topics to the applications that own them. Actually, for this latter reason, I'm concerned that giving more control over the names of internal topics might make topic security and access control more difficult. Or maybe this concern is off-base, and folks who take advanced control over the topic name would also take on the responsibility to make sure their naming scheme works in conjunction with their broker configs. For whatever reason, I hadn't considered prefixing the application's id with "pre-prod.". Offhand, I think this would achieve the desired outcome. There may be some devil in the details, of course. Glad to hear, by the way, that you've already considered the problem of concurrent modifications to the changelogs (etc.). It sounds like your plan should work, although it might become a management burden if you start wanting to run a lot of these stream-app tests. In that case, you could consider mirroring the relevant topics *again* into a test-specific prefix (like "pre-prod.test-1.", up to some point. Then, you could stop the mirror, run the test, verify the results, and then just delete the whole test dataset. Does it seem like you have a good path forward? From what I'm hearing, the "user-space" approach is at least worth exploring before considering a new API. Of course, if it doesn't pan out for whatever reason, I'd (personally) support adding whatever features are necessary to support your use case. Thanks, -John On Mon, Mar 25, 2019 at 9:40 PM Paul Whalen wrote: > John and Ryanne, > > Thanks for the responses! I think Ryanne's way of describing the question > is actually a much better summary than my long winded description: "a > Streams app can switch between topics with and without a cluster alias > prefix when you migrate between prod and pre-prod, while preserving state." > > To address a few of John's points... > > But, the prod app will still be running, and its changelog will still be > > mirrored into pre-prod when you start the pre-prod app. > > > The idea is actually to turn off the mirroring from prod to pre-prod during > this period, so the environments can operate completely independently and > their state can comfortably diverge during the testing period. After the > testing period we'd be happy to throw away everything in pre-prod and start > mirroring again from prod with a blank slate. > > Also, the pre-prod app won't be in the same consumer group as the prod app, > > so it won't know from what offset to start processing input. > > > This is where I'm hoping the magic of MM2 will come in - at the time we > shut off mirroring from prod to pre-prod in order to spin of the pre-prod > environment, we will do an "offset translation" with RemoteClusterUtils > like Ryanne mentioned, so new Streams apps in pre-prod will see consumer > offsets that make sense for reading from pre-prod topics. > > I like both of your ideas around the "user space" solution: subscribing to > multiple topics, or choosing a topic based on config. However, in order to > populate their internal state properly, when the pre-prod apps come up they > will need to look for repartition and changelog topics with the right > prefix. This seems problematic to me since the user doesn't have direct > control over those topic names, though it did just occur to me now that the > user *sort of* does. Since the naming scheme is currently just > applicationId + "-" + storeName + "-changelog", we could translate the > consumer group offsets to a consumer group with a new name that has the > same prefix as the mirrored topics do. That seems a bit clumsly/lucky to > me (is the internal topic naming convention really a "public API"?), but I > think it would work. > > I'd be curious to hear if folks think that solution would work and be an > acceptable pattern, since my original proposal of more user control of > internal topic naming did seem a bit heavy handed. > > Thanks very much for your help! > Paul > > On Mon, Mar 25, 2019 at 3:14 PM Ryanne Dolan > wrote: > > > Hey Paul, thanks for the kind words re MM2. > > > > I'm not a Streams expert first off, but I think I understand your > question: > > if a Streams app can switch between topics with and without a cluster > alias > > prefix when you migrate between prod and pre-prod, while preserving > state. > > Streams supports regexes and lists of topics as input, so you can use > e.g. > > builder.stream(List.of("topic1", "prod.topic1")), which is a good place > to > > start. In this case, the combined subscription is s
[jira] [Created] (KAFKA-8166) Kafka 2.2 Javadoc broken
Sachin NS created KAFKA-8166: Summary: Kafka 2.2 Javadoc broken Key: KAFKA-8166 URL: https://issues.apache.org/jira/browse/KAFKA-8166 Project: Kafka Issue Type: Bug Components: documentation Affects Versions: 2.2.0 Reporter: Sachin NS Attachments: 2.2_javadoc.JPG The Kafka 2.2 javadocs link mentioned in the Apache Kafka API documentation ([http://kafka.apache.org/documentation/#producerapi)] errors out and gives a 404 Not Found error. Below is the link which is mentioned in the documentation: [http://kafka.apache.org/22/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html] I don't see javadoc folder within [https://kafka.apache.org/22/] This error is not present in Kafka 2.1 or previous documentation. Attaching screenshot for reference. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8165) Streams task causes Out Of Memory after connection and store restoration
Di Campo created KAFKA-8165: --- Summary: Streams task causes Out Of Memory after connection and store restoration Key: KAFKA-8165 URL: https://issues.apache.org/jira/browse/KAFKA-8165 Project: Kafka Issue Type: Bug Components: streams Affects Versions: 2.1.0 Environment: Amazon Linux container, on Docker based on wurstmeister/kafka image. Reporter: Di Campo Having a Kafka Streams 2.1 application, when Kafka brokers are stable, the (largely stateful) application has been consuming ~160 messages per second at a sustained rate for several hours. However it started having connection issues to the brokers. {code:java} Connection to node 3 (/172.31.36.118:9092) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient){code} Also it began showing a lot of these errors: {code:java} WARN [Consumer clientId=stream-processor-81e1ce17-1765-49f8-9b44-117f983a2d19-StreamThread-2-consumer, groupId=stream-processor] 1 partitions have leader brokers without a matching listener, including [broker-2-health-check-0] (org.apache.kafka.clients.NetworkClient){code} In fact, the _health-check_ topic is in the broker but not consumed by this topology or used in any way by the Streams application (it is just broker healthcheck). It does not complain about topics that are actually consumed by the topology. Some time after these errors (that appear at a rate of 24 appearances per second during ~5 minutes), then the following logs appear: {code:java} [2019-03-27 15:14:47,709] WARN [Consumer clientId=stream-processor-81e1ce17-1765-49f8-9b44-117f983a2d19-StreamThread-1-restore-consumer, groupId=] Connection to node -3 (/ip3:9092) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient){code} In between 6 and then 3 lines of "Connection could not be established" error messages, 3 of these ones slipped in: [2019-03-27 15:14:47,723] WARN Started Restoration of visitorCustomerStore partition 15 total records to be restored 17 (com.divvit.dp.streams.applications.monitors.ConsoleGlobalRestoreListener) ... one for each different KV store I have (I still have another KV that does not appear, and a WindowedStore store that also does not appear). Then I finally see "Restoration Complete" (using a logging ConsoleGlobalRestoreListener as in docs) messages for all of my stores. So it seems it may be fine now to restart the processing. Three minutes later, some events get processed, and I see an OOM error: java.lang.OutOfMemoryError: GC overhead limit exceeded ... so given that it usually allows to process during hours under same circumstances, I'm wondering whether there is some memory leak in the connection resources or somewhere in the handling of this scenario. Kafka and KafkaStreams 2.1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [DISCUSS] KIP-401 TransformerSupplier/ProcessorSupplier enhancements
Hello Paul, Thanks for the uploaded PR and the detailed description! I've made a pass on it and left some comments. Overall I think I agree with you that passing in the storebuilder directly that store name is more convienent as it does not require another `addStore` call, but we just need to spend some more documentation effort on educating users about the two ways of connecting their stores. I'm slightly concerned about this education curve but I can be convinced if most people felt it is worthy. Guozhang On Sat, Mar 23, 2019 at 5:15 PM Paul Whalen wrote: > I'd like to resurrect this discussion with a cursory, proof-of-concept > implementation of the KIP which combines many of our ideas: > https://github.com/apache/kafka/pull/6496. I tried to keep the diff as > small as possible for now, just using it to convey the main ideas. But > I'll separately address some of our earlier discussion: > >- Will there be a new, separate interface for users to implement for the >new functionality? No, to hopefully keep things simple, all of the >Processor/TransformerSupplier interfaces will just extend >StateStoresSupplier, allowing users to opt in to this functionality by >overriding the default implementation that gives an empty list. >- Will the interface allow users to specify the store name, or the >entire StoreBuilder? The entire StoreBuilder, so the >Processor/TransformerSupplier can completely encapsulate name and >implementation of a state store if desired. >- Will the old way of specifying store names alongside the supplier when >calling stream.process/transform() be deprecated? No, this is still a >legitimate way to wire up Processors/Transformers and their stores. But > I >would recommend not allowing stream.process/transform() calls that use > both >store declaration mechanisms (this restriction is not in the proof of >concept) >- How will we handle adding the same state store to the topology >multiple times because different Processor/TransformerSuppliers declare > it? >topology.addStateStore() will be slightly relaxed for convenience, and > will >allow adding the same StoreBuilder multiple times as long as the exact > same >StoreBuilder instance is being added for the same store name. This > seems >to prevent in practice the issue of accidentally making two state stores >one by adding with the same name. For additional safety, if we wanted > to >(not in the proof of concept), we could allow for this relaxation only > for >internal callers of topology.addStateStore(). > > So, in summary, the use cases look like: > >- 1 transformer/processor that owns its store: Using the new >StateStoresSupplier interface method to supply its StoreBuilders that > will >be added to the topology automatically. >- Multiple transformer/processors that share the same store: Either > > >1. The old way: the StoreBuilder is defined "far away" from the >Transformer/Processor implementations, and is added to the topology >manually by the user >2. The new way: the StoreBuilder is defined closer to the >Transformer/Processor implementations, and the same instance is > returned by >all Transformer/ProcessorSuppliers that need it > > > This makes the KIP wiki a bit stale; I'll update if we want to bring this > design to a vote. > > Thanks! > Paul > > On Sun, Dec 16, 2018 at 6:04 PM Guozhang Wang wrote: > > > Matthias / Paul, > > > > The concern I had about introducing `StoreBuilderSupplier` is simply > > because it is another XXSupplier to the public API, so I'd like to ask if > > we really have to add it :) > > > > The difference between encapsulating the store name and encapsulating the > > full state store builder is that, in the former: > > > > --- > > > > String storeName = "store1"; > > builder.addStore(new MyStoreBuilder(storeName)); > > stream1.transform(new MyTransformerSupplier(storeName)); // following > my > > proposal, that the store name can be passed in and used for both > > `listStores` and in the `Transformer#init`; so the Transformer function > > does not need to get the constant string name again. > > > > // one caveat to admit, is that > > MyTransofmerSupplier logic may be just unique to `store1` so it cannot be > > reused with a different store name anyways. > > --- > > > > While in the latter: > > > > --- > > > > stream1.transform(new MyTransformerSupplierForStore1); // the name is > > just indicating that we may have one such supplier for each store. > > > > --- > > > > I understand the latter introduce more convenience from the API, but the > > cost is that since we still cannot completely `builder.addStore`, but > only > > reduce its semantic scope to shared state stores only,; hence users need > to > > learn two ways of creating state stores for those two patterns. > > > > My argument is that more public A
Re: [DISCUSS] KIP-422: Return to default max poll interval in Streams
This is a reasonable one and we should probably do it post 1.0 already. I'm +1 (binding) on this. Guozhang On Wed, Mar 27, 2019 at 10:23 AM John Roesler wrote: > Hi all, > > I'd like to propose KIP-422 to remove our "max int" default > max.poll.interval.ms in Streams and fall back to the Consumer default of > five minutes. > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-442%3A+Return+to+default+max+poll+interval+in+Streams > > Permalink: https://cwiki.apache.org/confluence/x/1COGBg > > See also: https://issues.apache.org/jira/browse/KAFKA-6399 > > Thanks, > -John > -- -- Guozhang
[DISCUSS] KIP-422: Return to default max poll interval in Streams
Hi all, I'd like to propose KIP-422 to remove our "max int" default max.poll.interval.ms in Streams and fall back to the Consumer default of five minutes. https://cwiki.apache.org/confluence/display/KAFKA/KIP-442%3A+Return+to+default+max+poll+interval+in+Streams Permalink: https://cwiki.apache.org/confluence/x/1COGBg See also: https://issues.apache.org/jira/browse/KAFKA-6399 Thanks, -John
Re: [DISCUSS] KIP-439: Deprecate Interface WindowStoreIterator
Hello Matthias, Just to clarify the naming conflicts is between the newly added function and the old functions that we want to deprecate / remove right? The existing ones have different signatures with parameters so that they should not have conflicts. I was thinking about just make the change directly without deprecating existing ones which would require users of 2.3 to make code changes -- this code change looks reasonably straight-forward to me and not much worth deferring to later when the deprecated ones are removed. On the other hand, just deprecating "WindowIterator" without add new functions seems not very useful for users either since it is only used as an indicator but users cannot make code changes during this phase anyways, so it is still a `one-cut` deal when we eventually remove the deprecated ones and add the new one. Hence I'm slightly inclining to trade compatibility and replace it with new functions in one release, but if people have a good idea of the renaming approach (I do not have a good one on top of my head :) I can also be convinced that way. Guozhang On Mon, Mar 11, 2019 at 10:41 AM Matthias J. Sax wrote: > I am open to change the return type to > > KeyValueIterator, V> > > However, this requires to rename > > #fetch(K key, long startTimestamp, long endTimestamp) > #fetch(K key, Instant startTimestamp, Instant endTimestamp) > > to avoid naming conflicts. > > What new name would you suggest? The existing methods are called > `fetch()`, `fetchAll()`, `all()`, `put()`. > > While I think it would be good to get fully aligned return types, I am > not sure how we can get aligned method names (without renaming all of > them...)? If we think it's worth to rename all to get this cleaned up, I > am no opposed. > > > Thoughts? > > > -Matthias > > > On 3/11/19 10:27 AM, Guozhang Wang wrote: > > I was thinking about changing the return type even, to > > `KeyValueIterator, V>` since it is confusing to users about > the > > key typed `Long` (Streams javadoc today did not explain it clearly > either), > > note it is not backward compatible at all. > > > > Personally I'd prefer to just deprecate the API and new new ones that > > return `KeyValueIterator, V>` directly, but if most people > felt > > it is too intrusive for compatibility I can be convinced with > > `KeyValueIterator` as well. > > > > Guozhang > > > > On Mon, Mar 11, 2019 at 10:17 AM Sophie Blee-Goldman < > sop...@confluent.io> > > wrote: > > > >> I remember thinking this while working on window stores, am definitely > for > >> it. > >> > >> On Mon, Mar 11, 2019 at 9:20 AM John Roesler wrote: > >> > >>> Sounds great to me. Thanks, Matthias! > >>> -John > >>> > >>> On Sun, Mar 10, 2019 at 11:58 PM Matthias J. Sax < > matth...@confluent.io> > >>> wrote: > >>> > Hi, > > I would like to propose KIP-439 to deprecate interface > `WindowStoreIterator`. > > > > >>> > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-439%3A+Deprecate+Interface+WindowStoreIterator > > Looking forward to your feedback. > > > -Matthias > > > >>> > >> > > > > > > -- -- Guozhang
Re: [DISCUSSION] KIP-418: A method-chaining way to branch KStream
OK, let me summarize what we have discussed up to this point. First, it seems that it's commonly agreed that branch API needs improvement. Motivation is given in the KIP. There are two potential ways to do it: 1. (as origianlly proposed) new KafkaStreamsBrancher<..>() .branch(predicate1, ks ->..) .branch(predicate2, ks->..) .defaultBranch(ks->..) //optional .onTopOf(stream).mapValues(...) //onTopOf returns its argument PROS: 1) Fully backwards compatible. 2) The code won't make sense until all the necessary ingredients are provided. CONS: The need to create a KafkaStreamsBrancher instance contrasts the fluency of other KStream methods. 2. (as Paul proposes) stream .branch(predicate1, ks ->...) .branch(predicate2, ks->...) .defaultBranch(ks->...) //or noDefault(). Both defaultBranch(..) and noDefault() return void PROS: Generally follows the way KStreams interface is defined. CONS: We need to define two terminal methods (defaultBranch(ks->) and noDefault()). And for a user it is very easy to miss the fact that one of the terminal methods should be called. If these methods are not called, we can throw an exception in runtime. Colleagues, what are your thoughts? Can we do better? Regards, Ivan 27.03.2019 18:46, Ivan Ponomarev пишет: 25.03.2019 17:43, Ivan Ponomarev пишет: Paul, I see your point when you are talking about stream..branch..branch...default.. Still, I believe that this cannot not be implemented the easy way. Maybe we all should think further. Let me comment on two of your ideas. user could specify a terminal method that assumes nothing will reach the default branch, throwing an exception if such a case occurs. 1) OK, apparently this should not be the only option besides `default`, because there are scenarios when we want to just silently drop the messages that didn't match any predicate. 2) Throwing an exception in the middle of data flow processing looks like a bad idea. In stream processing paradigm, I would prefer to emit a special message to a dedicated stream. This is exactly where `default` can be used. it would be fairly easily for the InternalTopologyBuilder to track dangling branches that haven't been terminated and raise a clear error before it becomes an issue. You mean a runtime exception, when the program is compiled and run? Well, I'd prefer an API that simply won't compile if used incorrectly. Can we build such an API as a method chain starting from KStream object? There is a huge cost difference between runtime and compile-time errors. Even if a failure uncovers instantly on unit tests, it costs more for the project than a compilation failure. Regards, Ivan 25.03.2019 0:38, Paul Whalen пишет: Ivan, Good point about the terminal operation being required. But is that really such a bad thing? If the user doesn't want a defaultBranch they can call some other terminal method (noDefaultBranch()?) just as easily. In fact I think it creates an opportunity for a nicer API - a user could specify a terminal method that assumes nothing will reach the default branch, throwing an exception if such a case occurs. That seems like an improvement over the current branch() API, which allows for the more subtle behavior of records unexpectedly getting dropped. The need for a terminal operation certainly has to be well documented, but it would be fairly easily for the InternalTopologyBuilder to track dangling branches that haven't been terminated and raise a clear error before it becomes an issue. Especially now that there is a "build step" where the topology is actually wired up, when StreamsBuilder.build() is called. Regarding onTopOf() returning its argument, I agree that it's critical to allow users to do other operations on the input stream. With the fluent solution, it ought to work the same way all other operations do - if you want to process off the original KStream multiple times, you just need the stream as a variable so you can call as many operations on it as you desire. Thoughts? Best, Paul On Sun, Mar 24, 2019 at 2:02 PM Ivan Ponomarev wrote: Hello Paul, I afraid this won't work because we do not always need the defaultBranch. And without a terminal operation we don't know when to finalize and build the 'branch switch'. In my proposal, onTopOf returns its argument, so we can do something more with the original branch after branching. I understand your point that the need of special object construction contrasts the fluency of most KStream methods. But here we have a special case: we build the switch to split the flow, so I think this is still idiomatic. Regards, Ivan 24.03.2019 4:02, Paul Whalen пишет: Ivan, I think it's a great idea to improve this API, but I find the onTopOff() mechanism a little confusing since it contrasts the fluency of other KStream method calls. Ideally I'd like to just call a method on the stream so it still reads top to bottom if the br
Re: [VOTE] KIP-392: Allow consumers to fetch from the closest replica
@Jun Re; 200: It's a fair point that it is useful to minimize the client changes that are needed to get a benefit from affinity. I think the high level argument that this is mostly the concern of operators and should be under their control. Since there is a protocol bump here, users will have to upgrade clients at a minimum. An alternative would be to make "preferred" the default option for `replica.selection.policy`. But I agree that the value of the configuration becomes less clear in this case. Overall this suggestion sounds good to me, but let me see if there is any additional feedback before I update the KIP. Re; 201: Ack. @Guozhang I think rack.id is still an easier and more reliable way for many users to determine local affinity. This lets us provide the simple rack-aware implementation which is probably sufficient for a fair number of use cases and wouldn't require users to write any custom code. Thanks, Jason On Wed, Mar 27, 2019 at 9:05 AM Guozhang Wang wrote: > Hello Jun, > > Regarding 200: if we assume that most client would not bother setting > rack.id at all and affinity can be determined w/o rack.id via TCP header, > plus rack.id may not be "future-proof" additional information is needed as > well, then do we still need to change the protocol of metadata request to > add `rack.id`? > > > Guozhang > > On Tue, Mar 26, 2019 at 6:23 PM Jun Rao wrote: > > > Hi, Jason, > > > > Thanks for the KIP. Just a couple of more comments. > > > > 200. I am wondering if we really need the replica.selection.policy config > > in the consumer. A slight variant is that we (1) let the consumer always > > fetch from the PreferredReplica and (2) provide a default implementation > of > > ReplicaSelector that always returns the leader replica in select() for > > backward compatibility. Then, we can get rid of replica.selection.policy > in > > the consumer. The benefits are that (1) fewer configs, (2) affinity > > optimization can potentially be turned on with just a broker side change > > (assuming affinity can be determined w/o client rack.id). > > > > 201. I am wondering if PreferredReplica in the protocol should be named > > PreferredReadReplica since it's intended for reads? > > > > Jun > > > > On Mon, Mar 25, 2019 at 9:07 AM Jason Gustafson > > wrote: > > > > > Hi All, discussion on the KIP seems to have died down, so I'd like to > go > > > ahead and start a vote. Here is a link to the KIP: > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica > > > . > > > > > > +1 from me (duh) > > > > > > -Jason > > > > > > > > -- > -- Guozhang >
Re: [DISCUSS] KIP-405: Kafka Tiered Storage
Hi All, Thanks for your initial feedback. We updated the KIP. Please take a look and let us know if you have any questions. https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage Thanks, Harsha On Wed, Feb 6, 2019, at 10:30 AM, Harsha wrote: > Thanks Eno, Adam & Satish for you review and questions. I'll address > these in KIP and update the thread here. > > Thanks, > Harsha > > On Wed, Feb 6, 2019, at 7:09 AM, Satish Duggana wrote: > > Thanks, Harsha for the KIP. It is a good start for tiered storage in > > Kafka. I have a few comments/questions. > > > > It may be good to have a configuration to keep the number of local > > segments instead of keeping only the active segment. This config can > > be exposed at cluster and topic levels with default value as 1. In > > some use cases, few consumers may lag over one segment, it will be > > better to serve from local storage instead of remote storage. > > > > It may be better to keep “remote.log.storage.enable” and respective > > configuration at topic level along with cluster level. It will be > > helpful in environments where few topics are configured with > > local-storage and other topics are configured with remote storage. > > > > Each topic-partition leader pushes its log segments with respective > > index files to remote whenever active log rolls over, it updates the > > remote log index file for the respective remote log segment. The > > second option is to add offset index files also for each segment. It > > can serve consumer fetch requests for old segments from local log > > segment instead of serving directly from the remote log which may > > cause high latencies. There can be different strategies in when the > > remote segment is copied to a local segment. > > > > What is “remote.log.manager.scheduler.interval.ms” config about? > > > > How do followers sync RemoteLogSegmentIndex files? Do they request > > from leader replica? This looks to be important as the failed over > > leader should have RemoteLogSegmentIndex updated and ready to avoid > > high latencies in serving old data stored in remote logs. > > > > Thanks, > > Satish. > > > > On Tue, Feb 5, 2019 at 10:42 PM Ryanne Dolan wrote: > > > > > > Thanks Harsha, makes sense. > > > > > > Ryanne > > > > > > On Mon, Feb 4, 2019 at 5:53 PM Harsha Chintalapani > > > wrote: > > > > > > > "I think you are saying that this enables additional (potentially > > > > cheaper) > > > > storage options without *requiring* an existing ETL pipeline. “ > > > > Yes. > > > > > > > > " But it's not really a replacement for the sort of pipelines people > > > > build > > > > with Connect, Gobblin etc.” > > > > > > > > It is not. But also making an assumption that everyone runs these > > > > pipelines for storing raw Kafka data into HDFS or S3 is also wrong > > > > assumption. > > > > The aim of this KIP is to provide tiered storage as whole package not > > > > asking users to ship the data on their own using existing ETL, which > > > > means > > > > running a consumer and maintaining those pipelines. > > > > > > > > " My point was that, if you are already offloading records in an ETL > > > > pipeline, why do you need a new pipeline built into the broker to ship > > > > the > > > > same data to the same place?” > > > > > > > > As you said its ETL pipeline, which means users of these pipelines are > > > > reading the data from broker and transforming its state and storing it > > > > somewhere. > > > > The point of this KIP is store log segments as it is without changing > > > > their structure so that we can use the existing offset mechanisms to > > > > look > > > > it up when the consumer needs to read old data. When you do load it via > > > > your existing pipelines you are reading the topic as a whole , which > > > > doesn’t guarantee that you’ll produce this data back into HDFS in S3 in > > > > the > > > > same order and who is going to generate the Index files again. > > > > > > > > > > > > "So you'd end up with one of 1)cold segments are only useful to Kafka; > > > > 2) > > > > you have the same data written to HDFS/etc twice, once for Kafka and > > > > once > > > > for everything else, in two separate formats” > > > > > > > > You are talking two different use cases. If someone is storing raw data > > > > out of Kafka for long term access. > > > > By storing the data as it is in HDFS though Kafka will solve this issue. > > > > They do not need to run another pipe-line to ship these logs. > > > > > > > > If they are running pipelines to store in HDFS in a different format, > > > > thats a different use case. May be they are transforming Kafka logs to > > > > ORC > > > > so that they can query through Hive. Once you transform the log > > > > segment it > > > > does loose its ability to use the existing offset index. > > > > Main objective here not to change the existing protocol and still be > > > > able > > > > to write and read logs from remote stor
Re: [VOTE] KIP-392: Allow consumers to fetch from the closest replica
Hi, Guozhang, In general, users may want to optimize affinity in different ways, e.g. latency, cost, etc. I am not sure if all those cases can by captured by client IP addresses. So, it seems that having a rack.id in the consumer is still potentially useful. Thanks, Jun On Wed, Mar 27, 2019 at 9:05 AM Guozhang Wang wrote: > Hello Jun, > > Regarding 200: if we assume that most client would not bother setting > rack.id at all and affinity can be determined w/o rack.id via TCP header, > plus rack.id may not be "future-proof" additional information is needed as > well, then do we still need to change the protocol of metadata request to > add `rack.id`? > > > Guozhang > > On Tue, Mar 26, 2019 at 6:23 PM Jun Rao wrote: > > > Hi, Jason, > > > > Thanks for the KIP. Just a couple of more comments. > > > > 200. I am wondering if we really need the replica.selection.policy config > > in the consumer. A slight variant is that we (1) let the consumer always > > fetch from the PreferredReplica and (2) provide a default implementation > of > > ReplicaSelector that always returns the leader replica in select() for > > backward compatibility. Then, we can get rid of replica.selection.policy > in > > the consumer. The benefits are that (1) fewer configs, (2) affinity > > optimization can potentially be turned on with just a broker side change > > (assuming affinity can be determined w/o client rack.id). > > > > 201. I am wondering if PreferredReplica in the protocol should be named > > PreferredReadReplica since it's intended for reads? > > > > Jun > > > > On Mon, Mar 25, 2019 at 9:07 AM Jason Gustafson > > wrote: > > > > > Hi All, discussion on the KIP seems to have died down, so I'd like to > go > > > ahead and start a vote. Here is a link to the KIP: > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica > > > . > > > > > > +1 from me (duh) > > > > > > -Jason > > > > > > > > -- > -- Guozhang >
Re: [VOTE] KIP-392: Allow consumers to fetch from the closest replica
Hello Jun, Regarding 200: if we assume that most client would not bother setting rack.id at all and affinity can be determined w/o rack.id via TCP header, plus rack.id may not be "future-proof" additional information is needed as well, then do we still need to change the protocol of metadata request to add `rack.id`? Guozhang On Tue, Mar 26, 2019 at 6:23 PM Jun Rao wrote: > Hi, Jason, > > Thanks for the KIP. Just a couple of more comments. > > 200. I am wondering if we really need the replica.selection.policy config > in the consumer. A slight variant is that we (1) let the consumer always > fetch from the PreferredReplica and (2) provide a default implementation of > ReplicaSelector that always returns the leader replica in select() for > backward compatibility. Then, we can get rid of replica.selection.policy in > the consumer. The benefits are that (1) fewer configs, (2) affinity > optimization can potentially be turned on with just a broker side change > (assuming affinity can be determined w/o client rack.id). > > 201. I am wondering if PreferredReplica in the protocol should be named > PreferredReadReplica since it's intended for reads? > > Jun > > On Mon, Mar 25, 2019 at 9:07 AM Jason Gustafson > wrote: > > > Hi All, discussion on the KIP seems to have died down, so I'd like to go > > ahead and start a vote. Here is a link to the KIP: > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica > > . > > > > +1 from me (duh) > > > > -Jason > > > -- -- Guozhang
Re: Speeding up integration tests
Hi All, I've created a PR for what we have internally for retrying flaky tests. Any reviews and ideas are welcome: https://github.com/apache/kafka/pull/6506 It's basically collects the failed classes and reruns them at the end. If they successful it overwrites the test report. Thanks, Viktor On Mon, Mar 11, 2019 at 1:05 PM Stanislav Kozlovski wrote: > I agree with Ron. > I think improving the framework with a configurable number of retries on > some tests will yield the highest ROI in terms of passing builds. > > On Fri, Mar 8, 2019 at 10:48 PM Ron Dagostino wrote: > > > It's a classic problem: you can't string N things together serially and > > expect high reliability. 5,000 tests in a row isn't going to give you a > > bunch of 9's. It feels to me that the test frameworks themselves should > > support a more robust model -- like a way to tag a test as "retry me up > to > > N times before you really consider me a failure" or something like that. > > > > Ron > > > > On Fri, Mar 8, 2019 at 11:40 AM Stanislav Kozlovski < > > stanis...@confluent.io> > > wrote: > > > > > > We internally have an improvement for a half a year now which reruns > > the > > > flaky test classes at the end of the test gradle task, lets you know > that > > > they were rerun and probably flaky. It fails the build only if the > second > > > run of the test class was also unsuccessful. I think it works pretty > > good, > > > we mostly have green builds. If there is interest, I can try to > > contribute > > > that. > > > > > > That does sound very intriguing. Does it rerun the test classes that > > failed > > > or some known, marked classes? If it is the former, I can see a lot of > > > value in having that automated in our PR builds. I wonder what others > > think > > > of this > > > > > > On Thu, Feb 28, 2019 at 6:04 PM Viktor Somogyi-Vass < > > > viktorsomo...@gmail.com> > > > wrote: > > > > > > > Hey All, > > > > > > > > Thanks for the loads of ideas. > > > > > > > > @Stanislav, @Sonke > > > > I probably left it out from my email but I really imagined this as a > > > > case-by-case basis change. If we think that it wouldn't cause > problems, > > > > then it might be applied. That way we'd limit the blast radius > > somewhat. > > > > The 1 hour gain is really just the most optimistic scenario, I'm > almost > > > > sure that not every test could be transformed to use a common > cluster. > > > > We internally have an improvement for a half a year now which reruns > > the > > > > flaky test classes at the end of the test gradle task, lets you know > > that > > > > they were rerun and probably flaky. It fails the build only if the > > second > > > > run of the test class was also unsuccessful. I think it works pretty > > > good, > > > > we mostly have green builds. If there is interest, I can try to > > > contribute > > > > that. > > > > > > > > >I am also extremely annoyed at times by the amount of coffee I have > to > > > > drink before tests finish > > > > Just please don't get a heart attack :) > > > > > > > > @Ron, @Colin > > > > You bring up a very good point that it is easier and frees up more > > > > resources if we just run change specific tests and it's good to know > > > that a > > > > similar solution (meaning using a shared resource for testing) have > > > failed > > > > elsewhere. I second Ron on the test categorization though, although > as > > a > > > > first attempt I think using a flaky retry + running only the > necessary > > > > tests would help in both time saving and effectiveness. Also it would > > be > > > > easier to achieve. > > > > > > > > @Ismael > > > > Yea, it'd be interesting to profile the startup/shutdown, I've never > > done > > > > that. Perhaps I'll set some time apart for that :). It's definitely > > true > > > > though that if we see a significant delay there we wouldn't just > > improve > > > > the efficiency of the tests but also customer experience. > > > > > > > > Best, > > > > Viktor > > > > > > > > > > > > > > > > On Thu, Feb 28, 2019 at 8:12 AM Ismael Juma > wrote: > > > > > > > > > It's an idea that has come up before and worth exploring > eventually. > > > > > However, I'd first try to optimize the server startup/shutdown > > process. > > > > If > > > > > we measure where the time is going, maybe some opportunities will > > > present > > > > > themselves. > > > > > > > > > > Ismael > > > > > > > > > > On Wed, Feb 27, 2019, 3:09 AM Viktor Somogyi-Vass < > > > > viktorsomo...@gmail.com > > > > > > > > > > > wrote: > > > > > > > > > > > Hi Folks, > > > > > > > > > > > > I've been observing lately that unit tests usually take 2.5 hours > > to > > > > run > > > > > > and a very big portion of these are the core tests where a new > > > cluster > > > > is > > > > > > spun up for every test. This takes most of the time. I ran a test > > > > > > (TopicCommandWithAdminClient with 38 test inside) through the > > > profiler > > > > > and > > > > > > it shows for instance that running the whole cl
Re: [ANNOUNCE] Apache Kafka 2.2.0
Thanks Matthias for running this release! The links to the 2.2 javadocs are broken: https://kafka.apache.org/22/javadoc/index.html The same happened for 2.1.0 (https://lists.apache.org/thread.html/57f2940225bcce36c3a01ec524dc967e81cb20e159d9b1c851b712f4@%3Cdev.kafka.apache.org%3E) so we probably want to review this part of the release process. On Wed, Mar 27, 2019 at 6:28 AM Sanjeev Kumar wrote: > > Congratulations for the 2.2.0 release! > > Regards, Sanjeev > > On Wed, Mar 27, 2019 at 6:43 AM Guozhang Wang wrote: > > > Thanks Matthias for the release! > > > > On Tue, Mar 26, 2019 at 9:31 PM Dongjin Lee wrote: > > > > > Congratulations and thanks for your great work, Matthias!! > > > > > > Best, > > > Dongjin > > > > > > On Wed, Mar 27, 2019 at 4:49 AM Stephane Maarek < > > kafka.tutori...@gmail.com > > > > > > > wrote: > > > > > > > Congratulations on this amazing release! Lots of cool new features :) > > > > > > > > I've also released a YouTube video that will hopefully help the > > community > > > > get up to speed: https://www.youtube.com/watch?v=kaWbp1Cnfo4&t=5s > > > > > > > > Happy watching! > > > > > > > > On Tue, Mar 26, 2019 at 7:02 PM Matthias J. Sax > > > wrote: > > > > > > > > > The Apache Kafka community is pleased to announce the release for > > > Apache > > > > > Kafka 2.2.0 > > > > > > > > > > - Added SSL support for custom principal name > > > > > - Allow SASL connections to periodically re-authenticate > > > > > - Command line tool bin/kafka-topics.sh adds AdminClient support > > > > > - Improved consumer group management > > > > >- default group.id is `null` instead of empty string > > > > > - API improvement > > > > >- Producer: introduce close(Duration) > > > > >- AdminClient: introduce close(Duration) > > > > >- Kafka Streams: new flatTransform() operator in Streams DSL > > > > >- KafkaStreams (and other classed) now implement AutoClosable to > > > > > support try-with-resource > > > > >- New Serdes and default method implementations > > > > > - Kafka Streams exposed internal client.id via ThreadMetadata > > > > > - Metric improvements: All `-min`, `-avg` and `-max` metrics will > > now > > > > > output `NaN` as default value > > > > > > > > > > All of the changes in this release can be found in the release notes: > > > > > https://www.apache.org/dist/kafka/2.2.0/RELEASE_NOTES.html > > > > > > > > > > > > > > > You can download the source and binary release (Scala 2.11 and 2.12) > > > > > from: https://kafka.apache.org/downloads#2.2.0 > > > > > > > > > > > > > > > > > > > > > > > > --- > > > > > > > > > > > > > > > Apache Kafka is a distributed streaming platform with four core APIs: > > > > > > > > > > > > > > > ** The Producer API allows an application to publish a stream records > > > to > > > > > one or more Kafka topics. > > > > > > > > > > ** The Consumer API allows an application to subscribe to one or more > > > > > topics and process the stream of records produced to them. > > > > > > > > > > ** The Streams API allows an application to act as a stream > > processor, > > > > > consuming an input stream from one or more topics and producing an > > > > > output stream to one or more output topics, effectively transforming > > > the > > > > > input streams to output streams. > > > > > > > > > > ** The Connector API allows building and running reusable producers > > or > > > > > consumers that connect Kafka topics to existing applications or data > > > > > systems. For example, a connector to a relational database might > > > > > capture every change to a table. > > > > > > > > > > > > > > > With these APIs, Kafka can be used for two broad classes of > > > application: > > > > > > > > > > ** Building real-time streaming data pipelines that reliably get data > > > > > between systems or applications. > > > > > > > > > > ** Building real-time streaming applications that transform or react > > > > > to the streams of data. > > > > > > > > > > > > > > > Apache Kafka is in use at large and small companies worldwide, > > > including > > > > > Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, > > > Rabobank, > > > > > Target, The New York Times, Uber, Yelp, and Zalando, among others. > > > > > > > > > > A big thank you for the following 98 contributors to this release! > > > > > > > > > > Alex Diachenko, Andras Katona, Andrew Schofield, Anna Povzner, Arjun > > > > > Satish, Attila Sasvari, Benedict Jin, Bert Roos, Bibin Sebastian, > > Bill > > > > > Bejeck, Bob Barrett, Boyang Chen, Bridger Howell, cadonna, Chia-Ping > > > > > Tsai, Chris Egerton, Colin Hicks, Colin P. Mccabe, Colin Patrick > > > McCabe, > > > > > cwildman, Cyrus Vafadari, David Arthur, Dhruvil Shah, Dong Lin, > > Edoardo > > > > > Comar, Flavien Raynaud, forficate, Gardner Vickers, Guozhang Wang, > > Gwen > > > > > (Chen) Shapira, hackerwin7, hejiefang, huxi, Ismael Juma, Ja
[jira] [Created] (KAFKA-8164) Improve test passing rate by rerunning flaky tests
Viktor Somogyi-Vass created KAFKA-8164: -- Summary: Improve test passing rate by rerunning flaky tests Key: KAFKA-8164 URL: https://issues.apache.org/jira/browse/KAFKA-8164 Project: Kafka Issue Type: Improvement Affects Versions: 2.3.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass Failing flaky tests are often a problem: * pull requests need to be rerun to achieve a green build * rerunning the whole build for a PR is resource and time consuming * green build give high confidence when releasing (and well, generally too) but flaky tests often erode this and annoying for the developers The aim of this JIRA is to provide an improvement which automatically retries any tests that are failed for the first run. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Failing test on PR
Hi Mateusz, Just write a comment that only says "retest this please" (without the quotation marks). Best, Viktor On Wed, Mar 27, 2019 at 1:02 PM Mateusz Zakarczemny wrote: > Hi, > I'm working on https://github.com/apache/kafka/pull/4807 PR. Last PR build > failed in some random place. My changes are related to console consumer and > the job failed in > > *kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup* > I'm assuming it's some flaky test. Is there any way to rerun Jenkins PR > build? Or I should proceed in a other way? > > Regards, > Mateusz Zakarczemny >
[jira] [Created] (KAFKA-8163) SetSchemaMetadata SMT does not apply to nested types
pierre bouvret created KAFKA-8163: - Summary: SetSchemaMetadata SMT does not apply to nested types Key: KAFKA-8163 URL: https://issues.apache.org/jira/browse/KAFKA-8163 Project: Kafka Issue Type: Bug Reporter: pierre bouvret In a schema, I want to replace the pg.public.foufou namespace by the pg.other_public.foufou namespace. The schema (Envelope from Debezium) has an inner definition for Value also belonging to the pg.public.foufou namespace Using a SetSchemaMetadata SMT, the inner namespace is not updated. {quote}{ "type": "record", "name": "Envelope", "namespace": "pg.other_public.foufou", "fields": [ { "name": "before", "type": [ "null", { "type": "record", "name": "Value", "namespace": "pg.public.foufou", "fields": [ { "name": "id", "type": "int" }, { "name": "lib", "type": [ "null", "string" ], "default": null } ], "connect.name": "pg.public.foufou.Value" } ], "default": null },{quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8162) IBM JDK Class not found error when handling SASL authentication exception
Arkadiusz Firus created KAFKA-8162: -- Summary: IBM JDK Class not found error when handling SASL authentication exception Key: KAFKA-8162 URL: https://issues.apache.org/jira/browse/KAFKA-8162 Project: Kafka Issue Type: Bug Components: clients, core Affects Versions: 2.1.1, 2.2.0, 2.1.0 Environment: Any with IBM JDK 8 SR5 FP10 Reporter: Arkadiusz Firus When there is a problem with SASL authentication then enum KerberosError is being used to retrieve the error code. When IBM JDK is being used it tries to load a class com.ibm.security.krb5.internal.KrbException which is not present in all IBM JDK versions. This leads to NoClassDefFoundError which is not handled. I tested it on: java version "1.8.0_161" Java(TM) SE Runtime Environment (build 8.0.5.10 - pxa6480sr5fp10-20180214_01(SR5 FP10)) IBM J9 VM (build 2.9, JRE 1.8.0 Linux amd64-64 Compressed References 20180208_378436 (JIT enabled, AOT enabled) In this version of JDK class KrbException is in package com.ibm.security.krb5 (without internal). So the fully class name is: com.ibm.security.krb5.KrbException Full stack trace from the logs: [2019-03-27 06:50:00,113] ERROR Processor got uncaught exception. (kafka.network.Processor) java.lang.NoClassDefFoundError: org.apache.kafka.common.security.kerberos.KerberosError (initialization failure) at java.lang.J9VMInternals.initializationAlreadyFailed(J9VMInternals.java:96) at org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.handleSaslToken(SaslServerAuthenticator.java:384) at org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:256) at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:132) at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532) at org.apache.kafka.common.network.Selector.poll(Selector.java:467) at kafka.network.Processor.poll(SocketServer.scala:689) at kafka.network.Processor.run(SocketServer.scala:594) at java.lang.Thread.run(Thread.java:811) Caused by: org.apache.kafka.common.KafkaException: Kerberos exceptions could not be initialized at org.apache.kafka.common.security.kerberos.KerberosError.(KerberosError.java:59) ... 8 more Caused by: java.lang.ClassNotFoundException: com.ibm.security.krb5.internal.KrbException at java.lang.Class.forNameImpl(Native Method) at java.lang.Class.forName(Class.java:297) at org.apache.kafka.common.security.kerberos.KerberosError.(KerberosError.java:53) ... 8 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Failing test on PR
Hi, I'm working on https://github.com/apache/kafka/pull/4807 PR. Last PR build failed in some random place. My changes are related to console consumer and the job failed in *kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup* I'm assuming it's some flaky test. Is there any way to rerun Jenkins PR build? Or I should proceed in a other way? Regards, Mateusz Zakarczemny
[jira] [Resolved] (KAFKA-8160) To add ACL with SSL authentication
[ https://issues.apache.org/jira/browse/KAFKA-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sönke Liebau resolved KAFKA-8160. - Resolution: Information Provided Hi [~suseem...@gmail.com] you can absolutely use SSL based authentication with ACLs, please refer to the docs [here|https://kafka.apache.org/documentation/#security_ssl] and [here|https://kafka.apache.org/documentation/#security_authz] for more information. For your specific question, you will have to use a custom PrincipalBuilder to ensure that principals that are extracted from certificates conform to what you set as username for your SCRAM users. As this is more of a support request, not a new feature I'll close this ticket, if you have any further questions, please don't hesitate to reach out on the users mailing list! > To add ACL with SSL authentication > -- > > Key: KAFKA-8160 > URL: https://issues.apache.org/jira/browse/KAFKA-8160 > Project: Kafka > Issue Type: New Feature > Components: consumer, producer >Affects Versions: 1.1.0 >Reporter: suseendramani >Priority: Major > > We want to setup the SSL based authentication along with ACL in place. Is > that doable and can it be added as a feature ? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8161) config conflict when run script bin/kafka-configs.sh --entity-type topics --entity-name flink-mirror-test1 --zookeeper 10.39.0.54:2181/kafka102_test --add-config 'foll
Haiping created KAFKA-8161: -- Summary: config conflict when run script bin/kafka-configs.sh --entity-type topics --entity-name flink-mirror-test1 --zookeeper 10.39.0.54:2181/kafka102_test --add-config 'follower.replication.throttled.replicas=0,1:1,2' --alter Key: KAFKA-8161 URL: https://issues.apache.org/jira/browse/KAFKA-8161 Project: Kafka Issue Type: Bug Components: admin Affects Versions: 0.10.2.1 Reporter: Haiping -- This message was sent by Atlassian JIRA (v7.6.3#76005)