Thanks Matthias. I have updated the KIP according to this and take KAFKA-6058. We will use `KafkaConsumer` instead of reusing `ConsumerGroupCommand` and keep `StreamsResetter` in `core` until `KAFKA-6058 is fixed.
Just as a reminder, [VOTE] thread is already open if there are no more feedback on this KIP :) El vie., 13 oct. 2017 a las 0:28, Matthias J. Sax (<matth...@confluent.io>) escribió: > Jorge, > > thanks for the update. > > I would suggest to not reuse `ConsumerGroupCommand` and re implement > what we need in `StreamsResetter` directly. > > Even if we need to keep `StreamsResetter` in `core` for now, I think we > should not introduce new dependencies. > > Currently, we still use old `kafka.admin.AdmitClient` in > `StreamsResetter`. We need new `KafkaAdminClient` to support "describe > consumer group" to get rid of this part. Than we can move > `StreamsResetter` to `streams` package. > > Cf. https://issues.apache.org/jira/browse/KAFKA-6058 > https://issues.apache.org/jira/browse/KAFKA-5965 > > Feel free to pick up KAFKA-6058 and KAFKA-5965. > > > -Matthias > > > > On 10/9/17 12:54 AM, Jorge Esteban Quilcate Otoya wrote: > > Matthias, > > > > Thanks for the heads up! > > > > I think the main dependency is from `StreamResseter` to > > `ConsumerGroupCommand` class to actually reuse `#reset-offsets` > > functionality. > > > > Not sure what would be the better way to remove it. To expose commands > > (e.g. `ConsumerGroupCommand`) as part of AdminClient, they have to be > > re-implemented on the `client` module right? Is this an option? If not I > > think we should keep `StreamResseter` as part of `core` module until we > > have `ConsumerGroupCommand` on `client` module as well. > > > > El vie., 6 oct. 2017 a las 0:05, Matthias J. Sax (<matth...@confluent.io > >) > > escribió: > > > >> Jorge, > >> > >> KIP-198 (that got merged already) overlaps with this KIP. Can you please > >> update your KIP accordingly? > >> > >> Also, while working on KIP-198, we identified some shortcomings in > >> AdminClient that do not allow us to move StreamsResetter our of core > >> package. We want to address those shortcoming in another KIP to add > >> missing functionality to the new AdminClient. > >> > >> Having say this, and remembering a discussion about dependencies that > >> might be introduced by this KIP, it might be good to understand those > >> dependencies in detail. Maybe we can resolve those dependencies somehow > >> and thus, be able to more StreamsResetter out of core package. Could you > >> summarize those dependencies in the KIP or just as a reply? > >> > >> Thanks! > >> > >> > >> -Matthias > >> > >> On 9/11/17 3:02 PM, Jorge Esteban Quilcate Otoya wrote: > >>> Thanks Guozhang! > >>> > >>> I have updated the KIP to: > >>> > >>> 1. Only one scenario param is allowed. If none, `to-earliest` will be > >> used, > >>> behaving as the current version. > >>> > >>> 2. > >>> 1. An exception will be printed mentioning that there is no existing > >>> offsets registered. > >>> 2. inputTopics format could support define partition numbers as in > >>> reset-offsets option for kafka-consumer-groups. > >>> > >>> 3. That should be handled by KIP-198. > >>> > >>> I will start the VOTE thread in a following email. > >>> > >>> > >>> El mié., 30 ago. 2017 a las 2:01, Guozhang Wang (<wangg...@gmail.com>) > >>> escribió: > >>> > >>>> Hi Jorge, > >>>> > >>>> Thanks for the KIP. It would be a great to add feature to the reset > >> tools. > >>>> I made a pass over it and it looks good to me overall. I have a few > >>>> comments: > >>>> > >>>> 1. For all the scenarios, do we allow users to specify more than one > >>>> parameters? If not could you make that clear in the wiki, e.g. we > would > >>>> return with an error message saying that only one is allowed; if yes > >> then > >>>> what precedence order we are following? > >>>> > >>>> 2. Personally I feel that "--by-duration", "--to-offset" and > >> "--shift-by" > >>>> are a tad overkill, because 1) they assume there exist some committed > >>>> offset for each of the topic, but that may not be always true, 2) > >> offset / > >>>> time shifting amount on different topics may not be a good fit > >> universally, > >>>> i.e. one could imagine the we want to reset all input topics to their > >>>> offsets of a given time, but resetting all topics' offset to the same > >> value > >>>> or let all of them shifting the same amount of offsets are usually not > >>>> applicable. For "--by-duration" it seems could be easily supported by > >> the > >>>> "to-date". > >>>> > >>>> For the general consumer group reset tool, since it could be set one > per > >>>> partition these parameters may be more useful. > >>>> > >>>> 3. As for the implementation details, when removing zookeeper config > in > >>>> `kafka-streams-application-reset`, we should consider return a meaning > >>>> error message otherwise it would be "unrecognized config" blah. > >>>> > >>>> > >>>> If you feel confident about the wiki after discussing about these > >> points, > >>>> please feel free to move on to start a voting thread. Note that we are > >>>> about 3 weeks away from KIP deadline and 4 weeks away from feature > >>>> deadline. > >>>> > >>>> > >>>> Guozhang > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> On Tue, Aug 22, 2017 at 1:45 PM, Matthias J. Sax < > matth...@confluent.io > >>> > >>>> wrote: > >>>> > >>>>> Thanks for the update Jorge. > >>>>> > >>>>> I don't have any further comments. > >>>>> > >>>>> > >>>>> -Matthias > >>>>> > >>>>> On 8/12/17 6:43 PM, Jorge Esteban Quilcate Otoya wrote: > >>>>>> I have updated the KIP: > >>>>>> > >>>>>> - Change execution parameters, using `--dry-run` > >>>>>> - Reference KAFKA-4327 > >>>>>> - And advise about changes on `StreamResetter` > >>>>>> > >>>>>> Also includes that it will cover a change on `ConsumerGroupCommand` > to > >>>>>> align execution options. > >>>>>> > >>>>>> El dom., 16 jul. 2017 a las 5:37, Matthias J. Sax (< > >>>>> matth...@confluent.io>) > >>>>>> escribió: > >>>>>> > >>>>>>> Thanks a lot for the update! > >>>>>>> > >>>>>>> I like the KIP! > >>>>>>> > >>>>>>> One more question about `--dry-run` vs `--execute`: While I agree > >> that > >>>>>>> we should use the same flag for both tools, I am not sure which one > >> is > >>>>>>> the better one... My personal take is, that I like `--dry-run` > >> better. > >>>>>>> Not sure what others think. > >>>>>>> > >>>>>>> One more comment: with the removal of ZK, we can also tackle this > >>>> JIRA: > >>>>>>> https://issues.apache.org/jira/browse/KAFKA-4327 If we do so, I > >> think > >>>>> we > >>>>>>> should mention it in the KIP. > >>>>>>> > >>>>>>> I am also not sure about backward compatibility issue for this > case. > >>>>>>> Actually, I don't expect people to call `StreamsResetter` from Java > >>>>>>> code, but you can never know. So if we break this, we need to make > >>>> sure > >>>>>>> to cover it in the KIP and later on in the release notes. > >>>>>>> > >>>>>>> > >>>>>>> -Matthias > >>>>>>> > >>>>>>> On 7/14/17 7:15 AM, Jorge Esteban Quilcate Otoya wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> KIP is updated. > >>>>>>>> Changes: > >>>>>>>> 1. Approach discussed to keep both tools (streams application > >>>> resetter > >>>>>>> and > >>>>>>>> consumer group reset offset). > >>>>>>>> 2. Options has been aligned between both tools. > >>>>>>>> 3. Zookeeper option from streams-application-resetted will be > >>>> removed, > >>>>>>> and > >>>>>>>> replaced internally for Kafka AdminClient. > >>>>>>>> > >>>>>>>> Looking forward to your feedback. > >>>>>>>> > >>>>>>>> El jue., 29 jun. 2017 a las 15:04, Matthias J. Sax (< > >>>>>>> matth...@confluent.io>) > >>>>>>>> escribió: > >>>>>>>> > >>>>>>>>> Damian, > >>>>>>>>> > >>>>>>>>>> resets everything and clears up > >>>>>>>>>>> the state stores. > >>>>>>>>> > >>>>>>>>> That's not correct. The reset tool does not touch local store. > For > >>>>> this, > >>>>>>>>> we have `KafkaStreams#clenup` -- otherwise, you would need to run > >>>> the > >>>>>>>>> tool in every machine you run an application instance. > >>>>>>>>> > >>>>>>>>> With regard to state, the tool only deletes the underlying > >> changelog > >>>>>>>>> topics. > >>>>>>>>> > >>>>>>>>> Just to clarify. The idea of allowing to rest with different > offset > >>>> is > >>>>>>>>> to clear all state but just use a different start offset (instead > >> of > >>>>>>>>> zero). This is for use case where your topic has a larger > retention > >>>>> time > >>>>>>>>> than the amount of data you want to reprocess. But we always need > >> to > >>>>>>>>> start with an empty state. (Resetting with consistent state is > >>>>> something > >>>>>>>>> we might do at some point, but it's much hard and not part of > this > >>>>> KIP) > >>>>>>>>> > >>>>>>>>>> @matthias, could we remove the ZK dependency from the streams > >> reset > >>>>>>> tool > >>>>>>>>>> now? > >>>>>>>>> > >>>>>>>>> I think so. The new AdminClient provide the feature we need > AFAIK. > >> I > >>>>>>>>> guess we can picky back this into the KIP (we would need a KIP > >>>> anyway > >>>>>>>>> because we deprecate `--zookeeper` what is an public API change). > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I just want to point out, that even without ZK dependency, I > prefer > >>>> to > >>>>>>>>> wrap the consumer offset tool and keep two tools. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -Matthias > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 6/29/17 9:14 AM, Damian Guy wrote: > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> Thanks for the KIP. What is not clear is how is this going to > >>>> handle > >>>>>>>>> state > >>>>>>>>>> stores? Right now the streams reset tool, resets everything and > >>>>> clears > >>>>>>> up > >>>>>>>>>> the state stores. What are we going to do if we reset to a > >>>> particular > >>>>>>>>>> offset? If we clear the state then we've lost any previously > >>>>> aggregated > >>>>>>>>>> values (which may or may not be what is expected). If we don't > >>>> clear > >>>>>>>>> them, > >>>>>>>>>> then we will end up with incorrect aggregates. > >>>>>>>>>> > >>>>>>>>>> @matthias, could we remove the ZK dependency from the streams > >> reset > >>>>>>> tool > >>>>>>>>>> now? > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Damian > >>>>>>>>>> > >>>>>>>>>> On Thu, 29 Jun 2017 at 15:22 Jorge Esteban Quilcate Otoya < > >>>>>>>>>> quilcate.jo...@gmail.com> wrote: > >>>>>>>>>> > >>>>>>>>>>> You're right, I was not considering Zookeeper dependency. > >>>>>>>>>>> > >>>>>>>>>>> I'm starting to like the idea to wrap `reset-offset` from > >>>>>>>>>>> `streams-application-reset`. > >>>>>>>>>>> > >>>>>>>>>>> I will wait a bit for more feedback and work on a draft with > this > >>>>>>>>> changes. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> El mié., 28 jun. 2017 a las 0:20, Matthias J. Sax (< > >>>>>>>>> matth...@confluent.io > >>>>>>>>>>>> ) > >>>>>>>>>>> escribió: > >>>>>>>>>>> > >>>>>>>>>>>> I agree, that we should not duplicate functionality. > >>>>>>>>>>>> > >>>>>>>>>>>> However, I am worried, that a non-streams users using the > offset > >>>>>>> reset > >>>>>>>>>>>> tool might delete topics unintentionally (even if the changes > >> are > >>>>>>>>> pretty > >>>>>>>>>>>> small). Also, currently the stream reset tool required > Zookeeper > >>>>>>> while > >>>>>>>>>>>> the offset reset tool does not -- I don't think we should add > >>>> this > >>>>>>>>>>>> dependency to the offset reset tool. > >>>>>>>>>>>> > >>>>>>>>>>>> Thus, it think it might be better to keep both tools, but > >>>>> internally > >>>>>>>>>>>> rewrite the streams reset entry class, to reuse as much as > >>>> possible > >>>>>>>>> from > >>>>>>>>>>>> the offset reset tool. Ie. the streams tool would be a wrapper > >>>>> around > >>>>>>>>>>>> the offset tool and add some functionality it needs that is > >>>> Streams > >>>>>>>>>>>> specific. > >>>>>>>>>>>> > >>>>>>>>>>>> I also think, that keeping separate tools for consumers and > >>>> streams > >>>>>>> is > >>>>>>>>> a > >>>>>>>>>>>> good thing. We might want to add new features that don't apply > >> to > >>>>>>> plain > >>>>>>>>>>>> consumers -- note, a Streams applications is more than just a > >>>>> client. > >>>>>>>>>>>> > >>>>>>>>>>>> WDYT? > >>>>>>>>>>>> > >>>>>>>>>>>> Would be good to get some feedback from others, too. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -Matthias > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On 6/27/17 9:05 AM, Jorge Esteban Quilcate Otoya wrote: > >>>>>>>>>>>>> Thanks for the feedback Matthias! > >>>>>>>>>>>>> > >>>>>>>>>>>>> The main idea is to use only 1 tool to reset offsets and > don't > >>>>>>>>>>> replicate > >>>>>>>>>>>>> functionality between tools. > >>>>>>>>>>>>> Reset Offset (KIP-122) tool not only reset but support > execute > >>>> the > >>>>>>>>>>> reset > >>>>>>>>>>>>> but also export, import from files, etc. > >>>>>>>>>>>>> If we extend the current tool (kafka-streams-application- > >>>>> reset.sh) > >>>>>>> we > >>>>>>>>>>>> will > >>>>>>>>>>>>> have to duplicate all this functionality also. > >>>>>>>>>>>>> Maybe another option is to move the current implementation > into > >>>>>>>>>>>>> `kafka-consumer-group` and add a new command > >>>>>>> `--reset-offset-streams` > >>>>>>>>>>>> with > >>>>>>>>>>>>> the current implementation functionality and add > >>>> `--reset-offset` > >>>>>>>>>>> options > >>>>>>>>>>>>> for input topics. Does this make sense? > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> El lun., 26 jun. 2017 a las 23:32, Matthias J. Sax (< > >>>>>>>>>>>> matth...@confluent.io>) > >>>>>>>>>>>>> escribió: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Jorge, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> thanks a lot for this KIP. Allowing the reset streams > >>>>> applications > >>>>>>>>>>> with > >>>>>>>>>>>>>> arbitrary start offset is something we got multiple requests > >>>>>>> already. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Couple of clarification question: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - why do you want to deprecate the current tool instead of > >>>>>>> extending > >>>>>>>>>>>>>> the current tool with the stuff the offset reset tool can do > >>>> (ie, > >>>>>>> use > >>>>>>>>>>>>>> the offset reset tool internally) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> - you suggest to extend the offset reset tool to replace > the > >>>>>>> stream > >>>>>>>>>>>>>> reset tool: how would the reset tool know if it is > resetting a > >>>>>>>>> streams > >>>>>>>>>>>>>> applications or a regular consumer group? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On 6/26/17 1:28 PM, Jorge Esteban Quilcate Otoya wrote: > >>>>>>>>>>>>>>> Hi all, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I'd like to start the discussion to add reset offset > tooling > >>>> for > >>>>>>>>>>> Stream > >>>>>>>>>>>>>>> applications. > >>>>>>>>>>>>>>> The KIP can be found here: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>> 171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>> Jorge. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>>> > >>>> -- > >>>> -- Guozhang > >>>> > >>> > >> > >> > > > >