Jorge,

thanks for the update.

I would suggest to not reuse `ConsumerGroupCommand` and re implement
what we need in `StreamsResetter` directly.

Even if we need to keep `StreamsResetter` in `core` for now, I think we
should not introduce new dependencies.

Currently, we still use old `kafka.admin.AdmitClient` in
`StreamsResetter`. We need new `KafkaAdminClient` to support "describe
consumer group" to get rid of this part. Than we can move
`StreamsResetter` to `streams` package.

Cf. https://issues.apache.org/jira/browse/KAFKA-6058
https://issues.apache.org/jira/browse/KAFKA-5965

Feel free to pick up KAFKA-6058 and KAFKA-5965.


-Matthias



On 10/9/17 12:54 AM, Jorge Esteban Quilcate Otoya wrote:
> Matthias,
> 
> Thanks for the heads up!
> 
> I think the main dependency is from `StreamResseter` to
> `ConsumerGroupCommand` class to actually reuse `#reset-offsets`
> functionality.
> 
> Not sure what would be the better way to remove it. To expose commands
> (e.g. `ConsumerGroupCommand`) as part of AdminClient, they have to be
> re-implemented on the `client` module right? Is this an option? If not I
> think we should keep `StreamResseter` as part of `core` module until we
> have `ConsumerGroupCommand` on `client` module as well.
> 
> El vie., 6 oct. 2017 a las 0:05, Matthias J. Sax (<matth...@confluent.io>)
> escribió:
> 
>> Jorge,
>>
>> KIP-198 (that got merged already) overlaps with this KIP. Can you please
>> update your KIP accordingly?
>>
>> Also, while working on KIP-198, we identified some shortcomings in
>> AdminClient that do not allow us to move StreamsResetter our of core
>> package. We want to address those shortcoming in another KIP to add
>> missing functionality to the new AdminClient.
>>
>> Having say this, and remembering a discussion about dependencies that
>> might be introduced by this KIP, it might be good to understand those
>> dependencies in detail. Maybe we can resolve those dependencies somehow
>> and thus, be able to more StreamsResetter out of core package. Could you
>> summarize those dependencies in the KIP or just as a reply?
>>
>> Thanks!
>>
>>
>> -Matthias
>>
>> On 9/11/17 3:02 PM, Jorge Esteban Quilcate Otoya wrote:
>>> Thanks Guozhang!
>>>
>>> I have updated the KIP to:
>>>
>>> 1. Only one scenario param is allowed. If none, `to-earliest` will be
>> used,
>>> behaving as the current version.
>>>
>>> 2.
>>>   1. An exception will be printed mentioning that there is no existing
>>> offsets registered.
>>>   2. inputTopics format could support define partition numbers as in
>>> reset-offsets option for kafka-consumer-groups.
>>>
>>> 3. That should be handled by KIP-198.
>>>
>>> I will start the VOTE thread in a following email.
>>>
>>>
>>> El mié., 30 ago. 2017 a las 2:01, Guozhang Wang (<wangg...@gmail.com>)
>>> escribió:
>>>
>>>> Hi Jorge,
>>>>
>>>> Thanks for the KIP. It would be a great to add feature to the reset
>> tools.
>>>> I made a pass over it and it looks good to me overall. I have a few
>>>> comments:
>>>>
>>>> 1. For all the scenarios, do we allow users to specify more than one
>>>> parameters? If not could you make that clear in the wiki, e.g. we would
>>>> return with an error message saying that only one is allowed; if yes
>> then
>>>> what precedence order we are following?
>>>>
>>>> 2. Personally I feel that "--by-duration", "--to-offset" and
>> "--shift-by"
>>>> are a tad overkill, because 1) they assume there exist some committed
>>>> offset for each of the topic, but that may not be always true, 2)
>> offset /
>>>> time shifting amount on different topics may not be a good fit
>> universally,
>>>> i.e. one could imagine the we want to reset all input topics to their
>>>> offsets of a given time, but resetting all topics' offset to the same
>> value
>>>> or let all of them shifting the same amount of offsets are usually not
>>>> applicable. For "--by-duration" it seems could be easily supported by
>> the
>>>> "to-date".
>>>>
>>>> For the general consumer group reset tool, since it could be set one per
>>>> partition these parameters may be more useful.
>>>>
>>>> 3. As for the implementation details, when removing zookeeper config in
>>>> `kafka-streams-application-reset`, we should consider return a meaning
>>>> error message otherwise it would be "unrecognized config" blah.
>>>>
>>>>
>>>> If you feel confident about the wiki after discussing about these
>> points,
>>>> please feel free to move on to start a voting thread. Note that we are
>>>> about 3 weeks away from KIP deadline and 4 weeks away from feature
>>>> deadline.
>>>>
>>>>
>>>> Guozhang
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Aug 22, 2017 at 1:45 PM, Matthias J. Sax <matth...@confluent.io
>>>
>>>> wrote:
>>>>
>>>>> Thanks for the update Jorge.
>>>>>
>>>>> I don't have any further comments.
>>>>>
>>>>>
>>>>> -Matthias
>>>>>
>>>>> On 8/12/17 6:43 PM, Jorge Esteban Quilcate Otoya wrote:
>>>>>> I have updated the KIP:
>>>>>>
>>>>>> - Change execution parameters, using `--dry-run`
>>>>>> - Reference KAFKA-4327
>>>>>> - And advise about changes on `StreamResetter`
>>>>>>
>>>>>> Also includes that it will cover a change on `ConsumerGroupCommand` to
>>>>>> align execution options.
>>>>>>
>>>>>> El dom., 16 jul. 2017 a las 5:37, Matthias J. Sax (<
>>>>> matth...@confluent.io>)
>>>>>> escribió:
>>>>>>
>>>>>>> Thanks a lot for the update!
>>>>>>>
>>>>>>> I like the KIP!
>>>>>>>
>>>>>>> One more question about `--dry-run` vs `--execute`: While I agree
>> that
>>>>>>> we should use the same flag for both tools, I am not sure which one
>> is
>>>>>>> the better one... My personal take is, that I like `--dry-run`
>> better.
>>>>>>> Not sure what others think.
>>>>>>>
>>>>>>> One more comment: with the removal of ZK, we can also tackle this
>>>> JIRA:
>>>>>>> https://issues.apache.org/jira/browse/KAFKA-4327 If we do so, I
>> think
>>>>> we
>>>>>>> should mention it in the KIP.
>>>>>>>
>>>>>>> I am also not sure about backward compatibility issue for this case.
>>>>>>> Actually, I don't expect people to call `StreamsResetter` from Java
>>>>>>> code, but you can never know. So if we break this, we need to make
>>>> sure
>>>>>>> to cover it in the KIP and later on in the release notes.
>>>>>>>
>>>>>>>
>>>>>>> -Matthias
>>>>>>>
>>>>>>> On 7/14/17 7:15 AM, Jorge Esteban Quilcate Otoya wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> KIP is updated.
>>>>>>>> Changes:
>>>>>>>> 1. Approach discussed to keep both tools (streams application
>>>> resetter
>>>>>>> and
>>>>>>>> consumer group reset offset).
>>>>>>>> 2. Options has been aligned between both tools.
>>>>>>>> 3. Zookeeper option from streams-application-resetted will be
>>>> removed,
>>>>>>> and
>>>>>>>> replaced internally for Kafka AdminClient.
>>>>>>>>
>>>>>>>> Looking forward to your feedback.
>>>>>>>>
>>>>>>>> El jue., 29 jun. 2017 a las 15:04, Matthias J. Sax (<
>>>>>>> matth...@confluent.io>)
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>>> Damian,
>>>>>>>>>
>>>>>>>>>> resets everything and clears up
>>>>>>>>>>> the state stores.
>>>>>>>>>
>>>>>>>>> That's not correct. The reset tool does not touch local store. For
>>>>> this,
>>>>>>>>> we have `KafkaStreams#clenup` -- otherwise, you would need to run
>>>> the
>>>>>>>>> tool in every machine you run an application instance.
>>>>>>>>>
>>>>>>>>> With regard to state, the tool only deletes the underlying
>> changelog
>>>>>>>>> topics.
>>>>>>>>>
>>>>>>>>> Just to clarify. The idea of allowing to rest with different offset
>>>> is
>>>>>>>>> to clear all state but just use a different start offset (instead
>> of
>>>>>>>>> zero). This is for use case where your topic has a larger retention
>>>>> time
>>>>>>>>> than the amount of data you want to reprocess. But we always need
>> to
>>>>>>>>> start with an empty state. (Resetting with consistent state is
>>>>> something
>>>>>>>>> we might do at some point, but it's much hard and not part of this
>>>>> KIP)
>>>>>>>>>
>>>>>>>>>> @matthias, could we remove the ZK dependency from the streams
>> reset
>>>>>>> tool
>>>>>>>>>> now?
>>>>>>>>>
>>>>>>>>> I think so. The new AdminClient provide the feature we need AFAIK.
>> I
>>>>>>>>> guess we can picky back this into the KIP (we would need a KIP
>>>> anyway
>>>>>>>>> because we deprecate `--zookeeper` what is an public API change).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I just want to point out, that even without ZK dependency, I prefer
>>>> to
>>>>>>>>> wrap the consumer offset tool and keep two tools.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -Matthias
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 6/29/17 9:14 AM, Damian Guy wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Thanks for the KIP. What is not clear is how is this going to
>>>> handle
>>>>>>>>> state
>>>>>>>>>> stores? Right now the streams reset tool, resets everything and
>>>>> clears
>>>>>>> up
>>>>>>>>>> the state stores. What are we going to do if we reset to a
>>>> particular
>>>>>>>>>> offset? If we clear the state then we've lost any previously
>>>>> aggregated
>>>>>>>>>> values (which may or may not be what is expected). If we don't
>>>> clear
>>>>>>>>> them,
>>>>>>>>>> then we will end up with incorrect aggregates.
>>>>>>>>>>
>>>>>>>>>> @matthias, could we remove the ZK dependency from the streams
>> reset
>>>>>>> tool
>>>>>>>>>> now?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Damian
>>>>>>>>>>
>>>>>>>>>> On Thu, 29 Jun 2017 at 15:22 Jorge Esteban Quilcate Otoya <
>>>>>>>>>> quilcate.jo...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> You're right, I was not considering Zookeeper dependency.
>>>>>>>>>>>
>>>>>>>>>>> I'm starting to like the idea to wrap `reset-offset` from
>>>>>>>>>>> `streams-application-reset`.
>>>>>>>>>>>
>>>>>>>>>>> I will wait a bit for more feedback and work on a draft with this
>>>>>>>>> changes.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> El mié., 28 jun. 2017 a las 0:20, Matthias J. Sax (<
>>>>>>>>> matth...@confluent.io
>>>>>>>>>>>> )
>>>>>>>>>>> escribió:
>>>>>>>>>>>
>>>>>>>>>>>> I agree, that we should not duplicate functionality.
>>>>>>>>>>>>
>>>>>>>>>>>> However, I am worried, that a non-streams users using the offset
>>>>>>> reset
>>>>>>>>>>>> tool might delete topics unintentionally (even if the changes
>> are
>>>>>>>>> pretty
>>>>>>>>>>>> small). Also, currently the stream reset tool required Zookeeper
>>>>>>> while
>>>>>>>>>>>> the offset reset tool does not -- I don't think we should add
>>>> this
>>>>>>>>>>>> dependency to the offset reset tool.
>>>>>>>>>>>>
>>>>>>>>>>>> Thus, it think it might be better to keep both tools, but
>>>>> internally
>>>>>>>>>>>> rewrite the streams reset entry class, to reuse as much as
>>>> possible
>>>>>>>>> from
>>>>>>>>>>>> the offset reset tool. Ie. the streams tool would be a wrapper
>>>>> around
>>>>>>>>>>>> the offset tool and add some functionality it needs that is
>>>> Streams
>>>>>>>>>>>> specific.
>>>>>>>>>>>>
>>>>>>>>>>>> I also think, that keeping separate tools for consumers and
>>>> streams
>>>>>>> is
>>>>>>>>> a
>>>>>>>>>>>> good thing. We might want to add new features that don't apply
>> to
>>>>>>> plain
>>>>>>>>>>>> consumers -- note, a Streams applications is more than just a
>>>>> client.
>>>>>>>>>>>>
>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>
>>>>>>>>>>>> Would be good to get some feedback from others, too.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 6/27/17 9:05 AM, Jorge Esteban Quilcate Otoya wrote:
>>>>>>>>>>>>> Thanks for the feedback Matthias!
>>>>>>>>>>>>>
>>>>>>>>>>>>> The main idea is to use only 1 tool to reset offsets and don't
>>>>>>>>>>> replicate
>>>>>>>>>>>>> functionality between tools.
>>>>>>>>>>>>> Reset Offset (KIP-122) tool not only reset but support execute
>>>> the
>>>>>>>>>>> reset
>>>>>>>>>>>>> but also export, import from files, etc.
>>>>>>>>>>>>> If we extend the current tool (kafka-streams-application-
>>>>> reset.sh)
>>>>>>> we
>>>>>>>>>>>> will
>>>>>>>>>>>>> have to duplicate all this functionality also.
>>>>>>>>>>>>> Maybe another option is to move the current implementation into
>>>>>>>>>>>>> `kafka-consumer-group` and add a new command
>>>>>>> `--reset-offset-streams`
>>>>>>>>>>>> with
>>>>>>>>>>>>> the current implementation functionality and add
>>>> `--reset-offset`
>>>>>>>>>>> options
>>>>>>>>>>>>> for input topics. Does this make sense?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> El lun., 26 jun. 2017 a las 23:32, Matthias J. Sax (<
>>>>>>>>>>>> matth...@confluent.io>)
>>>>>>>>>>>>> escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Jorge,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks a lot for this KIP. Allowing the reset streams
>>>>> applications
>>>>>>>>>>> with
>>>>>>>>>>>>>> arbitrary start offset is something we got multiple requests
>>>>>>> already.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Couple of clarification question:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  - why do you want to deprecate the current tool instead of
>>>>>>> extending
>>>>>>>>>>>>>> the current tool with the stuff the offset reset tool can do
>>>> (ie,
>>>>>>> use
>>>>>>>>>>>>>> the offset reset tool internally)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  - you suggest to extend the offset reset tool to replace the
>>>>>>> stream
>>>>>>>>>>>>>> reset tool: how would the reset tool know if it is resetting a
>>>>>>>>> streams
>>>>>>>>>>>>>> applications or a regular consumer group?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 6/26/17 1:28 PM, Jorge Esteban Quilcate Otoya wrote:
>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'd like to start the discussion to add reset offset tooling
>>>> for
>>>>>>>>>>> Stream
>>>>>>>>>>>>>>> applications.
>>>>>>>>>>>>>>> The KIP can be found here:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>> 171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Jorge.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> -- Guozhang
>>>>
>>>
>>
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to