Re: [VOTE] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-07-29 Thread Jorge Esteban Quilcate Otoya
Thanks everyone for voting!

With 3 binding votes (Matthias, Guozhang, and John) and 2 non-binding
votes (Leah, and Sophie), will mark this KIP as accepted.

Thanks,

Jorge.


On Tue, Jul 28, 2020 at 3:27 AM Matthias J. Sax  wrote:

> +1 (binding)
>
> On 7/27/20 4:55 PM, Guozhang Wang wrote:
> > +1. Thanks Jorge for bringing in this KIP!
> >
> > Guozhang
> >
> > On Mon, Jul 27, 2020 at 10:07 AM Leah Thomas 
> wrote:
> >
> >> Hi Jorge,
> >>
> >> Looks great. +1 (non-binding)
> >>
> >> Best,
> >> Leah
> >>
> >> On Thu, Jul 16, 2020 at 6:39 PM Sophie Blee-Goldman <
> sop...@confluent.io>
> >> wrote:
> >>
> >>> Hey Jorge,
> >>>
> >>> Thanks for the reminder -- +1 (non-binding)
> >>>
> >>> Cheers,
> >>> Sophie
> >>>
> >>> On Thu, Jul 16, 2020 at 4:06 PM Jorge Esteban Quilcate Otoya <
> >>> quilcate.jo...@gmail.com> wrote:
> >>>
> >>>> Bumping this vote thread to check if there's any feedback.
> >>>>
> >>>> Cheers,
> >>>> Jorge.
> >>>>
> >>>> On Sat, Jul 4, 2020 at 6:20 PM John Roesler 
> >> wrote:
> >>>>
> >>>>> Thanks Jorge,
> >>>>>
> >>>>> I’m +1 (binding)
> >>>>>
> >>>>> -John
> >>>>>
> >>>>> On Fri, Jul 3, 2020, at 10:26, Jorge Esteban Quilcate Otoya wrote:
> >>>>>> Hola everyone,
> >>>>>>
> >>>>>> I'd like to start a new thread to vote for KIP-617 as there have
> >> been
> >>>>>> significant changes since the previous vote started.
> >>>>>>
> >>>>>> KIP wiki page:
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards
> >>>>>>
> >>>>>> Many thanks!
> >>>>>>
> >>>>>> Jorge.
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
> >
>
>


Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-07-23 Thread Jorge Esteban Quilcate Otoya
Thanks Matthias.

1. PR looks good to me.

2. About adding default methods, I think for correctness and backward
compatibility default methods should be added to avoid breaking potential
clients.
I have updated the KIP accordingly.


Jorge.

On Thu, Jul 23, 2020 at 5:19 AM Matthias J. Sax  wrote:

> Finally cycling back to this.
>
> Overall I like the KIP.
>
> Two comments:
>
>  - I tried to figure out why the two InMemoerySessionStore methods are
> deprecated and it seems those annotations are there since the class was
> added; as this seems to be a bug, and there are no backward
> compatibility concerns, I just did a PR to remove those annotations:
>
> https://github.com/apache/kafka/pull/9061
>
>
>  - about moving the "read" methods from SessionStore to
> ReadOnlySessionsStore: while I agree that this make sense, it is
> strictly specking backward incompatible if we don't add default
> implementations. On the other hand, it seems that the likelihood that
> one only implement ReadOnlySessionStore is basically zero, so I am not
> sure if its worth to bother?
>
>
> -Matthias
>
> On 7/4/20 7:02 AM, John Roesler wrote:
> > Thanks Jorge,
> >
> > This KIP looks good to me!
> >
> > -John
> >
> > On Fri, Jul 3, 2020, at 03:19, Jorge Esteban Quilcate Otoya wrote:
> >> Hi John,
> >>
> >> Thanks for the feedback.
> >>
> >> I'd be happy to take the third option and consider moving methods to
> >> ReadOnlySessionStore as part of the KIP.
> >> Docs is updated to reflect these changes.
> >>
> >> Cheers,
> >> Jorge.
> >>
> >> On Thu, Jul 2, 2020 at 10:06 PM John Roesler 
> wrote:
> >>
> >>> Hey Jorge,
> >>>
> >>> Thanks for the details. That sounds like a mistake to me on both
> counts.
> >>>
> >>> I don’t think you need to worry about those depreciations. If the
> >>> interface methods aren’t deprecated, then the methods are not
> deprecated.
> >>> We should remove the annotations, but it doesn’t need to be in the kip.
> >>>
> >>> I think any query methods should have been in the ReadOnly interface. I
> >>> guess it’s up to you whether you want to:
> >>> 1. Add the reverse methods next to the existing methods (what you have
> in
> >>> the kip right now)
> >>> 2. Partially fix it by adding your new methods to the ReadOnly
> interface
> >>> 3. Fully fix the problem by moving the existing methods as well as your
> >>> new ones. Since  SessionStore extends ReadOnlySessionStore, it’s ok
> just to
> >>> move the definitions.
> >>>
> >>> I’m ok with whatever you prefer.
> >>>
> >>> Thanks,
> >>> John
> >>>
> >>> On Thu, Jul 2, 2020, at 11:29, Jorge Esteban Quilcate Otoya wrote:
> >>>> (sorry for the spam)
> >>>>
> >>>> Actually `findSessions` are only deprecated on `InMemorySessionStore`,
> >>>> which seems strange as RocksDB and interfaces haven't marked these
> >>> methods
> >>>> as deprecated.
> >>>>
> >>>> Any hint on how to handle this?
> >>>>
> >>>> Cheers,
> >>>>
> >>>> On Thu, Jul 2, 2020 at 4:57 PM Jorge Esteban Quilcate Otoya <
> >>>> quilcate.jo...@gmail.com> wrote:
> >>>>
> >>>>> @John: I can see there are some deprecations in there as well. Will
> >>> update
> >>>>> the KIP.
> >>>>>
> >>>>> Thanks,
> >>>>> Jorge
> >>>>>
> >>>>>
> >>>>> On Thu, Jul 2, 2020 at 3:29 PM Jorge Esteban Quilcate Otoya <
> >>>>> quilcate.jo...@gmail.com> wrote:
> >>>>>
> >>>>>> Thanks John.
> >>>>>>
> >>>>>>> it looks like there’s a revision error in which two methods are
> >>>>>> proposed for SessionStore, but seem like they should be in
> >>>>>> ReadOnlySessionStore. Do I read that right?
> >>>>>>
> >>>>>> Yes, I've opted to keep the new methods alongside the existing ones.
> >>> In
> >>>>>> the case of SessionStore, `findSessions` are in `SessionStore`, and
> >>> `fetch`
> >>>>>> are in `ReadOnlySessionStore`. If it makes more sense, I can move
> all
> >>> o

Re: [VOTE] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-07-16 Thread Jorge Esteban Quilcate Otoya
Bumping this vote thread to check if there's any feedback.

Cheers,
Jorge.

On Sat, Jul 4, 2020 at 6:20 PM John Roesler  wrote:

> Thanks Jorge,
>
> I’m +1 (binding)
>
> -John
>
> On Fri, Jul 3, 2020, at 10:26, Jorge Esteban Quilcate Otoya wrote:
> > Hola everyone,
> >
> > I'd like to start a new thread to vote for KIP-617 as there have been
> > significant changes since the previous vote started.
> >
> > KIP wiki page:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards
> >
> > Many thanks!
> >
> > Jorge.
> >
>


Re: [DISCUSS] KIP-634: Complementary support for headers in Kafka Streams DSL

2020-07-16 Thread Jorge Esteban Quilcate Otoya
Hi everyone,

Bumping this thread to check if there's any feedback.

Cheers,
Jorge.

On Tue, Jun 30, 2020 at 12:46 AM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Hi everyone,
>
> I would like to start the discussion for 
> KIP-634:https://cwiki.apache.org/confluence/display/KAFKA/KIP-634%3A+Complementary+support+for+headers+in+Kafka+Streams+DSL
>
> Looking forward to your feedback.
>
> Thanks!
> Jorge.
>
>
>
>


[VOTE] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-07-03 Thread Jorge Esteban Quilcate Otoya
Hola everyone,

I'd like to start a new thread to vote for KIP-617 as there have been
significant changes since the previous vote started.

KIP wiki page:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards

Many thanks!

Jorge.


Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-07-03 Thread Jorge Esteban Quilcate Otoya
Hi John,

Thanks for the feedback.

I'd be happy to take the third option and consider moving methods to
ReadOnlySessionStore as part of the KIP.
Docs is updated to reflect these changes.

Cheers,
Jorge.

On Thu, Jul 2, 2020 at 10:06 PM John Roesler  wrote:

> Hey Jorge,
>
> Thanks for the details. That sounds like a mistake to me on both counts.
>
> I don’t think you need to worry about those depreciations. If the
> interface methods aren’t deprecated, then the methods are not deprecated.
> We should remove the annotations, but it doesn’t need to be in the kip.
>
> I think any query methods should have been in the ReadOnly interface. I
> guess it’s up to you whether you want to:
> 1. Add the reverse methods next to the existing methods (what you have in
> the kip right now)
> 2. Partially fix it by adding your new methods to the ReadOnly interface
> 3. Fully fix the problem by moving the existing methods as well as your
> new ones. Since  SessionStore extends ReadOnlySessionStore, it’s ok just to
> move the definitions.
>
> I’m ok with whatever you prefer.
>
> Thanks,
> John
>
> On Thu, Jul 2, 2020, at 11:29, Jorge Esteban Quilcate Otoya wrote:
> > (sorry for the spam)
> >
> > Actually `findSessions` are only deprecated on `InMemorySessionStore`,
> > which seems strange as RocksDB and interfaces haven't marked these
> methods
> > as deprecated.
> >
> > Any hint on how to handle this?
> >
> > Cheers,
> >
> > On Thu, Jul 2, 2020 at 4:57 PM Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> > > @John: I can see there are some deprecations in there as well. Will
> update
> > > the KIP.
> > >
> > > Thanks,
> > > Jorge
> > >
> > >
> > > On Thu, Jul 2, 2020 at 3:29 PM Jorge Esteban Quilcate Otoya <
> > > quilcate.jo...@gmail.com> wrote:
> > >
> > >> Thanks John.
> > >>
> > >> > it looks like there’s a revision error in which two methods are
> > >> proposed for SessionStore, but seem like they should be in
> > >> ReadOnlySessionStore. Do I read that right?
> > >>
> > >> Yes, I've opted to keep the new methods alongside the existing ones.
> In
> > >> the case of SessionStore, `findSessions` are in `SessionStore`, and
> `fetch`
> > >> are in `ReadOnlySessionStore`. If it makes more sense, I can move all
> of
> > >> them to ReadOnlySessionStore.
> > >> Let me know what you think.
> > >>
> > >> Thanks,
> > >> Jorge.
> > >>
> > >> On Thu, Jul 2, 2020 at 2:36 PM John Roesler 
> wrote:
> > >>
> > >>> Hi Jorge,
> > >>>
> > >>> Thanks for the update. I think this is a good plan.
> > >>>
> > >>> I just took a look at the KIP again, and it looks like there’s a
> > >>> revision error in which two methods are proposed for SessionStore,
> but seem
> > >>> like they should be in ReadOnlySessionStore. Do I read that right?
> > >>>
> > >>> Otherwise, I’m happy with your proposal.
> > >>>
> > >>> Thanks,
> > >>> John
> > >>>
> > >>> On Wed, Jul 1, 2020, at 17:01, Jorge Esteban Quilcate Otoya wrote:
> > >>> > Quick update: KIP is updated with latest changes now.
> > >>> > Will leave it open this week while working on the PR.
> > >>> >
> > >>> > Hope to open a new vote thread over the next few days if no
> additional
> > >>> > feedback is provided.
> > >>> >
> > >>> > Cheers,
> > >>> > Jorge.
> > >>> >
> > >>> > On Mon, Jun 29, 2020 at 11:30 AM Jorge Esteban Quilcate Otoya <
> > >>> > quilcate.jo...@gmail.com> wrote:
> > >>> >
> > >>> > > Thanks, John!
> > >>> > >
> > >>> > > Make sense to reconsider the current approach. I was heading in a
> > >>> similar
> > >>> > > direction while drafting the implementation. Metered, Caching,
> and
> > >>> other
> > >>> > > layers will also have to get duplicated to build up new methods
> in
> > >>> `Stores`
> > >>> > > factory, and class casting issues would appear on stores created
> by
> > >>> DSL.
> > >>> > >
> > >>> > 

Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-07-02 Thread Jorge Esteban Quilcate Otoya
(sorry for the spam)

Actually `findSessions` are only deprecated on `InMemorySessionStore`,
which seems strange as RocksDB and interfaces haven't marked these methods
as deprecated.

Any hint on how to handle this?

Cheers,

On Thu, Jul 2, 2020 at 4:57 PM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> @John: I can see there are some deprecations in there as well. Will update
> the KIP.
>
> Thanks,
> Jorge
>
>
> On Thu, Jul 2, 2020 at 3:29 PM Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
>> Thanks John.
>>
>> > it looks like there’s a revision error in which two methods are
>> proposed for SessionStore, but seem like they should be in
>> ReadOnlySessionStore. Do I read that right?
>>
>> Yes, I've opted to keep the new methods alongside the existing ones. In
>> the case of SessionStore, `findSessions` are in `SessionStore`, and `fetch`
>> are in `ReadOnlySessionStore`. If it makes more sense, I can move all of
>> them to ReadOnlySessionStore.
>> Let me know what you think.
>>
>> Thanks,
>> Jorge.
>>
>> On Thu, Jul 2, 2020 at 2:36 PM John Roesler  wrote:
>>
>>> Hi Jorge,
>>>
>>> Thanks for the update. I think this is a good plan.
>>>
>>> I just took a look at the KIP again, and it looks like there’s a
>>> revision error in which two methods are proposed for SessionStore, but seem
>>> like they should be in ReadOnlySessionStore. Do I read that right?
>>>
>>> Otherwise, I’m happy with your proposal.
>>>
>>> Thanks,
>>> John
>>>
>>> On Wed, Jul 1, 2020, at 17:01, Jorge Esteban Quilcate Otoya wrote:
>>> > Quick update: KIP is updated with latest changes now.
>>> > Will leave it open this week while working on the PR.
>>> >
>>> > Hope to open a new vote thread over the next few days if no additional
>>> > feedback is provided.
>>> >
>>> > Cheers,
>>> > Jorge.
>>> >
>>> > On Mon, Jun 29, 2020 at 11:30 AM Jorge Esteban Quilcate Otoya <
>>> > quilcate.jo...@gmail.com> wrote:
>>> >
>>> > > Thanks, John!
>>> > >
>>> > > Make sense to reconsider the current approach. I was heading in a
>>> similar
>>> > > direction while drafting the implementation. Metered, Caching, and
>>> other
>>> > > layers will also have to get duplicated to build up new methods in
>>> `Stores`
>>> > > factory, and class casting issues would appear on stores created by
>>> DSL.
>>> > >
>>> > > I will draft a proposal with new methods (move methods from proposed
>>> > > interfaces to existing ones) with default implementation in a KIP
>>> update
>>> > > and wait for Matthias to chime in to validate this approach.
>>> > >
>>> > > Jorge.
>>> > >
>>> > >
>>> > > On Sat, Jun 27, 2020 at 4:01 PM John Roesler 
>>> wrote:
>>> > >
>>> > >> Hi Jorge,
>>> > >>
>>> > >> Sorry for my silence, I've been absorbed with the 2.6 and 2.5.1
>>> releases.
>>> > >>
>>> > >> The idea to separate the new methods into "mixin" interfaces seems
>>> > >> like a good one, but as we've discovered in KIP-614, it doesn't work
>>> > >> out that way in practice. The problem is that the store
>>> implementations
>>> > >> are just the base layer that get composed with other layers in
>>> Streams
>>> > >> before they can be accessed in the DSL. This is extremely subtle, so
>>> > >> I'm going to put everyone to sleep with a detailed explanation:
>>> > >>
>>> > >> For example, this is the mechanism by which all KeyValueStore
>>> > >> implementations get added to Streams:
>>> > >> org.apache.kafka.streams.state.internals.KeyValueStoreBuilder#build
>>> > >> return new MeteredKeyValueStore<>(
>>> > >>   maybeWrapCaching(maybeWrapLogging(storeSupplier.get())),
>>> > >>   storeSupplier.metricsScope(),
>>> > >>   time,
>>> > >>   keySerde,
>>> > >>   valueSerde
>>> > >> );
>>> > >>
>>> > >> In the DSL, the store that a processor gets from the context would
>>> be
>>> > >

Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-07-02 Thread Jorge Esteban Quilcate Otoya
@John: I can see there are some deprecations in there as well. Will update
the KIP.

Thanks,
Jorge


On Thu, Jul 2, 2020 at 3:29 PM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Thanks John.
>
> > it looks like there’s a revision error in which two methods are proposed
> for SessionStore, but seem like they should be in ReadOnlySessionStore. Do
> I read that right?
>
> Yes, I've opted to keep the new methods alongside the existing ones. In
> the case of SessionStore, `findSessions` are in `SessionStore`, and `fetch`
> are in `ReadOnlySessionStore`. If it makes more sense, I can move all of
> them to ReadOnlySessionStore.
> Let me know what you think.
>
> Thanks,
> Jorge.
>
> On Thu, Jul 2, 2020 at 2:36 PM John Roesler  wrote:
>
>> Hi Jorge,
>>
>> Thanks for the update. I think this is a good plan.
>>
>> I just took a look at the KIP again, and it looks like there’s a revision
>> error in which two methods are proposed for SessionStore, but seem like
>> they should be in ReadOnlySessionStore. Do I read that right?
>>
>> Otherwise, I’m happy with your proposal.
>>
>> Thanks,
>> John
>>
>> On Wed, Jul 1, 2020, at 17:01, Jorge Esteban Quilcate Otoya wrote:
>> > Quick update: KIP is updated with latest changes now.
>> > Will leave it open this week while working on the PR.
>> >
>> > Hope to open a new vote thread over the next few days if no additional
>> > feedback is provided.
>> >
>> > Cheers,
>> > Jorge.
>> >
>> > On Mon, Jun 29, 2020 at 11:30 AM Jorge Esteban Quilcate Otoya <
>> > quilcate.jo...@gmail.com> wrote:
>> >
>> > > Thanks, John!
>> > >
>> > > Make sense to reconsider the current approach. I was heading in a
>> similar
>> > > direction while drafting the implementation. Metered, Caching, and
>> other
>> > > layers will also have to get duplicated to build up new methods in
>> `Stores`
>> > > factory, and class casting issues would appear on stores created by
>> DSL.
>> > >
>> > > I will draft a proposal with new methods (move methods from proposed
>> > > interfaces to existing ones) with default implementation in a KIP
>> update
>> > > and wait for Matthias to chime in to validate this approach.
>> > >
>> > > Jorge.
>> > >
>> > >
>> > > On Sat, Jun 27, 2020 at 4:01 PM John Roesler 
>> wrote:
>> > >
>> > >> Hi Jorge,
>> > >>
>> > >> Sorry for my silence, I've been absorbed with the 2.6 and 2.5.1
>> releases.
>> > >>
>> > >> The idea to separate the new methods into "mixin" interfaces seems
>> > >> like a good one, but as we've discovered in KIP-614, it doesn't work
>> > >> out that way in practice. The problem is that the store
>> implementations
>> > >> are just the base layer that get composed with other layers in
>> Streams
>> > >> before they can be accessed in the DSL. This is extremely subtle, so
>> > >> I'm going to put everyone to sleep with a detailed explanation:
>> > >>
>> > >> For example, this is the mechanism by which all KeyValueStore
>> > >> implementations get added to Streams:
>> > >> org.apache.kafka.streams.state.internals.KeyValueStoreBuilder#build
>> > >> return new MeteredKeyValueStore<>(
>> > >>   maybeWrapCaching(maybeWrapLogging(storeSupplier.get())),
>> > >>   storeSupplier.metricsScope(),
>> > >>   time,
>> > >>   keySerde,
>> > >>   valueSerde
>> > >> );
>> > >>
>> > >> In the DSL, the store that a processor gets from the context would be
>> > >> the result of this composition. So even if the storeSupplier.get()
>> returns
>> > >> a store that implements the "reverse" interface, when you try to use
>> it
>> > >> from a processor like:
>> > >> org.apache.kafka.streams.kstream.ValueTransformerWithKey#init
>> > >> ReadOnlyBackwardWindowStore store =
>> > >>   (ReadOnlyBackwardWindowStore) context.getStateStore(..)
>> > >>
>> > >> You'd just get a ClassCastException because it's actually a
>> > >> MeteredKeyValueStore, which doesn't implement
>> > >> ReadOnlyBackwardWindowStore.
>> > >>
>> > >> The only w

Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-07-02 Thread Jorge Esteban Quilcate Otoya
Thanks John.

> it looks like there’s a revision error in which two methods are proposed
for SessionStore, but seem like they should be in ReadOnlySessionStore. Do
I read that right?

Yes, I've opted to keep the new methods alongside the existing ones. In the
case of SessionStore, `findSessions` are in `SessionStore`, and `fetch` are
in `ReadOnlySessionStore`. If it makes more sense, I can move all of them
to ReadOnlySessionStore.
Let me know what you think.

Thanks,
Jorge.

On Thu, Jul 2, 2020 at 2:36 PM John Roesler  wrote:

> Hi Jorge,
>
> Thanks for the update. I think this is a good plan.
>
> I just took a look at the KIP again, and it looks like there’s a revision
> error in which two methods are proposed for SessionStore, but seem like
> they should be in ReadOnlySessionStore. Do I read that right?
>
> Otherwise, I’m happy with your proposal.
>
> Thanks,
> John
>
> On Wed, Jul 1, 2020, at 17:01, Jorge Esteban Quilcate Otoya wrote:
> > Quick update: KIP is updated with latest changes now.
> > Will leave it open this week while working on the PR.
> >
> > Hope to open a new vote thread over the next few days if no additional
> > feedback is provided.
> >
> > Cheers,
> > Jorge.
> >
> > On Mon, Jun 29, 2020 at 11:30 AM Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> > > Thanks, John!
> > >
> > > Make sense to reconsider the current approach. I was heading in a
> similar
> > > direction while drafting the implementation. Metered, Caching, and
> other
> > > layers will also have to get duplicated to build up new methods in
> `Stores`
> > > factory, and class casting issues would appear on stores created by
> DSL.
> > >
> > > I will draft a proposal with new methods (move methods from proposed
> > > interfaces to existing ones) with default implementation in a KIP
> update
> > > and wait for Matthias to chime in to validate this approach.
> > >
> > > Jorge.
> > >
> > >
> > > On Sat, Jun 27, 2020 at 4:01 PM John Roesler 
> wrote:
> > >
> > >> Hi Jorge,
> > >>
> > >> Sorry for my silence, I've been absorbed with the 2.6 and 2.5.1
> releases.
> > >>
> > >> The idea to separate the new methods into "mixin" interfaces seems
> > >> like a good one, but as we've discovered in KIP-614, it doesn't work
> > >> out that way in practice. The problem is that the store
> implementations
> > >> are just the base layer that get composed with other layers in Streams
> > >> before they can be accessed in the DSL. This is extremely subtle, so
> > >> I'm going to put everyone to sleep with a detailed explanation:
> > >>
> > >> For example, this is the mechanism by which all KeyValueStore
> > >> implementations get added to Streams:
> > >> org.apache.kafka.streams.state.internals.KeyValueStoreBuilder#build
> > >> return new MeteredKeyValueStore<>(
> > >>   maybeWrapCaching(maybeWrapLogging(storeSupplier.get())),
> > >>   storeSupplier.metricsScope(),
> > >>   time,
> > >>   keySerde,
> > >>   valueSerde
> > >> );
> > >>
> > >> In the DSL, the store that a processor gets from the context would be
> > >> the result of this composition. So even if the storeSupplier.get()
> returns
> > >> a store that implements the "reverse" interface, when you try to use
> it
> > >> from a processor like:
> > >> org.apache.kafka.streams.kstream.ValueTransformerWithKey#init
> > >> ReadOnlyBackwardWindowStore store =
> > >>   (ReadOnlyBackwardWindowStore) context.getStateStore(..)
> > >>
> > >> You'd just get a ClassCastException because it's actually a
> > >> MeteredKeyValueStore, which doesn't implement
> > >> ReadOnlyBackwardWindowStore.
> > >>
> > >> The only way to make this work would be to make the Metered,
> > >> Caching, and Logging layers also implement the new interfaces,
> > >> but this effectively forces implementations to also implement
> > >> the interface. Otherwise, the intermediate layers would have to
> > >> cast the store in each method, like this:
> > >> MeteredWindowStore#backwardFetch {
> > >>   ((ReadOnlyBackwardWindowStore) innerStore).backwardFetch(..)
> > >> }
> > >>
> > >> And then if the implementation doesn't "opt in" by implementing
&g

Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-07-01 Thread Jorge Esteban Quilcate Otoya
Quick update: KIP is updated with latest changes now.
Will leave it open this week while working on the PR.

Hope to open a new vote thread over the next few days if no additional
feedback is provided.

Cheers,
Jorge.

On Mon, Jun 29, 2020 at 11:30 AM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Thanks, John!
>
> Make sense to reconsider the current approach. I was heading in a similar
> direction while drafting the implementation. Metered, Caching, and other
> layers will also have to get duplicated to build up new methods in `Stores`
> factory, and class casting issues would appear on stores created by DSL.
>
> I will draft a proposal with new methods (move methods from proposed
> interfaces to existing ones) with default implementation in a KIP update
> and wait for Matthias to chime in to validate this approach.
>
> Jorge.
>
>
> On Sat, Jun 27, 2020 at 4:01 PM John Roesler  wrote:
>
>> Hi Jorge,
>>
>> Sorry for my silence, I've been absorbed with the 2.6 and 2.5.1 releases.
>>
>> The idea to separate the new methods into "mixin" interfaces seems
>> like a good one, but as we've discovered in KIP-614, it doesn't work
>> out that way in practice. The problem is that the store implementations
>> are just the base layer that get composed with other layers in Streams
>> before they can be accessed in the DSL. This is extremely subtle, so
>> I'm going to put everyone to sleep with a detailed explanation:
>>
>> For example, this is the mechanism by which all KeyValueStore
>> implementations get added to Streams:
>> org.apache.kafka.streams.state.internals.KeyValueStoreBuilder#build
>> return new MeteredKeyValueStore<>(
>>   maybeWrapCaching(maybeWrapLogging(storeSupplier.get())),
>>   storeSupplier.metricsScope(),
>>   time,
>>   keySerde,
>>   valueSerde
>> );
>>
>> In the DSL, the store that a processor gets from the context would be
>> the result of this composition. So even if the storeSupplier.get() returns
>> a store that implements the "reverse" interface, when you try to use it
>> from a processor like:
>> org.apache.kafka.streams.kstream.ValueTransformerWithKey#init
>> ReadOnlyBackwardWindowStore store =
>>   (ReadOnlyBackwardWindowStore) context.getStateStore(..)
>>
>> You'd just get a ClassCastException because it's actually a
>> MeteredKeyValueStore, which doesn't implement
>> ReadOnlyBackwardWindowStore.
>>
>> The only way to make this work would be to make the Metered,
>> Caching, and Logging layers also implement the new interfaces,
>> but this effectively forces implementations to also implement
>> the interface. Otherwise, the intermediate layers would have to
>> cast the store in each method, like this:
>> MeteredWindowStore#backwardFetch {
>>   ((ReadOnlyBackwardWindowStore) innerStore).backwardFetch(..)
>> }
>>
>> And then if the implementation doesn't "opt in" by implementing
>> the interface, you'd get a ClassCastException, not when you get the
>> store, but when you try to use it.
>>
>> The fact that we get ClassCastExceptions no matter which way we
>> turn here indicates that we're really not getting any benefit from the
>> type system, which makes the extra interfaces seem not worth all the
>> code involved.
>>
>> Where we landed in KIP-614 is that, unless we want to completely
>> revamp the way that StateStores work in the DSL, you might as
>> well just add the new methods to the existing interfaces. To prevent
>> compilation errors, we can add default implementations that throw
>> UnsupportedOperationException. If a store doesn't opt in by
>> implementing the methods, you'd get an UnsupportedOperationException,
>> which seems no worse, and maybe better, than the ClassCastException
>> you'd get if we go with the "mixin interface" approach.
>>
>> A quick note: This entire discussion focuses on the DSL. If you're just
>> using the Processor API by directly adding the a custom store to the
>> Topology:
>> org.apache.kafka.streams.Topology#addStateStore
>> and then retrieving it in the processor via:
>> org.apache.kafka.streams.processor.ProcessorContext#getStateStore
>> in
>> org.apache.kafka.streams.processor.Processor#init
>>
>> Then, you can both register and retrieve _any_ StateStore implementation.
>> There's no need to use KeyValueStore or any other built-in interface.
>> In other words, KeyValueStore and company are only part of the DSL,
>> not the PAPI. So, discussions about the build-in store interfaces 

[DISCUSS] KIP-634: Complementary support for headers in Kafka Streams DSL

2020-06-29 Thread Jorge Esteban Quilcate Otoya
Hi everyone,

I would like to start the discussion for
KIP-634:https://cwiki.apache.org/confluence/display/KAFKA/KIP-634%3A+Complementary+support+for+headers+in+Kafka+Streams+DSL

Looking forward to your feedback.

Thanks!
Jorge.


Re: [VOTE] KIP-418: A method-chaining way to branch KStream

2020-06-29 Thread Jorge Esteban Quilcate Otoya
This will be a great addition. Thanks Ivan!

+1 (non-binding)

On Fri, Jun 26, 2020 at 7:07 PM John Roesler  wrote:

> Thanks, Ivan!
>
> I’m +1 (binding)
>
> -John
>
> On Thu, May 28, 2020, at 17:24, Ivan Ponomarev wrote:
> > Hello all!
> >
> > I'd like to start the vote for KIP-418 which proposes deprecation of
> > current `branch` method and provides a method-chaining based API for
> > branching.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-418%3A+A+method-chaining+way+to+branch+KStream
> >
> > Regards,
> >
> > Ivan
> >
>


Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-06-29 Thread Jorge Esteban Quilcate Otoya
Thanks, John!

Make sense to reconsider the current approach. I was heading in a similar
direction while drafting the implementation. Metered, Caching, and other
layers will also have to get duplicated to build up new methods in `Stores`
factory, and class casting issues would appear on stores created by DSL.

I will draft a proposal with new methods (move methods from proposed
interfaces to existing ones) with default implementation in a KIP update
and wait for Matthias to chime in to validate this approach.

Jorge.


On Sat, Jun 27, 2020 at 4:01 PM John Roesler  wrote:

> Hi Jorge,
>
> Sorry for my silence, I've been absorbed with the 2.6 and 2.5.1 releases.
>
> The idea to separate the new methods into "mixin" interfaces seems
> like a good one, but as we've discovered in KIP-614, it doesn't work
> out that way in practice. The problem is that the store implementations
> are just the base layer that get composed with other layers in Streams
> before they can be accessed in the DSL. This is extremely subtle, so
> I'm going to put everyone to sleep with a detailed explanation:
>
> For example, this is the mechanism by which all KeyValueStore
> implementations get added to Streams:
> org.apache.kafka.streams.state.internals.KeyValueStoreBuilder#build
> return new MeteredKeyValueStore<>(
>   maybeWrapCaching(maybeWrapLogging(storeSupplier.get())),
>   storeSupplier.metricsScope(),
>   time,
>   keySerde,
>   valueSerde
> );
>
> In the DSL, the store that a processor gets from the context would be
> the result of this composition. So even if the storeSupplier.get() returns
> a store that implements the "reverse" interface, when you try to use it
> from a processor like:
> org.apache.kafka.streams.kstream.ValueTransformerWithKey#init
> ReadOnlyBackwardWindowStore store =
>   (ReadOnlyBackwardWindowStore) context.getStateStore(..)
>
> You'd just get a ClassCastException because it's actually a
> MeteredKeyValueStore, which doesn't implement
> ReadOnlyBackwardWindowStore.
>
> The only way to make this work would be to make the Metered,
> Caching, and Logging layers also implement the new interfaces,
> but this effectively forces implementations to also implement
> the interface. Otherwise, the intermediate layers would have to
> cast the store in each method, like this:
> MeteredWindowStore#backwardFetch {
>   ((ReadOnlyBackwardWindowStore) innerStore).backwardFetch(..)
> }
>
> And then if the implementation doesn't "opt in" by implementing
> the interface, you'd get a ClassCastException, not when you get the
> store, but when you try to use it.
>
> The fact that we get ClassCastExceptions no matter which way we
> turn here indicates that we're really not getting any benefit from the
> type system, which makes the extra interfaces seem not worth all the
> code involved.
>
> Where we landed in KIP-614 is that, unless we want to completely
> revamp the way that StateStores work in the DSL, you might as
> well just add the new methods to the existing interfaces. To prevent
> compilation errors, we can add default implementations that throw
> UnsupportedOperationException. If a store doesn't opt in by
> implementing the methods, you'd get an UnsupportedOperationException,
> which seems no worse, and maybe better, than the ClassCastException
> you'd get if we go with the "mixin interface" approach.
>
> A quick note: This entire discussion focuses on the DSL. If you're just
> using the Processor API by directly adding the a custom store to the
> Topology:
> org.apache.kafka.streams.Topology#addStateStore
> and then retrieving it in the processor via:
> org.apache.kafka.streams.processor.ProcessorContext#getStateStore
> in
> org.apache.kafka.streams.processor.Processor#init
>
> Then, you can both register and retrieve _any_ StateStore implementation.
> There's no need to use KeyValueStore or any other built-in interface.
> In other words, KeyValueStore and company are only part of the DSL,
> not the PAPI. So, discussions about the build-in store interfaces are only
> really relevant in the context of the DSL, Transformers, and Materialized.
>
> So, in conclusion, I'd really recommend just adding any new methods to
> the existing store interfaces. We might be able to revamp the API in the
> future to support mixins, but it's a much larger scope of work than this
> KIP.
> A more minor comment is that we don't need to add Deprecated variants
> of new methods.
>
> Thanks again, and once again, I'm sorry I tuned out and didn't offer this
> feedback before you revised the KIP.
> -John
>
>
>
>
> On Mon, Jun 22, 2020, at 06:11, Jorge Esteban Quilcate Otoya wrote:
> > Hi everyone,
> >
> > I'

Re: New Website Layout

2020-06-26 Thread Jorge Esteban Quilcate Otoya
Looks great!!

A small comment about menus: `Get Started` and `Docs` pages have different
UX.
While Get Started pages use the whole page, Docs have a menu on the left
side.
I'd like Docs pages to also use most of the page. For instance, config and
metrics tables could look more readable using more space.
Would it be possible to make the left menu expand/collapse, similar to
current Confluence wiki menu?

Thanks,
Jorge.

On Fri, Jun 26, 2020 at 11:49 AM Ben Stopford  wrote:

> Hey folks
>
> We've made some updates to the website's look and feel. There is a staged
> version in the link below.
>
> https://ec2-13-57-18-236.us-west-1.compute.amazonaws.com/
> username: kafka
> password: streaming
>
> Comments welcomed.
>
> Ben
>


Re: [VOTE] KIP-629: Use racially neutral terms in our codebase

2020-06-26 Thread Jorge Esteban Quilcate Otoya
+1 (non-binding)
Thank you Xavier!

On Fri, Jun 26, 2020 at 8:38 AM Bruno Cadonna  wrote:

> +1 (non-binding)
>
> On Fri, Jun 26, 2020 at 3:41 AM Jay Kreps  wrote:
> >
> > +1
> >
> > On Thu, Jun 25, 2020 at 6:39 PM Bill Bejeck  wrote:
> >
> > > Thanks for this KIP Xavier.
> > >
> > > +1(binding)
> > >
> > > -Bill
> > >
> > > On Thu, Jun 25, 2020 at 9:04 PM Gwen Shapira 
> wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > Thank you Xavier!
> > > >
> > > > On Thu, Jun 25, 2020, 3:44 PM Xavier Léauté  wrote:
> > > >
> > > > > Hi Everyone,
> > > > >
> > > > > I would like to initiate the voting process for KIP-629.
> > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-629%3A+Use+racially+neutral+terms+in+our+codebase
> > > > >
> > > > > Thank you,
> > > > > Xavier
> > > > >
> > > >
> > >
>


Re: [DISCUSS] KIP-629: Use racially neutral terms in our codebase

2020-06-24 Thread Jorge Esteban Quilcate Otoya
Hi Xavier,

Thank you for this proposal! It's awesome to see the community taking
action on this.

`include` and `exclude` make sense to me.

Probably obvious but is documentation and website considered as well as
part of the KIP?
This would be interesting because it could be also important to make these
guidelines explicit for other components in the ecosystem (e.g. connectors)
to follow.
What do you think?

Cheers,
Jorge.

On Tue, Jun 23, 2020 at 8:38 AM Bruno Cadonna  wrote:

> Hi Xavier,
>
> Thank you very much for starting this initiative!
> Not only for the changes to the code base but also for showing me
> where and how we can use more appropriate terms in general.
>
> Best,
> Bruno
>
> On Tue, Jun 23, 2020 at 4:17 AM John Roesler  wrote:
> >
> > Hi Xavier,
> >
> > I think your approach made a lot of sense; I definitely didn’t mean to
> criticize. Thanks for the update! The new names look good to me.
> >
> > -John
> >
> > On Mon, Jun 22, 2020, at 18:50, Matthias J. Sax wrote:
> > > Great initiative!
> > >
> > > I liked the proposed names, too.
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 6/22/20 4:48 PM, Guozhang Wang wrote:
> > > > Xavier, thanks for the KIP! The proposed names make sense to me.
> > > >
> > > > Guozhang
> > > >
> > > > On Mon, Jun 22, 2020 at 4:24 PM Xavier Léauté 
> wrote:
> > > >
> > > >> Please check the list for updated config / argument names.
> > > >>
> > > >> I also added a proposal to replace the term "blackout" with
> "backoff",
> > > >> which is used internally in the replication protocol.
> > > >>
> > > >> On Mon, Jun 22, 2020 at 3:10 PM Xavier Léauté 
> wrote:
> > > >>
> > > >>> I agree we could improve on some of the config names. My thinking
> here is
> > > >>> that unless we had some precedent for a different name, it seemed
> > > >>> relatively straightforward to follow the approach other open source
> > > >>> projects have taken. It also makes migration for users easy if we
> are
> > > >>> consistent in the renaming, so we should find terms we can use
> across the
> > > >>> board.
> > > >>>
> > > >>> A cursory search indicates we already use include/exclude for topic
> > > >>> creation config in Connect, so I think it makes sense to align on
> that.
> > > >>> I'll update the KIP accordingly.
> > > >>>
> > > >>> On Sat, Jun 20, 2020 at 11:37 AM Ryanne Dolan <
> ryannedo...@gmail.com>
> > > >>> wrote:
> > > >>>
> > >  Xavier, I'm dismayed to see some of these instances are my fault.
> Fully
> > >  support your plan.
> > > 
> > >  John, I had the same thought -- "list" is extraneous here. In the
> case
> > > >> of
> > >  "topics.whitelist" we already have precedent to just use "topics".
> > > 
> > >  Ryanne
> > > 
> > >  On Sat, Jun 20, 2020, 12:43 PM John Roesler 
> > > >> wrote:
> > > 
> > > > Thanks Xavier!
> > > >
> > > > I’m +1 on this idea, and I’m glad this is the extent of what
> needs to
> > > >>> be
> > > > changed. I recall when I joined the project being pleased at the
> lack
> > > >>> of
> > > > common offensive terminology. I hadn’t considered
> > > >> whitelist/blacklist,
> > >  but
> > > > I can see the argument.
> > > >
> > > > Allowlist/blocklist are kind of a mouthful, though.
> > > >
> > > > What do you think of just “allow” and “deny” instead? This is
> common
> > > > terminology in ACLs for example, and it doesn’t really seem
> necessary
> > > >>> to
> > > > say “list” in the config name.
> > > >
> > > > Alternatively, looking at the actual configs, it seems like
> > > >> “include”,
> > > > “include-only” (or “only”) and “exclude” might be more
> appropriate in
> > > > context.
> > > >
> > > > I hope this doesn’t kick off a round of bikeshedding. I’m really
> fine
> > > > either way; I doubt it matters much. I just wanted to see if we
> can
> > > >>> name
> > > > these configs without making up new multi-syllable words.
> > > >
> > > > Thanks for bringing it up!
> > > > -John
> > > >
> > > > On Sat, Jun 20, 2020, at 09:31, Ron Dagostino wrote:
> > > >> Yes.  Thank you.
> > > >>
> > > >>> On Jun 20, 2020, at 12:20 AM, Gwen Shapira 
> > >  wrote:
> > > >>>
> > > >>> Thank you so much for this initiative. Small change, but it
> makes
> > > >>> our
> > > >>> community more inclusive.
> > > >>>
> > > >>> Gwen
> > > >>>
> > >  On Fri, Jun 19, 2020, 6:02 PM Xavier Léauté 
> > >  wrote:
> > > 
> > >  Hi Everyone,
> > > 
> > >  There are a number of places in our codebase that use racially
> > >  charged
> > >  terms. I am proposing we update them to use more neutral
> terms.
> > > 
> > >  The KIP lists the ones I have found and proposes alternatives.
> > > >> If
> > >  you
> > > > see
> > >  any I missed or did not consider, please reply and I'll add
> > > >> them.
> > > 
> 

Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-06-22 Thread Jorge Esteban Quilcate Otoya
Hi everyone,

I've updated the KIP, applying Matthias' feedback regarding interface
hierarchy.

Also, following the last email, I think we can consider reverse operations
on KeyValue range as well, as implementation supports lexicographic order.

I considered different naming between Key-based ranges and Time-based
ranges, and mitigate confusion when fetching keys and time ranges as
WindowStore does:

Key-based ranges: reverseRange(), reverseAll()
Time-based ranges: backwardFetch()

Then, key-based changes apply to KeyValueStore, and time-based changes to
Window and Session stores.

Let me know if you have any questions.

Thanks,
Jorge.


On Tue, Jun 16, 2020 at 12:47 AM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Hi everyone, sorry for the late reply.
>
> Thanks Matthias for your feedback. I think it makes sense to reconsider
> the current design based on your input.
>
> After digging deeper into the current implementation, I'd like to bring my
> current understanding to be double-checked as it might be redefining the
> KIP's scope:
>
> 1. There are 2 ranges been exposed by different stores:
>
> a. Key Range
> b. Timestamp Range
>
> So far, we have discussed covering both.
>
> 2. Key Range functions do not provide ordering guarantees by design:
>
> ```ReadOnlyKeyValueStore.java
> /**
>  * Get an iterator over a given range of keys. This iterator must be
> closed after use.
>  * The returned iterator must be safe from {@link
> java.util.ConcurrentModificationException}s
>  * and must not return null values. No ordering guarantees are
> provided.
>  * ...
>  */
>  KeyValueIterator range(K from, K to);
> ```
>
> Therefore, I'd propose removing Key range operations from the scope.
>
> 3. Timestamp Range operations happen at the SegmentsStore level (internal)
> API
>
> AFAICT, Segments wrappers handle all Timestamp ranges queries.
>
> I'd propose extending `Segments#segments(timeFrom, timeTo, backwards)`
> with a flag for backwards operations.
>
> As segments returned will be processed backwards, I'm not extending
> KeyValueStores to query each segment backwards as previous point 2.
>
> 4. Extend WindowStores implementations with a new
> WindowBackwardStore/ReadOnlyBackwardStore:
>
> ```java
> public interface ReadOnlyWindowBackwardStore {
> WindowStoreIterator backwardFetch(K key, Instant from, Instant to)
> throws IllegalArgumentException;
>
> KeyValueIterator, V> backwardFetch(K from, K to, Instant
> fromTime, Instant toTime)
> throws IllegalArgumentException;
>
> KeyValueIterator, V> backwardFetchAll(Instant from,
> Instant to) throws IllegalArgumentException;
> ```
>
> 5. SessionStore is a bit different as it has fetch/find sessions spread
> between SessionStore and ReadOnlySessionStore.
>
> I'd propose a new interface `SessionBackwardStore` to expose backward find
> operations:
>
> ```java
> public interface SessionBackwardStore {
> KeyValueIterator, AGG> backwardFindSessions(final K key,
> final long earliestSessionEndTime, final long latestSessionStartTime);
>
> KeyValueIterator, AGG> backwardFindSessions(final K
> keyFrom, final K keyTo, final long earliestSessionEndTime, final long
> latestSessionStartTime);
> }
> ```
>
> If this understanding is correct I'll proceed to update the KIP based on
> this.
>
> Looking forward to your feedback.
>
> Thanks,
> Jorge.
>
> On Fri, May 29, 2020 at 3:32 AM Matthias J. Sax  wrote:
>
>> Hey,
>>
>> Sorry that I am late to the game. I am not 100% convinced about the
>> current proposal. Using a new config as feature flag seems to be rather
>> "nasty" to me, and flipping from/to is a little bit too fancy for my
>> personal taste.
>>
>> I agree, that the original proposal using a "ReadDirection" enum is not
>> ideal either.
>>
>> Thus, I would like to put out a new idea: We could add a new interface
>> that offers new methods that return revers iterators.
>>
>> The KIP already proposes to add `reverseAll()` and it seems backward
>> incompatible to just add this method to `ReadOnlyKeyValueStore` and
>> `ReadOnlyWindowStore`. I don't think we could provide a useful default
>> implementation for custom stores and thus either break compatibility or
>> need add a default that just throws an exception. Neither seems to be a
>> good option.
>>
>> Using a new interface avoid this issue and allows users implementing
>> custom stores to opt-in by adding the interface to their stores.
>> Furthermore, we don't need any config. In the end, we encapsulte the
>> c

Re: [VOTE] KIP-626: Rename StreamsConfig config variable name

2020-06-16 Thread Jorge Esteban Quilcate Otoya
+1

Thanks Matthias.

On Tue, Jun 16, 2020 at 4:02 AM Matthias J. Sax  wrote:

> Hi,
>
> I found a small inconsistency in our public API and propose a small KIP
> to fix it. As the change is trivial, I skip the discussion and call
> directly for a VOTE.
>
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-626%3A+Rename+StreamsConfig+config+variable+name
>
>
> -Matthias
>
>


Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-06-15 Thread Jorge Esteban Quilcate Otoya
revers iterator for KeyValue and Window store, should we
> do the same for Session store?
>
>
>
> This might be more code to write, but I believe it provides the better
> user experience. Thoughts?
>
>
>
> -Matthias
>
>
>
>
> On 5/26/20 8:47 PM, John Roesler wrote:
> > Sorry for my silence, Jorge,
> >
> > I've just taken another look, and I'm personally happy with the KIP.
> >
> > Thanks,
> > -John
> >
> > On Tue, May 26, 2020, at 16:17, Jorge Esteban Quilcate Otoya wrote:
> >> If no additional comments, I will proceed to start the a vote thread.
> >>
> >> Thanks a lot for your feedback!
> >>
> >> On Fri, May 22, 2020 at 9:25 AM Jorge Esteban Quilcate Otoya <
> >> quilcate.jo...@gmail.com> wrote:
> >>
> >>> Thanks Sophie. I like the `reverseAll()` idea.
> >>>
> >>> I updated the KIP with your feedback.
> >>>
> >>>
> >>>
> >>> On Fri, May 22, 2020 at 4:22 AM Sophie Blee-Goldman <
> sop...@confluent.io>
> >>> wrote:
> >>>
> >>>> Hm, the case of `all()` does seem to present a dilemma in the case of
> >>>> variable-length keys.
> >>>>
> >>>> In the case of fixed-length keys, you can just compute the keys that
> >>>> correspond
> >>>> to the maximum and minimum serialized bytes, then perform a `range()`
> >>>> query
> >>>> instead of an `all()`. If your keys don't have a well-defined ordering
> >>>> such
> >>>> that
> >>>> you can't determine the MAX_KEY, then you probably don't care about
> the
> >>>> iterator order anyway.
> >>>>
> >>>>  But with variable-length keys, there is no MAX_KEY. If all your keys
> were
> >>>> just
> >>>> of the form 'a', 'aa', 'a', 'aaa' then in fact the only way to
> >>>> figure out the
> >>>> maximum key in the store is by using `all()` -- and without a reverse
> >>>> iterator, you're
> >>>> doomed to iterate through every single key just to answer that simple
> >>>> question.
> >>>>
> >>>> That said, I still think determining the iterator order based on the
> >>>> to/from bytes
> >>>> makes a lot of intuitive sense and gives the API a nice symmetry.
> What if
> >>>> we
> >>>> solved the `all()` problem by just giving `all()` a reverse form to
> >>>> complement it?
> >>>> Ie we would have `all()` and `reverseAll()`, or something to that
> effect.
> >>>>
> >>>> On Thu, May 21, 2020 at 3:41 PM Jorge Esteban Quilcate Otoya <
> >>>> quilcate.jo...@gmail.com> wrote:
> >>>>
> >>>>> Thanks John.
> >>>>>
> >>>>> Agree. I like the first approach as well, with StreamsConfig flag
> >>>> passing
> >>>>> by via ProcessorContext.
> >>>>>
> >>>>> Another positive effect with "reverse parameters" is that in the
> case of
> >>>>> `fetch(keyFrom, keyTo, timeFrom, timeTo)` users can decide _which_
> pair
> >>>> to
> >>>>> flip, whether with `ReadDirection` enum it apply to both.
> >>>>>
> >>>>> The only issue I've found while reviewing the KIP is that `all()`
> won't
> >>>> fit
> >>>>> within this approach.
> >>>>>
> >>>>> We could remove it from the KIP and argue that for WindowStore,
> >>>>> `fetchAll(0, Long.MAX_VALUE)` can be used to get all in reverse
> order,
> >>>> and
> >>>>> for KeyValueStore, no ordering guarantees are provided.
> >>>>>
> >>>>> If there is consensus with this changes, I will go and update the
> KIP.
> >>>>>
> >>>>> On Thu, May 21, 2020 at 3:33 PM John Roesler 
> >>>> wrote:
> >>>>>
> >>>>>> Hi Jorge,
> >>>>>>
> >>>>>> Thanks for that idea. I agree, a feature flag would protect anyone
> >>>>>> who may be depending on the current behavior.
> >>>>>>
> >>>>>> It seems better to locate the feature flag in the initialization
> >>>> logic of
> >>>>>> the store, rather than have a method

Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-05-26 Thread Jorge Esteban Quilcate Otoya
If no additional comments, I will proceed to start the a vote thread.

Thanks a lot for your feedback!

On Fri, May 22, 2020 at 9:25 AM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Thanks Sophie. I like the `reverseAll()` idea.
>
> I updated the KIP with your feedback.
>
>
>
> On Fri, May 22, 2020 at 4:22 AM Sophie Blee-Goldman 
> wrote:
>
>> Hm, the case of `all()` does seem to present a dilemma in the case of
>> variable-length keys.
>>
>> In the case of fixed-length keys, you can just compute the keys that
>> correspond
>> to the maximum and minimum serialized bytes, then perform a `range()`
>> query
>> instead of an `all()`. If your keys don't have a well-defined ordering
>> such
>> that
>> you can't determine the MAX_KEY, then you probably don't care about the
>> iterator order anyway.
>>
>>  But with variable-length keys, there is no MAX_KEY. If all your keys were
>> just
>> of the form 'a', 'aa', 'a', 'aaa' then in fact the only way to
>> figure out the
>> maximum key in the store is by using `all()` -- and without a reverse
>> iterator, you're
>> doomed to iterate through every single key just to answer that simple
>> question.
>>
>> That said, I still think determining the iterator order based on the
>> to/from bytes
>> makes a lot of intuitive sense and gives the API a nice symmetry. What if
>> we
>> solved the `all()` problem by just giving `all()` a reverse form to
>> complement it?
>> Ie we would have `all()` and `reverseAll()`, or something to that effect.
>>
>> On Thu, May 21, 2020 at 3:41 PM Jorge Esteban Quilcate Otoya <
>> quilcate.jo...@gmail.com> wrote:
>>
>> > Thanks John.
>> >
>> > Agree. I like the first approach as well, with StreamsConfig flag
>> passing
>> > by via ProcessorContext.
>> >
>> > Another positive effect with "reverse parameters" is that in the case of
>> > `fetch(keyFrom, keyTo, timeFrom, timeTo)` users can decide _which_ pair
>> to
>> > flip, whether with `ReadDirection` enum it apply to both.
>> >
>> > The only issue I've found while reviewing the KIP is that `all()` won't
>> fit
>> > within this approach.
>> >
>> > We could remove it from the KIP and argue that for WindowStore,
>> > `fetchAll(0, Long.MAX_VALUE)` can be used to get all in reverse order,
>> and
>> > for KeyValueStore, no ordering guarantees are provided.
>> >
>> > If there is consensus with this changes, I will go and update the KIP.
>> >
>> > On Thu, May 21, 2020 at 3:33 PM John Roesler 
>> wrote:
>> >
>> > > Hi Jorge,
>> > >
>> > > Thanks for that idea. I agree, a feature flag would protect anyone
>> > > who may be depending on the current behavior.
>> > >
>> > > It seems better to locate the feature flag in the initialization
>> logic of
>> > > the store, rather than have a method on the "live" store that changes
>> > > its behavior on the fly.
>> > >
>> > > It seems like there are two options here, one is to add a new config:
>> > >
>> > > StreamsConfig.ENABLE_BACKWARDS_ITERATION =
>> > >   "enable.backwards.iteration
>> > >
>> > > Or we can add a feature flag in Materialized, like
>> > >
>> > > Materialized.enableBackwardsIteration()
>> > >
>> > > I think I'd personally lean toward the config, for the following
>> reason.
>> > > The concern that Sophie raised is that someone's program may depend
>> > > on the existing contract of getting an empty iterator. We don't want
>> to
>> > > switch behavior when they aren't expecting it, so we provide them a
>> > > config to assert that they _are_ expecting the new behavior, which
>> > > means they take responsibility for updating their code to expect the
>> new
>> > > behavior.
>> > >
>> > > There doesn't seem to be a reason to offer a choice of behaviors on a
>> > > per-query, or per-store basis. We just want people to be not surprised
>> > > by this change in general.
>> > >
>> > > What do you think?
>> > > Thanks,
>> > > -John
>> > >
>> > > On Wed, May 20, 2020, at 17:37, Jorge Quilcate wrote:
>> > > > Thank you both for the great feedback.
>> > > >
>> > > > I like

Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-05-22 Thread Jorge Esteban Quilcate Otoya
Thanks Sophie. I like the `reverseAll()` idea.

I updated the KIP with your feedback.



On Fri, May 22, 2020 at 4:22 AM Sophie Blee-Goldman 
wrote:

> Hm, the case of `all()` does seem to present a dilemma in the case of
> variable-length keys.
>
> In the case of fixed-length keys, you can just compute the keys that
> correspond
> to the maximum and minimum serialized bytes, then perform a `range()` query
> instead of an `all()`. If your keys don't have a well-defined ordering such
> that
> you can't determine the MAX_KEY, then you probably don't care about the
> iterator order anyway.
>
>  But with variable-length keys, there is no MAX_KEY. If all your keys were
> just
> of the form 'a', 'aa', 'a', 'aaa' then in fact the only way to
> figure out the
> maximum key in the store is by using `all()` -- and without a reverse
> iterator, you're
> doomed to iterate through every single key just to answer that simple
> question.
>
> That said, I still think determining the iterator order based on the
> to/from bytes
> makes a lot of intuitive sense and gives the API a nice symmetry. What if
> we
> solved the `all()` problem by just giving `all()` a reverse form to
> complement it?
> Ie we would have `all()` and `reverseAll()`, or something to that effect.
>
> On Thu, May 21, 2020 at 3:41 PM Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > Thanks John.
> >
> > Agree. I like the first approach as well, with StreamsConfig flag passing
> > by via ProcessorContext.
> >
> > Another positive effect with "reverse parameters" is that in the case of
> > `fetch(keyFrom, keyTo, timeFrom, timeTo)` users can decide _which_ pair
> to
> > flip, whether with `ReadDirection` enum it apply to both.
> >
> > The only issue I've found while reviewing the KIP is that `all()` won't
> fit
> > within this approach.
> >
> > We could remove it from the KIP and argue that for WindowStore,
> > `fetchAll(0, Long.MAX_VALUE)` can be used to get all in reverse order,
> and
> > for KeyValueStore, no ordering guarantees are provided.
> >
> > If there is consensus with this changes, I will go and update the KIP.
> >
> > On Thu, May 21, 2020 at 3:33 PM John Roesler 
> wrote:
> >
> > > Hi Jorge,
> > >
> > > Thanks for that idea. I agree, a feature flag would protect anyone
> > > who may be depending on the current behavior.
> > >
> > > It seems better to locate the feature flag in the initialization logic
> of
> > > the store, rather than have a method on the "live" store that changes
> > > its behavior on the fly.
> > >
> > > It seems like there are two options here, one is to add a new config:
> > >
> > > StreamsConfig.ENABLE_BACKWARDS_ITERATION =
> > >   "enable.backwards.iteration
> > >
> > > Or we can add a feature flag in Materialized, like
> > >
> > > Materialized.enableBackwardsIteration()
> > >
> > > I think I'd personally lean toward the config, for the following
> reason.
> > > The concern that Sophie raised is that someone's program may depend
> > > on the existing contract of getting an empty iterator. We don't want to
> > > switch behavior when they aren't expecting it, so we provide them a
> > > config to assert that they _are_ expecting the new behavior, which
> > > means they take responsibility for updating their code to expect the
> new
> > > behavior.
> > >
> > > There doesn't seem to be a reason to offer a choice of behaviors on a
> > > per-query, or per-store basis. We just want people to be not surprised
> > > by this change in general.
> > >
> > > What do you think?
> > > Thanks,
> > > -John
> > >
> > > On Wed, May 20, 2020, at 17:37, Jorge Quilcate wrote:
> > > > Thank you both for the great feedback.
> > > >
> > > > I like the "fancy" proposal :), and how it removes the need for
> > > > additional API methods. And with a feature flag on `StateStore`,
> > > > disabled by default, should no break current users.
> > > >
> > > > The only side-effect I can think of is that: by moving the flag
> > upwards,
> > > > all later operations become affected; which might be ok for most
> (all?)
> > > > cases. I can't think of an scenario where this would be an issue,
> just
> > > > want to point this out.
> > > >
> > > > If moving to this approach, I'd like to check 

Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-05-21 Thread Jorge Esteban Quilcate Otoya
Thanks John.

Agree. I like the first approach as well, with StreamsConfig flag passing
by via ProcessorContext.

Another positive effect with "reverse parameters" is that in the case of
`fetch(keyFrom, keyTo, timeFrom, timeTo)` users can decide _which_ pair to
flip, whether with `ReadDirection` enum it apply to both.

The only issue I've found while reviewing the KIP is that `all()` won't fit
within this approach.

We could remove it from the KIP and argue that for WindowStore,
`fetchAll(0, Long.MAX_VALUE)` can be used to get all in reverse order, and
for KeyValueStore, no ordering guarantees are provided.

If there is consensus with this changes, I will go and update the KIP.

On Thu, May 21, 2020 at 3:33 PM John Roesler  wrote:

> Hi Jorge,
>
> Thanks for that idea. I agree, a feature flag would protect anyone
> who may be depending on the current behavior.
>
> It seems better to locate the feature flag in the initialization logic of
> the store, rather than have a method on the "live" store that changes
> its behavior on the fly.
>
> It seems like there are two options here, one is to add a new config:
>
> StreamsConfig.ENABLE_BACKWARDS_ITERATION =
>   "enable.backwards.iteration
>
> Or we can add a feature flag in Materialized, like
>
> Materialized.enableBackwardsIteration()
>
> I think I'd personally lean toward the config, for the following reason.
> The concern that Sophie raised is that someone's program may depend
> on the existing contract of getting an empty iterator. We don't want to
> switch behavior when they aren't expecting it, so we provide them a
> config to assert that they _are_ expecting the new behavior, which
> means they take responsibility for updating their code to expect the new
> behavior.
>
> There doesn't seem to be a reason to offer a choice of behaviors on a
> per-query, or per-store basis. We just want people to be not surprised
> by this change in general.
>
> What do you think?
> Thanks,
> -John
>
> On Wed, May 20, 2020, at 17:37, Jorge Quilcate wrote:
> > Thank you both for the great feedback.
> >
> > I like the "fancy" proposal :), and how it removes the need for
> > additional API methods. And with a feature flag on `StateStore`,
> > disabled by default, should no break current users.
> >
> > The only side-effect I can think of is that: by moving the flag upwards,
> > all later operations become affected; which might be ok for most (all?)
> > cases. I can't think of an scenario where this would be an issue, just
> > want to point this out.
> >
> > If moving to this approach, I'd like to check if I got this right before
> > updating the KIP:
> >
> > - only `StateStore` will change by having a new method:
> > `backwardIteration()`, `false` by default to keep things compatible.
> > - then all `*Stores` will have to update their implementation based on
> > this flag.
> >
> >
> > On 20/05/2020 21:02, Sophie Blee-Goldman wrote:
> > >> There's no possibility that someone could be relying
> > >> on iterating over that range in increasing order, because that's not
> what
> > >> happens. However, they could indeed be relying on getting an empty
> > > iterator
> > >
> > > I just meant that they might be relying on the assumption that the
> range
> > > query
> > > will never return results with decreasing keys. The empty iterator
> wouldn't
> > > break that contract, but of course a surprise reverse iterator would.
> > >
> > > FWIW I actually am in favor of automatically converting to a reverse
> > > iterator,
> > > I just thought we should consider whether this should be off by
> default or
> > > even possible to disable at all.
> > >
> > > On Tue, May 19, 2020 at 7:42 PM John Roesler 
> wrote:
> > >
> > >> Thanks for the response, Sophie,
> > >>
> > >> I wholeheartedly agree we should take as much into account as possible
> > >> up front, rather than regretting our decisions later. I actually do
> share
> > >> your vague sense of worry, which was what led me to say initially
> that I
> > >> thought my counterproposal might be "too fancy". Sometimes, it's
> better
> > >> to be explicit instead of "elegant", if we think more people will be
> > >> confused
> > >> than not.
> > >>
> > >> I really don't think that there's any danger of "relying on a bug"
> here,
> > >> although
> > >> people certainly could be relying on current behavior. One thing to be
> > >> clear
> > >> about (which I just left a more detailed comment in KAFKA-8159 about)
> is
> > >> that
> > >> when we say something like key1 > key2, this ordering is defined by
> the
> > >> serde's output and nothing else.
> > >>
> > >> Currently, thanks to your fix in
> https://github.com/apache/kafka/pull/6521
> > >> ,
> > >> the store contract is that for range scans, if from > to, then the
> store
> > >> must
> > >> return an empty iterator. There's no possibility that someone could be
> > >> relying
> > >> on iterating over that range in increasing order, because that's not
> what
> > >> happens. However, they could 

[jira] [Created] (KAFKA-9929) Support reverse iterator on WindowStore

2020-04-28 Thread Jorge Esteban Quilcate Otoya (Jira)
Jorge Esteban Quilcate Otoya created KAFKA-9929:
---

 Summary: Support reverse iterator on WindowStore
 Key: KAFKA-9929
 URL: https://issues.apache.org/jira/browse/KAFKA-9929
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Jorge Esteban Quilcate Otoya


Currently, WindowStore fetch operations return an iterator sorted from earliest 
to latest result:

```

* For each key, the iterator guarantees ordering of windows, starting from the 
oldest/earliest

* available window to the newest/latest window.

```

 

We have a use-case where traces are stored in a WindowStore and 
use Kafka Streams to create a materialized view of traces. A query request 
comes with a time range (e.g. now-1h, now) and want to return the most recent 
results, i.e. fetch from this period of time, iterate and pattern match 
latest/most recent traces, and if enough results, then reply without moving 
further on the iterator.

Same store is used to search for previous traces. In this case, it search a key 
for the last day, if found traces, we would also like to iterate from the most 
recent.

RocksDb seems to support iterating backward and forward: 
[https://github.com/facebook/rocksdb/wiki/Iterator#iterating-upper-bound-and-lower-bound]

 

For reference: This in some way extracts some bits from this previous issue: 
https://issues.apache.org/jira/browse/KAFKA-4212:

 

> The {{RocksDBWindowsStore}}, a {{WindowsStore}}, can expire items via segment 
> dropping, but it stores multiple items per key, based on their timestamp. But 
> this store can be repurposed as a cache by fetching the items in reverse 
> chronological order and returning the first item found.

 

Would like to know if there is any impediment on RocksDb or  WindowStore to 
support this.

Adding an argument to reverse in current fetch methods would be great:

```

WindowStore.fetch(from,to,Direction.BACKWARD|FORWARD)

```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Kafka JMX monitoring

2020-03-19 Thread Jorge Esteban Quilcate Otoya
Hi,

You could start by looking at Jolokia  or Prometheus
JMX Exporter  depending on your
needs.

Jorge.

On Thu, Mar 19, 2020 at 7:54 AM 张祥  wrote:

> Hi,
>
> I want to know what the best practice to collect Kafka JMX metrics is. I
> haven't found a decent way to collect and parse JMX in Java (because it is
> too much) and I learn that there are tools like tools like jmxtrans to do
> this. I wonder if there is more. Thanks. Regards.
>


Re: [DISCUSS] KIP-515: Enable ZK client to use the new TLS supported authentication

2019-09-03 Thread Jorge Esteban Quilcate Otoya
Hi Pere,

Have you add your KIP to the list here
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals?

I found the KIP number assigned to another.



On Mon, Sep 2, 2019 at 2:23 PM Pere Urbón Bayes 
wrote:

> Thanks for your time Harsha,
>anyone else with comments? looking forward to hearing from you.
>
> Stupid question: when do you move from discussion to vote?
>
> Missatge de Harsha Chintalapani  del dia dv., 30 d’ag.
> 2019 a les 21:59:
>
> > Thanks Pere. KIP looks good to me.
> > -Harsha
> >
> >
> > On Fri, Aug 30, 2019 at 10:05 AM, Pere Urbón Bayes  >
> > wrote:
> >
> >> Not really,
> >>   my idea is to keep the JAAS parameter, so people don't see major
> >> changes. But if you pass a properties file, then this takes precedence
> over
> >> the other, with the idea that you can do sasl as well with the
> properties
> >> files.
> >>
> >> Makes sense?
> >>
> >> -- Pere
> >>
> >> Missatge de Harsha Chintalapani  del dia dv., 30 d’ag.
> >> 2019 a les 19:00:
> >>
> >>> Hi Pere,
> >>>   Thanks for the KIP. Enabling SSL for zookeeper for Kafka
> makes
> >>> sense.
> >>> "The changes are planned to be introduced in a compatible way, by
> >>> keeping the current JAAS variable precedence."
> >>> Can you elaborate a bit here. If the user configures a JAAS file with
> >>> Client section it will take precedence over zookeeper SSL configs?
> >>>
> >>> Thanks,
> >>> Harsha
> >>>
> >>>
> >>>
> >>> On Fri, Aug 30, 2019 at 7:50 AM, Pere Urbón Bayes <
> pere.ur...@gmail.com>
> >>> wrote:
> >>>
>  Hi,
>  quick question, I saw in another mail that 2.4 release is planned for
>  September. I think it would be really awesome to have this for this
>  release, do you think we can make it?
> 
>  -- Pere
> 
>  Missatge de Pere Urbón Bayes  del dia dj., 29
>  d’ag. 2019 a les 20:10:
> 
>  Hi,
>  this is my first KIP for a change in Apache Kafka, so I'm really need
>  to the process. Looking forward to hearing from you and learn the best
>  ropes here.
> 
>  I would like to propose this KIP-515 to enable the ZookeeperClients to
>  take full advantage of the TLS communication in the new Zookeeper
> 3.5.5.
>  Specially interesting it the Zookeeper Security Migration, that
> without
>  this change will not work with TLS, disabling users to use ACLs when
> the
>  Zookeeper cluster use TLS.
> 
>  link:
> 
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-515%3A+Enable+ZK+client+to+use+the+new+TLS+supported+authentication
> 
>  Looking forward to hearing from you on this,
> 
>  /cheers
> 
>  --
>  Pere Urbon-Bayes
>  Software Architect
>  http://www.purbon.com
>  https://twitter.com/purbon
>  https://www.linkedin.com/in/purbon/
> 
>  --
>  Pere Urbon-Bayes
>  Software Architect
>  http://www.purbon.com
>  https://twitter.com/purbon
>  https://www.linkedin.com/in/purbon/
> 
> >>>
> >>>
> >>
> >> --
> >> Pere Urbon-Bayes
> >> Software Architect
> >> http://www.purbon.com
> >> https://twitter.com/purbon
> >> https://www.linkedin.com/in/purbon/
> >>
> >
> >
>
> --
> Pere Urbon-Bayes
> Software Architect
> http://www.purbon.com
> https://twitter.com/purbon
> https://www.linkedin.com/in/purbon/
>


Re: [ANNOUNCE] New Committer: Bill Bejeck

2019-02-14 Thread Jorge Esteban Quilcate Otoya
Congratulations Bill!

On Thu, 14 Feb 2019, 09:29 Mickael Maison,  wrote:

> Congratulations Bill!
>
> On Thu, Feb 14, 2019 at 7:52 AM Gurudatt Kulkarni 
> wrote:
>
> > Congratulations Bill!
> >
> > On Thursday, February 14, 2019, Konstantine Karantasis <
> > konstant...@confluent.io> wrote:
> > > Congrats Bill!
> > >
> > > -Konstantine
> > >
> > > On Wed, Feb 13, 2019 at 8:42 PM Srinivas Reddy <
> > srinivas96all...@gmail.com
> > >
> > > wrote:
> > >
> > >> Congratulations Bill 
> > >>
> > >> Well deserved!!
> > >>
> > >> -
> > >> Srinivas
> > >>
> > >> - Typed on tiny keys. pls ignore typos.{mobile app}
> > >>
> > >> On Thu, 14 Feb, 2019, 11:21 Ismael Juma  > >>
> > >> > Congratulations Bill!
> > >> >
> > >> > On Wed, Feb 13, 2019, 5:03 PM Guozhang Wang  > wrote:
> > >> >
> > >> > > Hello all,
> > >> > >
> > >> > > The PMC of Apache Kafka is happy to announce that we've added Bill
> > >> Bejeck
> > >> > > as our newest project committer.
> > >> > >
> > >> > > Bill has been active in the Kafka community since 2015. He has
> made
> > >> > > significant contributions to the Kafka Streams project with more
> > than
> > >> 100
> > >> > > PRs and 4 authored KIPs, including the streams topology
> optimization
> > >> > > framework. Bill's also very keen on tightening Kafka's unit test /
> > >> system
> > >> > > tests coverage, which is a great value to our project codebase.
> > >> > >
> > >> > > In addition, Bill has been very active in evangelizing Kafka for
> > stream
> > >> > > processing in the community. He has given several Kafka meetup
> talks
> > in
> > >> > the
> > >> > > past year, including a presentation at Kafka Summit SF. He's also
> > >> > authored
> > >> > > a book about Kafka Streams (
> > >> > > https://www.manning.com/books/kafka-streams-in-action), as well
> as
> > >> > various
> > >> > > of posts in public venues like DZone as well as his personal blog
> (
> > >> > > http://codingjunkie.net/).
> > >> > >
> > >> > > We really appreciate the contributions and are looking forward to
> > see
> > >> > more
> > >> > > from him. Congratulations, Bill !
> > >> > >
> > >> > >
> > >> > > Guozhang, on behalf of the Apache Kafka PMC
> > >> > >
> > >> >
> > >>
> > >
> >
>


Re: [ANNOUNCE] New committer: Colin McCabe

2018-09-26 Thread Jorge Esteban Quilcate Otoya
Congratulations Colin!!

El mié., 26 sept. 2018 a las 5:54, Dongjin Lee ()
escribió:

> Congratulations!!
>
> Best,
> Dongjin
>
> On Wed, Sep 26, 2018 at 11:56 AM Satish Duggana 
> wrote:
>
> > Congratulations Colin!
> >
> >
> >
> > On Wed, Sep 26, 2018 at 5:52 AM, Vahid Hashemian
> >  wrote:
> > > Congratulations Colin!
> > >
> > > Regards.
> > > --Vahid
> > >
> > > On Tue, Sep 25, 2018 at 3:43 PM Colin McCabe 
> wrote:
> > >
> > >> Thanks, everyone!
> > >>
> > >> best,
> > >> Colin
> > >>
> > >>
> > >> On Tue, Sep 25, 2018, at 15:26, Robert Barrett wrote:
> > >> > Congratulations Colin!
> > >> >
> > >> > On Tue, Sep 25, 2018 at 1:51 PM Matthias J. Sax <
> > matth...@confluent.io>
> > >> > wrote:
> > >> >
> > >> > > Congrats Colin! The was over due for some time :)
> > >> > >
> > >> > > -Matthias
> > >> > >
> > >> > > On 9/25/18 1:51 AM, Edoardo Comar wrote:
> > >> > > > Congratulations Colin !
> > >> > > > --
> > >> > > >
> > >> > > > Edoardo Comar
> > >> > > >
> > >> > > > IBM Event Streams
> > >> > > > IBM UK Ltd, Hursley Park, SO21 2JN
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > From:   Ismael Juma 
> > >> > > > To: Kafka Users , dev <
> > >> dev@kafka.apache.org>
> > >> > > > Date:   25/09/2018 09:40
> > >> > > > Subject:[ANNOUNCE] New committer: Colin McCabe
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > Hi all,
> > >> > > >
> > >> > > > The PMC for Apache Kafka has invited Colin McCabe as a committer
> > and
> > >> we
> > >> > > > are
> > >> > > > pleased to announce that he has accepted!
> > >> > > >
> > >> > > > Colin has contributed 101 commits and 8 KIPs including
> significant
> > >> > > > improvements to replication, clients, code quality and testing.
> A
> > few
> > >> > > > highlights were KIP-97 (Improved Clients Compatibility Policy),
> > >> KIP-117
> > >> > > > (AdminClient), KIP-227 (Incremental FetchRequests to Increase
> > >> Partition
> > >> > > > Scalability), the introduction of findBugs and adding Trogdor
> > (fault
> > >> > > > injection and benchmarking tool).
> > >> > > >
> > >> > > > In addition, Colin has reviewed 38 pull requests and
> participated
> > in
> > >> more
> > >> > > > than 50 KIP discussions.
> > >> > > >
> > >> > > > Thank you for your contributions Colin! Looking forward to many
> > >> more. :)
> > >> > > >
> > >> > > > Ismael, for the Apache Kafka PMC
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > Unless stated otherwise above:
> > >> > > > IBM United Kingdom Limited - Registered in England and Wales
> with
> > >> number
> > >> > > > 741598.
> > >> > > > Registered office: PO Box 41, North Harbour, Portsmouth,
> Hampshire
> > >> PO6
> > >> > > 3AU
> > >> > > >
> > >> > >
> > >> > >
> > >>
> >
>
>
> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
>
> *github:  github.com/dongjinleekr
> linkedin: kr.linkedin.com/in/dongjinleekr
> slideshare:
> www.slideshare.net/dongjinleekr
> *
>


Re: Accessing Topology Builder

2018-09-26 Thread Jorge Esteban Quilcate Otoya
Good to know, thanks Matthias!

You've mentioned a previous operator, but what about:
`peek().mapValues().peek()`, will both `peek`s be in the same thread as
well?

El mar., 25 sept. 2018 a las 23:14, Matthias J. Sax ()
escribió:

> Just for clarification:
>
> `peek()` would run on the same thread and the previous operator. Even
> if---strictly speaking---there is no public contract to guarantee this,
> it would be the case in the current implementation, and I also don't see
> any reason why this would change at any point in the future, because
> it's the most efficient implementation I can think of.
>
> -Matthias
>
> On 9/22/18 4:51 AM, Jorge Esteban Quilcate Otoya wrote:
> > Thanks, everyone!
> >
> > @Bill, the main issue with using `KStraem#peek()` is that AFAIK each
> `peek`
> > processor runs on a potentially different thread, then passing the trace
> > between them could be challenging. It will also require users to add
> these
> > operators themselves, which could be too cumbersome to use.
> >
> > @Guozhang and @John: I will first focus on creating the
> > `TracingProcessorSupplier` for instrumenting custom `Processors` and I
> will
> > keep the idea of a `ProcessorInterceptor` in the back of my head to see
> if
> > it make sense to propose a KIP for this.
> >
> > Thanks again for your feedback!
> >
> > Cheers,
> > Jorge.
> > El mié., 19 sept. 2018 a las 1:55, Bill Bejeck ()
> > escribió:
> >
> >> Jorge:
> >>
> >> I have a crazy idea off the top of my head.
> >>
> >> Would something as low-tech using KSteam.peek calls on either side of
> >> certain processors to record start and end times work?
> >>
> >> Thanks,
> >> Bill
> >>
> >> On Tue, Sep 18, 2018 at 4:38 PM Guozhang Wang 
> wrote:
> >>
> >>> Jorge:
> >>>
> >>> My suggestion was to let your users to implement on the
> >>> TracingProcessorSupplier
> >>> / TracingProcessor directly instead of the base-line ProcessorSupplier
> /
> >>> Processor. Would that work for you?
> >>>
> >>>
> >>> Guozhang
> >>>
> >>>
> >>> On Tue, Sep 18, 2018 at 8:02 AM, Jorge Esteban Quilcate Otoya <
> >>> quilcate.jo...@gmail.com> wrote:
> >>>
> >>>> final StreamsBuilder builder = kafkaStreamsTracing.builder();Thanks
> >>>> Guozhang and John.
> >>>>
> >>>> @Guozhang:
> >>>>
> >>>>> I'd suggest to provide a
> >>>>> WrapperProcessorSupplier for the users than modifying
> >>>>> InternalStreamsTopology: more specifically, you can provide an
> >>>>> `abstract WrapperProcessorSupplier
> >>>>> implements ProcessorSupplier` and then let users to instantiate this
> >>>> class
> >>>>> instead of the "bare-metal" interface. WDYT?
> >>>>
> >>>> Yes, in the gist, I have a class implementing `ProcessorSupplier`:
> >>>>
> >>>> ```
> >>>> public class TracingProcessorSupplier implements
> >>> ProcessorSupplier >>>> V> {
> >>>>   final KafkaTracing kafkaTracing;
> >>>>   final String name;
> >>>>   final ProcessorSupplier delegate;
> >>>>public TracingProcessorSupplier(KafkaTracing kafkaTracing,
> >>>>   String name, ProcessorSupplier delegate) {
> >>>> this.kafkaTracing = kafkaTracing;
> >>>> this.name = name;
> >>>> this.delegate = delegate;
> >>>>   }
> >>>>@Override public Processor get() {
> >>>> return new TracingProcessor<>(kafkaTracing, name, delegate.get());
> >>>>   }
> >>>> }
> >>>> ```
> >>>>
> >>>> My challenge is how to wrap Topology Processors created by
> >>>> `StreamsBuilder#build` to make this instrumentation easy to adopt by
> >>> Kafka
> >>>> Streams users.
> >>>>
> >>>> @John:
> >>>>
> >>>>> The diff you posted only contains the library-side changes, and it's
> >>> not
> >>>>> obvious how you would use this to insert the desired tracing code.
> >>>>> Perhaps you could provide a snippet demonstrating how you want to use
> >>>> this
> >>>>> change to enable tracing?

Re: Accessing Topology Builder

2018-09-22 Thread Jorge Esteban Quilcate Otoya
Thanks, everyone!

@Bill, the main issue with using `KStraem#peek()` is that AFAIK each `peek`
processor runs on a potentially different thread, then passing the trace
between them could be challenging. It will also require users to add these
operators themselves, which could be too cumbersome to use.

@Guozhang and @John: I will first focus on creating the
`TracingProcessorSupplier` for instrumenting custom `Processors` and I will
keep the idea of a `ProcessorInterceptor` in the back of my head to see if
it make sense to propose a KIP for this.

Thanks again for your feedback!

Cheers,
Jorge.
El mié., 19 sept. 2018 a las 1:55, Bill Bejeck ()
escribió:

> Jorge:
>
> I have a crazy idea off the top of my head.
>
> Would something as low-tech using KSteam.peek calls on either side of
> certain processors to record start and end times work?
>
> Thanks,
> Bill
>
> On Tue, Sep 18, 2018 at 4:38 PM Guozhang Wang  wrote:
>
> > Jorge:
> >
> > My suggestion was to let your users to implement on the
> > TracingProcessorSupplier
> > / TracingProcessor directly instead of the base-line ProcessorSupplier /
> > Processor. Would that work for you?
> >
> >
> > Guozhang
> >
> >
> > On Tue, Sep 18, 2018 at 8:02 AM, Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> > > final StreamsBuilder builder = kafkaStreamsTracing.builder();Thanks
> > > Guozhang and John.
> > >
> > > @Guozhang:
> > >
> > > > I'd suggest to provide a
> > > > WrapperProcessorSupplier for the users than modifying
> > > > InternalStreamsTopology: more specifically, you can provide an
> > > > `abstract WrapperProcessorSupplier
> > > > implements ProcessorSupplier` and then let users to instantiate this
> > > class
> > > > instead of the "bare-metal" interface. WDYT?
> > >
> > > Yes, in the gist, I have a class implementing `ProcessorSupplier`:
> > >
> > > ```
> > > public class TracingProcessorSupplier implements
> > ProcessorSupplier > > V> {
> > >   final KafkaTracing kafkaTracing;
> > >   final String name;
> > >   final ProcessorSupplier delegate;
> > >public TracingProcessorSupplier(KafkaTracing kafkaTracing,
> > >   String name, ProcessorSupplier delegate) {
> > > this.kafkaTracing = kafkaTracing;
> > > this.name = name;
> > > this.delegate = delegate;
> > >   }
> > >@Override public Processor get() {
> > > return new TracingProcessor<>(kafkaTracing, name, delegate.get());
> > >   }
> > > }
> > > ```
> > >
> > > My challenge is how to wrap Topology Processors created by
> > > `StreamsBuilder#build` to make this instrumentation easy to adopt by
> > Kafka
> > > Streams users.
> > >
> > > @John:
> > >
> > > > The diff you posted only contains the library-side changes, and it's
> > not
> > > > obvious how you would use this to insert the desired tracing code.
> > > > Perhaps you could provide a snippet demonstrating how you want to use
> > > this
> > > > change to enable tracing?
> > >
> > > My first approach was something like this:
> > >
> > > ```
> > > final StreamsBuilder builder = kafkaStreamsTracing.builder();
> > > ```
> > >
> > > Where `KafkaStreamsTracing#builder` looks like this:
> > >
> > > ```
> > >   public StreamsBuilder builder() {
> > > return new StreamsBuilder(new Topology(new
> > > TracingInternalTopologyBuilder(kafkaTracing)));
> > >   }
> > > ```
> > >
> > > Then, once the builder creates a topology, `processors` will be wrapped
> > by
> > > `TracingProcessorSupplier` described above.
> > >
> > > Probably this approach is too naive but works as an initial proof of
> > > concept.
> > >
> > > > Off the top of my head, here are some other approaches you might
> > > evaluate:
> > > > * you mentioned interceptors. Perhaps we could create a
> > > > ProcessorInterceptor interface and add a config to set it.
> > >
> > > This sounds very interesting to me. Then we won't need to touch
> internal
> > > API's, and just provide some configs. One challenge here is how to
> define
> > > the hooks. In consumer/producer, lifecycle is clear,
> > `onConsumer`/`onSend`
> > > and then `onCommit`/`onAck` methods. For 

Re: Accessing Topology Builder

2018-09-18 Thread Jorge Esteban Quilcate Otoya
t; > can actually use the per-processor metrics which include latency sensors.
> >
> > If you do want to track, for a certain record, what's the latency of
> > processing it, then you'd probably need the processor implementation in
> > your repo. In this case, though, I'd suggest to provide a
> > WrapperProcessorSupplier for the users than modifying
> > InternalStreamsTopology: more specifically, you can provide an
> > `abstract WrapperProcessorSupplier
> > implements ProcessorSupplier` and then let users to instantiate this
> class
> > instead of the "bare-metal" interface. WDYT?
> >
> >
> > Guozhang
> >
> > On Sun, Sep 16, 2018 at 12:56 PM, Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> > > Thanks for your answer, Matthias!
> > >
> > > What I'm looking for is something similar to interceptors, but for
> Stream
> > > Processors.
> > >
> > > In Zipkin -and probably other tracing implementations as well- we are
> > using
> > > Headers to propagate the context of a trace (i.e. adding metadata to
> the
> > > Kafka Record, so we can create references to a trace).
> > > Now that Headers are part of Kafka Streams Processor API, we can
> > propagate
> > > context from input (Consumers) to outputs (Producers) by using
> > > `KafkaClientSupplier` (e.g. <
> > > https://github.com/openzipkin/brave/blob/master/
> > > instrumentation/kafka-streams/src/main/java/brave/kafka/streams/
> > > TracingKafkaClientSupplier.java
> > > >).
> > >
> > > "Input to Output" traces could be enough for some use-cases, but we are
> > > looking for a more detailed trace -that could cover cases like
> > side-effects
> > > (e.g. for each processor), where input/output and processors latencies
> > can
> > > be recorded. This is why I have been looking for how to decorate the
> > > `ProcessorSupplier` and all the changes shown in the comparison. Here
> is
> > a
> > > gist of how we are planning to decorate the `addProcessor` method:
> > > https://github.com/openzipkin/brave/compare/master...jeqo:
> > > kafka-streams-topology#diff-8282914d84039affdf7c37251b905b44R7
> > >
> > > Hope this makes a bit more sense now :)
> > >
> > > El dom., 16 sept. 2018 a las 20:51, Matthias J. Sax (<
> > > matth...@confluent.io>)
> > > escribió:
> > >
> > > > >> I'm experimenting on how to add tracing to Kafka Streams.
> > > >
> > > > What do you mean by this exactly? Is there a JIRA? I am fine removing
> > > > `final` from `InternalTopologyBuilder#addProcessor()` -- it's an
> > > > internal class.
> > > >
> > > > However, the diff also shows
> > > >
> > > > > public Topology(final InternalTopologyBuilder
> > internalTopologyBuilder)
> > > {
> > > >
> > > > This has two impacts: first, it modifies `Topology` what is part of
> > > > public API and would require a KIP. Second, it exposes
> > > > `InternalTopologyBuilder` as part of the public API -- something we
> > > > should not do.
> > > >
> > > > I am also not sure, why you want to do this (btw: also public API
> > change
> > > > requiring a KIP). However, this should not be necessary.
> > > >
> > > > > public StreamsBuilder(final Topology topology)  {
> > > >
> > > >
> > > > I think I am lacking some context what you try to achieve. Maybe you
> > can
> > > > elaborate in the problem you try to solve?
> > > >
> > > >
> > > > -Matthias
> > > >
> > > > On 9/15/18 10:31 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > > Hi everyone,
> > > > >
> > > > > I'm experimenting on how to add tracing to Kafka Streams.
> > > > >
> > > > > One option is to override and access
> > > > > `InternalTopologyBuilder#addProcessor`. Currently this method it is
> > > > final,
> > > > > and builder is not exposed as part of `StreamsBuilder`:
> > > > >
> > > > > ```
> > > > > public class StreamsBuilder {
> > > > >
> > > > > /** The actual topology that is constructed by this
> > StreamsBuilder.
> > > > */
> > > > > private final Topology topology = new 

Re: Accessing Topology Builder

2018-09-16 Thread Jorge Esteban Quilcate Otoya
Thanks for your answer, Matthias!

What I'm looking for is something similar to interceptors, but for Stream
Processors.

In Zipkin -and probably other tracing implementations as well- we are using
Headers to propagate the context of a trace (i.e. adding metadata to the
Kafka Record, so we can create references to a trace).
Now that Headers are part of Kafka Streams Processor API, we can propagate
context from input (Consumers) to outputs (Producers) by using
`KafkaClientSupplier` (e.g. <
https://github.com/openzipkin/brave/blob/master/instrumentation/kafka-streams/src/main/java/brave/kafka/streams/TracingKafkaClientSupplier.java
>).

"Input to Output" traces could be enough for some use-cases, but we are
looking for a more detailed trace -that could cover cases like side-effects
(e.g. for each processor), where input/output and processors latencies can
be recorded. This is why I have been looking for how to decorate the
`ProcessorSupplier` and all the changes shown in the comparison. Here is a
gist of how we are planning to decorate the `addProcessor` method:
https://github.com/openzipkin/brave/compare/master...jeqo:kafka-streams-topology#diff-8282914d84039affdf7c37251b905b44R7

Hope this makes a bit more sense now :)

El dom., 16 sept. 2018 a las 20:51, Matthias J. Sax ()
escribió:

> >> I'm experimenting on how to add tracing to Kafka Streams.
>
> What do you mean by this exactly? Is there a JIRA? I am fine removing
> `final` from `InternalTopologyBuilder#addProcessor()` -- it's an
> internal class.
>
> However, the diff also shows
>
> > public Topology(final InternalTopologyBuilder internalTopologyBuilder) {
>
> This has two impacts: first, it modifies `Topology` what is part of
> public API and would require a KIP. Second, it exposes
> `InternalTopologyBuilder` as part of the public API -- something we
> should not do.
>
> I am also not sure, why you want to do this (btw: also public API change
> requiring a KIP). However, this should not be necessary.
>
> > public StreamsBuilder(final Topology topology)  {
>
>
> I think I am lacking some context what you try to achieve. Maybe you can
> elaborate in the problem you try to solve?
>
>
> -Matthias
>
> On 9/15/18 10:31 AM, Jorge Esteban Quilcate Otoya wrote:
> > Hi everyone,
> >
> > I'm experimenting on how to add tracing to Kafka Streams.
> >
> > One option is to override and access
> > `InternalTopologyBuilder#addProcessor`. Currently this method it is
> final,
> > and builder is not exposed as part of `StreamsBuilder`:
> >
> > ```
> > public class StreamsBuilder {
> >
> > /** The actual topology that is constructed by this StreamsBuilder.
> */
> > private final Topology topology = new Topology();
> >
> > /** The topology's internal builder. */
> > final InternalTopologyBuilder internalTopologyBuilder =
> > topology.internalTopologyBuilder;
> >
> > private final InternalStreamsBuilder internalStreamsBuilder = new
> > InternalStreamsBuilder(internalTopologyBuilder);
> > ```
> >
> > The goal is that If `builder#addProcessor` is exposed, we could decorate
> > every `ProcessorSupplier` and capture traces from it:
> >
> > ```
> > @Override
> >   public void addProcessor(String name, ProcessorSupplier supplier,
> > String... predecessorNames) {
> > super.addProcessor(name, new TracingProcessorSupplier(tracer, name,
> > supplier), predecessorNames);
> >   }
> > ```
> >
> > Would it make sense to propose this as a change:
> > https://github.com/apache/kafka/compare/trunk...jeqo:tracing-topology ?
> or
> > maybe there is a better way to do this?
> > TopologyWrapper does something similar:
> >
> https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/TopologyWrapper.java
> >
> > Thanks in advance for any help.
> >
> > Cheers,
> > Jorge.
> >
>
>


Accessing Topology Builder

2018-09-15 Thread Jorge Esteban Quilcate Otoya
Hi everyone,

I'm experimenting on how to add tracing to Kafka Streams.

One option is to override and access
`InternalTopologyBuilder#addProcessor`. Currently this method it is final,
and builder is not exposed as part of `StreamsBuilder`:

```
public class StreamsBuilder {

/** The actual topology that is constructed by this StreamsBuilder. */
private final Topology topology = new Topology();

/** The topology's internal builder. */
final InternalTopologyBuilder internalTopologyBuilder =
topology.internalTopologyBuilder;

private final InternalStreamsBuilder internalStreamsBuilder = new
InternalStreamsBuilder(internalTopologyBuilder);
```

The goal is that If `builder#addProcessor` is exposed, we could decorate
every `ProcessorSupplier` and capture traces from it:

```
@Override
  public void addProcessor(String name, ProcessorSupplier supplier,
String... predecessorNames) {
super.addProcessor(name, new TracingProcessorSupplier(tracer, name,
supplier), predecessorNames);
  }
```

Would it make sense to propose this as a change:
https://github.com/apache/kafka/compare/trunk...jeqo:tracing-topology ? or
maybe there is a better way to do this?
TopologyWrapper does something similar:
https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/TopologyWrapper.java

Thanks in advance for any help.

Cheers,
Jorge.


Re: [ANNOUNCE] New Kafka PMC member: Dong Lin

2018-09-15 Thread Jorge Esteban Quilcate Otoya
Congratulations!!

El sáb., 15 sept. 2018 a las 15:18, Dongjin Lee ()
escribió:

> Congratulations!
>
> Best,
> Dongjin
>
> On Sat, Sep 15, 2018 at 3:00 PM Colin McCabe  wrote:
>
> > Congratulations, Dong Lin!
> >
> > best,
> > Colin
> >
> > On Wed, Aug 22, 2018, at 05:26, Satish Duggana wrote:
> > > Congrats Dong Lin!
> > >
> > > On Wed, Aug 22, 2018 at 10:08 AM, Abhimanyu Nagrath <
> > > abhimanyunagr...@gmail.com> wrote:
> > >
> > > > Congratulations, Dong!
> > > >
> > > > On Wed, Aug 22, 2018 at 6:20 AM Dhruvil Shah 
> > wrote:
> > > >
> > > > > Congratulations, Dong!
> > > > >
> > > > > On Tue, Aug 21, 2018 at 4:38 PM Jason Gustafson <
> ja...@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > Congrats!
> > > > > >
> > > > > > On Tue, Aug 21, 2018 at 10:03 AM, Ray Chiang  >
> > > > wrote:
> > > > > >
> > > > > > > Congrats Dong!
> > > > > > >
> > > > > > > -Ray
> > > > > > >
> > > > > > >
> > > > > > > On 8/21/18 9:33 AM, Becket Qin wrote:
> > > > > > >
> > > > > > >> Congrats, Dong!
> > > > > > >>
> > > > > > >> On Aug 21, 2018, at 11:03 PM, Eno Thereska <
> > eno.there...@gmail.com>
> > > > > > >>> wrote:
> > > > > > >>>
> > > > > > >>> Congrats Dong!
> > > > > > >>>
> > > > > > >>> Eno
> > > > > > >>>
> > > > > > >>> On Tue, Aug 21, 2018 at 7:05 AM, Ted Yu  >
> > > > wrote:
> > > > > > >>>
> > > > > > >>> Congratulation Dong!
> > > > > > 
> > > > > >  On Tue, Aug 21, 2018 at 1:59 AM Viktor Somogyi-Vass <
> > > > > >  viktorsomo...@gmail.com>
> > > > > >  wrote:
> > > > > > 
> > > > > >  Congrats Dong! :)
> > > > > > >
> > > > > > > On Tue, Aug 21, 2018 at 10:09 AM James Cheng <
> > > > wushuja...@gmail.com
> > > > > >
> > > > > > >
> > > > > >  wrote:
> > > > > > 
> > > > > > > Congrats Dong!
> > > > > > >>
> > > > > > >> -James
> > > > > > >>
> > > > > > >> On Aug 20, 2018, at 3:54 AM, Ismael Juma <
> ism...@juma.me.uk
> > >
> > > > > wrote:
> > > > > > >>>
> > > > > > >>> Hi everyone,
> > > > > > >>>
> > > > > > >>> Dong Lin became a committer in March 2018. Since then, he
> > has
> > > > > > >>>
> > > > > > >> remained
> > > > > > 
> > > > > > > active in the community and contributed a number of
> patches,
> > > > > reviewed
> > > > > > >>> several pull requests and participated in numerous KIP
> > > > > > discussions. I
> > > > > > >>>
> > > > > > >> am
> > > > > > >
> > > > > > >> happy to announce that Dong is now a member of the
> > > > > > >>> Apache Kafka PM
> > > > > > >>>
> > > > > > >>> Congratulation Dong! Looking forward to your future
> > > > > contributions.
> > > > > > >>>
> > > > > > >>> Ismael, on behalf of the Apache Kafka PMC
> > > > > > >>>
> > > > > > >>
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> >
>
>
> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
>
> *github:  github.com/dongjinleekr
> linkedin: kr.linkedin.com/in/dongjinleekr
> slideshare:
> www.slideshare.net/dongjinleekr
> *
>


Checking Connection with Kafka Broker from Client-side

2018-08-15 Thread Jorge Esteban Quilcate Otoya
Hi everyone,

I'm evaluating how to validate connection to Kafka Brokers in an
application that uses Consumer API by making health check using
AdminClient.
Is there any consideration around Authorization that I should take into
consideration/any best practice? (I'm considering calling `describeCluster`
as health check)

Related issue: https://github.com/openzipkin/zipkin/issues/2098

Thanks in advance for your feedback.

Jorge.


Re: [VOTE] KIP-244: Add Record Header support to Kafka Streams

2018-05-21 Thread Jorge Esteban Quilcate Otoya
Hi,

I'd like to point out that:

org.apache.kafka.streams.test.ConsumerRecordFactory


Has also been included as part of this KIP to support changes on test
cases, just in case there is additional feedback here.

Cheers,
Jorge.

El lun., 21 may. 2018 a las 16:46, Jorge Esteban Quilcate Otoya (<
quilcate.jo...@gmail.com>) escribió:

> Thanks for your votes and your feedback.
>
> This KIP has been approved with the following results:
>
> Binding +1s: 3 (Matthias, Damina, Guozhang)
> Non-biniding +1s: (Bill, Ted)
>
> Jorge.
>
> El mar., 15 may. 2018 a las 20:01, Bill Bejeck (<bbej...@gmail.com>)
> escribió:
>
>> Thanks for the KIP!
>>
>> +1
>>
>> -Bill
>>
>> On Tue, May 15, 2018 at 1:47 PM, Damian Guy <damian....@gmail.com> wrote:
>>
>> > Thanks. +1 (binding)
>> >
>> > On Tue, 15 May 2018 at 01:04 Jorge Esteban Quilcate Otoya <
>> > quilcate.jo...@gmail.com> wrote:
>> >
>> > > @Guozhang added. Thanks!
>> > >
>> > > El mar., 15 may. 2018 a las 5:50, Matthias J. Sax (<
>> > matth...@confluent.io
>> > > >)
>> > > escribió:
>> > >
>> > > > +1 (binding)
>> > > >
>> > > > Thanks a lot for the KIP!
>> > > >
>> > > > -Matthias
>> > > >
>> > > > On 5/14/18 10:17 AM, Guozhang Wang wrote:
>> > > > > +1 from me
>> > > > >
>> > > > > One more comment on the wiki: while reviewing the PR I realized
>> that
>> > > in `
>> > > > > MockProcessorContext.java
>> > > > > <
>> > > >
>> > > https://github.com/apache/kafka/pull/4955/files#diff-
>> > d5440e7338f775230019a86e6bcacccb
>> > > > >`
>> > > > > we are also adding one additional API plus modifying the existing
>> > > > > `setRecordMetadata` API. Since this class is part of the public
>> > > > test-utils
>> > > > > package we should claim it in the wiki as well.
>> > > > >
>> > > > >
>> > > > > Guozhang
>> > > > >
>> > > > > On Mon, May 14, 2018 at 8:43 AM, Ted Yu <yuzhih...@gmail.com>
>> wrote:
>> > > > >
>> > > > >> +1
>> > > > >>
>> > > > >> On Mon, May 14, 2018 at 8:31 AM, Jorge Esteban Quilcate Otoya <
>> > > > >> quilcate.jo...@gmail.com> wrote:
>> > > > >>
>> > > > >>> Hi everyone,
>> > > > >>>
>> > > > >>> I would like to start a vote on KIP-244: Add Record Header
>> support
>> > to
>> > > > >> Kafka
>> > > > >>> Streams
>> > > > >>>
>> > > > >>> KIP wiki page:
>> > > > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > > > >>> 244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API
>> > > > >>>
>> > > > >>> The discussion thread is here:
>> > > > >>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201805.
>> > > > >>>
>> mbox/%3CCAC3UcJvrgcBfe6%3DiW6%2BuTWsLB%2B4CsHgRmDx9TvCzJQrWvfg7_w%
>> > > > >>> 40mail.gmail.com%3E
>> > > > >>>
>> > > > >>> Cheers,
>> > > > >>> Jorge.
>> > > > >>>
>> > > > >>
>> > > > >
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > >
>> >
>>
>


Re: [VOTE] KIP-244: Add Record Header support to Kafka Streams

2018-05-21 Thread Jorge Esteban Quilcate Otoya
Thanks for your votes and your feedback.

This KIP has been approved with the following results:

Binding +1s: 3 (Matthias, Damina, Guozhang)
Non-biniding +1s: (Bill, Ted)

Jorge.

El mar., 15 may. 2018 a las 20:01, Bill Bejeck (<bbej...@gmail.com>)
escribió:

> Thanks for the KIP!
>
> +1
>
> -Bill
>
> On Tue, May 15, 2018 at 1:47 PM, Damian Guy <damian@gmail.com> wrote:
>
> > Thanks. +1 (binding)
> >
> > On Tue, 15 May 2018 at 01:04 Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> > > @Guozhang added. Thanks!
> > >
> > > El mar., 15 may. 2018 a las 5:50, Matthias J. Sax (<
> > matth...@confluent.io
> > > >)
> > > escribió:
> > >
> > > > +1 (binding)
> > > >
> > > > Thanks a lot for the KIP!
> > > >
> > > > -Matthias
> > > >
> > > > On 5/14/18 10:17 AM, Guozhang Wang wrote:
> > > > > +1 from me
> > > > >
> > > > > One more comment on the wiki: while reviewing the PR I realized
> that
> > > in `
> > > > > MockProcessorContext.java
> > > > > <
> > > >
> > > https://github.com/apache/kafka/pull/4955/files#diff-
> > d5440e7338f775230019a86e6bcacccb
> > > > >`
> > > > > we are also adding one additional API plus modifying the existing
> > > > > `setRecordMetadata` API. Since this class is part of the public
> > > > test-utils
> > > > > package we should claim it in the wiki as well.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Mon, May 14, 2018 at 8:43 AM, Ted Yu <yuzhih...@gmail.com>
> wrote:
> > > > >
> > > > >> +1
> > > > >>
> > > > >> On Mon, May 14, 2018 at 8:31 AM, Jorge Esteban Quilcate Otoya <
> > > > >> quilcate.jo...@gmail.com> wrote:
> > > > >>
> > > > >>> Hi everyone,
> > > > >>>
> > > > >>> I would like to start a vote on KIP-244: Add Record Header
> support
> > to
> > > > >> Kafka
> > > > >>> Streams
> > > > >>>
> > > > >>> KIP wiki page:
> > > > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > >>> 244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API
> > > > >>>
> > > > >>> The discussion thread is here:
> > > > >>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201805.
> > > > >>>
> mbox/%3CCAC3UcJvrgcBfe6%3DiW6%2BuTWsLB%2B4CsHgRmDx9TvCzJQrWvfg7_w%
> > > > >>> 40mail.gmail.com%3E
> > > > >>>
> > > > >>> Cheers,
> > > > >>> Jorge.
> > > > >>>
> > > > >>
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Apache Kafka 2.0.0 Release Plan

2018-05-21 Thread Jorge Esteban Quilcate Otoya
Ok, thanks!

On Mon, 21 May 2018, 15:17 Rajini Sivaram, <rajinisiva...@gmail.com> wrote:

> Jorge,
>
> KIP-244 has binding votes from Guozhang, Matthias and Damian. So you can
> close the vote and move the KIP to Accepted state.
>
>
> On Mon, May 21, 2018 at 2:13 PM, Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > I think there is one missing vote in KIP-244 too. Pull request is almost
> > ready to be merged also.
> >
> > On Mon, 21 May 2018, 11:48 Rajini Sivaram, <rajinisiva...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > This is a reminder that KIP freeze for 2.0.0 release is tomorrow. KIPs
> > that
> > > are not yet in voting stage will be postponed to the next release. We
> > have
> > > several KIPs that need more votes to be accepted. Please participate in
> > > reviews and votes to enable these to be added to the release (or
> > postponed
> > > if required). We need more committers to review and vote since binding
> > > votes are needed by tomorrow to include these in the release, but
> reviews
> > > and votes from everyone in the community are welcome.
> > >
> > > KIPs that require one more binding vote:
> > >
> > >- KIP-235: https://cwiki.apache.org/confluence/display/KAFKA/K
> > >IP-235%3A+Add+DNS+alias+support+for+secured+connection
> > ><
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-235%
> > 3A+Add+DNS+alias+support+for+secured+connection
> > > >
> > >- KIP-255: https://cwiki.apache.org/confluence/pages/viewpage.
> > >action?pageId=75968876
> > >- KIP-277: https://cwiki.apache.org/confluence/display/KAFKA/K
> > >IP-277+-+Fine+Grained+ACL+for+CreateTopics+API
> > ><
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-277+-+
> > Fine+Grained+ACL+for+CreateTopics+API
> > > >
> > >- KIP-285: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >285%3A+Connect+Rest+Extension+Plugin
> > ><
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-285%
> > 3A+Connect+Rest+Extension+Plugin
> > > >
> > >- KIP-297:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-297%
> > >3A+Externalizing+Secrets+for+Connect+Configurations
> > ><
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-297%
> > 3A+Externalizing+Secrets+for+Connect+Configurations
> > > >
> > >
> > > KIPs that require two more binding votes:
> > >
> > >- KIP-248: https://cwiki.apache.org/confluence/display/KAFKA/K
> > >IP-248+-+Create+New+ConfigCommand+That+Uses+The+New+AdminClient
> > ><
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-248+-+
> > Create+New+ConfigCommand+That+Uses+The+New+AdminClient
> > > >
> > >- KIP-290: https://cwiki.apache.org/confl
> > uence/display/KAFKA/KIP-290%3A
> > >+Support+for+wildcard+suffixed+ACLs
> > ><
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-290%
> > 3A+Support+for+wildcard+suffixed+ACLs
> > > >
> > >- KIP-298: https://cwiki.apache.org/confluence/display/KAFKA/K
> > >IP-298%3A+Error+Handling+in+Connect
> > ><
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%
> > 3A+Error+Handling+in+Connect
> > > >
> > >
> > > KIPs that require three binding votes
> > >
> > >
> > >- KIP-275: https://cwiki.apache.org/confluence/pages/viewpage.
> > >action?pageId=75977607
> > >
> > >
> > > Thanks,
> > >
> > > Rajini
> > >
> > > On Tue, May 15, 2018 at 11:23 AM, Rajini Sivaram <
> > rajinisiva...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > This is just a reminder that KIP freeze for 2.0.0 release is in a
> > week's
> > > > time and we still have a lot of KIPs in progress. KIPs that are
> > currently
> > > > being discussed should start the voting process soon to get voting
> > > complete
> > > > by 22nd of May. Please participate in discussions and votes to enable
> > > > these to be added to the release (or postponed if required).
> > > >
> > > > Voting is in progress for the following KIPs:
> > > >
> > > >
> > > >- KIP-235: htt

Re: [DISCUSS] Apache Kafka 2.0.0 Release Plan

2018-05-21 Thread Jorge Esteban Quilcate Otoya
I think there is one missing vote in KIP-244 too. Pull request is almost
ready to be merged also.

On Mon, 21 May 2018, 11:48 Rajini Sivaram,  wrote:

> Hi all,
>
> This is a reminder that KIP freeze for 2.0.0 release is tomorrow. KIPs that
> are not yet in voting stage will be postponed to the next release. We have
> several KIPs that need more votes to be accepted. Please participate in
> reviews and votes to enable these to be added to the release (or postponed
> if required). We need more committers to review and vote since binding
> votes are needed by tomorrow to include these in the release, but reviews
> and votes from everyone in the community are welcome.
>
> KIPs that require one more binding vote:
>
>- KIP-235: https://cwiki.apache.org/confluence/display/KAFKA/K
>IP-235%3A+Add+DNS+alias+support+for+secured+connection
><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-235%3A+Add+DNS+alias+support+for+secured+connection
> >
>- KIP-255: https://cwiki.apache.org/confluence/pages/viewpage.
>action?pageId=75968876
>- KIP-277: https://cwiki.apache.org/confluence/display/KAFKA/K
>IP-277+-+Fine+Grained+ACL+for+CreateTopics+API
><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-277+-+Fine+Grained+ACL+for+CreateTopics+API
> >
>- KIP-285: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>285%3A+Connect+Rest+Extension+Plugin
><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-285%3A+Connect+Rest+Extension+Plugin
> >
>- KIP-297: https://cwiki.apache.org/confluence/display/KAFKA/KIP-297%
>3A+Externalizing+Secrets+for+Connect+Configurations
><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-297%3A+Externalizing+Secrets+for+Connect+Configurations
> >
>
> KIPs that require two more binding votes:
>
>- KIP-248: https://cwiki.apache.org/confluence/display/KAFKA/K
>IP-248+-+Create+New+ConfigCommand+That+Uses+The+New+AdminClient
><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-248+-+Create+New+ConfigCommand+That+Uses+The+New+AdminClient
> >
>- KIP-290: https://cwiki.apache.org/confluence/display/KAFKA/KIP-290%3A
>+Support+for+wildcard+suffixed+ACLs
><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-290%3A+Support+for+wildcard+suffixed+ACLs
> >
>- KIP-298: https://cwiki.apache.org/confluence/display/KAFKA/K
>IP-298%3A+Error+Handling+in+Connect
><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect
> >
>
> KIPs that require three binding votes
>
>
>- KIP-275: https://cwiki.apache.org/confluence/pages/viewpage.
>action?pageId=75977607
>
>
> Thanks,
>
> Rajini
>
> On Tue, May 15, 2018 at 11:23 AM, Rajini Sivaram 
> wrote:
>
> > Hi all,
> >
> > This is just a reminder that KIP freeze for 2.0.0 release is in a week's
> > time and we still have a lot of KIPs in progress. KIPs that are currently
> > being discussed should start the voting process soon to get voting
> complete
> > by 22nd of May. Please participate in discussions and votes to enable
> > these to be added to the release (or postponed if required).
> >
> > Voting is in progress for the following KIPs:
> >
> >
> >- KIP-235: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >235%3A+Add+DNS+alias+support+for+secured+connection
> ><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-235%3A+Add+DNS+alias+support+for+secured+connection
> >
> >- KIP-244: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API
> ><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API
> >
> >- KIP-248: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >248+-+Create+New+ConfigCommand+That+Uses+The+New+AdminClient
> ><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-248+-+Create+New+ConfigCommand+That+Uses+The+New+AdminClient
> >
> >- KIP-255: https://cwiki.apache.org/confluence/pages/viewpage.
> >action?pageId=75968876
> >- KIP-275: https://cwiki.apache.org/confluence/pages/viewpage.
> >action?pageId=75977607
> >- KIP-277: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >277+-+Fine+Grained+ACL+for+CreateTopics+API
> ><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-277+-+Fine+Grained+ACL+for+CreateTopics+API
> >
> >- KIP-278: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >278+-+Add+version+option+to+Kafka%27s+commands
> ><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-278+-+Add+version+option+to+Kafka%27s+commands
> >
> >- KIP-282: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >282%3A+Add+the+listener+name+to+the+authentication+context
> ><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-282%3A+Add+the+listener+name+to+the+authentication+context
> 

Re: [VOTE] KIP-244: Add Record Header support to Kafka Streams

2018-05-15 Thread Jorge Esteban Quilcate Otoya
@Guozhang added. Thanks!

El mar., 15 may. 2018 a las 5:50, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> +1 (binding)
>
> Thanks a lot for the KIP!
>
> -Matthias
>
> On 5/14/18 10:17 AM, Guozhang Wang wrote:
> > +1 from me
> >
> > One more comment on the wiki: while reviewing the PR I realized that in `
> > MockProcessorContext.java
> > <
> https://github.com/apache/kafka/pull/4955/files#diff-d5440e7338f775230019a86e6bcacccb
> >`
> > we are also adding one additional API plus modifying the existing
> > `setRecordMetadata` API. Since this class is part of the public
> test-utils
> > package we should claim it in the wiki as well.
> >
> >
> > Guozhang
> >
> > On Mon, May 14, 2018 at 8:43 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> >
> >> +1
> >>
> >> On Mon, May 14, 2018 at 8:31 AM, Jorge Esteban Quilcate Otoya <
> >> quilcate.jo...@gmail.com> wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> I would like to start a vote on KIP-244: Add Record Header support to
> >> Kafka
> >>> Streams
> >>>
> >>> KIP wiki page:
> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>> 244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API
> >>>
> >>> The discussion thread is here:
> >>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201805.
> >>> mbox/%3CCAC3UcJvrgcBfe6%3DiW6%2BuTWsLB%2B4CsHgRmDx9TvCzJQrWvfg7_w%
> >>> 40mail.gmail.com%3E
> >>>
> >>> Cheers,
> >>> Jorge.
> >>>
> >>
> >
> >
> >
>
>


Re: KIP-244: Add Record Header support to Kafka Streams

2018-05-14 Thread Jorge Esteban Quilcate Otoya
Yes, I've one already created: https://github.com/apache/kafka/pull/4955

On Mon, 14 May 2018, 17:55 Guozhang Wang, <wangg...@gmail.com> wrote:

> Thanks Jorge, that sounds good to me.
>
> Also please feel free to send out the PR for reviews while the KIP is being
> voted on.
>
>
> Guozhang
>
>
> On Mon, May 14, 2018 at 8:31 AM, Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > Thanks for your feedback everyone!
> >
> > If there is no more comments on this KIP, I think we can open the VOTE
> > thread.
> >
> > Cheers,
> > Jorge.
> >
> > El sáb., 12 may. 2018 a las 2:02, Guozhang Wang (<wangg...@gmail.com>)
> > escribió:
> >
> > > Yeah I'm only talking about the DSL part (i.e. how stateful / stateless
> > > operators default inheritance protocol would be promised) to be managed
> > > with KIP-159.
> > >
> > > For allowing users to override the default behavior in PAPI, that would
> > be
> > > in a different KIP.
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Fri, May 11, 2018 at 10:41 AM, Matthias J. Sax <
> matth...@confluent.io
> > >
> > > wrote:
> > >
> > > > I am actually not sure about this. Because it's about the semantics
> at
> > > > PAPI level, but KIP-159 targets the DSL, it might actually be better
> to
> > > > have a separate KIP?
> > > >
> > > > -Matthias
> > > >
> > > > On 5/11/18 9:26 AM, Guozhang Wang wrote:
> > > > > That's a good question. I think we can manage this in KIP-159. I
> will
> > > go
> > > > > ahead and try to augment that KIP together with the original author
> > > > Jeyhun.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Fri, May 11, 2018 at 12:45 AM, Jorge Esteban Quilcate Otoya <
> > > > > quilcate.jo...@gmail.com> wrote:
> > > > >
> > > > >> Thanks Guozhang and Matthias! I do also agree with this way of
> > > handling
> > > > >> headers inheritance. I will add them to the KIP doc.
> > > > >>
> > > > >>> We can discuss about extending the current protocol and how to
> > enable
> > > > >> users
> > > > >>> override those rule, and how to expose them in the DSL layer in a
> > > > future
> > > > >>> KIP.
> > > > >>
> > > > >> About this, should this be managed on KIP-159 or a new one?
> > > > >>
> > > > >> El jue., 10 may. 2018 a las 17:46, Matthias J. Sax (<
> > > > matth...@confluent.io
> > > > >>> )
> > > > >> escribió:
> > > > >>
> > > > >>> Thanks Guozhang! Sounds good to me!
> > > > >>>
> > > > >>> -Matthias
> > > > >>>
> > > > >>> On 5/10/18 7:55 AM, Guozhang Wang wrote:
> > > > >>>> Thanks for your thoughts Matthias. I think if we do want to
> bring
> > > > >> KIP-244
> > > > >>>> into 2.0 then we need to keep its scope small and well defined.
> > For
> > > > >> that
> > > > >>>> I'm proposing:
> > > > >>>>
> > > > >>>> 1. Make the inheritance implementation of headers consistent
> with
> > > what
> > > > >> we
> > > > >>>> had with other record context fields. I.e. pass through the
> record
> > > > >>> context
> > > > >>>> in `context.forward()`. Note that within a processor node, users
> > can
> > > > >>>> already manipulate the Headers with the given APIs, so at the
> time
> > > of
> > > > >>>> forwarding, the library can just copy what-ever is left /
> updated
> > to
> > > > >> the
> > > > >>>> next processor node.
> > > > >>>>
> > > > >>>> 2. In the sink node, where a record is being sent to the Kafka
> > > topic,
> > > > >> we
> > > > >>>> should consider the following:
> > > > >>>>
> > > > >>>> a. For sink topics, we will set the headers into the producer
> > > record.
> > &

[VOTE] KIP-244: Add Record Header support to Kafka Streams

2018-05-14 Thread Jorge Esteban Quilcate Otoya
Hi everyone,

I would like to start a vote on KIP-244: Add Record Header support to Kafka
Streams

KIP wiki page:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API

The discussion thread is here:
http://mail-archives.apache.org/mod_mbox/kafka-dev/201805.mbox/%3CCAC3UcJvrgcBfe6%3DiW6%2BuTWsLB%2B4CsHgRmDx9TvCzJQrWvfg7_w%40mail.gmail.com%3E

Cheers,
Jorge.


Re: KIP-244: Add Record Header support to Kafka Streams

2018-05-14 Thread Jorge Esteban Quilcate Otoya
Thanks for your feedback everyone!

If there is no more comments on this KIP, I think we can open the VOTE
thread.

Cheers,
Jorge.

El sáb., 12 may. 2018 a las 2:02, Guozhang Wang (<wangg...@gmail.com>)
escribió:

> Yeah I'm only talking about the DSL part (i.e. how stateful / stateless
> operators default inheritance protocol would be promised) to be managed
> with KIP-159.
>
> For allowing users to override the default behavior in PAPI, that would be
> in a different KIP.
>
>
> Guozhang
>
>
> On Fri, May 11, 2018 at 10:41 AM, Matthias J. Sax <matth...@confluent.io>
> wrote:
>
> > I am actually not sure about this. Because it's about the semantics at
> > PAPI level, but KIP-159 targets the DSL, it might actually be better to
> > have a separate KIP?
> >
> > -Matthias
> >
> > On 5/11/18 9:26 AM, Guozhang Wang wrote:
> > > That's a good question. I think we can manage this in KIP-159. I will
> go
> > > ahead and try to augment that KIP together with the original author
> > Jeyhun.
> > >
> > >
> > > Guozhang
> > >
> > > On Fri, May 11, 2018 at 12:45 AM, Jorge Esteban Quilcate Otoya <
> > > quilcate.jo...@gmail.com> wrote:
> > >
> > >> Thanks Guozhang and Matthias! I do also agree with this way of
> handling
> > >> headers inheritance. I will add them to the KIP doc.
> > >>
> > >>> We can discuss about extending the current protocol and how to enable
> > >> users
> > >>> override those rule, and how to expose them in the DSL layer in a
> > future
> > >>> KIP.
> > >>
> > >> About this, should this be managed on KIP-159 or a new one?
> > >>
> > >> El jue., 10 may. 2018 a las 17:46, Matthias J. Sax (<
> > matth...@confluent.io
> > >>> )
> > >> escribió:
> > >>
> > >>> Thanks Guozhang! Sounds good to me!
> > >>>
> > >>> -Matthias
> > >>>
> > >>> On 5/10/18 7:55 AM, Guozhang Wang wrote:
> > >>>> Thanks for your thoughts Matthias. I think if we do want to bring
> > >> KIP-244
> > >>>> into 2.0 then we need to keep its scope small and well defined. For
> > >> that
> > >>>> I'm proposing:
> > >>>>
> > >>>> 1. Make the inheritance implementation of headers consistent with
> what
> > >> we
> > >>>> had with other record context fields. I.e. pass through the record
> > >>> context
> > >>>> in `context.forward()`. Note that within a processor node, users can
> > >>>> already manipulate the Headers with the given APIs, so at the time
> of
> > >>>> forwarding, the library can just copy what-ever is left / updated to
> > >> the
> > >>>> next processor node.
> > >>>>
> > >>>> 2. In the sink node, where a record is being sent to the Kafka
> topic,
> > >> we
> > >>>> should consider the following:
> > >>>>
> > >>>> a. For sink topics, we will set the headers into the producer
> record.
> > >>>> b. For repartition topics, we will the headers into the producer
> > >> record.
> > >>>> c. For changelog topics, we will drop the headers in the produce
> > record
> > >>>> since they will not be used in restoration and not stored in the
> state
> > >>>> store either.
> > >>>>
> > >>>>
> > >>>> We can discuss about extending the current protocol and how to
> enable
> > >>> users
> > >>>> override those rule, and how to expose them in the DSL layer in a
> > >> future
> > >>>> KIP.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Guozhang
> > >>>>
> > >>>>
> > >>>> On Mon, May 7, 2018 at 5:49 PM, Matthias J. Sax <
> > matth...@confluent.io
> > >>>
> > >>>> wrote:
> > >>>>
> > >>>>> Guozhang,
> > >>>>>
> > >>>>> if you advocate to forward headers by default, it might be a better
> > >>>>> default strategy do forward the headers for all operators (similar
> to
> > >>>>> topic/partition/offset metadata). It's usually harder for users to
> 

Re: KIP-244: Add Record Header support to Kafka Streams

2018-05-11 Thread Jorge Esteban Quilcate Otoya
t; -Matthias
> >>>>
> >>>>
> >>>> On 5/6/18 8:53 PM, Guozhang Wang wrote:
> >>>>> Matthias, thanks for sharing your opinions in the inheritance
> protocol
> >> of
> >>>>> the record context. I'm thinking maybe we should make this discussion
> >> as
> >>>> a
> >>>>> separate KIP by itself? If yes, then KIP-244's scope would be
> smaller,
> >>>> and
> >>>>> within KIP-244 we can have a simple inheritance rule that setting it
> to
> >>>>> null when 1) going through stateful operators and 2) sending to any
> >>>> topics.
> >>>>>
> >>>>>
> >>>>> Guozhang
> >>>>>
> >>>>> On Sun, May 6, 2018 at 10:24 AM, Matthias J. Sax <
> >> matth...@confluent.io>
> >>>>> wrote:
> >>>>>
> >>>>>> Making the inheritance protocol a public contract seems reasonable
> to
> >>>> me.
> >>>>>>
> >>>>>> In the current implementation, all output records inherits the
> offset,
> >>>>>> timestamp, topic, and partition metadata from the input record. We
> >>>>>> already added an API to change the timestamp explicitly for the
> output
> >>>>>> record thought.
> >>>>>>
> >>>>>> I think it make sense to keep the inheritance of offset, topic, and
> >>>>>> partition. For headers, it's worth to discuss. I see arguments for
> two
> >>>>>> strategies: (1) inherit by default, (2) set `null` by default.
> >>>>>> Independent of the default behavior, we should add an API to set
> >> headers
> >>>>>> for output records explicitly though (similar to the "set timestamp
> >>>> API").
> >>>>>>
> >>>>>> From my point of view, timestamp/headers are a different
> >>>>>> "class/category" of data/metadata than topic/partition/offset. For
> the
> >>>>>> first category, it makes sense to manipulate them and it's more than
> >>>>>> "plain metadata"; especially the timestamp. For the second category
> it
> >>>>>> does not make sense to manipulate it, and to me
> topic/partition/offset
> >>>>>> is pure metadata only---strictly speaking, it's even questionable if
> >>>>>> output records should have any value for topic/partition/offset in
> the
> >>>>>> first place, or if they should be `null`, because those attributes
> do
> >>>>>> only make sense for source records that are consumed from a topic
> >>>>>> directly only. On the other hand, if we make this difference
> explicit,
> >>>>>> it might be useful information for the use to track the current
> >>>>>> topic/partition/offset of the original source record.
> >>>>>>
> >>>>>> Furthermore, to me, timestamps and headers are somewhat different,
> >> too.
> >>>>>> For stream processing it's required that every record has a
> timestamp;
> >>>>>> thus, it make sense to inherit the input record timestamp by default
> >> (a
> >>>>>> timestamp is not really metadata but actually equally important to
> key
> >>>>>> and value from my point of view). Header however are optional, and
> >> thus
> >>>>>> inheriting them is not really required. It might be convenient
> though:
> >>>>>> for example, imagine a simple "filter-only" application -- it would
> be
> >>>>>> cumbersome for users to explicitly copy the headers from the input
> >>>>>> records to the output records -- it seems to be unnecessary
> >> boilerplate
> >>>>>> code. On the other hand, for any other more complex use case, it's
> >>>>>> questionable to inherit headers---note, that headers would be
> written
> >> to
> >>>>>> the output topics increasing the size of the messages. Overall, I am
> >> not
> >>>>>> sure which default strategy might be the better one for headers. Is
> >>>>>> there a convincing argument for either one of them? I slightly tend
> to
> >>>>>> think that using `null` as default might 

Re: KIP 244 WIP

2018-05-04 Thread Jorge Esteban Quilcate Otoya
Hi Florian,

I have updated this KIP  and restarted the discussion to reduce scope to
Processor API first, to then prepare another KIP to define behaviour on DSL
functions.

Feel free to validate the current WIP and give feedback to check that KIP
is supporting your use-case.

Cheers,
Jorge.

El dom., 29 abr. 2018 a las 19:39, Ted Yu () escribió:

> The KIP is still under discussion.
>
> Have you seen this thread ?
>
>
> http://search-hadoop.com/m/Kafka/uyzND1cGBCC19brXW?subj=Re+KIP+244+Add+Record+Header+support+to+Kafka+Streams
>
> FYI
>
> On Sun, Apr 29, 2018 at 10:29 AM, Florian Garcia <
> garcia.florian.pe...@gmail.com> wrote:
>
> > Hello,
> >
> > I am really interested about the "KIP-244
> >  : Add record
> > headers to Kafka Streams".
> > I saw that the previous implementation hasn't change since december so I
> > decided to start my own during this weekend. For now, only the
> > `.filter((key, value, header) -> ...)` is working (Tests are in
> > KStreamFilterTest.java). You can find the changes here:
> > https://github.com/apache/kafka/compare/trunk...ImFlog:streams-headers
> >
> > Can someone take a quick look at this and tell me if I'm heading toward
> the
> > right direction ?
> > Some things feels quite wrong as I am duplicating a lot because of the
> > AbstractProcessor new `process` methods and the `forward` methods
> > from ProcessorContextImpl.java.
> >
> > If the filter is well implemented, I will continue with the rest.
> >
> > Thank you by advance
> > Florian Garcia
> >
>


Re: KIP-244: Add Record Header support to Kafka Streams

2018-05-03 Thread Jorge Esteban Quilcate Otoya
ntext change, and it's not represented in your PR.
> >
> > Also, despite the decreased scope in this KIP, I think it might be
> valuable
> > to define what will happen to headers once this change is implemented.
> For
> > example, I think a minimal groundwork-level change might be to make the
> API
> > changes, while promising to drop all headers from input records.
> >
> > A maximal groundwork change would be to forward the headers through all
> > operators in Streams. But I think there are some unresolved questions
> about
> > forwarding, like "what happens to the headers in a join?"
> >
> > There's of course some middle ground, but instinctively, I think I'd
> prefer
> > to have a clear definition that headers are currently *not* forwarded,
> > rather than having a complex list of operators that do or don't forward
> > them. Plus, I think it might be tricky to define this behavior while not
> > allowing the scope to return to that of your original proposal!
> >
> > Thanks again for the KIP,
> > -John
> >
> > On Wed, May 2, 2018 at 8:05 AM, Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> > > Hi Matthias,
> > >
> > > I've created a new JIRA to track this, updated the KIP and create a PR.
> > >
> > > Looking forward to your feedback,
> > >
> > > Jorge.
> > >
> > > El mar., 13 feb. 2018 a las 22:43, Matthias J. Sax (<
> > matth...@confluent.io
> > > >)
> > > escribió:
> > >
> > > > Hi Jorge,
> > > >
> > > > I would like to unblock this KIP to make some progress. The tricky
> > > > question of this work, seems to be how to expose headers at DSL
> level.
> > > > This related to KIP-149 and KIP-159. However, for Processor API, it
> > > > seems to be rather straight forward to add headers to the API.
> > > >
> > > > Thus, I would suggest to de-scope this KIP and add header support for
> > > > Processor API only as a first step. If this is done, we can see in a
> > > > second step, how to add headers at DSL level.
> > > >
> > > > WDYT about this proposal?
> > > >
> > > > If you agree, please update the JIRA and KIP accordingly. Note, that
> we
> > > > have two JIRA that are duplicates atm. We can scope them accordingly:
> > > > one for PAPI only, and second as a dependent JIRA for DSL.
> > > >
> > > >
> > > > -Matthias
> > > >
> > > > On 12/30/17 3:11 PM, Jorge Esteban Quilcate Otoya wrote:
> > > > > Thanks for your feedback!
> > > > >
> > > > > 1. I was adding headers to KeyValue to support groupBy, but I think
> > it
> > > is
> > > > > not necessary. It should be enough with mapping headers to
> key/value
> > > and
> > > > > then group using current KeyValue structure.
> > > > >
> > > > > 2. Yes. IMO key/value stores, like RocksDB, rely on KV as
> structure,
> > > > hence
> > > > > considering headers as part of stateful operations will not fit in
> > this
> > > > > approach and increase complexity (I cannot think in a use-case that
> > > need
> > > > > this).
> > > > >
> > > > > 3. and 4. Changes on 1. will solve this issue.
> > > > >
> > > > > Probably I rush a bit proposing this change, I was not aware of
> > KIP-159
> > > > or
> > > > > KAFKA-5632.
> > > > > If KIP-159 is adopted and we reduce this KIP to add Headers to
> > > > > RecordContext will be enough, but I'm not sure about the scope of
> > > > KIP-159.
> > > > > If it includes stateful operations will be difficult to implemented
> > as
> > > > > stated in 2.
> > > > >
> > > > > Cheers,
> > > > > Jorge.
> > > > >
> > > > > El mar., 26 dic. 2017 a las 20:04, Matthias J. Sax (<
> > > > matth...@confluent.io>)
> > > > > escribió:
> > > > >
> > > > >> Thanks for the KIP Jorge,
> > > > >>
> > > > >> As Bill pointed out already, we should be careful with adding new
> > > > >> overloads as this contradicts the work done via KIP-182.
> > > > >>
> > > > >> This KIP also seems to be related to KIP-14

Re: KIP-244: Add Record Header support to Kafka Streams

2018-05-02 Thread Jorge Esteban Quilcate Otoya
Hi Matthias,

I've created a new JIRA to track this, updated the KIP and create a PR.

Looking forward to your feedback,

Jorge.

El mar., 13 feb. 2018 a las 22:43, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> Hi Jorge,
>
> I would like to unblock this KIP to make some progress. The tricky
> question of this work, seems to be how to expose headers at DSL level.
> This related to KIP-149 and KIP-159. However, for Processor API, it
> seems to be rather straight forward to add headers to the API.
>
> Thus, I would suggest to de-scope this KIP and add header support for
> Processor API only as a first step. If this is done, we can see in a
> second step, how to add headers at DSL level.
>
> WDYT about this proposal?
>
> If you agree, please update the JIRA and KIP accordingly. Note, that we
> have two JIRA that are duplicates atm. We can scope them accordingly:
> one for PAPI only, and second as a dependent JIRA for DSL.
>
>
> -Matthias
>
> On 12/30/17 3:11 PM, Jorge Esteban Quilcate Otoya wrote:
> > Thanks for your feedback!
> >
> > 1. I was adding headers to KeyValue to support groupBy, but I think it is
> > not necessary. It should be enough with mapping headers to key/value and
> > then group using current KeyValue structure.
> >
> > 2. Yes. IMO key/value stores, like RocksDB, rely on KV as structure,
> hence
> > considering headers as part of stateful operations will not fit in this
> > approach and increase complexity (I cannot think in a use-case that need
> > this).
> >
> > 3. and 4. Changes on 1. will solve this issue.
> >
> > Probably I rush a bit proposing this change, I was not aware of KIP-159
> or
> > KAFKA-5632.
> > If KIP-159 is adopted and we reduce this KIP to add Headers to
> > RecordContext will be enough, but I'm not sure about the scope of
> KIP-159.
> > If it includes stateful operations will be difficult to implemented as
> > stated in 2.
> >
> > Cheers,
> > Jorge.
> >
> > El mar., 26 dic. 2017 a las 20:04, Matthias J. Sax (<
> matth...@confluent.io>)
> > escribió:
> >
> >> Thanks for the KIP Jorge,
> >>
> >> As Bill pointed out already, we should be careful with adding new
> >> overloads as this contradicts the work done via KIP-182.
> >>
> >> This KIP also seems to be related to KIP-149 and KIP-159. Are you aware
> >> of them? Both have quite long DISCUSS threads, but it might be worth
> >> browsing through them.
> >>
> >> A few further questions:
> >>
> >>  - why do you want to add the headers to `KeyValue`? I am not sure if we
> >> should consider headers as optional metadata and add it to
> >> `RecordContext` similar to timestamp, offset, etc. only
> >
> >
> >>  - You only include stateless single-record transformations at the DSL
> >> level. Do you suggest that all other operator just drop headers on the
> >> floor?
> >>
> >>  - Why do you only want to put headers into in-memory and cache but not
> >> RocksDB store? What do you mean by "pass through"? IMHO, all stores
> >> should behave the same at DSL level.
> >>-> if we store the headers in the state stores, what is the upgrade
> >> path?
> >>
> >>  - Why do we need to store record header in state in the first place, if
> >> we exclude stateful operator at DSL level?
> >>
> >>
> >> What is the motivation for the "border lines" you choose?
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 12/21/17 8:18 AM, Bill Bejeck wrote:
> >>> Jorge,
> >>>
> >>> Thanks for the KIP, I know this is a feature others in the community
> have
> >>> been interested in getting into Kafka Streams.
> >>>
> >>> I took a quick pass over it, and I have one initial question.
> >>>
> >>> We recently reduced overloads with KIP-182, and in this KIP we are
> >>> increasing them again.
> >>>
> >>> I can see from the KIP why they are necessary, but I'm wondering if
> there
> >>> is something else we can do to cut down on the overloads introduced.  I
> >>> don't have any sound suggestions ATM, so I'll have to think about it
> some
> >>> more, but I wanted to put the thought out there.
> >>>
> >>> Thanks,
> >>> Bill
> >>>
> >>> On Thu, Dec 21, 2017 at 9:06 AM, Jorge Esteban Quilcate Otoya <
> >>> quilcate.jo...@gmail.com> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> I have created a KIP to add Record Headers support to Kafka Streams
> API:
> >>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>> 244%3A+Add+Record+Header+support+to+Kafka+Streams
> >>>>
> >>>>
> >>>> The main goal is to be able to use headers to filter, map and process
> >>>> records as streams. Stateful processing (joins, windows) are not
> >>>> considered.
> >>>>
> >>>> Proposed changes/Draft:
> >>>> https://github.com/apache/kafka/compare/trunk...jeqo:streams-headers
> >>>>
> >>>> Feedback and suggestions are more than welcome.
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Jorge.
> >>>>
> >>>
> >>
> >>
> >
>
>


Re: [VOTE] Kafka 2.0.0 in June 2018

2018-04-19 Thread Jorge Esteban Quilcate Otoya
+1 (non binding), thanks Ismael!

El jue., 19 abr. 2018 a las 13:01, Manikumar ()
escribió:

> +1 (non-binding).
>
> Thanks.
>
> On Thu, Apr 19, 2018 at 3:07 PM, Stephane Maarek <
> steph...@simplemachines.com.au> wrote:
>
> > +1 (non binding). Thanks Ismael!
> >
> > On Thu., 19 Apr. 2018, 2:47 pm Gwen Shapira,  wrote:
> >
> > > +1 (binding)
> > >
> > > On Wed, Apr 18, 2018 at 11:35 AM, Ismael Juma 
> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I started a discussion last year about bumping the version of the
> June
> > > 2018
> > > > release to 2.0.0[1]. To reiterate the reasons in the original post:
> > > >
> > > > 1. Adopt KIP-118 (Drop Support for Java 7), which requires a major
> > > version
> > > > bump due to semantic versioning.
> > > >
> > > > 2. Take the chance to remove deprecated code that was deprecated
> prior
> > to
> > > > 1.0.0, but not removed in 1.0.0 (e.g. old Scala clients) so that we
> can
> > > > move faster.
> > > >
> > > > One concern that was raised is that we still do not have a rolling
> > > upgrade
> > > > path for the old ZK-based consumers. Since the Scala clients haven't
> > been
> > > > updated in a long time (they don't support security or the latest
> > message
> > > > format), users who need them can continue to use 1.1.0 with no loss
> of
> > > > functionality.
> > > >
> > > > Since it's already mid-April and people seemed receptive during the
> > > > discussion last year, I'm going straight to a vote, but we can
> discuss
> > > more
> > > > if needed (of course).
> > > >
> > > > Ismael
> > > >
> > > > [1]
> > > > https://lists.apache.org/thread.html/dd9d3e31d7e9590c1f727ef5560c93
> > > > 3281bad0de3134469b7b3c4257@%3Cdev.kafka.apache.org%3E
> > > >
> > >
> > >
> > >
> > > --
> > > *Gwen Shapira*
> > > Product Manager | Confluent
> > > 650.450.2760 <(650)%20450-2760> | @gwenshap
> > > Follow us: Twitter  | blog
> > > 
> > >
> >
>


Re: [VOTE] KIP-171 - Extend Consumer Group Reset Offset for Stream Application

2018-02-26 Thread Jorge Esteban Quilcate Otoya
Hi all,

Thanks for the feedback.

I have updated the "Compatibility, Deprecation, and Migration Plan" section
to document this to support the rollback. I probably should have handled
this change, as small as it looks, as a new KIP to avoid this issue.

I like Colin's idea about asking for confirmation, although I'm not sure if
another tool has already this behavior and could create more confusion
(e.g. why this command ask for confirmation and others don't). Maybe we
will require a more broad looks at the CLI tools to agree on this?

Jorge.

El jue., 22 feb. 2018 a las 21:09, Guozhang Wang (<wangg...@gmail.com>)
escribió:

> Yup, agreed.
>
> On Thu, Feb 22, 2018 at 11:46 AM, Ismael Juma <ism...@juma.me.uk> wrote:
>
> > Hi Guozhang,
> >
> > To clarify my comment: any change with a backwards compatibility impact
> > should be mentioned in the "Compatibility, Deprecation, and Migration
> Plan"
> > section (in addition to the deprecation period and only happening in a
> > major release as you said).
> >
> > Ismael
> >
> > On Thu, Feb 22, 2018 at 11:10 AM, Guozhang Wang <wangg...@gmail.com>
> > wrote:
> >
> > > Just to clarify, the KIP itself has mentioned about the change so the
> PR
> > > was not un-intentional:
> > >
> > > "
> > >
> > > 3. Keep execution parameters uniform between both tools: It will
> execute
> > by
> > > default, and have a `dry-run` parameter just show the results. This
> will
> > > involve change current `ConsumerGroupCommand` to change execution
> > options.
> > >
> > > "
> > >
> > > We were agreed that the proposed change is better than the current
> > status,
> > > since may people not using "--execute" on consumer reset tool were
> > actually
> > > surprised that nothing gets executed. What we were concerning as a
> > > hind-sight is that instead of doing such change in a minor release like
> > > 1.1, we should consider only doing that in the next major release as it
> > > breaks compatibility. In the past when we are going to remove / replace
> > > certain option we would first add a going-to-be-deprecated warning in
> the
> > > previous releases until it was finally removed. So Jason's suggestion
> is
> > to
> > > do the same: we are not reverting this change forever, but trying to
> > delay
> > > it after 1.1.
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Thu, Feb 22, 2018 at 10:56 AM, Colin McCabe <cmcc...@apache.org>
> > wrote:
> > >
> > > > Perhaps, if the user doesn't pass the --execute flag, the tool should
> > > > print a prompt like "would you like to perform this reset?" and wait
> > for
> > > a
> > > > Y / N (or yes or no) input from the command-line.  Then, if the
> > --execute
> > > > flag is passed, we skip this.  That seems 99% compatible, and also
> > > > accomplishes the goal of making the tool less confusing.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Thu, Feb 22, 2018, at 10:23, Ismael Juma wrote:
> > > > > Yes, let's revert the incompatible changes. There was no mention of
> > > > > compatibility impact on the KIP and we should ensure that is the
> case
> > > for
> > > > > 1.1.0.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Thu, Feb 22, 2018 at 9:55 AM, Jason Gustafson <
> ja...@confluent.io
> > >
> > > > wrote:
> > > > >
> > > > > > I know it's a been a while since this vote passed, but I think we
> > > need
> > > > to
> > > > > > reconsider the incompatible changes to the consumer reset tool.
> > > > > > Specifically, we have removed the --execute option without
> > > deprecating
> > > > it
> > > > > > first, and we have changed the default behavior to execute rather
> > > than
> > > > do a
> > > > > > dry run. The latter in particular seems dangerous since users who
> > > were
> > > > > > previously using the default behavior to view offsets will now
> > > suddenly
> > > > > > find the offsets already committed. As far as I can tell, this
> > change
> > > > was
> > > > > > done mostly for cosmetic reasons. Without a compelli

Re: [DISCUSS] KIP-222 - Add "describe consumer group" to KafkaAdminClient

2018-02-06 Thread Jorge Esteban Quilcate Otoya
Hi everyone,

I have add some changes to the KIP based on the Pull Request:
https://github.com/apache/kafka/pull/4454#issuecomment-360553277 :

* Reduce the scope of the operations to Consumer Groups to avoid complexity
of making assignments generic for Consumer and Connect groups. If Connect
Group operations are required we can handle it in another KIP or add it
here if needed.
* Consider adding `deleteConsumerGroups` API once
https://cwiki.apache.org/confluence/display/KAFKA/KIP-229%3A+DeleteGroups+API
is
merged.

Looking forward to your feedback.

If these changes look good we can keep the discussion on the PR.

Cheers,
Jorge.

El dom., 7 ene. 2018 a las 21:00, Jorge Esteban Quilcate Otoya (<
quilcate.jo...@gmail.com>) escribió:

> Great!
>
> I have added `listGroupOffsets` to the KIP.
>
> If there are no additional feedback, VOTE thread is already open.
>
> Cheers,
> Jorge
>
>
> El mar., 2 ene. 2018 a las 17:49, Gwen Shapira (<g...@confluent.io>)
> escribió:
>
>> listGroups and listGroupOffsets will make it a snap to transition the
>> existing ConsumerGroups CLI to depend on client libraries only.
>>
>> Thanks for adding them :)
>>
>> On Sun, Dec 31, 2017 at 1:39 PM Jorge Esteban Quilcate Otoya <
>> quilcate.jo...@gmail.com> wrote:
>>
>> > Thanks all for your feedback, and sorry for late response.
>> >
>> > I'm considering the following:
>> >
>> > ```AdminClient.java
>> > public abstract ListGroupsResult listGroups(ListGroupsOptions
>> options);
>> >
>> > public ListGroupsResult listGroups() {
>> > return listGroups(new ListGroupsOptions());
>> > }
>> >
>> > public ListGroupsResult listConsumerGroups(ListGroupsOptions
>> options) {
>> > //filtering groups by ConsumerProtocol.PROTOCOL_TYPE
>> > }
>> >
>> > public ListGroupsResult listConsumerGroups() {
>> > return listConsumerGroups(new ListGroupsOptions());
>> > }
>> > ```
>> >
>> > About `describeConsumerGroups`, I'm considering renaming to
>> > `describeGroups` and rename `ConsumerGroupDescription` and
>> > `ConsumerDescription` to `GroupDescription` to `MemberDescription`.
>> > Not sure we need a deserializer, we can access `DescribeGroupsResponse`
>> > members directly.
>> >
>> > As @dan says, I also think `listGroupOffsets` could be added to this
>> KIP to
>> > make it complete.
>> >
>> > I'm thinking about renaming this KIP to "Add Consumer Group operations
>> to
>> > Admin API".
>> >
>> > I'm updating the KIP accordingly.
>> >
>> > Cheers and happy 2018!
>> >
>> > Jorge.
>> >
>> > El mié., 13 dic. 2017 a las 19:06, Colin McCabe (<cmcc...@apache.org>)
>> > escribió:
>> >
>> > > On Tue, Dec 12, 2017, at 09:39, Jason Gustafson wrote:
>> > > > Hi Colin,
>> > > >
>> > > > They do share the same namespace. We have a "protocol type" field in
>> > the
>> > > > JoinGroup request to make sure that all members are of the same
>> kind.
>> > >
>> > > Hi Jason,
>> > >
>> > > Thanks.  That makes sense.
>> > >
>> > > > Very roughly what I was thinking is something like this. First we
>> > > introduce an
>> > > > interface for deserialization:
>> > > >
>> > > > interface GroupMetadataDeserializer<Metadata, Assignment> {
>> > > >   String protocolType();
>> > > >   Metadata desrializeMetadata(ByteBuffer);
>> > > >   Assignment deserializeAssignment(ByteBuffer);
>> > > > }
>> > > >
>> > > > Then we add some kind of generic container:
>> > > >
>> > > > class MemberMetadata<Metadata, Assignment> {
>> > > >   Metadata metadata;
>> > > >   Assignment assignment;
>> > > > }
>> > > >
>> > > > Then we have two APIs: one generic and one specific to consumer
>> groups:
>> > > >
>> > > > <M, A> Map<String, MemberMetadata<M,A>> describeGroup(String
>> groupId,
>> > > > GroupMetadataDeserializer<M, A> deserializer);
>> > > >
>> > > > Map<String, ConsumerGroupMetadata> describeConsumerGroup(String
>> > groupId);
>> > > >
>> > > > (This i

Re: [VOTE] KIP-222 - Add "describe consumer group" to KafkaAdminClient

2018-02-06 Thread Jorge Esteban Quilcate Otoya
Thanks Matthias. I have updated the version on KIP main page also.

There are some changes that arise on the Pull Request. I will comment them
on the discussion thread.

Cheers,
Jorge.

El vie., 2 feb. 2018 a las 20:50, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> Feature freeze for 1.1 passed already, thus, KIP-222 will not be part of
> 1.1 release.
>
> I updated the JIRA with target version 1.2.
>
> -Matthias
>
> On 2/1/18 3:57 PM, Jeff Widman wrote:
> > Don't forget to update the wiki page now that the vote has passed--it
> > currently says this KIP is "under discussion":
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-222+-+Add+Consumer+Group+operations+to+Admin+API
> >
> > Also, should the JIRA ticket be tagged with 1.1.0 (provided this is
> merged
> > by then)?
> >
> > On Mon, Jan 22, 2018 at 3:26 AM, Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> >> My bad, KIP is updated:
> >>
> >> ```
> >> public class MemberDescription {
> >> private final String consumerId;
> >> private final String clientId;
> >> private final String host;
> >> private final MemberAssignment assignment;
> >> }
> >> public class MemberAssignment {
> >> private final List assignment;
> >> }
> >> ```
> >>
> >> Cheers,
> >> Jorge.
> >>
> >> El lun., 22 ene. 2018 a las 6:46, Jun Rao (<j...@confluent.io>)
> escribió:
> >>
> >>> Hi, Jorge,
> >>>
> >>> For #3, I wasn't suggesting using the internal Assignment. We can just
> >>> introduce a new public type that wraps List. We can
> call
> >> it
> >>> sth like MemberAssignment to distinguish it from the internal one. This
> >>> makes extending the type in the future easier.
> >>>
> >>> Thanks,
> >>>
> >>> Jun
> >>>
> >>> On Sun, Jan 21, 2018 at 3:19 PM, Jorge Esteban Quilcate Otoya <
> >>> quilcate.jo...@gmail.com> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> Thanks all for your votes and approving this KIP :)
> >>>>
> >>>> @Jun Rao:
> >>>>
> >>>> 1. Yes, KIP is updated with MemberDescription.
> >>>> 2. Changed:
> >>>> ```
> >>>> public class ListGroupOffsetsResult {
> >>>> final KafkaFuture<Map<TopicPartition, OffsetAndMetadata>> future;
> >>>> ```
> >>>> 3. Not sure about this one as Assignment type is part of
> >>>> o.a.k.clients.consumer.internals. Will we be breaking encapsulation
> >> if we
> >>>> expose it as part of AdminClient?
> >>>> Currently is defined as:
> >>>> ```
> >>>> public class MemberDescription {
> >>>> private final String consumerId;
> >>>> private final String clientId;
> >>>> private final String host;
> >>>> private final List assignment;
> >>>> }
> >>>> ```
> >>>>
> >>>> BTW: I've created a PR with the work in progress:
> >>>> https://github.com/apache/kafka/pull/4454
> >>>>
> >>>> Cheers,
> >>>> Jorge.
> >>>>
> >>>> El vie., 19 ene. 2018 a las 23:52, Jun Rao (<j...@confluent.io>)
> >>> escribió:
> >>>>
> >>>>> Hi, Jorge,
> >>>>>
> >>>>> Thanks for the KIP. Looks good to me overall. A few comments below.
> >>>>>
> >>>>> 1. It seems that ConsumerDescription should be MemberDescription?
> >>>>>
> >>>>> 2. Each offset can have an optional metadata. So, in
> >>>>> ListGroupOffsetsResult, perhaps it's better to have
> >>>>> KafkaFuture<Map<TopicPartition, OffsetAndMetadata>>, where
> >>>>> OffsetAndMetadata contains an offset and a metadata of String.
> >>>>>
> >>>>> 3. As Jason mentioned in the discussion, it would be nice to extend
> >>> this
> >>>>> api to support general group management, instead of just the consumer
> >>>> group
> >>>>> in the future. For that, it might be better for MemberDescription to
> >>> have
> >>>>> assignment of type Assignment, 

Re: [VOTE] KIP-222 - Add "describe consumer group" to KafkaAdminClient

2018-01-22 Thread Jorge Esteban Quilcate Otoya
My bad, KIP is updated:

```
public class MemberDescription {
private final String consumerId;
private final String clientId;
private final String host;
private final MemberAssignment assignment;
}
public class MemberAssignment {
private final List assignment;
}
```

Cheers,
Jorge.

El lun., 22 ene. 2018 a las 6:46, Jun Rao (<j...@confluent.io>) escribió:

> Hi, Jorge,
>
> For #3, I wasn't suggesting using the internal Assignment. We can just
> introduce a new public type that wraps List. We can call it
> sth like MemberAssignment to distinguish it from the internal one. This
> makes extending the type in the future easier.
>
> Thanks,
>
> Jun
>
> On Sun, Jan 21, 2018 at 3:19 PM, Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > Hi all,
> >
> > Thanks all for your votes and approving this KIP :)
> >
> > @Jun Rao:
> >
> > 1. Yes, KIP is updated with MemberDescription.
> > 2. Changed:
> > ```
> > public class ListGroupOffsetsResult {
> > final KafkaFuture<Map<TopicPartition, OffsetAndMetadata>> future;
> > ```
> > 3. Not sure about this one as Assignment type is part of
> > o.a.k.clients.consumer.internals. Will we be breaking encapsulation if we
> > expose it as part of AdminClient?
> > Currently is defined as:
> > ```
> > public class MemberDescription {
> > private final String consumerId;
> > private final String clientId;
> > private final String host;
> > private final List assignment;
> > }
> > ```
> >
> > BTW: I've created a PR with the work in progress:
> > https://github.com/apache/kafka/pull/4454
> >
> > Cheers,
> > Jorge.
> >
> > El vie., 19 ene. 2018 a las 23:52, Jun Rao (<j...@confluent.io>)
> escribió:
> >
> > > Hi, Jorge,
> > >
> > > Thanks for the KIP. Looks good to me overall. A few comments below.
> > >
> > > 1. It seems that ConsumerDescription should be MemberDescription?
> > >
> > > 2. Each offset can have an optional metadata. So, in
> > > ListGroupOffsetsResult, perhaps it's better to have
> > > KafkaFuture<Map<TopicPartition, OffsetAndMetadata>>, where
> > > OffsetAndMetadata contains an offset and a metadata of String.
> > >
> > > 3. As Jason mentioned in the discussion, it would be nice to extend
> this
> > > api to support general group management, instead of just the consumer
> > group
> > > in the future. For that, it might be better for MemberDescription to
> have
> > > assignment of type Assignment, which consists of a list of partitions.
> > > Then, in the future, we can add other fields to Assignment.
> > >
> > > Jun
> > >
> > >
> > > On Thu, Jan 18, 2018 at 9:45 AM, Mickael Maison <
> > mickael.mai...@gmail.com>
> > > wrote:
> > >
> > > > +1 (non binding), thanks
> > > >
> > > > On Thu, Jan 18, 2018 at 5:41 PM, Colin McCabe <cmcc...@apache.org>
> > > wrote:
> > > > > +1 (non-binding)
> > > > >
> > > > > Colin
> > > > >
> > > > >
> > > > > On Thu, Jan 18, 2018, at 07:36, Ted Yu wrote:
> > > > >> +1
> > > > >>  Original message From: Bill Bejeck <
> > > bbej...@gmail.com>
> > > > >> Date: 1/18/18  6:59 AM  (GMT-08:00) To: dev@kafka.apache.org
> > Subject:
> > > > >> Re: [VOTE] KIP-222 - Add "describe consumer group" to
> > KafkaAdminClient
> > > > >> Thanks for the KIP
> > > > >>
> > > > >> +1
> > > > >>
> > > > >> Bill
> > > > >>
> > > > >> On Thu, Jan 18, 2018 at 4:24 AM, Rajini Sivaram <
> > > > rajinisiva...@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >> > +1 (binding)
> > > > >> >
> > > > >> > Thanks for the KIP, Jorge.
> > > > >> >
> > > > >> > Regards,
> > > > >> >
> > > > >> > Rajini
> > > > >> >
> > > > >> > On Wed, Jan 17, 2018 at 9:04 PM, Guozhang Wang <
> > wangg...@gmail.com>
> > > > wrote:
> > > > >> >
> > > > >> > > +1 (binding). Thanks Jorge.
> > > > >> > >
> > >

Re: [VOTE] KIP-222 - Add "describe consumer group" to KafkaAdminClient

2018-01-21 Thread Jorge Esteban Quilcate Otoya
Hi all,

Thanks all for your votes and approving this KIP :)

@Jun Rao:

1. Yes, KIP is updated with MemberDescription.
2. Changed:
```
public class ListGroupOffsetsResult {
final KafkaFuture<Map<TopicPartition, OffsetAndMetadata>> future;
```
3. Not sure about this one as Assignment type is part of
o.a.k.clients.consumer.internals. Will we be breaking encapsulation if we
expose it as part of AdminClient?
Currently is defined as:
```
public class MemberDescription {
private final String consumerId;
private final String clientId;
private final String host;
private final List assignment;
}
```

BTW: I've created a PR with the work in progress:
https://github.com/apache/kafka/pull/4454

Cheers,
Jorge.

El vie., 19 ene. 2018 a las 23:52, Jun Rao (<j...@confluent.io>) escribió:

> Hi, Jorge,
>
> Thanks for the KIP. Looks good to me overall. A few comments below.
>
> 1. It seems that ConsumerDescription should be MemberDescription?
>
> 2. Each offset can have an optional metadata. So, in
> ListGroupOffsetsResult, perhaps it's better to have
> KafkaFuture<Map<TopicPartition, OffsetAndMetadata>>, where
> OffsetAndMetadata contains an offset and a metadata of String.
>
> 3. As Jason mentioned in the discussion, it would be nice to extend this
> api to support general group management, instead of just the consumer group
> in the future. For that, it might be better for MemberDescription to have
> assignment of type Assignment, which consists of a list of partitions.
> Then, in the future, we can add other fields to Assignment.
>
> Jun
>
>
> On Thu, Jan 18, 2018 at 9:45 AM, Mickael Maison <mickael.mai...@gmail.com>
> wrote:
>
> > +1 (non binding), thanks
> >
> > On Thu, Jan 18, 2018 at 5:41 PM, Colin McCabe <cmcc...@apache.org>
> wrote:
> > > +1 (non-binding)
> > >
> > > Colin
> > >
> > >
> > > On Thu, Jan 18, 2018, at 07:36, Ted Yu wrote:
> > >> +1
> > >>  Original message From: Bill Bejeck <
> bbej...@gmail.com>
> > >> Date: 1/18/18  6:59 AM  (GMT-08:00) To: dev@kafka.apache.org Subject:
> > >> Re: [VOTE] KIP-222 - Add "describe consumer group" to KafkaAdminClient
> > >> Thanks for the KIP
> > >>
> > >> +1
> > >>
> > >> Bill
> > >>
> > >> On Thu, Jan 18, 2018 at 4:24 AM, Rajini Sivaram <
> > rajinisiva...@gmail.com>
> > >> wrote:
> > >>
> > >> > +1 (binding)
> > >> >
> > >> > Thanks for the KIP, Jorge.
> > >> >
> > >> > Regards,
> > >> >
> > >> > Rajini
> > >> >
> > >> > On Wed, Jan 17, 2018 at 9:04 PM, Guozhang Wang <wangg...@gmail.com>
> > wrote:
> > >> >
> > >> > > +1 (binding). Thanks Jorge.
> > >> > >
> > >> > >
> > >> > > Guozhang
> > >> > >
> > >> > > On Wed, Jan 17, 2018 at 11:29 AM, Gwen Shapira <g...@confluent.io
> >
> > >> > wrote:
> > >> > >
> > >> > > > Hey, since there were no additional comments in the discussion,
> > I'd
> > >> > like
> > >> > > to
> > >> > > > resume the voting.
> > >> > > >
> > >> > > > +1 (binding)
> > >> > > >
> > >> > > > On Fri, Nov 17, 2017 at 9:15 AM Guozhang Wang <
> wangg...@gmail.com
> > >
> > >> > > wrote:
> > >> > > >
> > >> > > > > Hello Jorge,
> > >> > > > >
> > >> > > > > I left some comments on the discuss thread. The wiki page
> itself
> > >> > looks
> > >> > > > good
> > >> > > > > overall.
> > >> > > > >
> > >> > > > >
> > >> > > > > Guozhang
> > >> > > > >
> > >> > > > > On Tue, Nov 14, 2017 at 10:02 AM, Jorge Esteban Quilcate
> Otoya <
> > >> > > > > quilcate.jo...@gmail.com> wrote:
> > >> > > > >
> > >> > > > > > Added.
> > >> > > > > >
> > >> > > > > > El mar., 14 nov. 2017 a las 19:00, Ted Yu (<
> > yuzhih...@gmail.com>)
> > >> > > > > > escribió:
> > >> > > > > >
> > >> 

Re: [ANNOUNCE] New committer: Matthias J. Sax

2018-01-15 Thread Jorge Esteban Quilcate Otoya
Congratulations Matthias!!

El lun., 15 ene. 2018 a las 9:08, Boyang Chen ()
escribió:

> Great news Matthias!
>
>
> 
> From: Kaufman Ng 
> Sent: Monday, January 15, 2018 11:32 AM
> To: us...@kafka.apache.org
> Cc: dev@kafka.apache.org
> Subject: Re: [ANNOUNCE] New committer: Matthias J. Sax
>
> Congrats Matthias!
>
> On Sun, Jan 14, 2018 at 4:52 AM, Rajini Sivaram 
> wrote:
>
> > Congratulations Matthias!
> >
> > On Sat, Jan 13, 2018 at 11:34 AM, Mickael Maison <
> mickael.mai...@gmail.com
> > >
> > wrote:
> >
> > > Congratulations Matthias !
> > >
> > > On Sat, Jan 13, 2018 at 7:01 AM, Paolo Patierno 
> > > wrote:
> > > > Congratulations Matthias ! Very well deserved !
> > > > 
> > > > From: Guozhang Wang 
> > > > Sent: Friday, January 12, 2018 11:59:21 PM
> > > > To: dev@kafka.apache.org; us...@kafka.apache.org
> > > > Subject: [ANNOUNCE] New committer: Matthias J. Sax
> > > >
> > > > Hello everyone,
> > > >
> > > > The PMC of Apache Kafka is pleased to announce Matthias J. Sax as our
> > > > newest Kafka committer.
> > > >
> > > > Matthias has made tremendous contributions to Kafka Streams API since
> > > early
> > > > 2016. His footprint has been all over the places in Streams: in the
> > past
> > > > two years he has been the main driver on improving the join semantics
> > > > inside Streams DSL, summarizing all their shortcomings and bridging
> the
> > > > gaps; he has also been largely working on the exactly-once semantics
> of
> > > > Streams by leveraging on the transaction messaging feature in 0.11.0.
> > In
> > > > addition, Matthias have been very active in community activity that
> > goes
> > > > beyond mailing list: he's getting the close to 1000 up votes and 100
> > > > helpful flags on SO for answering almost all questions about Kafka
> > > Streams.
> > > >
> > > > Thank you for your contribution and welcome to Apache Kafka,
> Matthias!
> > > >
> > > >
> > > >
> > > > Guozhang, on behalf of the Apache Kafka PMC
> > >
> >
>
>
>
> --
> Kaufman Ng
> +1 646 961 8063 <(646)%20961-8063>
> Solutions Architect | Confluent | www.confluent.io >
> [https://www.confluent.io/wp-content/uploads/Untitled-design-12.png]<
> http://www.confluent.io/>
>
> Confluent: Apache Kafka & Streaming Platform for the Enterprise<
> http://www.confluent.io/>
> www.confluent.io
> Confluent, founded by the creators of Apache Kafka, delivers a complete
> execution of Kafka for the Enterprise, to help you run your business in
> real time.
>
>
>
>


Re: [DISCUSS] KIP-222 - Add "describe consumer group" to KafkaAdminClient

2018-01-07 Thread Jorge Esteban Quilcate Otoya
Great!

I have added `listGroupOffsets` to the KIP.

If there are no additional feedback, VOTE thread is already open.

Cheers,
Jorge


El mar., 2 ene. 2018 a las 17:49, Gwen Shapira (<g...@confluent.io>)
escribió:

> listGroups and listGroupOffsets will make it a snap to transition the
> existing ConsumerGroups CLI to depend on client libraries only.
>
> Thanks for adding them :)
>
> On Sun, Dec 31, 2017 at 1:39 PM Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > Thanks all for your feedback, and sorry for late response.
> >
> > I'm considering the following:
> >
> > ```AdminClient.java
> > public abstract ListGroupsResult listGroups(ListGroupsOptions
> options);
> >
> > public ListGroupsResult listGroups() {
> > return listGroups(new ListGroupsOptions());
> > }
> >
> > public ListGroupsResult listConsumerGroups(ListGroupsOptions
> options) {
> > //filtering groups by ConsumerProtocol.PROTOCOL_TYPE
> > }
> >
> > public ListGroupsResult listConsumerGroups() {
> > return listConsumerGroups(new ListGroupsOptions());
> > }
> > ```
> >
> > About `describeConsumerGroups`, I'm considering renaming to
> > `describeGroups` and rename `ConsumerGroupDescription` and
> > `ConsumerDescription` to `GroupDescription` to `MemberDescription`.
> > Not sure we need a deserializer, we can access `DescribeGroupsResponse`
> > members directly.
> >
> > As @dan says, I also think `listGroupOffsets` could be added to this KIP
> to
> > make it complete.
> >
> > I'm thinking about renaming this KIP to "Add Consumer Group operations to
> > Admin API".
> >
> > I'm updating the KIP accordingly.
> >
> > Cheers and happy 2018!
> >
> > Jorge.
> >
> > El mié., 13 dic. 2017 a las 19:06, Colin McCabe (<cmcc...@apache.org>)
> > escribió:
> >
> > > On Tue, Dec 12, 2017, at 09:39, Jason Gustafson wrote:
> > > > Hi Colin,
> > > >
> > > > They do share the same namespace. We have a "protocol type" field in
> > the
> > > > JoinGroup request to make sure that all members are of the same kind.
> > >
> > > Hi Jason,
> > >
> > > Thanks.  That makes sense.
> > >
> > > > Very roughly what I was thinking is something like this. First we
> > > introduce an
> > > > interface for deserialization:
> > > >
> > > > interface GroupMetadataDeserializer<Metadata, Assignment> {
> > > >   String protocolType();
> > > >   Metadata desrializeMetadata(ByteBuffer);
> > > >   Assignment deserializeAssignment(ByteBuffer);
> > > > }
> > > >
> > > > Then we add some kind of generic container:
> > > >
> > > > class MemberMetadata<Metadata, Assignment> {
> > > >   Metadata metadata;
> > > >   Assignment assignment;
> > > > }
> > > >
> > > > Then we have two APIs: one generic and one specific to consumer
> groups:
> > > >
> > > > <M, A> Map<String, MemberMetadata<M,A>> describeGroup(String groupId,
> > > > GroupMetadataDeserializer<M, A> deserializer);
> > > >
> > > > Map<String, ConsumerGroupMetadata> describeConsumerGroup(String
> > groupId);
> > > >
> > > > (This is just a sketch, so obviously we can change them to use
> futures
> > or
> > > > to batch or whatever.)
> > > >
> > > > I think it would be fine to not provide a connect-specific API since
> > this
> > > > usage will probably be limited to Connect itself.
> > >
> > > Yeah, it probably makes sense to have a separation between
> describeGroup
> > > and describeConsumerGroup.
> > >
> > > We will have to be pretty careful with cross-version compatibility in
> > > describeConsumerGroup.  It should be possible for an old client to talk
> > > to a new broker, and a new client to talk to an old broker.  So we
> > > should be prepared to read data in multiple formats.
> > >
> > > I'm not sure if we need to have a 'deserializer' argument to
> > > describeGroup.  We can just let them access a byte array, right?
> > > Theoretically they might also just want to check for the presence or
> > > absence of a group, but not deserialize anything.
> > >
> > > best,
> > > Colin
> >

Re: [DISCUSS] KIP-222 - Add "describe consumer group" to KafkaAdminClient

2017-12-31 Thread Jorge Esteban Quilcate Otoya
t; is there any update regarding this KIP?
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > > On 11/17/17 9:14 AM, Guozhang Wang wrote:
> > > > > > Hello Jorge,
> > > > > >
> > > > > > I made a pass over the wiki, and here are a few comments:
> > > > > >
> > > > > > 1. First, regarding to Tom's comment #2 above, I think if we are
> only
> > > > > going
> > > > > > to include the String groupId. Then it is Okay to keep as a
> String
> > > than
> > > > > > using a new wrapper class. However, I think we could include the
> > > > > > protocol_type returned from the ListGroupsResponse along with the
> > > > > groupId.
> > > > > > This is a very useful information to tell which consumer groups
> are
> > > from
> > > > > > Connect, which ones are from Streams, which ones are
> user-customized
> > > etc.
> > > > > > With this, it is reasonable to keep a wrapper class.
> > > > > >
> > > > > > 2. In ConsumerDescription, could we also add the state,
> protocol_type
> > > > > > (these two are form DescribeGroupResponse), and the Node
> coordinator
> > > > > (this
> > > > > > may be returned from the AdminClient itself) as well? This is
> also
> > > for
> > > > > > information consistency with the old client (note that
> protocol_type
> > > was
> > > > > > called assignment_strategy there).
> > > > > >
> > > > > > 3. With 1) / 2) above, maybe we can rename
> "ConsumerGroupListing" to
> > > > > > "ConsumerGroupSummary" and make "ConsumerGroupDescription" an
> > > extended
> > > > > > class of the former with the additional fields?
> > > > > >
> > > > > >
> > > > > >
> > > > > > Guozhang
> > > > > >
> > > > > >
> > > > > > On Tue, Nov 7, 2017 at 2:13 AM, Jorge Esteban Quilcate Otoya <
> > > > > > quilcate.jo...@gmail.com> wrote:
> > > > > >
> > > > > >> Hi Tom,
> > > > > >>
> > > > > >> 1. You're right. I've updated the KIP accordingly.
> > > > > >> 2. Yes, I have add it to keep consistency, but I'd like to know
> what
> > > > > others
> > > > > >> think about this too.
> > > > > >>
> > > > > >> Cheers,
> > > > > >> Jorge.
> > > > > >>
> > > > > >> El mar., 7 nov. 2017 a las 9:29, Tom Bentley (<
> > > t.j.bent...@gmail.com>)
> > > > > >> escribió:
> > > > > >>
> > > > > >>> Hi again Jorge,
> > > > > >>>
> > > > > >>> A couple of minor points:
> > > > > >>>
> > > > > >>> 1. ConsumerGroupDescription has the member `name`, but
> everywhere
> > > else
> > > > > >> that
> > > > > >>> I've seen the term "group id" is used, so perhaps calling it
> "id"
> > > or
> > > > > >>> "groupId" would be more consistent.
> > > > > >>> 2. I think you've added ConsumerGroupListing for consistency
> with
> > > > > >>> TopicListing. For topics it makes sense because at well as the
> name
> > > > > there
> > > > > >>> is whether the topic is internal. For consumer groups, though
> > > there is
> > > > > >> just
> > > > > >>> the name and having a separate ConsumerGroupListing seems like
> it
> > > > > doesn't
> > > > > >>> add very much, and would mostly get in the way when using the
> API.
> > > I
> > > > > >> would
> > > > > >>> be interested in what others thought about this.
> > > > > >>>
> > > > > >>> Cheers,
> > > > > >>>
> > > > > >>> Tom
> > > > > >>>
> > > > > >>> On 6 November 2017 at 22:16, Jorge Esteban Quilcate Otoya <
> > > > > >>> 

Re: KIP-244: Add Record Header support to Kafka Streams

2017-12-30 Thread Jorge Esteban Quilcate Otoya
Thanks for your feedback!

1. I was adding headers to KeyValue to support groupBy, but I think it is
not necessary. It should be enough with mapping headers to key/value and
then group using current KeyValue structure.

2. Yes. IMO key/value stores, like RocksDB, rely on KV as structure, hence
considering headers as part of stateful operations will not fit in this
approach and increase complexity (I cannot think in a use-case that need
this).

3. and 4. Changes on 1. will solve this issue.

Probably I rush a bit proposing this change, I was not aware of KIP-159 or
KAFKA-5632.
If KIP-159 is adopted and we reduce this KIP to add Headers to
RecordContext will be enough, but I'm not sure about the scope of KIP-159.
If it includes stateful operations will be difficult to implemented as
stated in 2.

Cheers,
Jorge.

El mar., 26 dic. 2017 a las 20:04, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> Thanks for the KIP Jorge,
>
> As Bill pointed out already, we should be careful with adding new
> overloads as this contradicts the work done via KIP-182.
>
> This KIP also seems to be related to KIP-149 and KIP-159. Are you aware
> of them? Both have quite long DISCUSS threads, but it might be worth
> browsing through them.
>
> A few further questions:
>
>  - why do you want to add the headers to `KeyValue`? I am not sure if we
> should consider headers as optional metadata and add it to
> `RecordContext` similar to timestamp, offset, etc. only


>  - You only include stateless single-record transformations at the DSL
> level. Do you suggest that all other operator just drop headers on the
> floor?
>
>  - Why do you only want to put headers into in-memory and cache but not
> RocksDB store? What do you mean by "pass through"? IMHO, all stores
> should behave the same at DSL level.
>-> if we store the headers in the state stores, what is the upgrade
> path?
>
>  - Why do we need to store record header in state in the first place, if
> we exclude stateful operator at DSL level?
>
>
> What is the motivation for the "border lines" you choose?
>
>
> -Matthias
>
>
> On 12/21/17 8:18 AM, Bill Bejeck wrote:
> > Jorge,
> >
> > Thanks for the KIP, I know this is a feature others in the community have
> > been interested in getting into Kafka Streams.
> >
> > I took a quick pass over it, and I have one initial question.
> >
> > We recently reduced overloads with KIP-182, and in this KIP we are
> > increasing them again.
> >
> > I can see from the KIP why they are necessary, but I'm wondering if there
> > is something else we can do to cut down on the overloads introduced.  I
> > don't have any sound suggestions ATM, so I'll have to think about it some
> > more, but I wanted to put the thought out there.
> >
> > Thanks,
> > Bill
> >
> > On Thu, Dec 21, 2017 at 9:06 AM, Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> >> Hi all,
> >>
> >> I have created a KIP to add Record Headers support to Kafka Streams API:
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> 244%3A+Add+Record+Header+support+to+Kafka+Streams
> >>
> >>
> >> The main goal is to be able to use headers to filter, map and process
> >> records as streams. Stateful processing (joins, windows) are not
> >> considered.
> >>
> >> Proposed changes/Draft:
> >> https://github.com/apache/kafka/compare/trunk...jeqo:streams-headers
> >>
> >> Feedback and suggestions are more than welcome.
> >>
> >> Cheers,
> >>
> >> Jorge.
> >>
> >
>
>


KIP-244: Add Record Header support to Kafka Streams

2017-12-21 Thread Jorge Esteban Quilcate Otoya
Hi all,

I have created a KIP to add Record Headers support to Kafka Streams API:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-244%3A+Add+Record+Header+support+to+Kafka+Streams


The main goal is to be able to use headers to filter, map and process
records as streams. Stateful processing (joins, windows) are not
considered.

Proposed changes/Draft:
https://github.com/apache/kafka/compare/trunk...jeqo:streams-headers

Feedback and suggestions are more than welcome.

Cheers,

Jorge.


Re: [VOTE] KIP-222 - Add "describe consumer group" to KafkaAdminClient

2017-11-14 Thread Jorge Esteban Quilcate Otoya
Added.

El mar., 14 nov. 2017 a las 19:00, Ted Yu (<yuzhih...@gmail.com>) escribió:

> Please fill in JIRA number in Status section.
>
> On Tue, Nov 14, 2017 at 9:57 AM, Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > JIRA issue title updated.
> >
> > El mar., 14 nov. 2017 a las 18:45, Ted Yu (<yuzhih...@gmail.com>)
> > escribió:
> >
> > > Can you fill in JIRA number (KAFKA-6058
> > > <https://issues.apache.org/jira/browse/KAFKA-6058>) ?
> > >
> > > If one JIRA is used for the two additions, consider updating the JIRA
> > > title.
> > >
> > > On Tue, Nov 14, 2017 at 9:04 AM, Jorge Esteban Quilcate Otoya <
> > > quilcate.jo...@gmail.com> wrote:
> > >
> > > > Hi all,
> > > >
> > > > As I didn't see any further discussion around this KIP, I'd like to
> > start
> > > > voting.
> > > >
> > > > KIP documentation:
> > > >
> > > https://cwiki.apache.org/confluence/pages/viewpage.
> > action?pageId=74686265
> > > >
> > > > Cheers,
> > > > Jorge.
> > > >
> > >
> >
>


Re: [VOTE] KIP-222 - Add "describe consumer group" to KafkaAdminClient

2017-11-14 Thread Jorge Esteban Quilcate Otoya
JIRA issue title updated.

El mar., 14 nov. 2017 a las 18:45, Ted Yu (<yuzhih...@gmail.com>) escribió:

> Can you fill in JIRA number (KAFKA-6058
> <https://issues.apache.org/jira/browse/KAFKA-6058>) ?
>
> If one JIRA is used for the two additions, consider updating the JIRA
> title.
>
> On Tue, Nov 14, 2017 at 9:04 AM, Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > Hi all,
> >
> > As I didn't see any further discussion around this KIP, I'd like to start
> > voting.
> >
> > KIP documentation:
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74686265
> >
> > Cheers,
> > Jorge.
> >
>


[VOTE] KIP-222 - Add "describe consumer group" to KafkaAdminClient

2017-11-14 Thread Jorge Esteban Quilcate Otoya
Hi all,

As I didn't see any further discussion around this KIP, I'd like to start
voting.

KIP documentation:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74686265

Cheers,
Jorge.


Re: [VOTE] KIP-171 - Extend Consumer Group Reset Offset for Stream Application

2017-11-14 Thread Jorge Esteban Quilcate Otoya
Thanks to everyone for your feedback.

KIP has been accepted and discussion is moved to PR.

Cheers,
Jorge.

El lun., 6 nov. 2017 a las 17:31, Rajini Sivaram (<rajinisiva...@gmail.com>)
escribió:

> +1 (binding)
> Thanks for the KIP,  Jorge.
>
> Regards,
>
> Rajini
>
> On Tue, Oct 31, 2017 at 9:58 AM, Damian Guy <damian@gmail.com> wrote:
>
> > Thanks for the KIP - +1 (binding)
> >
> > On Mon, 23 Oct 2017 at 18:39 Guozhang Wang <wangg...@gmail.com> wrote:
> >
> > > Thanks Jorge for driving this KIP! +1 (binding).
> > >
> > >
> > > Guozhang
> > >
> > > On Mon, Oct 16, 2017 at 2:11 PM, Bill Bejeck <bbej...@gmail.com>
> wrote:
> > >
> > > > +1
> > > >
> > > > Thanks,
> > > > Bill
> > > >
> > > > On Fri, Oct 13, 2017 at 6:36 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> > > >
> > > > > +1
> > > > >
> > > > > On Fri, Oct 13, 2017 at 3:32 PM, Matthias J. Sax <
> > > matth...@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 9/11/17 3:04 PM, Jorge Esteban Quilcate Otoya wrote:
> > > > > > > Hi All,
> > > > > > >
> > > > > > > It seems that there is no further concern with the KIP-171.
> > > > > > > At this point we would like to start the voting process.
> > > > > > >
> > > > > > > The KIP can be found here:
> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > 171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application
> > > > > > >
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>


Re: [DISCUSS] KIP-222 - Add "describe consumer group" to KafkaAdminClient

2017-11-07 Thread Jorge Esteban Quilcate Otoya
Hi Tom,

1. You're right. I've updated the KIP accordingly.
2. Yes, I have add it to keep consistency, but I'd like to know what others
think about this too.

Cheers,
Jorge.

El mar., 7 nov. 2017 a las 9:29, Tom Bentley (<t.j.bent...@gmail.com>)
escribió:

> Hi again Jorge,
>
> A couple of minor points:
>
> 1. ConsumerGroupDescription has the member `name`, but everywhere else that
> I've seen the term "group id" is used, so perhaps calling it "id" or
> "groupId" would be more consistent.
> 2. I think you've added ConsumerGroupListing for consistency with
> TopicListing. For topics it makes sense because at well as the name there
> is whether the topic is internal. For consumer groups, though there is just
> the name and having a separate ConsumerGroupListing seems like it doesn't
> add very much, and would mostly get in the way when using the API. I would
> be interested in what others thought about this.
>
> Cheers,
>
> Tom
>
> On 6 November 2017 at 22:16, Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > Thanks for the feedback!
> >
> > @Ted Yu: Links added.
> >
> > KIP updated. Changes:
> >
> > * `#listConsumerGroups(ListConsumerGroupsOptions options)` added to the
> > API.
> > * `DescribeConsumerGroupResult` and `ConsumerGroupDescription` classes
> > described.
> >
> > Cheers,
> > Jorge.
> >
> >
> >
> >
> > El lun., 6 nov. 2017 a las 20:28, Guozhang Wang (<wangg...@gmail.com>)
> > escribió:
> >
> > > Hi Matthias,
> > >
> > > You meant "list groups" I think?
> > >
> > > Guozhang
> > >
> > > On Mon, Nov 6, 2017 at 11:17 AM, Matthias J. Sax <
> matth...@confluent.io>
> > > wrote:
> > >
> > > > The main goal of this KIP is to enable decoupling StreamsResetter
> from
> > > > core module. For this case (ie, using AdminClient within
> > > > StreamsResetter) we get the group.id from the user as command line
> > > > argument. Thus, I think the KIP is useful without "describe group"
> > > > command to.
> > > >
> > > > I am happy to include "describe group" command in the KIP. Just want
> to
> > > > point out, that there is no reason to insist on it IMHO.
> > > >
> > > >
> > > > -Matthias
> > > >
> > > > On 11/6/17 7:06 PM, Guozhang Wang wrote:
> > > > > A quick question: I think we do not yet have the `list consumer
> > groups`
> > > > > func as in the old AdminClient. Without this `describe group` given
> > the
> > > > > group id would not be very useful. Could you include this as well
> in
> > > your
> > > > > KIP? More specifically, you can look at kafka.admin.AdminClientfor
> > more
> > > > > details on the APIs.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Mon, Nov 6, 2017 at 7:22 AM, Ted Yu <yuzhih...@gmail.com>
> wrote:
> > > > >
> > > > >> Please fill out Discussion thread and JIRA fields.
> > > > >>
> > > > >> Thanks
> > > > >>
> > > > >> On Mon, Nov 6, 2017 at 2:02 AM, Tom Bentley <
> t.j.bent...@gmail.com>
> > > > wrote:
> > > > >>
> > > > >>> Hi Jorge,
> > > > >>>
> > > > >>> Thanks for the KIP. A few initial comments:
> > > > >>>
> > > > >>> 1. The AdminClient doesn't have any API like
> `listConsumerGroups()`
> > > > >>> currently, so in general how does a client know the group ids it
> is
> > > > >>> interested in?
> > > > >>> 2. Could you fill in the API of DescribeConsumerGroupResult, just
> > so
> > > > >>> everyone knows exactly what being proposed.
> > > > >>> 3. Can you describe the ConsumerGroupDescription class?
> > > > >>> 4. Probably worth mentioning that this will use
> > > > >>> DescribeGroupsRequest/Response, and also enumerating the error
> > codes
> > > > >> that
> > > > >>> can return (or, equivalently, enumerate the exceptions throw from
> > the
> > > > >>> futures obtained from the DescribeConsumerGroupResult).
> > > > >>>
> > > > >>> Cheers,
> > > > >>>
> > > > >>> Tom
> > > > >>>
> > > > >>> On 6 November 2017 at 08:19, Jorge Esteban Quilcate Otoya <
> > > > >>> quilcate.jo...@gmail.com> wrote:
> > > > >>>
> > > > >>>> Hi everyone,
> > > > >>>>
> > > > >>>> I would like to start a discussion on KIP-222 [1] based on issue
> > > [2].
> > > > >>>>
> > > > >>>> Looking forward to feedback.
> > > > >>>>
> > > > >>>> [1]
> > > > >>>> https://cwiki.apache.org/confluence/pages/viewpage.
> > > > >>> action?pageId=74686265
> > > > >>>> [2] https://issues.apache.org/jira/browse/KAFKA-6058
> > > > >>>>
> > > > >>>> Cheers,
> > > > >>>> Jorge.
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>


Re: [ANNOUNCE] New committer: Onur Karaman

2017-11-06 Thread Jorge Esteban Quilcate Otoya
Congratulations Onur!!
On Tue, 7 Nov 2017 at 06:30, Jaikiran Pai  wrote:

> Congratulations Onur!
>
> -Jaikiran
>
>
> On 06/11/17 10:54 PM, Jun Rao wrote:
> > Hi, everyone,
> >
> > The PMC of Apache Kafka is pleased to announce a new Kafka committer Onur
> > Karaman.
> >
> > Onur's most significant work is the improvement of Kafka controller,
> which
> > is the brain of a Kafka cluster. Over time, we have accumulated quite a
> few
> > correctness and performance issues in the controller. There have been
> > attempts to fix controller issues in isolation, which would make the code
> > base more complicated without a clear path of solving all problems. Onur
> is
> > the one who took a holistic approach, by first documenting all known
> > issues, writing down a new design, coming up with a plan to deliver the
> > changes in phases and executing on it. At this point, Onur has completed
> > the two most important phases: making the controller single threaded and
> > changing the controller to use the async ZK api. The former fixed
> multiple
> > deadlocks and race conditions. The latter significantly improved the
> > performance when there are many partitions. Experimental results show
> that
> > Onur's work reduced the controlled shutdown time by a factor of 100 times
> > and the controller failover time by a factor of 3 times.
> >
> > Congratulations, Onur!
> >
> > Thanks,
> >
> > Jun (on behalf of the Apache Kafka PMC)
> >
>
>


Re: [DISCUSS] KIP-222 - Add "describe consumer group" to KafkaAdminClient

2017-11-06 Thread Jorge Esteban Quilcate Otoya
Thanks for the feedback!

@Ted Yu: Links added.

KIP updated. Changes:

* `#listConsumerGroups(ListConsumerGroupsOptions options)` added to the API.
* `DescribeConsumerGroupResult` and `ConsumerGroupDescription` classes
described.

Cheers,
Jorge.




El lun., 6 nov. 2017 a las 20:28, Guozhang Wang (<wangg...@gmail.com>)
escribió:

> Hi Matthias,
>
> You meant "list groups" I think?
>
> Guozhang
>
> On Mon, Nov 6, 2017 at 11:17 AM, Matthias J. Sax <matth...@confluent.io>
> wrote:
>
> > The main goal of this KIP is to enable decoupling StreamsResetter from
> > core module. For this case (ie, using AdminClient within
> > StreamsResetter) we get the group.id from the user as command line
> > argument. Thus, I think the KIP is useful without "describe group"
> > command to.
> >
> > I am happy to include "describe group" command in the KIP. Just want to
> > point out, that there is no reason to insist on it IMHO.
> >
> >
> > -Matthias
> >
> > On 11/6/17 7:06 PM, Guozhang Wang wrote:
> > > A quick question: I think we do not yet have the `list consumer groups`
> > > func as in the old AdminClient. Without this `describe group` given the
> > > group id would not be very useful. Could you include this as well in
> your
> > > KIP? More specifically, you can look at kafka.admin.AdminClientfor more
> > > details on the APIs.
> > >
> > >
> > > Guozhang
> > >
> > > On Mon, Nov 6, 2017 at 7:22 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> > >
> > >> Please fill out Discussion thread and JIRA fields.
> > >>
> > >> Thanks
> > >>
> > >> On Mon, Nov 6, 2017 at 2:02 AM, Tom Bentley <t.j.bent...@gmail.com>
> > wrote:
> > >>
> > >>> Hi Jorge,
> > >>>
> > >>> Thanks for the KIP. A few initial comments:
> > >>>
> > >>> 1. The AdminClient doesn't have any API like `listConsumerGroups()`
> > >>> currently, so in general how does a client know the group ids it is
> > >>> interested in?
> > >>> 2. Could you fill in the API of DescribeConsumerGroupResult, just so
> > >>> everyone knows exactly what being proposed.
> > >>> 3. Can you describe the ConsumerGroupDescription class?
> > >>> 4. Probably worth mentioning that this will use
> > >>> DescribeGroupsRequest/Response, and also enumerating the error codes
> > >> that
> > >>> can return (or, equivalently, enumerate the exceptions throw from the
> > >>> futures obtained from the DescribeConsumerGroupResult).
> > >>>
> > >>> Cheers,
> > >>>
> > >>> Tom
> > >>>
> > >>> On 6 November 2017 at 08:19, Jorge Esteban Quilcate Otoya <
> > >>> quilcate.jo...@gmail.com> wrote:
> > >>>
> > >>>> Hi everyone,
> > >>>>
> > >>>> I would like to start a discussion on KIP-222 [1] based on issue
> [2].
> > >>>>
> > >>>> Looking forward to feedback.
> > >>>>
> > >>>> [1]
> > >>>> https://cwiki.apache.org/confluence/pages/viewpage.
> > >>> action?pageId=74686265
> > >>>> [2] https://issues.apache.org/jira/browse/KAFKA-6058
> > >>>>
> > >>>> Cheers,
> > >>>> Jorge.
> > >>>>
> > >>>
> > >>
> > >
> > >
> > >
> >
> >
>
>
> --
> -- Guozhang
>


[DISCUSS] KIP-222 - Add "describe consumer group" to KafkaAdminClient

2017-11-06 Thread Jorge Esteban Quilcate Otoya
Hi everyone,

I would like to start a discussion on KIP-222 [1] based on issue [2].

Looking forward to feedback.

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74686265
[2] https://issues.apache.org/jira/browse/KAFKA-6058

Cheers,
Jorge.


Re: [DISCUSS] KIP-171: Extend Consumer Group Reset Offset for Stream Application

2017-10-13 Thread Jorge Esteban Quilcate Otoya
Thanks Matthias.

I have updated the KIP according to this and take KAFKA-6058.
We will use `KafkaConsumer` instead of reusing `ConsumerGroupCommand` and
keep `StreamsResetter` in `core` until `KAFKA-6058 is fixed.

Just as a reminder, [VOTE] thread is already open if there are no more
feedback on this KIP :)

El vie., 13 oct. 2017 a las 0:28, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> Jorge,
>
> thanks for the update.
>
> I would suggest to not reuse `ConsumerGroupCommand` and re implement
> what we need in `StreamsResetter` directly.
>
> Even if we need to keep `StreamsResetter` in `core` for now, I think we
> should not introduce new dependencies.
>
> Currently, we still use old `kafka.admin.AdmitClient` in
> `StreamsResetter`. We need new `KafkaAdminClient` to support "describe
> consumer group" to get rid of this part. Than we can move
> `StreamsResetter` to `streams` package.
>
> Cf. https://issues.apache.org/jira/browse/KAFKA-6058
> https://issues.apache.org/jira/browse/KAFKA-5965
>
> Feel free to pick up KAFKA-6058 and KAFKA-5965.
>
>
> -Matthias
>
>
>
> On 10/9/17 12:54 AM, Jorge Esteban Quilcate Otoya wrote:
> > Matthias,
> >
> > Thanks for the heads up!
> >
> > I think the main dependency is from `StreamResseter` to
> > `ConsumerGroupCommand` class to actually reuse `#reset-offsets`
> > functionality.
> >
> > Not sure what would be the better way to remove it. To expose commands
> > (e.g. `ConsumerGroupCommand`) as part of AdminClient, they have to be
> > re-implemented on the `client` module right? Is this an option? If not I
> > think we should keep `StreamResseter` as part of `core` module until we
> > have `ConsumerGroupCommand` on `client` module as well.
> >
> > El vie., 6 oct. 2017 a las 0:05, Matthias J. Sax (<matth...@confluent.io
> >)
> > escribió:
> >
> >> Jorge,
> >>
> >> KIP-198 (that got merged already) overlaps with this KIP. Can you please
> >> update your KIP accordingly?
> >>
> >> Also, while working on KIP-198, we identified some shortcomings in
> >> AdminClient that do not allow us to move StreamsResetter our of core
> >> package. We want to address those shortcoming in another KIP to add
> >> missing functionality to the new AdminClient.
> >>
> >> Having say this, and remembering a discussion about dependencies that
> >> might be introduced by this KIP, it might be good to understand those
> >> dependencies in detail. Maybe we can resolve those dependencies somehow
> >> and thus, be able to more StreamsResetter out of core package. Could you
> >> summarize those dependencies in the KIP or just as a reply?
> >>
> >> Thanks!
> >>
> >>
> >> -Matthias
> >>
> >> On 9/11/17 3:02 PM, Jorge Esteban Quilcate Otoya wrote:
> >>> Thanks Guozhang!
> >>>
> >>> I have updated the KIP to:
> >>>
> >>> 1. Only one scenario param is allowed. If none, `to-earliest` will be
> >> used,
> >>> behaving as the current version.
> >>>
> >>> 2.
> >>>   1. An exception will be printed mentioning that there is no existing
> >>> offsets registered.
> >>>   2. inputTopics format could support define partition numbers as in
> >>> reset-offsets option for kafka-consumer-groups.
> >>>
> >>> 3. That should be handled by KIP-198.
> >>>
> >>> I will start the VOTE thread in a following email.
> >>>
> >>>
> >>> El mié., 30 ago. 2017 a las 2:01, Guozhang Wang (<wangg...@gmail.com>)
> >>> escribió:
> >>>
> >>>> Hi Jorge,
> >>>>
> >>>> Thanks for the KIP. It would be a great to add feature to the reset
> >> tools.
> >>>> I made a pass over it and it looks good to me overall. I have a few
> >>>> comments:
> >>>>
> >>>> 1. For all the scenarios, do we allow users to specify more than one
> >>>> parameters? If not could you make that clear in the wiki, e.g. we
> would
> >>>> return with an error message saying that only one is allowed; if yes
> >> then
> >>>> what precedence order we are following?
> >>>>
> >>>> 2. Personally I feel that "--by-duration", "--to-offset" and
> >> "--shift-by"
> >>>> are a tad overkill, because 1) they assume there exist some committed
> >>>> offset for e

Re: [DISCUSS] KIP-171: Extend Consumer Group Reset Offset for Stream Application

2017-10-09 Thread Jorge Esteban Quilcate Otoya
Matthias,

Thanks for the heads up!

I think the main dependency is from `StreamResseter` to
`ConsumerGroupCommand` class to actually reuse `#reset-offsets`
functionality.

Not sure what would be the better way to remove it. To expose commands
(e.g. `ConsumerGroupCommand`) as part of AdminClient, they have to be
re-implemented on the `client` module right? Is this an option? If not I
think we should keep `StreamResseter` as part of `core` module until we
have `ConsumerGroupCommand` on `client` module as well.

El vie., 6 oct. 2017 a las 0:05, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> Jorge,
>
> KIP-198 (that got merged already) overlaps with this KIP. Can you please
> update your KIP accordingly?
>
> Also, while working on KIP-198, we identified some shortcomings in
> AdminClient that do not allow us to move StreamsResetter our of core
> package. We want to address those shortcoming in another KIP to add
> missing functionality to the new AdminClient.
>
> Having say this, and remembering a discussion about dependencies that
> might be introduced by this KIP, it might be good to understand those
> dependencies in detail. Maybe we can resolve those dependencies somehow
> and thus, be able to more StreamsResetter out of core package. Could you
> summarize those dependencies in the KIP or just as a reply?
>
> Thanks!
>
>
> -Matthias
>
> On 9/11/17 3:02 PM, Jorge Esteban Quilcate Otoya wrote:
> > Thanks Guozhang!
> >
> > I have updated the KIP to:
> >
> > 1. Only one scenario param is allowed. If none, `to-earliest` will be
> used,
> > behaving as the current version.
> >
> > 2.
> >   1. An exception will be printed mentioning that there is no existing
> > offsets registered.
> >   2. inputTopics format could support define partition numbers as in
> > reset-offsets option for kafka-consumer-groups.
> >
> > 3. That should be handled by KIP-198.
> >
> > I will start the VOTE thread in a following email.
> >
> >
> > El mié., 30 ago. 2017 a las 2:01, Guozhang Wang (<wangg...@gmail.com>)
> > escribió:
> >
> >> Hi Jorge,
> >>
> >> Thanks for the KIP. It would be a great to add feature to the reset
> tools.
> >> I made a pass over it and it looks good to me overall. I have a few
> >> comments:
> >>
> >> 1. For all the scenarios, do we allow users to specify more than one
> >> parameters? If not could you make that clear in the wiki, e.g. we would
> >> return with an error message saying that only one is allowed; if yes
> then
> >> what precedence order we are following?
> >>
> >> 2. Personally I feel that "--by-duration", "--to-offset" and
> "--shift-by"
> >> are a tad overkill, because 1) they assume there exist some committed
> >> offset for each of the topic, but that may not be always true, 2)
> offset /
> >> time shifting amount on different topics may not be a good fit
> universally,
> >> i.e. one could imagine the we want to reset all input topics to their
> >> offsets of a given time, but resetting all topics' offset to the same
> value
> >> or let all of them shifting the same amount of offsets are usually not
> >> applicable. For "--by-duration" it seems could be easily supported by
> the
> >> "to-date".
> >>
> >> For the general consumer group reset tool, since it could be set one per
> >> partition these parameters may be more useful.
> >>
> >> 3. As for the implementation details, when removing zookeeper config in
> >> `kafka-streams-application-reset`, we should consider return a meaning
> >> error message otherwise it would be "unrecognized config" blah.
> >>
> >>
> >> If you feel confident about the wiki after discussing about these
> points,
> >> please feel free to move on to start a voting thread. Note that we are
> >> about 3 weeks away from KIP deadline and 4 weeks away from feature
> >> deadline.
> >>
> >>
> >> Guozhang
> >>
> >>
> >>
> >>
> >>
> >> On Tue, Aug 22, 2017 at 1:45 PM, Matthias J. Sax <matth...@confluent.io
> >
> >> wrote:
> >>
> >>> Thanks for the update Jorge.
> >>>
> >>> I don't have any further comments.
> >>>
> >>>
> >>> -Matthias
> >>>
> >>> On 8/12/17 6:43 PM, Jorge Esteban Quilcate Otoya wrote:
> >>>> I have updated the KIP:
> >>>>
> >>>&

[VOTE] KIP-171 - Extend Consumer Group Reset Offset for Stream Application

2017-09-11 Thread Jorge Esteban Quilcate Otoya
Hi All,

It seems that there is no further concern with the KIP-171.
At this point we would like to start the voting process.

The KIP can be found here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application


Thanks!


Re: [DISCUSS] KIP-171: Extend Consumer Group Reset Offset for Stream Application

2017-09-11 Thread Jorge Esteban Quilcate Otoya
Thanks Guozhang!

I have updated the KIP to:

1. Only one scenario param is allowed. If none, `to-earliest` will be used,
behaving as the current version.

2.
  1. An exception will be printed mentioning that there is no existing
offsets registered.
  2. inputTopics format could support define partition numbers as in
reset-offsets option for kafka-consumer-groups.

3. That should be handled by KIP-198.

I will start the VOTE thread in a following email.


El mié., 30 ago. 2017 a las 2:01, Guozhang Wang (<wangg...@gmail.com>)
escribió:

> Hi Jorge,
>
> Thanks for the KIP. It would be a great to add feature to the reset tools.
> I made a pass over it and it looks good to me overall. I have a few
> comments:
>
> 1. For all the scenarios, do we allow users to specify more than one
> parameters? If not could you make that clear in the wiki, e.g. we would
> return with an error message saying that only one is allowed; if yes then
> what precedence order we are following?
>
> 2. Personally I feel that "--by-duration", "--to-offset" and "--shift-by"
> are a tad overkill, because 1) they assume there exist some committed
> offset for each of the topic, but that may not be always true, 2) offset /
> time shifting amount on different topics may not be a good fit universally,
> i.e. one could imagine the we want to reset all input topics to their
> offsets of a given time, but resetting all topics' offset to the same value
> or let all of them shifting the same amount of offsets are usually not
> applicable. For "--by-duration" it seems could be easily supported by the
> "to-date".
>
> For the general consumer group reset tool, since it could be set one per
> partition these parameters may be more useful.
>
> 3. As for the implementation details, when removing zookeeper config in
> `kafka-streams-application-reset`, we should consider return a meaning
> error message otherwise it would be "unrecognized config" blah.
>
>
> If you feel confident about the wiki after discussing about these points,
> please feel free to move on to start a voting thread. Note that we are
> about 3 weeks away from KIP deadline and 4 weeks away from feature
> deadline.
>
>
> Guozhang
>
>
>
>
>
> On Tue, Aug 22, 2017 at 1:45 PM, Matthias J. Sax <matth...@confluent.io>
> wrote:
>
> > Thanks for the update Jorge.
> >
> > I don't have any further comments.
> >
> >
> > -Matthias
> >
> > On 8/12/17 6:43 PM, Jorge Esteban Quilcate Otoya wrote:
> > > I have updated the KIP:
> > >
> > > - Change execution parameters, using `--dry-run`
> > > - Reference KAFKA-4327
> > > - And advise about changes on `StreamResetter`
> > >
> > > Also includes that it will cover a change on `ConsumerGroupCommand` to
> > > align execution options.
> > >
> > > El dom., 16 jul. 2017 a las 5:37, Matthias J. Sax (<
> > matth...@confluent.io>)
> > > escribió:
> > >
> > >> Thanks a lot for the update!
> > >>
> > >> I like the KIP!
> > >>
> > >> One more question about `--dry-run` vs `--execute`: While I agree that
> > >> we should use the same flag for both tools, I am not sure which one is
> > >> the better one... My personal take is, that I like `--dry-run` better.
> > >> Not sure what others think.
> > >>
> > >> One more comment: with the removal of ZK, we can also tackle this
> JIRA:
> > >> https://issues.apache.org/jira/browse/KAFKA-4327 If we do so, I think
> > we
> > >> should mention it in the KIP.
> > >>
> > >> I am also not sure about backward compatibility issue for this case.
> > >> Actually, I don't expect people to call `StreamsResetter` from Java
> > >> code, but you can never know. So if we break this, we need to make
> sure
> > >> to cover it in the KIP and later on in the release notes.
> > >>
> > >>
> > >> -Matthias
> > >>
> > >> On 7/14/17 7:15 AM, Jorge Esteban Quilcate Otoya wrote:
> > >>> Hi,
> > >>>
> > >>> KIP is updated.
> > >>> Changes:
> > >>> 1. Approach discussed to keep both tools (streams application
> resetter
> > >> and
> > >>> consumer group reset offset).
> > >>> 2. Options has been aligned between both tools.
> > >>> 3. Zookeeper option from streams-application-resetted will be
> removed,
> > >> and
> > >>> replaced internally for Kafka 

Re: [DISCUSS] KIP-171: Extend Consumer Group Reset Offset for Stream Application

2017-08-12 Thread Jorge Esteban Quilcate Otoya
I have updated the KIP:

- Change execution parameters, using `--dry-run`
- Reference KAFKA-4327
- And advise about changes on `StreamResetter`

Also includes that it will cover a change on `ConsumerGroupCommand` to
align execution options.

El dom., 16 jul. 2017 a las 5:37, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> Thanks a lot for the update!
>
> I like the KIP!
>
> One more question about `--dry-run` vs `--execute`: While I agree that
> we should use the same flag for both tools, I am not sure which one is
> the better one... My personal take is, that I like `--dry-run` better.
> Not sure what others think.
>
> One more comment: with the removal of ZK, we can also tackle this JIRA:
> https://issues.apache.org/jira/browse/KAFKA-4327 If we do so, I think we
> should mention it in the KIP.
>
> I am also not sure about backward compatibility issue for this case.
> Actually, I don't expect people to call `StreamsResetter` from Java
> code, but you can never know. So if we break this, we need to make sure
> to cover it in the KIP and later on in the release notes.
>
>
> -Matthias
>
> On 7/14/17 7:15 AM, Jorge Esteban Quilcate Otoya wrote:
> > Hi,
> >
> > KIP is updated.
> > Changes:
> > 1. Approach discussed to keep both tools (streams application resetter
> and
> > consumer group reset offset).
> > 2. Options has been aligned between both tools.
> > 3. Zookeeper option from streams-application-resetted will be removed,
> and
> > replaced internally for Kafka AdminClient.
> >
> > Looking forward to your feedback.
> >
> > El jue., 29 jun. 2017 a las 15:04, Matthias J. Sax (<
> matth...@confluent.io>)
> > escribió:
> >
> >> Damian,
> >>
> >>> resets everything and clears up
> >>>> the state stores.
> >>
> >> That's not correct. The reset tool does not touch local store. For this,
> >> we have `KafkaStreams#clenup` -- otherwise, you would need to run the
> >> tool in every machine you run an application instance.
> >>
> >> With regard to state, the tool only deletes the underlying changelog
> >> topics.
> >>
> >> Just to clarify. The idea of allowing to rest with different offset is
> >> to clear all state but just use a different start offset (instead of
> >> zero). This is for use case where your topic has a larger retention time
> >> than the amount of data you want to reprocess. But we always need to
> >> start with an empty state. (Resetting with consistent state is something
> >> we might do at some point, but it's much hard and not part of this KIP)
> >>
> >>> @matthias, could we remove the ZK dependency from the streams reset
> tool
> >>> now?
> >>
> >> I think so. The new AdminClient provide the feature we need AFAIK. I
> >> guess we can picky back this into the KIP (we would need a KIP anyway
> >> because we deprecate `--zookeeper` what is an public API change).
> >>
> >>
> >> I just want to point out, that even without ZK dependency, I prefer to
> >> wrap the consumer offset tool and keep two tools.
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 6/29/17 9:14 AM, Damian Guy wrote:
> >>> Hi,
> >>>
> >>> Thanks for the KIP. What is not clear is how is this going to handle
> >> state
> >>> stores? Right now the streams reset tool, resets everything and clears
> up
> >>> the state stores. What are we going to do if we reset to a particular
> >>> offset? If we clear the state then we've lost any previously aggregated
> >>> values (which may or may not be what is expected). If we don't clear
> >> them,
> >>> then we will end up with incorrect aggregates.
> >>>
> >>> @matthias, could we remove the ZK dependency from the streams reset
> tool
> >>> now?
> >>>
> >>> Thanks,
> >>> Damian
> >>>
> >>> On Thu, 29 Jun 2017 at 15:22 Jorge Esteban Quilcate Otoya <
> >>> quilcate.jo...@gmail.com> wrote:
> >>>
> >>>> You're right, I was not considering Zookeeper dependency.
> >>>>
> >>>> I'm starting to like the idea to wrap `reset-offset` from
> >>>> `streams-application-reset`.
> >>>>
> >>>> I will wait a bit for more feedback and work on a draft with this
> >> changes.
> >>>>
> >>>>
> >>>> El mié., 28 jun. 2017 a las 0:20, Ma

Re: [DISCUSS] KIP-171: Extend Consumer Group Reset Offset for Stream Application

2017-07-14 Thread Jorge Esteban Quilcate Otoya
Hi,

KIP is updated.
Changes:
1. Approach discussed to keep both tools (streams application resetter and
consumer group reset offset).
2. Options has been aligned between both tools.
3. Zookeeper option from streams-application-resetted will be removed, and
replaced internally for Kafka AdminClient.

Looking forward to your feedback.

El jue., 29 jun. 2017 a las 15:04, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> Damian,
>
> > resets everything and clears up
> >> the state stores.
>
> That's not correct. The reset tool does not touch local store. For this,
> we have `KafkaStreams#clenup` -- otherwise, you would need to run the
> tool in every machine you run an application instance.
>
> With regard to state, the tool only deletes the underlying changelog
> topics.
>
> Just to clarify. The idea of allowing to rest with different offset is
> to clear all state but just use a different start offset (instead of
> zero). This is for use case where your topic has a larger retention time
> than the amount of data you want to reprocess. But we always need to
> start with an empty state. (Resetting with consistent state is something
> we might do at some point, but it's much hard and not part of this KIP)
>
> > @matthias, could we remove the ZK dependency from the streams reset tool
> > now?
>
> I think so. The new AdminClient provide the feature we need AFAIK. I
> guess we can picky back this into the KIP (we would need a KIP anyway
> because we deprecate `--zookeeper` what is an public API change).
>
>
> I just want to point out, that even without ZK dependency, I prefer to
> wrap the consumer offset tool and keep two tools.
>
>
> -Matthias
>
>
> On 6/29/17 9:14 AM, Damian Guy wrote:
> > Hi,
> >
> > Thanks for the KIP. What is not clear is how is this going to handle
> state
> > stores? Right now the streams reset tool, resets everything and clears up
> > the state stores. What are we going to do if we reset to a particular
> > offset? If we clear the state then we've lost any previously aggregated
> > values (which may or may not be what is expected). If we don't clear
> them,
> > then we will end up with incorrect aggregates.
> >
> > @matthias, could we remove the ZK dependency from the streams reset tool
> > now?
> >
> > Thanks,
> > Damian
> >
> > On Thu, 29 Jun 2017 at 15:22 Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> >> You're right, I was not considering Zookeeper dependency.
> >>
> >> I'm starting to like the idea to wrap `reset-offset` from
> >> `streams-application-reset`.
> >>
> >> I will wait a bit for more feedback and work on a draft with this
> changes.
> >>
> >>
> >> El mié., 28 jun. 2017 a las 0:20, Matthias J. Sax (<
> matth...@confluent.io
> >>> )
> >> escribió:
> >>
> >>> I agree, that we should not duplicate functionality.
> >>>
> >>> However, I am worried, that a non-streams users using the offset reset
> >>> tool might delete topics unintentionally (even if the changes are
> pretty
> >>> small). Also, currently the stream reset tool required Zookeeper while
> >>> the offset reset tool does not -- I don't think we should add this
> >>> dependency to the offset reset tool.
> >>>
> >>> Thus, it think it might be better to keep both tools, but internally
> >>> rewrite the streams reset entry class, to reuse as much as possible
> from
> >>> the offset reset tool. Ie. the streams tool would be a wrapper around
> >>> the offset tool and add some functionality it needs that is Streams
> >>> specific.
> >>>
> >>> I also think, that keeping separate tools for consumers and streams is
> a
> >>> good thing. We might want to add new features that don't apply to plain
> >>> consumers -- note, a Streams applications is more than just a client.
> >>>
> >>> WDYT?
> >>>
> >>> Would be good to get some feedback from others, too.
> >>>
> >>>
> >>> -Matthias
> >>>
> >>>
> >>> On 6/27/17 9:05 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>> Thanks for the feedback Matthias!
> >>>>
> >>>> The main idea is to use only 1 tool to reset offsets and don't
> >> replicate
> >>>> functionality between tools.
> >>>> Reset Offset (KIP-122) tool not only reset but support execute the
> >> reset
>

Re: [DISCUSS] KIP-171: Extend Consumer Group Reset Offset for Stream Application

2017-06-29 Thread Jorge Esteban Quilcate Otoya
You're right, I was not considering Zookeeper dependency.

I'm starting to like the idea to wrap `reset-offset` from
`streams-application-reset`.

I will wait a bit for more feedback and work on a draft with this changes.


El mié., 28 jun. 2017 a las 0:20, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> I agree, that we should not duplicate functionality.
>
> However, I am worried, that a non-streams users using the offset reset
> tool might delete topics unintentionally (even if the changes are pretty
> small). Also, currently the stream reset tool required Zookeeper while
> the offset reset tool does not -- I don't think we should add this
> dependency to the offset reset tool.
>
> Thus, it think it might be better to keep both tools, but internally
> rewrite the streams reset entry class, to reuse as much as possible from
> the offset reset tool. Ie. the streams tool would be a wrapper around
> the offset tool and add some functionality it needs that is Streams
> specific.
>
> I also think, that keeping separate tools for consumers and streams is a
> good thing. We might want to add new features that don't apply to plain
> consumers -- note, a Streams applications is more than just a client.
>
> WDYT?
>
> Would be good to get some feedback from others, too.
>
>
> -Matthias
>
>
> On 6/27/17 9:05 AM, Jorge Esteban Quilcate Otoya wrote:
> > Thanks for the feedback Matthias!
> >
> > The main idea is to use only 1 tool to reset offsets and don't replicate
> > functionality between tools.
> > Reset Offset (KIP-122) tool not only reset but support execute the reset
> > but also export, import from files, etc.
> > If we extend the current tool (kafka-streams-application-reset.sh) we
> will
> > have to duplicate all this functionality also.
> > Maybe another option is to move the current implementation into
> > `kafka-consumer-group` and add a new command `--reset-offset-streams`
> with
> > the current implementation functionality and add `--reset-offset` options
> > for input topics. Does this make sense?
> >
> >
> > El lun., 26 jun. 2017 a las 23:32, Matthias J. Sax (<
> matth...@confluent.io>)
> > escribió:
> >
> >> Jorge,
> >>
> >> thanks a lot for this KIP. Allowing the reset streams applications with
> >> arbitrary start offset is something we got multiple requests already.
> >>
> >> Couple of clarification question:
> >>
> >>  - why do you want to deprecate the current tool instead of extending
> >> the current tool with the stuff the offset reset tool can do (ie, use
> >> the offset reset tool internally)
> >>
> >>  - you suggest to extend the offset reset tool to replace the stream
> >> reset tool: how would the reset tool know if it is resetting a streams
> >> applications or a regular consumer group?
> >>
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 6/26/17 1:28 PM, Jorge Esteban Quilcate Otoya wrote:
> >>> Hi all,
> >>>
> >>> I'd like to start the discussion to add reset offset tooling for Stream
> >>> applications.
> >>> The KIP can be found here:
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application
> >>>
> >>> Thanks,
> >>> Jorge.
> >>>
> >>
> >>
> >
>
>


Re: [DISCUSS] KIP-171: Extend Consumer Group Reset Offset for Stream Application

2017-06-27 Thread Jorge Esteban Quilcate Otoya
Thanks for the feedback Matthias!

The main idea is to use only 1 tool to reset offsets and don't replicate
functionality between tools.
Reset Offset (KIP-122) tool not only reset but support execute the reset
but also export, import from files, etc.
If we extend the current tool (kafka-streams-application-reset.sh) we will
have to duplicate all this functionality also.
Maybe another option is to move the current implementation into
`kafka-consumer-group` and add a new command `--reset-offset-streams` with
the current implementation functionality and add `--reset-offset` options
for input topics. Does this make sense?


El lun., 26 jun. 2017 a las 23:32, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> Jorge,
>
> thanks a lot for this KIP. Allowing the reset streams applications with
> arbitrary start offset is something we got multiple requests already.
>
> Couple of clarification question:
>
>  - why do you want to deprecate the current tool instead of extending
> the current tool with the stuff the offset reset tool can do (ie, use
> the offset reset tool internally)
>
>  - you suggest to extend the offset reset tool to replace the stream
> reset tool: how would the reset tool know if it is resetting a streams
> applications or a regular consumer group?
>
>
>
> -Matthias
>
>
> On 6/26/17 1:28 PM, Jorge Esteban Quilcate Otoya wrote:
> > Hi all,
> >
> > I'd like to start the discussion to add reset offset tooling for Stream
> > applications.
> > The KIP can be found here:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application
> >
> > Thanks,
> > Jorge.
> >
>
>


[DISCUSS] KIP-171: Extend Consumer Group Reset Offset for Stream Application

2017-06-26 Thread Jorge Esteban Quilcate Otoya
Hi all,

I'd like to start the discussion to add reset offset tooling for Stream
applications.
The KIP can be found here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application

Thanks,
Jorge.


Re: KIP-162: Enable topic deletion by default

2017-05-26 Thread Jorge Esteban Quilcate Otoya
+1

El vie., 26 may. 2017 a las 16:14, Matthias J. Sax ()
escribió:

> +1
>
> On 5/26/17 7:03 AM, Gwen Shapira wrote:
> > Hi Kafka developers, users and friends,
> >
> > I've added a KIP to improve our out-of-the-box usability a bit:
> > KIP-162: Enable topic deletion by default:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-162+-+Enable+topic+deletion+by+default
> >
> > Pretty simple :) Discussion and feedback are welcome.
> >
> > Gwen
> >
>
>


Re: [VOTE] KIP-122: Add Reset Consumer Group Offsets tooling

2017-04-05 Thread Jorge Esteban Quilcate Otoya
Thanks to everyone who voted and/or provided feedback!

The vote passed with 4 binding +1s (Gwen, Grant, Jason, Becket) and 5
non-binding +1s (Matthias, Vahid, Bill, Dong, Mickael).

El mié., 5 abr. 2017 a las 11:58, Jorge Esteban Quilcate Otoya (<
quilcate.jo...@gmail.com>) escribió:

> Thanks Ismael! Response inline.
>
> El mar., 4 abr. 2017 a las 17:43, Ismael Juma (<ism...@juma.me.uk>)
> escribió:
>
> Sorry for the delay Jorge. Responses inline.
>
> On Thu, Mar 23, 2017 at 5:56 PM, Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > @Ismael, thanks for your feedback!
> > 1. Good point. I will add optional support for timezone as part of the
> > datetime input. But, when datetime is without timezone, would it be more
> > consistent to get the timezone from the cluster first and then reset
> based
> > on that value? Not sure if it is possible to get that info from the
> > cluster. But, in case that's not available, I could add a note to advise
> > that in case timezone is not specified the tool will get that value from
> > the client and it would be user's responsibility to validate that is
> > aligned with the server.
> >
>
> There's no way to get such data from the Cluster today. It's relatively
> common for servers to use UTC as their timezone though. Is there any value
> in using the client timezone? Today's apps typically have data from all
> over and what are the chances that the time zone from where the client is
> running is the correct one?
>
>
> You're right, make sense to use UTC by default and accept Timezone as part
> of the input value. I updated the KIP.
>
>
>
> > 2. Happy to add it to the KIP.
> >
>
> OK.
>
>
> > 3. This was part of the discussion thread, we end up with `shift-by` to
> > avoid adding `offset` to each case and make it a bit more consistent.
> >
>
> OK.
>
>
> > 4. I could remove CURRENT-OFFSET and CURRENT-LAG from the output, and
> leave
> > it as part of `describe` operation, if that's better.
> >
>
> It seems better to me as one would hope people would look at describe to
> find the current values.
>
> Done, KIP updated.
>
>
> > 5. Agree. At the beginning we consider `shift-plus` and `shift-minus`,
> but
> > agree to join them in one option and accept +/- as input. Maybe that's a
> > better option?
> >
>
> Not sure, maybe it's fine as it is. I can't think of anything better, at
> least.
>
>
> Agreed, I also think is good enough as it is now.
>
>
>
> Ismael
>
>
> Jorge.
>
>


Re: [VOTE] KIP-122: Add Reset Consumer Group Offsets tooling

2017-04-05 Thread Jorge Esteban Quilcate Otoya
Thanks Ismael! Response inline.

El mar., 4 abr. 2017 a las 17:43, Ismael Juma (<ism...@juma.me.uk>)
escribió:

Sorry for the delay Jorge. Responses inline.

On Thu, Mar 23, 2017 at 5:56 PM, Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> @Ismael, thanks for your feedback!
> 1. Good point. I will add optional support for timezone as part of the
> datetime input. But, when datetime is without timezone, would it be more
> consistent to get the timezone from the cluster first and then reset based
> on that value? Not sure if it is possible to get that info from the
> cluster. But, in case that's not available, I could add a note to advise
> that in case timezone is not specified the tool will get that value from
> the client and it would be user's responsibility to validate that is
> aligned with the server.
>

There's no way to get such data from the Cluster today. It's relatively
common for servers to use UTC as their timezone though. Is there any value
in using the client timezone? Today's apps typically have data from all
over and what are the chances that the time zone from where the client is
running is the correct one?


You're right, make sense to use UTC by default and accept Timezone as part
of the input value. I updated the KIP.



> 2. Happy to add it to the KIP.
>

OK.


> 3. This was part of the discussion thread, we end up with `shift-by` to
> avoid adding `offset` to each case and make it a bit more consistent.
>

OK.


> 4. I could remove CURRENT-OFFSET and CURRENT-LAG from the output, and
leave
> it as part of `describe` operation, if that's better.
>

It seems better to me as one would hope people would look at describe to
find the current values.

Done, KIP updated.


> 5. Agree. At the beginning we consider `shift-plus` and `shift-minus`, but
> agree to join them in one option and accept +/- as input. Maybe that's a
> better option?
>

Not sure, maybe it's fine as it is. I can't think of anything better, at
least.


Agreed, I also think is good enough as it is now.



Ismael


Jorge.


Re: [VOTE] KIP-122: Add Reset Consumer Group Offsets tooling

2017-03-23 Thread Jorge Esteban Quilcate Otoya
@Ismael, thanks for your feedback!
1. Good point. I will add optional support for timezone as part of the
datetime input. But, when datetime is without timezone, would it be more
consistent to get the timezone from the cluster first and then reset based
on that value? Not sure if it is possible to get that info from the
cluster. But, in case that's not available, I could add a note to advise
that in case timezone is not specified the tool will get that value from
the client and it would be user's responsibility to validate that is
aligned with the server.
2. Happy to add it to the KIP.
3. This was part of the discussion thread, we end up with `shift-by` to
avoid adding `offset` to each case and make it a bit more consistent.
4. I could remove CURRENT-OFFSET and CURRENT-LAG from the output, and leave
it as part of `describe` operation, if that's better.
5. Agree. At the beginning we consider `shift-plus` and `shift-minus`, but
agree to join them in one option and accept +/- as input. Maybe that's a
better option?

El jue., 23 mar. 2017 a las 17:17, Ismael Juma (<ism...@juma.me.uk>)
escribió:

Hi Jorge,

Thanks for the detailed KIP. The tool looks very useful. A few comments:

1. We are using the default timezone of the client for the specified date.
This seems a bit error prone. Would it be better to require the users to
specify the time zone as part of the date time? We should at least allow
it, but my experience when it comes to using the default time zone in a
distributed environment is not great.
2. It seems like we are using the ISO 8601 format for date time and
duration. It would be good to mention that.
3. `shift-by` should perhaps be `shift-offset-by` to be a bit clearer.
4. Printing the current offset via reset-offsets is a bit odd, can we not
use a different option for that?
5. It's a bit odd that `by-duration` subtracts while `shift-by` moves
forward. It would be nice if the name made it clear that `by-duration` is
subtracting, but I have no good suggestions, so maybe that's the best we
can do.

Ismael

On Fri, Feb 24, 2017 at 12:46 AM, Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Hi All,
>
> It seems that there is no further concern with the KIP-122.
> At this point we would like to start the voting process.
>
> The KIP can be found here:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 122%3A+Add+Reset+Consumer+Group+Offsets+tooling
>
>
> Thanks!
>
> Jorge.
>


Re: [VOTE] KIP-122: Add Reset Consumer Group Offsets tooling

2017-03-22 Thread Jorge Esteban Quilcate Otoya
@Jason, thanks for your feedback!
You're right, we are not considering the old consumer, since we rely on the
KafkaConsumer#seek operations. I'm happy to update the KIP to make this
explicit.
About the second comment: I suppose that it would work, but I would like to
include it to the test cases first. Do you know if this scenario has been
test it in other clients?

Jorge


El mié., 22 mar. 2017 a las 5:23, Dong Lin (<lindon...@gmail.com>) escribió:

> Thanks for the KIP!
>
> +1 (non-binding)
>
> On Tue, Mar 21, 2017 at 6:24 PM, Becket Qin <becket@gmail.com> wrote:
>
> > +1
> >
> > Thanks for the KIP. The tool is very useful.
> >
> > On Tue, Mar 21, 2017 at 4:46 PM, Jason Gustafson <ja...@confluent.io>
> > wrote:
> >
> > > +1 This looks super useful! Might be worth mentioning somewhere
> > > compatibility with the old consumer. It looks like offsets in zk are
> not
> > > covered, which seems fine, but probably should be explicitly noted.
> Maybe
> > > you can also add a note saying that the tool can be used for old
> > consumers
> > > which have offsets stored in Kafka, but it will not protect against an
> > > active consumer group in that case?
> > >
> > > Thanks,
> > > Jason
> > >
> > > On Tue, Mar 14, 2017 at 10:13 AM, Dong Lin <lindon...@gmail.com>
> wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > On Tue, Mar 14, 2017 at 8:53 AM, Bill Bejeck <bbej...@gmail.com>
> > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > On Tue, Mar 14, 2017 at 11:50 AM, Grant Henke <ghe...@cloudera.com
> >
> > > > wrote:
> > > > >
> > > > > > +1. Agreed. This is a great tool to have.
> > > > > >
> > > > > > On Tue, Mar 14, 2017 at 12:33 AM, Gwen Shapira <
> g...@confluent.io>
> > > > > wrote:
> > > > > >
> > > > > > > +1 (binding)
> > > > > > >
> > > > > > > Nice job - this is going to be super useful.
> > > > > > >
> > > > > > > On Thu, Feb 23, 2017 at 4:46 PM, Jorge Esteban Quilcate Otoya <
> > > > > > > quilcate.jo...@gmail.com> wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > It seems that there is no further concern with the KIP-122.
> > > > > > > > At this point we would like to start the voting process.
> > > > > > > >
> > > > > > > > The KIP can be found here:
> > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > > 122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> > > > > > > >
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Jorge.
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > *Gwen Shapira*
> > > > > > > Product Manager | Confluent
> > > > > > > 650.450.2760 <(650)%20450-2760> | @gwenshap
> > > > > > > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> > > > > > > <http://www.confluent.io/blog>
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Grant Henke
> > > > > > Software Engineer | Cloudera
> > > > > > gr...@cloudera.com | twitter.com/gchenke |
> > > linkedin.com/in/granthenke
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-03-02 Thread Jorge Esteban Quilcate Otoya
Hi,

I have incorporated the latest revisions into the KIP and created a PR to
check the implementation details.
If there are no more issues, the VOTE thread has already started.

Looking forward to your comments.

Jorge.

El mar., 28 feb. 2017 a las 19:46, Vahid S Hashemian (<
vahidhashem...@us.ibm.com>) escribió:

Thanks Jorge for addressing my suggestions. Looks good to me.

--Vahid



From:   Jorge Esteban Quilcate Otoya <quilcate.jo...@gmail.com>
To: dev@kafka.apache.org
Date:   02/27/2017 01:57 AM
Subject:Re: KIP-122: Add a tool to Reset Consumer Group Offsets



@Vahid: make sense to add "new lag" info IMO, I will update the KIP.

@Becket:

1. About deleting, I think ConsumerGroupCommand already has an option to
delete Group information by topic. From delete docs: "Pass in groups to
delete topic partition offsets and ownership information over the entire
consumer group.". Let me know if this solves is enough for your case, of
we
can consider to add something to the Reset Offsets tool.

2. Yes, for instance in the case of active consumers, the tool will
validate that there are no active consumers to avoid race conditions. I
have added some code snippets to the wiki, thanks for pointing that out.

El sáb., 25 feb. 2017 a las 0:29, Becket Qin (<becket@gmail.com>)
escribió:

> Thanks for the KIP Jorge. I think this is a useful KIP. I haven't read
the
> KIP in detail yet, some comments from a quick review:
>
> 1. A glance at it it seems that there is no delete option. At LinkedIn
we
> identified some cases that users want to delete the committed offset of
a
> group. It would be good to include that as well.
>
> 2. It seems the KIP is missing some necessary implementation key points.
> e.g. how would the tool to commit offsets for a consumer group, does the
> broker need to know this is a special tool instead of an active consumer
in
> the group (the generation check will be made on offset commit)? They are
> probably in your proof of concept code. Could you add them to the wiki
as
> well?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Feb 24, 2017 at 1:19 PM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com> wrote:
>
> > Thanks Jorge for addressing my question/suggestion.
> >
> > One last thing. I noticed is that in the example you have for the
"plan"
> > option
> > (
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 122%3A+Add+Reset+Consumer+Group+Offsets+tooling#KIP-122:
> > AddResetConsumerGroupOffsetstooling-ExecutionOptions
> > )
> > under "Description" column, you put 0 for lag. So I assume that is the
> > current lag being reported, and not the new lag. Might be helpful to
> > explicitly specify that (i.e. CURRENT-LAG) in the column header.
> > The other option is to report both current and new lags, but I
understand
> > if we don't want to do that since it's rather redundant info.
> >
> > Thanks again.
> > --Vahid
> >
> >
> >
> > From:   Jorge Esteban Quilcate Otoya <quilcate.jo...@gmail.com>
> > To: dev@kafka.apache.org
> > Date:   02/24/2017 12:47 PM
> > Subject:Re: KIP-122: Add a tool to Reset Consumer Group
Offsets
> >
> >
> >
> > Hi Vahid,
> >
> > Thanks for your comments. Check my answers below:
> >
> > El vie., 24 feb. 2017 a las 19:41, Vahid S Hashemian (<
> > vahidhashem...@us.ibm.com>) escribió:
> >
> > > Hi Jorge,
> > >
> > > Thanks for the useful KIP.
> > >
> > > I have a question regarding the proposed "plan" option.
> > > The "current offset" and "lag" values of a topic partition are
> > meaningful
> > > within a consumer group. In other words, different consumer groups
> could
> > > have different values for these properties of each topic partition.
> > > I don't see that reflected in the discussion around the "plan"
option.
> > > Unless we are assuming a "--group" option is also provided by user
> > (which
> > > is not clear from the KIP if that is the case).
> > >
> >
> > I have added an additional comment to state that this options will
> require
> > a "group" argument.
> > It is considered to affect only one Consumer Group.
> >
> >
> > >
> > > Also, I was wondering if you can provide at least one full command
> > example
> > > for each of the "plan", "execute", and "export" options. They would
> > > definitely help in understanding some of the details.
> > >
> > >
> > Added to the KI

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-27 Thread Jorge Esteban Quilcate Otoya
@Vahid: make sense to add "new lag" info IMO, I will update the KIP.

@Becket:

1. About deleting, I think ConsumerGroupCommand already has an option to
delete Group information by topic. From delete docs: "Pass in groups to
delete topic partition offsets and ownership information over the entire
consumer group.". Let me know if this solves is enough for your case, of we
can consider to add something to the Reset Offsets tool.

2. Yes, for instance in the case of active consumers, the tool will
validate that there are no active consumers to avoid race conditions. I
have added some code snippets to the wiki, thanks for pointing that out.

El sáb., 25 feb. 2017 a las 0:29, Becket Qin (<becket@gmail.com>)
escribió:

> Thanks for the KIP Jorge. I think this is a useful KIP. I haven't read the
> KIP in detail yet, some comments from a quick review:
>
> 1. A glance at it it seems that there is no delete option. At LinkedIn we
> identified some cases that users want to delete the committed offset of a
> group. It would be good to include that as well.
>
> 2. It seems the KIP is missing some necessary implementation key points.
> e.g. how would the tool to commit offsets for a consumer group, does the
> broker need to know this is a special tool instead of an active consumer in
> the group (the generation check will be made on offset commit)? They are
> probably in your proof of concept code. Could you add them to the wiki as
> well?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Feb 24, 2017 at 1:19 PM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com> wrote:
>
> > Thanks Jorge for addressing my question/suggestion.
> >
> > One last thing. I noticed is that in the example you have for the "plan"
> > option
> > (
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 122%3A+Add+Reset+Consumer+Group+Offsets+tooling#KIP-122:
> > AddResetConsumerGroupOffsetstooling-ExecutionOptions
> > )
> > under "Description" column, you put 0 for lag. So I assume that is the
> > current lag being reported, and not the new lag. Might be helpful to
> > explicitly specify that (i.e. CURRENT-LAG) in the column header.
> > The other option is to report both current and new lags, but I understand
> > if we don't want to do that since it's rather redundant info.
> >
> > Thanks again.
> > --Vahid
> >
> >
> >
> > From:   Jorge Esteban Quilcate Otoya <quilcate.jo...@gmail.com>
> > To: dev@kafka.apache.org
> > Date:   02/24/2017 12:47 PM
> > Subject:Re: KIP-122: Add a tool to Reset Consumer Group Offsets
> >
> >
> >
> > Hi Vahid,
> >
> > Thanks for your comments. Check my answers below:
> >
> > El vie., 24 feb. 2017 a las 19:41, Vahid S Hashemian (<
> > vahidhashem...@us.ibm.com>) escribió:
> >
> > > Hi Jorge,
> > >
> > > Thanks for the useful KIP.
> > >
> > > I have a question regarding the proposed "plan" option.
> > > The "current offset" and "lag" values of a topic partition are
> > meaningful
> > > within a consumer group. In other words, different consumer groups
> could
> > > have different values for these properties of each topic partition.
> > > I don't see that reflected in the discussion around the "plan" option.
> > > Unless we are assuming a "--group" option is also provided by user
> > (which
> > > is not clear from the KIP if that is the case).
> > >
> >
> > I have added an additional comment to state that this options will
> require
> > a "group" argument.
> > It is considered to affect only one Consumer Group.
> >
> >
> > >
> > > Also, I was wondering if you can provide at least one full command
> > example
> > > for each of the "plan", "execute", and "export" options. They would
> > > definitely help in understanding some of the details.
> > >
> > >
> > Added to the KIP.
> >
> >
> > > Sorry for the delayed question/suggestion. I hope they make sense.
> > >
> > > Thanks.
> > > --Vahid
> > >
> > >
> > >
> > > From:   Jorge Esteban Quilcate Otoya <quilcate.jo...@gmail.com>
> > > To: dev@kafka.apache.org
> > > Date:   02/24/2017 09:51 AM
> > > Subject:Re: KIP-122: Add a tool to Reset Consumer Group Offsets
> > >
> > >
> > >
> > > Great! KIP updated.
> > >
> > >
> > >
> &

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-24 Thread Jorge Esteban Quilcate Otoya
Hi Vahid,

Thanks for your comments. Check my answers below:

El vie., 24 feb. 2017 a las 19:41, Vahid S Hashemian (<
vahidhashem...@us.ibm.com>) escribió:

> Hi Jorge,
>
> Thanks for the useful KIP.
>
> I have a question regarding the proposed "plan" option.
> The "current offset" and "lag" values of a topic partition are meaningful
> within a consumer group. In other words, different consumer groups could
> have different values for these properties of each topic partition.
> I don't see that reflected in the discussion around the "plan" option.
> Unless we are assuming a "--group" option is also provided by user (which
> is not clear from the KIP if that is the case).
>

I have added an additional comment to state that this options will require
a "group" argument.
It is considered to affect only one Consumer Group.


>
> Also, I was wondering if you can provide at least one full command example
> for each of the "plan", "execute", and "export" options. They would
> definitely help in understanding some of the details.
>
>
Added to the KIP.


> Sorry for the delayed question/suggestion. I hope they make sense.
>
> Thanks.
> --Vahid
>
>
>
> From:   Jorge Esteban Quilcate Otoya <quilcate.jo...@gmail.com>
> To: dev@kafka.apache.org
> Date:   02/24/2017 09:51 AM
> Subject:Re: KIP-122: Add a tool to Reset Consumer Group Offsets
>
>
>
> Great! KIP updated.
>
>
>
> El vie., 24 feb. 2017 a las 18:22, Matthias J. Sax
> (<matth...@confluent.io>)
> escribió:
>
> > I like this!
> >
> > --by-duration and --shift-by
> >
> >
> > -Matthias
> >
> > On 2/24/17 12:57 AM, Jorge Esteban Quilcate Otoya wrote:
> > > Renaming to --by-duration LGTM
> > >
> > > Not sure about changing it to --shift-by-duration because we could end
> up
> > > with the same redundancy as before with reset: --reset-offsets
> > > --reset-to-*.
> > >
> > > Maybe changing --shift-offset-by to --shift-by 'n' could make it
> > consistent
> > > enough?
> > >
> > >
> > > El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<
> > matth...@confluent.io>)
> > > escribió:
> > >
> > >> I just read the update KIP once more.
> > >>
> > >> I would suggest to rename --to-duration to --by-duration
> > >>
> > >> Or as a second idea, rename --to-duration to --shift-by-duration and
> at
> > >> the same time rename --shift-offset-by to --shift-by-offset
> > >>
> > >> Not sure what the best option is, but naming would be more consistent
> > IMHO.
> > >>
> > >>
> > >>
> > >> -Matthias
> > >>
> > >> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> > >>> Hi All,
> > >>>
> > >>> If there are no more concerns, I'd like to start vote for this KIP.
> > >>>
> > >>> Thanks!
> > >>> Jorge.
> > >>>
> > >>> El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
> > >>> quilcate.jo...@gmail.com>) escribió:
> > >>>
> > >>>> Oh ok :)
> > >>>>
> > >>>> So, we can keep `--topic t1:1,2,3`
> > >>>>
> > >>>> I think with this one we have most of the feedback applied. I will
> > >> update
> > >>>> the KIP with this change.
> > >>>>
> > >>>> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
> > >> matth...@confluent.io>)
> > >>>> escribió:
> > >>>>
> > >>>> Sounds reasonable.
> > >>>>
> > >>>> If we have multiple --topic arguments, it does also not matter if
> we
> > use
> > >>>> t1:1,2 or t2=1,2
> > >>>>
> > >>>> I just suggested '=' because I wanted use ':' to chain multiple
> > topics.
> > >>>>
> > >>>>
> > >>>> -Matthias
> > >>>>
> > >>>> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> > >>>>> Yeap, `--topic t1=1,2`LGTM
> > >>>>>
> > >>>>> Don't have idea neither about getting rid of repeated --topic, but
> > >>>> --group
> > >>>>> is also repeated in the case of deletion, so it could 

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-24 Thread Jorge Esteban Quilcate Otoya
Great! KIP updated.



El vie., 24 feb. 2017 a las 18:22, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> I like this!
>
> --by-duration and --shift-by
>
>
> -Matthias
>
> On 2/24/17 12:57 AM, Jorge Esteban Quilcate Otoya wrote:
> > Renaming to --by-duration LGTM
> >
> > Not sure about changing it to --shift-by-duration because we could end up
> > with the same redundancy as before with reset: --reset-offsets
> > --reset-to-*.
> >
> > Maybe changing --shift-offset-by to --shift-by 'n' could make it
> consistent
> > enough?
> >
> >
> > El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<
> matth...@confluent.io>)
> > escribió:
> >
> >> I just read the update KIP once more.
> >>
> >> I would suggest to rename --to-duration to --by-duration
> >>
> >> Or as a second idea, rename --to-duration to --shift-by-duration and at
> >> the same time rename --shift-offset-by to --shift-by-offset
> >>
> >> Not sure what the best option is, but naming would be more consistent
> IMHO.
> >>
> >>
> >>
> >> -Matthias
> >>
> >> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> >>> Hi All,
> >>>
> >>> If there are no more concerns, I'd like to start vote for this KIP.
> >>>
> >>> Thanks!
> >>> Jorge.
> >>>
> >>> El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
> >>> quilcate.jo...@gmail.com>) escribió:
> >>>
> >>>> Oh ok :)
> >>>>
> >>>> So, we can keep `--topic t1:1,2,3`
> >>>>
> >>>> I think with this one we have most of the feedback applied. I will
> >> update
> >>>> the KIP with this change.
> >>>>
> >>>> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
> >> matth...@confluent.io>)
> >>>> escribió:
> >>>>
> >>>> Sounds reasonable.
> >>>>
> >>>> If we have multiple --topic arguments, it does also not matter if we
> use
> >>>> t1:1,2 or t2=1,2
> >>>>
> >>>> I just suggested '=' because I wanted use ':' to chain multiple
> topics.
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>> Yeap, `--topic t1=1,2`LGTM
> >>>>>
> >>>>> Don't have idea neither about getting rid of repeated --topic, but
> >>>> --group
> >>>>> is also repeated in the case of deletion, so it could be ok to have
> >>>>> repeated --topic arguments.
> >>>>>
> >>>>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> >>>> matth...@confluent.io>)
> >>>>> escribió:
> >>>>>
> >>>>>> So you suggest to merge "scope options" --topics, --topic, and
> >>>>>> --partitions into a single option? Sound good to me.
> >>>>>>
> >>>>>> I like the compact way to express it, ie,
> topicname:list-of-partitions
> >>>>>> with "all partitions" if not partitions are specified. It's quite
> >>>>>> intuitive to use.
> >>>>>>
> >>>>>> Just wondering, if we could get rid of the repeated --topic option;
> >> it's
> >>>>>> somewhat verbose. Have no good idea though who to improve it.
> >>>>>>
> >>>>>> If you concatenate multiple topic, we need one more character that
> is
> >>>>>> not allowed in topic names to separate the topics:
> >>>>>>
> >>>>>>> invalidChars = {'/', '\\', ',', '\u', ':', '"', '\'', ';', '*',
> >>>>>> '?', ' ', '\t', '\r', '\n', '='};
> >>>>>>
> >>>>>> maybe
> >>>>>>
> >>>>>> --topics t1=1,2,3:t2:t3=3
> >>>>>>
> >>>>>> use '=' to specify partitions (instead of ':' as you proposed) and
> ':'
> >>>>>> to separate topics? All other characters seem to be worse to use to
> >> me.
> >>>>>> But maybe you have a better idea.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> -Ma

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-24 Thread Jorge Esteban Quilcate Otoya
Renaming to --by-duration LGTM

Not sure about changing it to --shift-by-duration because we could end up
with the same redundancy as before with reset: --reset-offsets
--reset-to-*.

Maybe changing --shift-offset-by to --shift-by 'n' could make it consistent
enough?


El vie., 24 feb. 2017 a las 6:39, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> I just read the update KIP once more.
>
> I would suggest to rename --to-duration to --by-duration
>
> Or as a second idea, rename --to-duration to --shift-by-duration and at
> the same time rename --shift-offset-by to --shift-by-offset
>
> Not sure what the best option is, but naming would be more consistent IMHO.
>
>
>
> -Matthias
>
> On 2/23/17 4:42 PM, Jorge Esteban Quilcate Otoya wrote:
> > Hi All,
> >
> > If there are no more concerns, I'd like to start vote for this KIP.
> >
> > Thanks!
> > Jorge.
> >
> > El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
> > quilcate.jo...@gmail.com>) escribió:
> >
> >> Oh ok :)
> >>
> >> So, we can keep `--topic t1:1,2,3`
> >>
> >> I think with this one we have most of the feedback applied. I will
> update
> >> the KIP with this change.
> >>
> >> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<
> matth...@confluent.io>)
> >> escribió:
> >>
> >> Sounds reasonable.
> >>
> >> If we have multiple --topic arguments, it does also not matter if we use
> >> t1:1,2 or t2=1,2
> >>
> >> I just suggested '=' because I wanted use ':' to chain multiple topics.
> >>
> >>
> >> -Matthias
> >>
> >> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> >>> Yeap, `--topic t1=1,2`LGTM
> >>>
> >>> Don't have idea neither about getting rid of repeated --topic, but
> >> --group
> >>> is also repeated in the case of deletion, so it could be ok to have
> >>> repeated --topic arguments.
> >>>
> >>> El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> >> matth...@confluent.io>)
> >>> escribió:
> >>>
> >>>> So you suggest to merge "scope options" --topics, --topic, and
> >>>> --partitions into a single option? Sound good to me.
> >>>>
> >>>> I like the compact way to express it, ie, topicname:list-of-partitions
> >>>> with "all partitions" if not partitions are specified. It's quite
> >>>> intuitive to use.
> >>>>
> >>>> Just wondering, if we could get rid of the repeated --topic option;
> it's
> >>>> somewhat verbose. Have no good idea though who to improve it.
> >>>>
> >>>> If you concatenate multiple topic, we need one more character that is
> >>>> not allowed in topic names to separate the topics:
> >>>>
> >>>>> invalidChars = {'/', '\\', ',', '\u', ':', '"', '\'', ';', '*',
> >>>> '?', ' ', '\t', '\r', '\n', '='};
> >>>>
> >>>> maybe
> >>>>
> >>>> --topics t1=1,2,3:t2:t3=3
> >>>>
> >>>> use '=' to specify partitions (instead of ':' as you proposed) and ':'
> >>>> to separate topics? All other characters seem to be worse to use to
> me.
> >>>> But maybe you have a better idea.
> >>>>
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>>
> >>>> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> >>>>> @Matthias about the point 9:
> >>>>>
> >>>>> What about keeping only the --topic option, and support this format:
> >>>>>
> >>>>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> >>>>>
> >>>>> In this case topics t1, t2, and t3 will be selected: topic t1 with
> >>>>> partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
> >>>> with
> >>>>> only partition 2.
> >>>>>
> >>>>> Jorge.
> >>>>>
> >>>>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> >>>>> quilcate.jo...@gmail.com>) escribió:
> >>>>>
> >>>>>> Thanks for the feedback Matthias.
> >>>>>>
> >>>>>> * 1. You're right. I'll reorder the scenarios.
> >>

[VOTE] KIP-122: Add Reset Consumer Group Offsets tooling

2017-02-23 Thread Jorge Esteban Quilcate Otoya
Hi All,

It seems that there is no further concern with the KIP-122.
At this point we would like to start the voting process.

The KIP can be found here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling


Thanks!

Jorge.


Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-23 Thread Jorge Esteban Quilcate Otoya
Hi All,

If there are no more concerns, I'd like to start vote for this KIP.

Thanks!
Jorge.

El jue., 23 feb. 2017 a las 22:50, Jorge Esteban Quilcate Otoya (<
quilcate.jo...@gmail.com>) escribió:

> Oh ok :)
>
> So, we can keep `--topic t1:1,2,3`
>
> I think with this one we have most of the feedback applied. I will update
> the KIP with this change.
>
> El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<matth...@confluent.io>)
> escribió:
>
> Sounds reasonable.
>
> If we have multiple --topic arguments, it does also not matter if we use
> t1:1,2 or t2=1,2
>
> I just suggested '=' because I wanted use ':' to chain multiple topics.
>
>
> -Matthias
>
> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> > Yeap, `--topic t1=1,2`LGTM
> >
> > Don't have idea neither about getting rid of repeated --topic, but
> --group
> > is also repeated in the case of deletion, so it could be ok to have
> > repeated --topic arguments.
> >
> > El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> matth...@confluent.io>)
> > escribió:
> >
> >> So you suggest to merge "scope options" --topics, --topic, and
> >> --partitions into a single option? Sound good to me.
> >>
> >> I like the compact way to express it, ie, topicname:list-of-partitions
> >> with "all partitions" if not partitions are specified. It's quite
> >> intuitive to use.
> >>
> >> Just wondering, if we could get rid of the repeated --topic option; it's
> >> somewhat verbose. Have no good idea though who to improve it.
> >>
> >> If you concatenate multiple topic, we need one more character that is
> >> not allowed in topic names to separate the topics:
> >>
> >>> invalidChars = {'/', '\\', ',', '\u', ':', '"', '\'', ';', '*',
> >> '?', ' ', '\t', '\r', '\n', '='};
> >>
> >> maybe
> >>
> >> --topics t1=1,2,3:t2:t3=3
> >>
> >> use '=' to specify partitions (instead of ':' as you proposed) and ':'
> >> to separate topics? All other characters seem to be worse to use to me.
> >> But maybe you have a better idea.
> >>
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> >>> @Matthias about the point 9:
> >>>
> >>> What about keeping only the --topic option, and support this format:
> >>>
> >>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> >>>
> >>> In this case topics t1, t2, and t3 will be selected: topic t1 with
> >>> partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
> >> with
> >>> only partition 2.
> >>>
> >>> Jorge.
> >>>
> >>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> >>> quilcate.jo...@gmail.com>) escribió:
> >>>
> >>>> Thanks for the feedback Matthias.
> >>>>
> >>>> * 1. You're right. I'll reorder the scenarios.
> >>>>
> >>>> * 2. Agree. I'll update the KIP.
> >>>>
> >>>> * 3. I like it, updating to `reset-offsets`
> >>>>
> >>>> * 4. Agree, removing the `reset-` part
> >>>>
> >>>> * 5. Yes, 1.e option without --execute or --export will print out
> >> current
> >>>> offset, and the new offset, that will be the same. The use-case of
> this
> >>>> option is to use it in combination with --export mostly and have a
> >> current
> >>>> 'checkpoint' to reset later. I will add to the KIP how the output
> should
> >>>> looks like.
> >>>>
> >>>> * 6. Considering 4., I will update it to `--to-offset`
> >>>>
> >>>> * 7. I like the idea to unify these options (plus, minus).
> >>>> `shift-offsets-by` is a good option, but I will like some more
> feedback
> >>>> here about the name. I will update the KIP in the meantime.
> >>>>
> >>>> * 8. Yes, discussed in 9.
> >>>>
> >>>> * 9. Agree. I'll love some feedback here. `topic` is already used by
> >>>> `delete`, and we can add `--all-topics` to consider all
> >> topics/partitions
> >>>> assigned to a group. How could we define specific topics/partitions?
> >>>>
> >>>> * 10. Haven't thought about it, but make sense.
> >>>> ,,

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-23 Thread Jorge Esteban Quilcate Otoya
Oh ok :)

So, we can keep `--topic t1:1,2,3`

I think with this one we have most of the feedback applied. I will update
the KIP with this change.

El jue., 23 feb. 2017 a las 22:38, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> Sounds reasonable.
>
> If we have multiple --topic arguments, it does also not matter if we use
> t1:1,2 or t2=1,2
>
> I just suggested '=' because I wanted use ':' to chain multiple topics.
>
>
> -Matthias
>
> On 2/23/17 10:49 AM, Jorge Esteban Quilcate Otoya wrote:
> > Yeap, `--topic t1=1,2`LGTM
> >
> > Don't have idea neither about getting rid of repeated --topic, but
> --group
> > is also repeated in the case of deletion, so it could be ok to have
> > repeated --topic arguments.
> >
> > El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<
> matth...@confluent.io>)
> > escribió:
> >
> >> So you suggest to merge "scope options" --topics, --topic, and
> >> --partitions into a single option? Sound good to me.
> >>
> >> I like the compact way to express it, ie, topicname:list-of-partitions
> >> with "all partitions" if not partitions are specified. It's quite
> >> intuitive to use.
> >>
> >> Just wondering, if we could get rid of the repeated --topic option; it's
> >> somewhat verbose. Have no good idea though who to improve it.
> >>
> >> If you concatenate multiple topic, we need one more character that is
> >> not allowed in topic names to separate the topics:
> >>
> >>> invalidChars = {'/', '\\', ',', '\u', ':', '"', '\'', ';', '*',
> >> '?', ' ', '\t', '\r', '\n', '='};
> >>
> >> maybe
> >>
> >> --topics t1=1,2,3:t2:t3=3
> >>
> >> use '=' to specify partitions (instead of ':' as you proposed) and ':'
> >> to separate topics? All other characters seem to be worse to use to me.
> >> But maybe you have a better idea.
> >>
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> >>> @Matthias about the point 9:
> >>>
> >>> What about keeping only the --topic option, and support this format:
> >>>
> >>> `--topic t1:0,1,2 --topic t2 --topic t3:2`
> >>>
> >>> In this case topics t1, t2, and t3 will be selected: topic t1 with
> >>> partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
> >> with
> >>> only partition 2.
> >>>
> >>> Jorge.
> >>>
> >>> El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> >>> quilcate.jo...@gmail.com>) escribió:
> >>>
> >>>> Thanks for the feedback Matthias.
> >>>>
> >>>> * 1. You're right. I'll reorder the scenarios.
> >>>>
> >>>> * 2. Agree. I'll update the KIP.
> >>>>
> >>>> * 3. I like it, updating to `reset-offsets`
> >>>>
> >>>> * 4. Agree, removing the `reset-` part
> >>>>
> >>>> * 5. Yes, 1.e option without --execute or --export will print out
> >> current
> >>>> offset, and the new offset, that will be the same. The use-case of
> this
> >>>> option is to use it in combination with --export mostly and have a
> >> current
> >>>> 'checkpoint' to reset later. I will add to the KIP how the output
> should
> >>>> looks like.
> >>>>
> >>>> * 6. Considering 4., I will update it to `--to-offset`
> >>>>
> >>>> * 7. I like the idea to unify these options (plus, minus).
> >>>> `shift-offsets-by` is a good option, but I will like some more
> feedback
> >>>> here about the name. I will update the KIP in the meantime.
> >>>>
> >>>> * 8. Yes, discussed in 9.
> >>>>
> >>>> * 9. Agree. I'll love some feedback here. `topic` is already used by
> >>>> `delete`, and we can add `--all-topics` to consider all
> >> topics/partitions
> >>>> assigned to a group. How could we define specific topics/partitions?
> >>>>
> >>>> * 10. Haven't thought about it, but make sense.
> >>>> ,, would be enough.
> >>>>
> >>>> * 11. Agree. Solved with 10.
> >>>>
> >>>> Also, I have a couple of changes to mention:
> >>>>
> >>>> 1. I have add a referen

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-23 Thread Jorge Esteban Quilcate Otoya
Yeap, `--topic t1=1,2`LGTM

Don't have idea neither about getting rid of repeated --topic, but --group
is also repeated in the case of deletion, so it could be ok to have
repeated --topic arguments.

El jue., 23 feb. 2017 a las 19:14, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> So you suggest to merge "scope options" --topics, --topic, and
> --partitions into a single option? Sound good to me.
>
> I like the compact way to express it, ie, topicname:list-of-partitions
> with "all partitions" if not partitions are specified. It's quite
> intuitive to use.
>
> Just wondering, if we could get rid of the repeated --topic option; it's
> somewhat verbose. Have no good idea though who to improve it.
>
> If you concatenate multiple topic, we need one more character that is
> not allowed in topic names to separate the topics:
>
> > invalidChars = {'/', '\\', ',', '\u', ':', '"', '\'', ';', '*',
> '?', ' ', '\t', '\r', '\n', '='};
>
> maybe
>
> --topics t1=1,2,3:t2:t3=3
>
> use '=' to specify partitions (instead of ':' as you proposed) and ':'
> to separate topics? All other characters seem to be worse to use to me.
> But maybe you have a better idea.
>
>
>
> -Matthias
>
>
> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> > @Matthias about the point 9:
> >
> > What about keeping only the --topic option, and support this format:
> >
> > `--topic t1:0,1,2 --topic t2 --topic t3:2`
> >
> > In this case topics t1, t2, and t3 will be selected: topic t1 with
> > partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
> with
> > only partition 2.
> >
> > Jorge.
> >
> > El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> > quilcate.jo...@gmail.com>) escribió:
> >
> >> Thanks for the feedback Matthias.
> >>
> >> * 1. You're right. I'll reorder the scenarios.
> >>
> >> * 2. Agree. I'll update the KIP.
> >>
> >> * 3. I like it, updating to `reset-offsets`
> >>
> >> * 4. Agree, removing the `reset-` part
> >>
> >> * 5. Yes, 1.e option without --execute or --export will print out
> current
> >> offset, and the new offset, that will be the same. The use-case of this
> >> option is to use it in combination with --export mostly and have a
> current
> >> 'checkpoint' to reset later. I will add to the KIP how the output should
> >> looks like.
> >>
> >> * 6. Considering 4., I will update it to `--to-offset`
> >>
> >> * 7. I like the idea to unify these options (plus, minus).
> >> `shift-offsets-by` is a good option, but I will like some more feedback
> >> here about the name. I will update the KIP in the meantime.
> >>
> >> * 8. Yes, discussed in 9.
> >>
> >> * 9. Agree. I'll love some feedback here. `topic` is already used by
> >> `delete`, and we can add `--all-topics` to consider all
> topics/partitions
> >> assigned to a group. How could we define specific topics/partitions?
> >>
> >> * 10. Haven't thought about it, but make sense.
> >> ,, would be enough.
> >>
> >> * 11. Agree. Solved with 10.
> >>
> >> Also, I have a couple of changes to mention:
> >>
> >> 1. I have add a reference to the branch where I'm working on this KIP.
> >>
> >> 2. About the period scenario `--to-period`. I will change it to
> >> `--to-duration` given that duration (
> >> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
> >> follows this format: 'PnDTnHnMnS' and does not consider daylight saving
> >> efects.
> >>
> >>
> >>
> >> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> matth...@confluent.io>)
> >> escribió:
> >>
> >> Hi,
> >>
> >> thanks for updating the KIP. Couple of follow up comments:
> >>
> >> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> >> time" option -- IMHO it belongs to "reset by position"?
> >>
> >>
> >> * Nit: Description of "Reset to Earliest"
> >>
> >>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> >>
> >> I think this is strictly speaking not correct (as auto.offset.reset only
> >> triggered if no valid offset is found, but this tool explicitly modified
> >> committed offset), and should be phrased as
> >>
> >>> using Kafka Consumer's #seekToBeginning

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-23 Thread Jorge Esteban Quilcate Otoya
@Matthias about the point 9:

What about keeping only the --topic option, and support this format:

`--topic t1:0,1,2 --topic t2 --topic t3:2`

In this case topics t1, t2, and t3 will be selected: topic t1 with
partitions 0,1 and 2; topic t2 with all its partitions; and topic t3, with
only partition 2.

Jorge.

El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
quilcate.jo...@gmail.com>) escribió:

> Thanks for the feedback Matthias.
>
> * 1. You're right. I'll reorder the scenarios.
>
> * 2. Agree. I'll update the KIP.
>
> * 3. I like it, updating to `reset-offsets`
>
> * 4. Agree, removing the `reset-` part
>
> * 5. Yes, 1.e option without --execute or --export will print out current
> offset, and the new offset, that will be the same. The use-case of this
> option is to use it in combination with --export mostly and have a current
> 'checkpoint' to reset later. I will add to the KIP how the output should
> looks like.
>
> * 6. Considering 4., I will update it to `--to-offset`
>
> * 7. I like the idea to unify these options (plus, minus).
> `shift-offsets-by` is a good option, but I will like some more feedback
> here about the name. I will update the KIP in the meantime.
>
> * 8. Yes, discussed in 9.
>
> * 9. Agree. I'll love some feedback here. `topic` is already used by
> `delete`, and we can add `--all-topics` to consider all topics/partitions
> assigned to a group. How could we define specific topics/partitions?
>
> * 10. Haven't thought about it, but make sense.
> ,, would be enough.
>
> * 11. Agree. Solved with 10.
>
> Also, I have a couple of changes to mention:
>
> 1. I have add a reference to the branch where I'm working on this KIP.
>
> 2. About the period scenario `--to-period`. I will change it to
> `--to-duration` given that duration (
> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
> follows this format: 'PnDTnHnMnS' and does not consider daylight saving
> efects.
>
>
>
> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<matth...@confluent.io>)
> escribió:
>
> Hi,
>
> thanks for updating the KIP. Couple of follow up comments:
>
> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> time" option -- IMHO it belongs to "reset by position"?
>
>
> * Nit: Description of "Reset to Earliest"
>
> > using Kafka Consumer's `auto.offset.reset` to `earliest`
>
> I think this is strictly speaking not correct (as auto.offset.reset only
> triggered if no valid offset is found, but this tool explicitly modified
> committed offset), and should be phrased as
>
> > using Kafka Consumer's #seekToBeginning()
>
> -> similar issue for description of "Reset to Latest"
>
>
> * Main option: rename to --reset-offsets (plural instead of singular)
>
>
> * Scenario Options: I would remove "reset" from all options, because the
> main argument "--reset-offset" says already what to do:
>
> > bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
>
> better (IMHO):
>
> > bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
>
>
>
> * Option 1.e ("print and export current offset") is not intuitive to use
> IMHO. The main option is "--reset-offset" but nothing happens if no
> scenario is specified. It is also not specified, what the output should
> look like?
>
> Furthermore, --describe should actually show currently committed offset
> for a group. So it seems to be redundant to have the same option in
> --reset-offsets
>
>
> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
> comment above to "--to-offset")
>
>
> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
> and accept positive/negative values
>
>
> * About Scope "all": maybe it's better to have an option "--all-topics"
> (or similar). IMHO explicit arguments are preferable over implicit
> setting to guard again accidental miss use of the tool.
>
>
> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
> can have two options that are easier to distinguish.
>
>
> * I still think that JSON is not the best format (it's too verbose/hard
> to write for humans from scratch). A simple CSV format with implicit
> schema (topic,partition,offset) would be sufficient.
>
>
> * Why does the JSON contain "group_id" field -- there is parameter
> "--group" to specify the group ID. Would one overwrite the othe

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-21 Thread Jorge Esteban Quilcate Otoya
Thanks for the feedback Matthias.

* 1. You're right. I'll reorder the scenarios.

* 2. Agree. I'll update the KIP.

* 3. I like it, updating to `reset-offsets`

* 4. Agree, removing the `reset-` part

* 5. Yes, 1.e option without --execute or --export will print out current
offset, and the new offset, that will be the same. The use-case of this
option is to use it in combination with --export mostly and have a current
'checkpoint' to reset later. I will add to the KIP how the output should
looks like.

* 6. Considering 4., I will update it to `--to-offset`

* 7. I like the idea to unify these options (plus, minus).
`shift-offsets-by` is a good option, but I will like some more feedback
here about the name. I will update the KIP in the meantime.

* 8. Yes, discussed in 9.

* 9. Agree. I'll love some feedback here. `topic` is already used by
`delete`, and we can add `--all-topics` to consider all topics/partitions
assigned to a group. How could we define specific topics/partitions?

* 10. Haven't thought about it, but make sense.
,, would be enough.

* 11. Agree. Solved with 10.

Also, I have a couple of changes to mention:

1. I have add a reference to the branch where I'm working on this KIP.

2. About the period scenario `--to-period`. I will change it to
`--to-duration` given that duration (
https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html) follows
this format: 'PnDTnHnMnS' and does not consider daylight saving efects.



El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> Hi,
>
> thanks for updating the KIP. Couple of follow up comments:
>
> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> time" option -- IMHO it belongs to "reset by position"?
>
>
> * Nit: Description of "Reset to Earliest"
>
> > using Kafka Consumer's `auto.offset.reset` to `earliest`
>
> I think this is strictly speaking not correct (as auto.offset.reset only
> triggered if no valid offset is found, but this tool explicitly modified
> committed offset), and should be phrased as
>
> > using Kafka Consumer's #seekToBeginning()
>
> -> similar issue for description of "Reset to Latest"
>
>
> * Main option: rename to --reset-offsets (plural instead of singular)
>
>
> * Scenario Options: I would remove "reset" from all options, because the
> main argument "--reset-offset" says already what to do:
>
> > bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
>
> better (IMHO):
>
> > bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
>
>
>
> * Option 1.e ("print and export current offset") is not intuitive to use
> IMHO. The main option is "--reset-offset" but nothing happens if no
> scenario is specified. It is also not specified, what the output should
> look like?
>
> Furthermore, --describe should actually show currently committed offset
> for a group. So it seems to be redundant to have the same option in
> --reset-offsets
>
>
> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
> comment above to "--to-offset")
>
>
> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
> and accept positive/negative values
>
>
> * About Scope "all": maybe it's better to have an option "--all-topics"
> (or similar). IMHO explicit arguments are preferable over implicit
> setting to guard again accidental miss use of the tool.
>
>
> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
> can have two options that are easier to distinguish.
>
>
> * I still think that JSON is not the best format (it's too verbose/hard
> to write for humans from scratch). A simple CSV format with implicit
> schema (topic,partition,offset) would be sufficient.
>
>
> * Why does the JSON contain "group_id" field -- there is parameter
> "--group" to specify the group ID. Would one overwrite the other (what
> order) or would there be an error if "--group" is used in combination
> with "--reset-from-file"?
>
>
>
> -Matthias
>
>
>
>
> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> > Hi,
> >
> > according to the feedback, I've updated the KIP:
> >
> > - We have added and ordered the scenarios, scopes and executions of the
> > Reset Offset tool.
> > - Consider it as an extension to the current `ConsumerGroupCommand` tool
> > - Execution will be possible without generating JSON files.
> >
> >
> https://cwiki.apache.o

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-17 Thread Jorge Esteban Quilcate Otoya
Hi,

according to the feedback, I've updated the KIP:

- We have added and ordered the scenarios, scopes and executions of the
Reset Offset tool.
- Consider it as an extension to the current `ConsumerGroupCommand` tool
- Execution will be possible without generating JSON files.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling

Looking forward to your feedback!

Jorge.

El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
quilcate.jo...@gmail.com>) escribió:

> Great. I think I got the idea. What about this options:
>
> Scenarios:
>
> 1. Current status
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>
> 2. To Datetime
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
> 2017-01-01T00:00:00.000´
>
> 3. To Period
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´
>
> 4. To Earliest
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´
>
> 5. To Latest
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
>
> 6. Minus 'n' offsets
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>
> 7. Plus 'n' offsets
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>
> 8. To specific offset
>
> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>
> Scopes:
>
> a. All topics used by Consumer Group
>
> Don't specify --topics
>
> b. Specific List of Topics
>
> Add list of values in --topics t1,t2,tn
>
> c. One Topic, all Partitions
>
> Add one topic and no partitions values: --topic t1
>
> d. One Topic, List of Partitions
>
> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>
> About Reset Plan (JSON file):
>
> I think is still valid to have the option to persist reset configuration
> as a file, but I agree to give the option to run the tool without going
> down to the JSON file.
>
> Execution options:
>
> 1. Without execution argument (No args):
>
> Print out results (reset plan)
>
> 2. With --execute argument:
>
> Run reset process
>
> 3. With --output argument:
>
> Save result in a JSON format.
>
> 4. Only with --execute option and --reset-file (path to JSON)
>
> Reset based on file
>
> 4. Only with --verify option and --reset-file (path to JSON)
>
> Verify file values with current offsets
>
> I think we can remove --generate-and-execute because is a bit clumsy.
>
> With this options we will be able to execute with manual JSON
> configuration.
>
>
> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<b...@confluent.io>)
> escribió:
>
> Yes - using a tool like this to skip a set of consumer groups over a
> corrupt/bad message is definitely appealing.
>
> B
>
> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <g...@confluent.io> wrote:
>
> > I like the --reset-to-earliest and --reset-to-latest. In general,
> > since the JSON route is the most challenging for users, we want to
> > provide a lot of ways to do useful things without going there.
> >
> > Two things that can help:
> >
> > 1. A lot of times, users want to skip few messages that cause issues
> > and continue. maybe just specifying the topic, partition and delta
> > will be better than having to find the offset and write a JSON and
> > validate the JSON etc.
> >
> > 2. Thinking if there are other common use-cases that we can make easy
> > rather than just one generic but not very usable method.
> >
> > Gwen
> >
> > On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> > <quilcate.jo...@gmail.com> wrote:
> > > Thanks for the feedback!
> > >
> > > @Onur, @Gwen:
> > >
> > > Agree. Actually at the first draft I considered to have it inside
> > > ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
> > tool
> > > to describe it clearly and focus it on reset functionality.
> > >
> > > But now that you mentioned, it does make sense to have it in
> > > ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
> > it?
> > >
> > > Maybe something like this:
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> --topics
> > t1
> > > --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --execute --

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-08 Thread Jorge Esteban Quilcate Otoya
Great. I think I got the idea. What about this options:

Scenarios:

1. Current status

´kafka-consumer-groups.sh --reset-offset --group cg1´

2. To Datetime

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
2017-01-01T00:00:00.000´

3. To Period

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´

4. To Earliest

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´

5. To Latest

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´

6. Minus 'n' offsets

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´

7. Plus 'n' offsets

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´

8. To specific offset

´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´

Scopes:

a. All topics used by Consumer Group

Don't specify --topics

b. Specific List of Topics

Add list of values in --topics t1,t2,tn

c. One Topic, all Partitions

Add one topic and no partitions values: --topic t1

d. One Topic, List of Partitions

Add one topic and partitions values: --topic t1 --partitions 0,1,2

About Reset Plan (JSON file):

I think is still valid to have the option to persist reset configuration as
a file, but I agree to give the option to run the tool without going down
to the JSON file.

Execution options:

1. Without execution argument (No args):

Print out results (reset plan)

2. With --execute argument:

Run reset process

3. With --output argument:

Save result in a JSON format.

4. Only with --execute option and --reset-file (path to JSON)

Reset based on file

4. Only with --verify option and --reset-file (path to JSON)

Verify file values with current offsets

I think we can remove --generate-and-execute because is a bit clumsy.

With this options we will be able to execute with manual JSON configuration.


El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<b...@confluent.io>)
escribió:

> Yes - using a tool like this to skip a set of consumer groups over a
> corrupt/bad message is definitely appealing.
>
> B
>
> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <g...@confluent.io> wrote:
>
> > I like the --reset-to-earliest and --reset-to-latest. In general,
> > since the JSON route is the most challenging for users, we want to
> > provide a lot of ways to do useful things without going there.
> >
> > Two things that can help:
> >
> > 1. A lot of times, users want to skip few messages that cause issues
> > and continue. maybe just specifying the topic, partition and delta
> > will be better than having to find the offset and write a JSON and
> > validate the JSON etc.
> >
> > 2. Thinking if there are other common use-cases that we can make easy
> > rather than just one generic but not very usable method.
> >
> > Gwen
> >
> > On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> > <quilcate.jo...@gmail.com> wrote:
> > > Thanks for the feedback!
> > >
> > > @Onur, @Gwen:
> > >
> > > Agree. Actually at the first draft I considered to have it inside
> > > ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
> > tool
> > > to describe it clearly and focus it on reset functionality.
> > >
> > > But now that you mentioned, it does make sense to have it in
> > > ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
> > it?
> > >
> > > Maybe something like this:
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> --topics
> > t1
> > > --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> > > plan.json´
> > >
> > > ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
> > cg1
> > > --topics t1 --reset-from 2017-01-01T00:00:00.000´
> > >
> > > @Gwen:
> > >
> > >> It looks exactly like the replica assignment tool
> > >
> > > It was influenced by ;-) I use the generate-verify-execute process here
> > to
> > > make sure user will be aware of the result of this operation. At the
> > > beginning we considered only add a couple of options to Consumer Group
> > > Command:
> > >
> > > --rewind-to-timestamp and --rewind-to-period
> > >
> > > @Onur:
> > >
> > >> You can actually get away with overriding while members of the group
> > are live
> > > with method 2 by using group 

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-08 Thread Jorge Esteban Quilcate Otoya
Thanks for the feedback!

@Onur, @Gwen:

Agree. Actually at the first draft I considered to have it inside
´kafka-consumer-groups.sh´, but I decide to propose it as a standalone tool
to describe it clearly and focus it on reset functionality.

But now that you mentioned, it does make sense to have it in
´kafka-consumer-groups.sh´. How would be a consistent way to introduce it?

Maybe something like this:

´kafka-consumer-groups.sh --reset-offset --generate --group cg1 --topics t1
--reset-from 2017-01-01T00:00:00.000 --output plan.json´

´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
plan.json´

´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
plan.json´

´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group cg1
--topics t1 --reset-from 2017-01-01T00:00:00.000´

@Gwen:

> It looks exactly like the replica assignment tool

It was influenced by ;-) I use the generate-verify-execute process here to
make sure user will be aware of the result of this operation. At the
beginning we considered only add a couple of options to Consumer Group
Command:

--rewind-to-timestamp and --rewind-to-period

@Onur:

> You can actually get away with overriding while members of the group are live
with method 2 by using group information from DescribeGroupsRequest.

This means that we need to have Consumer Group stopped before executing and
start a new consumer internally to do this? Therefore, we won't be able to
consider executing reset when ConsumerGroup is active? (trying to relate it
with @Dong 5th question)

@Dong:

> Should we allow user to use wildcard to reset offset of all groups for a
given topic as well?

I haven't thought about this scenario. Could be interesting. Following the
recommendation to add it into Consumer Group Command, in this case Group
argument will be optional if there are only 1 topic. I think for multiple
topic won't be that useful.

> Should we allow user to specify timestamp per topic partition in the json
file as well?

Don't think this could be a valid from the tool, but if Reset Plan is
generated, and user want to set the offset for a specific partition to
other offset (eventually based on another timestamp), and execute it, it
will be up to her/him.

> Should the script take some credential file to make sure that this
operation is authenticated given the potential impact of this operation?

Haven't tried to secure brokers yet, but the tool should support
authorization if it's enabled in the broker.

> Should we provide constant to reset committed offset to earliest/latest
offset of a partition, e.g. -1 indicates earliest offset and -2 indicates
latest offset.

I will go for something like ´--reset-to-earliest´ and ´--reset-to-latest´

> Should we allow dynamic change of the comitted offset when consumer are
running, such that consumer will seek to the newly committed offset and
start consuming from there?

Not sure about this. I will recommend to keep it simple and ask user to
stop consumers first. But I would considered it if the trade-offs are
clear.

@Matthias

Added :). And thanks a lot for your help to define this KIP!



El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<g...@confluent.io>)
escribió:

> As long as the CLI is a bit consistent? Like, not just adding 3
> arguments and a JSON parser to the existing tool, right?
>
> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> <onurkaraman.apa...@gmail.com> wrote:
> > I think it makes sense to just add the feature to
> kafka-consumer-groups.sh
> >
> > On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <g...@confluent.io> wrote:
> >
> >> Thanks for the KIP. I'm super happy about adding the capability.
> >>
> >> I hate the interface, though. It looks exactly like the replica
> >> assignment tool. A tool everyone loves so much that there are multiple
> >> projects, open and closed, that try to fix it.
> >>
> >> Can we swap it with something that looks a bit more like the consumer
> >> group tool? or the kafka streams reset tool? Consistency is helpful in
> >> such cases. I spent some time learning existing tools and learning yet
> >> another one is a deterrent.
> >>
> >> Gwen
> >>
> >>
> >>
> >> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >> <quilcate.jo...@gmail.com> wrote:
> >> > Hi all,
> >> >
> >> > I would like to propose a KIP to Add a tool to Reset Consumer Group
> >> Offsets.
> >> >
> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >> >
> >> > Please, take a look at the proposal and share your feedback.
> >> >
> >> > Thanks,
> >> > Jorge.
> >>
> >>
> >>
> >> --
> >> Gwen Shapira
> >> Product Manager | Confluent
> >> 650.450.2760 <(650)%20450-2760> | @gwenshap
> >> Follow us: Twitter | blog
> >>
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 <(650)%20450-2760> | @gwenshap
> Follow us: Twitter | blog
>


KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-07 Thread Jorge Esteban Quilcate Otoya
Hi all,

I would like to propose a KIP to Add a tool to Reset Consumer Group Offsets.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets

Please, take a look at the proposal and share your feedback.

Thanks,
Jorge.


Re: Rewind Kafka Stream consumer offset by timestamp

2017-01-31 Thread Jorge Esteban Quilcate Otoya
Thanks Matthias!

My comments below.

Regards,
Jorge.

El lun., 30 ene. 2017 a las 18:40, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> It would be enough, IMHO :)
>
> However, we need to discuss some details about this.
>
> 1) we could extend the reset tool with an flag --start-from-offsets and
> the user can specify an offset per partition
>
> This would give the most flexibility, but it is hard to use. Especially
> if you have many partitions, we do not want to hand in this information
> per command line (maybe an "offset file" would work).
>
> Doing this per topic or even global seems to be of little use because it
> lacks proper semantic interpretation.
>

Agree, this option is of little use but could be helpful to give backward
compatibility for clients that don't have timestamp index but nevertheless
they want to rewind to an specific offset.


>
>
> 2) we could extend the reset tool with an flag --start-from-timestamp
> that could be globally applied to all partitions (I guess, that is what
> you have in mind)
>
> Has the advantage that it is easier to use. However, what should the
> parameter format be? Milliseconds since the Epoch (what is the internal

format) seems hard to use either.
>

I think 'dateTime' and 'duration' are valid options here: you could define
to reprocess since 2017-01-01T09:00:00 and also reprocess since P1M - 1
month ago. XML duration (https://www.w3.org/TR/xmlschema-2/#duration) and
dateTime (https://www.w3.org/TR/xmlschema-2/#dateTime) lexical
representations could work here.


>
> There is also the question, how we deal with out-of-order data. Maybe
> it's sufficient though, to just set to the first tuple with equal of
> greater timestamp than the specified time. (and educate user, that they
> might see some older data, ie, with smaller ts, if there is later
> arriving out of order records).
>

Agree.


>
> We might want to exploit broker's timestamp index. But what about older
> brokers that do not have timestamp index, as we do have client backward
> compatibility now? We might just say, "not supported" though.
>
>
item (1) could be valid to give this option to older brokers.


> What about data, that lacks a proper timestamp and users work with
> custom timestamp extractor? Should we support this, too?
>
>
Haven't thought about this this use-case, could be a valid use case.


> Maybe we need a KIP discussion for this. It seems to be a broader feature.
>
>
Yes, I will love to do that. I also believe this could be a valid use-case
to be added to 'kafka-consumer-groups' command-line tool, and have an
external tool to rewind consumer-groups offsets.


> -Matthias
>
>
>
> On 1/30/17 2:07 AM, Jorge Esteban Quilcate Otoya wrote:
> > Thanks Eno and Matthias for your feedback!
> >
> > I've check KIP-95 and Matthias blog post (
> >
> https://www.confluent.io/blog/data-reprocessing-with-kafka-streams-resetting-a-streams-application/
> )
> > and I have a clearer idea on how stream internals work.
> >
> > In a general use-case, following Application Reset Tool's procedure:
> > ---
> >
> >1. for any specified input topic, it resets all offsets to zero
> >2. for any specified intermediate topic, seeks to the end for all
> >partitions
> >3. for all internal topic
> >   1. resets all offsets to zero
> >   2. deletes the topic
> >
> > ---
> > But instead of resetting input topics to zero, resetting input topics to
> > offset by timestamp wouldn't be enough?
> >
> > I will definitely take a look to StreamsResetter and give a try to
> support
> > this feature.
> >
> >
> > El lun., 30 ene. 2017 a las 1:43, Matthias J. Sax (<
> matth...@confluent.io>)
> > escribió:
> >
> >> You can always built you own little tool similar to StreamsResetter.java
> >> to get this done. Ie, you set the committed offset "manually" based on
> >> timestamps before you start your application.
> >>
> >> But as Eno mentioned, you need to think carefully about what a
> >> consistent reset point would be because you cannot reset the
> >> application's state...
> >>
> >> If you start you application with an empty state, this might be less of
> >> an concern though and seems reasonable.
> >>
> >>
> >> -Matthias
> >>
> >> On 1/29/17 12:55 PM, Eno Thereska wrote:
> >>> Hi Jorge,
> >>>
> >>> This is currently not possible, but it is likely to be considered for
> >> discussion. One challenge is that, if you have multiple topics, it is
> >> 

Re: Rewind Kafka Stream consumer offset by timestamp

2017-01-30 Thread Jorge Esteban Quilcate Otoya
Thanks Eno and Matthias for your feedback!

I've check KIP-95 and Matthias blog post (
https://www.confluent.io/blog/data-reprocessing-with-kafka-streams-resetting-a-streams-application/)
and I have a clearer idea on how stream internals work.

In a general use-case, following Application Reset Tool's procedure:
---

   1. for any specified input topic, it resets all offsets to zero
   2. for any specified intermediate topic, seeks to the end for all
   partitions
   3. for all internal topic
  1. resets all offsets to zero
  2. deletes the topic

---
But instead of resetting input topics to zero, resetting input topics to
offset by timestamp wouldn't be enough?

I will definitely take a look to StreamsResetter and give a try to support
this feature.


El lun., 30 ene. 2017 a las 1:43, Matthias J. Sax (<matth...@confluent.io>)
escribió:

> You can always built you own little tool similar to StreamsResetter.java
> to get this done. Ie, you set the committed offset "manually" based on
> timestamps before you start your application.
>
> But as Eno mentioned, you need to think carefully about what a
> consistent reset point would be because you cannot reset the
> application's state...
>
> If you start you application with an empty state, this might be less of
> an concern though and seems reasonable.
>
>
> -Matthias
>
> On 1/29/17 12:55 PM, Eno Thereska wrote:
> > Hi Jorge,
> >
> > This is currently not possible, but it is likely to be considered for
> discussion. One challenge is that, if you have multiple topics, it is
> difficult to rewind them all back to a consistent point in time. KIP-95,
> currently under discussion, is handling the slightly different issue, of
> stopping the consuming at a point in time:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-95%3A+Incremental+Batch+Processing+for+Kafka+Streams
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-95:+Incremental+Batch+Processing+for+Kafka+Streams
> >.
> >
> > Thanks
> > Eno
> >> On 29 Jan 2017, at 19:29, Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
> >>
> >> Hi everyone,
> >>
> >> I was wondering if its possible to rewind consumers offset in Kafka
> Stream
> >> using timestamp as with `offsetsForTimes(Map<TopicPartition, Long>
> >> timestampsToSearch)` in KafkaConsumer.
> >>
> >> I know its possible to go back to `earliest` offset in topic or
> `latest`,
> >> but would be useful to go back using timestamp as with Consumer API do.
> >>
> >> Maybe is there an option to do this already and I'm missing something?
> >>
> >> Thanks in advance for your feedback!
> >>
> >> Jorge.
> >
> >
>
>


Rewind Kafka Stream consumer offset by timestamp

2017-01-29 Thread Jorge Esteban Quilcate Otoya
Hi everyone,

I was wondering if its possible to rewind consumers offset in Kafka Stream
using timestamp as with `offsetsForTimes(Map
timestampsToSearch)` in KafkaConsumer.

I know its possible to go back to `earliest` offset in topic or `latest`,
but would be useful to go back using timestamp as with Consumer API do.

Maybe is there an option to do this already and I'm missing something?

Thanks in advance for your feedback!

Jorge.


<    1   2   3