[jira] [Created] (KAFKA-10064) Add documentation for KIP-571

2020-05-28 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10064:
---

 Summary: Add documentation for KIP-571
 Key: KAFKA-10064
 URL: https://issues.apache.org/jira/browse/KAFKA-10064
 Project: Kafka
  Issue Type: Task
  Components: docs, streams
Reporter: Boyang Chen
Assignee: feyman
 Fix For: 2.6.0


We need to add documentation of KIP-571 similar to what other KIPs going out in 
2.6: [https://github.com/apache/kafka/pull/8621]

 

[~feyman] I'm assigning this to you for now, let me know if there is anything 
missing for context.  Here is the instruction to setup Apache wiki on local: 
[https://cwiki.apache.org/confluence/display/KAFKA/Setup+Kafka+Website+on+Local+Apache+Server]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-602 - Change default value for client.dns.lookup

2020-05-28 Thread Badai Aqrandista
Ismael/Rajini

I have updated the KIP to reflect the deprecation of "default" value and
not adding "use_first_dns_ip". Also I have updated the PR to reflect this
change, including changing all references of ClientDnsLookup.DEFAULT to
ClientDnsLookup.USE_ALL_DNS_IPS in core code and clients test code.

Please let me know what you think.

Thanks
Badai

On Fri, May 29, 2020 at 3:46 AM Ismael Juma  wrote:

> +1. I think we should remove this config in AK 3.0, but in the meantime, we
> can log a warning if people explicitly set the value to `default`. I think
> this would be pretty rare.
>
> Ismael
>
> On Thu, May 28, 2020 at 10:25 AM Rajini Sivaram 
> wrote:
>
> > It is a bit confusing to have a `default` value that is not the default
> and
> > in hindsight it wasn't a good choice of name. But agree that changing the
> > config default and avoiding the temporary `use_first_dns_ip` option makes
> > sense.
> >
> > Regards,
> >
> > Rajini
> >
> >
> > On Thu, May 28, 2020 at 4:49 PM Badai Aqrandista 
> > wrote:
> >
> > > Ismael/Rajini
> > >
> > > I have put some comments in the PR in response to Ismael's.
> > >
> > > I have some questions about Ismael's suggestion to not add
> > > "use_first_dns_ip" at all and instead just deprecate "default".
> > >
> > > The PR would be much cleaner if we just deprecate "default" as you
> > > suggested. But we will need to update some core code. And I will need
> to
> > > update the KIP to reflect this.
> > >
> > > ClientDnsLookup.DEFAULT is used in few places in core (server):
> > >
> > >
> > >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaServer.scala#L520
> > >
> > >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/ReplicaFetcherBlockingSend.scala#L92
> > >
> > >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L156
> > >
> > >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/coordinator/transaction/TransactionMarkerChannelManager.scala#L82
> > >
> > > And a couple of tools:
> > >
> > >
> > >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala#L482
> > >
> > >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/BrokerApiVersionsCommand.scala#L299
> > >
> > > And some tests.
> > >
> > > What do you think?
> > >
> > > Thanks
> > > Badai
> > >
> > > On Fri, May 22, 2020 at 6:45 PM Badai Aqrandista 
> > > wrote:
> > >
> > > > Voting thread has been posted.
> > > >
> > > > KIP-602 page has been updated with suggestions from Rajini.
> > > >
> > > > Thanks
> > > > Badai
> > > >
> > > > On Fri, May 22, 2020 at 6:00 AM Ismael Juma 
> wrote:
> > > >
> > > >> Badai, would you like to start a vote on this KIP?
> > > >>
> > > >> Ismael
> > > >>
> > > >> On Wed, May 20, 2020 at 7:45 AM Rajini Sivaram <
> > rajinisiva...@gmail.com
> > > >
> > > >> wrote:
> > > >>
> > > >> > Deprecating for removal in 3.0 sounds good.
> > > >> >
> > > >> > On Wed, May 20, 2020 at 3:33 PM Ismael Juma 
> > > wrote:
> > > >> >
> > > >> > > Is there any reason to use "use_first_dns_ip"? Should we remove
> it
> > > >> > > completely? Or at least deprecate it for removal in 3.0?
> > > >> > >
> > > >> > > Ismael
> > > >> > >
> > > >> > >
> > > >> > > On Wed, May 20, 2020, 1:39 AM Rajini Sivaram <
> > > rajinisiva...@gmail.com
> > > >> >
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hi Badai,
> > > >> > > >
> > > >> > > > Thanks for the KIP, sounds like a useful change. Perhaps we
> > should
> > > >> call
> > > >> > > the
> > > >> > > > new option `use_first_dns_ip` (not `_ips` since it refers to
> > one).
> > > >> We
> > > >> > > > should also mention in the KIP that only one type of address
> > (ipv4
> > > >> or
> > > >> > > ipv6,
> > > >> > > > based on the first one) will be used - that is the current
> > > behaviour
> > > >> > for
> > > >> > > > `use_all_dns_ips`.  Since we are changing `default` to be
> > exactly
> > > >> the
> > > >> > > same
> > > >> > > > as `use_all_dns_ips`, it will be good to mention that
> explicitly
> > > >> under
> > > >> > > > Public Interfaces.
> > > >> > > >
> > > >> > > > Regards,
> > > >> > > >
> > > >> > > > Rajini
> > > >> > > >
> > > >> > > >
> > > >> > > > On Mon, May 18, 2020 at 1:44 AM Badai Aqrandista <
> > > >> ba...@confluent.io>
> > > >> > > > wrote:
> > > >> > > >
> > > >> > > > > Ismael
> > > >> > > > >
> > > >> > > > > What do you think of the PR and the explanation regarding
> the
> > > >> issue
> > > >> > > > raised
> > > >> > > > > in KIP-235?
> > > >> > > > >
> > > >> > > > > Should I go ahead and build a proper PR?
> > > >> > > > >
> > > >> > > > > Thanks
> > > >> > > > > Badai
> > > >> > > > >
> > > >> > > > > On Mon, May 11, 2020 at 8:53 AM Badai Aqrandista <
> > > >> ba...@confluent.io
> > > >> > >
> > > >> > > > > wrote:
> > > >> > > > >
> > > >> > > > > > Ismael
> > > >> > > > > >
> > > >> > > > > > PR 

Re: [DISCUSS] KIP-616: Rename implicit Serdes instances in kafka-streams-scala

2020-05-28 Thread Yuriy Badalyantc
At the current moment, I think John's plan is better than the original plan
described in the KIP. I think we should create a new `Serdes` in another
package. The old one will be deprecated.

- Yuriy

On Fri, May 29, 2020 at 8:58 AM John Roesler  wrote:

> Thanks, Matthias,
>
> If we go with the approach Yuriy and I agreed on, to deprecate and replace
> the whole class and not just a few of the methods, then the timeline is
> less of a concern. Under that plan, Yuriy can just write the new class
> exactly the way he wants and people can cleanly swap over to the new
> pattern when they are ready.
>
> The timeline was more significant if we were just going to deprecate some
> methods and add new methods to the existing class. That plan requires two
> implementation phases, where we first deprecate the existing methods and
> later swap the implicits at the same time we remove the deprecated members.
> Aside from the complexity of that approach, it’s not a breakage free path,
> as some users would be forced to continue using the deprecated members
> until a future release drops them, breaking their source code, and only
> then can they update their code.
>
> That wouldn’t be the end of the world, and we’ve had to do the same thing
> in the past with the implicit conversations, but this is a much wider
> scope, since it’s all the serdes. I’m happy with the new plan, since it’s
> not only one step, but also it provides everyone a breakage-free path.
>
> We can still consider dropping the deprecated class in 3.0; I just wanted
> to clarify how the timeline issue has changed.
>
> Thanks,
> John
>
> On Thu, May 28, 2020, at 20:34, Matthias J. Sax wrote:
> > I am not a Scale person, so I cannot really contribute much. However for
> > the deprecation period, if we get the change into 2.7, it might be ok to
> > remove the deprecated classed in 3.0.
> >
> > It would only be one minor release in between what is a little bit short
> > (we usually prefer at least two minor released, better three), but if we
> > have a good reason for it, it might be ok.
> >
> > If we cannot remove it in 3.0, it seems there would be a 4.0 in about a
> > year(?) when ZK removal is finished and we can remove the deprecated
> > code than.
> >
> >
> > -Matthias
> >
> > On 5/28/20 7:39 AM, John Roesler wrote:
> > > Hi Yuriy,
> > >
> > > Sounds good to me! I had a feeling we were bringing different context
> > > to the discussion; thanks for sticking with the conversation until we
> got
> > > it hashed out.
> > >
> > > I'm glad you prefer Serde*s*, since having multiple different classes
> with
> > > the same name leads to all kinds of trouble. "Serdes" seems relatively
> > > safe because people in the Scala lib won't be using the Java Serdes
> class,
> > > and they won't be using the deprecated and non-deprecated one at the
> > > same time.
> > >
> > > Thank again,
> > > -John
> > >
> > > On Thu, May 28, 2020, at 02:21, Yuriy Badalyantc wrote:
> > >> Ok, I understood you, John. I wasn't sure about kafka deprecation
> policy
> > >> and thought that the full cycle could be done with 2.7 version.
> Waiting for
> > >> 3.0 is too much, I agree with it.
> > >>
> > >> So, I think creating one more `Serdes` in another package is our way.
> I
> > >> suggest one of the following:
> > >> 1. `org.apache.kafka.streams.scala.serde.Serdes`
> > >> 2. `org.apache.kafka.streams.scala.serialization.Serdes`
> > >>
> > >> About `Serde` vs `Serdes`. I'm strongly against `Serde` because it
> would
> > >> lead to a new name clash with the
> > >> `org.apache.kafka.common.serialization.Serde`.
> > >>
> > >> - Yuriy
> > >>
> > >> On Thu, May 28, 2020 at 11:12 AM John Roesler 
> wrote:
> > >>
> > >>> Hi Yuriy,
> > >>>
> > >>> Thanks for the clarification.
> > >>>
> > >>> I guess my concern is twofold:
> > >>> 1. We typically leave deprecated methods in place for at least a
> major
> > >>> release cycle before removing them, so it would seem abrupt to have a
> > >>> deprecation period of only one minor release. If we follow the same
> pattern
> > >>> here, it would take over a year to finish this KIP.
> > >>> 2. It doesn’t seem like there is a nonbreaking deprecation path at
> all if
> > >>> people enumerate their imports (if they don’t use a wildcard). In
> that
> > >>> case, they would have no path to implicitly use the newly named
> serdes, and
> > >>> therefore they would have no way to avoid continuing to use the
> deprecated
> > >>> ones.
> > >>>
> > >>> Since you mentioned that your reason is mainly the preference for
> the name
> > >>> “Serde” or “Serdes”, can we explore just using one of those? Would
> it cause
> > >>> some kind of conflict to use org.apache.kafka.streams.scala.Serde or
> to use
> > >>> Serdes in a different package, like
> > >>> org.apache.kafka.streams.scala.implicit.Serdes?
> > >>>
> > >>> I empathize with this desire. I faced the same dilemma when I wanted
> to
> > >>> replace Processor but keep the class name in KIP-478. I wound up
> creating 

Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-05-28 Thread Matthias J. Sax
Hey,

Sorry that I am late to the game. I am not 100% convinced about the
current proposal. Using a new config as feature flag seems to be rather
"nasty" to me, and flipping from/to is a little bit too fancy for my
personal taste.

I agree, that the original proposal using a "ReadDirection" enum is not
ideal either.

Thus, I would like to put out a new idea: We could add a new interface
that offers new methods that return revers iterators.

The KIP already proposes to add `reverseAll()` and it seems backward
incompatible to just add this method to `ReadOnlyKeyValueStore` and
`ReadOnlyWindowStore`. I don't think we could provide a useful default
implementation for custom stores and thus either break compatibility or
need add a default that just throws an exception. Neither seems to be a
good option.

Using a new interface avoid this issue and allows users implementing
custom stores to opt-in by adding the interface to their stores.
Furthermore, we don't need any config. In the end, we encapsulte the
change into the store, and our runtime is agnostic to it (as it should be).

The hierarchy becomes a little complex (but uses would not really see
the complexity):

// exsiting
ReadOnlyKeyValueStore
KeyValueStore extend StateStore, ReadOnlyKeyValueStore


// helper interface; users don't care
// need similar ones for other stores
ReverseReadOnlyKeyValueStore {
KeyValueIterator reverseRange(K from, K to);
KeyValueIterator reverseAll();
}


// two new user facing interfaces for kv-store
// need similar ones for other stores
ReadOnlyKeyValueStoreWithReverseIterators extends ReadOnlyKeyValueStore,
ReverseReadOnlyKeyValueStore

KeyValueStoreWithReverseIterators extends KeyValueStore,
ReverseReadOnlyKeyValueStore


// updated (also internal)
// also need to update other built-in stores
RocksDB implements KeyValueStoreWithReverseIterators, BulkLoadingStore


In the end, user would only care about the two (for kv-case) new
interface that offer revers iterator (read only and regular) and can
cast stores accordingly in their Processors/Transformers or via IQ.


Btw: if we add revers iterator for KeyValue and Window store, should we
do the same for Session store?



This might be more code to write, but I believe it provides the better
user experience. Thoughts?



-Matthias




On 5/26/20 8:47 PM, John Roesler wrote:
> Sorry for my silence, Jorge,
> 
> I've just taken another look, and I'm personally happy with the KIP.
> 
> Thanks,
> -John
> 
> On Tue, May 26, 2020, at 16:17, Jorge Esteban Quilcate Otoya wrote:
>> If no additional comments, I will proceed to start the a vote thread.
>>
>> Thanks a lot for your feedback!
>>
>> On Fri, May 22, 2020 at 9:25 AM Jorge Esteban Quilcate Otoya <
>> quilcate.jo...@gmail.com> wrote:
>>
>>> Thanks Sophie. I like the `reverseAll()` idea.
>>>
>>> I updated the KIP with your feedback.
>>>
>>>
>>>
>>> On Fri, May 22, 2020 at 4:22 AM Sophie Blee-Goldman 
>>> wrote:
>>>
 Hm, the case of `all()` does seem to present a dilemma in the case of
 variable-length keys.

 In the case of fixed-length keys, you can just compute the keys that
 correspond
 to the maximum and minimum serialized bytes, then perform a `range()`
 query
 instead of an `all()`. If your keys don't have a well-defined ordering
 such
 that
 you can't determine the MAX_KEY, then you probably don't care about the
 iterator order anyway.

  But with variable-length keys, there is no MAX_KEY. If all your keys were
 just
 of the form 'a', 'aa', 'a', 'aaa' then in fact the only way to
 figure out the
 maximum key in the store is by using `all()` -- and without a reverse
 iterator, you're
 doomed to iterate through every single key just to answer that simple
 question.

 That said, I still think determining the iterator order based on the
 to/from bytes
 makes a lot of intuitive sense and gives the API a nice symmetry. What if
 we
 solved the `all()` problem by just giving `all()` a reverse form to
 complement it?
 Ie we would have `all()` and `reverseAll()`, or something to that effect.

 On Thu, May 21, 2020 at 3:41 PM Jorge Esteban Quilcate Otoya <
 quilcate.jo...@gmail.com> wrote:

> Thanks John.
>
> Agree. I like the first approach as well, with StreamsConfig flag
 passing
> by via ProcessorContext.
>
> Another positive effect with "reverse parameters" is that in the case of
> `fetch(keyFrom, keyTo, timeFrom, timeTo)` users can decide _which_ pair
 to
> flip, whether with `ReadDirection` enum it apply to both.
>
> The only issue I've found while reviewing the KIP is that `all()` won't
 fit
> within this approach.
>
> We could remove it from the KIP and argue that for WindowStore,
> `fetchAll(0, Long.MAX_VALUE)` can be used to get all in reverse order,
 and
> for KeyValueStore, no ordering guarantees are 

Re: [DISCUSS] KIP-616: Rename implicit Serdes instances in kafka-streams-scala

2020-05-28 Thread John Roesler
Thanks, Matthias,

If we go with the approach Yuriy and I agreed on, to deprecate and replace the 
whole class and not just a few of the methods, then the timeline is less of a 
concern. Under that plan, Yuriy can just write the new class exactly the way he 
wants and people can cleanly swap over to the new pattern when they are ready.

The timeline was more significant if we were just going to deprecate some 
methods and add new methods to the existing class. That plan requires two 
implementation phases, where we first deprecate the existing methods and later 
swap the implicits at the same time we remove the deprecated members. Aside 
from the complexity of that approach, it’s not a breakage free path, as some 
users would be forced to continue using the deprecated members until a future 
release drops them, breaking their source code, and only then can they update 
their code.

That wouldn’t be the end of the world, and we’ve had to do the same thing in 
the past with the implicit conversations, but this is a much wider scope, since 
it’s all the serdes. I’m happy with the new plan, since it’s not only one step, 
but also it provides everyone a breakage-free path.

We can still consider dropping the deprecated class in 3.0; I just wanted to 
clarify how the timeline issue has changed.

Thanks,
John

On Thu, May 28, 2020, at 20:34, Matthias J. Sax wrote:
> I am not a Scale person, so I cannot really contribute much. However for
> the deprecation period, if we get the change into 2.7, it might be ok to
> remove the deprecated classed in 3.0.
> 
> It would only be one minor release in between what is a little bit short
> (we usually prefer at least two minor released, better three), but if we
> have a good reason for it, it might be ok.
> 
> If we cannot remove it in 3.0, it seems there would be a 4.0 in about a
> year(?) when ZK removal is finished and we can remove the deprecated
> code than.
> 
> 
> -Matthias
> 
> On 5/28/20 7:39 AM, John Roesler wrote:
> > Hi Yuriy,
> > 
> > Sounds good to me! I had a feeling we were bringing different context
> > to the discussion; thanks for sticking with the conversation until we got
> > it hashed out.
> > 
> > I'm glad you prefer Serde*s*, since having multiple different classes with
> > the same name leads to all kinds of trouble. "Serdes" seems relatively
> > safe because people in the Scala lib won't be using the Java Serdes class,
> > and they won't be using the deprecated and non-deprecated one at the
> > same time.
> > 
> > Thank again,
> > -John
> > 
> > On Thu, May 28, 2020, at 02:21, Yuriy Badalyantc wrote:
> >> Ok, I understood you, John. I wasn't sure about kafka deprecation policy
> >> and thought that the full cycle could be done with 2.7 version. Waiting for
> >> 3.0 is too much, I agree with it.
> >>
> >> So, I think creating one more `Serdes` in another package is our way. I
> >> suggest one of the following:
> >> 1. `org.apache.kafka.streams.scala.serde.Serdes`
> >> 2. `org.apache.kafka.streams.scala.serialization.Serdes`
> >>
> >> About `Serde` vs `Serdes`. I'm strongly against `Serde` because it would
> >> lead to a new name clash with the
> >> `org.apache.kafka.common.serialization.Serde`.
> >>
> >> - Yuriy
> >>
> >> On Thu, May 28, 2020 at 11:12 AM John Roesler  wrote:
> >>
> >>> Hi Yuriy,
> >>>
> >>> Thanks for the clarification.
> >>>
> >>> I guess my concern is twofold:
> >>> 1. We typically leave deprecated methods in place for at least a major
> >>> release cycle before removing them, so it would seem abrupt to have a
> >>> deprecation period of only one minor release. If we follow the same 
> >>> pattern
> >>> here, it would take over a year to finish this KIP.
> >>> 2. It doesn’t seem like there is a nonbreaking deprecation path at all if
> >>> people enumerate their imports (if they don’t use a wildcard). In that
> >>> case, they would have no path to implicitly use the newly named serdes, 
> >>> and
> >>> therefore they would have no way to avoid continuing to use the deprecated
> >>> ones.
> >>>
> >>> Since you mentioned that your reason is mainly the preference for the name
> >>> “Serde” or “Serdes”, can we explore just using one of those? Would it 
> >>> cause
> >>> some kind of conflict to use org.apache.kafka.streams.scala.Serde or to 
> >>> use
> >>> Serdes in a different package, like
> >>> org.apache.kafka.streams.scala.implicit.Serdes?
> >>>
> >>> I empathize with this desire. I faced the same dilemma when I wanted to
> >>> replace Processor but keep the class name in KIP-478. I wound up creating 
> >>> a
> >>> new package for the new Processor.
> >>>
> >>> Thanks,
> >>> John
> >>>
> >>> On Wed, May 27, 2020, at 22:20, Yuriy Badalyantc wrote:
>  Hi John,
> 
>  I'm stick with the `org.apache.kafka.streams.scala.Serdes` because it's
>  sort of conventional in the scala community. If you have a typeclass
> >>> `Foo`,
>  you probably will search `Foo` related stuff in the `Foo` or maybe `Foos`
>  

Re: [DISCUSS] KIP-616: Rename implicit Serdes instances in kafka-streams-scala

2020-05-28 Thread Matthias J. Sax
I am not a Scale person, so I cannot really contribute much. However for
the deprecation period, if we get the change into 2.7, it might be ok to
remove the deprecated classed in 3.0.

It would only be one minor release in between what is a little bit short
(we usually prefer at least two minor released, better three), but if we
have a good reason for it, it might be ok.

If we cannot remove it in 3.0, it seems there would be a 4.0 in about a
year(?) when ZK removal is finished and we can remove the deprecated
code than.


-Matthias

On 5/28/20 7:39 AM, John Roesler wrote:
> Hi Yuriy,
> 
> Sounds good to me! I had a feeling we were bringing different context
> to the discussion; thanks for sticking with the conversation until we got
> it hashed out.
> 
> I'm glad you prefer Serde*s*, since having multiple different classes with
> the same name leads to all kinds of trouble. "Serdes" seems relatively
> safe because people in the Scala lib won't be using the Java Serdes class,
> and they won't be using the deprecated and non-deprecated one at the
> same time.
> 
> Thank again,
> -John
> 
> On Thu, May 28, 2020, at 02:21, Yuriy Badalyantc wrote:
>> Ok, I understood you, John. I wasn't sure about kafka deprecation policy
>> and thought that the full cycle could be done with 2.7 version. Waiting for
>> 3.0 is too much, I agree with it.
>>
>> So, I think creating one more `Serdes` in another package is our way. I
>> suggest one of the following:
>> 1. `org.apache.kafka.streams.scala.serde.Serdes`
>> 2. `org.apache.kafka.streams.scala.serialization.Serdes`
>>
>> About `Serde` vs `Serdes`. I'm strongly against `Serde` because it would
>> lead to a new name clash with the
>> `org.apache.kafka.common.serialization.Serde`.
>>
>> - Yuriy
>>
>> On Thu, May 28, 2020 at 11:12 AM John Roesler  wrote:
>>
>>> Hi Yuriy,
>>>
>>> Thanks for the clarification.
>>>
>>> I guess my concern is twofold:
>>> 1. We typically leave deprecated methods in place for at least a major
>>> release cycle before removing them, so it would seem abrupt to have a
>>> deprecation period of only one minor release. If we follow the same pattern
>>> here, it would take over a year to finish this KIP.
>>> 2. It doesn’t seem like there is a nonbreaking deprecation path at all if
>>> people enumerate their imports (if they don’t use a wildcard). In that
>>> case, they would have no path to implicitly use the newly named serdes, and
>>> therefore they would have no way to avoid continuing to use the deprecated
>>> ones.
>>>
>>> Since you mentioned that your reason is mainly the preference for the name
>>> “Serde” or “Serdes”, can we explore just using one of those? Would it cause
>>> some kind of conflict to use org.apache.kafka.streams.scala.Serde or to use
>>> Serdes in a different package, like
>>> org.apache.kafka.streams.scala.implicit.Serdes?
>>>
>>> I empathize with this desire. I faced the same dilemma when I wanted to
>>> replace Processor but keep the class name in KIP-478. I wound up creating a
>>> new package for the new Processor.
>>>
>>> Thanks,
>>> John
>>>
>>> On Wed, May 27, 2020, at 22:20, Yuriy Badalyantc wrote:
 Hi John,

 I'm stick with the `org.apache.kafka.streams.scala.Serdes` because it's
 sort of conventional in the scala community. If you have a typeclass
>>> `Foo`,
 you probably will search `Foo` related stuff in the `Foo` or maybe `Foos`
 (plural). All other places are far less discoverable for the developers.

 I agree that the migration path is a bit complex for such change. But I
 think it's more important to provide good developer experience than to
 simplify migration. Also, I think it's debatable which migration path is
 better for library users. If we would create, for example, `Serdes2`,
 library users will have to modify their code if they used any part of the
 old `Serde`. With my approach, most of the old code will still work
>>> without
 changes. Only explicit usage of implicits will need to be fixed (because
 names will be changed, and old names will be deprecated). Wildcard
>>> imports
 will work without changes and will not lead to a name clash. Moreover,
>>> many
 users may not notice name clash problems. And with my migration path,
>>> they
 will not notice any changes at all.

 - Yuriy

 On Thu, May 28, 2020 at 7:48 AM John Roesler 
>>> wrote:

> Hi Yuriy,
>
> Thanks for the reply. I guess I've been out of the Scala game for a
> while; all this summoner business is totally new to me.
>
> I think I followed the rationale you provided, but I still don't see
> why you can't implement your whole plan in a new class. What
> is special about the existing Serdes class?
>
> Thanks,
> -John
>
> On Tue, May 19, 2020, at 01:18, Yuriy Badalyantc wrote:
>> Hi John,
>>
>> Your suggestion looks interesting. I think it's technically doable.
>>> But
> I'm
>> not 

Jenkins build is back to normal : kafka-trunk-jdk8 #4584

2020-05-28 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2020-05-28 Thread Jun Rao
Hi, Satish,

Made another pass on the wiki. A few more comments below.

100. It would be useful to provide more details on how those apis are used.
Otherwise, it's kind of hard to really assess whether the new apis are
sufficient/redundant. A few examples below.
100.1 deleteRecords seems to only advance the logStartOffset in Log. How
does that trigger the deletion of remote log segments?
100.2 stopReplica with deletion is used in 2 cases (a) replica
reassignment; (b) topic deletion. We only want to delete the tiered
metadata in the second case. Also, in the second case, who initiates the
deletion of the remote segment since the leader may not exist?
100.3 "LogStartOffset of a topic can be either in local or in remote
storage." If LogStartOffset exists in both places, which one is the source
of truth?
100.4 List listRemoteLogSegments(TopicPartition
topicPartition, long minOffset): How is minOffset supposed to be used?
100.5 When copying a segment to remote storage, it seems we are calling the
same RLMM.putRemoteLogSegmentData() twice before and
after copyLogSegment(). Could you explain why?
100.6 LogSegmentData includes leaderEpochCache, but there is no api
in RemoteStorageManager to retrieve it.

101. If the __remote_log_metadata is for production usage, could you
provide more details? For example, what is the schema of the data (both
key and value)? How is the topic maintained,delete or compact?

110. Is the cache implementation in RemoteLogMetadataManager meant for
production usage? If so, could you provide more details on the schema and
how/where the data is stored?

111. "Committed offsets can be stored in a local file". Could you describe
the format of the file and where it's stored?

112. Truncation of remote segments under unclean leader election: I am not
sure who figures out the truncated remote segments and how that information
is propagated to all replicas?

113. "If there are any failures in removing remote log segments then those
are stored in a specific topic (default as
__remote_segments_to_be_deleted)". Is it necessary to add yet another
internal topic? Could we just keep retrying?

114. "We may not need to copy producer-id-snapshot as we are copying only
segments earlier to last-stable-offset." Hmm, not sure about that. The
producer snapshot includes things like the last timestamp of each open
producer id and can affect when those producer ids are expired.

Thanks,

Jun

On Thu, May 28, 2020 at 5:38 AM Satish Duggana 
wrote:

> Hi Jun,
> Gentle reminder. Please go through the updated wiki and let us know your
> comments.
>
> Thanks,
> Satish.
>
> On Tue, May 19, 2020 at 3:50 PM Satish Duggana 
> wrote:
>
>> Hi Jun,
>> Please go through the wiki which has the latest updates. Google doc is
>> updated frequently to be in sync with wiki.
>>
>> Thanks,
>> Satish.
>>
>> On Tue, May 19, 2020 at 12:30 AM Jun Rao  wrote:
>>
>>> Hi, Satish,
>>>
>>> Thanks for the update. Just to clarify. Which doc has the latest
>>> updates, the wiki or the google doc?
>>>
>>> Jun
>>>
>>> On Thu, May 14, 2020 at 10:38 AM Satish Duggana <
>>> satish.dugg...@gmail.com> wrote:
>>>
 Hi Jun,
 Thanks for your comments.  We updated the KIP with more details.

 >100. For each of the operations related to tiering, it would be useful
 to provide a description on how it works with the new API. These include
 things like consumer fetch, replica fetch, offsetForTimestamp, retention
 (remote and local) by size, time and logStartOffset, topic deletion, etc.
 This will tell us if the proposed APIs are sufficient.

 We addressed most of these APIs in the KIP. We can add more details if
 needed.

 >101. For the default implementation based on internal topic, is it
 meant as a proof of concept or for production usage? I assume that it's the
 former. However, if it's the latter, then the KIP needs to describe the
 design in more detail.

 It is production usage as was mentioned in an earlier mail. We plan to
 update this section in the next few days.

 >102. When tiering a segment, the segment is first written to the
 object store and then its metadata is written to RLMM using the api "void
 putRemoteLogSegmentData()". One potential issue with this approach is
 that if the system fails after the first operation, it leaves a garbage in
 the object store that's never reclaimed. One way to improve this is to have
 two separate APIs, sth like preparePutRemoteLogSegmentData() and
 commitPutRemoteLogSegmentData().

 That is a good point. We currently have a different way using markers
 in the segment but your suggestion is much better.

 >103. It seems that the transactional support and the ability to read
 from follower are missing.

 KIP is updated with transactional support, follower fetch semantics,
 and reading from a follower.

 >104. It would be useful to provide a testing plan for 

Jenkins build is back to normal : kafka-trunk-jdk14 #139

2020-05-28 Thread Apache Jenkins Server
See 




[VOTE] KIP-418: A method-chaining way to branch KStream

2020-05-28 Thread Ivan Ponomarev

Hello all!

I'd like to start the vote for KIP-418 which proposes deprecation of 
current `branch` method and provides a method-chaining based API for 
branching.


https://cwiki.apache.org/confluence/display/KAFKA/KIP-418%3A+A+method-chaining+way+to+branch+KStream

Regards,

Ivan


Build failed in Jenkins: kafka-trunk-jdk8 #4583

2020-05-28 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Update documentation.html to refer to 2.5 (#8744)

[github] MINOR: Update documentation.html to refer to 2.6 (#8745)


--
[...truncated 6.25 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTimeDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTimeDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessRecordForTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessRecordForTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldForwardRecordsFromSubtopologyToSubtopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldForwardRecordsFromSubtopologyToSubtopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForSmallerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForSmallerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCreateStateDirectoryForStatefulTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCreateStateDirectoryForStatefulTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis STARTED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis PASSED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime STARTED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime PASSED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep PASSED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWindowStartTimestampWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWindowStartTimestampWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldReturnIsPersistent STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldReturnIsPersistent PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardClose 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardClose 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardFlush 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardFlush 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardInit 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardInit 
PASSED

org.apache.kafka.streams.test.TestRecordTest > testConsumerRecord STARTED

org.apache.kafka.streams.test.TestRecordTest > testConsumerRecord PASSED

org.apache.kafka.streams.test.TestRecordTest > testToString STARTED

org.apache.kafka.streams.test.TestRecordTest > testToString PASSED

org.apache.kafka.streams.test.TestRecordTest > testInvalidRecords STARTED

org.apache.kafka.streams.test.TestRecordTest > testInvalidRecords PASSED

org.apache.kafka.streams.test.TestRecordTest > testPartialConstructorEquals 
STARTED

org.apache.kafka.streams.test.TestRecordTest > testPartialConstructorEquals 
PASSED

org.apache.kafka.streams.test.TestRecordTest > testMultiFieldMatcher STARTED


[jira] [Created] (KAFKA-10063) UnsupportedOperation when querying cleaner metrics after shutdown

2020-05-28 Thread Jason Gustafson (Jira)
Jason Gustafson created KAFKA-10063:
---

 Summary: UnsupportedOperation when querying cleaner metrics after 
shutdown
 Key: KAFKA-10063
 URL: https://issues.apache.org/jira/browse/KAFKA-10063
 Project: Kafka
  Issue Type: Bug
Reporter: Jason Gustafson


We have a few log cleaner metrics which iterate the set of cleaners. For 
example:

{code}
  newGauge("max-clean-time-secs", () => 
cleaners.iterator.map(_.lastStats.elapsedSecs).max.toInt)
{code}

It seems possible currently for LogCleaner metrics to get queried after 
shutdown of the log cleaner, which clears the `cleaners` collection. This can 
lead to the following error:
{code}
ava.lang.UnsupportedOperationException: empty.max
at scala.collection.IterableOnceOps.max(IterableOnce.scala:952)
at scala.collection.IterableOnceOps.max$(IterableOnce.scala:950)
at scala.collection.AbstractIterator.max(Iterator.scala:1279)
at 
kafka.log.LogCleaner.kafka$log$LogCleaner$$$anonfun$new$9(LogCleaner.scala:132)
at kafka.log.LogCleaner$$anonfun$4.value(LogCleaner.scala:132)
at kafka.log.LogCleaner$$anonfun$4.value(LogCleaner.scala:132)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Jenkins build is back to normal : kafka-trunk-jdk11 #1510

2020-05-28 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-2.5-jdk8 #136

2020-05-28 Thread Apache Jenkins Server
See 


Changes:

[rhauch] MINOR: Update documentation.html to refer to 2.5 (#8744)


--
[...truncated 2.92 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled 

[jira] [Created] (KAFKA-10062) Add a method to retrieve the current timestamp as known by the Streams app

2020-05-28 Thread Piotr Smolinski (Jira)
Piotr Smolinski created KAFKA-10062:
---

 Summary: Add a method to retrieve the current timestamp as known 
by the Streams app
 Key: KAFKA-10062
 URL: https://issues.apache.org/jira/browse/KAFKA-10062
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 2.5.0
Reporter: Piotr Smolinski


Please add to the ProcessorContext a method to retrieve current timestamp 
compatible with Punctuator#punctate(long) method.

Proposal in ProcessorContext:

long getTimestamp(PunctuationType type);

The method should return time value as known by the Punctuator scheduler with 
the respective PunctuationType.

The use-case is tracking of a process with timeout-based escalation.

A transformer receives process events and in case of missing an event execute 
an action (emit message) after given escalation timeout (several stages). The 
initial message may already arrive with reference timestamp in the past and may 
trigger different action upon arrival depending on how far in the past it is.

If the timeout should be computed against some further time only, Punctuator is 
perfectly sufficient. The problem is that I have to evaluate the current 
time-related state once the message arrives.

I am using wall-clock time. Normally accessing System.currentTimeMillis() is 
sufficient, but it breaks in unit testing with TopologyTestDriver, where the 
app wall clock time is different from the system-wide one.

To access the mentioned clock I am using reflection to access 
ProcessorContextImpl#task and then StreamTask#time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10061) Flaky Test `ReassignPartitionsIntegrationTest .testCancellation`

2020-05-28 Thread Jason Gustafson (Jira)
Jason Gustafson created KAFKA-10061:
---

 Summary: Flaky Test `ReassignPartitionsIntegrationTest 
.testCancellation`
 Key: KAFKA-10061
 URL: https://issues.apache.org/jira/browse/KAFKA-10061
 Project: Kafka
  Issue Type: Bug
Reporter: Jason Gustafson
Assignee: Jason Gustafson


We have seen this a few times:
```
org.scalatest.exceptions.TestFailedException: Timed out waiting for 
verifyAssignment result VerifyAssignmentResult(Map(foo-0 -> 
PartitionReassignmentState(List(0, 1, 3, 2),List(0, 1, 3),false), baz-1 -> 
PartitionReassignmentState(List(0, 2, 3, 1),List(0, 2, 
3),false)),true,Map(),false).  The latest result was 
VerifyAssignmentResult(Map(foo-0 -> PartitionReassignmentState(ArrayBuffer(0, 
1, 3),List(0, 1, 3),true), baz-1 -> PartitionReassignmentState(ArrayBuffer(0, 
2, 3),List(0, 2, 3),true)),false,HashMap(),false)
```
It looks like the reassignment is completing earlier than the test expects. See 
the following from the log:

```
Successfully started partition reassignments for baz-1,foo-0
==> verifyAssignment(adminClient, 
jsonString={"version":1,"partitions":[{"topic":"foo","partition":0,"replicas":[0,1,3],"log_dirs":["any","any","any"]},{"topic":"baz","partition":1,"replicas":[0,2,3],"log_dirs":["any","any","any"]}]})
Status of partition reassignment:
Reassignment of partition baz-1 is still in progress.
Reassignment of partition foo-0 is complete.
```

A successful run looks like this:
```
Successfully started partition reassignments for baz-1,foo-0
==> verifyAssignment(adminClient, 
jsonString={"version":1,"partitions":[{"topic":"foo","partition":0,"replicas":[0,1,3],"log_dirs":["any","any","any"]},{"topic":"baz","partition":1,"replicas":[0,2,3],"log_dirs":["any","any","any"]}]})
Status of partition reassignment:
Reassignment of partition baz-1 is still in progress.
Reassignment of partition foo-0 is still in progress.
```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Build failed in Jenkins: kafka-trunk-jdk14 #138

2020-05-28 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-9673: Filter and Conditional SMTs (#8699)


--
[...truncated 6.78 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCleanUpPersistentStateStoresOnClose[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 

New release branch 2.6

2020-05-28 Thread Randall Hauch
Hello Kafka developers and friends,

We now have a release branch for the 2.6 release. The branch name is "2.6"
and the version will be "2.6.0". Trunk will be shortly be bumped to the
next snapshot version 2.7.0-SNAPSHOT (
https://github.com/apache/kafka/pull/8746).

I'll be going over the JIRAs to move every non-blocker feature from this
release to the next release. If you have any questions or concerns, please
ask on the "Apache Kafka 2.6.0 release" discussion thread.

>From this point, most changes should go to trunk. However, all bug fixes
are still welcome on the release branch until the code freeze on June 10.
After that, only blocker bugs should be merged to the release branch.

Blockers (existing and new that we discover while testing the release) will
be committed to trunk and backported to the 2.6 release branch.

Please discuss with your reviewer whether your PR should go to trunk or to
trunk+release so they can merge accordingly.

As always, please help us test the release!

Thanks!
Randall Hauch


Build failed in Jenkins: kafka-trunk-jdk11 #1509

2020-05-28 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-9673: Filter and Conditional SMTs (#8699)


--
[...truncated 8.46 MB...]

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKeyAndDefaultTimestamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKeyAndDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullHeaders STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullHeaders PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithDefaultTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithDefaultTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithNullKey PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecord STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecord PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecord STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecord PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > shouldAdvanceTime 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > shouldAdvanceTime 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairs STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairsAndCustomTimestamps
 STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairsAndCustomTimestamps
 PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairs 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicNameAndTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicNameAndTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairsAndCustomTimestamps
 STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairsAndCustomTimestamps
 PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicName PASSED


[jira] [Created] (KAFKA-10060) Kafka is logging too verbosely at the INFO level

2020-05-28 Thread Greg Hamilton (Jira)
Greg Hamilton created KAFKA-10060:
-

 Summary: Kafka is logging too verbosely at the INFO level
 Key: KAFKA-10060
 URL: https://issues.apache.org/jira/browse/KAFKA-10060
 Project: Kafka
  Issue Type: Bug
  Components: logging
Affects Versions: 2.1.0
Reporter: Greg Hamilton


Some of the INFO level log4j entries are quite verbose and not really useful, 
for example in kafka.coordinator.group.GroupMetadataManager, the following log 
can be constantly printed with 0 expired offsets:

 
{code:java}
info(s"Removed $numOffsetsRemoved expired offsets in ${time.milliseconds() - 
currentTimestamp} milliseconds."){code}
 

 

*Other examples include:*

kafka.coordinator.group.GroupMetadataManager.GroupCoordinator:

 
{code:java}
info(s"Group ${group.groupId} with generation ${group.generationId} is now 
empty " +
s"(${Topic.GROUP_METADATA_TOPIC_NAME}-${partitionFor(group.groupId)})")
{code}
{code:java}
info(s"Preparing to rebalance group ${group.groupId} in state 
${group.currentState} with old generation " +  s"${group.generationId} 
(${Topic.GROUP_METADATA_TOPIC_NAME}-${partitionFor(group.groupId)}) (reason: 
$reason)")
{code}
{code:java}
info(s"Assignment received from leader for group ${group.groupId} for 
generation ${group.generationId}")
{code}
{code:java}
info(s"Stabilized group ${group.groupId} generation ${group.generationId} " + 
s"(${Topic.GROUP_METADATA_TOPIC_NAME}-${partitionFor(group.groupId)})")
{code}
 

 

We should move them to DEBUG if they are expected in normal state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Build failed in Jenkins: kafka-trunk-jdk8 #4582

2020-05-28 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-9673: Filter and Conditional SMTs (#8699)


--
[...truncated 6.24 MB...]

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTimeDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTimeDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessRecordForTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessRecordForTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldForwardRecordsFromSubtopologyToSubtopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldForwardRecordsFromSubtopologyToSubtopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForSmallerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForSmallerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCreateStateDirectoryForStatefulTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCreateStateDirectoryForStatefulTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis STARTED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis PASSED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime STARTED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime PASSED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep PASSED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 

Re: [DISCUSS] KIP-602 - Change default value for client.dns.lookup

2020-05-28 Thread Ismael Juma
+1. I think we should remove this config in AK 3.0, but in the meantime, we
can log a warning if people explicitly set the value to `default`. I think
this would be pretty rare.

Ismael

On Thu, May 28, 2020 at 10:25 AM Rajini Sivaram 
wrote:

> It is a bit confusing to have a `default` value that is not the default and
> in hindsight it wasn't a good choice of name. But agree that changing the
> config default and avoiding the temporary `use_first_dns_ip` option makes
> sense.
>
> Regards,
>
> Rajini
>
>
> On Thu, May 28, 2020 at 4:49 PM Badai Aqrandista 
> wrote:
>
> > Ismael/Rajini
> >
> > I have put some comments in the PR in response to Ismael's.
> >
> > I have some questions about Ismael's suggestion to not add
> > "use_first_dns_ip" at all and instead just deprecate "default".
> >
> > The PR would be much cleaner if we just deprecate "default" as you
> > suggested. But we will need to update some core code. And I will need to
> > update the KIP to reflect this.
> >
> > ClientDnsLookup.DEFAULT is used in few places in core (server):
> >
> >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaServer.scala#L520
> >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/ReplicaFetcherBlockingSend.scala#L92
> >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L156
> >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/coordinator/transaction/TransactionMarkerChannelManager.scala#L82
> >
> > And a couple of tools:
> >
> >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala#L482
> >
> >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/BrokerApiVersionsCommand.scala#L299
> >
> > And some tests.
> >
> > What do you think?
> >
> > Thanks
> > Badai
> >
> > On Fri, May 22, 2020 at 6:45 PM Badai Aqrandista 
> > wrote:
> >
> > > Voting thread has been posted.
> > >
> > > KIP-602 page has been updated with suggestions from Rajini.
> > >
> > > Thanks
> > > Badai
> > >
> > > On Fri, May 22, 2020 at 6:00 AM Ismael Juma  wrote:
> > >
> > >> Badai, would you like to start a vote on this KIP?
> > >>
> > >> Ismael
> > >>
> > >> On Wed, May 20, 2020 at 7:45 AM Rajini Sivaram <
> rajinisiva...@gmail.com
> > >
> > >> wrote:
> > >>
> > >> > Deprecating for removal in 3.0 sounds good.
> > >> >
> > >> > On Wed, May 20, 2020 at 3:33 PM Ismael Juma 
> > wrote:
> > >> >
> > >> > > Is there any reason to use "use_first_dns_ip"? Should we remove it
> > >> > > completely? Or at least deprecate it for removal in 3.0?
> > >> > >
> > >> > > Ismael
> > >> > >
> > >> > >
> > >> > > On Wed, May 20, 2020, 1:39 AM Rajini Sivaram <
> > rajinisiva...@gmail.com
> > >> >
> > >> > > wrote:
> > >> > >
> > >> > > > Hi Badai,
> > >> > > >
> > >> > > > Thanks for the KIP, sounds like a useful change. Perhaps we
> should
> > >> call
> > >> > > the
> > >> > > > new option `use_first_dns_ip` (not `_ips` since it refers to
> one).
> > >> We
> > >> > > > should also mention in the KIP that only one type of address
> (ipv4
> > >> or
> > >> > > ipv6,
> > >> > > > based on the first one) will be used - that is the current
> > behaviour
> > >> > for
> > >> > > > `use_all_dns_ips`.  Since we are changing `default` to be
> exactly
> > >> the
> > >> > > same
> > >> > > > as `use_all_dns_ips`, it will be good to mention that explicitly
> > >> under
> > >> > > > Public Interfaces.
> > >> > > >
> > >> > > > Regards,
> > >> > > >
> > >> > > > Rajini
> > >> > > >
> > >> > > >
> > >> > > > On Mon, May 18, 2020 at 1:44 AM Badai Aqrandista <
> > >> ba...@confluent.io>
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Ismael
> > >> > > > >
> > >> > > > > What do you think of the PR and the explanation regarding the
> > >> issue
> > >> > > > raised
> > >> > > > > in KIP-235?
> > >> > > > >
> > >> > > > > Should I go ahead and build a proper PR?
> > >> > > > >
> > >> > > > > Thanks
> > >> > > > > Badai
> > >> > > > >
> > >> > > > > On Mon, May 11, 2020 at 8:53 AM Badai Aqrandista <
> > >> ba...@confluent.io
> > >> > >
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Ismael
> > >> > > > > >
> > >> > > > > > PR created: https://github.com/apache/kafka/pull/8644/files
> > >> > > > > >
> > >> > > > > > Also, as this is my first PR, please let me know if I missed
> > >> > > anything.
> > >> > > > > >
> > >> > > > > > Thanks
> > >> > > > > > Badai
> > >> > > > > >
> > >> > > > > > On Mon, May 11, 2020 at 8:19 AM Badai Aqrandista <
> > >> > ba...@confluent.io
> > >> > > >
> > >> > > > > > wrote:
> > >> > > > > >
> > >> > > > > >> Ismael
> > >> > > > > >>
> > >> > > > > >> Thank you for responding.
> > >> > > > > >>
> > >> > > > > >> KIP-235 modified ClientUtils#parseAndValidateAddresses [1]
> to
> > >> > > resolve
> > >> > > > an
> > >> > > > > >> address alias (i.e. bootstrap server) into multiple
> > addresses.
> > >> > This
> > >> > > 

Re: [DISCUSS] KIP-602 - Change default value for client.dns.lookup

2020-05-28 Thread Rajini Sivaram
It is a bit confusing to have a `default` value that is not the default and
in hindsight it wasn't a good choice of name. But agree that changing the
config default and avoiding the temporary `use_first_dns_ip` option makes
sense.

Regards,

Rajini


On Thu, May 28, 2020 at 4:49 PM Badai Aqrandista  wrote:

> Ismael/Rajini
>
> I have put some comments in the PR in response to Ismael's.
>
> I have some questions about Ismael's suggestion to not add
> "use_first_dns_ip" at all and instead just deprecate "default".
>
> The PR would be much cleaner if we just deprecate "default" as you
> suggested. But we will need to update some core code. And I will need to
> update the KIP to reflect this.
>
> ClientDnsLookup.DEFAULT is used in few places in core (server):
>
>
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaServer.scala#L520
>
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/ReplicaFetcherBlockingSend.scala#L92
>
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L156
>
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/coordinator/transaction/TransactionMarkerChannelManager.scala#L82
>
> And a couple of tools:
>
>
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala#L482
>
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/BrokerApiVersionsCommand.scala#L299
>
> And some tests.
>
> What do you think?
>
> Thanks
> Badai
>
> On Fri, May 22, 2020 at 6:45 PM Badai Aqrandista 
> wrote:
>
> > Voting thread has been posted.
> >
> > KIP-602 page has been updated with suggestions from Rajini.
> >
> > Thanks
> > Badai
> >
> > On Fri, May 22, 2020 at 6:00 AM Ismael Juma  wrote:
> >
> >> Badai, would you like to start a vote on this KIP?
> >>
> >> Ismael
> >>
> >> On Wed, May 20, 2020 at 7:45 AM Rajini Sivaram  >
> >> wrote:
> >>
> >> > Deprecating for removal in 3.0 sounds good.
> >> >
> >> > On Wed, May 20, 2020 at 3:33 PM Ismael Juma 
> wrote:
> >> >
> >> > > Is there any reason to use "use_first_dns_ip"? Should we remove it
> >> > > completely? Or at least deprecate it for removal in 3.0?
> >> > >
> >> > > Ismael
> >> > >
> >> > >
> >> > > On Wed, May 20, 2020, 1:39 AM Rajini Sivaram <
> rajinisiva...@gmail.com
> >> >
> >> > > wrote:
> >> > >
> >> > > > Hi Badai,
> >> > > >
> >> > > > Thanks for the KIP, sounds like a useful change. Perhaps we should
> >> call
> >> > > the
> >> > > > new option `use_first_dns_ip` (not `_ips` since it refers to one).
> >> We
> >> > > > should also mention in the KIP that only one type of address (ipv4
> >> or
> >> > > ipv6,
> >> > > > based on the first one) will be used - that is the current
> behaviour
> >> > for
> >> > > > `use_all_dns_ips`.  Since we are changing `default` to be exactly
> >> the
> >> > > same
> >> > > > as `use_all_dns_ips`, it will be good to mention that explicitly
> >> under
> >> > > > Public Interfaces.
> >> > > >
> >> > > > Regards,
> >> > > >
> >> > > > Rajini
> >> > > >
> >> > > >
> >> > > > On Mon, May 18, 2020 at 1:44 AM Badai Aqrandista <
> >> ba...@confluent.io>
> >> > > > wrote:
> >> > > >
> >> > > > > Ismael
> >> > > > >
> >> > > > > What do you think of the PR and the explanation regarding the
> >> issue
> >> > > > raised
> >> > > > > in KIP-235?
> >> > > > >
> >> > > > > Should I go ahead and build a proper PR?
> >> > > > >
> >> > > > > Thanks
> >> > > > > Badai
> >> > > > >
> >> > > > > On Mon, May 11, 2020 at 8:53 AM Badai Aqrandista <
> >> ba...@confluent.io
> >> > >
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Ismael
> >> > > > > >
> >> > > > > > PR created: https://github.com/apache/kafka/pull/8644/files
> >> > > > > >
> >> > > > > > Also, as this is my first PR, please let me know if I missed
> >> > > anything.
> >> > > > > >
> >> > > > > > Thanks
> >> > > > > > Badai
> >> > > > > >
> >> > > > > > On Mon, May 11, 2020 at 8:19 AM Badai Aqrandista <
> >> > ba...@confluent.io
> >> > > >
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > >> Ismael
> >> > > > > >>
> >> > > > > >> Thank you for responding.
> >> > > > > >>
> >> > > > > >> KIP-235 modified ClientUtils#parseAndValidateAddresses [1] to
> >> > > resolve
> >> > > > an
> >> > > > > >> address alias (i.e. bootstrap server) into multiple
> addresses.
> >> > This
> >> > > is
> >> > > > > why
> >> > > > > >> it would break SSL hostname verification when the bootstrap
> >> server
> >> > > is
> >> > > > > an IP
> >> > > > > >> address, i.e. it will resolve the IP address to an FQDN and
> use
> >> > that
> >> > > > > FQDN
> >> > > > > >> in the SSL handshake.
> >> > > > > >>
> >> > > > > >> However, what I am proposing is to modify ClientUtils#resolve
> >> [2],
> >> > > > which
> >> > > > > >> is only used in ClusterConnectionStates#currentAddress [3],
> to
> >> get
> >> > > the
> >> > > > > >> resolved InetAddress of the address to connect to. And
> >> > > > > >> 

[jira] [Created] (KAFKA-10059) KafkaAdminClient returns null OffsetAndMetadata value when there is no committed offset for a partition

2020-05-28 Thread Bob Barrett (Jira)
Bob Barrett created KAFKA-10059:
---

 Summary: KafkaAdminClient returns null OffsetAndMetadata value 
when there is no committed offset for a partition
 Key: KAFKA-10059
 URL: https://issues.apache.org/jira/browse/KAFKA-10059
 Project: Kafka
  Issue Type: Improvement
Reporter: Bob Barrett


When listing consumer group offsets through the admin client, the map that we 
return has a null `OffsetAndMetadata` value if the partition has no committed 
offset. It would be better to return a non-null value that indicates the lack 
of an offset, such as `OffsetAndMetadata(-1, Optional.empty(), metadata)`. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] KIP-589: Add API to update Replica state in Controller

2020-05-28 Thread Guozhang Wang
David, thanks for the KIP. I'm +1 on it as well.

One note is that in post-ZK world, we would need a different way to get
broker epoch since it is updated as ZKversion today. I believe we would
have this discussion in a different KIP though.


Guozhang

On Wed, May 27, 2020 at 8:26 PM Colin McCabe  wrote:

> Thanks, David.  +1 (binding).
>
> cheers,
> Colin
>
> On Wed, May 27, 2020, at 18:21, David Arthur wrote:
> > Colin, thanks for the feedback. Good points. I've updated the KIP with
> your
> > suggestions.
> >
> > -David
> >
> > On Wed, May 27, 2020 at 4:05 PM Colin McCabe  wrote:
> >
> > > Hi David,
> > >
> > > Thanks for the KIP!
> > >
> > > The KIP refers to "the KIP-500 bridge release (version 2.6.0 as of the
> > > time of this proposal)".  This is out of date-- the bridge release
> will be
> > > one of the 3.x releases.  We should either update this to 3.0, or
> perhaps
> > > just take out the reference to a specific version, since it's not
> necessary
> > > to understand the rest of the KIP.
> > >
> > > > ... and potentially could replace the existing controlled shutdown
> RPC.
> > > Since this RPC
> > > > is somewhat generic, it could also be used to mark a replicas a
> "online"
> > > following some
> > > > kind of log dir recovery procedure (out of scope for this proposal).
> > >
> > > I think it would be good to move this part into the "Future Work"
> section.
> > >
> > > > The Reason field is an optional textual description of why the event
> is
> > > being sent
> > >
> > > Since we implemented optional fields in KIP-482, describing this field
> as
> > > "optional" might be confusing.  Probably better to avoid describing it
> that
> > > way, unless it's a tagged field.
> > >
> > > > - If no Topic is given, it is implied that all topics on this broker
> are
> > > being indicated
> > > > - If a Topic and no partitions are given, it is implied that all
> > > partitions of this topic are being indicated
> > >
> > > I would prefer to leave out these "shortcuts" since they seem likely to
> > > lead to confusion and bugs.
> > >
> > > For example, suppose that  the controller has just created a new
> partition
> > > for topic "foo" and put it on broker 3.  But then, before broker 3
> gets the
> > > LeaderAndIsrRequest from the controller, broker 3 get a bad log
> directory.
> > > So it sends an AlterReplicaStateRequest to the controller specifying
> topic
> > > foo and leaving out the partition list (using the first "shortcut".)
> The
> > > new partition will get marked as offline even though it hasn't even
> been
> > > created, much less assigned to the bad log directory.
> > >
> > > Since log directory failures are rare, spelling out the full set of
> > > affected partitions when one happens doesn't seem like that much of a
> > > burden.  This is also consistent with what we currently do.  In fact,
> it's
> > > much more efficient than what we currently do, since with KIP-589, we
> won't
> > > have to enumerate partitions that aren't on the failed log directory.
> > >
> > > In the future work section: If we eventually want to replace
> > > ControlledShutdownRequest with this RPC, we'll need some additional
> > > functionality.  Specifically, we'll need the ability to tell the
> controller
> > > to stop putting new partitions on the broker that sent the request.
> That
> > > could be done with a separate request or possibly additional flags on
> this
> > > request.  In any case, we don't have to solve that problem now.
> > >
> > > Thanks again for the KIP... great to see this moving forward.
> > >
> > > regards,
> > > Colin
> > >
> > >
> > > On Wed, May 20, 2020, at 12:22, David Arthur wrote:
> > > > Hello, all. I'd like to start the vote for KIP-589 which proposes to
> add
> > > a
> > > > new AlterReplicaState RPC.
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller
> > > >
> > > > Cheers,
> > > > David
> > > >
> > >
> >
> >
> > --
> > -David
> >
>


-- 
-- Guozhang


Re: Streams, Kafka windows

2020-05-28 Thread John Roesler
Setting a new record for procrastination, I've just created this ticket:
https://issues.apache.org/jira/browse/KAFKA-10058

On Sat, Jan 18, 2020, at 22:03, John Roesler wrote:
> Good idea! I’ll make a note to do it when I’m at a computer. 
> 
> On Sat, Jan 18, 2020, at 21:51, Guozhang Wang wrote:
> > Hey John,
> > 
> > Since this is a common question and I've seen many users asking about
> > window semantics like this, could you file a JIRA ticket for creating a
> > wiki page like Join Semantics (
> > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Streams+Join+Semantics)
> > to summarize the windowing operations like this too?
> > 
> > Guozhang
> > 
> > On Sat, Jan 18, 2020 at 3:22 PM John Roesler  wrote:
> > 
> > > Glad it helped!
> > > -John
> > >
> > > On Sat, Jan 18, 2020, at 12:27, Viktor Markvardt wrote:
> > > > Hi John,
> > > >
> > > > Thank you for your assistance!
> > > > Your example very help me and I understood kafka-streams more clearly
> > > now.
> > > > Have a nice weekend :)
> > > >
> > > > Best regards,
> > > > Viktor Markvardt
> > > >
> > > > чт, 16 янв. 2020 г. в 19:29, John Roesler :
> > > >
> > > > > Hi Viktor,
> > > > >
> > > > > I’m starting to wonder what exactly “duplicate” means in this context.
> > > Can
> > > > > you elaborate?
> > > > >
> > > > > In case it helps, with your window definition, if I send a record with
> > > > > timestamp 20, it would actually belong to three different windows:
> > > > > [0,30)
> > > > > [10,40)
> > > > > [20,50)
> > > > >
> > > > > Because of this, you would (correctly) see three output records for
> > > that
> > > > > one input, but the outputs wouldn’t be “duplicates” properly, because
> > > > > they’d have different keys:
> > > > >
> > > > > Input:
> > > > > Key1: Val1 @ timestamp:20
> > > > >
> > > > > Output:
> > > > > Windowed: 1
> > > > > Windowed: 1
> > > > > Windowed: 1
> > > > >
> > > > > Any chance that explains your observation?
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jan 16, 2020, at 10:18, Viktor Markvardt wrote:
> > > > > > Hi John,
> > > > > >
> > > > > > Thanks for answering my questions!
> > > > > > I observe behavior which I can not understand.
> > > > > > The code is working, but when delay between records larger then
> > > window
> > > > > > duration I receive duplicated records.
> > > > > > With the code below I received duplicated records in the output
> > > kstream.
> > > > > > Count of duplicate records is always 3. If I change
> > > duration/advanceBy
> > > > > > count of duplicated records is changing also.
> > > > > > Do you have any ideas why duplicated records are received in the
> > > output
> > > > > > kstream?
> > > > > >
> > > > > > KStream windowedStream = source
> > > > > > .groupByKey()
> > > > > >
> > > > > >
> > > > >
> > > .windowedBy(TimeWindows.of(Duration.ofSeconds(30)).grace(Duration.ZERO).advanceBy(Duration.ofSeconds(10)))
> > > > > > .count()
> > > > > >
> > > > > >
> > > > >
> > > .suppress(Suppressed.untilWindowCloses(Suppressed.BufferConfig.unbounded()))
> > > > > > .toStream();
> > > > > >
> > > > > >
> > > > > > Best regards,
> > > > > > Viktor Markvardt
> > > > > >
> > > > > > чт, 16 янв. 2020 г. в 04:59, John Roesler :
> > > > > >
> > > > > > > Hi Viktor,
> > > > > > >
> > > > > > > I’m not sure why you get two identical outputs in response to a
> > > single
> > > > > > > record. Regardless, since you say that you want to get a single,
> > > final
> > > > > > > result for the window and you expect multiple inputs to the
> > > windows,
> > > > > you
> > > > > > > need Suppression.
> > > > > > >
> > > > > > > My guess is that you just sent one record to try it out and didn’t
> > > see
> > > > > any
> > > > > > > output? This is expected. Just as the window boundaries are
> > > defined by
> > > > > the
> > > > > > > time stamps of the records, not by the current system time,
> > > > > suppression is
> > > > > > > governed by the timestamp of the records. I.e., a thirty-second
> > > window
> > > > > is
> > > > > > > not actually closed until you see a new record with a timestamp
> > > thirty
> > > > > > > seconds later.
> > > > > > >
> > > > > > >  Maybe try just sending a sequence of updates with incrementing
> > > > > > > timestamps. If the first record has timestamp T, then you should
> > > see an
> > > > > > > output when you pass in a record with timestamp T+30.
> > > > > > >
> > > > > > > Important note: there is a built-in grace period that delays the
> > > > > output of
> > > > > > > final results after the window ends. For complicated reasons, the
> > > > > default
> > > > > > > is 24 hours! So you would actually not see an output until you
> > > send a
> > > > > > > record with timestamp T+30+(24 hours) ! I strongly recommend you
> > > set
> > > > > the
> > > > > > > grace period on TimeWindows to zero for your testing. You can
> > > increase
> > > > > it
> > > > > > > later if you want to tolerate some late-arriving 

[jira] [Created] (KAFKA-10058) Create Windowing Operations overview documentation

2020-05-28 Thread John Roesler (Jira)
John Roesler created KAFKA-10058:


 Summary: Create Windowing Operations overview documentation 
 Key: KAFKA-10058
 URL: https://issues.apache.org/jira/browse/KAFKA-10058
 Project: Kafka
  Issue Type: Improvement
  Components: documentation, streams
Reporter: John Roesler


It would improve usability if we added a cross-cutting overview of windowing 
operations and semantics, similar to our join semantics page 
(https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Streams+Join+Semantics)

 

See the original mailing list thread for details: 
https://lists.apache.org/thread.html/r36cac1234bfee46e1bfe447c2f1d70ccfe58f3ca23ce9f13f7eb7be3%40%3Cdev.kafka.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-602 - Change default value for client.dns.lookup

2020-05-28 Thread Badai Aqrandista
Ismael/Rajini

I have put some comments in the PR in response to Ismael's.

I have some questions about Ismael's suggestion to not add
"use_first_dns_ip" at all and instead just deprecate "default".

The PR would be much cleaner if we just deprecate "default" as you
suggested. But we will need to update some core code. And I will need to
update the KIP to reflect this.

ClientDnsLookup.DEFAULT is used in few places in core (server):

https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaServer.scala#L520
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/ReplicaFetcherBlockingSend.scala#L92
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L156
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/coordinator/transaction/TransactionMarkerChannelManager.scala#L82

And a couple of tools:

https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala#L482
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/BrokerApiVersionsCommand.scala#L299

And some tests.

What do you think?

Thanks
Badai

On Fri, May 22, 2020 at 6:45 PM Badai Aqrandista  wrote:

> Voting thread has been posted.
>
> KIP-602 page has been updated with suggestions from Rajini.
>
> Thanks
> Badai
>
> On Fri, May 22, 2020 at 6:00 AM Ismael Juma  wrote:
>
>> Badai, would you like to start a vote on this KIP?
>>
>> Ismael
>>
>> On Wed, May 20, 2020 at 7:45 AM Rajini Sivaram 
>> wrote:
>>
>> > Deprecating for removal in 3.0 sounds good.
>> >
>> > On Wed, May 20, 2020 at 3:33 PM Ismael Juma  wrote:
>> >
>> > > Is there any reason to use "use_first_dns_ip"? Should we remove it
>> > > completely? Or at least deprecate it for removal in 3.0?
>> > >
>> > > Ismael
>> > >
>> > >
>> > > On Wed, May 20, 2020, 1:39 AM Rajini Sivaram > >
>> > > wrote:
>> > >
>> > > > Hi Badai,
>> > > >
>> > > > Thanks for the KIP, sounds like a useful change. Perhaps we should
>> call
>> > > the
>> > > > new option `use_first_dns_ip` (not `_ips` since it refers to one).
>> We
>> > > > should also mention in the KIP that only one type of address (ipv4
>> or
>> > > ipv6,
>> > > > based on the first one) will be used - that is the current behaviour
>> > for
>> > > > `use_all_dns_ips`.  Since we are changing `default` to be exactly
>> the
>> > > same
>> > > > as `use_all_dns_ips`, it will be good to mention that explicitly
>> under
>> > > > Public Interfaces.
>> > > >
>> > > > Regards,
>> > > >
>> > > > Rajini
>> > > >
>> > > >
>> > > > On Mon, May 18, 2020 at 1:44 AM Badai Aqrandista <
>> ba...@confluent.io>
>> > > > wrote:
>> > > >
>> > > > > Ismael
>> > > > >
>> > > > > What do you think of the PR and the explanation regarding the
>> issue
>> > > > raised
>> > > > > in KIP-235?
>> > > > >
>> > > > > Should I go ahead and build a proper PR?
>> > > > >
>> > > > > Thanks
>> > > > > Badai
>> > > > >
>> > > > > On Mon, May 11, 2020 at 8:53 AM Badai Aqrandista <
>> ba...@confluent.io
>> > >
>> > > > > wrote:
>> > > > >
>> > > > > > Ismael
>> > > > > >
>> > > > > > PR created: https://github.com/apache/kafka/pull/8644/files
>> > > > > >
>> > > > > > Also, as this is my first PR, please let me know if I missed
>> > > anything.
>> > > > > >
>> > > > > > Thanks
>> > > > > > Badai
>> > > > > >
>> > > > > > On Mon, May 11, 2020 at 8:19 AM Badai Aqrandista <
>> > ba...@confluent.io
>> > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > >> Ismael
>> > > > > >>
>> > > > > >> Thank you for responding.
>> > > > > >>
>> > > > > >> KIP-235 modified ClientUtils#parseAndValidateAddresses [1] to
>> > > resolve
>> > > > an
>> > > > > >> address alias (i.e. bootstrap server) into multiple addresses.
>> > This
>> > > is
>> > > > > why
>> > > > > >> it would break SSL hostname verification when the bootstrap
>> server
>> > > is
>> > > > > an IP
>> > > > > >> address, i.e. it will resolve the IP address to an FQDN and use
>> > that
>> > > > > FQDN
>> > > > > >> in the SSL handshake.
>> > > > > >>
>> > > > > >> However, what I am proposing is to modify ClientUtils#resolve
>> [2],
>> > > > which
>> > > > > >> is only used in ClusterConnectionStates#currentAddress [3], to
>> get
>> > > the
>> > > > > >> resolved InetAddress of the address to connect to. And
>> > > > > >> ClusterConnectionStates#currentAddress is only used by
>> > > > > >> NetworkClient#initiateConnect [4] to create InetSocketAddress
>> to
>> > > > > establish
>> > > > > >> the socket connection to the broker.
>> > > > > >>
>> > > > > >> Therefore, as far as I know, this change will not affect higher
>> > > level
>> > > > > >> protocol like SSL or SASL.
>> > > > > >>
>> > > > > >> PR coming after this.
>> > > > > >>
>> > > > > >> Thanks
>> > > > > >> Badai
>> > > > > >>
>> > > > > >> [1]
>> > > > > >>
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/kafka/blob/2.5.0/clients/src/main/java/org/apache/kafka/clients/ClientUtils.java#L51
>> > 

Re: [DISCUSS] KIP-616: Rename implicit Serdes instances in kafka-streams-scala

2020-05-28 Thread John Roesler
Hi Yuriy,

Sounds good to me! I had a feeling we were bringing different context
to the discussion; thanks for sticking with the conversation until we got
it hashed out.

I'm glad you prefer Serde*s*, since having multiple different classes with
the same name leads to all kinds of trouble. "Serdes" seems relatively
safe because people in the Scala lib won't be using the Java Serdes class,
and they won't be using the deprecated and non-deprecated one at the
same time.

Thank again,
-John

On Thu, May 28, 2020, at 02:21, Yuriy Badalyantc wrote:
> Ok, I understood you, John. I wasn't sure about kafka deprecation policy
> and thought that the full cycle could be done with 2.7 version. Waiting for
> 3.0 is too much, I agree with it.
> 
> So, I think creating one more `Serdes` in another package is our way. I
> suggest one of the following:
> 1. `org.apache.kafka.streams.scala.serde.Serdes`
> 2. `org.apache.kafka.streams.scala.serialization.Serdes`
> 
> About `Serde` vs `Serdes`. I'm strongly against `Serde` because it would
> lead to a new name clash with the
> `org.apache.kafka.common.serialization.Serde`.
> 
> - Yuriy
> 
> On Thu, May 28, 2020 at 11:12 AM John Roesler  wrote:
> 
> > Hi Yuriy,
> >
> > Thanks for the clarification.
> >
> > I guess my concern is twofold:
> > 1. We typically leave deprecated methods in place for at least a major
> > release cycle before removing them, so it would seem abrupt to have a
> > deprecation period of only one minor release. If we follow the same pattern
> > here, it would take over a year to finish this KIP.
> > 2. It doesn’t seem like there is a nonbreaking deprecation path at all if
> > people enumerate their imports (if they don’t use a wildcard). In that
> > case, they would have no path to implicitly use the newly named serdes, and
> > therefore they would have no way to avoid continuing to use the deprecated
> > ones.
> >
> > Since you mentioned that your reason is mainly the preference for the name
> > “Serde” or “Serdes”, can we explore just using one of those? Would it cause
> > some kind of conflict to use org.apache.kafka.streams.scala.Serde or to use
> > Serdes in a different package, like
> > org.apache.kafka.streams.scala.implicit.Serdes?
> >
> > I empathize with this desire. I faced the same dilemma when I wanted to
> > replace Processor but keep the class name in KIP-478. I wound up creating a
> > new package for the new Processor.
> >
> > Thanks,
> > John
> >
> > On Wed, May 27, 2020, at 22:20, Yuriy Badalyantc wrote:
> > > Hi John,
> > >
> > > I'm stick with the `org.apache.kafka.streams.scala.Serdes` because it's
> > > sort of conventional in the scala community. If you have a typeclass
> > `Foo`,
> > > you probably will search `Foo` related stuff in the `Foo` or maybe `Foos`
> > > (plural). All other places are far less discoverable for the developers.
> > >
> > > I agree that the migration path is a bit complex for such change. But I
> > > think it's more important to provide good developer experience than to
> > > simplify migration. Also, I think it's debatable which migration path is
> > > better for library users. If we would create, for example, `Serdes2`,
> > > library users will have to modify their code if they used any part of the
> > > old `Serde`. With my approach, most of the old code will still work
> > without
> > > changes. Only explicit usage of implicits will need to be fixed (because
> > > names will be changed, and old names will be deprecated). Wildcard
> > imports
> > > will work without changes and will not lead to a name clash. Moreover,
> > many
> > > users may not notice name clash problems. And with my migration path,
> > they
> > > will not notice any changes at all.
> > >
> > > - Yuriy
> > >
> > > On Thu, May 28, 2020 at 7:48 AM John Roesler 
> > wrote:
> > >
> > > > Hi Yuriy,
> > > >
> > > > Thanks for the reply. I guess I've been out of the Scala game for a
> > > > while; all this summoner business is totally new to me.
> > > >
> > > > I think I followed the rationale you provided, but I still don't see
> > > > why you can't implement your whole plan in a new class. What
> > > > is special about the existing Serdes class?
> > > >
> > > > Thanks,
> > > > -John
> > > >
> > > > On Tue, May 19, 2020, at 01:18, Yuriy Badalyantc wrote:
> > > > > Hi John,
> > > > >
> > > > > Your suggestion looks interesting. I think it's technically doable.
> > But
> > > > I'm
> > > > > not sure that this is the better solution. I will try to explain.
> > From
> > > > the
> > > > > scala developers' perspective, `Serde` looks really like a typeclass.
> > > > > Typical typeclass in pure scala will look like this:
> > > > >
> > > > > ```
> > > > > trait Serde[A] {
> > > > >   def serialize(data: A): Array[Byte]
> > > > >   def deserialize(data: Array[Byte]): A
> > > > > }
> > > > > object Serde extends DefaultSerdes {
> > > > >   // "summoner" function. With this I can write `Serde[A]` and this
> > serde
> > > > > will be implicitly summonned.

Jenkins build is back to normal : kafka-trunk-jdk14 #137

2020-05-28 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-9673) Conditionally apply SMTs

2020-05-28 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch resolved KAFKA-9673.
--
Fix Version/s: 2.6.0
 Reviewer: Konstantine Karantasis
   Resolution: Fixed

KIP-585 was approved by the 2.6.0 KIP freeze, and the PR was approved and 
merged to `trunk` before 2.6.0 feature freeze.

> Conditionally apply SMTs
> 
>
> Key: KAFKA-9673
> URL: https://issues.apache.org/jira/browse/KAFKA-9673
> Project: Kafka
>  Issue Type: New Feature
>  Components: KafkaConnect
>Reporter: Tom Bentley
>Assignee: Tom Bentley
>Priority: Major
> Fix For: 2.6.0
>
>
> KAFKA-7052 ended up using IAE with a message, rather than NPE in the case of 
> a SMT being applied to a record lacking a given field. It's still not 
> possible to apply a SMT conditionally, which is what things like Debezium 
> really need in order to apply transformations only to non-schema change 
> events.
> [~rhauch] suggested a mechanism to conditionally apply any SMT but was 
> concerned about the possibility of a naming collision (assuming it was 
> configured by a simple config)
> I'd like to propose something which would solve this problem without the 
> possibility of such collisions. The idea is to have a higher-level condition, 
> which applies an arbitrary transformation (or transformation chain) according 
> to some predicate on the record. 
> More concretely, it might be configured like this:
> {noformat}
>   transforms.conditionalExtract.type: Conditional
>   transforms.conditionalExtract.transforms: extractInt
>   transforms.conditionalExtract.transforms.extractInt.type: 
> org.apache.kafka.connect.transforms.ExtractField$Key
>   transforms.conditionalExtract.transforms.extractInt.field: c1
>   transforms.conditionalExtract.condition: topic-matches:
> {noformat}
> * The {{Conditional}} SMT is configured with its own list of transforms 
> ({{transforms.conditionalExtract.transforms}}) to apply. This would work just 
> like the top level {{transforms}} config, so subkeys can be used to configure 
> these transforms in the usual way.
> * The {{condition}} config defines the predicate for when the transforms are 
> applied to a record using a {{:}} syntax
> We could initially support three condition types:
> *{{topic-matches:}}* The transformation would be applied if the 
> record's topic name matched the given regular expression pattern. For 
> example, the following would apply the transformation on records being sent 
> to any topic with a name beginning with "my-prefix-":
> {noformat}
>transforms.conditionalExtract.condition: topic-matches:my-prefix-.*
> {noformat}
>
> *{{has-header:}}* The transformation would be applied if the 
> record had at least one header with the given name. For example, the 
> following will apply the transformation on records with at least one header 
> with the name "my-header":
> {noformat}
>transforms.conditionalExtract.condition: has-header:my-header
> {noformat}
>
> *{{not:}}* This would negate the result of another named 
> condition using the condition config prefix. For example, the following will 
> apply the transformation on records which lack any header with the name 
> my-header:
> {noformat}
>   transforms.conditionalExtract.condition: not:hasMyHeader
>   transforms.conditionalExtract.condition.hasMyHeader: 
> has-header:my-header
> {noformat}
> I foresee one implementation concern with this approach, which is that 
> currently {{Transformation}} has to return a fixed {{ConfigDef}}, and this 
> proposal would require something more flexible in order to allow the config 
> parameters to depend on the listed transform aliases (and similarly for named 
> predicate used for the {{not:}} predicate). I think this could be done by 
> adding a {{default}} method to {{Transformation}} for getting the ConfigDef 
> given the config, for example.
> Obviously this would require a KIP, but before I spend any more time on this 
> I'd be interested in your thoughts [~rhauch], [~rmoff], [~gunnar.morling].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-9971) Error Reporting in Sink Connectors

2020-05-28 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch resolved KAFKA-9971.
--
Fix Version/s: 2.6.0
 Reviewer: Randall Hauch
 Assignee: Aakash Shah
   Resolution: Fixed

Merged to `trunk` for inclusion in the upcoming 2.6.0 release. This was 
approved and merged before feature freeze.

> Error Reporting in Sink Connectors
> --
>
> Key: KAFKA-9971
> URL: https://issues.apache.org/jira/browse/KAFKA-9971
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 2.6.0
>Reporter: Aakash Shah
>Assignee: Aakash Shah
>Priority: Critical
> Fix For: 2.6.0
>
>
> Currently, 
> [KIP-298|https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect]
>  provides error handling in Kafka Connect that includes functionality such as 
> retrying, logging, and sending errant records to a dead letter queue. 
> However, the dead letter queue functionality from KIP-298 only supports error 
> reporting within contexts of the transform operation, and key, value, and 
> header converter operation. Within the context of the {{put(...)}} method in 
> sink connectors, there is no support for dead letter queue/error reporting 
> functionality. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Broker side round robin on topic partitions when receiving messages

2020-05-28 Thread Vinicius Scheidegger
Hi,

I'm trying to understand a little bit more about how Kafka works.
I have a design with multiple producers writing to a single topic and
multiple consumers in a single Consumer Group consuming message from this
topic.

My idea is to distribute the messages from all producers equally. From
reading the documentation I understood that the partition is always
selected by the producer. Is that correct?

I'd also like to know if there is an out of the box option to assign the
partition via a round robin *on the broker side *to guarantee equal
distribution of the load - if possible to each consumer, but if not
possible, at least to each partition.

If my understanding is correct, it looks like in a multiple producer
scenario there is lack of support from Kafka regarding load balancing and
customers have to either stick to the hash of the key (random distribution,
although it would guarantee same key goes to the same partition) or they
have to create their own logic on the producer side (i.e. by sharing memory)

Am I missing something?

Thank you,

Vinicius Scheidegger


Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2020-05-28 Thread Satish Duggana
Hi Jun,
Gentle reminder. Please go through the updated wiki and let us know your
comments.

Thanks,
Satish.

On Tue, May 19, 2020 at 3:50 PM Satish Duggana 
wrote:

> Hi Jun,
> Please go through the wiki which has the latest updates. Google doc is
> updated frequently to be in sync with wiki.
>
> Thanks,
> Satish.
>
> On Tue, May 19, 2020 at 12:30 AM Jun Rao  wrote:
>
>> Hi, Satish,
>>
>> Thanks for the update. Just to clarify. Which doc has the latest updates,
>> the wiki or the google doc?
>>
>> Jun
>>
>> On Thu, May 14, 2020 at 10:38 AM Satish Duggana 
>> wrote:
>>
>>> Hi Jun,
>>> Thanks for your comments.  We updated the KIP with more details.
>>>
>>> >100. For each of the operations related to tiering, it would be useful
>>> to provide a description on how it works with the new API. These include
>>> things like consumer fetch, replica fetch, offsetForTimestamp, retention
>>> (remote and local) by size, time and logStartOffset, topic deletion, etc.
>>> This will tell us if the proposed APIs are sufficient.
>>>
>>> We addressed most of these APIs in the KIP. We can add more details if
>>> needed.
>>>
>>> >101. For the default implementation based on internal topic, is it
>>> meant as a proof of concept or for production usage? I assume that it's the
>>> former. However, if it's the latter, then the KIP needs to describe the
>>> design in more detail.
>>>
>>> It is production usage as was mentioned in an earlier mail. We plan to
>>> update this section in the next few days.
>>>
>>> >102. When tiering a segment, the segment is first written to the object
>>> store and then its metadata is written to RLMM using the api "void 
>>> putRemoteLogSegmentData()".
>>> One potential issue with this approach is that if the system fails after
>>> the first operation, it leaves a garbage in the object store that's never
>>> reclaimed. One way to improve this is to have two separate APIs, sth like
>>> preparePutRemoteLogSegmentData() and commitPutRemoteLogSegmentData().
>>>
>>> That is a good point. We currently have a different way using markers in
>>> the segment but your suggestion is much better.
>>>
>>> >103. It seems that the transactional support and the ability to read
>>> from follower are missing.
>>>
>>> KIP is updated with transactional support, follower fetch semantics, and
>>> reading from a follower.
>>>
>>> >104. It would be useful to provide a testing plan for this KIP.
>>>
>>> We added a few tests by introducing test util for tiered storage in the
>>> PR. We will provide the testing plan in the next few days.
>>>
>>> Thanks,
>>> Satish.
>>>
>>>
>>> On Wed, Feb 26, 2020 at 9:43 PM Harsha Chintalapani 
>>> wrote:
>>>




 On Tue, Feb 25, 2020 at 12:46 PM, Jun Rao  wrote:

> Hi, Satish,
>
> Thanks for the updated doc. The new API seems to be an improvement
> overall. A few more comments below.
>
> 100. For each of the operations related to tiering, it would be useful
> to provide a description on how it works with the new API. These include
> things like consumer fetch, replica fetch, offsetForTimestamp, retention
> (remote and local) by size, time and logStartOffset, topic deletion,
> etc. This will tell us if the proposed APIs are sufficient.
>

 Thanks for the feedback Jun. We will add more details around this.


> 101. For the default implementation based on internal topic, is it
> meant as a proof of concept or for production usage? I assume that it's 
> the
> former. However, if it's the latter, then the KIP needs to describe the
> design in more detail.
>

 Yes it meant to be for production use.  Ideally it would be good to
 merge this in as the default implementation for metadata service. We can
 add more details around design and testing.

 102. When tiering a segment, the segment is first written to the object
> store and then its metadata is written to RLMM using the api "void
> putRemoteLogSegmentData()".
> One potential issue with this approach is that if the system fails
> after the first operation, it leaves a garbage in the object store that's
> never reclaimed. One way to improve this is to have two separate APIs, sth
> like preparePutRemoteLogSegmentData() and commitPutRemoteLogSegmentData().
>
> 103. It seems that the transactional support and the ability to read
> from follower are missing.
>
> 104. It would be useful to provide a testing plan for this KIP.
>

 We are working on adding more details around transactional support and
 coming up with test plan.
 Add system tests and integration tests.

 Thanks,
>
> Jun
>
> On Mon, Feb 24, 2020 at 8:10 AM Satish Duggana <
> satish.dugg...@gmail.com> wrote:
>
> Hi Jun,
> Please look at the earlier reply and let us know your comments.
>
> Thanks,
> Satish.
>
> On Wed, Feb 12, 2020 at 

[jira] [Created] (KAFKA-10057) optimize class ConfigCommand method alterConfig parameters

2020-05-28 Thread qianrui (Jira)
qianrui created KAFKA-10057:
---

 Summary: optimize class ConfigCommand method  alterConfig 
parameters
 Key: KAFKA-10057
 URL: https://issues.apache.org/jira/browse/KAFKA-10057
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 1.1.1
Reporter: qianrui


class ConfigCommand method alterConfig first parameter  zkClient not used,It 
should be deleted。



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Jenkins build is back to normal : kafka-2.5-jdk8 #135

2020-05-28 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-trunk-jdk14 #136

2020-05-28 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-9960: implement KIP-606 to add metadata context to 
MetricsReporter

[github] KAFKA-9146: Add option to force delete active members in 
StreamsResetter


--
[...truncated 4.94 MB...]
kafka.api.SaslSslConsumerTest > testSimpleConsumption STARTED

kafka.api.UserQuotaTest > testProducerConsumerOverrideLowerQuota PASSED

kafka.api.UserQuotaTest > testProducerConsumerOverrideUnthrottled STARTED

kafka.api.SaslSslConsumerTest > testSimpleConsumption PASSED

kafka.api.TransactionsExpirationTest > 
testBumpTransactionalEpochAfterInvalidProducerIdMapping STARTED

kafka.api.UserQuotaTest > testProducerConsumerOverrideUnthrottled PASSED

kafka.api.UserQuotaTest > testThrottledProducerConsumer STARTED

kafka.api.TransactionsExpirationTest > 
testBumpTransactionalEpochAfterInvalidProducerIdMapping PASSED

kafka.api.TransactionsBounceTest > testWithGroupMetadata STARTED

kafka.api.TransactionsBounceTest > testWithGroupMetadata PASSED

kafka.api.TransactionsBounceTest > testWithGroupId STARTED

kafka.api.UserQuotaTest > testThrottledProducerConsumer PASSED

kafka.api.UserQuotaTest > testQuotaOverrideDelete STARTED

kafka.api.TransactionsBounceTest > testWithGroupId PASSED

kafka.api.ProducerFailureHandlingTest > testCannotSendToInternalTopic STARTED

kafka.api.ProducerFailureHandlingTest > testCannotSendToInternalTopic PASSED

kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne STARTED

kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne PASSED

kafka.api.ProducerFailureHandlingTest > testWrongBrokerList STARTED

kafka.api.ProducerFailureHandlingTest > testWrongBrokerList PASSED

kafka.api.ProducerFailureHandlingTest > testNotEnoughReplicas STARTED

kafka.api.ProducerFailureHandlingTest > testNotEnoughReplicas PASSED

kafka.api.ProducerFailureHandlingTest > 
testResponseTooLargeForReplicationWithAckAll STARTED

kafka.api.ProducerFailureHandlingTest > 
testResponseTooLargeForReplicationWithAckAll PASSED

kafka.api.ProducerFailureHandlingTest > testNonExistentTopic STARTED

kafka.api.UserQuotaTest > testQuotaOverrideDelete PASSED

kafka.api.UserQuotaTest > testThrottledRequest STARTED

kafka.api.ProducerFailureHandlingTest > testNonExistentTopic PASSED

kafka.api.ProducerFailureHandlingTest > testInvalidPartition STARTED

kafka.api.ProducerFailureHandlingTest > testInvalidPartition PASSED

kafka.api.ProducerFailureHandlingTest > testSendAfterClosed STARTED

kafka.api.UserQuotaTest > testThrottledRequest PASSED

kafka.api.ConsumerBounceTest > testCloseDuringRebalance STARTED

kafka.api.ProducerFailureHandlingTest > testSendAfterClosed PASSED

kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckZero STARTED

kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckZero PASSED

kafka.api.ProducerFailureHandlingTest > 
testPartitionTooLargeForReplicationWithAckAll STARTED

kafka.api.ProducerFailureHandlingTest > 
testPartitionTooLargeForReplicationWithAckAll PASSED

kafka.api.ProducerFailureHandlingTest > 
testNotEnoughReplicasAfterBrokerShutdown STARTED

kafka.api.ConsumerBounceTest > testCloseDuringRebalance PASSED

kafka.api.ConsumerBounceTest > testClose STARTED

kafka.api.ProducerFailureHandlingTest > 
testNotEnoughReplicasAfterBrokerShutdown PASSED

kafka.api.SslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaSubscribe STARTED

kafka.api.SslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaSubscribe PASSED

kafka.api.SslEndToEndAuthorizationTest > testProduceConsumeWithPrefixedAcls 
STARTED

kafka.api.SslEndToEndAuthorizationTest > testProduceConsumeWithPrefixedAcls 
PASSED

kafka.api.SslEndToEndAuthorizationTest > testProduceConsumeViaAssign STARTED

kafka.api.ConsumerBounceTest > testClose PASSED

kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures STARTED

kafka.api.SslEndToEndAuthorizationTest > testProduceConsumeViaAssign PASSED

kafka.api.SslEndToEndAuthorizationTest > testNoConsumeWithDescribeAclViaAssign 
STARTED
[2146.082s][warning][os,thread] Failed to start thread - pthread_create failed 
(EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
[2146.235s][warning][os,thread] Failed to start thread - pthread_create failed 
(EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
[2146.239s][warning][os,thread] Failed to start thread - pthread_create failed 
(EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
ERROR: No tool found matching GRADLE_4_10_2_HOME
ERROR: Could not install GRADLE_4_10_3_HOME
java.lang.NullPointerException
at 
hudson.plugins.toolenv.ToolEnvBuildWrapper$1.buildEnvVars(ToolEnvBuildWrapper.java:46)
at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:873)
at hudson.plugins.git.GitSCM.getParamExpandedRepos(GitSCM.java:484)
at 

[jira] [Created] (KAFKA-10056) Consumer metadata may use outdated groupSubscription that doesn't contain newly subscribed topics

2020-05-28 Thread Rajini Sivaram (Jira)
Rajini Sivaram created KAFKA-10056:
--

 Summary: Consumer metadata may use outdated groupSubscription that 
doesn't contain newly subscribed topics
 Key: KAFKA-10056
 URL: https://issues.apache.org/jira/browse/KAFKA-10056
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 2.5.0
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 2.6.0, 2.5.1


>From [~hai_lin] in KAFKA-9181:

I did notice some issue after this patch, here is what I observe.

Consumer metadata might skip first metadata update, cause grouopSubscription is 
not reset. In my case, the consumer coordinator thread hijack the update by 
calling newMetadataRequestAndVersion with outdated groupSubscription before 
joinPrepare() happen. The groupSubscription will get reset later and it will 
eventually get update later, and this won't be an issue for initial consumer 
subscribe(since the groupSubscription is empty anyway), but it might happen the 
following subscribe when groupSubscription is not empty. This will create a 
discrepancy between subscription and groupSubscription, if any new metadata 
request happened in between, metadataTopics will return outdated group 
information. 

 
h4. The happy path
 * Consumer call subscribe > Update {{needUpdated}}, bump up {{requestVersion}} 
and update {{subscription}} in {{SubscriptionState}} > {{prepareJoin()}} was 
call in first {{poll()}} to reset {{groupSubscription}} -> next time when 
metadata update was call and {{metadataTopics()}} returns {{subscription}} 
since {{groupSubscription}} is empty -> update call issue to broker to fetch 
partition information for new topic

h4. In our case
 * Consumer call subscribe > Update {{needUpdated}}, bump up {{requestVersion}} 
and update {{subscription}}(not {{groupSubscription}}) in {{SubscriptionState}} 
> Consumer Coordinator heartbeat thread call metadata request and 
{{SubscriptionState}} gave away the current requestVersion and outdated 
{{groupSubscription}} > making request for metadata update with outdated 
subscription -> request comes back to client and since {{requestVersion}} is up 
to latest, it reset {{needUpdated}} flag -> {{joinPrepare()}} called and reset 
{{groupSubscription}} > no new metadata update request follow cause 
{{needUpdated}} was reset -> metadata request will happen when 
{{metadata.max.age}} reaches.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Build failed in Jenkins: kafka-trunk-jdk11 #1508

2020-05-28 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10050: kafka_log4j_appender.py fixed for JDK11 (#8731)

[github] KAFKA-9802; Increase transaction timeout in system tests to reduce

[github] MINOR: Slight MetadataCache tweaks to avoid unnecessary work (#8728)

[github] KAFKA-10052: Harden assertion of topic settings in Connect integration

[github] KAFKA-9971: Error Reporting in Sink Connectors (KIP-610) (#8720)


--
[...truncated 3.14 MB...]

org.apache.kafka.streams.TestTopicsTest > testTimestamp STARTED

org.apache.kafka.streams.TestTopicsTest > testTimestamp PASSED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders STARTED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValue STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValue PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateTopicWithNullTopicName STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateTopicWithNullTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordsFromKeyValuePairs STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordsFromKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKeyAndDefaultTimestamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKeyAndDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullHeaders STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullHeaders PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithDefaultTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithDefaultTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithNullKey PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecord STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecord PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecord STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecord PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > shouldAdvanceTime 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > shouldAdvanceTime 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairs STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairsAndCustomTimestamps
 STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 

Jenkins build is back to normal : kafka-trunk-jdk8 #4580

2020-05-28 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-616: Rename implicit Serdes instances in kafka-streams-scala

2020-05-28 Thread Yuriy Badalyantc
Ok, I understood you, John. I wasn't sure about kafka deprecation policy
and thought that the full cycle could be done with 2.7 version. Waiting for
3.0 is too much, I agree with it.

So, I think creating one more `Serdes` in another package is our way. I
suggest one of the following:
1. `org.apache.kafka.streams.scala.serde.Serdes`
2. `org.apache.kafka.streams.scala.serialization.Serdes`

About `Serde` vs `Serdes`. I'm strongly against `Serde` because it would
lead to a new name clash with the
`org.apache.kafka.common.serialization.Serde`.

- Yuriy

On Thu, May 28, 2020 at 11:12 AM John Roesler  wrote:

> Hi Yuriy,
>
> Thanks for the clarification.
>
> I guess my concern is twofold:
> 1. We typically leave deprecated methods in place for at least a major
> release cycle before removing them, so it would seem abrupt to have a
> deprecation period of only one minor release. If we follow the same pattern
> here, it would take over a year to finish this KIP.
> 2. It doesn’t seem like there is a nonbreaking deprecation path at all if
> people enumerate their imports (if they don’t use a wildcard). In that
> case, they would have no path to implicitly use the newly named serdes, and
> therefore they would have no way to avoid continuing to use the deprecated
> ones.
>
> Since you mentioned that your reason is mainly the preference for the name
> “Serde” or “Serdes”, can we explore just using one of those? Would it cause
> some kind of conflict to use org.apache.kafka.streams.scala.Serde or to use
> Serdes in a different package, like
> org.apache.kafka.streams.scala.implicit.Serdes?
>
> I empathize with this desire. I faced the same dilemma when I wanted to
> replace Processor but keep the class name in KIP-478. I wound up creating a
> new package for the new Processor.
>
> Thanks,
> John
>
> On Wed, May 27, 2020, at 22:20, Yuriy Badalyantc wrote:
> > Hi John,
> >
> > I'm stick with the `org.apache.kafka.streams.scala.Serdes` because it's
> > sort of conventional in the scala community. If you have a typeclass
> `Foo`,
> > you probably will search `Foo` related stuff in the `Foo` or maybe `Foos`
> > (plural). All other places are far less discoverable for the developers.
> >
> > I agree that the migration path is a bit complex for such change. But I
> > think it's more important to provide good developer experience than to
> > simplify migration. Also, I think it's debatable which migration path is
> > better for library users. If we would create, for example, `Serdes2`,
> > library users will have to modify their code if they used any part of the
> > old `Serde`. With my approach, most of the old code will still work
> without
> > changes. Only explicit usage of implicits will need to be fixed (because
> > names will be changed, and old names will be deprecated). Wildcard
> imports
> > will work without changes and will not lead to a name clash. Moreover,
> many
> > users may not notice name clash problems. And with my migration path,
> they
> > will not notice any changes at all.
> >
> > - Yuriy
> >
> > On Thu, May 28, 2020 at 7:48 AM John Roesler 
> wrote:
> >
> > > Hi Yuriy,
> > >
> > > Thanks for the reply. I guess I've been out of the Scala game for a
> > > while; all this summoner business is totally new to me.
> > >
> > > I think I followed the rationale you provided, but I still don't see
> > > why you can't implement your whole plan in a new class. What
> > > is special about the existing Serdes class?
> > >
> > > Thanks,
> > > -John
> > >
> > > On Tue, May 19, 2020, at 01:18, Yuriy Badalyantc wrote:
> > > > Hi John,
> > > >
> > > > Your suggestion looks interesting. I think it's technically doable.
> But
> > > I'm
> > > > not sure that this is the better solution. I will try to explain.
> From
> > > the
> > > > scala developers' perspective, `Serde` looks really like a typeclass.
> > > > Typical typeclass in pure scala will look like this:
> > > >
> > > > ```
> > > > trait Serde[A] {
> > > >   def serialize(data: A): Array[Byte]
> > > >   def deserialize(data: Array[Byte]): A
> > > > }
> > > > object Serde extends DefaultSerdes {
> > > >   // "summoner" function. With this I can write `Serde[A]` and this
> serde
> > > > will be implicitly summonned.
> > > >   def apply[A](implicit ev: Serde[A]): Serde[A] = ev
> > > > }
> > > >
> > > > trait DefaultSerdes {
> > > >   // default instances here
> > > > }
> > > > ```
> > > >
> > > > Usage example (note, that there are no wildcards imports here):
> > > > ```
> > > > object Main extends App {
> > > >   import Serde // not wildcard import
> > > >
> > > >   // explicit summonning:
> > > >   val stringSerde = Serde[String] // using summoner
> > > >   stringSerde.serialize(???)
> > > >
> > > >   // implicit summonning
> > > >   def serialize[A: Serde](a: A) = {
> > > > Serde[A].serialize(a) // summoner again
> > > >   }
> > > >   serialize("foo")
> > > > }
> > > > ```
> > > >
> > > > Examples are pretty silly, but I just want to show common patterns of
> 

Jenkins build is back to normal : kafka-trunk-jdk11 #1507

2020-05-28 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-10052) Flaky Test InternalTopicsIntegrationTest.testCreateInternalTopicsWithFewerReplicasThanBrokers

2020-05-28 Thread Konstantine Karantasis (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-10052.

Resolution: Fixed

This must have been improved now, but please reopen if you notice failures 
happening again. 

> Flaky Test 
> InternalTopicsIntegrationTest.testCreateInternalTopicsWithFewerReplicasThanBrokers
> -
>
> Key: KAFKA-10052
> URL: https://issues.apache.org/jira/browse/KAFKA-10052
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Sophie Blee-Goldman
>Assignee: Konstantine Karantasis
>Priority: Critical
>  Labels: flaky-test, integration-test
> Fix For: 2.6.0
>
>
> h3. Stacktrace
>  
> {code:java}
> java.lang.NullPointerException
>   at 
> org.apache.kafka.connect.util.clusters.EmbeddedConnectClusterAssertions.assertTopicSettings(EmbeddedConnectClusterAssertions.java:207)
>  
>   at 
> org.apache.kafka.connect.integration.InternalTopicsIntegrationTest.assertInternalTopicSettings(InternalTopicsIntegrationTest.java:148)
>  
>   at
> org.apache.kafka.connect.integration.InternalTopicsIntegrationTest.testCreateInternalTopicsWithFewerReplicasThanBrokers(InternalTopicsIntegrationTest.java:118){code}
>  
>  
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/6539/testReport/junit/org.apache.kafka.connect.integration/InternalTopicsIntegrationTest/testCreateInternalTopicsWithFewerReplicasThanBrokers/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)