Jenkins build is back to normal : kafka-trunk-jdk11 #690

2019-07-11 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-482: The Kafka Protocol should Support Optional Fields

2019-07-11 Thread Colin McCabe
On Tue, Jul 9, 2019, at 15:29, Jose Armando Garcia Sancio wrote:
> Thanks Colin for the KIP. For my own edification why are we doing this
> "Optional fields can have any type, except for an array of structures."?
> Why can't we have an array of structures?

Optional fields are serialized starting with their total length.  This is 
straightforward to calculate for primitive fields like INT32, (or even an array 
of INT32), but more difficult to calculate for an array of structures.  
Basically, we'd have to do a two-pass serialization where we first calculate 
the lengths of everything, and then write it out.

The nice thing about this KIP is that there's nothing in the protocol stopping 
us from adding support for this feature in the future.  We wouldn't have to 
really change the protocol at all to add support.  But we'd have to change a 
lot of serialization code.  Given almost all of our use-cases for optional 
fields are adding an extra field here or there, it seems reasonable not to 
support it for right now.

best,
Colin

> 
> -- 
> -Jose
>


Re: [DISCUSS] KIP-480 : Sticky Partitioner

2019-07-11 Thread Jun Rao
Hi, Justine,

Thanks for the KIP. Nice writeup and great results. Just one comment.

100. To add a record to the accumulator, the producer needs to know the
partition id. The decision of whether the record can be added to the
current batch is only made after the accumulator.append() call. So, when a
batch is full, it seems that the KIP will try to append the next record to
the same partition, which will trigger the creation of a new batch with a
single record. After that, new records will be routed to a new partition.
If the producer doesn't come back to the first partition in time, the
producer will send a single record batch. In the worse case, it can be that
every other batch has only a single record. Is this correct? If so, could
we avoid that?

Jun

On Thu, Jul 11, 2019 at 5:23 PM Colin McCabe  wrote:

> Hi Justine,
>
> I agree that we shouldn't change RoundRobinPartitioner, since its behavior
> is already specified.
>
> However, we could add a new, separate StickyRoundRobinPartitioner class to
> KIP-480 which just implemented the sticky behavior regardless of whether
> the key was null.  That seems pretty easy to add (and it wouldn't have to
> implemented right away in the first PR, of course.)  It would be an option
> for people who wanted to configure this behavior.
>
> best,
> Colin
>
>
> On Wed, Jul 10, 2019, at 08:48, Justine Olshan wrote:
> > Hi M,
> >
> > I'm a little confused by what you mean by extending the behavior on to
> the
> > RoundRobinPartitioner.
> > The sticky partitioner plans to remove the round-robin behavior from
> > records with no keys. Instead of sending them to each partition in order,
> > it sends them all to the same partition until the batch is sent.
> > I don't think you can have both round-robin and sticky partition
> behavior.
> >
> > Thank you,
> > Justine Olshan
> >
> > On Wed, Jul 10, 2019 at 1:54 AM M. Manna  wrote:
> >
> > > Thanks for the comments Colin.
> > >
> > > My only concern is that this KIP is addressing a good feature and
> having
> > > that extended to RoundRobinPartitioner means 1 less KIP in the future.
> > >
> > > Would it be appropriate to extend the support to RoundRobinPartitioner
> too?
> > >
> > > Thanks,
> > >
> > > On Tue, 9 Jul 2019 at 17:24, Colin McCabe  wrote:
> > >
> > > > Hi M,
> > > >
> > > > The RoundRobinPartitioner added by KIP-369 doesn't interact with this
> > > > KIP.  If you configure your producer to use RoundRobinPartitioner,
> then
> > > the
> > > > DefaultPartitioner will not be used.  And the "sticky" behavior is
> > > > implemented only in the DefaultPartitioner.
> > > >
> > > > regards,
> > > > Colin
> > > >
> > > >
> > > > On Tue, Jul 9, 2019, at 05:12, M. Manna wrote:
> > > > > Hello Justine,
> > > > >
> > > > > I have one item I wanted to discuss.
> > > > >
> > > > > We are currently in review stage for KAFKA- where we can choose
> > > > always
> > > > > RoundRobin regardless of null/usable key.
> > > > >
> > > > > If I understood this KIP motivation correctly, you are still
> honouring
> > > > how
> > > > > the hashing of key works for DefaultPartitioner. Would you say that
> > > > having
> > > > > an always "Round-Robin" partitioning with "Sticky" assignment
> > > (efficient
> > > > > batching of records for a partition) doesn't deviate from your
> original
> > > > > intention?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > On Tue, 9 Jul 2019 at 01:00, Justine Olshan 
> > > > wrote:
> > > > >
> > > > > > Hello all,
> > > > > >
> > > > > > If there are no more comments or concerns, I would like to start
> the
> > > > vote
> > > > > > on this tomorrow afternoon.
> > > > > >
> > > > > > However, if there are still topics to discuss, feel free to bring
> > > them
> > > > up
> > > > > > now.
> > > > > >
> > > > > > Thank you,
> > > > > > Justine
> > > > > >
> > > > > > On Tue, Jul 2, 2019 at 4:25 PM Justine Olshan <
> jols...@confluent.io>
> > > > > > wrote:
> > > > > >
> > > > > > > Hello again,
> > > > > > >
> > > > > > > Another update to the interface has been made to the KIP.
> > > > > > > Please let me know if you have any feedback!
> > > > > > >
> > > > > > > Thank you,
> > > > > > > Justine
> > > > > > >
> > > > > > > On Fri, Jun 28, 2019 at 2:52 PM Justine Olshan <
> > > jols...@confluent.io
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Hi all,
> > > > > > >> I made some changes to the KIP.
> > > > > > >> The idea is to clean up the code, make behavior more explicit,
> > > > provide
> > > > > > >> more flexibility, and to keep default behavior the same.
> > > > > > >>
> > > > > > >> Now we will change the partition in onNewBatch, and specify
> the
> > > > > > >> conditions for this function call (non-keyed values, no
> explicit
> > > > > > >> partitions) in willCallOnNewBatch.
> > > > > > >> This clears up some of the issues with the implementation. I'm
> > > > happy to
> > > > > > >> hear further opinions and discuss this change!
> > > > > > >>
> > > > > > >> Thank you,
> > > > > > >> Justine

Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-07-11 Thread Colin McCabe
Hi Justine,

Thanks for the KIP.  This seems like a good step towards removing server-side 
topic auto-creation.

We should add included "client-side" to the title of the KIP somewhere, to make 
it clear that we're talking about client-side auto creation.

The KIP says:
> In order to automatically create topics with the producer, the producer's 
> auto.create.topics.enable config must be set to true and the broker config 
> should be set to false

>From a user's point of view, this seems counter-intuitive.  In order to 
>auto-create topics the broker's auto.create.topics.enable config should be set 
>to false?  It seems like the server-side auto-create is unrelated to the 
>client-side auto-create.  We could have both turned on (and I'm sure that in 
>the real world, people will try this configuration...)  There's no reason not 
>to support this, I think.

We should add some documentation explaining the difference between server-side 
and client-side auto-creation.  Without documentation, an admin might think 
that they had disabled all forms of auto-creation by setting the -side setting 
to false-- but this is not the case, of course.

best,
Colin


On Thu, Jul 11, 2019, at 16:22, Justine Olshan wrote:
> Hi Dhruvil,
> 
> Thanks for reading the KIP!
> That was the general idea for deprecation. We would log a warning when the
> config is enabled on the broker.
> I also believe that there would be a change to documentation.
> If there is anything else that should be done, please let me know!
> 
> Justine
> 
> On Thu, Jul 11, 2019 at 4:17 PM Dhruvil Shah  wrote:
> 
> > Hi Justine,
> >
> > Thanks for the KIP, this is great!
> >
> > Could you add some more information about what deprecating the broker
> > configuration means? Would we log a warning in the logs when auto topic
> > creation is enabled on the broker, for example?
> >
> > Thanks,
> > Dhruvil
> >
> > On Thu, Jul 11, 2019 at 10:28 AM Justine Olshan 
> > wrote:
> >
> > > Hello all,
> > >
> > > I'd like to start a discussion thread for KIP-487.
> > > This KIP plans to deprecate the current system of auto-creating topics
> > > through requests to the metadata and give the producer the ability to
> > > automatically create topics instead.
> > >
> > > More information can be found here:
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-487%3A+Automatic+Topic+Creation+on+Producer
> > >
> > > Thank you,
> > > Justine Olshan
> > >
> >
>


Re: [VOTE] KIP-480 : Sticky Partitioner

2019-07-11 Thread Colin McCabe
+1 (binding).  Thanks, Justine!

ComputedPartition#get probably should be ComputedPartition#partition or 
something.  We typically name accessors the same as the variables that are 
being accessed.

As we discussed in the other thread, one minor addition that might make this 
KIP even better is a StickyRoundRobinPartitioner class that just implements the 
sticky behavior regardless of whether the key is null or not.  It would just be 
a standalone custom partitioner class that could be configured if people wanted 
this.

best,
Colin


On Tue, Jul 9, 2019, at 17:15, Justine Olshan wrote:
> Hello all,
> 
> I'd like to start the vote for KIP-480 : Sticky Partitioner.
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner
> 
> Thank you,
> Justine Olshan
>


Re: [DISCUSS] KIP-480 : Sticky Partitioner

2019-07-11 Thread Colin McCabe
Hi Justine,

I agree that we shouldn't change RoundRobinPartitioner, since its behavior is 
already specified.

However, we could add a new, separate StickyRoundRobinPartitioner class to 
KIP-480 which just implemented the sticky behavior regardless of whether the 
key was null.  That seems pretty easy to add (and it wouldn't have to 
implemented right away in the first PR, of course.)  It would be an option for 
people who wanted to configure this behavior.

best,
Colin


On Wed, Jul 10, 2019, at 08:48, Justine Olshan wrote:
> Hi M,
> 
> I'm a little confused by what you mean by extending the behavior on to the
> RoundRobinPartitioner.
> The sticky partitioner plans to remove the round-robin behavior from
> records with no keys. Instead of sending them to each partition in order,
> it sends them all to the same partition until the batch is sent.
> I don't think you can have both round-robin and sticky partition behavior.
> 
> Thank you,
> Justine Olshan
> 
> On Wed, Jul 10, 2019 at 1:54 AM M. Manna  wrote:
> 
> > Thanks for the comments Colin.
> >
> > My only concern is that this KIP is addressing a good feature and having
> > that extended to RoundRobinPartitioner means 1 less KIP in the future.
> >
> > Would it be appropriate to extend the support to RoundRobinPartitioner too?
> >
> > Thanks,
> >
> > On Tue, 9 Jul 2019 at 17:24, Colin McCabe  wrote:
> >
> > > Hi M,
> > >
> > > The RoundRobinPartitioner added by KIP-369 doesn't interact with this
> > > KIP.  If you configure your producer to use RoundRobinPartitioner, then
> > the
> > > DefaultPartitioner will not be used.  And the "sticky" behavior is
> > > implemented only in the DefaultPartitioner.
> > >
> > > regards,
> > > Colin
> > >
> > >
> > > On Tue, Jul 9, 2019, at 05:12, M. Manna wrote:
> > > > Hello Justine,
> > > >
> > > > I have one item I wanted to discuss.
> > > >
> > > > We are currently in review stage for KAFKA- where we can choose
> > > always
> > > > RoundRobin regardless of null/usable key.
> > > >
> > > > If I understood this KIP motivation correctly, you are still honouring
> > > how
> > > > the hashing of key works for DefaultPartitioner. Would you say that
> > > having
> > > > an always "Round-Robin" partitioning with "Sticky" assignment
> > (efficient
> > > > batching of records for a partition) doesn't deviate from your original
> > > > intention?
> > > >
> > > > Thanks,
> > > >
> > > > On Tue, 9 Jul 2019 at 01:00, Justine Olshan 
> > > wrote:
> > > >
> > > > > Hello all,
> > > > >
> > > > > If there are no more comments or concerns, I would like to start the
> > > vote
> > > > > on this tomorrow afternoon.
> > > > >
> > > > > However, if there are still topics to discuss, feel free to bring
> > them
> > > up
> > > > > now.
> > > > >
> > > > > Thank you,
> > > > > Justine
> > > > >
> > > > > On Tue, Jul 2, 2019 at 4:25 PM Justine Olshan 
> > > > > wrote:
> > > > >
> > > > > > Hello again,
> > > > > >
> > > > > > Another update to the interface has been made to the KIP.
> > > > > > Please let me know if you have any feedback!
> > > > > >
> > > > > > Thank you,
> > > > > > Justine
> > > > > >
> > > > > > On Fri, Jun 28, 2019 at 2:52 PM Justine Olshan <
> > jols...@confluent.io
> > > >
> > > > > > wrote:
> > > > > >
> > > > > >> Hi all,
> > > > > >> I made some changes to the KIP.
> > > > > >> The idea is to clean up the code, make behavior more explicit,
> > > provide
> > > > > >> more flexibility, and to keep default behavior the same.
> > > > > >>
> > > > > >> Now we will change the partition in onNewBatch, and specify the
> > > > > >> conditions for this function call (non-keyed values, no explicit
> > > > > >> partitions) in willCallOnNewBatch.
> > > > > >> This clears up some of the issues with the implementation. I'm
> > > happy to
> > > > > >> hear further opinions and discuss this change!
> > > > > >>
> > > > > >> Thank you,
> > > > > >> Justine
> > > > > >>
> > > > > >> On Thu, Jun 27, 2019 at 2:53 PM Colin McCabe 
> > > > > wrote:
> > > > > >>
> > > > > >>> On Thu, Jun 27, 2019, at 01:31, Ismael Juma wrote:
> > > > > >>> > Thanks for the KIP Justine. It looks pretty good. A few
> > comments:
> > > > > >>> >
> > > > > >>> > 1. Should we favor partitions that are not under replicated?
> > > This is
> > > > > >>> > something that Netflix did too.
> > > > > >>>
> > > > > >>> This seems like it could lead to cascading failures, right?  If a
> > > > > >>> partition becomes under-replicated because there is too much
> > > traffic,
> > > > > the
> > > > > >>> producer stops sending to it, which puts even more load on the
> > > > > remaining
> > > > > >>> partitions, which are even more likely to fail then, etc.  It
> > also
> > > will
> > > > > >>> create unbalanced load patterns on the consumers.
> > > > > >>>
> > > > > >>> >
> > > > > >>> > 2. If there's no measurable performance difference, I agree
> > with
> > > > > >>> Stanislav
> > > > > >>> > that Optional would be better than Integer.
> > > > > >>> >
> > > > > >>> 

Re: [DISCUSS]KIP-216: IQ should throw different exceptions for different errors

2019-07-11 Thread Matthias J. Sax
Thanks Vito! I think the KIP shapes out nicely!


To answer the open question you raised (I also adjust my answers based
on the latest KIP update)



About `StreamThreadNotStartedException`: I understand what you pointed
out. However, I think we can consider the following: If a thread is not
started yet, and `KafkaStreams#store()` throw this exception, we would
not return a `CompositeReadOnlyXxxStore` to the user. Hence, `get()`
cannot be called. And if we return `CompositeReadOnlyXxxStore` the
thread was started and `get()` would never hit the condition to throw
the exception? Or do I miss something (this part of the logic is a
little tricky...)

However, thinking about it, what could happen IMHO is, that the
corresponding thread crashes after we handed out the store handle. For
this case, it would make sense to throw an exception from `get()` but it
would be a different one IMHO. Maybe we need a new type
(`StreamThreadDeadException` or similar?) or we should reuse
`StoreMigratedException` because if a thread dies we would migrate the
store to another thread. (The tricky part might be, to detect this
condition correctly -- not 100% sure atm how we could do this.)

What do you think about this?



About `KafkaStreamsNotRunningException` vs
`StreamThreadNotRunningException` -- I see your point. Atm, I think we
don't allow querying at all if KafkaStreams is not in state RUNNING
(correct me if I am wrong). Hence, if there is an instance with 2
thread, and 1 thread is actually up and ready, but the other thread is
not, you cannot query anything. Only if both threads are in state
RUNNING we allow to query. It might be possible to change the code to
allow querying if a thread is ready independent from the other threads.
For this case, the name you suggest would make more sense. But I
_think_, that the current behavior is different and thus,
`KafkaStreamsNotRunningException` seems to reflect the current behavior
better? -- I also want to add that we are talking about a fatal
exception -- if a thread crashes, we would migrate the store to another
thread and it would not be fatal, but the store can be re-discovered.
Only if all thread would die, if would be fatal -- however, for this
case KafakStreams would transit to DEAD anyway.



> When the user passes a store name to `KafkaStreams#store()`, does there
> have a way that distinguish the store name is "a wrong name" or "migrated"
> during `QueryableStoreProvider#getStore()`?
> From my current understanding, I cannot distinguish these two.

This should be possible. In the private KafkaStreams constructor, we
have access to `InternalTopologyBuilder` that can give us all known
store names. Hence, we can get a set of all known store names, keep them
as a member variable and use in `KafkaStreams#store()` in an initial
check if the store name is valid or not.



> Should we remove `StreamThreadNotRunningException` and throw
> `FatalStateStoreException` directly ?

I would keep both, because `FatalStateStoreException` is not very
descriptive. Also, we should still have fatal exception
`StateStoreNotAvailableException`? Not sure why you remove it?



Glad you found a way to avoid
`QueryableStoreType#setStreams(KafkaStreams streams)`.



-Matthias


On 7/5/19 8:03 AM, Vito Jeng wrote:
> Hi, Mattias,
> 
> Just completed the modification of KIP, please take a look when you are
> available.
> 
> ---
> Vito
> 
> 
> On Wed, Jul 3, 2019 at 9:07 PM Vito Jeng  wrote:
> 
>> Hi, Matthias,
>>
>> This is second part.
>>
>>> For the internal exceptions:
>>>
>>> `StateStoreClosedException` -- why can it be wrapped as
>>> `StreamThreadNotStartedException` ? It seems that the later would only
>>> be thrown by `KafkaStreams#store()` and thus would be throw directly.
>>
>> Both `StateStoreClosedException` and `EmptyStateStoreException` not can be
>> wrapped as `StreamThreadNotStartedException`.
>> This is a mistaken written in the previous KIP. Thank you point this.
>>
>>> A closed-exception should only happen after a store was successfully
>>> retrieved but cannot be queried any longer? Hence, converting/wrapping
>>> it into a `StateStoreMigratedException` make sense. I am also not sure,
>>> when a closed-exception would be wrapped by a
>>> `StateStoreNotAvailableException` (implying my understanding as describe
>>> above)?
>>>
>>> Same questions about `EmptyStateStoreException`.
>>>
>>> Thinking about both internal exceptions twice, I am wondering if it
>>> makes sense to have both internal exceptions at all? I have the
>>> impression that it make only sense to wrap them with a
>>> `StateStoreMigragedException`, but if they are wrapped into the same
>>> exception all the time, we can just remove both and throw
>>> `StateStoreMigratedException` directly?
>>
>> After deeper thinking, I think you are right. It seems we can throw
>> `StateStoreMigratedException` directly.
>> So that we can remove `StateStoreClosedException`,
>> `EmptyStateStoreException` and 

[jira] [Resolved] (KAFKA-5635) KIP-181 Kafka-Connect integrate with kafka ReST Proxy

2019-07-11 Thread Konstantine Karantasis (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-5635.
---
Resolution: Won't Fix

> KIP-181 Kafka-Connect integrate with kafka ReST Proxy
> -
>
> Key: KAFKA-5635
> URL: https://issues.apache.org/jira/browse/KAFKA-5635
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Dhananjay Patkar
>Priority: Major
>  Labels: features, newbie
>
> Kafka connect currently uses kafka clients which directly connect to kafka 
> brokers. 
> In a use case, wherein I have many kafka connect [producers] running remotely 
> its a challenge to configure broker information on every connect agent.
> Also, in case of IP change [upgrade or cluster re-creation], we need to 
> update every remote connect configuration.
> If kafka connect source connectors talk to ReST endpoint then client is 
> unaware of broker details. This way we can transparently upgrade / re-create 
> kafka cluster as long as ReST endpoint remains same.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-07-11 Thread Justine Olshan
Hi Dhruvil,

Thanks for reading the KIP!
That was the general idea for deprecation. We would log a warning when the
config is enabled on the broker.
I also believe that there would be a change to documentation.
If there is anything else that should be done, please let me know!

Justine

On Thu, Jul 11, 2019 at 4:17 PM Dhruvil Shah  wrote:

> Hi Justine,
>
> Thanks for the KIP, this is great!
>
> Could you add some more information about what deprecating the broker
> configuration means? Would we log a warning in the logs when auto topic
> creation is enabled on the broker, for example?
>
> Thanks,
> Dhruvil
>
> On Thu, Jul 11, 2019 at 10:28 AM Justine Olshan 
> wrote:
>
> > Hello all,
> >
> > I'd like to start a discussion thread for KIP-487.
> > This KIP plans to deprecate the current system of auto-creating topics
> > through requests to the metadata and give the producer the ability to
> > automatically create topics instead.
> >
> > More information can be found here:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-487%3A+Automatic+Topic+Creation+on+Producer
> >
> > Thank you,
> > Justine Olshan
> >
>


Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-07-11 Thread Dhruvil Shah
Hi Justine,

Thanks for the KIP, this is great!

Could you add some more information about what deprecating the broker
configuration means? Would we log a warning in the logs when auto topic
creation is enabled on the broker, for example?

Thanks,
Dhruvil

On Thu, Jul 11, 2019 at 10:28 AM Justine Olshan 
wrote:

> Hello all,
>
> I'd like to start a discussion thread for KIP-487.
> This KIP plans to deprecate the current system of auto-creating topics
> through requests to the metadata and give the producer the ability to
> automatically create topics instead.
>
> More information can be found here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-487%3A+Automatic+Topic+Creation+on+Producer
>
> Thank you,
> Justine Olshan
>


[jira] [Resolved] (KAFKA-8424) Replace ListGroups request/response with automated protocol

2019-07-11 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8424.

Resolution: Fixed

> Replace ListGroups request/response with automated protocol
> ---
>
> Key: KAFKA-8424
> URL: https://issues.apache.org/jira/browse/KAFKA-8424
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [DISCUSSION] KIP-418: A method-chaining way to branch KStream

2019-07-11 Thread Matthias J. Sax
Ivan,

did you see my last reply? What do you think about my proposal to mix
both approaches and try to get best-of-both worlds?


-Matthias

On 6/11/19 3:56 PM, Matthias J. Sax wrote:
> Thanks for the input John!
> 
>> under your suggestion, it seems that the name is required
> 
> If you want to get the `KStream` as part of the `Map` back using a
> `Function`, yes. If you follow the "embedded chaining" pattern using a
> `Consumer`, no.
> 
> Allowing for a default name via `split()` can of course be done.
> Similarly, using `Named` instead of `String` is possible.
> 
> I wanted to sketch out a high level proposal to merge both patterns
> only. Your suggestions to align the new API with the existing API make
> totally sense.
> 
> 
> 
> One follow up question: Would `Named` be optional or required in
> `split()` and `branch()`? It's unclear from your example.
> 
> If both are mandatory, what do we gain by it? The returned `Map` only
> contains the corresponding branches, so why should we prefix all of
> them? If only `Named` is mandatory in `branch()`, but optional in
> `split()`, the same question raises?
> 
> Requiring `Named` in `split()` seems only to make sense, if `Named` is
> optional in `branch()` and we generate `-X` suffix using a counter for
> different branch name. However, this might lead to the problem of
> changing names if branches are added/removed. Also, how would the names
> be generated if `Consumer` is mixed in (ie, not all branches are
> returned in the `Map`).
> 
> If `Named` is optional for both, it could happen that a user misses to
> specify a name for a branch what would lead to runtime issues.
> 
> 
> Hence, I am actually in favor to not allow a default name but keep
> `split()` without parameter and make `Named` in `branch()` required if a
> `Function` is used. This makes it explicit to the user that specifying a
> name is required if a `Function` is used.
> 
> 
> 
> About
> 
>> KBranchedStream#branch(BranchConfig)
> 
> I don't think that the branching predicate is a configuration and hence
> would not include it in a configuration object.
> 
>> withChain(...);
> 
> Similar, `withChain()` (that would only take a `Consumer`?) does not
> seem to be a configuration. We can also not prevent a user to call
> `withName()` in combination of `withChain()` what does not make sense
> IMHO. We could of course throw an RTE but not have a compile time check
> seems less appealing. Also, it could happen that neither `withChain()`
> not `withName()` is called and the branch is missing in the returned
> `Map` what lead to runtime issues, too.
> 
> Hence, I don't think that we should add `BranchConfig`. A config object
> is helpful if each configuration can be set independently of all others,
> but this seems not to be the case here. If we add new configuration
> later, we can also just move forward by deprecating the methods that
> accept `Named` and add new methods that accepted `BranchConfig` (that
> would of course implement `Named`).
> 
> 
> Thoughts?
> 
> 
> @Ivan, what do you think about the general idea to blend the two main
> approaches of returning a `Map` plus an "embedded chaining"?
> 
> 
> 
> -Matthias
> 
> 
> 
> On 6/4/19 10:33 AM, John Roesler wrote:
>> Thanks for the idea, Matthias, it does seem like this would satisfy
>> everyone. Returning the map from the terminal operations also solves
>> the problem of merging/joining the branched streams, if we want to add
>> support for the compliment later on.
>>
>> Under your suggestion, it seems that the name is required. Otherwise,
>> we wouldn't have keys for the map to return. I this this is actually
>> not too bad, since experience has taught us that, although names for
>> operations are not required to define stream processing logic, it does
>> significantly improve the operational experience when you can map the
>> topology, logs, metrics, etc. back to the source code. Since you
>> wouldn't (have to) reference the name to chain extra processing onto
>> the branch (thanks to the second argument), you can avoid the
>> "unchecked name" problem that Ivan pointed out.
>>
>> In the current implementation of Branch, you can name the branch
>> operator itself, and then all the branches get index-suffixed names
>> built from the branch operator name. I guess under this proposal, we
>> could naturally append the branch name to the branching operator name,
>> like this:
>>
>>stream.split(Named.withName("mysplit")) //creates node "mysplit"
>>   .branch(..., ..., "abranch") // creates node "mysplit-abranch"
>>   .defaultBranch(...) // creates node "mysplit-default"
>>
>> It does make me wonder about the DSL syntax itself, though.
>>
>> We don't have a defined grammar, so there's plenty of room to debate
>> the "best" syntax in the context of each operation, but in general,
>> the KStream DSL operators follow this pattern:
>>
>> operator(function, config_object?) OR operator(config_object)
>>
>> where 

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-11 Thread Matthias J. Sax
`Sum` is an existing name, for the "sampled sum" metric, that gets
deprecated. Hence, we cannot use it.

If we cannot use `Sum` and use `TotalSum`, we should also not use
`Count` but `TotalCount` for consistency.


-Matthias



On 7/11/19 12:58 PM, Bruno Cadonna wrote:
> Hi John,
> 
> Thank you for the KIP.
> 
> LGTM
> 
> I also do not like CumulativeSum/Count so much. I propose to just call
> it Sum and Count.
> 
> I understand that you want to unequivocally distinguish the two metric
> functions by their names, but I have the feeling the names become
> artificially complex. The exact semantics can also be documented in
> the javadocs, which btw could also be improved in those classes.
> 
> Best,
> Bruno
> 
> 
> 
> On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax  wrote:
>>
>> Thanks for the KIP. Overall LGTM.
>>
>> The only though I have is, if we may want to use `TotalSum` and
>> `TotalCount` instead of `CumulativeSum/Count` as names?
>>
>>
>> -Matthias
>>
>>
>> On 7/11/19 9:31 AM, John Roesler wrote:
>>> Hi Kafka devs,
>>>
>>> I'd like to propose KIP-488 as a minor cleanup of some of our metric
>>> implementations.
>>>
>>> KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw
>>>
>>> Over time, iterative updates to these metrics has resulted in a pretty
>>> confusing little collection of classes, and I've personally been
>>> involved in three separate moderately time-consuming iterations of me
>>> or someone else trying to work out which metrics are available, and
>>> which ones are desired for a given use case. One of these was actually
>>> a long-running bug in Kafka Streams' metrics, so not only has this
>>> confusion been a time sink, but it has also led to bugs.
>>>
>>> I'm hoping this change won't be too controversial.
>>>
>>> Thanks,
>>> -John
>>>
>>



signature.asc
Description: OpenPGP digital signature


Re: request access to create KIP

2019-07-11 Thread Jun Rao
Hi, Jose,

Thanks for your interest. Just gave you the wiki permission.

Jun

On Thu, Jul 11, 2019 at 3:39 PM Jose M  wrote:

> Hello,
>
> Id like permisssions to create a new KIP on the wiki.
>
> Wiki Id: jose.moralesaragon
>
>
> Thanks a lot,
>
> Jose M
>


request access to create KIP

2019-07-11 Thread Jose M
Hello,

Id like permisssions to create a new KIP on the wiki.

Wiki Id: jose.moralesaragon


Thanks a lot,

Jose M


Build failed in Jenkins: kafka-trunk-jdk8 #3780

2019-07-11 Thread Apache Jenkins Server
See 


Changes:

[cshapi] KAFKA-8644; The Kafka protocol generator should allow null defaults for

--
[...truncated 2.55 MB...]

org.apache.kafka.connect.transforms.ReplaceFieldTest > withSchema PASSED

org.apache.kafka.connect.transforms.ReplaceFieldTest > schemaless STARTED

org.apache.kafka.connect.transforms.ReplaceFieldTest > schemaless PASSED

org.apache.kafka.connect.transforms.HoistFieldTest > withSchema STARTED

org.apache.kafka.connect.transforms.HoistFieldTest > withSchema PASSED

org.apache.kafka.connect.transforms.HoistFieldTest > schemaless STARTED

org.apache.kafka.connect.transforms.HoistFieldTest > schemaless PASSED

org.apache.kafka.connect.transforms.MaskFieldTest > withSchema STARTED

org.apache.kafka.connect.transforms.MaskFieldTest > withSchema PASSED

org.apache.kafka.connect.transforms.MaskFieldTest > schemaless STARTED

org.apache.kafka.connect.transforms.MaskFieldTest > schemaless PASSED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > schemaNameUpdate 
STARTED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > schemaNameUpdate 
PASSED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > 
schemaNameAndVersionUpdateWithStruct STARTED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > 
schemaNameAndVersionUpdateWithStruct PASSED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > 
updateSchemaOfStruct STARTED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > 
updateSchemaOfStruct PASSED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > 
schemaNameAndVersionUpdate STARTED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > 
schemaNameAndVersionUpdate PASSED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > updateSchemaOfNull 
STARTED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > updateSchemaOfNull 
PASSED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > schemaVersionUpdate 
STARTED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > schemaVersionUpdate 
PASSED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > 
updateSchemaOfNonStruct STARTED

org.apache.kafka.connect.transforms.SetSchemaMetadataTest > 
updateSchemaOfNonStruct PASSED

org.apache.kafka.connect.transforms.TimestampRouterTest > defaultConfiguration 
STARTED

org.apache.kafka.connect.transforms.TimestampRouterTest > defaultConfiguration 
PASSED

org.apache.kafka.connect.transforms.FlattenTest > testKey STARTED

org.apache.kafka.connect.transforms.FlattenTest > testKey PASSED

org.apache.kafka.connect.transforms.FlattenTest > 
testOptionalAndDefaultValuesNested STARTED

org.apache.kafka.connect.transforms.FlattenTest > 
testOptionalAndDefaultValuesNested PASSED

org.apache.kafka.connect.transforms.FlattenTest > topLevelMapRequired STARTED

org.apache.kafka.connect.transforms.FlattenTest > topLevelMapRequired PASSED

org.apache.kafka.connect.transforms.FlattenTest > topLevelStructRequired STARTED

org.apache.kafka.connect.transforms.FlattenTest > topLevelStructRequired PASSED

org.apache.kafka.connect.transforms.FlattenTest > testOptionalFieldStruct 
STARTED

org.apache.kafka.connect.transforms.FlattenTest > testOptionalFieldStruct PASSED

org.apache.kafka.connect.transforms.FlattenTest > testNestedMapWithDelimiter 
STARTED

org.apache.kafka.connect.transforms.FlattenTest > testNestedMapWithDelimiter 
PASSED

org.apache.kafka.connect.transforms.FlattenTest > testOptionalFieldMap STARTED

org.apache.kafka.connect.transforms.FlattenTest > testOptionalFieldMap PASSED

org.apache.kafka.connect.transforms.FlattenTest > testUnsupportedTypeInMap 
STARTED

org.apache.kafka.connect.transforms.FlattenTest > testUnsupportedTypeInMap 
PASSED

org.apache.kafka.connect.transforms.FlattenTest > testNestedStruct STARTED

org.apache.kafka.connect.transforms.FlattenTest > testNestedStruct PASSED

org.apache.kafka.connect.transforms.ValueToKeyTest > withSchema STARTED

org.apache.kafka.connect.transforms.ValueToKeyTest > withSchema PASSED

org.apache.kafka.connect.transforms.ValueToKeyTest > schemaless STARTED

org.apache.kafka.connect.transforms.ValueToKeyTest > schemaless PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessFieldConversion STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessFieldConversion PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testConfigInvalidTargetType STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testConfigInvalidTargetType PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessStringToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessStringToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > testKey STARTED


[jira] [Resolved] (KAFKA-7688) Allow byte array class for Decimal Logical Types to fix Debezium Issues

2019-07-11 Thread Konstantine Karantasis (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-7688.
---
   Resolution: Fixed
Fix Version/s: (was: 1.1.1)

This issue was resolved outside the Apache Kafka repo with this fix: 
[https://github.com/confluentinc/schema-registry/pull/1020

]Feel free to re-open if there's still missing functionality that belongs 
strictly to the Kafka Connect framework. 

> Allow byte array class for Decimal Logical Types to fix Debezium Issues
> ---
>
> Key: KAFKA-7688
> URL: https://issues.apache.org/jira/browse/KAFKA-7688
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.1.1
>Reporter: Eric C Abis
>Priority: Blocker
>
> Decimal Logical Type fields are failing with Kafka Connect sink tasks and 
> showing this error:
> {code:java}
> Invalid Java object for schema type BYTES: class [B for field: "null"{code}
> There is an issue tracker for the problem here in the Confluent Schema 
> Registry tracker (it's all related):  
> [https://github.com/confluentinc/schema-registry/issues/833]
> I've created a fix for this issue and tested and verified it in our CF4 
> cluster here at Shutterstock.
> The issue boils down to the fact that Avro Decimal Logical types store values 
> as a Byte Arrays. Debezium sets the Default Value as Base64 encoded Byte 
> Arrays and record values as Big Integer Byte Arrays.    I'd like to submit a 
> PR that changes the SCHEMA_TYPE_CLASSES hash map in 
> org.apache.kafka.connect.data.ConnectSchema to allow Byte Arrays for Decimal 
> fields. 
> I reached out [to 
> us...@kafka.apache.org|mailto:to%c2%a0us...@kafka.apache.org] to ask for 
> GitHub permissions but if there is somewhere else I need to reach out to 
> please let me know.
> My GitHub user is TheGreatAbyss
> Thank You!
> Eric



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Build failed in Jenkins: kafka-trunk-jdk11 #689

2019-07-11 Thread Apache Jenkins Server
See 


Changes:

[cshapi] KAFKA-8644; The Kafka protocol generator should allow null defaults for

--
[...truncated 2.56 MB...]

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
timestampToConnectWithDefaultValue STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
timestampToConnectWithDefaultValue PASSED

org.apache.kafka.connect.json.JsonConverterTest > timeToConnectOptional STARTED

org.apache.kafka.connect.json.JsonConverterTest > timeToConnectOptional PASSED

org.apache.kafka.connect.json.JsonConverterTest > dateToConnectWithDefaultValue 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > dateToConnectWithDefaultValue 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullValueToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullValueToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > decimalToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > decimalToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > stringHeaderToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > stringHeaderToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonNonStringKeys STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonNonStringKeys PASSED

org.apache.kafka.connect.json.JsonConverterTest > longToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > longToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > mismatchSchemaJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > mismatchSchemaJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToConnectNonStringKeys 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToConnectNonStringKeys 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation PASSED

org.apache.kafka.connect.json.JsonConverterTest > bytesToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > bytesToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > shortToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > shortToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > intToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > intToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndNullValueToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndNullValueToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnectOptional 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnectOptional 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > structToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > structToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > stringToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > stringToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndArrayToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndArrayToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > byteToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > byteToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaPrimitiveToConnect 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaPrimitiveToConnect 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > byteToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > byteToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > intToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > intToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > dateToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > dateToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
decimalToConnectOptionalWithDefaultValue STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
decimalToConnectOptionalWithDefaultValue PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonStringKeys STARTED

org.apache.kafka.connect.json.JsonConverterTest > 

Re: [jira] [Created] (KAFKA-8326) Add List Serde

2019-07-11 Thread Jan Filipiak
I think this encourages bad descissions.
Lets just have people define repeated fields in thrift,avro,json,
protobuf. Its gonna look nasty if you got your 11th layer of lists.

If you really want to add lists, please do Map aswell in 1 shot

Best Jan

On 06.05.2019 17:59, Daniyar Yeralin (JIRA) wrote:
> Daniyar Yeralin created KAFKA-8326:
> --
> 
>  Summary: Add List Serde
>  Key: KAFKA-8326
>  URL: https://issues.apache.org/jira/browse/KAFKA-8326
>  Project: Kafka
>   Issue Type: Improvement
>   Components: clients, streams
> Reporter: Daniyar Yeralin
> 
> 
> I propose adding serializers and deserializers for the java.util.List class.
> 
> I have many use cases where I want to set the key of a Kafka message to be a 
> UUID. Currently, I need to turn UUIDs into strings or byte arrays and use 
> their associated Serdes, but it would be more convenient to serialize and 
> deserialize UUIDs directly.
> 
> I believe there are many use cases where one would want to have a List serde. 
> Ex. 
> [https://stackoverflow.com/questions/41427174/aggregate-java-objects-in-a-list-with-kafka-streams-dsl-windows],
>  
> [https://stackoverflow.com/questions/46365884/issue-with-arraylist-serde-in-kafka-streams-api]
> 
>  
> 
> KIP Link: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization]
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v7.6.3#76005)
> 


Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-11 Thread Bruno Cadonna
Hi John,

Thank you for the KIP.

LGTM

I also do not like CumulativeSum/Count so much. I propose to just call
it Sum and Count.

I understand that you want to unequivocally distinguish the two metric
functions by their names, but I have the feeling the names become
artificially complex. The exact semantics can also be documented in
the javadocs, which btw could also be improved in those classes.

Best,
Bruno



On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax  wrote:
>
> Thanks for the KIP. Overall LGTM.
>
> The only though I have is, if we may want to use `TotalSum` and
> `TotalCount` instead of `CumulativeSum/Count` as names?
>
>
> -Matthias
>
>
> On 7/11/19 9:31 AM, John Roesler wrote:
> > Hi Kafka devs,
> >
> > I'd like to propose KIP-488 as a minor cleanup of some of our metric
> > implementations.
> >
> > KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw
> >
> > Over time, iterative updates to these metrics has resulted in a pretty
> > confusing little collection of classes, and I've personally been
> > involved in three separate moderately time-consuming iterations of me
> > or someone else trying to work out which metrics are available, and
> > which ones are desired for a given use case. One of these was actually
> > a long-running bug in Kafka Streams' metrics, so not only has this
> > confusion been a time sink, but it has also led to bugs.
> >
> > I'm hoping this change won't be too controversial.
> >
> > Thanks,
> > -John
> >
>


Re: [DISCUSS] KIP-479: Add Materialized to Join

2019-07-11 Thread Bill Bejeck
Thanks all for the great discussion so far.

Everyone has made excellent points, and I appreciate the detail everyone
has put into their arguments.

However, after carefully evaluating all the points made so far, creating an
overload with Materialized is still my #1 option.
My reasoning for saying so is two-fold:

   1. It's a small change, and IMHO since it's consistent with our current
   API concerning state store usage, the cognitive load on users will be
   minimal.
   2. It achieves the most important goal of this KIP, namely to close the
   gap of naming state stores independently of the join operator name.

Additionally, I agree with the points made by Matthias earlier (I realize
there is some overlap here).

>  - the main purpose of this KIP is to close the naming gap what we achieve
>  - we can allow people to use the new in-memory store
>  - we allow people to enable/disable caching
>  - we unify the API
>  - we decouple querying from naming
>  - it's a small API change

Although it's not a perfect solution,  IMHO the positives of using
Materialize far outweigh the negatives, and from what we've discussed so
far, anything we implement seems to involve an additional change down the
road.

If others are still strongly opposed to using Materialized, my other
preferences would be

   1. Add a "withStoreName" to Joined.  Although I agree with Guozhang that
   having a parameter that only applies to one use-case would be clumsy.
   2. Add a String overload for naming the store, but this would be my
   least favorite option as IMHO this seems to be a step backward from why we
   introduced configuration objects in the first place.

Thanks,
Bill

On Thu, Jun 27, 2019 at 4:45 PM Matthias J. Sax 
wrote:

> Thanks for the KIP Bill!
>
> Great discussion to far.
>
> About John's idea about querying upstream stores and don't materialize a
> store: I agree with Bill that this seems to be an orthogonal question,
> and it might be better to treat it as an independent optimization and
> exclude from this KIP.
>
> > What should be the behavior if there is no store
> > configured (e.g., if Materialized with only serdes) and querying is
> > enabled?
>
> IMHO, this could be an error case. If one wants to query a store, they
> need to provide a name -- if you don't know the name, how would you
> actually query the store (even if it would be possible to get the name
> from the `TopologyDescription`, it seems clumsy).
>
> If we don't want to throw an error, materializing seems to be the right
> option, to exclude "query optimization" from this KIP. I would be ok
> with this option, even if it's clumsy to get the name from
> `TopologyDescription`; hence, I would prefer to treat it as an error.
>
> > To get back to the current behavior, users would have to
> > add a "bytes store supplier" to the Materialized to indicate that,
> > yes, they really want a state store there.
>
> This sound like a quite subtle semantic difference on how to use the
> API. Might be hard to explain to users. I would prefer to not introduce it.
>
>
>
> About Guozhang's points:
>
> 1a) That is actually a good point. However, I believe we cannot get
> around this issue easily, and it seems ok to me, to expose the actual
> store type we are using. (More thoughts later.)
>
> 1b) I don't see an issue with allowing users to query all stores? What
> is the rational behind it? What do we gain by not allowing it?
>
> 2) While I understand what you are saying, we also want/need to have a
> way in the PAPI to allow users adding "internal/private" non-queryable
> stores to a topology. That's possible via
> `Materialized#withQueryingDisabled()`. We could also update
> `Topology#addStateStore(StoreBuilder, boolean isQueryable, String...)`
> to address this. Again, I agree with Bill that the current API is built
> in a certain way, and if we want to change it, it should be a separate
> KIP, as it seems to be an orthogonal concern.
>
> > Instead, we just restrict KIP-307 to NOT
> > use the Joined.name for state store names and always use internal names
> as
> > well, which admittedly indeed leaves a hole of not being able to cover
> all
> > internal names here
>
> I think it's important to close this gap. Naming entities seems to a
> binary feature: if there is a gap, the feature is more or less useless,
> rendering KIP-307 void.
>
>
>
> I like John's detailed list of required features and what
> Materialized/WindowByteStoreSuppliers offers. My take is, that adding
> Materialized including the required run-time checks is the best option
> we have, for the following reasons:
>
>  - the main purpose of this KIP is to close the naming gap what we achieve
>  - we can allow people to use the new in-memory store
>  - we allow people to enable/disable caching
>  - we unify the API
>  - we decouple querying from naming
>  - it's a small API change
>
> Adding an overload and only passing in a name, would address the main
> purpose of the KIP. However, it falls short on 

Re: [DISCUSS] KIP-478 Strongly Typed Processor API

2019-07-11 Thread Matthias J. Sax
Side remark:

> Now that "flat transform" is a specific
>> part of the API it seems okay to steer folks in that direction (to never
>> use context.process in a transformer), but it should be called out
>> explicitly in javadocs.  Currently Transformer (which is used for both
>> transform() and flatTransform() ) doesn't really call out the ambiguity:

Would you want to do a PR for address this? We are always eager to
improve the JavaDocs!


-Matthias

On 7/7/19 11:26 AM, Paul Whalen wrote:
> First of all, +1 on the whole idea, my team has run into (admittedly minor,
> but definitely annoying) issues because of the weaker typing.  We're heavy
> users of the PAPI and have Processors that, while not hundreds of lines
> long, are certainly quite hefty and call context.forward() in many places.
> 
> After reading the KIP and discussion a few times, I've convinced myself
> that any initial concerns I had aren't really concerns at all (state store
> types, for one).  One thing I will mention:  changing *Transformer* to have
> ProcessorContext gave me pause, because I have code that does
> context.forward in transformers.  Now that "flat transform" is a specific
> part of the API it seems okay to steer folks in that direction (to never
> use context.process in a transformer), but it should be called out
> explicitly in javadocs.  Currently Transformer (which is used for both
> transform() and flatTransform() ) doesn't really call out the ambiguity:
> https://github.com/apache/kafka/blob/ca641b3e2e48c14ff308181c775775408f5f35f7/streams/src/main/java/org/apache/kafka/streams/kstream/Transformer.java#L75-L77,
> and for migrating users (from before flatTransform) it could be confusing.
> 
> Side note, I'd like to plug KIP-401 (there is a discussion thread and a
> voting thread) which also relates to using the PAPI.  It seems like there
> is some interest and it is in a votable state with the majority of
> implementation complete.
> 
> Paul
> 
> On Fri, Jun 28, 2019 at 2:02 PM Bill Bejeck  wrote:
> 
>> Sorry for coming late to the party.
>>
>> As for the naming I'm in favor of RecordProcessor as well.
>>
>> I agree that we should not take on doing all of the package movements as
>> part of this KIP, especially as John has pointed out, it will be an
>> opportunity to discuss some clean-up on individual classes which I envision
>> becoming another somewhat involved process.
>>
>> For the end goal, if possible, here's what I propose.
>>
>>1. We keep the scope of the KIP the same, *but we only implement* *it in
>>phases*
>>2. Phase one could include what Guozhang had proposed earlier namely
>>1. > 1.a) modifying ProcessorContext only with the output types on
>>   forward.
>>   > 1.b) modifying Transformer signature to have generics of
>>   ProcessorContext,
>>   > and then lift the restricting of not using punctuate: if user did
>>   not
>>   > follow the enforced typing and just code without generics, they
>>   will get
>>   > warning at compile time and get run-time error if they forward
>>   wrong-typed
>>   > records, which I think would be acceptable.
>>3. Then we could tackle other pieces in an incremental manner as we see
>>what makes sense
>>
>> Just my 2cents
>>
>> -Bill
>>
>> On Mon, Jun 24, 2019 at 10:22 PM Guozhang Wang  wrote:
>>
>>> Hi John,
>>>
>>> Yeah I think we should not do all the repackaging as part of this KIP as
>>> well (we can just do the movement of the Processor / ProcessorSupplier),
>>> but I think we need to discuss the end goal here since otherwise we may
>> do
>>> the repackaging of Processor in this KIP, but only later on realizing
>> that
>>> other re-packagings are not our favorite solutions.
>>>
>>>
>>> Guozhang
>>>
>>> On Mon, Jun 24, 2019 at 3:06 PM John Roesler  wrote:
>>>
 Hey Guozhang,

 Thanks for the idea! I'm wondering if we could take a middle ground
 and take your proposed layout as a "roadmap", while only actually
 moving the classes that are already involved in this KIP.

 The reason I ask is not just to control the scope of this KIP, but
 also, I think that if we move other classes to new packages, we might
 also want to take the opportunity to clean up other things about them.
 But each one of those would become a discussion point of its own, so
 it seems the discussion would become intractable. FWIW, I do like your
 idea for precisely this reason, it creates opportunities for us to
 consider other changes that we are simply not able to make without
 breaking source compatibility.

 If the others feel "kind of favorable" with this overall vision, maybe
 we can make one or more Jira tickets to capture it, and then just
 alter _this_ proposal to `processor.api.Processor` (etc).

 WDYT?
 -John

 On Sun, Jun 23, 2019 at 7:17 PM Guozhang Wang 
>>> wrote:
>
> Hello John,
>
> Thanks for your detailed explanation, 

Re: [DISCUSS] KIP-466: Add support for List serialization and deserialization

2019-07-11 Thread Matthias J. Sax
Daniyar,

thanks for the update to the KIP. It's in really good shape and well
written.

About the default constructor question:

All Serdes/Serializer/Deserializer classes need a default constructor to
create them easily via reflections when specifies in a config. I
understand that it is not super user friendly, but all existing code
works this way. Hence, it seems best to stick with the established pattern.

We have a similar issue with `TimeWindowedSerde` and
`SessionWindowedSerde`, and I just recently did a PR to improve user
experience that address the exact issue John raised. (cf
https://github.com/apache/kafka/pull/7067)

Note, that if a user would instantiate the Serde manually, the user
would also need to call `configure()` to setup the inner serdes. Kafka
Streams would not setup those automatically and one might most likely
end-up with an NPE.


Coming back the KIP, and the parameter names. `WindowedSerdes` are
similar to `ListSerde` as they wrap another Serde. For `WindowedSerdes`,
we use the following parameter names:

- default.windowed.key.serde.inner
- default.windowed.value.serde.inner


It might be good to align the naming pattern. I would also suggest to
use `type` instead of `impl`?


default.key.list.serde.impl  ->  default.list.key.serde.type
default.value.list.serde.impl  ->  default.list.value.serde.type
default.key.list.serde.element  ->  default.list.key.serde.inner
default.value.list.serde.element  ->  default.list.value.serde.inner



-Matthias


On 7/10/19 8:52 AM, Development wrote:
> Hi John,
> 
> Yes, I do agree. That totally makes sense. The only thing is that it goes 
> against what Matthias suggested earlier:
> "I think that ... `ListSerde` should have an default constructor and it 
> should be possible to pass in the `Class listClass` information via a 
> configuration. Otherwise, KafkaStreams cannot use it as default serde.”
> 
> What do you think about that? I hope I’m not confusing anything.
> 
> Best,
> Daniyar Yeralin
> 
>> On Jul 9, 2019, at 5:56 PM, John Roesler  wrote:
>>
>> Ah, my apologies, I must have just overlooked it. Thanks for the update, too.
>>
>> Just one more super-small question, do we need this variant: 
>>
>>> New method public static  Serde> ListSerde() in 
>>> org.apache.kafka.common.serialization.Serdes class (infers list 
>>> implementation and inner serde from config file)
>>
>> It seems like this situation implies my config file is already set up for 
>> the list serde, so passing this serde (e.g., in Produced) would have the 
>> same effect as not specifying it. 
>>
>> I guess that it could be the case that you have the 
>> `default.key/value.serde` set to something else, like StringSerde, but you 
>> still have the `default.key/value.list.serde.impl/element` set. This seems 
>> like it would result in more confusion than convenience, so my gut instinct 
>> is maybe we shouldn't introduce the `ListSerde()` variant until people 
>> actually request it later on.
>>
>> Thus, we'd just stick with fully config-driven or fully source-code-driven, 
>> not half/half.
>>
>> What do you think?
>>
>> Thanks,
>> -John
>>
>>
>> On Tue, Jul 9, 2019 at 9:58 AM Development > > wrote:
>>>
>>> Hi John,
>>>
>>> I hope everyone had a great long weekend.
>>>
>>> Regarding Java interfaces, I may not understand you correctly, but I think 
>>> I already listed them:
>>>
>>> So for Produced, you would use it in the following fashion, for example: 
>>> Produced.keySerde(Serdes.ListSerde(ArrayList.class, Serdes.Integer()))
>>>
>>> I also updated the KIP, and added a section “Serialization Strategy” where 
>>> I describe our logic of conditional serialization based on the type of an 
>>> inner serde.
>>>
>>> Thank you!
>>>
>>> Best,
>>> Daniyar Yeralin
>>>
>>> On Jun 26, 2019, at 11:44 AM, John Roesler >> > wrote:
>>>
>>> Thanks for the update, Daniyar!
>>>
>>> In addition to specifying the config interface, can you also specify
>>> the Java interface? Namely, if I need to pass an instance of this
>>> serde in to the DSL directly, as in Produced, Materialized, etc., what
>>> constructor(s) would I have available? Likewise with the Serializer
>>> and Deserailizer. I don't think you need to specify the implementation
>>> logic, since we've already discussed it here.
>>>
>>> If you also want to specify the serialized format of the data records
>>> in the KIP, it could be useful documentation, as well as letting us
>>> verify the schema for forward/backward compatibility concerns, etc.
>>>
>>> Thanks,
>>> John
>>>
>>> On Wed, Jun 26, 2019 at 10:33 AM Development >> > wrote:
>>>
>>>
>>> Hey,
>>>
>>> Finally made updates to the KIP: 
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>  
>>> 
>>>  
>>> 

Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-11 Thread Matthias J. Sax
Thanks for the KIP. Overall LGTM.

The only though I have is, if we may want to use `TotalSum` and
`TotalCount` instead of `CumulativeSum/Count` as names?


-Matthias


On 7/11/19 9:31 AM, John Roesler wrote:
> Hi Kafka devs,
> 
> I'd like to propose KIP-488 as a minor cleanup of some of our metric
> implementations.
> 
> KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw
> 
> Over time, iterative updates to these metrics has resulted in a pretty
> confusing little collection of classes, and I've personally been
> involved in three separate moderately time-consuming iterations of me
> or someone else trying to work out which metrics are available, and
> which ones are desired for a given use case. One of these was actually
> a long-running bug in Kafka Streams' metrics, so not only has this
> confusion been a time sink, but it has also led to bugs.
> 
> I'm hoping this change won't be too controversial.
> 
> Thanks,
> -John
> 



signature.asc
Description: OpenPGP digital signature


[DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-07-11 Thread Justine Olshan
Hello all,

I'd like to start a discussion thread for KIP-487.
This KIP plans to deprecate the current system of auto-creating topics
through requests to the metadata and give the producer the ability to
automatically create topics instead.

More information can be found here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-487%3A+Automatic+Topic+Creation+on+Producer

Thank you,
Justine Olshan


[jira] [Resolved] (KAFKA-8653) Regression in JoinGroup v0 rebalance timeout handling

2019-07-11 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-8653.

   Resolution: Fixed
Fix Version/s: 2.3.1

> Regression in JoinGroup v0 rebalance timeout handling
> -
>
> Key: KAFKA-8653
> URL: https://issues.apache.org/jira/browse/KAFKA-8653
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 2.3.1
>
>
> The rebalance timeout was added to the JoinGroup protocol in version 1. Prior 
> to 2.3, we handled version 0 JoinGroup requests by setting the rebalance 
> timeout to be equal to the session timeout. We lost this logic when we 
> converted the API to use the generated schema definition which uses the 
> default value of -1. The impact of this is that the group rebalance timeout 
> becomes 0, so rebalances finish immediately after we enter the 
> PrepareRebalance state and kick out all old members. This causes consumer 
> groups to enter an endless rebalance loop.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [DISCUSS] KIP-435: Internal Partition Reassignment Batching

2019-07-11 Thread Viktor Somogyi-Vass
Hi Stan,

I meant the following (maybe rare) scenario - we have replicas [1, 2, 3] on
a lot of partitions and the user runs a massive rebalance to change them
all to [3, 2, 1]. In the old behavior, I think that this would not do
anything but simply change the replica set in ZK.
Then, the user could run kafka-preferred-replica-election.sh on a given set
of partitions to make sure the new leader 3 gets elected.

I thought the old algorithm would elect 3 as the leader in this case right
away at the end but I have to double check this. In any case I think it
would make sense in the new algorithm if we elected the new preferred
leader right away, regardless of the new leader is chosen from the existing
replicas or not. If the whole reassignment is in fact just changing the
replica order then either way it is a simple (trivial) operation and doing
it batched wouldn't slow it down much as there is no data movement
involved. If the reassignment is mixed, meaning it contains reordering and
real movement as well then in fact it would help to even out the load
faster as data movements can get long. For instance in case of a
reassignment batch of two partitions concurrently: P1: (1,2,3) -> (3,2,1)
and P2: (4,5,6) -> (7,8,9) the P2 reassignment would elect a new leader but
P1 wouldn't and it wouldn't help the goal of normalizing traffic on broker
1 that much.
Again, I'll have to check how the current algorithm works and if it has any
unknown drawbacks to implement what I sketched up above.

As for generic preferred leader election when called from the admin api or
the auto leader balance feature I think you're right that we should leave
it as it is. It doesn't involve any data movement so it's fairly fast and
it normalizes the cluster state quickly.

Viktor

On Tue, Jul 9, 2019 at 9:04 PM Stanislav Kozlovski 
wrote:

> Hey Viktor,
>
>  I think it is intuitive if there are on a global level...If we applied
> > them on every batch then we
> > couldn't really guarantee any limits as the user would be able to get
> > around it with submitting lots of reassignments.
>
>
> Agreed. Could we mention this explicitly in the KIP?
>
> Also if I understand correctly, AlterPartitionAssignmentsRequest would be a
> > partition level batching, isn't it? So if you submit 10 partitions at
> once
> > then they'll all be started by the controller immediately as per my
> > understanding.
>
>
> Yep, absolutely
>
> I've raised the ordering problem on the discussion of KIP-455 in a bit
> > different form and as I remember the verdict there was that we shouldn't
> > expose ordering as an API. It might not be easy as you say and there
> might
> > be much better strategies to follow (like disk or network utilization
> > goals). Therefore I'll remove this section from the KIP.
>
>
> Sounds good to me.
>
> I'm not sure I get this scenario. So are you saying that after they
> > submitted a reassignment they also submit a preferred leader change?
> > In my mind what I would do is:
> > i) make auto.leader.rebalance.enable to obey the leader movement limit as
> > this way it will be easier to calculate the reassignment batches.
> >
>
> Sorry, this is my fault -- I should have been more clear.
> First, I didn't think through this well enough at the time, I don't think.
> If we have replicas=[1, 2, 3] and we reassign them to [4, 5, 6], it is
> obvious that a leader shift will happen. Your KIP proposes we shift
> replicas 1 and 4 first.
>
> I meant the following (maybe rare) scenario - we have replicas [1, 2, 3] on
> a lot of partitions and the user runs a massive rebalance to change them
> all to [3, 2, 1]. In the old behavior, I think that this would not do
> anything but simply change the replica set in ZK.
> Then, the user could run kafka-preferred-replica-election.sh on a given set
> of partitions to make sure the new leader 3 gets elected.
>
> ii) the user could submit preferred leader election but it's basically a
> > form of reassignment so it would fall under the batching criterias. If
> the
> > batch they submit is smaller than the internal, then it'd be executed
> > immediately but otherwise it would be split into more batches. It might
> be
> > a different behavior as it may not be executed it in one batch but I
> think
> > it isn't a problem because we'll default to Int.MAX with the batches and
> > otherwise because since it's a form of reassignment I think it makes
> sense
> > to apply the same logic.
>
>
> The AdminAPI for that is "electPreferredLeaders(Collection
> partitions)" and the old zNode is "{"partitions": [{"topic": "a",
> "partition": 0}]}" so it is a bit less explicit than our other reassignment
> API, but the functionality is the same.
> You're 100% right that it is a form of reassignment, but I hadn't thought
> of it like that and I some other people wouldn't have either.
> If I understand correctly, you're suggesting that we count the
> "reassignment.parallel.leader.movements" config against such preferred
> elections. I 

[jira] [Created] (KAFKA-8657) Automatic Topic Creation on Producer

2019-07-11 Thread Justine Olshan (JIRA)
Justine Olshan created KAFKA-8657:
-

 Summary: Automatic Topic Creation on Producer
 Key: KAFKA-8657
 URL: https://issues.apache.org/jira/browse/KAFKA-8657
 Project: Kafka
  Issue Type: Improvement
Reporter: Justine Olshan
Assignee: Justine Olshan


Kafka has a feature that allows for the auto-creation of topics. Usually this 
is done through a metadata request to the broker. KIP 487 aims to give the 
producer this functionality. 

See 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-487%3A+Automatic+Topic+Creation+on+Producer]
 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (KAFKA-8644) The Kafka protocol generator should allow null defaults for bytes and array fields

2019-07-11 Thread Gwen Shapira (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira resolved KAFKA-8644.
-
   Resolution: Fixed
Fix Version/s: 2.4.0

> The Kafka protocol generator should allow null defaults for bytes and array 
> fields
> --
>
> Key: KAFKA-8644
> URL: https://issues.apache.org/jira/browse/KAFKA-8644
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>Priority: Major
> Fix For: 2.4.0
>
>
> The Kafka protocol generator should allow null defaults for bytes and array 
> fields.  Currently, null defaults are only allowed for string fields.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[DISCUSS] KIP-489 Kafka Consumer Record Latency Metric

2019-07-11 Thread Sean Glover
Hi kafka-dev,

I've created KIP-489 as a proposal for adding latency metrics to the Kafka
Consumer in a similar way as record-lag metrics are implemented.

https://cwiki.apache.org/confluence/display/KAFKA/489%3A+Kafka+Consumer+Record+Latency+Metric

Regards,
Sean

-- 
Principal Engineer, Lightbend, Inc.



@seg1o , in/seanaglover



[DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics

2019-07-11 Thread John Roesler
Hi Kafka devs,

I'd like to propose KIP-488 as a minor cleanup of some of our metric
implementations.

KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw

Over time, iterative updates to these metrics has resulted in a pretty
confusing little collection of classes, and I've personally been
involved in three separate moderately time-consuming iterations of me
or someone else trying to work out which metrics are available, and
which ones are desired for a given use case. One of these was actually
a long-running bug in Kafka Streams' metrics, so not only has this
confusion been a time sink, but it has also led to bugs.

I'm hoping this change won't be too controversial.

Thanks,
-John


[jira] [Created] (KAFKA-8656) Kafka Consumer Record Latency Metric

2019-07-11 Thread Sean Glover (JIRA)
Sean Glover created KAFKA-8656:
--

 Summary: Kafka Consumer Record Latency Metric
 Key: KAFKA-8656
 URL: https://issues.apache.org/jira/browse/KAFKA-8656
 Project: Kafka
  Issue Type: New Feature
  Components: metrics
Reporter: Sean Glover
Assignee: Sean Glover


Consumer lag is a useful metric to monitor how many records are queued to be 
processed.  We can look at individual lag per partition or we may aggregate 
metrics. For example, we may want to monitor what the maximum lag of any 
particular partition in our consumer subscription so we can identify hot 
partitions, caused by an insufficient producing partitioning strategy.  We may 
want to monitor a sum of lag across all partitions so we have a sense as to our 
total backlog of messages to consume. Lag in offsets is useful when you have a 
good understanding of your messages and processing characteristics, but it 
doesn’t tell us how far behind _in time_ we are.  This is known as wait time in 
queueing theory, or more informally it’s referred to as latency.

The latency of a message can be defined as the difference between when that 
message was first produced to when the message is received by a consumer.  The 
latency of records in a partition correlates with lag, but a larger lag doesn’t 
necessarily mean a larger latency. For example, a topic consumed by two 
separate application consumer groups A and B may have similar lag, but 
different latency per partition.  Application A is a consumer which performs 
CPU intensive business logic on each message it receives. It’s distributed 
across many consumer group members to handle the load quickly enough, but since 
its processing time is slower, it takes longer to process each message per 
partition.  Meanwhile, Application B is a consumer which performs a simple ETL 
operation to land streaming data in another system, such as HDFS. It may have 
similar lag to Application A, but because it has a faster processing time its 
latency per partition is significantly less.

If the Kafka Consumer reported a latency metric it would be easier to build 
Service Level Agreements (SLAs) based on non-functional requirements of the 
streaming system.  For example, the system must never have a latency of greater 
than 10 minutes. This SLA could be used in monitoring alerts or as input to 
automatic scaling solutions.

[KIP-488|[https://cwiki.apache.org/confluence/display/KAFKA/488%3A+Kafka+Consumer+Record+Latency+Metric]]

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: PR builds stopped since 07/10 11AM

2019-07-11 Thread Bill Bejeck
Thanks Ismael!

On Thu, Jul 11, 2019 at 10:41 AM Ismael Juma  wrote:

> Hi all,
>
> After updating some jobs to the "modern" PR builder, I noticed several
> regressions and shared the details in the Apache Builds mailing list.
> Thankfully, the team has re-enabled the old PR builder. I have reverted the
> changes to the jobs so they should work as they were 2-3 days ago.
>
> Ismael
>
> On Wed, Jul 10, 2019 at 10:37 PM Boyang Chen 
> wrote:
>
> > Thanks a lot Ismael for the help here!
> >
> > On Wed, Jul 10, 2019 at 7:30 PM Ismael Juma  wrote:
> >
> > > I was reading the Apache Builds mailing list and:
> > >
> > > as part of some cleanup and consolidation (essentially we don't want to
> > > > maintain two different plugins that do the same thing), we have
> removed
> > > > support for the old GitHub PR Builder on Jenkins, and are focusing on
> > the
> > > > modern variant. If your build previously made use of the old one
> (It's
> > > > called "GitHub Pull Request Builder" in the job configuration), we
> ask
> > > that
> > > > you please switch to the newer one (called "Build pull requests to
> the
> > > > repository" in the same config). There should be no other changes,
> but
> > if
> > > > your builds start acting up, do let infra know :).
> > > > As an added bonus, you no longer need to contact infra about webhooks
> > > when
> > > > setting up PR builds for new repositories, it should just work(tm).
> > >
> > >
> > > I'll see if I can update the jobs.
> > >
> > > Ismael
> > >
> > > On Wed, Jul 10, 2019 at 6:09 PM Ismael Juma  wrote:
> > >
> > > > Hi Boyang,
> > > >
> > > > An INFRA JIRA is the right way to proceed here. Unfortunately none of
> > the
> > > > JIRAs below are for the right project.
> > > >
> > > > Ismael
> > > >
> > > > On Wed, Jul 10, 2019, 5:09 PM Boyang Chen <
> reluctanthero...@gmail.com>
> > > > wrote:
> > > >
> > > >> Hey there,
> > > >>
> > > >> just want you to be aware that all the Jenkins PR builds are stopped
> > > since
> > > >> today's morning:
> > > >> https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/
> > > >> https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/
> > > >>
> > > >> I have filed two JIRAs
> > > >> https://issues.apache.org/jira/browse/INFRATEST1-10
> > > >>  and https://issues.apache.org/jira/browse/KAFKA-8652 but not sure
> > > about
> > > >> the next steps to fix it.
> > > >>
> > > >> Let me know your thoughts.
> > > >>
> > > >> Boyang
> > > >>
> > > >
> > >
> >
>


Re: PR builds stopped since 07/10 11AM

2019-07-11 Thread Ismael Juma
Hi all,

After updating some jobs to the "modern" PR builder, I noticed several
regressions and shared the details in the Apache Builds mailing list.
Thankfully, the team has re-enabled the old PR builder. I have reverted the
changes to the jobs so they should work as they were 2-3 days ago.

Ismael

On Wed, Jul 10, 2019 at 10:37 PM Boyang Chen 
wrote:

> Thanks a lot Ismael for the help here!
>
> On Wed, Jul 10, 2019 at 7:30 PM Ismael Juma  wrote:
>
> > I was reading the Apache Builds mailing list and:
> >
> > as part of some cleanup and consolidation (essentially we don't want to
> > > maintain two different plugins that do the same thing), we have removed
> > > support for the old GitHub PR Builder on Jenkins, and are focusing on
> the
> > > modern variant. If your build previously made use of the old one (It's
> > > called "GitHub Pull Request Builder" in the job configuration), we ask
> > that
> > > you please switch to the newer one (called "Build pull requests to the
> > > repository" in the same config). There should be no other changes, but
> if
> > > your builds start acting up, do let infra know :).
> > > As an added bonus, you no longer need to contact infra about webhooks
> > when
> > > setting up PR builds for new repositories, it should just work(tm).
> >
> >
> > I'll see if I can update the jobs.
> >
> > Ismael
> >
> > On Wed, Jul 10, 2019 at 6:09 PM Ismael Juma  wrote:
> >
> > > Hi Boyang,
> > >
> > > An INFRA JIRA is the right way to proceed here. Unfortunately none of
> the
> > > JIRAs below are for the right project.
> > >
> > > Ismael
> > >
> > > On Wed, Jul 10, 2019, 5:09 PM Boyang Chen 
> > > wrote:
> > >
> > >> Hey there,
> > >>
> > >> just want you to be aware that all the Jenkins PR builds are stopped
> > since
> > >> today's morning:
> > >> https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/
> > >> https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/
> > >>
> > >> I have filed two JIRAs
> > >> https://issues.apache.org/jira/browse/INFRATEST1-10
> > >>  and https://issues.apache.org/jira/browse/KAFKA-8652 but not sure
> > about
> > >> the next steps to fix it.
> > >>
> > >> Let me know your thoughts.
> > >>
> > >> Boyang
> > >>
> > >
> >
>


Re: [DISCUSS] KIP-213: Second follow-up on Foreign Key Joins

2019-07-11 Thread Jan Filipiak


On 10.07.2019 06:25, Adam Bellemare wrote:
> In my experience (obviously empirical) it seems that many people just want
> the ability to join on foreign keys for the sake of handling all the
> relational data in their event streams and extra tombstones don't matter at
> all. This has been my own experience from our usage of our internal
> implementation at my company, and that of many others who have reached out
> to me.

backing this.



Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-07-11 Thread Rajini Sivaram
+1 (binding)

Thanks for the KIP, Andy!

Regards,

Rajini


On Thu, Jul 11, 2019 at 1:18 PM Gwen Shapira  wrote:

> +1 (binding)
>
> Thank you for the improvement.
>
> On Thu, Jul 11, 2019, 3:53 AM Andy Coates  wrote:
>
> > Hi All,
> >
> > So voting currently stands on:
> >
> > Binding:
> > +1 Matthias,
> > +1 Colin
> >
> > Non-binding:
> > +1  Thomas Becker
> > +1 Satish Guggana
> > +1 Ryan Dolan
> >
> > So we're still 1 binding vote short. :(
> >
> >
> > On Wed, 3 Jul 2019 at 23:08, Matthias J. Sax 
> > wrote:
> >
> > > Thanks for the details Colin and Andy.
> > >
> > > My indent was not to block the KIP, but it seems to be a fair question
> > > to ask.
> > >
> > > I talked to Ismael offline about it and understand his reasoning better
> > > now. If we don't deprecate `abstract AdminClient` class, it seems
> > > reasonable to not deprecate the corresponding factory methods either.
> > >
> > >
> > > +1 (binding) on the current proposal
> > >
> > >
> > >
> > > -Matthias
> > >
> > > On 7/3/19 5:03 AM, Andy Coates wrote:
> > > > Matthias,
> > > >
> > > > I was referring to platforms such as spark or flink that support
> > multiple
> > > > versions of the Kafka clients. Ismael mentioned this higher up on the
> > > > thread.
> > > >
> > > > I'd prefer this KIP didn't get held up over somewhat unrelated
> change,
> > > i.e.
> > > > should the factory method be on the interface or utility class.
> > Surely,
> > > > now would be a great time to change this if we wanted, but we can
> also
> > > > change this later if we need to.  In the interest of moving forward,
> > can
> > > I
> > > > propose we leave the factory methods as they are in the KIP?
> > > >
> > > > Thanks,
> > > >
> > > > Andy
> > > >
> > > > On Tue, 2 Jul 2019 at 17:14, Colin McCabe 
> wrote:
> > > >
> > > >> On Tue, Jul 2, 2019, at 09:14, Colin McCabe wrote:
> > > >>> On Mon, Jul 1, 2019, at 23:30, Matthias J. Sax wrote:
> > >  Not sure, if I understand the argument?
> > > 
> > >  Why would anyone need to support multiple client side versions?
> > >  Clients/brokers are forward/backward compatible anyway.
> > > >>>
> > > >>> When you're using many different libraries, it is helpful if they
> > don't
> > > >>> impose tight constraints on what versions their dependencies are.
> > > >>> Otherwise you can easily get in a situation where the constraints
> > can't
> > > >>> be satisfied.
> > > >>>
> > > 
> > >  Also, if one really supports multiple client side versions, won't
> > they
> > >  use multiple shaded dependencies for different versions?
> > > >>>
> > > >>> Shading the Kafka client doesn't really work, because of how we use
> > > >> reflection.
> > > >>>
> > > 
> > >  Last, it's possible to suppress warnings (at least in Java).
> > > >>>
> > > >>> But not in Scala.  So that does not help (for example), Scala
> users.
> > > >>
> > > >> I meant to write "Spark users" here.
> > > >>
> > > >> C.
> > > >>
> > > >>>
> > > >>> I agree that in general we should be using deprecation when
> > > >>> appropriate, regardless of the potential annoyances to users.  But
> > I'm
> > > >>> not sure deprecating this method is really worth it.
> > > >>>
> > > >>> best,
> > > >>> Colin
> > > >>>
> > > >>>
> > > 
> > >  Can you elaborate?
> > > 
> > >  IMHO, just adding a statement to JavaDocs is a little weak, and at
> > > some
> > >  point, we need to deprecate those methods anyway if we ever want
> to
> > >  remove them. The earlier we deprecate them, the earlier we can
> > remove
> > > >> them.
> > > 
> > > 
> > >  -Matthias
> > > 
> > >  On 7/1/19 4:22 AM, Andy Coates wrote:
> > > > Done. I've not deprecated the factory methods on the AdminClient
> > for
> > > >> the
> > > > same reason the AdminClient itself is not deprecated, i.e. this
> > > >> would cause
> > > > unavoidable warnings for libraries / platforms that support
> > multiple
> > > > versions of Kafka. However, I think we add a note to the Java
> docs
> > of
> > > > `AdminClient` to indicate that its use, going forward, is
> > > >> discouraged in
> > > > favour of the new `Admin` interface and explain why its not  been
> > > > deprecated, but that it may/will be removed in a future version.
> > > >
> > > > Regarding factory methods on interfaces: there seems to be some
> > > >> difference
> > > > of opinion here. I'm not sure of the best approach to revolve
> this.
> > > >> At the
> > > > moment the KIP has factory methods on the new `Admin` interface,
> > > >> rather
> > > > than some utility class. I prefer the utility class, but this
> isn't
> > > >> inline
> > > > with the patterns in the code base and some of the core team have
> > > >> expressed
> > > > they'd prefer to continue to have the factory methods on the
> > > >> interface.
> > > > I'm happy with this if others are.
> > > >
> > > > Thanks,
> > > >
> > > > Andy
> > > >
> > > 

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-07-11 Thread Gwen Shapira
+1 (binding)

Thank you for the improvement.

On Thu, Jul 11, 2019, 3:53 AM Andy Coates  wrote:

> Hi All,
>
> So voting currently stands on:
>
> Binding:
> +1 Matthias,
> +1 Colin
>
> Non-binding:
> +1  Thomas Becker
> +1 Satish Guggana
> +1 Ryan Dolan
>
> So we're still 1 binding vote short. :(
>
>
> On Wed, 3 Jul 2019 at 23:08, Matthias J. Sax 
> wrote:
>
> > Thanks for the details Colin and Andy.
> >
> > My indent was not to block the KIP, but it seems to be a fair question
> > to ask.
> >
> > I talked to Ismael offline about it and understand his reasoning better
> > now. If we don't deprecate `abstract AdminClient` class, it seems
> > reasonable to not deprecate the corresponding factory methods either.
> >
> >
> > +1 (binding) on the current proposal
> >
> >
> >
> > -Matthias
> >
> > On 7/3/19 5:03 AM, Andy Coates wrote:
> > > Matthias,
> > >
> > > I was referring to platforms such as spark or flink that support
> multiple
> > > versions of the Kafka clients. Ismael mentioned this higher up on the
> > > thread.
> > >
> > > I'd prefer this KIP didn't get held up over somewhat unrelated change,
> > i.e.
> > > should the factory method be on the interface or utility class.
> Surely,
> > > now would be a great time to change this if we wanted, but we can also
> > > change this later if we need to.  In the interest of moving forward,
> can
> > I
> > > propose we leave the factory methods as they are in the KIP?
> > >
> > > Thanks,
> > >
> > > Andy
> > >
> > > On Tue, 2 Jul 2019 at 17:14, Colin McCabe  wrote:
> > >
> > >> On Tue, Jul 2, 2019, at 09:14, Colin McCabe wrote:
> > >>> On Mon, Jul 1, 2019, at 23:30, Matthias J. Sax wrote:
> >  Not sure, if I understand the argument?
> > 
> >  Why would anyone need to support multiple client side versions?
> >  Clients/brokers are forward/backward compatible anyway.
> > >>>
> > >>> When you're using many different libraries, it is helpful if they
> don't
> > >>> impose tight constraints on what versions their dependencies are.
> > >>> Otherwise you can easily get in a situation where the constraints
> can't
> > >>> be satisfied.
> > >>>
> > 
> >  Also, if one really supports multiple client side versions, won't
> they
> >  use multiple shaded dependencies for different versions?
> > >>>
> > >>> Shading the Kafka client doesn't really work, because of how we use
> > >> reflection.
> > >>>
> > 
> >  Last, it's possible to suppress warnings (at least in Java).
> > >>>
> > >>> But not in Scala.  So that does not help (for example), Scala users.
> > >>
> > >> I meant to write "Spark users" here.
> > >>
> > >> C.
> > >>
> > >>>
> > >>> I agree that in general we should be using deprecation when
> > >>> appropriate, regardless of the potential annoyances to users.  But
> I'm
> > >>> not sure deprecating this method is really worth it.
> > >>>
> > >>> best,
> > >>> Colin
> > >>>
> > >>>
> > 
> >  Can you elaborate?
> > 
> >  IMHO, just adding a statement to JavaDocs is a little weak, and at
> > some
> >  point, we need to deprecate those methods anyway if we ever want to
> >  remove them. The earlier we deprecate them, the earlier we can
> remove
> > >> them.
> > 
> > 
> >  -Matthias
> > 
> >  On 7/1/19 4:22 AM, Andy Coates wrote:
> > > Done. I've not deprecated the factory methods on the AdminClient
> for
> > >> the
> > > same reason the AdminClient itself is not deprecated, i.e. this
> > >> would cause
> > > unavoidable warnings for libraries / platforms that support
> multiple
> > > versions of Kafka. However, I think we add a note to the Java docs
> of
> > > `AdminClient` to indicate that its use, going forward, is
> > >> discouraged in
> > > favour of the new `Admin` interface and explain why its not  been
> > > deprecated, but that it may/will be removed in a future version.
> > >
> > > Regarding factory methods on interfaces: there seems to be some
> > >> difference
> > > of opinion here. I'm not sure of the best approach to revolve this.
> > >> At the
> > > moment the KIP has factory methods on the new `Admin` interface,
> > >> rather
> > > than some utility class. I prefer the utility class, but this isn't
> > >> inline
> > > with the patterns in the code base and some of the core team have
> > >> expressed
> > > they'd prefer to continue to have the factory methods on the
> > >> interface.
> > > I'm happy with this if others are.
> > >
> > > Thanks,
> > >
> > > Andy
> > >
> > > On Thu, 27 Jun 2019 at 23:21, Matthias J. Sax <
> matth...@confluent.io
> > >
> > >> wrote:
> > >
> > >> @Andy:
> > >>
> > >> What about the factory methods of `AdminClient` class? Should they
> > >> be
> > >> deprecated?
> > >>
> > >> One nit about the KIP: can you maybe insert "code blocks" to
> > >> highlight
> > >> the API changes? Code blocks would simplify to read the KIP a lot.
> > 

Build failed in Jenkins: kafka-trunk-jdk8 #3779

2019-07-11 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-8653; Default rebalance timeout to session timeout for JoinGroup

--
[...truncated 2.63 MB...]
org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testOrderOfPluginUrlsWithJars STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testOrderOfPluginUrlsWithJars PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testPluginUrlsWithClasses STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testPluginUrlsWithClasses PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testConnectFrameworkClasses STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testConnectFrameworkClasses PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testThirdPartyClasses STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testThirdPartyClasses PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testClientConfigProvider STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testClientConfigProvider PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testAllowedConnectFrameworkClasses STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testAllowedConnectFrameworkClasses PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testJavaLibraryClasses STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testJavaLibraryClasses PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testEmptyPluginUrls STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testEmptyPluginUrls PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testConnectorClientConfigOverridePolicy STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testConnectorClientConfigOverridePolicy PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testPluginUrlsWithJars STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testPluginUrlsWithJars PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testPluginUrlsWithZips STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testPluginUrlsWithZips PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testPluginUrlsWithRelativeSymlinkBackwards STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testPluginUrlsWithRelativeSymlinkBackwards PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testPluginUrlsWithAbsoluteSymlink STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testPluginUrlsWithAbsoluteSymlink PASSED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testEmptyStructurePluginUrls STARTED

org.apache.kafka.connect.runtime.isolation.PluginUtilsTest > 
testEmptyStructurePluginUrls PASSED

org.apache.kafka.connect.runtime.isolation.PluginsTest > 
shouldInstantiateAndConfigureConnectRestExtension STARTED

org.apache.kafka.connect.runtime.isolation.PluginsTest > 
shouldInstantiateAndConfigureConnectRestExtension PASSED

org.apache.kafka.connect.runtime.isolation.PluginsTest > 
shouldInstantiateAndConfigureConverters STARTED

org.apache.kafka.connect.runtime.isolation.PluginsTest > 
shouldInstantiateAndConfigureConverters PASSED

org.apache.kafka.connect.runtime.isolation.PluginsTest > 
shouldInstantiateAndConfigureInternalConverters STARTED

org.apache.kafka.connect.runtime.isolation.PluginsTest > 
shouldInstantiateAndConfigureInternalConverters PASSED

org.apache.kafka.connect.runtime.isolation.PluginsTest > 
shouldInstantiateAndConfigureDefaultHeaderConverter STARTED

org.apache.kafka.connect.runtime.isolation.PluginsTest > 
shouldInstantiateAndConfigureDefaultHeaderConverter PASSED

org.apache.kafka.connect.runtime.isolation.PluginsTest > 
shouldInstantiateAndConfigureExplicitlySetHeaderConverterWithCurrentClassLoader 
STARTED

org.apache.kafka.connect.runtime.isolation.PluginsTest > 
shouldInstantiateAndConfigureExplicitlySetHeaderConverterWithCurrentClassLoader 
PASSED

org.apache.kafka.connect.runtime.ErrorHandlingTaskTest > 
testErrorHandlingInSinkTasks STARTED

org.apache.kafka.connect.runtime.ErrorHandlingTaskTest > 
testErrorHandlingInSinkTasks PASSED

org.apache.kafka.connect.runtime.ErrorHandlingTaskTest > 
testErrorHandlingInSourceTasks STARTED

org.apache.kafka.connect.runtime.ErrorHandlingTaskTest > 
testErrorHandlingInSourceTasks PASSED

org.apache.kafka.connect.runtime.ErrorHandlingTaskTest > 
testErrorHandlingInSourceTasksWthBadConverter STARTED

org.apache.kafka.connect.runtime.ErrorHandlingTaskTest > 
testErrorHandlingInSourceTasksWthBadConverter PASSED

org.apache.kafka.connect.util.ConnectUtilsTest > testLookupNullKafkaClusterId 
STARTED

org.apache.kafka.connect.util.ConnectUtilsTest > 

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-07-11 Thread Andy Coates
Hi All,

So voting currently stands on:

Binding:
+1 Matthias,
+1 Colin

Non-binding:
+1  Thomas Becker
+1 Satish Guggana
+1 Ryan Dolan

So we're still 1 binding vote short. :(


On Wed, 3 Jul 2019 at 23:08, Matthias J. Sax  wrote:

> Thanks for the details Colin and Andy.
>
> My indent was not to block the KIP, but it seems to be a fair question
> to ask.
>
> I talked to Ismael offline about it and understand his reasoning better
> now. If we don't deprecate `abstract AdminClient` class, it seems
> reasonable to not deprecate the corresponding factory methods either.
>
>
> +1 (binding) on the current proposal
>
>
>
> -Matthias
>
> On 7/3/19 5:03 AM, Andy Coates wrote:
> > Matthias,
> >
> > I was referring to platforms such as spark or flink that support multiple
> > versions of the Kafka clients. Ismael mentioned this higher up on the
> > thread.
> >
> > I'd prefer this KIP didn't get held up over somewhat unrelated change,
> i.e.
> > should the factory method be on the interface or utility class.  Surely,
> > now would be a great time to change this if we wanted, but we can also
> > change this later if we need to.  In the interest of moving forward, can
> I
> > propose we leave the factory methods as they are in the KIP?
> >
> > Thanks,
> >
> > Andy
> >
> > On Tue, 2 Jul 2019 at 17:14, Colin McCabe  wrote:
> >
> >> On Tue, Jul 2, 2019, at 09:14, Colin McCabe wrote:
> >>> On Mon, Jul 1, 2019, at 23:30, Matthias J. Sax wrote:
>  Not sure, if I understand the argument?
> 
>  Why would anyone need to support multiple client side versions?
>  Clients/brokers are forward/backward compatible anyway.
> >>>
> >>> When you're using many different libraries, it is helpful if they don't
> >>> impose tight constraints on what versions their dependencies are.
> >>> Otherwise you can easily get in a situation where the constraints can't
> >>> be satisfied.
> >>>
> 
>  Also, if one really supports multiple client side versions, won't they
>  use multiple shaded dependencies for different versions?
> >>>
> >>> Shading the Kafka client doesn't really work, because of how we use
> >> reflection.
> >>>
> 
>  Last, it's possible to suppress warnings (at least in Java).
> >>>
> >>> But not in Scala.  So that does not help (for example), Scala users.
> >>
> >> I meant to write "Spark users" here.
> >>
> >> C.
> >>
> >>>
> >>> I agree that in general we should be using deprecation when
> >>> appropriate, regardless of the potential annoyances to users.  But I'm
> >>> not sure deprecating this method is really worth it.
> >>>
> >>> best,
> >>> Colin
> >>>
> >>>
> 
>  Can you elaborate?
> 
>  IMHO, just adding a statement to JavaDocs is a little weak, and at
> some
>  point, we need to deprecate those methods anyway if we ever want to
>  remove them. The earlier we deprecate them, the earlier we can remove
> >> them.
> 
> 
>  -Matthias
> 
>  On 7/1/19 4:22 AM, Andy Coates wrote:
> > Done. I've not deprecated the factory methods on the AdminClient for
> >> the
> > same reason the AdminClient itself is not deprecated, i.e. this
> >> would cause
> > unavoidable warnings for libraries / platforms that support multiple
> > versions of Kafka. However, I think we add a note to the Java docs of
> > `AdminClient` to indicate that its use, going forward, is
> >> discouraged in
> > favour of the new `Admin` interface and explain why its not  been
> > deprecated, but that it may/will be removed in a future version.
> >
> > Regarding factory methods on interfaces: there seems to be some
> >> difference
> > of opinion here. I'm not sure of the best approach to revolve this.
> >> At the
> > moment the KIP has factory methods on the new `Admin` interface,
> >> rather
> > than some utility class. I prefer the utility class, but this isn't
> >> inline
> > with the patterns in the code base and some of the core team have
> >> expressed
> > they'd prefer to continue to have the factory methods on the
> >> interface.
> > I'm happy with this if others are.
> >
> > Thanks,
> >
> > Andy
> >
> > On Thu, 27 Jun 2019 at 23:21, Matthias J. Sax  >
> >> wrote:
> >
> >> @Andy:
> >>
> >> What about the factory methods of `AdminClient` class? Should they
> >> be
> >> deprecated?
> >>
> >> One nit about the KIP: can you maybe insert "code blocks" to
> >> highlight
> >> the API changes? Code blocks would simplify to read the KIP a lot.
> >>
> >>
> >> -Matthias
> >>
> >> On 6/26/19 6:56 AM, Ryanne Dolan wrote:
> >>> +1 (non-binding)
> >>>
> >>> Thanks.
> >>> Ryanne
> >>>
> >>> On Tue, Jun 25, 2019 at 10:21 PM Satish Duggana <
> >> satish.dugg...@gmail.com>
> >>> wrote:
> >>>
>  +1 (non-binding)
> 
>  On Wed, Jun 26, 2019 at 8:37 AM Satish Duggana <
> >> satish.dugg...@gmail.com>
> 

[jira] [Created] (KAFKA-8655) Failed to start systemctl kafka (code=exited, status=1/FAILURE)

2019-07-11 Thread Ben (JIRA)
Ben created KAFKA-8655:
--

 Summary: Failed to start systemctl kafka (code=exited, 
status=1/FAILURE)
 Key: KAFKA-8655
 URL: https://issues.apache.org/jira/browse/KAFKA-8655
 Project: Kafka
  Issue Type: Bug
Reporter: Ben


Alright so what i  did was
 sudo systemctl enable confluent-zookeeper
sudo systemctl enable confluent-kafka
sudo systemctl start confluent-zookeeper
i got a acces to file error, i've chmod it and now zookeeper works fine.
sudo systemctl start confluent-kafka
i got a error still couldn't fix , this is the output 

    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
    at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.j
    at java.nio.channels.FileChannel.open(FileChannel.java:287)
    at java.nio.channels.FileChannel.open(FileChannel.java:335)
    at org.apache.kafka.common.record.FileRecords.openChannel(FileRecords.java:4
    at org.apache.kafka.common.record.FileRecords.open(FileRecords.java:410)
    at org.apache.kafka.common.record.FileRecords.open(FileRecords.java:419)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Jenkins build is back to normal : kafka-trunk-jdk8 #3778

2019-07-11 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-8654) Cant restart heartbeatThread if encountered unexpected exception in heartbeatloop。

2019-07-11 Thread nick allen (JIRA)
nick allen created KAFKA-8654:
-

 Summary: Cant restart heartbeatThread if encountered unexpected 
exception in heartbeatloop。
 Key: KAFKA-8654
 URL: https://issues.apache.org/jira/browse/KAFKA-8654
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 2.1.0
Reporter: nick allen


There is a consumer in our cluster which have relatively high cpu usage for 
several days caused by kafka poll thread. So we dig in to find out that was 
because 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator#timeToNextHeartbeat
 returned zero leading to non-blocking select which in turn leading to 
pollForFetches returned immediately. But the actual poll timeout is set to 1s, 
so pollForFetches was called thousands of time per poll/second.

We use tool to inspect memory variables which show the corresponding 
heartbeatTimer's attribute:  

@Timer[
 time=@SystemTime[org.apache.kafka.common.utils.SystemTime@4d806627],
 startMs=@Long[1562075783801], // Jul 02 2019 13:56:23
 currentTimeMs=@Long[1562823681506], // Thu Jul 11 2019 05:41:21
 deadlineMs=@Long[1562075793801], // Tue Jul 02 2019 13:56:33
 ]

That shows that heartbeat hasn't been happening for about 10 days. And jstack 
shows the corresponding heartbeatThread is dead. Unfortunately we dont keep 
logs for that long so I cant figure out what happened then. 

IMO heartbeatThread is too important to be left dead, there should at least be 
some way to revive it, but it seems that startHeartbeatThreadIfNeeded can only 
be triggered by restarting or heartBeat itself.

It's also confusing that almost everything in 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.HeartbeatThread#run
 is async so it seems impossible for any exception to happen, so why there is 
so many catch clause?

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)