Re: [DISCUSS] KIP-326: Schedulable KTable as Graph source

2018-07-04 Thread flaviostutz
John, that was fantastic, man!
Have you built any custom implementation of your KIP in your machine so that I 
could test it out here? I wish I could test it out.
If you need any help implementing this feature, please tell me.

Thanks.

-Flávio Stutz




On 2018/07/03 18:04:52, John Roesler  wrote: 
> Hi Flávio,
> Thanks! I think that we can actually do this, but the API could be better.
> I've included Java code below, but I'll copy and modify your example so
> we're on the same page.
> 
> EXERCISE 1:
>   - The case is "total counting of events for a huge website"
>   - Tasks from Application A will have something like:
>  .stream(/site-events)
>  .transform( re-key s.t. the new key is the partition id)
>  .groupByKey() // you have to do this before count
>  .count()
>   // you explicitly published to a one-partition topic here, but
> it's actually sufficient just
>   // to re-group onto one key. You could name and pre-create the
> intermediate topic here,
>   // but you don't need a separate application for the final
> aggregation.
>  .groupBy((partitionId, partialCount) -> new KeyValue("ALL",
> partialCount))
>  .aggregate(sum up the partialCounts)
>  .publish(/counter-total)
> 
> I've left out the suppressions, but they would go right after the count()
> and the aggregate().
> 
> With this program, you don't have to worry about the double-aggregation you
> mentioned in the last email. The KTable produced by the first count() will
> maintain the correct count per partition. If the value changes for any
> partition, it'll emit a retraction of the old value and then the new value
> downstream, so that the final aggregation can update itself properly.
> 
> I think we can optimize both the execution and the programability by adding
> a "global aggregation" concept. But In principle, it seems like this usage
> of the current API will support your use case.
> 
> Once again, though, this is just to present an alternative. I haven't done
> the math on whether your proposal would be more efficient.
> 
> Thanks,
> -John
> 
> Here's the same algorithm written in Java:
> 
> final KStream siteEvents = builder.stream("/site-events");
> 
> // here we re-key the events so that the key is actually the partition id.
> // we don't need the value to do a count, so I just set it to "1".
> final KStream keyedByPartition = siteEvents.transform(()
> -> new Transformer>() {
> private ProcessorContext context;
> 
> @Override
> public void init(final ProcessorContext context) {
> this.context = context;
> }
> 
> @Override
> public KeyValue transform(final String key, final
> String value) {
> return new KeyValue<>(context.partition(), 1);
> }
> });
> 
> // Note that we can't do "count()" on a KStream, we have to group it first.
> I'm grouping by the key, so it will produce the count for each key.
> // Since the key is actually the partition id, it will produce the
> pre-aggregated count per partition.
> // Note that the result is a KTable. It'll always
> contain the most recent count for each partition.
> final KTable countsByPartition =
> keyedByPartition.groupByKey().count();
> 
> // Now we get ready for the final roll-up. We re-group all the constituent
> counts
> final KGroupedTable singlePartition =
> countsByPartition.groupBy((key, value) -> new KeyValue<>("ALL", value));
> 
> final KTable totalCount = singlePartition.reduce((l, r) -> l
> + r, (l, r) -> l - r);
> 
> totalCount.toStream().foreach((k, v) -> {
> // k is always "ALL"
> // v is always the most recent total value
> System.out.println("The total event count is: " + v);
> });
> 
> 
> On Tue, Jul 3, 2018 at 9:21 AM flaviost...@gmail.com 
> wrote:
> 
> > Great feature you have there!
> >
> > I'll try to exercise here how we would achieve the same functional
> > objectives using your KIP:
> >
> > EXERCISE 1:
> >   - The case is "total counting of events for a huge website"
> >   - Tasks from Application A will have something like:
> >  .stream(/site-events)
> >  .count()
> >  .publish(/single-partitioned-topic-with-count-partials)
> >   - The published messages will be, for example:
> >   ["counter-task1", 2345]
> >   ["counter-task2", 8495]
> >   ["counter-task3", 4839]
> >   - Single Task from Application B will have something like:
> >  .stream(/single-partitioned-topic-with-count-partials)
> >  .aggregate(by messages whose key starts with "counter")
> >  .publish(/counter-total)
> >   - FAIL HERE. How would I know what is the overall partitions? Maybe two
> > partials for the same task will arrive before other tasks and it maybe
> > aggregated twice.
> >
> > I tried to think about using GlobalKTables, but I didn't get an easy way
> > to aggregate the keys from that table. Do you have any clue?
> >
> > Thanks.
> >
> > -Flávio Stutz
> >
> >
> >
> >
> >
> >
> > 

Re: [VOTE] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-07-04 Thread Nishanth Pradeep
Hello Ted,

That's my fault. I replied in the PR because at the time John had made his
comment I hadn't added myself to the dev email list to reply. You can see
my comment here.

https://github.com/apache/kafka/pull/5284#discussion_r199360544

There is still ongoing discussion about the purpose of TopologyDescription.
Some parts of TopologyDescription might need to be worked so that the
return types are not strings, but TopologyDescription will continue to
print the human readable descriptions of the Topology by implicitly calling
the toString method of the underlying object. Please, view the discussion
thread for more details.

However, I believe this KIP is finished and the other changes might need to
be part of another KIP.

Best,
Nishanth Pradeep

On Tue, Jul 3, 2018 at 10:54 PM, Ted Yu  wrote:

> Hi,
> I don't seem to find response to John's comment :
>
> http://search-hadoop.com/m/Kafka/uyzND11alrn1G5N3Y1?subj=
> Re+Discuss+KIP+321+Add+method+to+get+TopicNameExtractor+in+
> TopologyDescription
>
> On Tue, Jul 3, 2018 at 7:38 PM, Nishanth Pradeep 
> wrote:
>
> > Hello,
> >
> > I would like to start the vote on extending the TopologyDescription.Sink
> > interface to return the class of the TopicNameExtractor in cases where
> > dynamic routing is used.
> >
> > The user can override the toString method of the TopicNameExtractor class
> > in order to provide a better textual description if he or she chooses.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 321%3A+Add+method+to+get+TopicNameExtractor+in+TopologyDescription
> >
> > Best,
> > Nishanth Pradeep
> >
>


Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

2018-07-04 Thread Matthias J. Sax
Thanks for the discussion. I am just catching up.

In general, I think we have different uses cases and non-windowed and
windowed is quite different. For the non-windowed case, suppress() has
no (useful) close or retention time, no final semantics, and also no
business logic impact.

On the other hand, for windowed aggregations, close time and final
result do have a meaning. IMHO, `close()` is part of business logic
while retention time is not. Also, suppression of intermediate result is
not a business rule and there might be use case for which either "early
intermediate" (before window end time) are suppressed only, or all
intermediates are suppressed (maybe also something in the middle, ie,
just reduce the load of intermediate updates). Thus, window-suppression
is much richer.

IMHO, a generic `suppress()` operator that can be inserted into the data
flow at any point is useful. Maybe we should keep is as generic as
possible. However, it might be difficult to use with regard to
windowing, as the mental effort to use it is high.

With regard to Guozhang's comment:

> we will actually
> process data as old as 30 days as well, while most of the late updates
> beyond 5 minutes would be discarded anyways.

If we use `suppress()` as a standalone operator, this is correct and
intended IMHO. To address the issue if the behavior is unwanted, I would
suggest to add a "suppress option" directly to
`count()/reduce()/aggregate()` window operator similar to
`Materialized`. This would be an "embedded suppress" and avoid the
issue. It would also address the issue about mental effort for "single
final window result" use case.

I also think that a shorter close-time than retention time is useful for
window aggregation. If we add close() to the window definition and
until() to `Materialized`, we can separate both correctly IMHO.

About setting `close = min(close,retention)` I am not sure. We might
rather throw an exception than reducing the close time automatically.
Otherwise, I see many user question about "I set close to X but it does
not get updated for some data that is with delay of X".

The tricky question might be to design the API in a backward compatible
way though.



-Matthias

On 7/3/18 5:38 AM, John Roesler wrote:
> Hi Guozhang,
> 
> I see. It seems like if we want to decouple 1) and 2), we need to alter the
> definition of the window. Do you think it would close the gap if we added a
> "window close" time to the window definition?
> 
> Such as:
> 
> builder.stream("input")
> .groupByKey()
> .windowedBy(
>   TimeWindows
> .of(60_000)
> .closeAfter(10 * 60)
> .until(30L * 24 * 60 * 60 * 1000)
> )
> .count()
> .suppress(Suppression.finalResultsOnly());
> 
> Possibly called "finalResultsAtWindowClose" or something?
> 
> Thanks,
> -John
> 
> On Mon, Jul 2, 2018 at 6:50 PM Guozhang Wang  wrote:
> 
>> Hey John,
>>
>> Obviously I'm too lazy on email replying diligence compared with you :)
>> Will try to reply them separately:
>>
>>
>> -
>>
>> To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
>>
>> I'm aware of this use case, but again, the concern is that, in this setting
>> in order to let the window be queryable for 30 days, we will actually
>> process data as old as 30 days as well, while most of the late updates
>> beyond 5 minutes would be discarded anyways. Personally I think for the
>> final update scenario, the ideal situation users would want is that "do not
>> process any data that is less than 5 minutes, and of course no update
>> records to the downstream later than 5 minutes either; but retain the
>> window to be queryable for 30 days". And by doing that the final window
>> snapshot would also be aligned with the update stream as well. In other
>> words, among these three periods:
>>
>> 1) the retention length of the window / table.
>> 2) the late records acceptance for updating the window.
>> 3) the late records update to be sent downstream.
>>
>> Final update use cases would naturally want 2) = 3), while 1) may be
>> different and larger, while what we provide now is that 1) = 2), which
>> could be different and in practice larger than 3), hence not the most
>> intuitive for their needs.
>>
>>
>>
>> -
>>
>> To reply your email on "Mon, Jul 2, 2018 at 10:27 AM":
>>
>> I'd like option 2) over option 1) better as well from programming pov. But
>> I'm wondering if option 2) would provide the above semantics or it is still
>> coupling 1) with 2) as well ?
>>
>>
>>
>> Guozhang
>>
>>
>>
>>
>> On Mon, Jul 2, 2018 at 1:08 PM, John Roesler  wrote:
>>
>>> In fact, to push the idea further (which IIRC is what Matthias originally
>>> proposed), if we can accept "Suppression#finalResultsOnly" in my last
>>> email, then we could also consider whether to eliminate
>>> "suppressLateEvents" entirely.
>>>
>>> We could always add it later, but you've both 

Re: [DISCUSS] KIP-320: Allow fetchers to detect and handle log truncation

2018-07-04 Thread Anna Povzner
Hi Jason and Dong,


I’ve been thinking about your suggestions and discussion regarding
position(), seek(), and new proposed API.


Here is my thought process why we should keep position() and seek() API
unchanged.


I think we should separate {offset, leader epoch} that uniquely identifies
a message from an offset that is a position. In some cases, offsets
returned from position() could be actual consumed messages by this consumer
identified by {offset, leader epoch}. In other cases, position() returns
offset that was not actually consumed. Suppose, the user calls position()
for the last offset. Suppose we return {offset, leader epoch} of the
message currently in the log. Then, the message gets truncated before
consumer’s first poll(). It does not make sense for poll() to fail in this
case, because the log truncation did not actually happen from the consumer
perspective. On the other hand, as the KIP proposes, it makes sense for the
committed() method to return {offset, leader epoch} because those offsets
represent actual consumed messages.


The same argument applies to the seek() method — we are not seeking to a
message, we are seeking to a position.


I like the proposal to add KafkaConsumer#findOffsets() API. I am assuming
something like:

Map findOffsets(Map
offsetsToSearch)

Similar to seek() and position(), I think findOffsets() should return
offset without leader epoch, because what we want is the offset that we
think is closest to the not divergent message from the given consumed
message. Until the consumer actually fetches the message, we should not let
the consumer store the leader epoch for a message it did not consume.


So, the workflow will be:

1) The user gets LogTruncationException with {offset, leader epoch of the
previous message} (whatever we send with new FetchRecords request).

2) offset = findOffsets(tp -> {offset, leader epoch})

3) seek(offset)


For the use-case where the users store committed offsets externally:

1) Such users would have to track the leader epoch together with an offset.
Otherwise, there is no way to detect later what leader epoch was associated
with the message. I think it’s reasonable to ask that from users if they
want to detect log truncation. Otherwise, they will get the current
behavior.


If the users currently get an offset to be stored using position(), I see
two possibilities. First, they call save offset returned from position()
that they call before poll(). In that case, it would not be correct to
store {offset, leader epoch} if we would have changed position() to return
{offset, leader epoch} since actual fetched message could be different
(from the example I described earlier). So, it would be more correct to
call position() after poll(). However, the user already gets
ConsumerRecords at this point, from which the user can extract {offset,
leader epoch} of the last message.


So, I like the idea of adding a helper method to ConsumerRecords, as Jason
proposed, something like:

public OffsetAndEpoch lastOffsetWithLeaderEpoch(), where OffsetAndEpoch is
a data struct holding {offset, leader epoch}.


In this case, we would advise the user to follow the workflow: poll(), get
{offset, leader epoch} from ConsumerRecords#lastOffsetWithLeaderEpoch(),
save offset and leader epoch, process records.


2) When the user needs to seek to the last committed offset, they call new
findOffsets(saved offset, leader epoch), and then seek(offset).


What do you think?


Thanks,

Anna


On Tue, Jul 3, 2018 at 4:06 PM Dong Lin  wrote:

> Hey Jason,
>
> Thanks much for your thoughtful explanation.
>
> Yes the solution using findOffsets(offset, leaderEpoch) also works. The
> advantage of this solution it adds only one API instead of two APIs. The
> concern is that its usage seems a bit more clumsy for advanced users. More
> specifically, advanced users who store offsets externally will always need
> to call findOffsets() before calling seek(offset) during consumer
> initialization. And those advanced users will need to manually keep track
> of the leaderEpoch of the last ConsumerRecord.
>
> The other solution may be more user-friendly for advanced users is to add
> two APIs, `void seek(offset, leaderEpoch)` and `(offset, epoch) =
> offsetEpochs(topicPartition)`.
>
> I kind of prefer the second solution because it is easier to use for
> advanced users. If we need to expose leaderEpoch anyway to safely identify
> a message, it may be conceptually simpler to expose it directly in
> seek(...) rather than requiring one more translation using
> findOffsets(...). But I am also OK with the first solution if other
> developers also favor that one :)
>
> Thanks,
> Dong
>
>
> On Mon, Jul 2, 2018 at 11:10 AM, Jason Gustafson 
> wrote:
>
> > Hi Dong,
> >
> > Thanks, I've been thinking about your suggestions a bit. It is
> challenging
> > to make this work given the current APIs. One of the difficulties is that
> > we don't have an API to find the leader epoch for a given offset at the

Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-07-04 Thread Matthias J. Sax
Sounds good to me.

-Matthias

On 7/4/18 10:53 AM, Guozhang Wang wrote:
> After looked through the current TopologyDescription I think I'd want to
> combine the suggestions from John and Matthias on the API proposal. The
> motivations is that we have two relatively different functionalities
> provided from the APIs today:
> 
> 1. Each interface's public functions, like
> SourceNode#topics(), GlobalStore#source(), which returns non-String typed
> data. The hope was to let users programmatically leverage on those APIs for
> runtime checking.
> 2. Each interface's impl class also have an implicit toString() overridden
> to print the necessary information. This was designed for debugging
> purposes only during development cycles.
> 
> What we've observed so far, though, is that users leverage 2) much more
> than 1) in practice, since it is more convienent to parse strings than
> recursively calling the APIs to get non-string fields. On the other hand,
> the discussion controversy is more around 1), not 2). As for 2) people seem
> to be on the right page anyways: print the topic lists if it is not
> dynamic, or print extractor string format otherwise. For 1) above we should
> probably have all three `Set topics()`, `Pattern topicPattern()`
> and `TopicNameExtractor topicExtractor()`; while for 2) I feel comfortable
> relying on the TopicNameExtractor#toString() in `Source#toString()` impl
> since even if users do not override this function, the default value
> `className@hashcode` still looks fine to me.
> 
> 
> Guozhang
> 
> 
> 
> On Tue, Jul 3, 2018 at 11:22 PM, Matthias J. Sax 
> wrote:
> 
>> I just double checked the discussion thread of KIP-120 that introduced
>> `TopologyDescription`. Back than the argument was, that using the
>> simplest option might be sufficient because the description is mostly
>> used for debugging.
>>
>> Not sure if this argument holds. It seem that people built first more
>> sophisticated tools using TopologyDescription.
>>
>> Final note: if we really want to add `topicPattern()` we might want to
>> deprecate `topic()` and replace with `Set topics()`, because a
>> source node can take a multiple topics, too.
>>
>> Just adding this for completeness of context to the discussion.
>>
>>
>> -Matthias
>>
>> On 7/3/18 11:09 PM, Matthias J. Sax wrote:
>>> John,
>>>
>>> I am a little bit on the fence. In retrospective, it might have been
>>> better to add `topic()` and `topicPattern()` to source node and return a
>>> proper `Pattern` object instead of the pattern as a String.
>>>
>>> All other "payload" is just names and thus String naturally. From my
>>> point of view `TopologyDescription` should represent the `Topology` in a
>>> "machine readable" form plus a default "human readable" from via
>>> `toString()` -- this does not imply that all return types should be
>> String.
>>>
>>> Let me know what you think. If you agree, we could even add
>>> `Source#topicPattern()` in another KIP.
>>>
>>>
>>> -Matthias
>>>
>>> On 6/26/18 3:45 PM, John Roesler wrote:
 Sorry for the late comment,

 Looking at the other pieces of TopologyDescription, I noticed that
>> pretty
 much all of the "payload" of these description nodes are strings.
>> Should we
 consider returning a string from `topicNameExtractor()` instead?

 In fact, if we did that, we could consider calling `toString()` on the
 extractor instead of returning the class name. This would allow authors
>> of
 the extractors to provide more information about the extractor than just
 its name. This might be especially useful in the case of anonymous
 implementations.

 Thanks for the KIP,
 -John

 On Mon, Jun 25, 2018 at 11:52 PM Ted Yu  wrote:

> My previous response was talking about the new method in
> InternalTopologyBuilder.
> The exception just means there is no uniform extractor for all the
>> sinks.
>
> On Mon, Jun 25, 2018 at 8:02 PM, Matthias J. Sax <
>> matth...@confluent.io>
> wrote:
>
>> Ted,
>>
>> Why? Each sink can have a different TopicNameExtractor.
>>
>>
>> -Matthias
>>
>> On 6/25/18 5:19 PM, Ted Yu wrote:
>>> If there are different TopicNameExtractor classes from multiple sink
>> nodes,
>>> the new method should throw exception alerting user of such scenario.
>>>
>>>
>>> On Mon, Jun 25, 2018 at 2:23 PM, Bill Bejeck 
> wrote:
>>>
 Thanks for the KIP!

 Overall I'm +1 on the KIP.   I have one question.

 The KIP states that the method "topicNameExtractor()" is added to
>> the
 InternalTopologyBuilder.java.

 It could be that I'm missing something, but wow does this work if a
> user
 has provided different TopicNameExtractor instances to multiple sink
>> nodes?

 Thanks,
 Bill



 On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang 
>> wrote:

RE: [VOTE] KIP-280: Enhanced log compaction

2018-07-04 Thread Luís Cabral
Hi Jun,

-:  1. I guess both new configurations will be at the topic level?

They will exist in the global configuration, at the very least.
I would like to have them on the topic level as well, but there is an 
inconsistency between the cleanup/compaction properties that exist “only 
globally” vs “globally + per topic”.
I haven’t gotten around to investigating why, and if that reason would then 
also impact the properties I’m suggesting. At first glance they seem to belong 
with the properties that are "only globally” configured, but Guozhang has 
written earlier with a suggestion of a compaction property that works for both 
(though I haven’t had the time to look into it yet, unfortunately).

-:  2. Since the log cleaner now needs to keep both the offset and another long 
(say timestamp) in the de-dup map, it reduces the number of keys that we can 
keep in the map and thus may require more rounds of cleaning. This is probably 
not a big issue, but it will be useful to document this impact in the KIP.

As a reader, I tend to prefer brief documentation on new features (they tend to 
be too many for me to find the willpower to read a 200-page essay about each 
one), so that influences me to avoid writing about every micro-impact that may 
exist, and simply leave it inferred (as you have just done).
But I also don’t feel strongly enough about it to argue either way. So, after 
reading my argument, if you still insist, I’ll happily add this there.

-: 3. With the new cleaning strategy, we want to be a bit careful with removing 
the last message in a partition (which is possible now). We need to preserve 
the offset of the last message so that we don't reuse the offset for a 
different message. One way to simply never remove the last message. Another way 
(suggested by Jason) is to create an empty message batch.

That is a good point, but isn’t the last message always kept regardless? In all 
of my tests with this approach, never have I seen it being removed. This is not 
because I made it so while changing the code, it was simply like this before, 
which made me smile!
Given these results, I just *assumed* (oops) that these scenarios represented 
the reality, so the compaction would only affected the “tail”, while the “head” 
remained untouched. Now that you say its possible that the last message 
actually gets overwritten somehow, I guess a new bullet point will have to be 
added to the KIP for this (after I’ve found the time to review the portion of 
the code that enacts this behaviour).

Kind Regards,
Luís Cabral

From: Jun Rao
Sent: 03 July 2018 23:58
To: dev
Subject: Re: [VOTE] KIP-280: Enhanced log compaction

Hi, Luis,

Thanks for the KIP. Overall, this seems a useful KIP. A few comments below.

1. I guess both new configurations will be at the topic level?
2. Since the log cleaner now needs to keep both the offset and another long
(say timestamp) in the de-dup map, it reduces the number of keys that we
can keep in the map and thus may require more rounds of cleaning. This is
probably not a big issue, but it will be useful to document this impact in
the KIP.
3. With the new cleaning strategy, we want to be a bit careful with
removing the last message in a partition (which is possible now). We need
to preserve the offset of the last message so that we don't reuse the
offset for a different message. One way to simply never remove the last
message. Another way (suggested by Jason) is to create an empty message
batch.

Jun

On Sat, Jun 9, 2018 at 12:39 AM, Luís Cabral 
wrote:

> Hi all,
>
> Any takers on having a look at this KIP and voting on it?
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 280%3A+Enhanced+log+compaction
>
> Cheers,
> Luis
>



Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-07-04 Thread Guozhang Wang
After looked through the current TopologyDescription I think I'd want to
combine the suggestions from John and Matthias on the API proposal. The
motivations is that we have two relatively different functionalities
provided from the APIs today:

1. Each interface's public functions, like
SourceNode#topics(), GlobalStore#source(), which returns non-String typed
data. The hope was to let users programmatically leverage on those APIs for
runtime checking.
2. Each interface's impl class also have an implicit toString() overridden
to print the necessary information. This was designed for debugging
purposes only during development cycles.

What we've observed so far, though, is that users leverage 2) much more
than 1) in practice, since it is more convienent to parse strings than
recursively calling the APIs to get non-string fields. On the other hand,
the discussion controversy is more around 1), not 2). As for 2) people seem
to be on the right page anyways: print the topic lists if it is not
dynamic, or print extractor string format otherwise. For 1) above we should
probably have all three `Set topics()`, `Pattern topicPattern()`
and `TopicNameExtractor topicExtractor()`; while for 2) I feel comfortable
relying on the TopicNameExtractor#toString() in `Source#toString()` impl
since even if users do not override this function, the default value
`className@hashcode` still looks fine to me.


Guozhang



On Tue, Jul 3, 2018 at 11:22 PM, Matthias J. Sax 
wrote:

> I just double checked the discussion thread of KIP-120 that introduced
> `TopologyDescription`. Back than the argument was, that using the
> simplest option might be sufficient because the description is mostly
> used for debugging.
>
> Not sure if this argument holds. It seem that people built first more
> sophisticated tools using TopologyDescription.
>
> Final note: if we really want to add `topicPattern()` we might want to
> deprecate `topic()` and replace with `Set topics()`, because a
> source node can take a multiple topics, too.
>
> Just adding this for completeness of context to the discussion.
>
>
> -Matthias
>
> On 7/3/18 11:09 PM, Matthias J. Sax wrote:
> > John,
> >
> > I am a little bit on the fence. In retrospective, it might have been
> > better to add `topic()` and `topicPattern()` to source node and return a
> > proper `Pattern` object instead of the pattern as a String.
> >
> > All other "payload" is just names and thus String naturally. From my
> > point of view `TopologyDescription` should represent the `Topology` in a
> > "machine readable" form plus a default "human readable" from via
> > `toString()` -- this does not imply that all return types should be
> String.
> >
> > Let me know what you think. If you agree, we could even add
> > `Source#topicPattern()` in another KIP.
> >
> >
> > -Matthias
> >
> > On 6/26/18 3:45 PM, John Roesler wrote:
> >> Sorry for the late comment,
> >>
> >> Looking at the other pieces of TopologyDescription, I noticed that
> pretty
> >> much all of the "payload" of these description nodes are strings.
> Should we
> >> consider returning a string from `topicNameExtractor()` instead?
> >>
> >> In fact, if we did that, we could consider calling `toString()` on the
> >> extractor instead of returning the class name. This would allow authors
> of
> >> the extractors to provide more information about the extractor than just
> >> its name. This might be especially useful in the case of anonymous
> >> implementations.
> >>
> >> Thanks for the KIP,
> >> -John
> >>
> >> On Mon, Jun 25, 2018 at 11:52 PM Ted Yu  wrote:
> >>
> >>> My previous response was talking about the new method in
> >>> InternalTopologyBuilder.
> >>> The exception just means there is no uniform extractor for all the
> sinks.
> >>>
> >>> On Mon, Jun 25, 2018 at 8:02 PM, Matthias J. Sax <
> matth...@confluent.io>
> >>> wrote:
> >>>
>  Ted,
> 
>  Why? Each sink can have a different TopicNameExtractor.
> 
> 
>  -Matthias
> 
>  On 6/25/18 5:19 PM, Ted Yu wrote:
> > If there are different TopicNameExtractor classes from multiple sink
>  nodes,
> > the new method should throw exception alerting user of such scenario.
> >
> >
> > On Mon, Jun 25, 2018 at 2:23 PM, Bill Bejeck 
> >>> wrote:
> >
> >> Thanks for the KIP!
> >>
> >> Overall I'm +1 on the KIP.   I have one question.
> >>
> >> The KIP states that the method "topicNameExtractor()" is added to
> the
> >> InternalTopologyBuilder.java.
> >>
> >> It could be that I'm missing something, but wow does this work if a
> >>> user
> >> has provided different TopicNameExtractor instances to multiple sink
>  nodes?
> >>
> >> Thanks,
> >> Bill
> >>
> >>
> >>
> >> On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang 
>  wrote:
> >>
> >>> Yup I agree, generally speaking the `toString()` output is not
> >> recommended
> >>> to be relied on programmatically in user's code, 

RE: [VOTE] KIP-280: Enhanced log compaction

2018-07-04 Thread Luís Cabral
Hi Jason,

There’s a “Motivation” chapter in the KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-280%3A+Enhanced+log+compaction#KIP-280:Enhancedlogcompaction-Motivation

Is it still unclear after reading that?

Kind Regards,
Luís Cabral


From: Jason Gustafson
Sent: 03 July 2018 23:45
To: dev
Subject: Re: [VOTE] KIP-280: Enhanced log compaction

Sorry to join the discussion late. Can you you add to the motivation the
use cases for header-based compaction. This seems not very clear to me.

Thanks,
Jason

On Mon, Jul 2, 2018 at 11:00 AM, Guozhang Wang  wrote:

> Hi Luis,
>
> I believe that compaction property is indeed overridable at per-topic
> level, as in
>
> https://github.com/apache/kafka/blob/0cacbcf30e0a90ab9fad7bc310e547
> 7cf959f1fd/clients/src/main/java/org/apache/kafka/common/
> config/TopicConfig.java#L116
>
> And also documented in https://kafka.apache.org/
> documentation/#topicconfigs
>
> Is that right?
>
>
>
> Guozhang
>
> On Mon, Jul 2, 2018 at 7:41 AM, Luís Cabral  >
> wrote:
>
> >  Hi Guozhang,
> >
> > You are right that it is not straightforward to add a dependent property
> > validation.
> > Though it is possible to re-design it to allow for this, that effort
> would
> > be better placed under its own KIP, if it really becomes useful for other
> > properties as well.
> > Given this, the fallback-to-offset behaviour currently documented will be
> > used.
> >
> > Also, while analyzing this, I noticed that the existing compaction
> > properties only exist globally, and not per topic.
> > I don't understand why this is, but it again feels like something out of
> > scope for this KIP.
> > Given this, the KIP was updated to only include the global configuration
> > properties, removing the per-topic configs.
> >
> > I'll soon update the PR according to the documentation, but I trust the
> > KIP doesn't need that to close, right?
> >
> > Cheers,
> > Luis
> >
> > On Monday, July 2, 2018, 2:00:08 PM GMT+2, Luís Cabral
> >  wrote:
> >
> >   Hi Guozhang,
> >
> > At the moment the KIP has your vote, Matthias' and Ted's.
> > Should I ask someone else to have a look?
> >
> > Cheers,
> > Luis
> >
> > On Monday, July 2, 2018, 12:16:48 PM GMT+2, Mickael Maison <
> > mickael.mai...@gmail.com> wrote:
> >
> >  +1 (non binding). Thanks for the KIP!
> >
> > On Sat, Jun 30, 2018 at 12:26 AM, Guozhang Wang 
> > wrote:
> > > Hi Luis,
> > >
> > > Regarding the minor suggest, I agree it would be better to make it as
> > > mandatory, but it might be a bit tricky because it is a conditional
> > > mandatory one depending on the other config's value. Would like to see
> > your
> > > updated PR.
> > >
> > > Regarding the KIP itself, both Matthias and myself can recast our votes
> > to
> > > the updated wiki, while we still need one more committer to vote
> > according
> > > to the bylaws.
> > >
> > >
> > > Guozhang
> > >
> > > On Thu, Jun 28, 2018 at 5:38 AM, Luís Cabral
> > 
> > > wrote:
> > >
> > >>  Hi,
> > >>
> > >> Thank you all for having a look!
> > >>
> > >> The KIP is now updated with the result of these late discussions,
> > though I
> > >> did take some liberty with this part:
> > >>
> > >>
> > >>- If the "compaction.strategy.header" configuration is not set (or
> is
> > >> blank), then the compaction strategy will fallback to "offset";
> > >>
> > >>
> > >> Alternatively, we can also set it to be a mandatory property when the
> > >> strategy is "header" and fail the application to start via a config
> > >> validation (I would honestly prefer this, but its up to your taste).
> > >>
> > >> Anyway, this is now a minute detail that can be adapted during the
> final
> > >> stage of this KIP, so are you all alright with me changing the status
> to
> > >> [ACCEPTED]?
> > >>
> > >> Cheers,
> > >> Luis
> > >>
> > >>
> > >>On Thursday, June 28, 2018, 2:08:11 PM GMT+2, Ted Yu <
> > >> yuzhih...@gmail.com> wrote:
> > >>
> > >>  +1
> > >>
> > >> On Thu, Jun 28, 2018 at 4:56 AM, Luís Cabral
> >  > >> >
> > >> wrote:
> > >>
> > >> > Hi Ted,
> > >> > Can I also get your input on this?
> > >> >
> > >> > bq. +1 from my side for using `compaction.strategy` with values
> > >> > "offset","timestamp" and "header" and `compaction.strategy.header`
> > >> > -Matthias
> > >> >
> > >> > bq. +1 from me as well.
> > >> > -Guozhang
> > >> >
> > >> >
> > >> > Cheers,
> > >> > Luis
> > >> >
> > >> >
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> >
>
>
>
> --
> -- Guozhang
>



Re: Contributing: KIP permissions

2018-07-04 Thread Rajini Sivaram
Hi Stanislav,

I have added access to your account `stanislav`.

Regards,

Rajini

On Wed, Jul 4, 2018 at 5:35 PM, Stanislav Kozlovski 
wrote:

> Hey everybody,
>
> I've already implemented some JIRA tickets and would like to further my
> contributions into this great project by creating KIPs. Can I receive
> permission to do so?
> My e-mail: stanis...@confluent.io
>
> --
> Best,
> Stanislav
>


Contributing: KIP permissions

2018-07-04 Thread Stanislav Kozlovski
Hey everybody,

I've already implemented some JIRA tickets and would like to further my
contributions into this great project by creating KIPs. Can I receive
permission to do so?
My e-mail: stanis...@confluent.io

-- 
Best,
Stanislav


Re: [DISCUSS] KIP-331 Add default implementation to close() and configure() for Serializer, Deserializer and Serde

2018-07-04 Thread Chia-Ping Tsai
hi John

> Just really scraping my mind for concerns to investigate... This change
> won't break source compatibility, but will it affect binary compatibility?
> For example, if I compile my application against Kafka 2.0, for example,
> and then swap in the Kafka jar containing your change on my classpath at
> run time, will it still work?
> I think it should be binary compatible, assuming Oracle didn't do anything
> crazy, since all the method references would be the same. It might we worth
> an experiment, though.

The side effect of this change is IncompatibleClassChangeError caused by the 
conflict of multi default implementations (see 
https://docs.oracle.com/javase/specs/jls/se8/html/jls-13.html#jls-13.5.6). But 
the change won't introduce the such error since there is no default impl before.

I will start the Vote at 7/6.

Cheers,
chia-ping

On 2018/07/03 18:17:28, John Roesler  wrote: 
> Hi again, Chia-Ping,
> 
> Thanks! I took a look at it, and it seems like a good change to me.
> 
> I don't know if it will generate much discussion, so I recommend just
> letting it steep for another day or two and then moving on to a vote if no
> one else chimes in.
> 
> Just really scraping my mind for concerns to investigate... This change
> won't break source compatibility, but will it affect binary compatibility?
> For example, if I compile my application against Kafka 2.0, for example,
> and then swap in the Kafka jar containing your change on my classpath at
> run time, will it still work?
> 
> I think it should be binary compatible, assuming Oracle didn't do anything
> crazy, since all the method references would be the same. It might we worth
> an experiment, though.
> 
> Thanks,
> -John
> 
> On Mon, Jul 2, 2018 at 9:29 PM Chia-Ping Tsai  wrote:
> 
> > > Can you provide a link, please?
> >
> > Pardon me. I put the page in the incorrect location.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-331+Add+default+implementation+to+close%28%29+and+configure%28%29+for+Serializer%2C+Deserializer+and+Serde
> >
> > Cheers,
> > Chia-Ping
> >
> > On 2018/07/02 19:45:19, John Roesler  wrote:
> > > Hi Chia-Ping,
> > >
> > > I couldn't find KIP-331 in the list of KIPs (
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
> > > ).
> > >
> > > Can you provide a link, please?
> > >
> > > Thanks,
> > > -John
> > >
> > > On Sun, Jul 1, 2018 at 11:33 AM Chia-Ping Tsai 
> > wrote:
> > >
> > > > hi folks,
> > > >
> > > > KIP-331 is waiting for any suggestions, feedback and reviews. The main
> > > > purpose of the KIP-331 is to add empty implementations to the close()
> > and
> > > > configure() so as to user can write less code to develop custom
> > Serialzier,
> > > > Deserializer and Serde.
> > > >
> > > > Cheers,
> > > > Chia-Ping
> > > >
> > >
> >
> 


[jira] [Created] (KAFKA-7133) DisconnectException every 5 minutes in single restore consumer thread

2018-07-04 Thread Chris Schwarzfischer (JIRA)
Chris Schwarzfischer created KAFKA-7133:
---

 Summary: DisconnectException every 5 minutes in single restore 
consumer thread
 Key: KAFKA-7133
 URL: https://issues.apache.org/jira/browse/KAFKA-7133
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 1.1.0
 Environment: Kafka Streams application in Kubernetes.
Kafka Server in Docker on machine in host mode
Reporter: Chris Schwarzfischer


One of our streams applications (and only this one) gets a 
{{org.apache.kafka.common.errors.DisconnectException}} almost exactly every 5 
minutes.
The application has two of
KStream -> KGroupedStream -> KTable -> KGroupedTable -> KTable
aggregations.

Relevant config is in Streams:
{code:java}
this.properties.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, 
StreamsConfig.AT_LEAST_ONCE);
//...
this.properties.put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, 2);
this.properties.put(StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, 1024 * 1024 
* 500 /* 500 MB */ );
this.properties.put(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, 1024 * 
1024 * 100 /* 100 MB */);
this.properties.put(ConsumerConfig.FETCH_MAX_BYTES_CONFIG, 1024 * 1024 * 50 /* 
50 MB */);
{code}

On the broker:
{noformat}
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3
KAFKA_OFFSETS_RETENTION_MINUTES: 108000
KAFKA_MIN_INSYNC_REPLICAS: 2
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 3
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 2
KAFKA_TRANSACTIONAL_ID_EXPIRATION_MS: 2147483000
KAFKA_LOG_RETENTION_HOURS: 2688
KAFKA_OFFSETS_RETENTION_CHECK_INTERVAL_MS: 120
KAFKA_ZOOKEEPER_SESSION_TIMEOUT_MS: 12000
{noformat}

Logging gives us a single restore consumer thread that throws exceptions every 
5 mins:
 
{noformat}
July 4th 2018, 15:38:51.560 dockertest032018-07-04T13:38:51,559Z INFO  
: 
[testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]:
 FetchSessionHandler::handleError:440 - [Consumer 
clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restore-consumer,
 groupId=] Error sending fetch request (sessionId=317141939, epoch=INITIAL) to 
node 1: org.apache.kafka.common.errors.DisconnectException.
July 4th 2018, 15:37:54.833 dockertest032018-07-04T13:37:54,832Z INFO  
: 
[testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]:
 FetchSessionHandler::handleError:440 - [Consumer 
clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restore-consumer,
 groupId=] Error sending fetch request (sessionId=2064325970, epoch=INITIAL) to 
node 3: org.apache.kafka.common.errors.DisconnectException.
July 4th 2018, 15:37:54.833 dockertest032018-07-04T13:37:54,832Z INFO  
: 
[testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]:
 FetchSessionHandler::handleError:440 - [Consumer 
clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restore-consumer,
 groupId=] Error sending fetch request (sessionId=1735432619, epoch=INITIAL) to 
node 2: org.apache.kafka.common.errors.DisconnectException.
July 4th 2018, 15:32:26.379 dockertest032018-07-04T13:32:26,378Z INFO  
: 
[testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]:
 FetchSessionHandler::handleError:440 - [Consumer 
clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restore-consumer,
 groupId=] Error sending fetch request (sessionId=317141939, epoch=INITIAL) to 
node 1: org.apache.kafka.common.errors.DisconnectException.
July 4th 2018, 15:32:01.926 dockertest032018-07-04T13:32:01,925Z INFO  
: 
[testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]:
 FetchSessionHandler::handleError:440 - [Consumer 
clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restore-consumer,
 groupId=] Error sending fetch request (sessionId=1735432619, epoch=INITIAL) to 
node 2: org.apache.kafka.common.errors.DisconnectException.
July 4th 2018, 15:32:01.926 dockertest032018-07-04T13:32:01,925Z INFO  
: 
[testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]:
 FetchSessionHandler::handleError:440 - [Consumer 
clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restore-consumer,
 groupId=] Error sending fetch request (sessionId=2064325970, epoch=INITIAL) to 
node 3: org.apache.kafka.common.errors.DisconnectException.
July 4th 2018, 15:26:53.886 dockertest032018-07-04T13:26:53,886Z INFO  
: 
[testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]:
 FetchSessionHandler::handleError:440 - [Consumer 

Builder Pattern for kafka-clients in 2.x ?

2018-07-04 Thread Matthias Wessendorf
Hi,

I was filing KAFKA-7059 ([1]) and sent a PR adding a new ctor:
--
public ProducerRecord(String topic, K key, V value, Iterable
headers)
---

One reasonable comment on the PR was instead of doing constructor
overloading, why not working on a builder for the ProducerRecord class.

I think this is generally a nice idea I was wondering if there is much
interest in ?

Sample:
---
final ProducerRecord myRecord = ProducerRecord.builder() //
or an exposed builder
.topic("mytopic")
.key("Key")
.value("the-val")
.headers(myHeaderIterable)
.build();
---

While at it - instead of just offering a builder for the "ProducerRecord"
class, why not adding a builder for the "KafkaProducer" and "KafkaConsumer"
clazzes.

---
final KafkaProducer myProducer = KafkaProducer.builder() //
or an exposed builder clazz
.config(myProducerConfig)
.keySerializer(myStringSerializer)
.valueSerializer(myStringSerializer)
.build();
---

to even make the above more nice, I think the "ProducerConfig" (analog the
ConsumerConfig) configuration options could be also made accesible w/ this
fluent API - instead of properties/map, which is what now dominates the
creation of the Consumers/Producers.


Any thoughts?   If there is interest, I am happy to start a KIP w/ a first
draft of the suggested API!

Cheers,
Matthias

[1] https://issues.apache.org/jira/browse/KAFKA-7059



-- 
Matthias Wessendorf

github: https://github.com/matzew
twitter: http://twitter.com/mwessendorf


Re: [VOTE] KIP-322: Return new error code for DeleteTopics API when topic deletion disabled.

2018-07-04 Thread Eno Thereska
+1 (non binding)

On Wed, Jul 4, 2018 at 1:19 PM, Kamal Chandraprakash <
kamal.chandraprak...@gmail.com> wrote:

> +1 (non-binding)
>
> On Wed, Jul 4, 2018 at 5:22 PM Magnus Edenhill  wrote:
>
> > +1 (non-binding)
> >
> > 2018-07-04 13:40 GMT+02:00 Satish Duggana :
> >
> > > +1
> > >
> > > Thanks,
> > > Satish.
> > >
> > > On Wed, Jul 4, 2018 at 4:11 PM, Daniele Ascione 
> > > wrote:
> > >
> > > > +1
> > > >
> > > > Thanks,
> > > > Daniele
> > > >
> > > > Il giorno mar 3 lug 2018 alle ore 23:55 Harsha  ha
> > > > scritto:
> > > >
> > > > > +1.
> > > > >
> > > > > Thanks,
> > > > > Harsha
> > > > >
> > > > > On Tue, Jul 3rd, 2018 at 9:22 AM, Ted Yu 
> > wrote:
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > +1
> > > > > >
> > > > > > On Tue, Jul 3, 2018 at 9:05 AM, Mickael Maison <
> > > > > mickael.mai...@gmail.com >
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > +1 (non binding)
> > > > > > > Thanks for the KIP
> > > > > > >
> > > > > > > On Tue, Jul 3, 2018 at 4:59 PM, Vahid S Hashemian
> > > > > > > < vahidhashem...@us.ibm.com > wrote:
> > > > > > > > +1 (non-binding)
> > > > > > > >
> > > > > > > > --Vahid
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > From: Gwen Shapira < g...@confluent.io >
> > > > > > > > To: dev < dev@kafka.apache.org >
> > > > > > > > Date: 07/03/2018 08:49 AM
> > > > > > > > Subject: Re: [VOTE] KIP-322: Return new error code for
> > > > > > > DeleteTopics
> > > > > > > > API when topic deletion disabled.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > On Tue, Jul 3, 2018 at 8:24 AM, Manikumar <
> > > > > manikumar.re...@gmail.com >
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >> Manikumar < manikumar.re...@gmail.com >
> > > > > > > >> Fri, Jun 29, 7:59 PM (4 days ago)
> > > > > > > >> to dev
> > > > > > > >> Hi All,
> > > > > > > >>
> > > > > > > >> I would like to start voting on KIP-322 which would return
> new
> > > > error
> > > > > > > > code
> > > > > > > >> for DeleteTopics API when topic deletion disabled.
> > > > > > > >>
> > > > > > > >>
> > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > > > > > action?pageId=87295558
> > > > > > > >
> > > > > > > >>
> > > > > > > >> Thanks,
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > *Gwen Shapira*
> > > > > > > > Product Manager | Confluent
> > > > > > > > 650.450.2760 | @gwenshap
> > > > > > > > Follow us: Twitter <
> > > > > > > > https://twitter.com/ConfluentInc
> > > > > > > >> | blog
> > > > > > > > <
> > > > > > > > http://www.confluent.io/blog
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > >
> > >
> >
>


Re: [VOTE] KIP-322: Return new error code for DeleteTopics API when topic deletion disabled.

2018-07-04 Thread Kamal Chandraprakash
+1 (non-binding)

On Wed, Jul 4, 2018 at 5:22 PM Magnus Edenhill  wrote:

> +1 (non-binding)
>
> 2018-07-04 13:40 GMT+02:00 Satish Duggana :
>
> > +1
> >
> > Thanks,
> > Satish.
> >
> > On Wed, Jul 4, 2018 at 4:11 PM, Daniele Ascione 
> > wrote:
> >
> > > +1
> > >
> > > Thanks,
> > > Daniele
> > >
> > > Il giorno mar 3 lug 2018 alle ore 23:55 Harsha  ha
> > > scritto:
> > >
> > > > +1.
> > > >
> > > > Thanks,
> > > > Harsha
> > > >
> > > > On Tue, Jul 3rd, 2018 at 9:22 AM, Ted Yu 
> wrote:
> > > >
> > > > >
> > > > >
> > > > >
> > > > > +1
> > > > >
> > > > > On Tue, Jul 3, 2018 at 9:05 AM, Mickael Maison <
> > > > mickael.mai...@gmail.com >
> > > > >
> > > > > wrote:
> > > > >
> > > > > > +1 (non binding)
> > > > > > Thanks for the KIP
> > > > > >
> > > > > > On Tue, Jul 3, 2018 at 4:59 PM, Vahid S Hashemian
> > > > > > < vahidhashem...@us.ibm.com > wrote:
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > --Vahid
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > From: Gwen Shapira < g...@confluent.io >
> > > > > > > To: dev < dev@kafka.apache.org >
> > > > > > > Date: 07/03/2018 08:49 AM
> > > > > > > Subject: Re: [VOTE] KIP-322: Return new error code for
> > > > > > DeleteTopics
> > > > > > > API when topic deletion disabled.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > On Tue, Jul 3, 2018 at 8:24 AM, Manikumar <
> > > > manikumar.re...@gmail.com >
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Manikumar < manikumar.re...@gmail.com >
> > > > > > >> Fri, Jun 29, 7:59 PM (4 days ago)
> > > > > > >> to dev
> > > > > > >> Hi All,
> > > > > > >>
> > > > > > >> I would like to start voting on KIP-322 which would return new
> > > error
> > > > > > > code
> > > > > > >> for DeleteTopics API when topic deletion disabled.
> > > > > > >>
> > > > > > >>
> > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > > > > action?pageId=87295558
> > > > > > >
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > *Gwen Shapira*
> > > > > > > Product Manager | Confluent
> > > > > > > 650.450.2760 | @gwenshap
> > > > > > > Follow us: Twitter <
> > > > > > > https://twitter.com/ConfluentInc
> > > > > > >> | blog
> > > > > > > <
> > > > > > > http://www.confluent.io/blog
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > >
> >
>


Re: [VOTE] 2.0.0 RC1

2018-07-04 Thread Mickael Maison
+1 (non-binding)
Ran tests and quickstart using kafka_2.12-2.0.0.tgz with Java 8

Thanks

On Wed, Jul 4, 2018 at 10:24 AM, Manikumar  wrote:
> +1 (non-binding)  Verified the release notes, src, binary artifacts,  Ran
> the test suite,
> Verified quick start, Ran producer/consumer perf test, log compaction tests
>
> Thanks
>
>
> On Wed, Jul 4, 2018 at 8:33 AM Brett Rann  wrote:
>
>> +1 tentative
>> rolling upgrade of tiny shared staging multitenacy (200+ consumer groups)
>> cluster from 1.1 to 2.0.0-rc1. cluster looks healthy. Will monitor.
>>
>> On Tue, Jul 3, 2018 at 8:18 AM Harsha  wrote:
>>
>> > +1.
>> >
>> > 1) Ran unit tests
>> > 2) 3 node cluster , tested basic operations.
>> >
>> > Thanks,
>> > Harsha
>> >
>> > On Mon, Jul 2nd, 2018 at 11:13 AM, "Vahid S Hashemian" <
>> > vahidhashem...@us.ibm.com> wrote:
>> >
>> > >
>> > >
>> > >
>> > > +1 (non-binding)
>> > >
>> > > Built from source and ran quickstart successfully on Ubuntu (with Java
>> > 8).
>> > >
>> > >
>> > > Minor: It seems this doc update PR is not included in the RC:
>> > > https://github.com/apache/kafka/pull/5280
>> > 
>> > > Guozhang seems to have wanted to cherry-pick it to 2.0.
>> > >
>> > > Thanks Rajini!
>> > > --Vahid
>> > >
>> > >
>> > >
>> > >
>> > > From: Rajini Sivaram < rajinisiva...@gmail.com >
>> > > To: dev < dev@kafka.apache.org >, Users < us...@kafka.apache.org >,
>> > > kafka-clients < kafka-clie...@googlegroups.com >
>> > > Date: 06/29/2018 11:36 AM
>> > > Subject: [VOTE] 2.0.0 RC1
>> > >
>> > >
>> > >
>> > > Hello Kafka users, developers and client-developers,
>> > >
>> > >
>> > > This is the second candidate for release of Apache Kafka 2.0.0.
>> > >
>> > >
>> > > This is a major version release of Apache Kafka. It includes 40 new
>> KIPs
>> > > and
>> > >
>> > > several critical bug fixes. Please see the 2.0.0 release plan for more
>> > > details:
>> > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80448820
>> > <
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80448820>
>> > >
>> > >
>> > >
>> > > A few notable highlights:
>> > >
>> > > - Prefixed wildcard ACLs (KIP-290), Fine grained ACLs for CreateTopics
>> > > (KIP-277)
>> > > - SASL/OAUTHBEARER implementation (KIP-255)
>> > > - Improved quota communication and customization of quotas (KIP-219,
>> > > KIP-257)
>> > > - Efficient memory usage for down conversion (KIP-283)
>> > > - Fix log divergence between leader and follower during fast leader
>> > > failover (KIP-279)
>> > > - Drop support for Java 7 and remove deprecated code including old
>> > > scala
>> > > clients
>> > > - Connect REST extension plugin, support for externalizing secrets and
>> > > improved error handling (KIP-285, KIP-297, KIP-298 etc.)
>> > > - Scala API for Kafka Streams and other Streams API improvements
>> > > (KIP-270, KIP-150, KIP-245, KIP-251 etc.)
>> > >
>> > > Release notes for the 2.0.0 release:
>> > >
>> > > http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/RELEASE_NOTES.html
>> > 
>> > >
>> > >
>> > >
>> > >
>> > > *** Please download, test and vote by Tuesday, July 3rd, 4pm PT
>> > >
>> > >
>> > > Kafka's KEYS file containing PGP keys we use to sign the release:
>> > >
>> > > http://kafka.apache.org/KEYS
>> > 
>> > >
>> > >
>> > >
>> > > * Release artifacts to be voted upon (source and binary):
>> > >
>> > > http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/
>> > 
>> > >
>> > >
>> > >
>> > > * Maven artifacts to be voted upon:
>> > >
>> > > https://repository.apache.org/content/groups/staging/
>> > 
>> > >
>> > >
>> > >
>> > > * Javadoc:
>> > >
>> > > http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/javadoc/
>> > 
>> > >
>> > >
>> > >
>> > > * Tag to be voted upon (off 2.0 branch) is the 2.0.0 tag:
>> > >
>> > > https://github.com/apache/kafka/tree/2.0.0-rc1
>> > 
>> > >
>> > >
>> > >
>> > > * Documentation:
>> > >
>> > > http://kafka.apache.org/20/documentation.html
>> > 
>> > >
>> > >
>> > >
>> > > * Protocol:
>> > >
>> > > http://kafka.apache.org/20/protocol.html
>> > 
>> > >
>> > >
>> > >
>> > > * Successful Jenkins builds for the 2.0 branch:
>> > >
>> > > Unit/integration tests:
>> > > https://builds.apache.org/job/kafka-2.0-jdk8/66/
>> > 
>> > >
>> > >
>> > > System tests:
>> > > https://jenkins.confluent.io/job/system-test-kafka/job/2.0/15/
>> > 
>> > >
>> > >
>> > >
>> > >
>> > > Please test and verify the release artifacts and submit a vote for this
>> > RC
>> > >
>> > 

Re: [VOTE] KIP-322: Return new error code for DeleteTopics API when topic deletion disabled.

2018-07-04 Thread Magnus Edenhill
+1 (non-binding)

2018-07-04 13:40 GMT+02:00 Satish Duggana :

> +1
>
> Thanks,
> Satish.
>
> On Wed, Jul 4, 2018 at 4:11 PM, Daniele Ascione 
> wrote:
>
> > +1
> >
> > Thanks,
> > Daniele
> >
> > Il giorno mar 3 lug 2018 alle ore 23:55 Harsha  ha
> > scritto:
> >
> > > +1.
> > >
> > > Thanks,
> > > Harsha
> > >
> > > On Tue, Jul 3rd, 2018 at 9:22 AM, Ted Yu  wrote:
> > >
> > > >
> > > >
> > > >
> > > > +1
> > > >
> > > > On Tue, Jul 3, 2018 at 9:05 AM, Mickael Maison <
> > > mickael.mai...@gmail.com >
> > > >
> > > > wrote:
> > > >
> > > > > +1 (non binding)
> > > > > Thanks for the KIP
> > > > >
> > > > > On Tue, Jul 3, 2018 at 4:59 PM, Vahid S Hashemian
> > > > > < vahidhashem...@us.ibm.com > wrote:
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > --Vahid
> > > > > >
> > > > > >
> > > > > >
> > > > > > From: Gwen Shapira < g...@confluent.io >
> > > > > > To: dev < dev@kafka.apache.org >
> > > > > > Date: 07/03/2018 08:49 AM
> > > > > > Subject: Re: [VOTE] KIP-322: Return new error code for
> > > > > DeleteTopics
> > > > > > API when topic deletion disabled.
> > > > > >
> > > > > >
> > > > > >
> > > > > > +1
> > > > > >
> > > > > > On Tue, Jul 3, 2018 at 8:24 AM, Manikumar <
> > > manikumar.re...@gmail.com >
> > > >
> > > > > > wrote:
> > > > > >
> > > > > >> Manikumar < manikumar.re...@gmail.com >
> > > > > >> Fri, Jun 29, 7:59 PM (4 days ago)
> > > > > >> to dev
> > > > > >> Hi All,
> > > > > >>
> > > > > >> I would like to start voting on KIP-322 which would return new
> > error
> > > > > > code
> > > > > >> for DeleteTopics API when topic deletion disabled.
> > > > > >>
> > > > > >>
> > > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > > > action?pageId=87295558
> > > > > >
> > > > > >>
> > > > > >> Thanks,
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Gwen Shapira*
> > > > > > Product Manager | Confluent
> > > > > > 650.450.2760 | @gwenshap
> > > > > > Follow us: Twitter <
> > > > > > https://twitter.com/ConfluentInc
> > > > > >> | blog
> > > > > > <
> > > > > > http://www.confluent.io/blog
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> >
>


Re: [VOTE] KIP-322: Return new error code for DeleteTopics API when topic deletion disabled.

2018-07-04 Thread Satish Duggana
+1

Thanks,
Satish.

On Wed, Jul 4, 2018 at 4:11 PM, Daniele Ascione  wrote:

> +1
>
> Thanks,
> Daniele
>
> Il giorno mar 3 lug 2018 alle ore 23:55 Harsha  ha
> scritto:
>
> > +1.
> >
> > Thanks,
> > Harsha
> >
> > On Tue, Jul 3rd, 2018 at 9:22 AM, Ted Yu  wrote:
> >
> > >
> > >
> > >
> > > +1
> > >
> > > On Tue, Jul 3, 2018 at 9:05 AM, Mickael Maison <
> > mickael.mai...@gmail.com >
> > >
> > > wrote:
> > >
> > > > +1 (non binding)
> > > > Thanks for the KIP
> > > >
> > > > On Tue, Jul 3, 2018 at 4:59 PM, Vahid S Hashemian
> > > > < vahidhashem...@us.ibm.com > wrote:
> > > > > +1 (non-binding)
> > > > >
> > > > > --Vahid
> > > > >
> > > > >
> > > > >
> > > > > From: Gwen Shapira < g...@confluent.io >
> > > > > To: dev < dev@kafka.apache.org >
> > > > > Date: 07/03/2018 08:49 AM
> > > > > Subject: Re: [VOTE] KIP-322: Return new error code for
> > > > DeleteTopics
> > > > > API when topic deletion disabled.
> > > > >
> > > > >
> > > > >
> > > > > +1
> > > > >
> > > > > On Tue, Jul 3, 2018 at 8:24 AM, Manikumar <
> > manikumar.re...@gmail.com >
> > >
> > > > > wrote:
> > > > >
> > > > >> Manikumar < manikumar.re...@gmail.com >
> > > > >> Fri, Jun 29, 7:59 PM (4 days ago)
> > > > >> to dev
> > > > >> Hi All,
> > > > >>
> > > > >> I would like to start voting on KIP-322 which would return new
> error
> > > > > code
> > > > >> for DeleteTopics API when topic deletion disabled.
> > > > >>
> > > > >>
> > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > > action?pageId=87295558
> > > > >
> > > > >>
> > > > >> Thanks,
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Gwen Shapira*
> > > > > Product Manager | Confluent
> > > > > 650.450.2760 | @gwenshap
> > > > > Follow us: Twitter <
> > > > > https://twitter.com/ConfluentInc
> > > > >> | blog
> > > > > <
> > > > > http://www.confluent.io/blog
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > >
>


Re: [VOTE] #2 KIP-248: Create New ConfigCommand That Uses The New AdminClient

2018-07-04 Thread Magnus Edenhill
There are some concerns about the incremental option that needs to be
discussed further.

I believe everyone agrees on the need for incremental updates, allowing a
client
to only alter the configuration it provides in an atomic fashion.
The proposal adds a request-level incremental bool for this purpose, which
is good.

But I suspect this might not be enough, and thus suggest that we should
extend
the per-config-entry struct with a mode field that tells the broker how to
alter
the given configuration entry:
 - append - append value to entry (if no previous value it acts like set)
 - set - overwrite value
 - delete - delete configuration entry / revert to default.

If we don't do this, the incremental mode can only be used in "append" mode,
and a client that wishes to overwrite property A, delete B, and append to C,
will need to issue three requests:
 - 1. DescribeConfigs to get the current config.
 - 2. AlterConfigs(incremental=False) to overwrite config property A and
delete B.
 - 3. AlterConfigs(incremental=True) to append to config property C.

This makes the configuration update non-atomic, which incremental is set
out to fix,
any configuration changes made by another client between 1 and 2 would be
lost at 2.


This also needs to be exposed in the Admin API to make the user intention
clear,
ConfigEntry should be extended with a new constructor that takes the mode
parameter: append, set, or delete.
The existing constructor should default to set/overwrite (as in the
existing pre-incremental case).
If an application issues an AlterConfigs() request with append or delete
ConfigEntrys and the broker does not support KIP-248,
the request should fail locally in the client.

For reference, this is how it is exposed in the corresponding C API:
https://github.com/edenhill/librdkafka/blob/master/src/rdkafka.h#L5200



2018-07-04 11:28 GMT+02:00 Rajini Sivaram :

> Hi Viktor,
>
> Where are we with this KIP? Is it just waiting for votes? We should try and
> get this in earlier in the release cycle this time.
>
> Thank you,
>
> Rajini
>
> On Mon, May 21, 2018 at 7:44 AM, Viktor Somogyi 
> wrote:
>
> > Hi All,
> >
> > I'd like to ask the community to please vote for this as the KIP
> > freeze is tomorrow.
> >
> > Thank you very much,
> > Viktor
> >
> > On Mon, May 21, 2018 at 9:39 AM, Viktor Somogyi  >
> > wrote:
> > > Hi Colin,
> > >
> > > Sure, I'll add a note.
> > > Thanks for your vote.
> > >
> > > Viktor
> > >
> > > On Sat, May 19, 2018 at 1:01 AM, Colin McCabe 
> > wrote:
> > >> Hi Viktor,
> > >>
> > >> Thanks, this looks good.
> > >>
> > >> The boolean should default to false if not set, to ensure that
> existing
> > clients continue to work as-is, right?  Might be good to add a note
> > specifying that.
> > >>
> > >> +1 (non-binding)
> > >>
> > >> best,
> > >> Colin
> > >>
> > >> On Fri, May 18, 2018, at 08:16, Viktor Somogyi wrote:
> > >>> Updated KIP-248:
> > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 248+-+Create+New+ConfigCommand+That+Uses+The+New+AdminClient
> > >>>
> > >>> I'd like to ask project members, committers and contributors to vote
> > >>> as this would be a useful improvement in Kafka.
> > >>>
> > >>> Sections changed:
> > >>> - Public interfaces: added the bin/scram-credentials.sh command that
> > >>> we discussed with Colin.
> > >>> - Wire format types: removed AlterOperations. As discussed with
> Colin,
> > >>> we don't actually need this: we should behave in an incremental way
> in
> > >>> AlterQuotas. For AlterConfig we'll implement this behavior with an
> > >>> extra flag on the protocol (and incrementing the version).
> > >>> - AlterQuotas protocol: removed AlterOperations. Added some more
> > >>> description to the behavior of the protocol. Removing quotas will
> > >>> happen by sending a NaN instead of the AlterOperations. (Since IEEE
> > >>> 754 covers NaNs and it is not a valid config for any quota, I think
> it
> > >>> is a good notation.)
> > >>> - SCRAM: so it will be done by the scram-credentials command that
> uses
> > >>> direct zookeeper connection. I think further modes, like doing it
> > >>> through the broker is not necessary. The idea here is that zookeeper
> > >>> in this case acts as a credentials store. This should be decoupled
> > >>> from the broker as we manage broker credentials as well. The new
> > >>> command acts as a client to the store.
> > >>> - AlterConfigs will have an incremental_update flag as discussed. By
> > >>> default it is false to provide the backward compatible behavior. When
> > >>> it is true it will merge the configs with what's there in the node.
> > >>> Deletion in incremental mode is done by sending an empty string as
> > >>> config value.
> > >>> - Other compatibility changes: this KIP doesn't scope listing
> multiple
> > >>> users and client's quotas. As per a conversation with Rajini, it is
> > >>> not a common use case and we can add it back later if it is needed.
> If
> > >>> this functionality is needed, the old 

Re: [VOTE] KIP-322: Return new error code for DeleteTopics API when topic deletion disabled.

2018-07-04 Thread Daniele Ascione
+1

Thanks,
Daniele

Il giorno mar 3 lug 2018 alle ore 23:55 Harsha  ha scritto:

> +1.
>
> Thanks,
> Harsha
>
> On Tue, Jul 3rd, 2018 at 9:22 AM, Ted Yu  wrote:
>
> >
> >
> >
> > +1
> >
> > On Tue, Jul 3, 2018 at 9:05 AM, Mickael Maison <
> mickael.mai...@gmail.com >
> >
> > wrote:
> >
> > > +1 (non binding)
> > > Thanks for the KIP
> > >
> > > On Tue, Jul 3, 2018 at 4:59 PM, Vahid S Hashemian
> > > < vahidhashem...@us.ibm.com > wrote:
> > > > +1 (non-binding)
> > > >
> > > > --Vahid
> > > >
> > > >
> > > >
> > > > From: Gwen Shapira < g...@confluent.io >
> > > > To: dev < dev@kafka.apache.org >
> > > > Date: 07/03/2018 08:49 AM
> > > > Subject: Re: [VOTE] KIP-322: Return new error code for
> > > DeleteTopics
> > > > API when topic deletion disabled.
> > > >
> > > >
> > > >
> > > > +1
> > > >
> > > > On Tue, Jul 3, 2018 at 8:24 AM, Manikumar <
> manikumar.re...@gmail.com >
> >
> > > > wrote:
> > > >
> > > >> Manikumar < manikumar.re...@gmail.com >
> > > >> Fri, Jun 29, 7:59 PM (4 days ago)
> > > >> to dev
> > > >> Hi All,
> > > >>
> > > >> I would like to start voting on KIP-322 which would return new error
> > > > code
> > > >> for DeleteTopics API when topic deletion disabled.
> > > >>
> > > >>
> > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > action?pageId=87295558
> > > >
> > > >>
> > > >> Thanks,
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > *Gwen Shapira*
> > > > Product Manager | Confluent
> > > > 650.450.2760 | @gwenshap
> > > > Follow us: Twitter <
> > > > https://twitter.com/ConfluentInc
> > > >> | blog
> > > > <
> > > > http://www.confluent.io/blog
> > > >>
> > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> >
> >


Re: [VOTE] KIP-308: Support dynamic update of max.connections.per.ip/max.connections.per.ip.overrides configs

2018-07-04 Thread Manikumar
*Hi All,The vote has passed with 3 binding votes (Dong Lin, Rajini, Jason,)
and one non-binding vote(Ted).Thanks everyone for the
votes.Thanks,Manikumar*

On Fri, Jun 29, 2018 at 9:34 PM Jason Gustafson  wrote:

> +1
>
> Thanks Manikumar!
>
> On Fri, Jun 29, 2018 at 8:37 AM, Rajini Sivaram 
> wrote:
>
> > +1 (binding)
> >
> > Thanks for the KIP, Manikumar!
> >
> > On Fri, Jun 29, 2018 at 4:23 PM, Dong Lin  wrote:
> >
> > > +1
> > >
> > > Thanks!
> > >
> > > On Fri, 29 Jun 2018 at 7:36 AM Ted Yu  wrote:
> > >
> > > > +1
> > > >
> > > > On Fri, Jun 29, 2018 at 7:29 AM, Manikumar <
> manikumar.re...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I would like to start voting on KIP-308 which would add support for
> > > > dynamic
> > > > > update of max.connections.per.ip/max.connections.per.ip.overrides
> > > configs
> > > > >
> > > > >
> > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > action?pageId=85474993
> > > > >
> > > > > Thanks,
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] #2 KIP-248: Create New ConfigCommand That Uses The New AdminClient

2018-07-04 Thread Rajini Sivaram
Hi Viktor,

Where are we with this KIP? Is it just waiting for votes? We should try and
get this in earlier in the release cycle this time.

Thank you,

Rajini

On Mon, May 21, 2018 at 7:44 AM, Viktor Somogyi 
wrote:

> Hi All,
>
> I'd like to ask the community to please vote for this as the KIP
> freeze is tomorrow.
>
> Thank you very much,
> Viktor
>
> On Mon, May 21, 2018 at 9:39 AM, Viktor Somogyi 
> wrote:
> > Hi Colin,
> >
> > Sure, I'll add a note.
> > Thanks for your vote.
> >
> > Viktor
> >
> > On Sat, May 19, 2018 at 1:01 AM, Colin McCabe 
> wrote:
> >> Hi Viktor,
> >>
> >> Thanks, this looks good.
> >>
> >> The boolean should default to false if not set, to ensure that existing
> clients continue to work as-is, right?  Might be good to add a note
> specifying that.
> >>
> >> +1 (non-binding)
> >>
> >> best,
> >> Colin
> >>
> >> On Fri, May 18, 2018, at 08:16, Viktor Somogyi wrote:
> >>> Updated KIP-248:
> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 248+-+Create+New+ConfigCommand+That+Uses+The+New+AdminClient
> >>>
> >>> I'd like to ask project members, committers and contributors to vote
> >>> as this would be a useful improvement in Kafka.
> >>>
> >>> Sections changed:
> >>> - Public interfaces: added the bin/scram-credentials.sh command that
> >>> we discussed with Colin.
> >>> - Wire format types: removed AlterOperations. As discussed with Colin,
> >>> we don't actually need this: we should behave in an incremental way in
> >>> AlterQuotas. For AlterConfig we'll implement this behavior with an
> >>> extra flag on the protocol (and incrementing the version).
> >>> - AlterQuotas protocol: removed AlterOperations. Added some more
> >>> description to the behavior of the protocol. Removing quotas will
> >>> happen by sending a NaN instead of the AlterOperations. (Since IEEE
> >>> 754 covers NaNs and it is not a valid config for any quota, I think it
> >>> is a good notation.)
> >>> - SCRAM: so it will be done by the scram-credentials command that uses
> >>> direct zookeeper connection. I think further modes, like doing it
> >>> through the broker is not necessary. The idea here is that zookeeper
> >>> in this case acts as a credentials store. This should be decoupled
> >>> from the broker as we manage broker credentials as well. The new
> >>> command acts as a client to the store.
> >>> - AlterConfigs will have an incremental_update flag as discussed. By
> >>> default it is false to provide the backward compatible behavior. When
> >>> it is true it will merge the configs with what's there in the node.
> >>> Deletion in incremental mode is done by sending an empty string as
> >>> config value.
> >>> - Other compatibility changes: this KIP doesn't scope listing multiple
> >>> users and client's quotas. As per a conversation with Rajini, it is
> >>> not a common use case and we can add it back later if it is needed. If
> >>> this functionality is needed, the old code should be still available
> >>> through run-kafka-class. (Removed the USE_OLD_KAFKA_CONFIG_COMMAND as
> >>> it doesn't make sense anymore.)
> >>>
> >>> On Fri, May 18, 2018 at 12:33 PM, Viktor Somogyi
> >>>  wrote:
> >>> > Ok, ignore my previous mail (except the last sentence), gmail didn't
> >>> > update me about your last email :/.
> >>> >
> >>> >> I think we should probably just create a flag for alterConfigs
> which marks it as incremental, like we discussed earlier, and do this as a
> compatible change that is needed for the shell command.
> >>> >
> >>> > Alright, I missed that about the sensitive configs too, so in this
> >>> > case I can agree with this. I'll update the KIP this afternoon and
> >>> > update this thread.
> >>> > Thanks again for your contribution.
> >>> >
> >>> > Viktor
> >>> >
> >>> > On Fri, May 18, 2018 at 2:34 AM, Colin McCabe 
> wrote:
> >>> >> Actually, I just realized that this won't work.  The AlterConfigs
> API is kind of broken right now.  DescribeConfigs won't return the
> "sensitive" configurations like passwords.  So doing describe + edit +
> alter will wipe out all sensitive configs. :(
> >>> >>
> >>> >> I think we should probably just create a flag for alterConfigs
> which marks it as incremental, like we discussed earlier, and do this as a
> compatible change that is needed for the shell command.
> >>> >>
> >>> >> best,
> >>> >> Colin
> >>> >>
> >>> >>
> >>> >> On Thu, May 17, 2018, at 09:32, Colin McCabe wrote:
> >>> >>> Hi Viktor,
> >>> >>>
> >>> >>> Since the KIP freeze is coming up really soon, maybe we should
> just drop
> >>> >>> the section about changes to AlterConfigs from KIP-248.  We don't
> really
> >>> >>> need it here, since ConfigCommand can use AlterConfigs as-is.
> >>> >>>
> >>> >>> We can pick up the discussion about improving AlterConfigs in a
> future KIP.
> >>> >>>
> >>> >>> cheers,
> >>> >>> Colin
> >>> >>>
> >>> >>> On Wed, May 16, 2018, at 22:06, Colin McCabe wrote:
> >>> >>> > Hi Viktor,
> >>> >>> >
> >>> >>> > The shell command isn’t that easy to 

Re: [VOTE] 2.0.0 RC1

2018-07-04 Thread Manikumar
+1 (non-binding)  Verified the release notes, src, binary artifacts,  Ran
the test suite,
Verified quick start, Ran producer/consumer perf test, log compaction tests

Thanks


On Wed, Jul 4, 2018 at 8:33 AM Brett Rann  wrote:

> +1 tentative
> rolling upgrade of tiny shared staging multitenacy (200+ consumer groups)
> cluster from 1.1 to 2.0.0-rc1. cluster looks healthy. Will monitor.
>
> On Tue, Jul 3, 2018 at 8:18 AM Harsha  wrote:
>
> > +1.
> >
> > 1) Ran unit tests
> > 2) 3 node cluster , tested basic operations.
> >
> > Thanks,
> > Harsha
> >
> > On Mon, Jul 2nd, 2018 at 11:13 AM, "Vahid S Hashemian" <
> > vahidhashem...@us.ibm.com> wrote:
> >
> > >
> > >
> > >
> > > +1 (non-binding)
> > >
> > > Built from source and ran quickstart successfully on Ubuntu (with Java
> > 8).
> > >
> > >
> > > Minor: It seems this doc update PR is not included in the RC:
> > > https://github.com/apache/kafka/pull/5280
> > 
> > > Guozhang seems to have wanted to cherry-pick it to 2.0.
> > >
> > > Thanks Rajini!
> > > --Vahid
> > >
> > >
> > >
> > >
> > > From: Rajini Sivaram < rajinisiva...@gmail.com >
> > > To: dev < dev@kafka.apache.org >, Users < us...@kafka.apache.org >,
> > > kafka-clients < kafka-clie...@googlegroups.com >
> > > Date: 06/29/2018 11:36 AM
> > > Subject: [VOTE] 2.0.0 RC1
> > >
> > >
> > >
> > > Hello Kafka users, developers and client-developers,
> > >
> > >
> > > This is the second candidate for release of Apache Kafka 2.0.0.
> > >
> > >
> > > This is a major version release of Apache Kafka. It includes 40 new
> KIPs
> > > and
> > >
> > > several critical bug fixes. Please see the 2.0.0 release plan for more
> > > details:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80448820
> > <
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80448820>
> > >
> > >
> > >
> > > A few notable highlights:
> > >
> > > - Prefixed wildcard ACLs (KIP-290), Fine grained ACLs for CreateTopics
> > > (KIP-277)
> > > - SASL/OAUTHBEARER implementation (KIP-255)
> > > - Improved quota communication and customization of quotas (KIP-219,
> > > KIP-257)
> > > - Efficient memory usage for down conversion (KIP-283)
> > > - Fix log divergence between leader and follower during fast leader
> > > failover (KIP-279)
> > > - Drop support for Java 7 and remove deprecated code including old
> > > scala
> > > clients
> > > - Connect REST extension plugin, support for externalizing secrets and
> > > improved error handling (KIP-285, KIP-297, KIP-298 etc.)
> > > - Scala API for Kafka Streams and other Streams API improvements
> > > (KIP-270, KIP-150, KIP-245, KIP-251 etc.)
> > >
> > > Release notes for the 2.0.0 release:
> > >
> > > http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/RELEASE_NOTES.html
> > 
> > >
> > >
> > >
> > >
> > > *** Please download, test and vote by Tuesday, July 3rd, 4pm PT
> > >
> > >
> > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > >
> > > http://kafka.apache.org/KEYS
> > 
> > >
> > >
> > >
> > > * Release artifacts to be voted upon (source and binary):
> > >
> > > http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/
> > 
> > >
> > >
> > >
> > > * Maven artifacts to be voted upon:
> > >
> > > https://repository.apache.org/content/groups/staging/
> > 
> > >
> > >
> > >
> > > * Javadoc:
> > >
> > > http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/javadoc/
> > 
> > >
> > >
> > >
> > > * Tag to be voted upon (off 2.0 branch) is the 2.0.0 tag:
> > >
> > > https://github.com/apache/kafka/tree/2.0.0-rc1
> > 
> > >
> > >
> > >
> > > * Documentation:
> > >
> > > http://kafka.apache.org/20/documentation.html
> > 
> > >
> > >
> > >
> > > * Protocol:
> > >
> > > http://kafka.apache.org/20/protocol.html
> > 
> > >
> > >
> > >
> > > * Successful Jenkins builds for the 2.0 branch:
> > >
> > > Unit/integration tests:
> > > https://builds.apache.org/job/kafka-2.0-jdk8/66/
> > 
> > >
> > >
> > > System tests:
> > > https://jenkins.confluent.io/job/system-test-kafka/job/2.0/15/
> > 
> > >
> > >
> > >
> > >
> > > Please test and verify the release artifacts and submit a vote for this
> > RC
> > >
> > > or report any issues so that we can fix them and roll out a new RC
> ASAP!
> > >
> > > Although this release vote requires PMC votes to pass, testing, votes,
> > and
> > >
> > > bug
> > > reports are valuable and appreciated from everyone.
> > >
> > >
> > > Thanks,
> > >
> > >
> > > Rajini
> > >
> > >
> > >
> > >

Re: [ANNOUNCE] Apache Kafka 0.10.2.2 Released

2018-07-04 Thread Rajini Sivaram
Thanks for driving the release, Matthias!

On Tue, Jul 3, 2018 at 8:48 PM, Matthias J. Sax  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> The Apache Kafka community is pleased to announce the release for
> Apache Kafka 0.10.2.2.
>
>
> This is a bug fix release and it includes fixes and improvements from
> 29 JIRAs, including a few critical bugs.
>
>
> All of the changes in this release can be found in the release notes:
>
>
> https://dist.apache.org/repos/dist/release/kafka/0.10.2.2/RELEASE_NOTES.
> html
>
>
>
> You can download the source release from:
>
>
> https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.2/kafka-0.10.2.
> 2-src.tgz
>
>
> and binary releases from:
>
>
> https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.2/kafka_2.11-0.
> 10.2.2.tgz
> (Scala 2.11)
>
> https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.2/kafka_2.12-0.
> 10.2.2.tgz
> (Scala 2.12)
>
>
> - 
> - ---
>
>
> Apache Kafka is a distributed streaming platform with four core APIs:
>
>
> ** The Producer API allows an application to publish a stream records
> to one or more Kafka topics.
>
>
> ** The Consumer API allows an application to subscribe to one or more
> topics and process the stream of records produced to them.
>
>
> ** The Streams API allows an application to act as a stream processor,
> consuming an input stream from one or more topics and producing an
> output stream to one or more output topics, effectively transforming
> the input streams to output streams.
>
>
> ** The Connector API allows building and running reusable producers or
> consumers that connect Kafka topics to existing applications or data
> systems. For example, a connector to a relational database might
> capture every change to a table.three key capabilities:
>
>
>
> With these APIs, Kafka can be used for two broad classes of application:
>
>
> ** Building real-time streaming data pipelines that reliably get data
> between systems or applications.
>
>
> ** Building real-time streaming applications that transform or react
> to the streams of data.
>
>
>
> Apache Kafka is in use at large and small companies worldwide,
> including Capital One, Goldman Sachs, ING, LinkedIn, Netflix,
> Pinterest, Rabobank, Target, The New York Times, Uber, Yelp, and
> Zalando, among others.
>
>
>
> A big thank you for the following 30 contributors to this release!
>
>
> Ewen Cheslack-Postava, Matthias J. Sax, Randall Hauch, Eno Thereska,
> Damian Guy, Rajini Sivaram, Colin P. Mccabe, Kelvin Rutt, Kyle
> Winkelman, Max Zheng, Guozhang Wang, Xavier Léauté, Konstantine
> Karantasis, Paolo Patierno, Robert Yokota, Tommy Becker, Arjun Satish,
> Xi Hu, Armin Braun, Edoardo Comar, Gunnar Morling, Gwen Shapira,
> Hooman Broujerdi, Ismael Juma, Jaikiran Pai, Jarek Rudzinski, Jason
> Gustafson, Jun Rao, Manikumar Reddy, Maytee Chinavanichkit
>
>
> We welcome your help and feedback. For more information on how to
> report problems, and to get involved, visit the project website at
> http://kafka.apache.org/
>
>
> Thank you!
>
>
> Regards,
>  -Matthias
>
>
> -BEGIN PGP SIGNATURE-
> Comment: GPGTools - https://gpgtools.org
>
> iQIzBAEBCgAdFiEEeiQdEa0SVXokodP3DccxaWtLg18FAls70woACgkQDccxaWtL
> g1+Xzw//Rb7K691p0R2qPOixZfllEuO926C9dIjiq9XA+dZrabgC4tMgAtE07Pf4
> i6ZUeIqVLH3IDYIKji92K+JUIWpu6fdmCc999bJUOJG+zABMbO0uRYm7/4LwfMPR
> kfjxRhxu31ewvafs3crE4Kfkekw4FLFIwHiaz3i/mKC1Ty6V4oiJcwHP4PZizE2r
> rTNbt0ZHzviiBH3klOoDh+ZZFwbDZn7EHUXm8o9fiiC52o/7TIqVWwmNzZJlNGRc
> bxC3boGXAXjgBwm7iqxBgkPku/kTTWpxj6jkHbS2NQfCZE5V7INQC2HlnynPHc7j
> m2F2plSvKOm4gi54q6SSiXkjcXA2dBJDe3y/jNpckXSQ31sNXsTi6vbRMkMPj8dJ
> j0SKhFoSCDpWejgLkUMg6hZgepgz7G1uYHA9K8SfCyCooqxsEY4I3dClNOySORly
> 4brdjZWpclhCn+zpekqBFZ9Sn3ipG4MOvH64chPEvYnysHkRH26FqXNPOK185V0Z
> Czl0dL0aEoJWZ3LxLTSoFkncKgqrcE00q4VknK3zGW65tlQ1DqTXtK3Ta1q8vX98
> PCCR4Tjhu0RcBAV2L4o43itKzIaLCp9lElA1341oQUB+tiPRA0GvWGg36EomehzF
> 1qdbjBug91CLyefZVVeEfTiqmNAYNyR1Zmx99rryx+Fp+5Ek9YI=
> =yjnJ
> -END PGP SIGNATURE-
>


Re: [ANNOUNCE] Apache Kafka 0.11.0.3 Released

2018-07-04 Thread Rajini Sivaram
Thanks for driving the release, Matthias!

On Tue, Jul 3, 2018 at 10:08 PM, Jason Gustafson  wrote:

> Awesome. Thanks Matthias!
>
> On Tue, Jul 3, 2018 at 12:44 PM, Yishun Guan  wrote:
>
> > Nice! Thanks~
> >
> > On Tue, Jul 3, 2018, 12:16 PM Ismael Juma  wrote:
> >
> > > Thanks Matthias!
> > >
> > > On Tue, 3 Jul 2018, 11:31 Matthias J. Sax,  wrote:
> > >
> > > > -BEGIN PGP SIGNED MESSAGE-
> > > > Hash: SHA512
> > > >
> > > > The Apache Kafka community is pleased to announce the release for
> > > > Apache Kafka 0.11.0.3.
> > > >
> > > >
> > > > This is a bug fix release and it includes fixes and improvements from
> > > > 27 JIRAs, including a few critical bugs.
> > > >
> > > >
> > > > All of the changes in this release can be found in the release notes:
> > > >
> > > >
> > > > https://dist.apache.org/repos/dist/release/kafka/0.11.0.3/
> > RELEASE_NOTES.
> > > > html
> > > > <
> > > https://dist.apache.org/repos/dist/release/kafka/0.11.0.3/
> > RELEASE_NOTES.html
> > > >
> > > >
> > > >
> > > >
> > > > You can download the source release from:
> > > >
> > > >
> > > > https://www.apache.org/dyn/closer.cgi?path=/kafka/0.11.0.
> > 3/kafka-0.11.0.
> > > > 3-src.tgz
> > > > <
> > > https://www.apache.org/dyn/closer.cgi?path=/kafka/0.11.0.
> > 3/kafka-0.11.0.3-src.tgz
> > > >
> > > >
> > > >
> > > > and binary releases from:
> > > >
> > > >
> > > > https://www.apache.org/dyn/closer.cgi?path=/kafka/0.11.0.
> > 3/kafka_2.11-0.
> > > > 11.0.3.tgz
> > > > (Scala 2.11)
> > > >
> > > > https://www.apache.org/dyn/closer.cgi?path=/kafka/0.11.0.
> > 3/kafka_2.12-0.
> > > > 11.0.3.tgz
> > > > (Scala 2.12)
> > > >
> > > >
> > > > -
> > > 
> 
> > > > - ---
> > > >
> > > >
> > > > Apache Kafka is a distributed streaming platform with four core APIs:
> > > >
> > > >
> > > > ** The Producer API allows an application to publish a stream records
> > > > to one or more Kafka topics.
> > > >
> > > >
> > > > ** The Consumer API allows an application to subscribe to one or more
> > > > topics and process the stream of records produced to them.
> > > >
> > > >
> > > > ** The Streams API allows an application to act as a stream
> processor,
> > > > consuming an input stream from one or more topics and producing an
> > > > output stream to one or more output topics, effectively transforming
> > > > the input streams to output streams.
> > > >
> > > >
> > > > ** The Connector API allows building and running reusable producers
> or
> > > > consumers that connect Kafka topics to existing applications or data
> > > > systems. For example, a connector to a relational database might
> > > > capture every change to a table.three key capabilities:
> > > >
> > > >
> > > >
> > > > With these APIs, Kafka can be used for two broad classes of
> > application:
> > > >
> > > >
> > > > ** Building real-time streaming data pipelines that reliably get data
> > > > between systems or applications.
> > > >
> > > >
> > > > ** Building real-time streaming applications that transform or react
> > > > to the streams of data.
> > > >
> > > >
> > > >
> > > > Apache Kafka is in use at large and small companies worldwide,
> > > > including Capital One, Goldman Sachs, ING, LinkedIn, Netflix,
> > > > Pinterest, Rabobank, Target, The New York Times, Uber, Yelp, and
> > > > Zalando, among others.
> > > >
> > > >
> > > >
> > > > A big thank you for the following 26 contributors to this release!
> > > >
> > > >
> > > > Matthias J. Sax, Ewen Cheslack-Postava, Konstantine Karantasis,
> > > > Guozhang Wang, Rajini Sivaram, Randall Hauch, tedyu, Jagadesh
> > > > Adireddi, Jarek Rudzinski, Jason Gustafson, Jeremy Custenborder, Anna
> > > > Povzner, Joel Hamill, John Roesler, Max Zheng, Mickael Maison, Robert
> > > > Yokota, Yaswanth Kumar, parafiend, Jiangjie (Becket) Qin, Arjun
> > > > Satish, Bill Bejeck, Damian Guy, Gitomain, Gunnar Morling, Ismael
> Juma
> > > >
> > > >
> > > > We welcome your help and feedback. For more information on how to
> > > > report problems, and to get involved, visit the project website at
> > > > http://kafka.apache.org/
> > > >
> > > >
> > > > Thank you!
> > > >
> > > >
> > > > Regards,
> > > >  -Matthias
> > > > -BEGIN PGP SIGNATURE-
> > > > Comment: GPGTools - https://gpgtools.org
> > > >
> > > > iQIzBAEBCgAdFiEEeiQdEa0SVXokodP3DccxaWtLg18FAls7wQAACgkQDccxaWtL
> > > > g1+b/g/+LjM5gh8u2wCVz7dhOstwvtaajRG7cG1QhZH3H9QquVs19aKiE9ZcvEcK
> > > > eJkX0S7rWopXs2qQxy5fVCTWGw5yO4eFNWuWxSIffuxH8/3K2sKahPi/4IDgd5Tj
> > > > ksmsxyXxWtGv/vEosJr+ZD7s1urPpkQ7DG6CT9wG9wj2ASq7sur/Eg7jfAnuIoTQ
> > > > UvQenKXU0T+D+BZKpUiZs5e6VGya6bUzboAbPGiwsMH4/xj2IlOEjVAyf3ppnuiu
> > > > /AW2LLqkFnbDB0IbveOu2+73CvVlahkaZ6nhPjkVpdpFw/SCAZHdkGdCafo8DKP8
> > > > DKcmzta/QCEJ1uQUe7Rh8ndzYLzTaU0rqilA2WZUZvTx0gkviDGvQv/S97XP8lRJ
> > > > SLn2xk166dxw0zpuIfzo0rr3S2Mz5PmAhrxiVxDG9ihaqBnABePspjp+cTXLhGhX
> > > > 

[jira] [Created] (KAFKA-7132) Consider adding multithreaded form of recovery

2018-07-04 Thread Richard Yu (JIRA)
Richard Yu created KAFKA-7132:
-

 Summary: Consider adding multithreaded form of recovery
 Key: KAFKA-7132
 URL: https://issues.apache.org/jira/browse/KAFKA-7132
 Project: Kafka
  Issue Type: Improvement
  Components: consumer
Reporter: Richard Yu


Currently, when a consumer falls out of a consumer group, it will restart 
processing from the last checkpointed offset. However, this design could result 
in a lag which some users could not afford to let happen. For example, lets say 
a consumer crashed at offset 100, with the last checkpointed offset being at 
70. When it recovers at a later offset (say, 120), it will be behind by an 
offset range of 50 (120 - 70). This is because it restarted at 70, forcing it 
to reprocess old data. To avoid this from happening, one option would be to 
allow the current consumer to start processing not from the last checkpointed 
offset (which is 70 in the example), but from 120 where it recovers. Meanwhile, 
a new KafkaConsumer will be instantiated and start reading from offset 70 in 
concurrency with the old process, and will be terminated once it reaches 120. 
In this manner, a considerable amount of lag can be avoided, particularly since 
the old consumer could proceed as if nothing had happened. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-07-04 Thread Matthias J. Sax
I just double checked the discussion thread of KIP-120 that introduced
`TopologyDescription`. Back than the argument was, that using the
simplest option might be sufficient because the description is mostly
used for debugging.

Not sure if this argument holds. It seem that people built first more
sophisticated tools using TopologyDescription.

Final note: if we really want to add `topicPattern()` we might want to
deprecate `topic()` and replace with `Set topics()`, because a
source node can take a multiple topics, too.

Just adding this for completeness of context to the discussion.


-Matthias

On 7/3/18 11:09 PM, Matthias J. Sax wrote:
> John,
> 
> I am a little bit on the fence. In retrospective, it might have been
> better to add `topic()` and `topicPattern()` to source node and return a
> proper `Pattern` object instead of the pattern as a String.
> 
> All other "payload" is just names and thus String naturally. From my
> point of view `TopologyDescription` should represent the `Topology` in a
> "machine readable" form plus a default "human readable" from via
> `toString()` -- this does not imply that all return types should be String.
> 
> Let me know what you think. If you agree, we could even add
> `Source#topicPattern()` in another KIP.
> 
> 
> -Matthias
> 
> On 6/26/18 3:45 PM, John Roesler wrote:
>> Sorry for the late comment,
>>
>> Looking at the other pieces of TopologyDescription, I noticed that pretty
>> much all of the "payload" of these description nodes are strings. Should we
>> consider returning a string from `topicNameExtractor()` instead?
>>
>> In fact, if we did that, we could consider calling `toString()` on the
>> extractor instead of returning the class name. This would allow authors of
>> the extractors to provide more information about the extractor than just
>> its name. This might be especially useful in the case of anonymous
>> implementations.
>>
>> Thanks for the KIP,
>> -John
>>
>> On Mon, Jun 25, 2018 at 11:52 PM Ted Yu  wrote:
>>
>>> My previous response was talking about the new method in
>>> InternalTopologyBuilder.
>>> The exception just means there is no uniform extractor for all the sinks.
>>>
>>> On Mon, Jun 25, 2018 at 8:02 PM, Matthias J. Sax 
>>> wrote:
>>>
 Ted,

 Why? Each sink can have a different TopicNameExtractor.


 -Matthias

 On 6/25/18 5:19 PM, Ted Yu wrote:
> If there are different TopicNameExtractor classes from multiple sink
 nodes,
> the new method should throw exception alerting user of such scenario.
>
>
> On Mon, Jun 25, 2018 at 2:23 PM, Bill Bejeck 
>>> wrote:
>
>> Thanks for the KIP!
>>
>> Overall I'm +1 on the KIP.   I have one question.
>>
>> The KIP states that the method "topicNameExtractor()" is added to the
>> InternalTopologyBuilder.java.
>>
>> It could be that I'm missing something, but wow does this work if a
>>> user
>> has provided different TopicNameExtractor instances to multiple sink
 nodes?
>>
>> Thanks,
>> Bill
>>
>>
>>
>> On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang 
 wrote:
>>
>>> Yup I agree, generally speaking the `toString()` output is not
>> recommended
>>> to be relied on programmatically in user's code, but we've observed
>>> convenience-beats-any-other-reasons again and again in development
>>> unfortunately. I think we should still not claiming it is part of the
>>> public APIs that would not be changed anyhow in the future, but just
>>> mentioning it in the wiki for people to be aware is fine.
>>>
>>>
>>> Guozhang
>>>
>>> On Sun, Jun 24, 2018 at 5:01 PM, Matthias J. Sax <
 matth...@confluent.io>
>>> wrote:
>>>
 Thanks for the KIP!

 I am don't have any further comments.

 For Guozhang's comment: if we mention anything about `toString()`,
>>> we
 should make explicit that `toString()` output is still not public
 contract and users should not rely on the output.

 Furhtermore, for the actual uses output, I would replace "topic:" by
 "extractor class:" to make the difference obvious.

 I am just hoping that people actually to not rely on `toString()`
>>> what
 defeats the purpose to the `TopologyDescription` class that was
 introduced to avoid the dependency... (Just a side comment, not
>>> really
 related to this KIP proposal itself).


 If there are no further comments in the next days, feel free to
>>> start
 the VOTE and open a PR.




 -Matthias

 On 6/22/18 6:04 PM, Guozhang Wang wrote:
> Thanks for writing the KIP!
>
> I'm +1 on the proposed changes over all. One minor suggestion: we
>>> should
> also mention that the `Sink#toString` will also be updated, in a
>>> way
>>> 

Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-07-04 Thread Matthias J. Sax
John,

I am a little bit on the fence. In retrospective, it might have been
better to add `topic()` and `topicPattern()` to source node and return a
proper `Pattern` object instead of the pattern as a String.

All other "payload" is just names and thus String naturally. From my
point of view `TopologyDescription` should represent the `Topology` in a
"machine readable" form plus a default "human readable" from via
`toString()` -- this does not imply that all return types should be String.

Let me know what you think. If you agree, we could even add
`Source#topicPattern()` in another KIP.


-Matthias

On 6/26/18 3:45 PM, John Roesler wrote:
> Sorry for the late comment,
> 
> Looking at the other pieces of TopologyDescription, I noticed that pretty
> much all of the "payload" of these description nodes are strings. Should we
> consider returning a string from `topicNameExtractor()` instead?
> 
> In fact, if we did that, we could consider calling `toString()` on the
> extractor instead of returning the class name. This would allow authors of
> the extractors to provide more information about the extractor than just
> its name. This might be especially useful in the case of anonymous
> implementations.
> 
> Thanks for the KIP,
> -John
> 
> On Mon, Jun 25, 2018 at 11:52 PM Ted Yu  wrote:
> 
>> My previous response was talking about the new method in
>> InternalTopologyBuilder.
>> The exception just means there is no uniform extractor for all the sinks.
>>
>> On Mon, Jun 25, 2018 at 8:02 PM, Matthias J. Sax 
>> wrote:
>>
>>> Ted,
>>>
>>> Why? Each sink can have a different TopicNameExtractor.
>>>
>>>
>>> -Matthias
>>>
>>> On 6/25/18 5:19 PM, Ted Yu wrote:
 If there are different TopicNameExtractor classes from multiple sink
>>> nodes,
 the new method should throw exception alerting user of such scenario.


 On Mon, Jun 25, 2018 at 2:23 PM, Bill Bejeck 
>> wrote:

> Thanks for the KIP!
>
> Overall I'm +1 on the KIP.   I have one question.
>
> The KIP states that the method "topicNameExtractor()" is added to the
> InternalTopologyBuilder.java.
>
> It could be that I'm missing something, but wow does this work if a
>> user
> has provided different TopicNameExtractor instances to multiple sink
>>> nodes?
>
> Thanks,
> Bill
>
>
>
> On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang 
>>> wrote:
>
>> Yup I agree, generally speaking the `toString()` output is not
> recommended
>> to be relied on programmatically in user's code, but we've observed
>> convenience-beats-any-other-reasons again and again in development
>> unfortunately. I think we should still not claiming it is part of the
>> public APIs that would not be changed anyhow in the future, but just
>> mentioning it in the wiki for people to be aware is fine.
>>
>>
>> Guozhang
>>
>> On Sun, Jun 24, 2018 at 5:01 PM, Matthias J. Sax <
>>> matth...@confluent.io>
>> wrote:
>>
>>> Thanks for the KIP!
>>>
>>> I am don't have any further comments.
>>>
>>> For Guozhang's comment: if we mention anything about `toString()`,
>> we
>>> should make explicit that `toString()` output is still not public
>>> contract and users should not rely on the output.
>>>
>>> Furhtermore, for the actual uses output, I would replace "topic:" by
>>> "extractor class:" to make the difference obvious.
>>>
>>> I am just hoping that people actually to not rely on `toString()`
>> what
>>> defeats the purpose to the `TopologyDescription` class that was
>>> introduced to avoid the dependency... (Just a side comment, not
>> really
>>> related to this KIP proposal itself).
>>>
>>>
>>> If there are no further comments in the next days, feel free to
>> start
>>> the VOTE and open a PR.
>>>
>>>
>>>
>>>
>>> -Matthias
>>>
>>> On 6/22/18 6:04 PM, Guozhang Wang wrote:
 Thanks for writing the KIP!

 I'm +1 on the proposed changes over all. One minor suggestion: we
>> should
 also mention that the `Sink#toString` will also be updated, in a
>> way
>> that
 if `topic()` returns null, use the other call, etc. This is because
 although we do not explicitly state the following logic as public
>>> protocols:

 ```

 "Sink: " + name + " (topic: " + topic() + ")\n  <-- " +
 nodeNames(predecessors);


 ```

 There are already some users that rely on
>> `topology.describe().toString(
>>> )`
 to have runtime checks on the returned string values, so changing
> this
 means that their app will break and hence need to make code
>> changes.

 Guozhang

 On Wed, Jun 20, 2018 at 7:20 PM, Nishanth Pradeep <
>> nishanth...@gmail.com

 wrote:

> Hello