Re: [KAFKA-557] Add emit on change support for Kafka Streams

2020-02-05 Thread Ted Yu
Thanks for the comments, Matthias.

w.r.t. requirement of an `equals()` implementation, each template type
would have an equals() method. We can use the following code to know
whether it is provided by JVM or provided by user.

boolean customEquals = false;
try {
Class cls = value.getClass().getMethod("equals",
Object.class).getDeclaringClass();
if (!Object.class.equals(cls)) {
customEquals = true;
}
} catch (NoSuchMethodException nsme) {
// equals is always defined, this wouldn't hit
}

The next question is: what if the user doesn't provide equals() method ?
Would we automatically fall back to emit-on-update ?

Cheers

On Tue, Feb 4, 2020 at 1:37 PM Matthias J. Sax  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> First a high level comment:
>
> Overall, I would like to make one step back, and make sure we are
> discussion on the same level. Originally, I understood this KIP as a
> proposed change of _semantics_, however, given the latest discussion
> it seems it's actually not -- it's more an _optimization_ proposal.
> Hence, we only need to make sure that this optimization does not break
> existing semantics. It this the right way to think about it?
>
> If yes, than it might actually be ok to have different behavior
> depending if there is a materialized KTable or not. So far, we never
> defined a public contract about our emit strategy and it seems this
> KIP does not define one either.
>
> Hence, I don't have as strong of an opinion about sending oldValues
> for example any longer. I guess the question is really, what can we
> implement in a reasonable way.
>
>
>
> Other comments:
>
>
> @Richard:
>
> Can you please add the KIP to the KIP overview table: It's missing
> (https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Pro
> posals).
>
>
> @Bruno:
>
> You mentioned caching. I think it's irrelevant (orthogonal) and we can
> discuss this KIP without considering it.
>
>
> @John:
>
> > Even in the source table, we forward the updated record with the
> > higher of the two timestamps. So the example is more like:
>
> That is not correct. Currently, we forward with the smaller
> out-of-order timestamp (changing the timestamp would corrupt the data
> - -- we don't know, because we don't check, if the value is the same or
> a different one, hence, we must emit the out-of-order record as-is).
>
> If we start to do emit-on-change, we also need to emit a new record if
> the timestamp changes due to out-of-order data, hence, we would still
> need to emit  because that give us correct semantics: assume
> you have a filter() and afterward use the filter KTable in a
> stream-table join -- the lower T1 timestamp must be propagated to the
> filtered KTable to ensure that that the stream-table join compute the
> correct result.
>
>
>
> Your point about requiring an `equals()` implementation is actually a
> quite interesting one and boils down to my statement from above about
> "what can we actually implement". What I don't understand is:
>
> > This way, we still don't have to rely on the existence of an
> > equals() method, but if it is there, we can benefit from it.
>
> Your bullet point (2) says it uses `equals()` -- hence, it seems we
> actually to rely on it? Also, how can we detect if there is an
> `equals()` method to do the comparison? Would be fail if we don't have
> `equals()` nor corresponding serializes to do the comparison?
>
>
>
> > Wow, really good catch! Yes, we absolutely need metrics and logs if
> > we're going to drop any records. And, yes, we should propose
> > metrics and logs that are similar to the existing ones when we drop
> > records for other reasons.
>
> I am not sure about this point. In fact, we have already some no-ops
> in Kafka Streams in our join-operators and don't report any of those
> either. Emit-on-change is operator semantics and I don't see why we
> would need to have a metric for it? It seems to be quite different
> compared to dropping late or malformed records.
>
>
> - -Matthias
>
>
>
> On 2/4/20 7:13 AM, Thomas Becker wrote:
> > Thanks John for your thoughtful reply. Some comments inline.
> >
> >
> > On Mon, 2020-02-03 at 11:51 -0600, John Roesler wrote:
> >> [EXTERNAL EMAIL] Attention: This email was sent from outside
> >> TiVo. DO NOT CLICK any links or attachments unless you expected
> >> them. 
> >>
> >>
> >> Hi Tommy,
> >>
> >> Thanks for the context. I can see the attraction of considering
> >> these use cases together.
> >>
> >> To answer your question, if a part of the record is not relevant
> >> to downstream consumers, I was thinking you could just use a
> >> mapValue to remove it.
> >>
> >> E.g., suppose you wanted to do a join between two tables.
> >>
> >> employeeInfo.join( employeePayroll, (info, payroll) -> new
> >> Result(info.name(), payroll.salary()) )
> >>
> >> We only care about one attribute from the Info table (name), and
> >> one from the Payroll table (salary), and 

Re: [KAFKA-557] Add emit on change support for Kafka Streams

2020-02-02 Thread Ted Yu
w.r.t. new metric, there is already droppedRecordsSensor which logs:

"Skipping record due to null key. topic=[{}] partition=[{}] offset=[{}]",

It seems we should introduce another metric which records the skipped
(duplicate) values.

This way, it is easier to observe the effect when this feature is in production.

Cheers


>
> -- Forwarded message -
> From: Richard Yu 
> Date: Sun, Feb 2, 2020 at 10:21 AM
> Subject: Re: [KAFKA-557] Add emit on change support for Kafka Streams
> To: 
>
>
> Hi Bruno,
>
> Thanks for the reply!
>
> I've included some basic description on the reporting strategies in the
> KIP (I might include more information on that later). I've also worked to
> add some more details on behavior changes as well as rejected alternatives.
> Hope it will help facilitate the process. :)
>
> I just want to add something on a relevant topic: we need metrics. I think
> this should also be included with this change for a number of reasons. For
> some users, they already know that their Streams application is
> experiencing a lot of no-op traffic. But that doesn't mean other users are
> aware of the same problem. Also, if we are dropping no-ops, then we might
> as well record exactly how many we have dropped out of how many total
> operations we've done. Therefore, I argue that we also include some metric
> which records this data and reports it to the user.
>
> Beyond that, let me know if we might need to address anything else. :)
>
> Cheers,
> Richard
>
>
>
> On Sun, Feb 2, 2020 at 3:57 AM Bruno Cadonna  wrote:
>
>> Hi,
>>
>> Richard, thank you for the updated KIP.
>>
>> Regarding your question about the survey, IMO the survey should
>> contain a brief description of the emit (report) strategy of each
>> system and a list of pros and cons. I personally would be interested
>> what emit strategy Flink uses.
>>
>> I have a few comments about the KIP and its documentation:
>>
>> KIP-specific:
>>
>> 1. I agree with Matthias that we should also include aggregations
>> where neither the value nor the timestamp change.
>>
>> 2. Regarding Matthias' concerns about the dependency of the result of
>> a stateless operation on the materialization, I have two
>> questions/observations:
>> a) Is the result not already dependent on the materialization since in
>> case of materlized results the cache would not emit all records for
>> the same key downstream?
>> b) Emitting more records from a non-materialized operation should not
>> affect the semantics because we are emitting changelog records. The
>> next time these changelog records are materialized the result should
>> be correct. Right? However, I see the issue when a KTable is
>> transformed to a KStream with `toStream()`. The stream would then
>> differ depending on the materialization. But that seems to me an issue
>> that is not specific to the emit strategy and that we already have
>> when we use a cache, don't we? Is it even an issue?
>>
>> 3. With out-of-order records we would emit more records. Let's assume
>> the following records
>> K, V, T3
>> K, V, T1
>> K, V, T2
>> with T1 < T2 < T3
>>
>> A KTable that reads this records in this order, would emit (assuming no
>> cache)
>> K, V, T3
>> K, V, T1
>>
>> The record at T3 is emitted because it is the first.
>> The record at T1 is emitted because T1 < T3.
>> The record at T2 is not emitted because T2 >= T1
>> Correct?
>>
>> Richard, it would be good to add a section about out-of-order records
>> to the KIP.
>>
>>
>> Documentation-specific:
>>
>> 1. I agree with John on his feedback on the KIP document. It is really
>> important to clearly state what this KIP will improve and what not,
>> otherwise it becomes hard to vote on the KIP and to decide whether the
>> KIP is fully implemented or not.
>>
>> 2. Could you please already state in the "Motivation" section of the
>> KIP where you list the current emit strategies that the emit strategy
>> only applies to operations that involve a KTable? Probably for most it
>> will be clear what you mean, but IMO KIPs should be easily
>> approachable and it doesn't cost much to add this information.
>>
>> 3. Could you please list the rejected suppress extension in the
>> "Rejected Alternatives" section?
>>
>> 4. In the discussion about materializing results of stateless
>> operations, could you please add that those stateless operations are
>> on KTables? IMO adding this information makes the KIP easier
>> approachable by people that are not that familiar with the matter.
>> Best,
>> Bruno
>>
>> On Sat, Feb 1, 2020 at 11:33 PM Richard Yu 
>> wrote:
>> >
>> > Hi all,
>> >
>> > You all have good points!
>> >
>> > I took a look, and I thought it over. After some thinking, it appears
>> the
>> > main point of contention is whether or not we can support emit on change
>> > for stateless operations. I agree with John in that we probably should
>> > restrict ourselves to materialized KTables:
>> >
>> >1. Loading any prior results would incur performance 

Re: [VOTE] KIP-526: Reduce Producer Metadata Lookups for Large Number of Topics

2020-01-28 Thread Ted Yu
+1

On Tue, Jan 28, 2020 at 10:52 AM Rajini Sivaram 
wrote:

> +1 (binding)
>
> Thanks for the KIP, Brian!
>
> Regards,
>
> Rajini
>
> On Thu, Jan 23, 2020 at 7:34 PM Jason Gustafson 
> wrote:
>
> > Sounds good. +1 from me.
> >
> > On Thu, Jan 23, 2020 at 9:00 AM Brian Byrne  wrote:
> >
> > > Thanks Jason,
> > >
> > > I'm in favor of the latter: metadata.max.idle.ms. I agree that
> > describing
> > > it as a "period" is inaccurate. With metadata.max.idle.ms, it also
> > aligns
> > > with metadata.max.age.ms for determining refresh period (which is an
> > > actual
> > > period).
> > >
> > > I've updated the docs.
> > >
> > > Thanks,
> > > Brian
> > >
> > > On Wed, Jan 22, 2020 at 6:19 PM Jason Gustafson 
> > > wrote:
> > >
> > > > Thanks for the proposal. Looks good overall. I wanted to suggest a
> > > possible
> > > > name change. I was considering something like `
> > > idle.metadata.expiration.ms
> > > > `
> > > > or maybe `metadata.max.idle.ms`. Thoughts?
> > > >
> > > > -Jason
> > > >
> > > >
> > > > On Tue, Jan 21, 2020 at 11:38 AM Guozhang Wang 
> > > wrote:
> > > >
> > > > > Got it.
> > > > >
> > > > > I was proposing that we do the "delayed async batch" but I think
> your
> > > > > argument for complexity and pushing it out of the scope is
> > convincing,
> > > so
> > > > > instead I propose we do the synchronous mini batching still but
> > > obviously
> > > > > it is already there :)  I'm +1 on the current proposal scope.
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Tue, Jan 21, 2020 at 10:16 AM Brian Byrne 
> > > > wrote:
> > > > >
> > > > > > Hi Guozhang,
> > > > > >
> > > > > > Ah, sorry, I misunderstood. Actually, this is solved for us
> today.
> > > How
> > > > > the
> > > > > > producer works is that it maintains at most one inflight metadata
> > > fetch
> > > > > > request at any time, where each request is tagged with the
> current
> > > > > > (monotonically increasing) request version. This version is
> bumped
> > > > > whenever
> > > > > > a new topic is encountered, and metadata fetching will continue
> to
> > > > > process
> > > > > > while the latest metadata response's version is below the current
> > > > > version.
> > > > > >
> > > > > > So if a metadata request is in flight, and a number of threads
> > > produce
> > > > to
> > > > > > new topics, they'll be added to the working set but the next
> > metadata
> > > > > > request won't take place until the outstanding one returns. So
> > their
> > > > > > updates will be batched together. As you suggest, we can have a
> > > simple
> > > > > list
> > > > > > that tracks unknown topics to isolate new vs. old topics.
> > > > > >
> > > > > > Thanks,
> > > > > > Brian
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Jan 21, 2020 at 10:04 AM Guozhang Wang <
> wangg...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi Brian,
> > > > > > >
> > > > > > > I think I buy the complexity and extra end-to-end-latency
> > argument
> > > :)
> > > > > I'm
> > > > > > > fine with delaying the asynchronous tech fetching to future
> works
> > > and
> > > > > > keep
> > > > > > > the current KIP's scope as-is for now. Under that case can we
> > > > consider
> > > > > > just
> > > > > > > a minor implementation detail (since it is not affecting public
> > > APIs
> > > > we
> > > > > > > probably do not even need to list it, but just thinking loud
> > here):
> > > > > > >
> > > > > > > In your proposal when we request for a topic of unknown
> metadata,
> > > we
> > > > > are
> > > > > > > going to directly set the topic name as that singleton in the
> > > > request.
> > > > > > I'm
> > > > > > > wondering for the scenario that KAFKA-8904 described, if the
> > > > > > producer#send
> > > > > > > for thousands of new topics are triggered sequentially by a
> > single
> > > > > thread
> > > > > > > or concurrent threads? If it's the latter, and we expect in
> such
> > > > > > scenarios
> > > > > > > we may have multiple topics being requests within a very short
> > > time,
> > > > > then
> > > > > > > we can probably do sth. like this internally in a synchronized
> > > > manner:
> > > > > > >
> > > > > > > 1) put the topic name into a list, as "unknown topics", then
> > > > > > > 2) exhaust the list, and put all topics from that list to the
> > > > request;
> > > > > if
> > > > > > > the list is empty, it means it has been emptied by another
> thread
> > > so
> > > > we
> > > > > > > skip sending a new request and just wait for the returned
> > metadata
> > > > > > refresh.
> > > > > > >
> > > > > > > In most cases the list would just be a singleton with the one
> > that
> > > > > thread
> > > > > > > has just enqueued, but under extreme scenarios it can help
> > > batching a
> > > > > few
> > > > > > > topic names probably (of course, I'm thinking about very
> extreme
> > > > cases
> > > > > > > here, assuming that's was what we've seen in 8904). Since these
> > two
> > > > > steps
> > > > > > > are very light-weighted, doing that in 

[jira] [Resolved] (KAFKA-9471) Throw exception for DEAD StreamThread.State

2020-01-24 Thread Ted Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved KAFKA-9471.
---
Resolution: Duplicate

> Throw exception for DEAD StreamThread.State
> ---
>
> Key: KAFKA-9471
> URL: https://issues.apache.org/jira/browse/KAFKA-9471
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
>
> In StreamThreadStateStoreProvider we have:
> {code}
> if (streamThread.state() == StreamThread.State.DEAD) {
> return Collections.emptyList();
> {code}
> If user cannot retry anymore, we should throw exception which is handled in 
> the else block.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9471) Return empty collection for PENDING_SHUTDOWN

2020-01-23 Thread Ted Yu (Jira)
Ted Yu created KAFKA-9471:
-

 Summary: Return empty collection for PENDING_SHUTDOWN
 Key: KAFKA-9471
 URL: https://issues.apache.org/jira/browse/KAFKA-9471
 Project: Kafka
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu


In StreamThreadStateStoreProvider we have:
{code}
if (streamThread.state() == StreamThread.State.DEAD) {
return Collections.emptyList();
{code}
PENDING_SHUTDOWN should be treated the same way as DEAD.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-9464) Close the producer in completeShutdown

2020-01-23 Thread Ted Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved KAFKA-9464.
---
Resolution: Not A Problem

> Close the producer in completeShutdown
> --
>
> Key: KAFKA-9464
> URL: https://issues.apache.org/jira/browse/KAFKA-9464
> Project: Kafka
>  Issue Type: Bug
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Minor
>
> In StreamThread#completeShutdown, the producer (if not null) should be closed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9465) Enclose consumer call with catching InvalidOffsetException

2020-01-22 Thread Ted Yu (Jira)
Ted Yu created KAFKA-9465:
-

 Summary: Enclose consumer call with catching InvalidOffsetException
 Key: KAFKA-9465
 URL: https://issues.apache.org/jira/browse/KAFKA-9465
 Project: Kafka
  Issue Type: Improvement
Reporter: Ted Yu


In maybeUpdateStandbyTasks, the try block encloses restoreConsumer.poll and 
record handling.
Since InvalidOffsetException is thrown by restoreConsumer.poll, we should 
enclose this call in the try block.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9464) Close the producer in completeShutdown

2020-01-22 Thread Ted Yu (Jira)
Ted Yu created KAFKA-9464:
-

 Summary: Close the producer in completeShutdown
 Key: KAFKA-9464
 URL: https://issues.apache.org/jira/browse/KAFKA-9464
 Project: Kafka
  Issue Type: Bug
Reporter: Ted Yu


In StreamThread#completeShutdown, the producer (if not null) should be closed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9463) Transient failure in KafkaAdminClientTest.testListOffsets

2020-01-21 Thread Ted Yu (Jira)
Ted Yu created KAFKA-9463:
-

 Summary: Transient failure in KafkaAdminClientTest.testListOffsets
 Key: KAFKA-9463
 URL: https://issues.apache.org/jira/browse/KAFKA-9463
 Project: Kafka
  Issue Type: Test
Reporter: Ted Yu


When running tests with Java 11, I got the following test failure:
{code}
java.util.concurrent.ExecutionException: 
org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node 
assignment.
at 
org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
at 
org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
at 
org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
at 
org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
at 
org.apache.kafka.clients.admin.KafkaAdminClientTest.testListOffsets(KafkaAdminClientTest.java:2336)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting 
for a node assignment.
{code}
KafkaAdminClientTest.testListOffsets passes when it is run alone.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9462) Correct exception message in DistributedHerder

2020-01-21 Thread Ted Yu (Jira)
Ted Yu created KAFKA-9462:
-

 Summary: Correct exception message in DistributedHerder
 Key: KAFKA-9462
 URL: https://issues.apache.org/jira/browse/KAFKA-9462
 Project: Kafka
  Issue Type: Bug
Reporter: Ted Yu


There are a few exception messages in DistributedHerder which were copied from 
other exception message.

This task corrects the messages to reflect actual condition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] KIP-553: Disable all SSL protocols except TLSV1.2 by default.

2020-01-21 Thread Ted Yu
+1

On Tue, Jan 21, 2020 at 8:24 AM Rajini Sivaram 
wrote:

> +1 (binding)
>
> Thanks for the KIP!
>
> Regards,
>
> Rajini
>
>
> On Tue, Jan 21, 2020 at 3:43 PM Николай Ижиков 
> wrote:
>
> > Hello.
> >
> > I would like to start vote for KIP-553: Disable all SSL protocols except
> > TLSV1.2 by default.
> >
> > KIP -
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=142641956
> > Discussion thread -
> >
> https://lists.apache.org/thread.html/9c6201fe403a24f84fc3aa27f47dd06b718c1d80de0ee3412b9b877c%40%3Cdev.kafka.apache.org%3E
>


Re: [DISCUSS] : KIP-562: Allow fetching a key from a single partition rather than iterating over all the stores on an instance

2020-01-18 Thread Ted Yu
Looking at the current KIP-562:

bq. Create a taskId from the combination of store name and partition
provided by the user

I wonder if a single taskId would be used for the “all partitions” case.
If so, we need to choose a numerical value for the partition portion of the
taskId.

On Sat, Jan 18, 2020 at 10:27 AM John Roesler  wrote:

> Thanks, Ted!
>
> This makes sense, but it seems like we should lean towards explicit
> semantics in the public API. ‘-1’ meaning “all partitions” is reasonable,
> but not explicit. That’s why I suggested the Boolean for “all partitions”.
> I guess this also means getPartition() should either throw an exception or
> return null if the partition is unspecified.
>
> Thanks,
> John
>
> On Sat, Jan 18, 2020, at 08:43, Ted Yu wrote:
> > I wonder if the following two methods can be combined:
> >
> > Integer getPartition() // would be null if unset or if "all partitions"
> > boolean getAllLocalPartitions() // true/false if "all partitions"
> requested
> >
> > into:
> >
> > Integer getPartition() // would be null if unset or -1 if "all
> partitions"
> >
> > Cheers
> >
> > On Fri, Jan 17, 2020 at 9:56 PM John Roesler 
> wrote:
> >
> > > Thanks, Navinder!
> > >
> > > I took a look at the KIP.
> > >
> > > We tend to use static factory methods instead of public constructors,
> and
> > > also builders for optional parameters.
> > >
> > > Given that, I think it would be more typical to have a factory method:
> > > storeQueryParams()
> > >
> > > and also builders for setting the optional parameters, like:
> > > withPartitions(List partitions)
> > > withStaleStoresEnabled()
> > > withStaleStoresDisabled()
> > >
> > >
> > > I was also thinking this over today, and it really seems like there are
> > > two main cases for specifying partitions,
> > > 1. you know exactly what partition you want. In this case, you'll only
> > > pass in a single number.
> > > 2. you want to get a handle on all the stores for this instance (the
> > > current behavior). In this case, it's not clear how to use
> withPartitions
> > > to achieve the goal, unless you want to apply a-priori knowledge of the
> > > number of partitions in the store. We could consider an empty list, or
> a
> > > null, to indicate "all", but that seems a little complicated.
> > >
> > > Thus, maybe it would actually be better to eschew withPartitions for
> now
> > > and instead just offer:
> > > withPartition(int partition)
> > > withAllLocalPartitions()
> > >
> > > and the getters:
> > > Integer getPartition() // would be null if unset or if "all partitions"
> > > boolean getAllLocalPartitions() // true/false if "all partitions"
> requested
> > >
> > > Sorry, I know I'm stirring the pot, but what do you think about this?
> > >
> > > Oh, also, the KIP is missing the method signature for the new
> > > KafkaStreams#store overload.
> > >
> > > Thanks!
> > > -John
> > >
> > > On Fri, Jan 17, 2020, at 08:07, Navinder Brar wrote:
> > > > Hi all,
> > > > I have created a new
> > > > KIP:
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-562%3A+Allow+fetching+a+key+from+a+single+partition+rather+than+iterating+over+all+the+stores+on+an+instance
> > > > Please take a look if you get a chance.
> > > > ~Navinder
> > >
> >
>


Re: [DISCUSS] : KIP-562: Allow fetching a key from a single partition rather than iterating over all the stores on an instance

2020-01-18 Thread Ted Yu
I wonder if the following two methods can be combined:

Integer getPartition() // would be null if unset or if "all partitions"
boolean getAllLocalPartitions() // true/false if "all partitions" requested

into:

Integer getPartition() // would be null if unset or -1 if "all partitions"

Cheers

On Fri, Jan 17, 2020 at 9:56 PM John Roesler  wrote:

> Thanks, Navinder!
>
> I took a look at the KIP.
>
> We tend to use static factory methods instead of public constructors, and
> also builders for optional parameters.
>
> Given that, I think it would be more typical to have a factory method:
> storeQueryParams()
>
> and also builders for setting the optional parameters, like:
> withPartitions(List partitions)
> withStaleStoresEnabled()
> withStaleStoresDisabled()
>
>
> I was also thinking this over today, and it really seems like there are
> two main cases for specifying partitions,
> 1. you know exactly what partition you want. In this case, you'll only
> pass in a single number.
> 2. you want to get a handle on all the stores for this instance (the
> current behavior). In this case, it's not clear how to use withPartitions
> to achieve the goal, unless you want to apply a-priori knowledge of the
> number of partitions in the store. We could consider an empty list, or a
> null, to indicate "all", but that seems a little complicated.
>
> Thus, maybe it would actually be better to eschew withPartitions for now
> and instead just offer:
> withPartition(int partition)
> withAllLocalPartitions()
>
> and the getters:
> Integer getPartition() // would be null if unset or if "all partitions"
> boolean getAllLocalPartitions() // true/false if "all partitions" requested
>
> Sorry, I know I'm stirring the pot, but what do you think about this?
>
> Oh, also, the KIP is missing the method signature for the new
> KafkaStreams#store overload.
>
> Thanks!
> -John
>
> On Fri, Jan 17, 2020, at 08:07, Navinder Brar wrote:
> > Hi all,
> > I have created a new
> > KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-562%3A+Allow+fetching+a+key+from+a+single+partition+rather+than+iterating+over+all+the+stores+on+an+instance
> > Please take a look if you get a chance.
> > ~Navinder
>


Re: [VOTE] 2.1.0 RC0

2018-10-24 Thread Ted Yu
+1

InternalTopicIntegrationTest failed during test suite run but passed with
rerun.

On Wed, Oct 24, 2018 at 3:48 AM Andras Beni 
wrote:

> +1 (non-binding)
>
> Verified signatures and checksums of release artifacts
> Performed quickstart steps on rc artifacts (both scala 2.11 and 2.12) and
> one built from tag 2.1.0-rc0
>
> Andras
>
> On Wed, Oct 24, 2018 at 10:17 AM Dong Lin  wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the first candidate for feature release of Apache Kafka 2.1.0.
> >
> > This is a major version release of Apache Kafka. It includes 28 new KIPs
> > and
> >
> > critical bug fixes. Please see the Kafka 2.1.0 release plan for more
> > details:
> >
> > *
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=91554044*
> > <
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=91554044
> > >
> >
> > Here are a few notable highlights:
> >
> > - Java 11 support
> > - Support for Zstandard, which achieves compression comparable to gzip
> with
> > higher compression and especially decompression speeds(KIP-110)
> > - Avoid expiring committed offsets for active consumer group (KIP-211)
> > - Provide Intuitive User Timeouts in The Producer (KIP-91)
> > - Kafka's replication protocol now supports improved fencing of zombies.
> > Previously, under certain rare conditions, if a broker became partitioned
> > from Zookeeper but not the rest of the cluster, then the logs of
> replicated
> > partitions could diverge and cause data loss in the worst case (KIP-320)
> > - Streams API improvements (KIP-319, KIP-321, KIP-330, KIP-353, KIP-356)
> > - Admin script and admin client API improvements to simplify admin
> > operation (KIP-231, KIP-308, KIP-322, KIP-324, KIP-338, KIP-340)
> > - DNS handling improvements (KIP-235, KIP-302)
> >
> > Release notes for the 2.1.0 release:
> > http://home.apache.org/~lindong/kafka-2.1.0-rc0/RELEASE_NOTES.html
> >
> > *** Please download, test and vote ***
> >
> > * Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > http://home.apache.org/~lindong/kafka-2.1.0-rc0/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/
> >
> > * Javadoc:
> > http://home.apache.org/~lindong/kafka-2.1.0-rc0/javadoc/
> >
> > * Tag to be voted upon (off 2.1 branch) is the 2.1.0-rc0 tag:
> > https://github.com/apache/kafka/tree/2.1.0-rc0
> >
> > * Documentation:
> > *http://kafka.apache.org/21/documentation.html*
> > 
> >
> > * Protocol:
> > http://kafka.apache.org/21/protocol.html
> >
> > * Successful Jenkins builds for the 2.1 branch:
> > Unit/integration tests: *
> https://builds.apache.org/job/kafka-2.1-jdk8/38/
> > *
> >
> > Please test and verify the release artifacts and submit a vote for this
> RC,
> > or report any issues so we can fix them and get a new RC out ASAP.
> Although
> > this release vote requires PMC votes to pass, testing, votes, and bug
> > reports are valuable and appreciated from everyone.
> >
> > Cheers,
> > Dong
> >
>


Re: [DISCUSS] Replacing EasyMock with Mockito in Kafka

2018-10-05 Thread Ted Yu
+1 to moving to Mockito

On Fri, Oct 5, 2018 at 12:11 PM Ron Dagostino  wrote:

> I have used Mockito and am a big fan -- I had never used EasyMock until
> recently.  The concept of record vs. replay mode in EasyMock really annoyed
> me -- I'm a fan of the "NO MODES" idea (
> https://en.wikipedia.org/wiki/Larry_Tesler), and when I encountered it in
> EasyMock in the context of Kafka development I just gritted my teeth and
> learned/used it as required.  I'm very pleased that Mockito is being
> seriously considered as a replacement.  Mockito  started as an EasyMock
> fork, I guess, and the project readily admits it is standing on the
> shoulders of giants in many respects, but I think one of the best things
> they ever did was jettison the concept of modes.
>
> +1 from me.
>
> Ron
>
> On Fri, Oct 5, 2018 at 10:00 AM Ismael Juma  wrote:
>
> > So far, it seems like people are in favour. If no objections are
> presented
> > in the next couple of days, we will go ahead with the change.
> >
> > Ismael
> >
> > On Sun, 30 Sep 2018, 20:32 Ismael Juma,  wrote:
> >
> > > Hi all,
> > >
> > > As described in KAFKA-7438
> > > , EasyMock's
> > > development has stagnated. This presents a number of issues:
> > >
> > > 1. Blocks us from running tests with newer Java versions, which is a
> > > frequent occurrence give the new Java release cadence. It is the main
> > > blocker in switching Jenkins from Java 10 to Java 11 at the moment.
> > > 2. Integration with newer testing libraries like JUnit 5 is slow to
> > appear
> > > (if it appears at all).
> > > 3. No API improvements. Mockito started as an EasyMock fork, but has
> > > continued to evolve and, in my opinion, it's more intuitive now.
> > >
> > > I think we should switch to Mockito for new tests and to incrementally
> > > migrate the existing ones as time allows. To make the proposal
> concrete,
> > I
> > > went ahead and converted all the tests in the `clients` module:
> > >
> > > https://github.com/apache/kafka/pull/5691
> > >
> > > I think the updated tests are nicely readable. I also removed PowerMock
> > > from the `clients` tests as we didn't really need it and its
> development
> > > has also stagnated a few months ago. I think we can easily remove
> > PowerMock
> > > elsewhere with the exception of `Connect` where we may need to keep it
> > for
> > > a while.
> > >
> > > Let me know your thoughts. Aside from the general future direction, I'd
> > > like to get the PR for KAFKA-7439 reviewed and merged soonish as merge
> > > conflicts will creep in quickly.
> > >
> > > Ismael
> > >
> >
>


Re: 答复: [VOTE] KIP-347: Enable batching in FindCoordinatorRequest

2018-08-27 Thread Ted Yu
Looking at the code for solution #1:
 } else if (builder.build(version)

instanceof List){

wouldn't AbstractRequest be gone due to type erasure ?

Which solution do you favor ?

Cheers

On Mon, Aug 27, 2018 at 4:20 PM Yishun Guan  wrote:

> Sorry for the delay, I have been struggling to come up with a nice
> solution:
>
> I have updated the KIP:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-347%3A++Enable+batching+in+FindCoordinatorRequest
>
> In summary, to solve the question Guozhang raised:
>
> "One tricky question is, how do we know if a higher version API has a
> batching optimization...
>
> a) One solution is to let the request's builder.build() return either
> ONE request or a LIST of requests. This is backward compatible. We can
> have a list of one single element.
>
> b) An alternative solution is to add extra fields in
> AbstractRequest.java including Boolean isBatchingEnable() and
> List buildFromBatch(). This way will decouple the two
> different build functions.
>
> Then we update the send logic in doSend() correspondingly."
>
>
> You can read about these solutions in more details in this KIP.
>
> Thanks,
> Yishun
>
> On Fri, Aug 17, 2018 at 12:12 PM Yishun Guan  wrote:
> >
> > Thanks for the clarification. I will address this in my KIP.
> >
> > On Fri, Aug 17, 2018, 12:06 PM Guozhang Wang  wrote:
> >>
> >> Today we do have logic for auto down-conversion, but it is assuming a
> one-to-one mapping. The actual logic is here:
> >>
> >>
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L775
> >>
> >> As you can see, NetworkClient maintains a "apiVersions" map that keeps
> which node supports which version. And then when sending the request to the
> node it will use this map to form the supported version of the request.
> >>
> >> But current logic do not consider that we may need multiple lower
> versioned requests to substitute a single higher versioned request, and
> that would be the logic your PR need to address.
> >>
> >>
> >> Guozhang
> >>
> >> On Fri, Aug 17, 2018 at 11:59 AM, Yishun Guan 
> wrote:
> >>>
> >>> @Guozhang Wang One thing that I remain confused about (and forgive me
> if you have explained this to me before), is that if we don't have any
> transformation helpers (between versions) implemented before, how do we
> deal with other incompatibility issues when we try to update requests and
> bump up their versions? Or we never have this problem until this version
> update and now that's why we need to implement a converter from V3 to V2?
> >>>
> >>> On Fri, Aug 17, 2018 at 10:18 AM Guozhang Wang 
> wrote:
> 
>  Yishun, some more comments:
> 
>  1. "All the coordinator ids " + "for this request": it should be "all
> the
>  requested group ids looking for their coordinators" right?
> 
>  2. I just thought about this a bit more, regarding "*Compatibility
> issues
>  between old and new versions need to be considered, we should think
> about
>  how to convert requests from a newer version to a old version."*
> 
> 
>  One thing I realized is that for FindCoordinatorRequest, today both
>  consumer / admin client would need it. I.e. to complete the KIP for
>  compatibility, you'll have to implement this function along with this
> KIP,
>  since otherwise consumer talking to an old broker will fail to.
> 
>  So I'd suggest you update the `Compatibility` section with a detailed
>  proposal on how to let new versioned clients to talk to old versioned
>  brokers. We've talked about some high-level implementation guidelines
> in
>  the DISCUSS thread, which you can try it out and see if it works:
> i.e. by
>  starting a Kafka broker with version 2.0, and then starting a consumer
>  client with trunk (it will have a new version), and the added logic
> should
>  make sure the consumer still proceeds normally with the compatibility
> logic
>  that we are going to add.
> 
> 
>  Guozhang
> 
>  On Thu, Aug 16, 2018 at 5:46 PM, Hu Xi  wrote:
> 
>  > +1 (non-binding)
>  >
>  > 
>  > 发件人: Yishun Guan 
>  > 发送时间: 2018年8月17日 8:14
>  > 收件人: dev@kafka.apache.org
>  > 主题: [VOTE] KIP-347: Enable batching in FindCoordinatorRequest
>  >
>  > Hi all,
>  >
>  > I want to start a vote on this KIP:
>  > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>  > 347%3A++Enable+batching+in+FindCoordinatorRequest
>  >
>  > Here is the discussion thread:
>  > https://lists.apache.org/thread.html/fd727cc7d5b0956d64255c35d5ed46
>  > 403c3495a7052ba8ffbc55e084@%3Cdev.kafka.apache.org%3E
>  >
>  > Thanks everyone for your input!
>  >
>  > Best,
>  > Yishun
>  >
> 
> 
> 
>  --
>  -- Guozhang
> >>
> >>
> >>
> >>
> >> --

Re: [VOTE] KIP-365: Materialized, Serialized, Joined, Consumed and Produced with implicit Serde

2018-08-27 Thread Ted Yu
+1

On Mon, Aug 27, 2018 at 12:18 PM John Roesler  wrote:

> +1 (non-binding)
>
> On Sat, Aug 25, 2018 at 1:16 PM Joan Goyeau  wrote:
>
> > Hi,
> >
> > We want to make sure that we always have a serde for all Materialized,
> > Serialized, Joined, Consumed and Produced.
> > For that we can make use of the implicit parameters in Scala.
> >
> > KIP:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-365%3A+Materialized%2C+Serialized%2C+Joined%2C+Consumed+and+Produced+with+implicit+Serde
> >
> > Github PR: https://github.com/apache/kafka/pull/5551
> >
> > Please make your votes.
> > Thanks
> >
>


Re: [VOTE] KIP-357: Add support to list ACLs per principal

2018-08-27 Thread Ted Yu
+1

On Mon, Aug 27, 2018 at 11:42 AM Priyank Shah  wrote:

> +1 (Non-binding)
>
> On 8/27/18, 9:24 AM, "Manikumar"  wrote:
>
> Hi All,
>
> I would like to start voting on KIP-357 which allows to list ACLs per
> principal using AclCommand (kafka-acls.sh)
>
> KIP:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-357%3A++Add+support+to+list+ACLs+per+principal
>
> Discussion Thread:
>
> https://lists.apache.org/thread.html/dc7f6005845a372a0a48a40872a32d9ece03807a4fb1bb89d3645afb@%3Cdev.kafka.apache.org%3E
>
> Thanks,
> Manikumar
>
>
>


[jira] [Created] (KAFKA-7345) Potentially unclosed FileChannel in StateDirectory#unlock

2018-08-26 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-7345:
-

 Summary: Potentially unclosed FileChannel in StateDirectory#unlock
 Key: KAFKA-7345
 URL: https://issues.apache.org/jira/browse/KAFKA-7345
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Ted Yu


{code}
lockAndOwner.lock.release();
log.debug("{} Released state dir lock for task {}", logPrefix(), 
taskId);

final FileChannel fileChannel = channels.remove(taskId);
if (fileChannel != null) {
fileChannel.close();
{code}
If {{lockAndOwner.lock.release()}} throws IOE, the closing of the FileChannel 
would be skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7344) Return early when all tasks are assigned in StickyTaskAssignor#assignActive

2018-08-25 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-7344:
-

 Summary: Return early when all tasks are assigned in 
StickyTaskAssignor#assignActive
 Key: KAFKA-7344
 URL: https://issues.apache.org/jira/browse/KAFKA-7344
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Ted Yu


After re-assigning existing active tasks to clients that previously had the 
same active task, there is chance that {{taskIds.size() == assigned.size()}}, 
i.e. all tasks are assigned .
The method continues with:
{code}
final Set unassigned = new HashSet<>(taskIds);
unassigned.removeAll(assigned);
{code}
We can check the above condition and return early before allocating HashSet.

Similar optimization can be done before the following (around line 112):
{code}
// assign any remaining unassigned tasks
final List sortedTasks = new ArrayList<>(unassigned);
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: unable to build schema registry

2018-08-24 Thread Ted Yu
The first error was complaining about this class from Kafka clients:

clients/src/main/java/org/apache/kafka/common/utils/Timer.java

It seems you haven't installed kafka clients jar into local maven repo.
You can run the following command under kafka working dir:

./gradlew installAll

I just built schema-registry successfully (by first installing common and
rest-utils)

FYI

On Fri, Aug 24, 2018 at 10:30 AM Simon Nunn  wrote:

>
> Getting the following error when trying to build the schema registry.  I
> have tried various versions of kafka, but not sure what I need to do.  Any
> help would be appreciated.
>
>
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-compiler-plugin:3.6.1:compile
> (default-compile) on project kafka-schema-registry: Compilation failure:
> Compilation failure:
> [ERROR]
> /C:/projects/schema-registry/core/src/main/java/io/confluent/kafka/schemaregistry/masterelector/kafka/SchemaRegistryCoordinator.java:[25,37]
> cannot find symbol
> [ERROR]   symbol:   class Timer
> [ERROR]   location: package org.apache.kafka.common.utils
> [ERROR]
> /C:/projects/schema-registry/core/src/main/java/io/confluent/kafka/schemaregistry/masterelector/kafka/SchemaRegistryCoordinator.java:[207,57]
> cannot find symbol
> [ERROR]   symbol:   class Timer
> [ERROR]   location: class
> io.confluent.kafka.schemaregistry.masterelector.kafka.SchemaRegistryCoordinator
> [ERROR]
> /C:/projects/schema-registry/core/src/main/java/io/confluent/kafka/schemaregistry/masterelector/kafka/SchemaRegistryCoordinator.java:[98,36]
> cannot find symbol
> [ERROR]   symbol:   method timer(long)
> [ERROR]   location: variable time of type
> org.apache.kafka.common.utils.Time
> [ERROR]
> /C:/projects/schema-registry/core/src/main/java/io/confluent/kafka/schemaregistry/masterelector/kafka/SchemaRegistryCoordinator.java:[114,23]
> cannot find symbol
> [ERROR]   symbol:   method timer(long)
> [ERROR]   location: variable time of type
> org.apache.kafka.common.utils.Time
> [ERROR] -> [Help 1]
>


Re: [VOTE] KIP-363: Make FunctionConversions private

2018-08-24 Thread Ted Yu
+1

On Fri, Aug 24, 2018 at 5:17 PM Attila Sasvári  wrote:

> Hi there,
>
> There is a conflicting KIP with the same number, see
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-363%3A+Allow+performance+tools+to+print+final+results+to+output+file
>
> Its discussion was started earlier, on August 23
> https://www.mail-archive.com/dev@kafka.apache.org/msg91132.html and KIP
> page already includes it:
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
>
> Please update KIP number to resolve the conflict.
>
> Apart from this, +1 (non-binding) and thanks for the KIP!
>
> Regards,
> - Attila
>
>
> Guozhang Wang  (időpont: 2018. aug. 24., P, 20:26) ezt
> írta:
>
> > +1 from me (binding).
> >
> > On Fri, Aug 24, 2018 at 11:24 AM, Joan Goyeau  wrote:
> >
> > > Hi,
> > >
> > > As pointed out in this comment #5539 (comment)
> > > <https://github.com/apache/kafka/pull/5539#discussion_r212380648>
> "This
> > > class was already defaulted to public visibility, and we can't retract
> it
> > > now, without a KIP.", the object FunctionConversions is only of
> internal
> > > use and therefore should be private to the lib only so that we can do
> > > changes without going through KIP like this one.
> > >
> > > Please make your vote.
> > >
> > > On Fri, 24 Aug 2018 at 19:14 John Roesler  wrote:
> > >
> > > > I'm also in favor of this. I don't think it's controversial either.
> > > Should
> > > > we just move to a vote?
> > > >
> > > > On Thu, Aug 23, 2018 at 7:01 PM Guozhang Wang 
> > > wrote:
> > > >
> > > > > +1.
> > > > >
> > > > > On Thu, Aug 23, 2018 at 12:47 PM, Ted Yu 
> > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > In the Motivation section, you can quote the comment from pull
> > > request
> > > > so
> > > > > > that reader doesn't have to click through.
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > > On Thu, Aug 23, 2018 at 12:13 PM Joan Goyeau 
> > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > As pointed out in this comment #5539 (comment)
> > > > > > > <
> https://github.com/apache/kafka/pull/5539#discussion_r212380648
> > >
> > > > the
> > > > > > > object FunctionConversions is only of internal use and
> therefore
> > > > should
> > > > > > be
> > > > > > > private to the lib only so that we can do changes without going
> > > > through
> > > > > > KIP
> > > > > > > like this one.
> > > > > > >
> > > > > > > KIP:
> > > > > > >
> > > > > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-363%3A+Make+
> > > > > > FunctionConversions+private
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>


Re: Build failed in Jenkins: kafka-trunk-jdk10 #429

2018-08-23 Thread Ted Yu
I ran streams unit tests as of
commit 4156ea0a9bcca67d209fd3b43d2268c9abd5a0b5 .

All tests passed locally.

FYI

On Thu, Aug 23, 2018 at 12:23 PM Joan Goyeau  wrote:

> I'm looking into this one.
>
> On Thu, 23 Aug 2018 at 20:19 Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
> > See <
> >
> https://builds.apache.org/job/kafka-trunk-jdk10/429/display/redirect?page=changes
> > >
> >
> > Changes:
> >
> > [wangguoz] KAFKA-7316: Fix Streams Scala filter recursive call #5538
> >
> > --
> > [...truncated 1.98 MB...]
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldNotHaveSameAssignmentOnAnyTwoHosts PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldRebalanceTasksToClientsBasedOnCapacity STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldRebalanceTasksToClientsBasedOnCapacity PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > >
> shouldNotHaveSameAssignmentOnAnyTwoHostsWhenThereArePreviousStandbyTasks
> > STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > >
> shouldNotHaveSameAssignmentOnAnyTwoHostsWhenThereArePreviousStandbyTasks
> > PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > >
> >
> shouldNotAssignStandbyTaskReplicasWhenNoClientAvailableWithoutHavingTheTaskAssigned
> > STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > >
> >
> shouldNotAssignStandbyTaskReplicasWhenNoClientAvailableWithoutHavingTheTaskAssigned
> > PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldAssignEachActiveTaskToOneClientWhenMoreClientsThanTasks STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldAssignEachActiveTaskToOneClientWhenMoreClientsThanTasks PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldAssignTopicGroupIdEvenlyAcrossClientsWithStandByTasks STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldAssignTopicGroupIdEvenlyAcrossClientsWithStandByTasks PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldAssignTasksNotPreviouslyActiveToNewClient STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldAssignTasksNotPreviouslyActiveToNewClient PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldAssignOneActiveTaskToEachProcessWhenTaskCountSameAsProcessCount
> > STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldAssignOneActiveTaskToEachProcessWhenTaskCountSameAsProcessCount
> > PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldReBalanceTasksAcrossClientsWhenCapacityLessThanTaskCount STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldReBalanceTasksAcrossClientsWhenCapacityLessThanTaskCount PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldBalanceActiveAndStandbyTasksAcrossAvailableClients STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldBalanceActiveAndStandbyTasksAcrossAvailableClients PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldAssignActiveAndStandbyTasks STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldAssignActiveAndStandbyTasks PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldNotHaveSameAssignmentOnAnyTwoHostsWhenThereArePreviousActiveTasks
> > STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldNotHaveSameAssignmentOnAnyTwoHostsWhenThereArePreviousActiveTasks
> > PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldAssignTasksToClientWithPreviousStandbyTasks STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldAssignTasksToClientWithPreviousStandbyTasks PASSED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldReBalanceTasksAcrossAllClientsWhenCapacityAndTaskCountTheSame
> > STARTED
> >
> >
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignorTest
> > > shouldReBalanceTasksAcrossAllClientsWhenCapacityAndTaskCountTheSame
> 

Re: [DISCUSS] KIP-363: Make FunctionConversions private

2018-08-23 Thread Ted Yu
+1

In the Motivation section, you can quote the comment from pull request so
that reader doesn't have to click through.

Cheers

On Thu, Aug 23, 2018 at 12:13 PM Joan Goyeau  wrote:

> Hi,
>
> As pointed out in this comment #5539 (comment)
>  the
> object FunctionConversions is only of internal use and therefore should be
> private to the lib only so that we can do changes without going through KIP
> like this one.
>
> KIP:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-363%3A+Make+FunctionConversions+private
>
> Thanks
>


[jira] [Created] (KAFKA-7334) Suggest changing config for state.dir in case of FileNotFoundException

2018-08-23 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-7334:
-

 Summary: Suggest changing config for state.dir in case of 
FileNotFoundException
 Key: KAFKA-7334
 URL: https://issues.apache.org/jira/browse/KAFKA-7334
 Project: Kafka
  Issue Type: Improvement
Reporter: Ted Yu


Quoting stack trace from KAFKA-5998 :
{code}
WARN [2018-08-22 03:17:03,745] 
org.apache.kafka.streams.processor.internals.ProcessorStateManager: task [0_45] 
Failed to write offset checkpoint file to /tmp/kafka-streams/
{{ /0_45/.checkpoint: {}}}
{{ ! java.nio.file.NoSuchFileException: 
/tmp/kafka-streams//0_45/.checkpoint.tmp}}
{{ ! at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)}}
{{ ! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)}}
{code}
When state.dir is left at default configuration, there is a chance that certain 
files under the state directory are cleaned by OS.

[~mjsax] and I proposed to suggest user, through exception message, to change 
the location for state.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-361: Add Consumer Configuration to Disable Auto Topic Creation

2018-08-22 Thread Ted Yu
We may also consider :

"suppress.auto.topic.creation"

or

"allow.auto.topic.creation"

w.r.t. suppress or allow, I don't have strong opinion either. It's just a
matter of choosing the proper default value.

Cheers

On Wed, Aug 22, 2018 at 6:00 PM Dhruvil Shah  wrote:

> Hi Matthias,
>
> Do you mean something like "suppress.auto.create.topic"? I am leaning a bit
> towards "allow.auto.create.topics" but I don't have a strong preference
> either. Let's wait to hear if anyone else has an opinion on this.
>
> Thanks,
> Dhruvil
>
> On Tue, Aug 21, 2018 at 5:28 PM Matthias J. Sax 
> wrote:
>
> > Thanks for the KIP Dhruvil!
> >
> > I agree with Jason's comment. An alternative might be to use "suppress"
> > what would revert the logic of "allow". Not sure which one is more
> > intuitive and I am fine with both (no personal preference). Just wanted
> > to mention it as an alternative.
> >
> > Don't have any further comments/question so far.
> >
> >
> > -Matthias
> >
> >
> >
> > On 8/21/18 4:42 PM, Jason Gustafson wrote:
> > > Hey Dhruvil,
> > >
> > > I would suggest using the verb "allow" rather than "enable. The
> consumer
> > > cannot enable auto topic creation because it is configured on the
> broker.
> > > All it can do is prevent it from happening if it is enabled.
> > >
> > > -Jason
> > >
> > > On Tue, Aug 21, 2018 at 3:56 PM, Dhruvil Shah 
> > wrote:
> > >
> > >> Hi,
> > >>
> > >> I would like to start discussion on KIP-361 that proposes we add a
> > consumer
> > >> configuration to disable auto topic creation.
> > >>
> > >> Link to the KIP:
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-361%3A+Add+Consumer+
> > >> Configuration+to+Disable+Auto+Topic+Creation
> > >>
> > >> Suggestions and feedback are welcome!
> > >>
> > >> Thanks,
> > >> Dhruvil
> > >>
> > >
> >
> >
>


Re: [VOTE] KIP-336: Consolidate ExtendedSerializer/Serializer and ExtendedDeserializer/Deserializer

2018-08-22 Thread Ted Yu
+1
 Original message From: Kamal Chandraprakash 
 Date: 8/22/18  3:19 AM  (GMT-08:00) To: 
dev@kafka.apache.org Subject: Re: [VOTE] KIP-336: Consolidate 
ExtendedSerializer/Serializer and ExtendedDeserializer/Deserializer 
+1

Thanks for the KIP!

On Wed, Aug 22, 2018 at 2:48 PM Viktor Somogyi-Vass 
wrote:

> Hi All,
>
> I'd like to start a vote on this KIP (
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=87298242)
> which aims to refactor ExtendedSerializer/Serializer and
> ExtendedDeserializer/Deserializer.
>
> To summarize what's the motivation:
>
> When headers were introduced by KIP-82 the ExtendedSerializer and
> ExtendedDeserializer classes were created in order to keep interface
> compatibility but still add `T deserialize(String topic, Headers headers,
> byte[] data);` and `byte[] serialize(String topic, Headers headers, T
> data);` methods that consume the headers for serialization/deserialization.
> The reason for doing so was that Kafka at that time needed be compatbile
> with Java 7. Since we're not compiling on Java 7 anymore (KAFKA-4423) we'll
> try consolidate the way we're using these in a backward compatible fashion:
> deprecating the Extended* classes and moving the aforementioned methods up
> in the class hierarchy.
>
> I'd be happy to get votes or additional feedback on this.
>
> Viktor
>


Re: [ANNOUNCE] New Kafka PMC member: Dong Lin

2018-08-21 Thread Ted Yu
Congratulation Dong!

On Tue, Aug 21, 2018 at 1:59 AM Viktor Somogyi-Vass 
wrote:

> Congrats Dong! :)
>
> On Tue, Aug 21, 2018 at 10:09 AM James Cheng  wrote:
>
> > Congrats Dong!
> >
> > -James
> >
> > > On Aug 20, 2018, at 3:54 AM, Ismael Juma  wrote:
> > >
> > > Hi everyone,
> > >
> > > Dong Lin became a committer in March 2018. Since then, he has remained
> > > active in the community and contributed a number of patches, reviewed
> > > several pull requests and participated in numerous KIP discussions. I
> am
> > > happy to announce that Dong is now a member of the
> > > Apache Kafka PMC.
> > >
> > > Congratulation Dong! Looking forward to your future contributions.
> > >
> > > Ismael, on behalf of the Apache Kafka PMC
> >
> >
>


[jira] [Created] (KAFKA-7316) Use of filter method in KTable.scala may result in StackOverflowError

2018-08-20 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-7316:
-

 Summary: Use of filter method in KTable.scala may result in 
StackOverflowError
 Key: KAFKA-7316
 URL: https://issues.apache.org/jira/browse/KAFKA-7316
 Project: Kafka
  Issue Type: Bug
Reporter: Ted Yu


In this thread:

http://search-hadoop.com/m/Kafka/uyzND1dNbYKXzC4F1?subj=Issue+in+Kafka+2+0+0+

Druhin reported seeing StackOverflowError when using filter method from 
KTable.scala

This can be reproduced with the following change:
{code}
diff --git 
a/streams/streams-scala/src/test/scala/org/apache/kafka/streams/scala/StreamToTableJoinScalaIntegrationTestImplicitSerdes.scala
 b/streams/streams-scala/src/test/scala
index 3d1bab5..e0a06f2 100644
--- 
a/streams/streams-scala/src/test/scala/org/apache/kafka/streams/scala/StreamToTableJoinScalaIntegrationTestImplicitSerdes.scala
+++ 
b/streams/streams-scala/src/test/scala/org/apache/kafka/streams/scala/StreamToTableJoinScalaIntegrationTestImplicitSerdes.scala
@@ -58,6 +58,7 @@ class StreamToTableJoinScalaIntegrationTestImplicitSerdes 
extends StreamToTableJ
 val userClicksStream: KStream[String, Long] = 
builder.stream(userClicksTopic)

 val userRegionsTable: KTable[String, String] = 
builder.table(userRegionsTopic)
+userRegionsTable.filter { case (_, count) => true }

 // Compute the total per region by summing the individual click counts per 
region.
 val clicksPerRegion: KTable[String, Long] =
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-358: Migrate Streams API to Duration instead of long ms times

2018-08-19 Thread Ted Yu
bq. // or just Duration windowSize();

+1 to the above choice.
The duration is obvious from the return type. For getter methods, we don't
use get as prefix (as least for new code).

Cheers

On Sun, Aug 19, 2018 at 8:03 AM Nikolay Izhikov  wrote:

> Hello, John.
>
> Thank you very much for your feedback!
> I've addressed all your comments.
> Please, see my answers and let my know is anything in KIP [1] needs to be
> improved.
>
> > The correct choice is actually "Instant", not> "LocalDateTime"
>
> I've changed the methods proposed in KIP [1] to use Instant.
>
> > I noticed some recent APIs are> missing (see KIP-328)
> > those APIs were just added and have never been released... you can just
> replace them.
>
> I've added new methods to KIP [1].
> Not released methods marked for remove.
>
> > any existing method that's already deprecated, don't bother
> transitioning it to Duration.
>
> Fixed.
>
> > IllegalArgumentException... we should plan to mention this in the
> javadoc for those methods.
>
> Got it.
>
> > In Stores, windowSize and segmentInterval should also be durations.
>
> Fixed.
>
> > StreamsMetrics, recordLatency ... this one is better left alone.
>
> OK. I removed this method from KIP [1].
>
> Two more questions question about implementation:
>
> 1. We have serveral methods without parameters.
> In java we can't have two methods with parameters with the same name.
> It wouldn't compile.
> So we have to rename new methods. Please, see suggested names and share
> your thoughts:
>
> Windows {
> long size() -> Duration windowSize();
> }
>
> Window {
> long start() -> Instant startTime();
> long end() -> Instant endTime();
> }
>
> SessionWindows {
> long inactivityGap() -> Duration inactivityGapDuration();
> }
>
> TimeWindowedDeserializer {
> Long getWindowSize() -> Duration getWindowSizeDuration(); // or just
> Duration windowSize();
> }
>
> SessionBytesStoreSupplier {
> long retentionPeriod() -> Duration retentionPeriodDuration();
> }
>
> WindowBytesStoreSupplier {
> long windowSize() -> Duration windowSizeDuration();
> long retentionPeriod() -> Duration retentionPeriodDuration();
> }
>
> 2. Do we want to use Duration and Instant inside API implementations?
>
> IGNITE-7277: "Durations potentially worsen memory pressure and gc
> performance, so internally, we will still use longMs as the representation."
> IGNITE-7222: Duration used to store retention.
>
> [1]
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-358%3A+Migrate+Streams+API+to+Duration+instead+of+long+ms+times
> [2]
> https://github.com/apache/kafka/commit/b3771ba22acad7870e38ff7f58820c5b50946787#diff-47289575d3e3e2449f27b3a7b6788e1aR64
>
> В Пт, 17/08/2018 в 14:46 -0500, John Roesler пишет:
> > Hi Nikolay,
> >
> > Thanks for this very nice KIP!
> >
> > To answer your questions:
> > 1. Correct, we should not delete existing methods that have been
> released,
> > but ...
> >
> > 2. Yes, we should deprecate the 'long' variants so that we can drop them
> > later on. Personally, I like to mention which version deprecated the
> method
> > so everyone can see later on how long it's been deprecated, but this may
> be
> > controversial, so let's let other weigh in.
> >
> > 3. I think you're asking whether it's appropriate to drop the "Ms"
> suffix,
> > and I think yes. So "long inactivityGapMs" would become "Duration
> > inactivityGap".
> > In the places where the parameter's name is just "duration", I think we
> can
> > pick something more descriptive (I realize it was already "durationMs";
> > this is just a good time to improve it).
> > Also, you're correct that we shouldn't use a Duration to represent a
> moment
> > in time, like "startTime". The correct choice is actually "Instant", not
> > "LocalDateTime", though.
> >
> https://stackoverflow.com/questions/32437550/whats-the-difference-between-instant-and-localdatetime
> > explains why.
> >
> > I also had a few notes on the KIP itself:
> > 4. You might want to pull trunk again. I noticed some recent APIs are
> > missing (see KIP-328).
> >
> > 5. Speaking of KIP-328: those APIs were just added and have never been
> > released, so there's no need to deprecate the methods, you can just
> replace
> > them.
> >
> > 6. For any existing method that's already deprecated, don't bother
> > transitioning it to Duration. I think the examples I noticed were
> > deprecated in KIP-328, so you'll see what I'm talking about when you pull
> > trunk again.
> >
> > 7. Any method taking a Duration argument may throw an
> > IllegalArgumentException (we choose to convert ArithmeticException to
> > IllegalArgumentException, as I mentioned in the Jira ticket). We don't
> need
> > a "throws" declaration, but we should plan to mention this in the javadoc
> > for those methods.
> >
> > 8. In Stores, windowSize and segmentInterval should also be durations.
> >
> > 9. In StreamsMetrics, recordLatency could be just a Duration, but I
> > actually think this one is better left alone. 

Re: subscribe

2018-08-18 Thread Ted Yu
Please see instructions here:

http://kafka.apache.org/contact

On Sat, Aug 18, 2018 at 8:18 AM Aegeaner 
wrote:

>
>
>


Re: [VOTE] KIP-347: Enable batching in FindCoordinatorRequest

2018-08-16 Thread Ted Yu
+1

On Thu, Aug 16, 2018 at 5:15 PM Yishun Guan  wrote:

> Hi all,
>
> I want to start a vote on this KIP:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-347%3A++Enable+batching+in+FindCoordinatorRequest
>
> Here is the discussion thread:
>
> https://lists.apache.org/thread.html/fd727cc7d5b0956d64255c35d5ed46403c3495a7052ba8ffbc55e084@%3Cdev.kafka.apache.org%3E
>
> Thanks everyone for your input!
>
> Best,
> Yishun
>


Re: [VOTE] KIP-325: Extend Consumer Group Command to Show Beginning Offsets

2018-08-16 Thread Ted Yu
+1

On Thu, Aug 16, 2018 at 12:05 PM Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> I would like to start a vote on KIP-325 which aims at adding a beginning
> offset column to consumer group command describe output.
>
> The KIP:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-325%3A+Extend+Consumer+Group+Command+to+Show+Beginning+Offsets
> Discussion thread:
> https://www.mail-archive.com/dev@kafka.apache.org/msg89203.html
>
> Thanks!
> --Vahid
>
>
>


Re: [VOTE] KIP-353: Allow Users to Configure Multi-Streams Timestamp Synchronization Behavior

2018-08-14 Thread Ted Yu
+1
 Original message From: Bill Bejeck  Date: 
8/14/18  11:09 AM  (GMT-08:00) To: dev@kafka.apache.org Subject: Re: [VOTE] 
KIP-353: Allow Users to Configure Multi-Streams Timestamp Synchronization 
Behavior 
+1

Thanks,
Bill

On Thu, Aug 9, 2018 at 4:20 PM John Roesler  wrote:

> +1 non-binding
>
> On Thu, Aug 9, 2018 at 3:14 PM Matthias J. Sax 
> wrote:
>
> > +1 (binding)
> >
> > On 8/9/18 11:57 AM, Guozhang Wang wrote:
> > > Hello all,
> > >
> > > I would like to start the voting processing on the following KIP, to
> > allow
> > > users control when a task can be processed based on its buffered
> records,
> > > and how the stream time of a task be advanced.
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 353%3A+Improve+Kafka+Streams+Timestamp+Synchronization
> > >
> > >
> > >
> > > Thanks,
> > > -- Guozhang
> > >
> >
> >
>


Re: [VOTE] KIP-356: Add withCachingDisabled() to StoreBuilder

2018-08-14 Thread Ted Yu
+1

On Tue, Aug 14, 2018 at 10:42 AM Guozhang Wang  wrote:

> Hello folks,
>
> I'd like to start a voting thread on the following KIP:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-356%3A+Add+withCachingDisabled%28%29+to+StoreBuilder
>
> It is a pretty straightforward one, adding a missing API to StoreBuilder
> which should actually be added at the very beginning but somehow was lost.
> Hence I skipped the DISCUSS process of it. But if you have any feedbacks
> please feel free to share as well.
>
>
>
> -- Guozhang
>


Re: [DISCUSS] KIP-347: Enable batching in FindCoordinatorRequest

2018-08-10 Thread Ted Yu
bq. this is the foundation of some later possible optimizations(enable
batching in *describeConsumerGroups ...*

*Can you say more why this change lays the foundation for the future
optimizations ?*

*You mentioned **FIND_COORDINATOR_REQUEST_V3 in the wiki but I don't see it
in PR.*
*I assume you would add that later.*

*Please read your wiki and fix grammatical error such as the following:*

bq. that need to be make

Thanks

On Wed, Aug 8, 2018 at 3:55 PM Yishun Guan  wrote:

> Hi all,
>
> I would like to start a discussion on:
>
> KIP-347: Enable batching in FindCoordinatorRequest
> https://cwiki.apache.org/confluence/x/CgZPBQ
>
> Thanks @Guozhang Wang  for his help and patience!
>
> Thanks,
> Yishun
>


[jira] [Created] (KAFKA-7276) Consider using re2j to speed up regex operations

2018-08-10 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-7276:
-

 Summary: Consider using re2j to speed up regex operations
 Key: KAFKA-7276
 URL: https://issues.apache.org/jira/browse/KAFKA-7276
 Project: Kafka
  Issue Type: Task
Reporter: Ted Yu


https://github.com/google/re2j

re2j claims to do linear time regular expression matching in Java.

Its benefit is most obvious for deeply nested regex (such as a | b | c | d).

We should consider using re2j to speed up regex operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-08-08 Thread Ted Yu
+1

On Wed, Aug 8, 2018 at 4:09 AM Mickael Maison 
wrote:

> +1 (non-binding)
> Thanks Vahid
> On Wed, Aug 8, 2018 at 11:26 AM Kamal Chandraprakash
>  wrote:
> >
> > +1 (non-binding)
> >
> > Thanks for the KIP.
> >
> > On Wed, Aug 8, 2018 at 3:11 PM Stanislav Kozlovski <
> stanis...@confluent.io>
> > wrote:
> >
> > > +1 (non-binding)
> > > Thanks!
> > >
> > > On Tue, Aug 7, 2018 at 11:47 PM Jason Gustafson 
> > > wrote:
> > >
> > > > +1 Thanks Vahid.
> > > >
> > > > On Tue, Aug 7, 2018 at 11:14 AM, Vahid S Hashemian <
> > > > vahidhashem...@us.ibm.com> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I'd like to start a vote on KIP-289 to modify the default group id
> of
> > > > > KafkaConsumer.
> > > > > The KIP:
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 289%3A+Improve+the+default+group+id+behavior+in+KafkaConsumer
> > > > > The discussion thread:
> > > > > https://www.mail-archive.com/dev@kafka.apache.org/msg87379.html
> > > > >
> > > > > Thanks!
> > > > > --Vahid
> > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best,
> > > Stanislav
> > >
>


Re: [VOTE] KIP-346 - Improve LogCleaner behavior on error

2018-08-07 Thread Ted Yu
+1

On Tue, Aug 7, 2018 at 5:25 AM Thomas Becker  wrote:

> +1 (non-binding)
>
> We've hit issues with the log cleaner in the past, and this would be a
> great improvement.
> On Tue, 2018-08-07 at 12:19 +0100, Stanislav Kozlovski wrote:
>
> Hey everybody,
>
> I'm starting a vote on KIP-346
>
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-346+-+Improve+LogCleaner+behavior+on+error
> >
>
>
>
> 
>
> This email and any attachments may contain confidential and privileged
> material for the sole use of the intended recipient. Any review, copying,
> or distribution of this email (or any attachments) by others is prohibited.
> If you are not the intended recipient, please contact the sender
> immediately and permanently delete this email and any attachments. No
> employee or agent of TiVo Inc. is authorized to conclude any binding
> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> Inc. may only be made by a signed written agreement.
>


Re: [DISCUSS] KIP-353: Allow Users to Configure Kafka Streams Timestamp Synchronization

2018-08-03 Thread Ted Yu
Guozhang:
I agree.

Probably note this on your KIP.

Thanks

On Fri, Aug 3, 2018 at 6:08 PM Guozhang Wang  wrote:

> Hello Ted,
>
> I think dynamic configuration itself would worth an independent KIP, if you
> meant to allow users changing the config on-the-fly while not bouncing the
> instance.
>
> Guozhang
>
>
> On Fri, Aug 3, 2018 at 3:33 PM, Ted Yu  wrote:
>
> > Guozhang:
> >
> > Do you plan to support dynamic config for the new config entry ?
> >
> > Cheers
> >
> > On Fri, Aug 3, 2018 at 2:00 PM Guozhang Wang  wrote:
> >
> > > Hello all,
> > >
> > > I would like to kick off a discussion on the following KIP, to allow
> > users
> > > control when a task can be processed based on its buffered records, and
> > how
> > > the stream time of a task be advanced.
> > >
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 353%3A+Improve+Kafka+Streams+Timestamp+Synchronization
> > >
> > > This is related to one of the root causes of out-of-ordering data in
> > Kafka
> > > Streams. Any thoughts / comments on this topic is more than welcomed.
> > >
> > >
> > > Thanks,
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] KIP-353: Allow Users to Configure Kafka Streams Timestamp Synchronization

2018-08-03 Thread Ted Yu
Guozhang:

Do you plan to support dynamic config for the new config entry ?

Cheers

On Fri, Aug 3, 2018 at 2:00 PM Guozhang Wang  wrote:

> Hello all,
>
> I would like to kick off a discussion on the following KIP, to allow users
> control when a task can be processed based on its buffered records, and how
> the stream time of a task be advanced.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-353%3A+Improve+Kafka+Streams+Timestamp+Synchronization
>
> This is related to one of the root causes of out-of-ordering data in Kafka
> Streams. Any thoughts / comments on this topic is more than welcomed.
>
>
> Thanks,
> -- Guozhang
>


Moving build job Was: Build failed in Jenkins: kafka-trunk-jdk10 #364

2018-08-03 Thread Ted Yu
Should the build job run on nodes with the {{Hadoop}} label ?

Based on recent builds@a.o discussion about how small the disk capacity is
on the general {{ubuntu}} label, I think we should try {{Hadoop}} labeled
machines.

Cheers

On Fri, Aug 3, 2018 at 3:08 PM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> See 
>
> --
> Started by an SCM change
> [EnvInject] - Loading node environment variables.
> Building remotely on H31 (ubuntu xenial) in workspace <
> https://builds.apache.org/job/kafka-trunk-jdk10/ws/>
>  > git rev-parse --is-inside-work-tree # timeout=10
> Fetching changes from the remote Git repository
>  > git config remote.origin.url https://github.com/apache/kafka.git #
> timeout=10
> Fetching upstream changes from https://github.com/apache/kafka.git
>  > git --version # timeout=10
>  > git fetch --tags --progress https://github.com/apache/kafka.git
> +refs/heads/*:refs/remotes/origin/*
> ERROR: Error fetching remote repo 'origin'
> hudson.plugins.git.GitException: Failed to fetch from
> https://github.com/apache/kafka.git
> at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:888)
> at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
> at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
> at hudson.scm.SCM.checkout(SCM.java:504)
> at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
> at
> hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
> at
> jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
> at
> hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
> at hudson.model.Run.execute(Run.java:1794)
> at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
> at
> hudson.model.ResourceController.execute(ResourceController.java:97)
> at hudson.model.Executor.run(Executor.java:429)
> Caused by: hudson.plugins.git.GitException: Command "git fetch --tags
> --progress https://github.com/apache/kafka.git
> +refs/heads/*:refs/remotes/origin/*" returned status code 128:
> stdout:
>
>


Re: [VOTE] KIP-341: Update Sticky Assignor's User Data Protocol

2018-08-03 Thread Ted Yu
+1

On Fri, Aug 3, 2018 at 10:40 AM Jason Gustafson  wrote:

> +1 Thanks Vahid.
>
> On Thu, Aug 2, 2018 at 1:27 PM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com
> > wrote:
>
> > Hi everyone,
> >
> > I believe the feedback on this KIP has been addressed so far. So I'd like
> > to start a vote.
> > The KIP:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 341%3A+Update+Sticky+Assignor%27s+User+Data+Protocol
> > Discussion thread:
> > https://www.mail-archive.com/dev@kafka.apache.org/msg89733.html
> >
> > Thanks!
> > --Vahid
> >
> >
>


Re: [VOTE] KIP-332: Update AclCommand to use AdminClient API

2018-08-02 Thread Ted Yu
+1

On Thu, Aug 2, 2018 at 7:33 PM Manikumar  wrote:

> Hi All,
>
> I would like to start voting on KIP-332 which allows AclCommand to use
> AdminClient API for acl management.
>
> KIP:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-332%3A+Update+AclCommand+to+use+AdminClient+API
>
> Discussion Thread:
> https://www.mail-archive.com/dev@kafka.apache.org/msg90049.html
>
> Thanks,
> Manikumar
>


Re: [VOTE] KIP-334 Include partitions in exceptions raised during consumer record deserialization/validation

2018-08-02 Thread Ted Yu
+1
 Original message From: Stanislav Kozlovski 
 Date: 8/2/18  2:41 AM  (GMT-08:00) To: 
dev@kafka.apache.org Subject: [VOTE] KIP-334 Include partitions in exceptions 
raised during consumer record deserialization/validation 
Hey everybody,

I'd like to start a vote thread for KIP-334 Include partitions in
exceptions raised during consumer record deserialization/validation


-- 
Best,
Stanislav


Re: [DISCUSS] KIP-332: Update AclCommand to use AdminClient API

2018-07-30 Thread Ted Yu
Look good to me.

On Mon, Jul 23, 2018 at 7:30 AM Manikumar  wrote:

> Hi all,
>
> I have created a KIP to use AdminClient API in AclCommand (kafka-acls.sh)
>
> *
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-332%3A+Update+AclCommand+to+use+AdminClient+API*
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-332%3A+Update+AclCommand+to+use+AdminClient+API
> >
>
> Please take a look.
>
> Thanks,
>


Re: [VOTE] KIP-328: Ability to suppress updates for KTables

2018-07-30 Thread Ted Yu
+1

On Mon, Jul 30, 2018 at 11:46 AM John Roesler  wrote:

> Hello devs,
>
> The discussion of KIP-328 has gone some time with no new comments, so I am
> calling for a vote!
>
> Here's the KIP: https://cwiki.apache.org/confluence/x/sQU0BQ
>
> The basic idea is to provide:
> * more usable control over update rate (vs the current state store caches)
> * the final-result-for-windowed-computations feature which several people
> have requested
>
> Thanks,
> -John
>


Re: [Vote] KIP-321: Update TopologyDescription to better represent Source and Sink Nodes

2018-07-25 Thread Ted Yu
+1

On Wed, Jul 25, 2018 at 7:24 PM Nishanth Pradeep 
wrote:

> Hello,
>
> I'm calling a vote for KIP-321:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-321%3A+Update+TopologyDescription+to+better+represent+Source+and+Sink+Nodes
>
> Best,
> Nishanth Pradeep
>


Re: [VOTE] KIP-344: The auto-generated client id should be passed to MetricsReporter

2018-07-25 Thread Ted Yu
+1

On Wed, Jul 25, 2018 at 9:49 AM Kevin Lu  wrote:

> Hi All,
>
> I am calling a vote for KIP-344
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-344%3A+The+auto-generated+client+id+should+be+passed+to+MetricsReporter
> .
>
> If any committer feels that this minor change does not need to go through
> the KIP process, then feel free to leave a code review at
> https://github.com/apache/kafka/pull/5383 and I will cancel the KIP.
>
> Regards,
> Kevin
>


Re: [DISCUSS] KIP-346 - Limit blast radius of log compaction failure

2018-07-24 Thread Ted Yu
As James pointed out in his reply, topic-partition name can be long.
It is not necessary to repeat the topic name for each of its partitions.
How about the following format:

topic-name1-{partition1, partition2, etc}

That is, topic name only appears once.

Cheers

On Mon, Jul 23, 2018 at 9:08 PM Stanislav Kozlovski 
wrote:

> Hi Ted,
>
> Yes, absolutely. Thanks for pointing that out!
>
> On Mon, Jul 23, 2018 at 6:12 PM Ted Yu  wrote:
>
> > For `uncleanable-partitions`, should the example include topic name(s) ?
> >
> > Cheers
> >
> > On Mon, Jul 23, 2018 at 5:46 PM Stanislav Kozlovski <
> > stanis...@confluent.io>
> > wrote:
> >
> > > I renamed the KIP and that changed the link. Sorry about that. Here is
> > the
> > > new link:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-346+-+Improve+LogCleaner+behavior+on+error
> > >
> > > On Mon, Jul 23, 2018 at 5:11 PM Stanislav Kozlovski <
> > > stanis...@confluent.io>
> > > wrote:
> > >
> > > > Hey group,
> > > >
> > > > I created a new KIP about making log compaction more fault-tolerant.
> > > > Please give it a look here and please share what you think,
> especially
> > in
> > > > regards to the points in the "Needs Discussion" paragraph.
> > > >
> > > > KIP: KIP-346
> > > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-346+-+Limit+blast+radius+of+log+compaction+failure
> > > >
> > > > --
> > > > Best,
> > > > Stanislav
> > > >
> > >
> > >
> > > --
> > > Best,
> > > Stanislav
> > >
> >
>
>
> --
> Best,
> Stanislav
>


Re: Seeing old tombstones in compacted topic

2018-07-23 Thread Ted Yu
Looking at some recent JIRAs, such as KAFKA-6568, which came in after the
release of 0.11.0

Would that possibly be related to what you observed ?

Cheers

On Mon, Jul 23, 2018 at 6:23 PM Mitch Seymour 
wrote:

> Hi all,
>
> We're using version 0.11.0 of Kafka (broker and client), and our Kafka
> Streams app uses a compacted topic for storing it's state. Here's the
> output of kafka-topics.sh --describe:
>
> Topic:mytopic
> PartitionCount:32
> ReplicationFactor:2
> Configs:retention.ms=43200,cleanup.policy=compact
>
> The app will write tombstones to this topic when it's finished with a
> certain key. I can see the tombstones using kafkacat
>
> kafkacat -q -b ... -t mytopic -o beginning -c 2 -f"Time: %T, Key: %k,
> Message: %s\n" -Z
>
> Output:
>
> Time: 1530559667357, Key: key1, Message: NULL
> Time: 1530559667466, Key: key2, Message: NULL
>
> Note: the -Z flag in kafkacat prints null values as NULL to make it easier
> to see the tombstones. Anyways, the timestamps on these topics are from
> GMT: Monday, July 2, 2018, so I'm not sure why the tombstones still exist
> in this topic.
>
> Furthermore, it looks like compaction is being triggered because I'm seeing
> this in the logs:
>
> discarding tombstones prior to Thu Jul 19 14:15:17 GMT 2018.
>
> However, I still see tombstones in this topic from Jul 2, so it doesn't add
> up Another side note: I'm not explicitly setting delete.retention.ms, and
> since the default value is 8640, or 1 day, I'm not too sure why the
> tombstones are sticking around.
>
> Anyways, has anyone experienced this before? I'm not sure if this is a
> known bug, or if there's something peculiar with our own setup. Thanks,
>
> Mitch
>


Re: [DISCUSS] KIP-346 - Limit blast radius of log compaction failure

2018-07-23 Thread Ted Yu
For `uncleanable-partitions`, should the example include topic name(s) ?

Cheers

On Mon, Jul 23, 2018 at 5:46 PM Stanislav Kozlovski 
wrote:

> I renamed the KIP and that changed the link. Sorry about that. Here is the
> new link:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-346+-+Improve+LogCleaner+behavior+on+error
>
> On Mon, Jul 23, 2018 at 5:11 PM Stanislav Kozlovski <
> stanis...@confluent.io>
> wrote:
>
> > Hey group,
> >
> > I created a new KIP about making log compaction more fault-tolerant.
> > Please give it a look here and please share what you think, especially in
> > regards to the points in the "Needs Discussion" paragraph.
> >
> > KIP: KIP-346
> > <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-346+-+Limit+blast+radius+of+log+compaction+failure
> >
> > --
> > Best,
> > Stanislav
> >
>
>
> --
> Best,
> Stanislav
>


[jira] [Created] (KAFKA-7195) StreamStreamJoinIntegrationTest fails in 2.0 Jenkins

2018-07-23 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-7195:
-

 Summary: StreamStreamJoinIntegrationTest fails in 2.0 Jenkins
 Key: KAFKA-7195
 URL: https://issues.apache.org/jira/browse/KAFKA-7195
 Project: Kafka
  Issue Type: Test
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/kafka-2.0-jdk8/87/testReport/junit/org.apache.kafka.streams.integration/StreamStreamJoinIntegrationTest/testOuter_caching_enabled___false_/
> :
{code}
java.lang.AssertionError: 
Expected: is <[A-null]>
 but: was <[A-a, A-b, A-c, A-d]>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
at 
org.apache.kafka.streams.integration.AbstractJoinIntegrationTest.checkResult(AbstractJoinIntegrationTest.java:171)
at 
org.apache.kafka.streams.integration.AbstractJoinIntegrationTest.runTest(AbstractJoinIntegrationTest.java:212)
at 
org.apache.kafka.streams.integration.AbstractJoinIntegrationTest.runTest(AbstractJoinIntegrationTest.java:184)
at 
org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest.testOuter(StreamStreamJoinIntegrationTest.java:198)
{code}
However, some test output was missing:
{code}
[2018-07-23 20:51:36,363] INFO Socket c
...[truncated 1627692 chars]...
671)
{code}
I ran the test locally which passed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-334 Include partitions in exceptions raised during consumer record deserialization/validation

2018-07-20 Thread Ted Yu
Hi, Stanislav:
Do you plan to start VOTE thread ?

Cheers

On Fri, Jul 20, 2018 at 6:11 PM Stanislav Kozlovski 
wrote:

> Hey group,
>
> I added a Pull Request for this KIP - here it is
> https://github.com/apache/kafka/pull/5410
> Please take a look.
>
> Best,
> Stanislav
>
> On Thu, Jul 5, 2018 at 11:06 AM Ismael Juma  wrote:
>
> > Yes, the Scala consumers have been removed in 2.0.0, which simplifies
> some
> > of this. The following commit was an initial step in unifying the
> exception
> > handling:
> >
> >
> >
> https://github.com/apache/kafka/commit/96bcfdfc7c9aac075635b2034e65e412a725672e
> >
> > But more can be done as you mentioned.
> >
> > Ismael
> >
> > On 5 Jul 2018 9:36 am, "Stanislav Kozlovski" 
> > wrote:
> >
> > Hey Ismael,
> >
> > It is only slightly related - my PR would attach two new attributes and
> > also touch upon deserialization exceptions.
> >
> > But this PR did provide me with some insight:
> > Maybe the best approach would be to make `InvalidRecordException` a
> public
> > exception instead of introducing a new one - I did not realize it was not
> > publicly exposed.
> > Does the following:
> >
> >  InvalidMessageException extends CorruptRecordException for temporary
> > compatibility with the old Scala clients.
> >  * We want to update the server side code to use and catch the new
> > CorruptRecordException.
> >  * Because ByteBufferMessageSet.scala and Message.scala are used in
> > both server and client code having
> >  * InvalidMessageException extend CorruptRecordException allows us to
> > change server code without affecting the client.
> >
> > still apply? I can see that the `ByteBufferMessageSet` and `Message`
> scala
> > classes are not present in the codebase anymore. AFAIA the old scala
> > clients were removed with 2.0.0 and we can thus update the server side
> code
> > to use the `CorruptRecordException` while changing and exposing
> > `InvalidRecordException` to the public. WDYT?
> >
> > I will also make sure to not expose the cause of the exception when not
> > needed, maybe I'll outright remove the `cause` attribute
> >
> >
> > On Thu, Jul 5, 2018 at 4:55 PM Ismael Juma  wrote:
> >
> > > Thanks for the KIP, Stanislav. The following PR looks related:
> > >
> > > https://github.com/apache/kafka/pull/4093/files
> > >
> > > Ismael
> > >
> > > On Thu, Jul 5, 2018 at 8:44 AM Stanislav Kozlovski <
> > stanis...@confluent.io
> > > >
> > > wrote:
> > >
> > > > Hey everybody,
> > > >
> > > > I just created a new KIP about exposing more information in
> exceptions
> > > > caused by consumer record deserialization/validation. Please have a
> > look
> > > at
> > > > it, it is a very short page.
> > > >
> > > > I am working under the assumption that all invalid record or
> > > > deserialization exceptions in the consumer pass through the `Fetcher`
> > > > class. Please confirm if that is true, otherwise I might miss some
> > places
> > > > where the exceptions are raised in my implementation
> > > >
> > > > One concern I have is the name of the second exception -
> > > > `InoperativeRecordException`. I would have named it
> > > > `InvalidRecordException` but that is taken. The `Fetcher` class
> catches
> > > > `InvalidRecordException` (here
> > > > <
> > > >
> > >
> >
> >
> https://github.com/apache/kafka/blob/c5b00d20d3703b7fc4358b7258d5d6adb890136f/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L1081
> > > > >
> > > > and here
> > > > <
> > > >
> > >
> >
> >
> https://github.com/apache/kafka/blob/c5b00d20d3703b7fc4358b7258d5d6adb890136f/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L1092
> > > > >)
> > > > and re-raises it as `KafkaException`, which exposes it as a
> > non-retriable
> > > > exception to the user (`InvalidRecordException` extends
> > > > `RetriableExecption`, but `KafkaException` doesn't).
> > > > A suggestion I got for an alternative name was
> > > > `InvalidFetchRecordException`. Please chime in if you have ideas
> > > >
> > > > Confluence page: KIP-334
> > > > <
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=87297793
> > > > >
> > > > JIRA Issue: KAFKA-5682 <
> > https://issues.apache.org/jira/browse/KAFKA-5682
> > > >
> > > > --
> > > > Best,
> > > > Stanislav
> > > >
> > >
> >
> >
> > --
> > Best,
> > Stanislav
> >
>
>
> --
> Best,
> Stanislav
>


Re: [DISCUSS] KIP-344: The auto-generated client id should be passed to MetricsReporter

2018-07-20 Thread Ted Yu
Looks good to me.

On Fri, Jul 20, 2018 at 12:17 AM Kevin Lu 
wrote:

> Hi All,
>
> This is the discussion thread for KIP-344 (
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-344%3A+The+auto-generated+client+id+should+be+passed+to+MetricsReporter
> ).
>
> Thanks!
>
> Regards,
> Kevin
>


[jira] [Created] (KAFKA-7175) Make version checking logic more flexible in streams_upgrade_test.py

2018-07-17 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-7175:
-

 Summary: Make version checking logic more flexible in 
streams_upgrade_test.py
 Key: KAFKA-7175
 URL: https://issues.apache.org/jira/browse/KAFKA-7175
 Project: Kafka
  Issue Type: Improvement
Reporter: Ted Yu


During debugging of system test failure for KAFKA-5037, it was re-discovered 
that the version numbers inside version probing related messages are hard coded 
in streams_upgrade_test.py

This is in-flexible.

We should correlate latest version from Java class with the expected version 
numbers.

Matthias made the following suggestion:

We should also make this more generic and test upgrades from 3 -> 4, 3 -> 5 and 
4 -> 5. The current code does only go from latest version to future version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7174) Improve version probing of subscription info

2018-07-17 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-7174:
-

 Summary: Improve version probing of subscription info
 Key: KAFKA-7174
 URL: https://issues.apache.org/jira/browse/KAFKA-7174
 Project: Kafka
  Issue Type: Improvement
Reporter: Ted Yu


During code review for KAFKA-5037, [~guozhang] made the following suggestion:

Currently the version probing works as the following:

when leader receives the subscription info encoded with a higher version that 
it can understand (e.g. the leader is on version 3, while one of the 
subscription received is encode with version 4), it will send back an empty 
assignment with the assignment encoded with version 3, and also 
latestSupportedVersion set to 3.

when the member receives the assignment, it checks if latestSupportedVersion is 
smaller than the version it used for encoding the sent subscription (i.e. the 
above logic). If it is smaller, then it means that leader cannot understand, in 
this case, version 4. It will then set the flag and then re-subscribe but with 
a down-graded encoding format of version 3.

NOW with PR #5322, we can let leader to clearly communicate this error via the 
error code, and upon receiving the assignment, if the error code is 
VERSION_PROBING, then the member can immediately know what happens, and hence 
can simplify the above logic. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-340: Allow kafka-reassign-partitions.sh and kafka-log-dirs.sh to take admin client property file

2018-07-17 Thread Ted Yu
+1

On Tue, Jul 17, 2018 at 1:40 PM Jason Gustafson  wrote:

> +1. This is useful (though the naming inconsistencies in the tools are
> vexing, as always).
>
> -Jason
>
> On Tue, Jul 17, 2018 at 12:24 PM, Dong Lin  wrote:
>
> > Hi all,
> >
> > It seems that there is no further concern with the KIP-304. I guess it is
> > time to start the voting thread.
> >
> > The KIP can be found at
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-340%
> > 3A+Allow+kafka-reassign-partitions.sh+and+kafka-log-dirs.sh+
> > to+take+admin+client+property+file
> > . This KIP provides a way to allow kafka-reassign-partitions.sh and
> > kafka-log-dirs.sh to talk to broker over SSL.
> >
> > Cheers,
> > Dong
> >
>


Re: KIP-327: Add describe all topics API to AdminClient

2018-07-14 Thread Ted Yu
What if broker crashes before all the pages can be returned ?

Cheers

On Sat, Jul 14, 2018 at 1:07 AM Stephane Maarek <
steph...@simplemachines.com.au> wrote:

> Why not paginate ? Then one can retrieve as many topics as desired ?
>
> On Sat., 14 Jul. 2018, 4:15 pm Colin McCabe,  wrote:
>
> > Good point.  We should probably have a maximum number of results like
> > 1000 or something.  That can go in the request RPC as well...
> > Cheers,
> > Colin
> >
> > On Fri, Jul 13, 2018, at 18:15, Ted Yu wrote:
> > > bq. describe topics by a regular expression on the server side
> > >
> > > Should caution be taken if the regex doesn't filter ("*") ?
> > >
> > > Cheers
> > >
> > > On Fri, Jul 13, 2018 at 6:02 PM Colin McCabe
> > >  wrote:>
> > > > As Jason wrote, this won't scale as the number of partitions
> > > > increases.> > We already have users who have tens of thousands of
> > topics, or
> > > > more.  If> > you multiply that by 100x over the next few years, you
> > end up with
> > > > this API> > returning full information about millions of topics,
> which
> > clearly
> > > > doesn't> > work.
> > > >
> > > > We discussed this a lot in the original KIP-117 DISCUSS thread
> > > > which added> > the Java AdminClient.  ListTopics and DescribeTopics
> > were
> > > > deliberately kept> > separate because we understood that eventually a
> > single RPC would
> > > > not be> > able to return information about all the topics in the
> > cluster.  So
> > > > I have> > to vote -1 for this proposal as it stands.
> > > >
> > > > I do agree that adding a way to describe topics by a regular
> > > > expression on> > the server side would be very useful.  This would
> > also fix a major
> > > > scalability problem we have now, which is that when
> > > > subscribing via a> > regular expression, clients need to fetch the
> > full list of all
> > > > topics in> > the cluster and filter locally.
> > > >
> > > > I think a regular expression library like re2 would be ideal
> > > > for this> > purpose.  re2 is standardized and language-agnostic (it's
> > not tied
> > > > only to> > Java).  In contrast, Java regular expression change with
> > different
> > > > releases> > of the JDK (there were some changes in java 8, for
> > example).
> > > > Also, re2> > regular expressions are linear time, never exponential
> > time.  See
> > > > https://github.com/google/re2j
> > > >
> > > > regards,
> > > > Colin
> > > >
> > > >
> > > > On Fri, Jul 13, 2018, at 05:00, Andras Beni wrote:
> > > > > The KIP looks good to me.
> > > > > However, if there is willingness in the community to work on
> > > > > metadata> > > request with patterns, the feature proposed here and
> > filtering by
> > > > > '*' or> > > '.*' would be redundant.
> > > > >
> > > > > Andras
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Jul 13, 2018 at 12:38 AM Jason Gustafson
> > > > > > > wrote:
> > > > >
> > > > > > Hey Manikumar,
> > > > > >
> > > > > > As Kafka begins to scale to larger and larger numbers of
> > > > topics/partitions,
> > > > > > I'm a little concerned about the scalability of APIs such as
> > > > > > this. The> > API
> > > > > > looks benign, but imagine you have have a few million
> > > > > > partitions. We> > > > already expose similar APIs in the producer
> > and consumer, so
> > > > > > probably> > not
> > > > > > much additional harm to expose it in the AdminClient, but it
> > > > > > would be> > nice
> > > > > > to put a little thought into some longer term options. We should
> > > > > > be> > giving
> > > > > > users an efficient way to select a smaller set of the topics
> > > > > > they are> > > > interested in. We have always discussed adding
> > some filtering
> > > > > > support> > to
> > > > > > the Metadata API. Perhaps now is a good time to reconsider this?
> > > > > > We now> > > > have a convention for wildcard ACLs, so perhaps we
> > can do
> > > > > > something> > > > similar. Full regex support might be ideal given
> > the consumer's> > > > subscription API, but that is more challenging.
> What
> > do you
> > > > > > think?> > > >
> > > > > > Thanks,
> > > > > > Jason
> > > > > >
> > > > > > On Thu, Jul 12, 2018 at 2:35 PM, Harsha 
> wrote:>
> > > > >
> > > > > > > Very useful. LGTM.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Harsha
> > > > > > >
> > > > > > > On Thu, Jul 12, 2018, at 9:56 AM, Manikumar wrote:
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > I have created a KIP to add describe all topics API to
> > > > > > > > AdminClient> > .
> > > > > > > >
> > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > 327%3A+Add+describe+all+topics+API+to+AdminClient
> > > > > > > >
> > > > > > > > Please take a look.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > >
> > > > > >
> > > >
> >
> >
>


Re: [DISCUSS]: KIP-339: Create a new ModifyConfigs API

2018-07-13 Thread Ted Yu
I experimented with putting null value into ConcurrentHashMap which led me
to this code:

final V putVal(K key, V value, boolean onlyIfAbsent) {

if (key == null || value == null) throw new NullPointerException();

I agree that getting NPE this way is not user friendly.
Using Java 8, the notion of null value support would be conveyed to user in
a friendly manner.

Cheers

On Fri, Jul 13, 2018 at 5:51 PM Colin McCabe  wrote:

> On Fri, Jul 13, 2018, at 17:45, Ted Yu wrote:
>
> Looking at modifyConfigs API, it doesn't seem that ConcurrentHashMap
> should be used as the underlying parameter type.
>
>
> I agree that there are other types of maps that do support null values.
> However, the fact that some official map implementations from the standard
> library don't support null values makes this a questionable feature to rely
> on.  Imagine being a new user of this API who created a ConcurrentHashMap,
> tried to insert some null keys, and pass to  the API.  It would compile,
> but not work.  It would certainly be confusing.
>
> Anyway, to signify that null value is supported, value type can be
> declared as Optional.
>
> FYI
>
>
> Yeah, now that we're on Java 8, Optional could be a good choice here.
>
> best,
> Colin
>
>
> On Fri, Jul 13, 2018 at 5:35 PM Colin McCabe  wrote:
>
>
> Hi Ted,
>
> That’s a fair question.  I think the main reason I didn’t propose that
> originally is that many people find null values in maps confusing.  Also,
> some newer java maps don’t support null values, such as ConcuurentHashMap.
> I’m curious what others think about this.
>
> Best,
> Colin
>
> On Wed, Jul 11, 2018, at 21:28, Ted Yu wrote:
> > bq. Map changes, Set
> removals,
> >
> > Is it possible to combine the two parameters into one Map where null
> Config
> > value signifies removal of config ?
> > This way, the following wouldn't occur (reducing un-intended config
> > removal):
> >
> > bq. If a configuration key is specified in both *changes* and *removals*
> >
> > *Cheers*
> >
> > On Wed, Jul 11, 2018 at 10:54 AM Colin McCabe 
> wrote:
> >
> > > Hi all,
> > >
> > > Previously, we discussed some issues with alterConfigs here on the
> mailing
> > > list, and eventually came to the conclusion that the RPC as implemented
> > > can't be used for a shell command modifying configurations.
> > >
> > > I wrote up a small KIP to fix the issues with the RPC.  Please take a
> look
> > > at
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-339%3A+Create+a+new+ModifyConfigs+API
> > >
> > > best,
> > > Colin
> > >
>
>
>


Re: KIP-327: Add describe all topics API to AdminClient

2018-07-13 Thread Ted Yu
bq. describe topics by a regular expression on the server side

Should caution be taken if the regex doesn't filter ("*") ?

Cheers

On Fri, Jul 13, 2018 at 6:02 PM Colin McCabe  wrote:

> As Jason wrote, this won't scale as the number of partitions increases.
> We already have users who have tens of thousands of topics, or more.  If
> you multiply that by 100x over the next few years, you end up with this API
> returning full information about millions of topics, which clearly doesn't
> work.
>
> We discussed this a lot in the original KIP-117 DISCUSS thread which added
> the Java AdminClient.  ListTopics and DescribeTopics were deliberately kept
> separate because we understood that eventually a single RPC would not be
> able to return information about all the topics in the cluster.  So I have
> to vote -1 for this proposal as it stands.
>
> I do agree that adding a way to describe topics by a regular expression on
> the server side would be very useful.  This would also fix a major
> scalability problem we have now, which is that when subscribing via a
> regular expression, clients need to fetch the full list of all topics in
> the cluster and filter locally.
>
> I think a regular expression library like re2 would be ideal for this
> purpose.  re2 is standardized and language-agnostic (it's not tied only to
> Java).  In contrast, Java regular expression change with different releases
> of the JDK (there were some changes in java 8, for example).  Also, re2
> regular expressions are linear time, never exponential time.  See
> https://github.com/google/re2j
>
> regards,
> Colin
>
>
> On Fri, Jul 13, 2018, at 05:00, Andras Beni wrote:
> > The KIP looks good to me.
> > However, if there is willingness in the community to work on metadata
> > request with patterns, the feature proposed here and filtering by '*' or
> > '.*' would be redundant.
> >
> > Andras
> >
> >
> >
> > On Fri, Jul 13, 2018 at 12:38 AM Jason Gustafson 
> wrote:
> >
> > > Hey Manikumar,
> > >
> > > As Kafka begins to scale to larger and larger numbers of
> topics/partitions,
> > > I'm a little concerned about the scalability of APIs such as this. The
> API
> > > looks benign, but imagine you have have a few million partitions. We
> > > already expose similar APIs in the producer and consumer, so probably
> not
> > > much additional harm to expose it in the AdminClient, but it would be
> nice
> > > to put a little thought into some longer term options. We should be
> giving
> > > users an efficient way to select a smaller set of the topics they are
> > > interested in. We have always discussed adding some filtering support
> to
> > > the Metadata API. Perhaps now is a good time to reconsider this? We now
> > > have a convention for wildcard ACLs, so perhaps we can do something
> > > similar. Full regex support might be ideal given the consumer's
> > > subscription API, but that is more challenging. What do you think?
> > >
> > > Thanks,
> > > Jason
> > >
> > > On Thu, Jul 12, 2018 at 2:35 PM, Harsha  wrote:
> > >
> > > > Very useful. LGTM.
> > > >
> > > > Thanks,
> > > > Harsha
> > > >
> > > > On Thu, Jul 12, 2018, at 9:56 AM, Manikumar wrote:
> > > > > Hi all,
> > > > >
> > > > > I have created a KIP to add describe all topics API to AdminClient
> .
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 327%3A+Add+describe+all+topics+API+to+AdminClient
> > > > >
> > > > > Please take a look.
> > > > >
> > > > > Thanks,
> > > >
> > >
>


Re: [DISCUSS]: KIP-339: Create a new ModifyConfigs API

2018-07-13 Thread Ted Yu
Looking at modifyConfigs API, it doesn't seem that ConcurrentHashMap should
be used as the underlying parameter type.

Anyway, to signify that null value is supported, value type can be declared
as Optional.

FYI

On Fri, Jul 13, 2018 at 5:35 PM Colin McCabe  wrote:

> Hi Ted,
>
> That’s a fair question.  I think the main reason I didn’t propose that
> originally is that many people find null values in maps confusing.  Also,
> some newer java maps don’t support null values, such as ConcuurentHashMap.
> I’m curious what others think about this.
>
> Best,
> Colin
>
> On Wed, Jul 11, 2018, at 21:28, Ted Yu wrote:
> > bq. Map changes, Set
> removals,
> >
> > Is it possible to combine the two parameters into one Map where null
> Config
> > value signifies removal of config ?
> > This way, the following wouldn't occur (reducing un-intended config
> > removal):
> >
> > bq. If a configuration key is specified in both *changes* and *removals*
> >
> > *Cheers*
> >
> > On Wed, Jul 11, 2018 at 10:54 AM Colin McCabe 
> wrote:
> >
> > > Hi all,
> > >
> > > Previously, we discussed some issues with alterConfigs here on the
> mailing
> > > list, and eventually came to the conclusion that the RPC as implemented
> > > can't be used for a shell command modifying configurations.
> > >
> > > I wrote up a small KIP to fix the issues with the RPC.  Please take a
> look
> > > at
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-339%3A+Create+a+new+ModifyConfigs+API
> > >
> > > best,
> > > Colin
> > >
>
>


Re: [DISCUSS]: KIP-339: Create a new ModifyConfigs API

2018-07-11 Thread Ted Yu
bq. Map changes, Set removals,

Is it possible to combine the two parameters into one Map where null Config
value signifies removal of config ?
This way, the following wouldn't occur (reducing un-intended config
removal):

bq. If a configuration key is specified in both *changes* and *removals*

*Cheers*

On Wed, Jul 11, 2018 at 10:54 AM Colin McCabe  wrote:

> Hi all,
>
> Previously, we discussed some issues with alterConfigs here on the mailing
> list, and eventually came to the conclusion that the RPC as implemented
> can't be used for a shell command modifying configurations.
>
> I wrote up a small KIP to fix the issues with the RPC.  Please take a look
> at
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-339%3A+Create+a+new+ModifyConfigs+API
>
> best,
> Colin
>


Re: [DISCUSS] KIP-340: Allow kafka-reassign-partitions.sh and kafka-log-dirs.sh to take admin client property file

2018-07-11 Thread Ted Yu
bq. the same approach used by "--config-file" in ConfigCommand.

I should have copied more from the KIP.
What I meant was that ConfigCommand doesn't use "--config-file" option. So
'same approach' implies StreamsResetter class, not ConfigCommand.

I didn't mean to change ConfigCommand w.r.t. name of the option.

Cheers

On Wed, Jul 11, 2018 at 8:06 PM Dong Lin  wrote:

> Do you mean we should replace "--command-config" with "--config-file" in
> ConfigCommand? There is backward compatibility concern with the change. I
> am not sure the benefit of this change is worth the effort to deprecate the
> old config. Maybe we should do it separately if more people thing it is
> necessary?
>
> On Wed, Jul 11, 2018 at 8:01 PM, Ted Yu  wrote:
>
> > bq. "--config-file" in ConfigCommand.
> >
> > Please update the above - it should be StreamsResetter
> >
> >
> > On Wed, Jul 11, 2018 at 7:59 PM Dong Lin  wrote:
> >
> > > Hey Ted,
> > >
> > > Thanks much for the suggestion. Yeah "config-file" looks better than
> > > "command-config". I have updated the KIP as suggested.
> > >
> > > Thanks,
> > > Dong
> > >
> > > On Wed, Jul 11, 2018 at 5:57 PM, Ted Yu  wrote:
> > >
> > > > Looking at StreamsResetter.java :
> > > >
> > > >commandConfigOption = optionParser.accepts("config-file",
> > > "Property
> > > > file containing configs to be passed to admin cl
> > > >
> > > > Not sure you have considered naming the option in the above fashion.
> > > >
> > > > Probably add the above to Alternative section.
> > > >
> > > > Cheers
> > > >
> > > > On Wed, Jul 11, 2018 at 2:04 PM Dong Lin 
> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I have created KIP-340: Allow kafka-reassign-partitions.sh and
> > > > > kafka-log-dirs.sh to take admin client property file. See
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 340%3A+Allow+kafka-reassign-partitions.sh+and+kafka-log-
> > > > dirs.sh+to+take+admin+client+property+file
> > > > > .
> > > > >
> > > > > This KIP provides a way to allow kafka-reassign-partitions.sh and
> > > > > kafka-log-dirs.sh to talk to broker over SSL. Please review the KIP
> > if
> > > > you
> > > > > have time.
> > > > >
> > > > >
> > > > > Thanks!
> > > > > Dong
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-340: Allow kafka-reassign-partitions.sh and kafka-log-dirs.sh to take admin client property file

2018-07-11 Thread Ted Yu
bq. "--config-file" in ConfigCommand.

Please update the above - it should be StreamsResetter


On Wed, Jul 11, 2018 at 7:59 PM Dong Lin  wrote:

> Hey Ted,
>
> Thanks much for the suggestion. Yeah "config-file" looks better than
> "command-config". I have updated the KIP as suggested.
>
> Thanks,
> Dong
>
> On Wed, Jul 11, 2018 at 5:57 PM, Ted Yu  wrote:
>
> > Looking at StreamsResetter.java :
> >
> >commandConfigOption = optionParser.accepts("config-file",
> "Property
> > file containing configs to be passed to admin cl
> >
> > Not sure you have considered naming the option in the above fashion.
> >
> > Probably add the above to Alternative section.
> >
> > Cheers
> >
> > On Wed, Jul 11, 2018 at 2:04 PM Dong Lin  wrote:
> >
> > > Hi all,
> > >
> > > I have created KIP-340: Allow kafka-reassign-partitions.sh and
> > > kafka-log-dirs.sh to take admin client property file. See
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 340%3A+Allow+kafka-reassign-partitions.sh+and+kafka-log-
> > dirs.sh+to+take+admin+client+property+file
> > > .
> > >
> > > This KIP provides a way to allow kafka-reassign-partitions.sh and
> > > kafka-log-dirs.sh to talk to broker over SSL. Please review the KIP if
> > you
> > > have time.
> > >
> > >
> > > Thanks!
> > > Dong
> > >
> >
>


Re: [DISCUSS] KIP-340: Allow kafka-reassign-partitions.sh and kafka-log-dirs.sh to take admin client property file

2018-07-11 Thread Ted Yu
Looking at StreamsResetter.java :

   commandConfigOption = optionParser.accepts("config-file", "Property
file containing configs to be passed to admin cl

Not sure you have considered naming the option in the above fashion.

Probably add the above to Alternative section.

Cheers

On Wed, Jul 11, 2018 at 2:04 PM Dong Lin  wrote:

> Hi all,
>
> I have created KIP-340: Allow kafka-reassign-partitions.sh and
> kafka-log-dirs.sh to take admin client property file. See
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-340%3A+Allow+kafka-reassign-partitions.sh+and+kafka-log-dirs.sh+to+take+admin+client+property+file
> .
>
> This KIP provides a way to allow kafka-reassign-partitions.sh and
> kafka-log-dirs.sh to talk to broker over SSL. Please review the KIP if you
> have time.
>
>
> Thanks!
> Dong
>


Re: [VOTE] 2.0.0 RC2

2018-07-10 Thread Ted Yu
+1

Ran thru test suite.

Checked signatures.

On Tue, Jul 10, 2018 at 10:17 AM Rajini Sivaram 
wrote:

> Hello Kafka users, developers and client-developers,
>
>
> This is the third candidate for release of Apache Kafka 2.0.0.
>
>
> This is a major version release of Apache Kafka. It includes 40 new  KIPs
> and
>
> several critical bug fixes. Please see the 2.0.0 release plan for more
> details:
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80448820
>
>
> A few notable highlights:
>
>- Prefixed wildcard ACLs (KIP-290), Fine grained ACLs for CreateTopics
>(KIP-277)
>- SASL/OAUTHBEARER implementation (KIP-255)
>- Improved quota communication and customization of quotas (KIP-219,
>KIP-257)
>- Efficient memory usage for down conversion (KIP-283)
>- Fix log divergence between leader and follower during fast leader
>failover (KIP-279)
>- Drop support for Java 7 and remove deprecated code including old scala
>clients
>- Connect REST extension plugin, support for externalizing secrets and
>improved error handling (KIP-285, KIP-297, KIP-298 etc.)
>- Scala API for Kafka Streams and other Streams API improvements
>(KIP-270, KIP-150, KIP-245, KIP-251 etc.)
>
>
> Release notes for the 2.0.0 release:
>
> http://home.apache.org/~rsivaram/kafka-2.0.0-rc2/RELEASE_NOTES.html
>
>
> *** Please download, test and vote by Friday, July 13, 4pm PT
>
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
>
> http://kafka.apache.org/KEYS
>
>
> * Release artifacts to be voted upon (source and binary):
>
> http://home.apache.org/~rsivaram/kafka-2.0.0-rc2/
>
>
> * Maven artifacts to be voted upon:
>
> https://repository.apache.org/content/groups/staging/
>
>
> * Javadoc:
>
> http://home.apache.org/~rsivaram/kafka-2.0.0-rc2/javadoc/
>
>
> * Tag to be voted upon (off 2.0 branch) is the 2.0.0 tag:
>
> https://github.com/apache/kafka/tree/2.0.0-rc2
>
>
>
> * Documentation:
>
> http://kafka.apache.org/20/documentation.html
>
>
> * Protocol:
>
> http://kafka.apache.org/20/protocol.html
>
>
> * Successful Jenkins builds for the 2.0 branch:
>
> Unit/integration tests: https://builds.apache.org/job/kafka-2.0-jdk8/72/
>
> System tests:
> https://jenkins.confluent.io/job/system-test-kafka/job/2.0/27/
>
>
> /**
>
>
> Thanks,
>
>
> Rajini
>


Re: [VOTE] 1.1.1 RC3

2018-07-09 Thread Ted Yu
+1

Ran test suite.

Checked signatures.

On Sun, Jul 8, 2018 at 3:36 PM Dong Lin  wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the fourth candidate for release of Apache Kafka 1.1.1.
>
> Apache Kafka 1.1.1 is a bug-fix release for the 1.1 branch that was first
> released with 1.1.0 about 3 months ago. We have fixed about 25 issues since
> that release. A few of the more significant fixes include:
>
> KAFKA-6925  - Fix memory
> leak in StreamsMetricsThreadImpl
> KAFKA-6937  - In-sync
> replica delayed during fetch if replica throttle is exceeded
> KAFKA-6917  - Process
> txn
> completion asynchronously to avoid deadlock
> KAFKA-6893  - Create
> processors before starting acceptor to avoid ArithmeticException
> KAFKA-6870  -
> Fix ConcurrentModificationException in SampledStat
> KAFKA-6878  - Fix
> NullPointerException when querying global state store
> KAFKA-6879  - Invoke
> session init callbacks outside lock to avoid Controller deadlock
> KAFKA-6857  - Prevent
> follower from truncating to the wrong offset if undefined leader epoch is
> requested
> KAFKA-6854  - Log
> cleaner
> fails with transaction markers that are deleted during clean
> KAFKA-6747  - Check
> whether there is in-flight transaction before aborting transaction
> KAFKA-6748  - Double
> check before scheduling a new task after the punctuate call
> KAFKA-6739  -
> Fix IllegalArgumentException when down-converting from V2 to V0/V1
> KAFKA-6728  -
> Fix NullPointerException when instantiating the HeaderConverter
>
> Kafka 1.1.1 release plan:
> https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+1.1.1
>
> Release notes for the 1.1.1 release:
> http://home.apache.org/~lindong/kafka-1.1.1-rc3/RELEASE_NOTES.html
>
> *** Please download, test and vote by Thursday, July 12, 12pm PT ***
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> http://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> http://home.apache.org/~lindong/kafka-1.1.1-rc3/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/
>
> * Javadoc:
> http://home.apache.org/~lindong/kafka-1.1.1-rc3/javadoc/
>
> * Tag to be voted upon (off 1.1 branch) is the 1.1.1-rc3 tag:
> https://github.com/apache/kafka/tree/1.1.1-rc3
>
> * Documentation:
> http://kafka.apache.org/11/documentation.html
>
> * Protocol:
> http://kafka.apache.org/11/protocol.html
>
> * Successful Jenkins builds for the 1.1 branch:
> Unit/integration tests: *https://builds.apache.org/job/kafka-1.1-jdk7/162
> *
> System tests:
> https://jenkins.confluent.io/job/system-test-kafka/job/1.1/156/
>
> Please test and verify the release artifacts and submit a vote for this RC,
> or report any issues so we can fix them and get a new RC out ASAP. Although
> this release vote requires PMC votes to pass, testing, votes, and bug
> reports are valuable and appreciated from everyone.
>
>
> Regards,
> Dong
>


[jira] [Resolved] (KAFKA-6335) SimpleAclAuthorizerTest#testHighConcurrencyModificationOfResourceAcls fails intermittently

2018-07-08 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved KAFKA-6335.
---
Resolution: Cannot Reproduce

> SimpleAclAuthorizerTest#testHighConcurrencyModificationOfResourceAcls fails 
> intermittently
> --
>
> Key: KAFKA-6335
> URL: https://issues.apache.org/jira/browse/KAFKA-6335
> Project: Kafka
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Manikumar
>Priority: Major
>
> From 
> https://builds.apache.org/job/kafka-pr-jdk9-scala2.12/3045/testReport/junit/kafka.security.auth/SimpleAclAuthorizerTest/testHighConcurrencyModificationOfResourceAcls/
>  :
> {code}
> java.lang.AssertionError: expected acls Set(User:36 has Allow permission for 
> operations: Read from hosts: *, User:7 has Allow permission for operations: 
> Read from hosts: *, User:21 has Allow permission for operations: Read from 
> hosts: *, User:39 has Allow permission for operations: Read from hosts: *, 
> User:43 has Allow permission for operations: Read from hosts: *, User:3 has 
> Allow permission for operations: Read from hosts: *, User:35 has Allow 
> permission for operations: Read from hosts: *, User:15 has Allow permission 
> for operations: Read from hosts: *, User:16 has Allow permission for 
> operations: Read from hosts: *, User:22 has Allow permission for operations: 
> Read from hosts: *, User:26 has Allow permission for operations: Read from 
> hosts: *, User:11 has Allow permission for operations: Read from hosts: *, 
> User:38 has Allow permission for operations: Read from hosts: *, User:8 has 
> Allow permission for operations: Read from hosts: *, User:28 has Allow 
> permission for operations: Read from hosts: *, User:32 has Allow permission 
> for operations: Read from hosts: *, User:25 has Allow permission for 
> operations: Read from hosts: *, User:41 has Allow permission for operations: 
> Read from hosts: *, User:44 has Allow permission for operations: Read from 
> hosts: *, User:48 has Allow permission for operations: Read from hosts: *, 
> User:2 has Allow permission for operations: Read from hosts: *, User:9 has 
> Allow permission for operations: Read from hosts: *, User:14 has Allow 
> permission for operations: Read from hosts: *, User:46 has Allow permission 
> for operations: Read from hosts: *, User:13 has Allow permission for 
> operations: Read from hosts: *, User:5 has Allow permission for operations: 
> Read from hosts: *, User:29 has Allow permission for operations: Read from 
> hosts: *, User:45 has Allow permission for operations: Read from hosts: *, 
> User:6 has Allow permission for operations: Read from hosts: *, User:37 has 
> Allow permission for operations: Read from hosts: *, User:23 has Allow 
> permission for operations: Read from hosts: *, User:19 has Allow permission 
> for operations: Read from hosts: *, User:24 has Allow permission for 
> operations: Read from hosts: *, User:17 has Allow permission for operations: 
> Read from hosts: *, User:34 has Allow permission for operations: Read from 
> hosts: *, User:12 has Allow permission for operations: Read from hosts: *, 
> User:42 has Allow permission for operations: Read from hosts: *, User:4 has 
> Allow permission for operations: Read from hosts: *, User:47 has Allow 
> permission for operations: Read from hosts: *, User:18 has Allow permission 
> for operations: Read from hosts: *, User:31 has Allow permission for 
> operations: Read from hosts: *, User:49 has Allow permission for operations: 
> Read from hosts: *, User:33 has Allow permission for operations: Read from 
> hosts: *, User:1 has Allow permission for operations: Read from hosts: *, 
> User:27 has Allow permission for operations: Read from hosts: *) but got 
> Set(User:36 has Allow permission for operations: Read from hosts: *, User:7 
> has Allow permission for operations: Read from hosts: *, User:21 has Allow 
> permission for operations: Read from hosts: *, User:39 has Allow permission 
> for operations: Read from hosts: *, User:43 has Allow permission for 
> operations: Read from hosts: *, User:3 has Allow permission for operations: 
> Read from hosts: *, User:35 has Allow permission for operations: Read from 
> hosts: *, User:15 has Allow permission for operations: Read from hosts: *, 
> User:16 has Allow permission for operations: Read from hosts: *, User:22 has 
> Allow permission for operations: Read from hosts: *, User:26 has Allow 
> permission for operations: Read from hosts: *, User:11 has Allow permission 
> for operations: Read from hosts: *, User:38 has Allow permission f

Re: [VOTE] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-07-03 Thread Ted Yu
Hi,
I don't seem to find response to John's comment :

http://search-hadoop.com/m/Kafka/uyzND11alrn1G5N3Y1?subj=Re+Discuss+KIP+321+Add+method+to+get+TopicNameExtractor+in+TopologyDescription

On Tue, Jul 3, 2018 at 7:38 PM, Nishanth Pradeep 
wrote:

> Hello,
>
> I would like to start the vote on extending the TopologyDescription.Sink
> interface to return the class of the TopicNameExtractor in cases where
> dynamic routing is used.
>
> The user can override the toString method of the TopicNameExtractor class
> in order to provide a better textual description if he or she chooses.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 321%3A+Add+method+to+get+TopicNameExtractor+in+TopologyDescription
>
> Best,
> Nishanth Pradeep
>


Re: [VOTE] KIP-291: Have separate queues for control requests and data requests

2018-07-03 Thread Ted Yu
For #1, I agree that obtaining good default is not trivial. We can revisit
in the future.

For #2, the table is readable.

Thanks

On Tue, Jul 3, 2018 at 4:23 PM, Lucas Wang  wrote:

> @Ted
> For #1, it's probably hard to predict M since it also depends on the
> hardware.
> I'm not sure how to use the suggested formula for the default value if we
> don't know M.
> Also TO is the default timeout we want to figure out, and the formula seems
> to be recursive.
> I'd suggest we stay with the current default value of 300 milliseconds, and
> address it separately
> if it turns out to be a problem. What do you think?
>
> #2, please try this link and see if it works now:
> https://drive.google.com/file/d/1QbPDqfT59A2X4To2p3OfD5YeJR8aW
> DK7/view?usp=sharing
>
> Regards,
> Lucas
>
>
> On Mon, Jul 2, 2018 at 5:52 PM, Ted Yu  wrote:
>
> > For #1, I don't know what would be good approximation for M.
> > Maybe use max((TO / 2) / N, M / N) as default value for poll timeout ?
> >
> > For #2, I don't see the picture in email :-)
> > Can you use third party website ?
> >
> > Thanks
> >
> > On Mon, Jul 2, 2018 at 5:17 PM, Lucas Wang 
> wrote:
> >
> > > Hi Ted,
> > >
> > > 1. I'm neutral on making the poll timeout parameter configurable.
> > > Mainly because as a config, it could be confusing for operators who try
> > to
> > > choose a value for it.
> > >
> > > To understand the implication of this value better,
> > > let's use TO to represent the timeout value under discussion,
> > > M to denote the processing time of data requests,
> > > and N to be the number of io threads.
> > >
> > > - If the data request queue is empty and there is no incoming data
> > > requests,
> > >   all io threads should be blocked on the data request queue, and
> > >   the average delay for a controller request is (TO / 2) / N, and the
> > > worst case delay is TO.
> > > - If all IO threads are busy processing data requests, then the average
> > > latency for a controller request is M / N.
> > > - In the worst case, a controller request can just miss the train, and
> IO
> > > threads get blocked on data request queue
> > >   for TO, at the end of which they all receive a new incoming data
> > > request, the latency for the
> > >   controller request can be TO + M.
> > >
> > > Given the intricacies, what do you think about choosing a relatively
> > > meaningful value and stick with it,
> > > rather than exposing it as a config?
> > >
> > > 2. Sorry for losing the format of the table, I've attached it below as
> a
> > > picture
> > >
> > >
> > > Regards,
> > > Lucas
> > >
> > > On Fri, Jun 29, 2018 at 5:28 PM, Ted Yu  wrote:
> > >
> > >> bq. which is hard coded to be 300 milliseconds
> > >>
> > >> Have you considered making the duration configurable ?
> > >>
> > >> The comparison at the end of your email seems to be copied where
> tabular
> > >> form is lost.
> > >> Do you mind posting that part again ?
> > >>
> > >> Thanks
> > >>
> > >> On Fri, Jun 29, 2018 at 4:53 PM, Lucas Wang 
> > >> wrote:
> > >>
> > >> > Hi Jun,
> > >> >
> > >> > Thanks for your comments.
> > >> > 1. I just replied in the discussion thread about the positive change
> > >> this
> > >> > KIP can still bring
> > >> > if implemented on the latest trunk, which includes the async ZK
> > >> operations
> > >> > for KAFKA-5642.
> > >> > The evaluation is done using an integration test.
> > >> > In production, we have not upgraded to Kafka 1.1 yet, and the code
> we
> > >> are
> > >> > currently running does
> > >> > not include async ZK operations, therefore I don't have any real
> usage
> > >> > result.
> > >> >
> > >> > 2. Thanks for bringing this up. I haven't considered this setting,
> and
> > >> the
> > >> > existing proposal in this KIP
> > >> > would make data requests and controller requests share a memory poll
> > of
> > >> > size specified by the config
> > >> > queued.max.request.bytes. The downside is that if there is memory
> > >> pressure,
> > >> > controller requests ma

Re: [VOTE] KIP-322: Return new error code for DeleteTopics API when topic deletion disabled.

2018-07-03 Thread Ted Yu
+1

On Tue, Jul 3, 2018 at 9:05 AM, Mickael Maison 
wrote:

> +1 (non binding)
> Thanks for the KIP
>
> On Tue, Jul 3, 2018 at 4:59 PM, Vahid S Hashemian
>  wrote:
> > +1 (non-binding)
> >
> > --Vahid
> >
> >
> >
> > From:   Gwen Shapira 
> > To: dev 
> > Date:   07/03/2018 08:49 AM
> > Subject:Re: [VOTE] KIP-322: Return new error code for
> DeleteTopics
> > API when topic deletion disabled.
> >
> >
> >
> > +1
> >
> > On Tue, Jul 3, 2018 at 8:24 AM, Manikumar 
> > wrote:
> >
> >> Manikumar 
> >> Fri, Jun 29, 7:59 PM (4 days ago)
> >> to dev
> >> Hi All,
> >>
> >> I would like to start voting on KIP-322 which would return new error
> > code
> >> for DeleteTopics API when topic deletion disabled.
> >>
> >>
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=87295558
> >
> >>
> >> Thanks,
> >>
> >
> >
> >
> > --
> > *Gwen Shapira*
> > Product Manager | Confluent
> > 650.450.2760 | @gwenshap
> > Follow us: Twitter <
> > https://twitter.com/ConfluentInc
> >> | blog
> > <
> > http://www.confluent.io/blog
> >>
> >
> >
> >
> >
>


Re: [VOTE] KIP-291: Have separate queues for control requests and data requests

2018-07-02 Thread Ted Yu
For #1, I don't know what would be good approximation for M.
Maybe use max((TO / 2) / N, M / N) as default value for poll timeout ?

For #2, I don't see the picture in email :-)
Can you use third party website ?

Thanks

On Mon, Jul 2, 2018 at 5:17 PM, Lucas Wang  wrote:

> Hi Ted,
>
> 1. I'm neutral on making the poll timeout parameter configurable.
> Mainly because as a config, it could be confusing for operators who try to
> choose a value for it.
>
> To understand the implication of this value better,
> let's use TO to represent the timeout value under discussion,
> M to denote the processing time of data requests,
> and N to be the number of io threads.
>
> - If the data request queue is empty and there is no incoming data
> requests,
>   all io threads should be blocked on the data request queue, and
>   the average delay for a controller request is (TO / 2) / N, and the
> worst case delay is TO.
> - If all IO threads are busy processing data requests, then the average
> latency for a controller request is M / N.
> - In the worst case, a controller request can just miss the train, and IO
> threads get blocked on data request queue
>   for TO, at the end of which they all receive a new incoming data
> request, the latency for the
>   controller request can be TO + M.
>
> Given the intricacies, what do you think about choosing a relatively
> meaningful value and stick with it,
> rather than exposing it as a config?
>
> 2. Sorry for losing the format of the table, I've attached it below as a
> picture
>
>
> Regards,
> Lucas
>
> On Fri, Jun 29, 2018 at 5:28 PM, Ted Yu  wrote:
>
>> bq. which is hard coded to be 300 milliseconds
>>
>> Have you considered making the duration configurable ?
>>
>> The comparison at the end of your email seems to be copied where tabular
>> form is lost.
>> Do you mind posting that part again ?
>>
>> Thanks
>>
>> On Fri, Jun 29, 2018 at 4:53 PM, Lucas Wang 
>> wrote:
>>
>> > Hi Jun,
>> >
>> > Thanks for your comments.
>> > 1. I just replied in the discussion thread about the positive change
>> this
>> > KIP can still bring
>> > if implemented on the latest trunk, which includes the async ZK
>> operations
>> > for KAFKA-5642.
>> > The evaluation is done using an integration test.
>> > In production, we have not upgraded to Kafka 1.1 yet, and the code we
>> are
>> > currently running does
>> > not include async ZK operations, therefore I don't have any real usage
>> > result.
>> >
>> > 2. Thanks for bringing this up. I haven't considered this setting, and
>> the
>> > existing proposal in this KIP
>> > would make data requests and controller requests share a memory poll of
>> > size specified by the config
>> > queued.max.request.bytes. The downside is that if there is memory
>> pressure,
>> > controller requests may be blocked
>> > from being read from a socket and does not get prioritized at the socket
>> > layer.
>> >
>> > If we have a separate bytes limit for the controller requests, I imagine
>> > there would be a separate memory pool
>> > dedicated to controller requests. Also it requires the processors to
>> tell
>> > connections from a controller apart
>> > from connections from other brokers or clients, which would probably
>> > require a dedicated port for the controller?
>> > IMO, this change is mainly driven by the memory pressure, kind of an
>> > orthogonal issue, and we can address it with a separate KIP
>> > if desired. Please let me know what you think.
>> >
>> > 3. I plans to change the implementation of the method
>> > receiveRequest(timeout: Long) in the RequestChannel class as follows:
>> >
>> > val controllerRequest = controllerRequestQueue.poll()
>> > if (controllerRequest != null) {
>> >   controllerRequest
>> > } else {
>> >   dataRequestQueue.poll(timeout, TimeUnit.MILLISECONDS)
>> > }
>> >
>> > with this implementation, there is no need to explicitly choose a
>> request
>> > handler thread to wake up depending on
>> > the types of request enqueued, and if a controller request arrives while
>> > some request handler threads are blocked on an empty data request queue,
>> > they will simply timeout and call the receiveRequest method again.
>> >
>> > In terms of performance, it means that in the worst case, for a
>> controller
>> > request that just missed the receiveRequest call, it 

[jira] [Resolved] (KAFKA-7124) Number of AnyLogDir should match the length of the replicas list

2018-07-01 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved KAFKA-7124.
---
Resolution: Not A Problem

> Number of AnyLogDir should match the length of the replicas list
> 
>
> Key: KAFKA-7124
> URL: https://issues.apache.org/jira/browse/KAFKA-7124
> Project: Kafka
>  Issue Type: Bug
>    Reporter: Ted Yu
>Priority: Major
>
> See discussion under 'Partitions reassignment is failing in Kafka 1.1.0' 
> thread reported by Debraj Manna.
> Here is snippet from generated json file:
> {code}
> {"topic": "Topic3", "partition": 7, "log_dirs": ["any"], "replicas": [3, 0, 
> 2]}
> {code}
> Code snippet from ReassignPartitionsCommand.scala :
> {code}
>   "log_dirs" -> replicas.map(r => 
> replicaLogDirAssignment.getOrElse(new TopicPartitionReplica(tp.topic, 
> tp.partition, r), AnyLogDir)).asJava
> {code}
> We know that the appearance of "any" was due to the OrElse clause.
> There is a bug in the above code that the number of AnyLogDir should match 
> the length of the replicas list, or "log_dirs" should be omitted in such case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7124) Number of AnyLogDir should match the length of the replicas list

2018-07-01 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-7124:
-

 Summary: Number of AnyLogDir should match the length of the 
replicas list
 Key: KAFKA-7124
 URL: https://issues.apache.org/jira/browse/KAFKA-7124
 Project: Kafka
  Issue Type: Bug
Reporter: Ted Yu


See discussion under 'Partitions reassignment is failing in Kafka 1.1.0' thread.

Here is snippet from generated json file:
{code}
{"topic": "Topic3", "partition": 7, "log_dirs": ["any"], "replicas": [3, 0, 2]}
{code}
Code snippet from ReassignPartitionsCommand.scala :
{code}
  "log_dirs" -> replicas.map(r => replicaLogDirAssignment.getOrElse(new 
TopicPartitionReplica(tp.topic, tp.partition, r), AnyLogDir)).asJava
{code}
We know that the appearance of "any" was due to the OrElse clause.
There is a bug in the above code that the number of AnyLogDir should match the 
length of the replicas list, or "log_dirs" should be omitted in such case.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [kafka-clients] [VOTE] 2.0.0 RC1

2018-06-30 Thread Ted Yu
+1

Checked signatures
Ran test suite

On Sat, Jun 30, 2018 at 2:37 AM, Rajini Sivaram 
wrote:

> Hi Manikumar,
>
> Thank you for pointing that out, I had forgotten to drop the old artifacts.
> New artifacts should be there now.
>
> Regards,
>
> Rajini
>
> On Sat, Jun 30, 2018 at 7:44 AM, Manikumar 
> wrote:
>
> > looks like maven artifacts are not updated in the staging repo. They are
> > still at old timestamp.
> > https://repository.apache.org/content/groups/staging/org/
> > apache/kafka/kafka_2.11/2.0.0/
> >
> > On Sat, Jun 30, 2018 at 12:06 AM Rajini Sivaram  >
> > wrote:
> >
> >> Hello Kafka users, developers and client-developers,
> >>
> >>
> >> This is the second candidate for release of Apache Kafka 2.0.0.
> >>
> >>
> >> This is a major version release of Apache Kafka. It includes 40 new
> KIPs
> >> and
> >>
> >> several critical bug fixes. Please see the 2.0.0 release plan for more
> >> details:
> >>
> >> https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=80448820
> >>
> >>
> >> A few notable highlights:
> >>
> >>- Prefixed wildcard ACLs (KIP-290), Fine grained ACLs for
> >>CreateTopics (KIP-277)
> >>- SASL/OAUTHBEARER implementation (KIP-255)
> >>- Improved quota communication and customization of quotas (KIP-219,
> >>KIP-257)
> >>- Efficient memory usage for down conversion (KIP-283)
> >>- Fix log divergence between leader and follower during fast leader
> >>failover (KIP-279)
> >>- Drop support for Java 7 and remove deprecated code including old
> >>scala clients
> >>- Connect REST extension plugin, support for externalizing secrets
> >>and improved error handling (KIP-285, KIP-297, KIP-298 etc.)
> >>- Scala API for Kafka Streams and other Streams API improvements
> >>(KIP-270, KIP-150, KIP-245, KIP-251 etc.)
> >>
> >> Release notes for the 2.0.0 release:
> >>
> >> http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/RELEASE_NOTES.html
> >>
> >>
> >>
> >> *** Please download, test and vote by Tuesday, July 3rd, 4pm PT
> >>
> >>
> >> Kafka's KEYS file containing PGP keys we use to sign the release:
> >>
> >> http://kafka.apache.org/KEYS
> >>
> >>
> >> * Release artifacts to be voted upon (source and binary):
> >>
> >> http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/
> >>
> >>
> >> * Maven artifacts to be voted upon:
> >>
> >> https://repository.apache.org/content/groups/staging/
> >>
> >>
> >> * Javadoc:
> >>
> >> http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/javadoc/
> >>
> >>
> >> * Tag to be voted upon (off 2.0 branch) is the 2.0.0 tag:
> >>
> >> https://github.com/apache/kafka/tree/2.0.0-rc1
> >>
> >>
> >> * Documentation:
> >>
> >> http://kafka.apache.org/20/documentation.html
> >>
> >>
> >> * Protocol:
> >>
> >> http://kafka.apache.org/20/protocol.html
> >>
> >>
> >> * Successful Jenkins builds for the 2.0 branch:
> >>
> >> Unit/integration tests: https://builds.apache.org/job/
> kafka-2.0-jdk8/66/
> >>
> >> System tests: https://jenkins.confluent.io/job/system-test-
> >> kafka/job/2.0/15/
> >>
> >>
> >>
> >> Please test and verify the release artifacts and submit a vote for this
> RC
> >> or report any issues so that we can fix them and roll out a new RC ASAP!
> >>
> >> Although this release vote requires PMC votes to pass, testing, votes,
> >> and bug
> >> reports are valuable and appreciated from everyone.
> >>
> >>
> >> Thanks,
> >>
> >>
> >> Rajini
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "kafka-clients" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to kafka-clients+unsubscr...@googlegroups.com.
> >> To post to this group, send email to kafka-clie...@googlegroups.com.
> >> Visit this group at https://groups.google.com/group/kafka-clients.
> >> To view this discussion on the web visit https://groups.google.com/d/
> >> msgid/kafka-clients/CAOJcB39GdTWOaK4qysvyPyGU8Ldm8
> >> 2t_TA364x1MP8a8OAod6A%40mail.gmail.com
> >>  CAOJcB39GdTWOaK4qysvyPyGU8Ldm82t_TA364x1MP8a8OAod6A%40mail.
> gmail.com?utm_medium=email_source=footer>
> >> .
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >
>


Re: [VOTE] KIP-291: Have separate queues for control requests and data requests

2018-06-29 Thread Ted Yu
bq. which is hard coded to be 300 milliseconds

Have you considered making the duration configurable ?

The comparison at the end of your email seems to be copied where tabular
form is lost.
Do you mind posting that part again ?

Thanks

On Fri, Jun 29, 2018 at 4:53 PM, Lucas Wang  wrote:

> Hi Jun,
>
> Thanks for your comments.
> 1. I just replied in the discussion thread about the positive change this
> KIP can still bring
> if implemented on the latest trunk, which includes the async ZK operations
> for KAFKA-5642.
> The evaluation is done using an integration test.
> In production, we have not upgraded to Kafka 1.1 yet, and the code we are
> currently running does
> not include async ZK operations, therefore I don't have any real usage
> result.
>
> 2. Thanks for bringing this up. I haven't considered this setting, and the
> existing proposal in this KIP
> would make data requests and controller requests share a memory poll of
> size specified by the config
> queued.max.request.bytes. The downside is that if there is memory pressure,
> controller requests may be blocked
> from being read from a socket and does not get prioritized at the socket
> layer.
>
> If we have a separate bytes limit for the controller requests, I imagine
> there would be a separate memory pool
> dedicated to controller requests. Also it requires the processors to tell
> connections from a controller apart
> from connections from other brokers or clients, which would probably
> require a dedicated port for the controller?
> IMO, this change is mainly driven by the memory pressure, kind of an
> orthogonal issue, and we can address it with a separate KIP
> if desired. Please let me know what you think.
>
> 3. I plans to change the implementation of the method
> receiveRequest(timeout: Long) in the RequestChannel class as follows:
>
> val controllerRequest = controllerRequestQueue.poll()
> if (controllerRequest != null) {
>   controllerRequest
> } else {
>   dataRequestQueue.poll(timeout, TimeUnit.MILLISECONDS)
> }
>
> with this implementation, there is no need to explicitly choose a request
> handler thread to wake up depending on
> the types of request enqueued, and if a controller request arrives while
> some request handler threads are blocked on an empty data request queue,
> they will simply timeout and call the receiveRequest method again.
>
> In terms of performance, it means that in the worst case, for a controller
> request that just missed the receiveRequest call, it can be delayed for as
> long as
> the timeout parameter, which is hard coded to be 300 milliseconds. If there
> is just one request handler thread, the average delay is
> 150 milliseconds assuming the chance of a controller request arriving at
> any particular time is the same. With N request handler threads,
> the average delay is 150/N milliseconds, which does not seem to be a
> problem.
>
> We have considered waking up of request handler threads based on which
> queue the request handler threads are blocked,
> and that design was turned down because of its complexity. The design can
> be found at here
>  controller+request+queue+design>
> .
>
> If you mean a general purpose priority queue such as the
> java.util.PriorityQueue, we also have considered it and turned down the
> design because
> - The readily available class java.util.PriorityQueue is unbounded and
> we'll need to implement a bounded version
> - We would still like to have the FIFO semantics on both the controller
> request queue and data request queue, which conceptually does not fit very
> well
> with a general purpose priority queue, e.g. we would probably need to use
> the enqueue time to enforce FIFO semantics.
> - A typical operation on the priority queue is O(log n), whereas the sample
> implementation above gives O(1) performance regardless of the size of both
> queues.
>
> 4. For the two APIs sendRequest and receiveRequest, since we are only
> changing their implementation, not the API itself
> the two metrics will support two queues and the meaning of "Idle" still
> holds:
>
>
>
>
>
>
> *Before this KIPAfter this KIPNetworkProcessorAvgIdlePercentidle = blocked
> on selectnot idle includes being blocked on requestQueueidle = blocked on
> selectnot idle includes being blocked on either controller request queue or
> data request queueRequestHandlerAvgIdlePercentidle = blocked on reading
> from requestQueue idle = taking a request from the controller request
> queue, or blocked on reading from the data request queue*
>
> Regards,
> Lucas
>
> On Fri, Jun 29, 2018 at 11:22 AM, Jun Rao  wrote:
>
> > Hi, Lucas,
> >
> > Thanks for the KIP. A few comments below.
> >
> > 1. As Eno mentioned in the discussion thread, I am wondering how much of
> > this is still an issue after KAFKA-5642. With that fix, the requests from
> > the controller to the brokers are batched in all the common cases. Have
> you
> > deployed Kafka 1.1? What's the 

Re: [VOTE] 0.10.2.2 RC1

2018-06-29 Thread Ted Yu
+1

Ran test suite.

Checked signatures.

On Fri, Jun 29, 2018 at 10:21 AM, Jason Gustafson 
wrote:

> +1 (binding). I checked release notes, documentation, and went through the
> quickstart.
>
> Thanks Matthias!
>
> On Fri, Jun 22, 2018 at 6:43 PM, Matthias J. Sax 
> wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the second candidate for release of Apache Kafka 0.10.2.2.
> >
> > Note, that RC0 was created before the upgrade to Gradle 4.8.1 and thus,
> > we discarded it in favor of RC1 (without sending out a email for RC0).
> >
> > This is a bug fix release closing 29 tickets:
> > https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+0.10.2.2
> >
> > Release notes for the 0.10.2.2 release:
> > http://home.apache.org/~mjsax/kafka-0.10.2.2-rc1/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by Tuesday, 6/26/18 end-of-day, so we
> > can close the vote on Wednesday.
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > http://home.apache.org/~mjsax/kafka-0.10.2.2-rc1/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/
> >
> > * Javadoc:
> > http://home.apache.org/~mjsax/kafka-0.10.2.2-rc1/javadoc/
> >
> > * Tag to be voted upon (off 0.10.2 branch) is the 0.10.2.2 tag:
> > https://github.com/apache/kafka/releases/tag/0.10.2.2-rc1
> >
> > * Documentation:
> > http://kafka.apache.org/0102/documentation.html
> >
> > * Protocol:
> > http://kafka.apache.org/0102/protocol.html
> >
> > * Successful Jenkins builds for the 0.10.2 branch:
> > Unit/integration tests: https://builds.apache.org/job/
> > kafka-0.10.2-jdk7/220/
> >
> > /**
> >
> > Thanks,
> >   -Matthias
> >
> >
>


Re: [VOTE] KIP-308: Support dynamic update of max.connections.per.ip/max.connections.per.ip.overrides configs

2018-06-29 Thread Ted Yu
+1

On Fri, Jun 29, 2018 at 7:29 AM, Manikumar 
wrote:

> Hi All,
>
> I would like to start voting on KIP-308 which would add support for dynamic
> update of max.connections.per.ip/max.connections.per.ip.overrides configs
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=85474993
>
> Thanks,
>


Re: [VOTE] 1.1.1 RC2

2018-06-28 Thread Ted Yu
+1

Ran test suite which passed.

On Thu, Jun 28, 2018 at 6:12 PM, Dong Lin  wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the second candidate for release of Apache Kafka 1.1.1.
>
> Apache Kafka 1.1.1 is a bug-fix release for the 1.1 branch that was first
> released with 1.1.0 about 3 months ago. We have fixed about 25 issues since
> that release. A few of the more significant fixes include:
>
> KAFKA-6925  - Fix memory
> leak in StreamsMetricsThreadImpl
> KAFKA-6937  - In-sync
> replica delayed during fetch if replica throttle is exceeded
> KAFKA-6917  - Process
> txn
> completion asynchronously to avoid deadlock
> KAFKA-6893  - Create
> processors before starting acceptor to avoid ArithmeticException
> KAFKA-6870  -
> Fix ConcurrentModificationException in SampledStat
> KAFKA-6878  - Fix
> NullPointerException when querying global state store
> KAFKA-6879  - Invoke
> session init callbacks outside lock to avoid Controller deadlock
> KAFKA-6857  - Prevent
> follower from truncating to the wrong offset if undefined leader epoch is
> requested
> KAFKA-6854  - Log
> cleaner
> fails with transaction markers that are deleted during clean
> KAFKA-6747  - Check
> whether there is in-flight transaction before aborting transaction
> KAFKA-6748  - Double
> check before scheduling a new task after the punctuate call
> KAFKA-6739  -
> Fix IllegalArgumentException when down-converting from V2 to V0/V1
> KAFKA-6728  -
> Fix NullPointerException when instantiating the HeaderConverter
>
> Kafka 1.1.1 release plan:
> https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+1.1.1
>
> Release notes for the 1.1.1 release:
> http://home.apache.org/~lindong/kafka-1.1.1-rc2/RELEASE_NOTES.html
>
> *** Please download, test and vote by Thursday, July 3, 12pm PT ***
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> http://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> http://home.apache.org/~lindong/kafka-1.1.1-rc2/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/
>
> * Javadoc:
> http://home.apache.org/~lindong/kafka-1.1.1-rc2/javadoc/
>
> * Tag to be voted upon (off 1.1 branch) is the 1.1.1-rc2 tag:
> https://github.com/apache/kafka/tree/1.1.1-rc2
>
> * Documentation:
> http://kafka.apache.org/11/documentation.html
>
> * Protocol:
> http://kafka.apache.org/11/protocol.html
>
> * Successful Jenkins builds for the 1.1 branch:
> Unit/integration tests: *https://builds.apache.org/job/kafka-1.1-jdk7/157/
> *
> System tests: https://jenkins.confluent.io/job/system-test-kafka-
> branch-builder/1817
>
>
> Please test and verify the release artifacts and submit a vote for this RC,
> or report any issues so we can fix them and get a new RC out ASAP. Although
> this release vote requires PMC votes to pass, testing, votes, and bug
> reports are valuable and appreciated from everyone.
>
>
> Regards,
> Dong
>


Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-28 Thread Ted Yu
+1

On Thu, Jun 28, 2018 at 4:56 AM, Luís Cabral 
wrote:

> Hi Ted,
> Can I also get your input on this?
>
> bq. +1 from my side for using `compaction.strategy` with values
> "offset","timestamp" and "header" and `compaction.strategy.header`
> -Matthias
>
> bq. +1 from me as well.
> -Guozhang
>
>
> Cheers,
> Luis
>
>
>


Re: [VOTE] KIP-330: Add retentionPeriod in SessionBytesStoreSupplier

2018-06-27 Thread Ted Yu
+1

On Wed, Jun 27, 2018 at 4:40 PM, Bill Bejeck  wrote:

> +1
>
> -Bill
>
> On Wed, Jun 27, 2018 at 7:39 PM Guozhang Wang  wrote:
>
> > Hello folks,
> >
> > I'd like to start a voting thread on KIP-330. I've intentionally skipped
> > the discuss phase since it is a pretty straight-forward public API change
> > and should actually be added since day one. The bug fix of KAFKA-7071
> > helped us to discover this overlook.
> >
> >
> > --
> > -- Guozhang
> >
>


Re: [VOTE] 2.0.0 RC0

2018-06-27 Thread Ted Yu
+1

Checked signatures
Ran test suite

On Mon, Jun 25, 2018 at 11:12 AM, Thomas Crayford 
wrote:

> +1 (non-binding) Heroku has run our usual set of upgrade and performance
> tests, and we haven't found any notable issues through that.
>
> On Sat, Jun 23, 2018 at 12:30 AM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com> wrote:
>
> > +1 (non-binding)
> >
> > Built from source and ran quickstart successfully on Ubuntu (with Java 8
> > and Java 9).
> >
> > Thanks Rajini!
> > --Vahid
> >
> >
>


Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

2018-06-27 Thread Ted Yu
I noticed this (lack of primary parameter) as well.

What you gave as new example is semantically the same as what I suggested.
So it is good by me.

Thanks

On Wed, Jun 27, 2018 at 7:31 AM, John Roesler  wrote:

> Thanks for taking look, Ted,
>
> I agree this is a departure from the conventions of Streams DSL.
>
> Most of our config objects have one or two "required" parameters, which fit
> naturally with the static factory method approach. TimeWindow, for example,
> requires a size parameter, so we can naturally say TimeWindows.of(size).
>
> I think in the case of a suppression, there's really no "core" parameter,
> and "Suppression.of()" seems sillier than "new Suppression()". I think that
> Suppression.of(duration) would be ambiguous, since there are many durations
> that we can configure.
>
> However, thinking about it again, I suppose that I can give each
> configuration method a static version, which would let you replace "new
> Suppression()." with "Suppression." in all the examples. Basically, instead
> of "of()", we'd support any of the methods I listed.
>
> For example:
>
> windowCounts
> .suppress(
> Suppression
> .suppressLateEvents(Duration.ofMinutes(10))
> .suppressIntermediateEvents(
> IntermediateSuppression.emitAfter(Duration.ofMinutes(10))
> )
> );
>
>
> Does that seem better?
>
> Thanks,
> -John
>
>
> On Wed, Jun 27, 2018 at 12:44 AM Ted Yu  wrote:
>
> > I started to read this KIP which contains a lot of materials.
> >
> > One suggestion:
> >
> > .suppress(
> > new Suppression()
> >
> >
> > Do you think it would be more consistent with the rest of Streams data
> > structures by supporting `of` ?
> >
> > Suppression.of(Duration.ofMinutes(10))
> >
> >
> > Cheers
> >
> >
> >
> > On Tue, Jun 26, 2018 at 1:11 PM, John Roesler  wrote:
> >
> > > Hello devs and users,
> > >
> > > Please take some time to consider this proposal for Kafka Streams:
> > >
> > > KIP-328: Ability to suppress updates for KTables
> > >
> > > link: https://cwiki.apache.org/confluence/x/sQU0BQ
> > >
> > > The basic idea is to provide:
> > > * more usable control over update rate (vs the current state store
> > caches)
> > > * the final-result-for-windowed-computations feature which several
> people
> > > have requested
> > >
> > > I look forward to your feedback!
> > >
> > > Thanks,
> > > -John
> > >
> >
>


Re: [DISCUSS] KIP-322: Return new error code for DeleteTopics API when topic deletion disabled.

2018-06-27 Thread Ted Yu
bq. set the serialVersionUID = 1L in Kafka exception classes

Indeed. Should have checked earlier.

On Wed, Jun 27, 2018 at 1:38 AM, Manikumar 
wrote:

>  Hi,
>
> Thanks for the review.
>
>
> > nit: you are going to fill in an actual value below in your PR, right ?
> > private static final long serialVersionUID = 1L;
> >
> >
>  We normally set the serialVersionUID = 1L in Kafka exception classes.
>
> In Motivation, please mention the IllegalStateException scenario.
> >
>
>  IllegalStateException mentioned in the JIRA is specific to user
> implementaion.
>  Main issue is we wont get response form the server. In this case, admin
> client times out the requests.
>


Re: [DISCUSS] KIP-328: Ability to suppress updates for KTables

2018-06-26 Thread Ted Yu
I started to read this KIP which contains a lot of materials.

One suggestion:

.suppress(
new Suppression()


Do you think it would be more consistent with the rest of Streams data
structures by supporting `of` ?

Suppression.of(Duration.ofMinutes(10))


Cheers



On Tue, Jun 26, 2018 at 1:11 PM, John Roesler  wrote:

> Hello devs and users,
>
> Please take some time to consider this proposal for Kafka Streams:
>
> KIP-328: Ability to suppress updates for KTables
>
> link: https://cwiki.apache.org/confluence/x/sQU0BQ
>
> The basic idea is to provide:
> * more usable control over update rate (vs the current state store caches)
> * the final-result-for-windowed-computations feature which several people
> have requested
>
> I look forward to your feedback!
>
> Thanks,
> -John
>


Re: [DISCUSS] KIP-325: Extend Consumer Group Command to Show Beginning Offsets

2018-06-26 Thread Ted Yu
nit:

bq. leaving this empty for compacted topics

Some user(s) may be confused by empty partition size. How about emitting
'compacted' for compacted topics ?

Cheers

On Tue, Jun 26, 2018 at 4:42 PM, Gwen Shapira  wrote:

> It will be. In my experience most topics aren't compacted, so it will still
> be valuable. If not difficult, leaving this empty for compacted topics to
> avoid confusion will also be nice.
>
> On Tue, Jun 26, 2018 at 4:29 PM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com> wrote:
>
> > Hi Gwen,
> >
> > Thanks for the feedback.
> > Regarding the partition size, couldn't "end offset - start offset" be
> > misleading for compacted topics?
> >
> > --Vahid
> >
> >
> >
> >
> > From:   Gwen Shapira 
> > To: dev 
> > Date:   06/26/2018 02:36 PM
> > Subject:Re: [DISCUSS] KIP-325: Extend Consumer Group Command to
> > Show Beginning Offsets
> >
> >
> >
> > Small suggestion: you can also add a "partition size" column - difference
> > between log-end and log-start. We've had users ask for this.
> >
> > On Tue, Jun 26, 2018 at 2:34 PM, Gwen Shapira  wrote:
> >
> > > This will be useful! Thank you :)
> > >
> > > On Tue, Jun 26, 2018 at 11:23 AM, Vahid S Hashemian <
> > > vahidhashem...@us.ibm.com> wrote:
> > >
> > >> Hi everyone,
> > >>
> > >> I have created a trivial KIP to improve the offset reporting of the
> > >> consumer group command:
> > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-325%
> >
> > >> 3A+Extend+Consumer+Group+Command+to+Show+Beginning+Offsets
> > >> Looking forward to your feedback!
> > >>
> > >> Thanks.
> > >> --Vahid
> > >>
> > >>
> > >>
> > >
> > >
> > > --
> > > *Gwen Shapira*
> > > Product Manager | Confluent
> > > 650.450.2760 | @gwenshap
> > > Follow us: Twitter <
> > https://twitter.com/ConfluentInc
> > > | blog
> > > <
> > http://www.confluent.io/blog
> > >
> > >
> > >
> >
> >
> > --
> > *Gwen Shapira*
> > Product Manager | Confluent
> > 650.450.2760 | @gwenshap
> > Follow us: Twitter <
> > https://twitter.com/ConfluentInc
> > > | blog
> > <
> > http://www.confluent.io/blog
> > >
> >
> >
> >
> >
> >
>
>
> --
> *Gwen Shapira*
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter  | blog
> 
>


Re: [DISCUSS] KIP-322: Return new error code for DeleteTopics API when topic deletion disabled.

2018-06-26 Thread Ted Yu
Looks good overall.

nit: you are going to fill in an actual value below in your PR, right ?

private static final long serialVersionUID = 1L;


In Motivation, please mention the IllegalStateException scenario.

On Tue, Jun 26, 2018 at 9:34 AM, Manikumar 
wrote:

> Hi all,
>
> I have created a minor KIP to return new error code for DeleteTopics API
> when topic deletion disabled.
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=87295558
>
> Please take a look.
>
> Thanks,
>


Re: [DISCUSS] KIP-326: Schedulable KTable as Graph source

2018-06-26 Thread Ted Yu
What's the relationship between this KIP and KIP-323 ?

Thanks

On Tue, Jun 26, 2018 at 11:22 AM, Flávio Stutz 
wrote:

> Hey, guys, I've just created a new KIP about creating a new DSL graph
> source for realtime partitioned consolidations.
>
> We have faced the following scenario/problem in a lot of situations with
> KStreams:
>- Huge incoming data being processed by numerous application instances
>- Need to aggregate different fields whose records span all topic
> partitions (something like “total amount spent by people aged > 30 yrs”
> when processing a topic partitioned by userid).
>
> The challenge here is to manage this kind of situation without any
> bottlenecks. We don't need the “global aggregation” to be processed at each
> incoming message. On a scenario of 500 instances, each handling 1k
> messages/s, any single point of aggregation (single partitioned topics,
> global tables or external databases) would create a bottleneck of 500k
> messages/s for single threaded/CPU elements.
>
> For this scenario, it is possible to store the partial aggregations on
> local stores and, from time to time, query those states and aggregate them
> as a single value, avoiding bottlenecks. This is a way to create a "timed
> aggregation barrier”.
>
> If we leverage this kind of built-in feature we could greatly enhance the
> ability of KStreams to better handle the CAP Theorem characteristics, so
> that one could choose to have Consistency over Availability when needed.
>
> We started this discussion with Matthias J. Sax here:
> https://issues.apache.org/jira/browse/KAFKA-6953
>
> If you want to see more, go to KIP-326 at:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 326%3A+Schedulable+KTable+as+Graph+source
>
> -Flávio Stutz
>


Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-06-25 Thread Ted Yu
My previous response was talking about the new method in
InternalTopologyBuilder.
The exception just means there is no uniform extractor for all the sinks.

On Mon, Jun 25, 2018 at 8:02 PM, Matthias J. Sax 
wrote:

> Ted,
>
> Why? Each sink can have a different TopicNameExtractor.
>
>
> -Matthias
>
> On 6/25/18 5:19 PM, Ted Yu wrote:
> > If there are different TopicNameExtractor classes from multiple sink
> nodes,
> > the new method should throw exception alerting user of such scenario.
> >
> >
> > On Mon, Jun 25, 2018 at 2:23 PM, Bill Bejeck  wrote:
> >
> >> Thanks for the KIP!
> >>
> >> Overall I'm +1 on the KIP.   I have one question.
> >>
> >> The KIP states that the method "topicNameExtractor()" is added to the
> >> InternalTopologyBuilder.java.
> >>
> >> It could be that I'm missing something, but wow does this work if a user
> >> has provided different TopicNameExtractor instances to multiple sink
> nodes?
> >>
> >> Thanks,
> >> Bill
> >>
> >>
> >>
> >> On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang 
> wrote:
> >>
> >>> Yup I agree, generally speaking the `toString()` output is not
> >> recommended
> >>> to be relied on programmatically in user's code, but we've observed
> >>> convenience-beats-any-other-reasons again and again in development
> >>> unfortunately. I think we should still not claiming it is part of the
> >>> public APIs that would not be changed anyhow in the future, but just
> >>> mentioning it in the wiki for people to be aware is fine.
> >>>
> >>>
> >>> Guozhang
> >>>
> >>> On Sun, Jun 24, 2018 at 5:01 PM, Matthias J. Sax <
> matth...@confluent.io>
> >>> wrote:
> >>>
> >>>> Thanks for the KIP!
> >>>>
> >>>> I am don't have any further comments.
> >>>>
> >>>> For Guozhang's comment: if we mention anything about `toString()`, we
> >>>> should make explicit that `toString()` output is still not public
> >>>> contract and users should not rely on the output.
> >>>>
> >>>> Furhtermore, for the actual uses output, I would replace "topic:" by
> >>>> "extractor class:" to make the difference obvious.
> >>>>
> >>>> I am just hoping that people actually to not rely on `toString()` what
> >>>> defeats the purpose to the `TopologyDescription` class that was
> >>>> introduced to avoid the dependency... (Just a side comment, not really
> >>>> related to this KIP proposal itself).
> >>>>
> >>>>
> >>>> If there are no further comments in the next days, feel free to start
> >>>> the VOTE and open a PR.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>> On 6/22/18 6:04 PM, Guozhang Wang wrote:
> >>>>> Thanks for writing the KIP!
> >>>>>
> >>>>> I'm +1 on the proposed changes over all. One minor suggestion: we
> >>> should
> >>>>> also mention that the `Sink#toString` will also be updated, in a way
> >>> that
> >>>>> if `topic()` returns null, use the other call, etc. This is because
> >>>>> although we do not explicitly state the following logic as public
> >>>> protocols:
> >>>>>
> >>>>> ```
> >>>>>
> >>>>> "Sink: " + name + " (topic: " + topic() + ")\n  <-- " +
> >>>>> nodeNames(predecessors);
> >>>>>
> >>>>>
> >>>>> ```
> >>>>>
> >>>>> There are already some users that rely on
> >>> `topology.describe().toString(
> >>>> )`
> >>>>> to have runtime checks on the returned string values, so changing
> >> this
> >>>>> means that their app will break and hence need to make code changes.
> >>>>>
> >>>>> Guozhang
> >>>>>
> >>>>> On Wed, Jun 20, 2018 at 7:20 PM, Nishanth Pradeep <
> >>> nishanth...@gmail.com
> >>>>>
> >>>>> wrote:
> >>>>>
> >>>>>> Hello Everyone,
> >>>>>>
> >>>>>> I have created a new KIP to discuss extending TopologyDescription.
> >> You
> >>>> can
> >>>>>> find it here:
> >>>>>>
> >>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>> 321%3A+Add+method+to+get+TopicNameExtractor+in+TopologyDescription
> >>>>>>
> >>>>>> Please provide any feedback that you might have.
> >>>>>>
> >>>>>> Best,
> >>>>>> Nishanth Pradeep
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> -- Guozhang
> >>>
> >>
> >
>
>


Re: SASL Unit test failing

2018-06-25 Thread Ted Yu
I ran the test on Linux as well.

cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)

Java version: 1.8.0_161, vendor: Oracle Corporation
Java home: /jdk1.8.0_161/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-327.28.3.el7.x86_64", arch: "amd64",
family: "unix"

On Mon, Jun 25, 2018 at 5:42 PM, Ted Yu  wrote:

> Here was the command I used:
>
> ./gradlew -Dtest.single=SaslAuthenticatorTest clients:test
>
> On Mon, Jun 25, 2018 at 5:39 PM, Ahmed A  wrote:
>
>> I ran test with -i option as follows - "./gradlew  -i test".  The same set
>> of three tests failed.
>>
>> My environment:
>> $ java -version
>> java version "1.8.0_121"
>> Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
>> Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
>>
>> $ cat /etc/redhat-release
>> Red Hat Enterprise Linux Workstation release 7.3 (Maipo)
>> $ uname -a
>> Linux  ahmed  3.10.0-514.36.5.el7.x86_64 #1 SMP Thu Dec 28 21:42:18 EST
>> 2017 x86_64 x86_64 x86_64 GNU/Linux
>>
>>
>> Can you please let me know how I can run an individual unit test, what
>> options do I provide?
>>
>>
>> Thank you,
>> Ahmed.
>>
>>
>>
>> On Mon, Jun 25, 2018 at 2:47 PM, Ted Yu  wrote:
>>
>> > I ran the test alone which passed.
>> >
>> > Can you include -i on the command line to see if there is some clue from
>> > the output ?
>> >
>> > Here is my environment:
>> >
>> > Java version: 1.8.0_151, vendor: Oracle Corporation
>> > Java home:
>> > /Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home/jre
>> > Default locale: en_US, platform encoding: UTF-8
>> > OS name: "mac os x", version: "10.11.3", arch: "x86_64", family: "mac"
>> >
>> > FYI
>> >
>> > On Mon, Jun 25, 2018 at 12:59 PM, Ahmed A  wrote:
>> >
>> > > Hello,
>> > >
>> > > I did a fresh clone of the kafka src code, and the following SASL unit
>> > > tests have been failing consistently:
>> > > - testMechanismPluggability
>> > > - testMechanismPluggability
>> > > - testMultipleServerMechanisms
>> > >
>> > > All three tests have similar stack trace:
>> > > at org.junit.Assert.assertTrue(Assert.java:52)
>> > > at
>> > > org.apache.kafka.common.network.NetworkTestUtils.waitForChannelReady(
>> > > NetworkTestUtils.java:79)
>> > > at
>> > > org.apache.kafka.common.network.NetworkTestUtils.checkClient
>> Connection(
>> > > NetworkTestUtils.java:52)
>> > >
>> > > I also noticed, the three tests are using digest-md5.
>> > >
>> > > Has anyone else run into a similar issue or have any ideas for the
>> > failure?
>> > >
>> > > Thank you,
>> > > Ahmed.
>> > >
>> >
>>
>
>


Re: SASL Unit test failing

2018-06-25 Thread Ted Yu
Here was the command I used:

./gradlew -Dtest.single=SaslAuthenticatorTest clients:test

On Mon, Jun 25, 2018 at 5:39 PM, Ahmed A  wrote:

> I ran test with -i option as follows - "./gradlew  -i test".  The same set
> of three tests failed.
>
> My environment:
> $ java -version
> java version "1.8.0_121"
> Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
>
> $ cat /etc/redhat-release
> Red Hat Enterprise Linux Workstation release 7.3 (Maipo)
> $ uname -a
> Linux  ahmed  3.10.0-514.36.5.el7.x86_64 #1 SMP Thu Dec 28 21:42:18 EST
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>
>
> Can you please let me know how I can run an individual unit test, what
> options do I provide?
>
>
> Thank you,
> Ahmed.
>
>
>
> On Mon, Jun 25, 2018 at 2:47 PM, Ted Yu  wrote:
>
> > I ran the test alone which passed.
> >
> > Can you include -i on the command line to see if there is some clue from
> > the output ?
> >
> > Here is my environment:
> >
> > Java version: 1.8.0_151, vendor: Oracle Corporation
> > Java home:
> > /Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home/jre
> > Default locale: en_US, platform encoding: UTF-8
> > OS name: "mac os x", version: "10.11.3", arch: "x86_64", family: "mac"
> >
> > FYI
> >
> > On Mon, Jun 25, 2018 at 12:59 PM, Ahmed A  wrote:
> >
> > > Hello,
> > >
> > > I did a fresh clone of the kafka src code, and the following SASL unit
> > > tests have been failing consistently:
> > > - testMechanismPluggability
> > > - testMechanismPluggability
> > > - testMultipleServerMechanisms
> > >
> > > All three tests have similar stack trace:
> > > at org.junit.Assert.assertTrue(Assert.java:52)
> > > at
> > > org.apache.kafka.common.network.NetworkTestUtils.waitForChannelReady(
> > > NetworkTestUtils.java:79)
> > > at
> > > org.apache.kafka.common.network.NetworkTestUtils.
> checkClientConnection(
> > > NetworkTestUtils.java:52)
> > >
> > > I also noticed, the three tests are using digest-md5.
> > >
> > > Has anyone else run into a similar issue or have any ideas for the
> > failure?
> > >
> > > Thank you,
> > > Ahmed.
> > >
> >
>


Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-06-25 Thread Ted Yu
If there are different TopicNameExtractor classes from multiple sink nodes,
the new method should throw exception alerting user of such scenario.


On Mon, Jun 25, 2018 at 2:23 PM, Bill Bejeck  wrote:

> Thanks for the KIP!
>
> Overall I'm +1 on the KIP.   I have one question.
>
> The KIP states that the method "topicNameExtractor()" is added to the
> InternalTopologyBuilder.java.
>
> It could be that I'm missing something, but wow does this work if a user
> has provided different TopicNameExtractor instances to multiple sink nodes?
>
> Thanks,
> Bill
>
>
>
> On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang  wrote:
>
> > Yup I agree, generally speaking the `toString()` output is not
> recommended
> > to be relied on programmatically in user's code, but we've observed
> > convenience-beats-any-other-reasons again and again in development
> > unfortunately. I think we should still not claiming it is part of the
> > public APIs that would not be changed anyhow in the future, but just
> > mentioning it in the wiki for people to be aware is fine.
> >
> >
> > Guozhang
> >
> > On Sun, Jun 24, 2018 at 5:01 PM, Matthias J. Sax 
> > wrote:
> >
> > > Thanks for the KIP!
> > >
> > > I am don't have any further comments.
> > >
> > > For Guozhang's comment: if we mention anything about `toString()`, we
> > > should make explicit that `toString()` output is still not public
> > > contract and users should not rely on the output.
> > >
> > > Furhtermore, for the actual uses output, I would replace "topic:" by
> > > "extractor class:" to make the difference obvious.
> > >
> > > I am just hoping that people actually to not rely on `toString()` what
> > > defeats the purpose to the `TopologyDescription` class that was
> > > introduced to avoid the dependency... (Just a side comment, not really
> > > related to this KIP proposal itself).
> > >
> > >
> > > If there are no further comments in the next days, feel free to start
> > > the VOTE and open a PR.
> > >
> > >
> > >
> > >
> > > -Matthias
> > >
> > > On 6/22/18 6:04 PM, Guozhang Wang wrote:
> > > > Thanks for writing the KIP!
> > > >
> > > > I'm +1 on the proposed changes over all. One minor suggestion: we
> > should
> > > > also mention that the `Sink#toString` will also be updated, in a way
> > that
> > > > if `topic()` returns null, use the other call, etc. This is because
> > > > although we do not explicitly state the following logic as public
> > > protocols:
> > > >
> > > > ```
> > > >
> > > > "Sink: " + name + " (topic: " + topic() + ")\n  <-- " +
> > > > nodeNames(predecessors);
> > > >
> > > >
> > > > ```
> > > >
> > > > There are already some users that rely on
> > `topology.describe().toString(
> > > )`
> > > > to have runtime checks on the returned string values, so changing
> this
> > > > means that their app will break and hence need to make code changes.
> > > >
> > > > Guozhang
> > > >
> > > > On Wed, Jun 20, 2018 at 7:20 PM, Nishanth Pradeep <
> > nishanth...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > >> Hello Everyone,
> > > >>
> > > >> I have created a new KIP to discuss extending TopologyDescription.
> You
> > > can
> > > >> find it here:
> > > >>
> > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > >> 321%3A+Add+method+to+get+TopicNameExtractor+in+TopologyDescription
> > > >>
> > > >> Please provide any feedback that you might have.
> > > >>
> > > >> Best,
> > > >> Nishanth Pradeep
> > > >>
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
>


Re: [DISCUSS] KIP-323: Schedulable KTable as Graph source

2018-06-25 Thread Ted Yu
This would be useful feature.

In the Public Interfaces section, the new method lacks a closing
parenthesis.

In the Proposed Changes section, if the order of the 3 bullets can match
the order of the parameters of the new method, it would be easier to read.

For Rejected Alternatives #2, can you add a sentence saying why it was
rejected ?

Cheers

On Mon, Jun 25, 2018 at 10:13 AM, Flávio Stutz 
wrote:

> Hey, guys, I've just started a KIP discussion here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 323%3A+Schedulable+KTable+as+Graph+source
>


Re: [VOTE] KIP-319: Replace numSegments to segmentInterval in Streams window configurations

2018-06-25 Thread Ted Yu
+1

On Mon, Jun 25, 2018 at 2:56 PM, John Roesler  wrote:

> Hello All,
>
> Thanks for the discussion on KIP-319. I'd now like to start the voting.
>
> As a reminder, KIP-319 proposes a fix to an issue I identified in
> KAFKA-7080. Specifically, the issue is that we're creating
> CachingWindowStore with the *number of segments* instead of the *segment
> size*.
>
> Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
>
> Additionally, here's a draft PR for clarity:
> https://github.com/apache/kafka/pull/5257
>
> Thanks,
> -John
>


Re: SASL Unit test failing

2018-06-25 Thread Ted Yu
I ran the test alone which passed.

Can you include -i on the command line to see if there is some clue from
the output ?

Here is my environment:

Java version: 1.8.0_151, vendor: Oracle Corporation
Java home:
/Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.11.3", arch: "x86_64", family: "mac"

FYI

On Mon, Jun 25, 2018 at 12:59 PM, Ahmed A  wrote:

> Hello,
>
> I did a fresh clone of the kafka src code, and the following SASL unit
> tests have been failing consistently:
> - testMechanismPluggability
> - testMechanismPluggability
> - testMultipleServerMechanisms
>
> All three tests have similar stack trace:
> at org.junit.Assert.assertTrue(Assert.java:52)
> at
> org.apache.kafka.common.network.NetworkTestUtils.waitForChannelReady(
> NetworkTestUtils.java:79)
> at
> org.apache.kafka.common.network.NetworkTestUtils.checkClientConnection(
> NetworkTestUtils.java:52)
>
> I also noticed, the three tests are using digest-md5.
>
> Has anyone else run into a similar issue or have any ideas for the failure?
>
> Thank you,
> Ahmed.
>


Re: [VOTE] KIP-312: Add Overloaded StreamsBuilder Build Method to Accept java.util.Properties

2018-06-25 Thread Ted Yu
+1

On Mon, Jun 25, 2018 at 9:45 AM, Guozhang Wang  wrote:

> +1.
>
> On Mon, Jun 25, 2018 at 8:12 AM, Matthias J. Sax 
> wrote:
>
> > +1 (binding)
> >
> > On 6/25/18 6:11 AM, Bill Bejeck wrote:
> > > All,
> > > I'd like to start a vote for this KIP now.
> > >
> > > Thanks,
> > > Bill
> > >
> >
> >
>
>
> --
> -- Guozhang
>


Re: [VOTE] 1.0.2 RC0

2018-06-22 Thread Ted Yu
+1

Ran test suite.

Checked signatures.

On Fri, Jun 22, 2018 at 11:42 AM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> +1 (non-binding)
>
> Built from source and ran quickstart successfully on Ubuntu (with Java 8).
>
> Thanks for running the release Matthias!
> --Vahid
>
>
>
>
> From:   "Matthias J. Sax" 
> To: dev@kafka.apache.org, us...@kafka.apache.org,
> kafka-clie...@googlegroups.com
> Date:   06/22/2018 10:42 AM
> Subject:[VOTE] 1.0.2 RC0
>
>
>
> Hello Kafka users, developers and client-developers,
>
> This is the first candidate for release of Apache Kafka 1.0.2.
>
> This is a bug fix release closing 26 tickets:
> https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+1.0.2
>
> Release notes for the 1.0.2 release:
> http://home.apache.org/~mjsax/kafka-1.0.2-rc0/RELEASE_NOTES.html
>
> *** Please download, test and vote by Tuesday, 6/26/18 end-of-day, so we
> can close the vote on Wednesday.
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> http://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> http://home.apache.org/~mjsax/kafka-1.0.2-rc0/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/
>
> * Javadoc:
> http://home.apache.org/~mjsax/kafka-1.0.2-rc0/javadoc/
>
> * Tag to be voted upon (off 1.0 branch) is the 1.0.2 tag:
> https://github.com/apache/kafka/releases/tag/1.0.2-rc0
>
> * Documentation:
> http://kafka.apache.org/10/documentation.html
>
> * Protocol:
> http://kafka.apache.org/10/protocol.html
>
> * Successful Jenkins builds for the 1.0 branch:
> Unit/integration tests: https://builds.apache.org/job/kafka-1.0-jdk7/211/
> System tests:
> https://jenkins.confluent.io/job/system-test-kafka/job/1.0/217/
>
> /**
>
> Thanks,
>   -Matthias
>
> [attachment "signature.asc" deleted by Vahid S Hashemian/Silicon
> Valley/IBM]
>
>
>
>


Re: [VOTE] KIP-313: Add KStream.flatTransform and KStream.flatTransformValues

2018-06-22 Thread Ted Yu
+1

On Fri, Jun 22, 2018 at 2:50 AM, Bruno Cadonna  wrote:

> Hi list,
>
> I would like to voting on this KIP.
>
> I created a first PR[1] that adds flatTransform. Once I get some
> feedback, I will start work on flatTransformValues.
>
> Best regards,
> Bruno
>
> [1] https://github.com/apache/kafka/pull/5273
>


Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

2018-06-21 Thread Ted Yu
bq. propose just to make it private

+1

On Thu, Jun 21, 2018 at 12:03 PM, John Roesler  wrote:

> Interesting... I did not initially consider it because I didn't want to
> have an impact on anyone's Streams apps, but now I see that unless
> developers have subclassed `Windows`, the number of segments would always
> be 3!
>
> There's one caveat to this, which I think was a mistake. The field
> `segments` in Windows is public, which means that anyone can actually set
> it directly on any Window instance like:
>
> TimeWindows tw = TimeWindows.of(100);
> tw.segments = 12345;
>
> Bypassing the bounds check and contradicting the javadoc in Windows that
> says users can't directly set it. Sadly there's no way to "deprecate" this
> exposure, so I propose just to make it private.
>
> With this new knowledge, I agree, I think we can switch to
> "segmentInterval" throughout the interface.
>
> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang  wrote:
>
> > Hello John,
> >
> > Thanks for the KIP.
> >
> > Should we consider making the change on `Stores#persistentWindowStore`
> > parameters as well?
> >
> >
> > Guozhang
> >
> >
> > On Wed, Jun 20, 2018 at 1:31 PM, John Roesler  wrote:
> >
> > > Hi Ted,
> > >
> > > Ah, when you made that comment to me before, I thought you meant as
> > opposed
> > > to "segments". Now it makes sense that you meant as opposed to
> > > "segmentSize".
> > >
> > > I named it that way to match the peer method "windowSize", which is
> also
> > a
> > > quantity of milliseconds.
> > >
> > > I agree that "interval" is more intuitive, but I think I favor
> > consistency
> > > in this case. Does that seem reasonable?
> > >
> > > Thanks,
> > > -John
> > >
> > > On Wed, Jun 20, 2018 at 1:06 PM Ted Yu  wrote:
> > >
> > > > Normally size is not measured in time unit, such as milliseconds.
> > > > How about naming the new method segmentInterval ?
> > > > Thanks
> > > >  Original message From: John Roesler <
> > j...@confluent.io>
> > > > Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org
> Subject:
> > > > [DISCUSS] KIP-319: Replace segments with segmentSize in
> > > > WindowBytesStoreSupplier
> > > > Hello All,
> > > >
> > > > I'd like to propose KIP-319 to fix an issue I identified in
> KAFKA-7080.
> > > > Specifically, we're creating CachingWindowStore with the *number of
> > > > segments* instead of the *segment size*.
> > > >
> > > > Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
> > > > Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> > > >
> > > > additionally, here's a draft PR for clarity:
> > > > https://github.com/apache/kafka/pull/5257
> > > >
> > > > Please let me know what you think!
> > > >
> > > > Thanks,
> > > > -John
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>


  1   2   3   4   5   >