Re: [DISCUSS] KIP-341: Update Sticky Assignor's User Data Protocol

2018-07-13 Thread Vahid S Hashemian
Hi Mike,

Thanks a lot for reviewing the KIP and sharing your feedback.
I agree that such an issue could surface with option 1, but the existing 
PR (that currently implements this option) checks for such duplicate 
assignments and ignores one in favor of the other. So at the end there 
will be valid (non-duplicate) assignments to consumers, but they might 
deviate a bit from the ideal assignment.
If rare scenarios like this are deemed troublesome and we want to avoid 
them, option 2 would probably be the way to go. In that case and in my 
opinion, option 2 would a better solution compared to introducing another 
field (e.g. timestamp).

Regards.
--Vahid



From:   Mike Freyberger 
To: "dev@kafka.apache.org" 
Date:   07/13/2018 08:42 PM
Subject:Re: [DISCUSS] KIP-341: Update Sticky Assignor's User Data 
Protocol



This is great!

For the client side implementation, I think it’s still possible for there 
to be a duplication. I’ll try to walk through the example here. 

Let’s says there are 2 consumers, 1 topic with 2 partitions. 

After the initial rebalance, generation 0:
Consumer A has partition 0
Consumer B has partition 1

Let’s say consumer B leaves the group (long debug, GC pause). This leads 
to another rebalance. This rebalance will be considered generation 1 and 
will result in:

Generation 1, Consumer A owns partition 0,1

Now let’s say Consumer B is still out of the group and then Consumer A 
leaves as well. While Consumer A is out of the group, Consumer B rejoins 
the group. During this rebalance, the only previous state would be the 
initial generation 0 assignment. So this assignment would be considered 
generation 1 as well and would result in:

Generation 1, Consumer B owns partition 0,1

When A rejoins the group, both consumers would claim ownership of both 
partitions and they would report the assignment was from generation 1. 
This gets us back into the same issue as before because the generation 
number cannot help at all. You could add a timestamp in addition to the 
generation marker, but that’d still be vulnerable to clock skew.

Would hooking into the existing generation marker protect the assignor for 
this kind of situation? We need to make sure the selected implantation is 
protected against the kind of failure mentioned above. 

Also, I have been working on KIP-315, which is another Sticky Assignor, 
which also requires some kind of epoch/generation marker to be protected 
against zombies. So, I’d be in favor of a generic solution here that other 
assignors can leverage. 

Best,

Mike Freyberger

> On Jul 13, 2018, at 6:15 PM, Vahid S Hashemian 
 wrote:
> 
> Hi all,
> 
> I create a short KIP to address an issue in Sticky Assignor assignment 
> logic: 
> 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-341%3A+Update+Sticky+Assignor%27s+User+Data+Protocol

> Please take a look and share your feedback / comments.
> 
> In particular, there is a Generation Marker section (
> 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-341%3A+Update+Sticky+Assignor%27s+User+Data+Protocol#KIP-341:UpdateStickyAssignor'sUserDataProtocol-GenerationMarker

> ) that provides two methods for implementing the improvement to the 
> protocol. I'd like to know which option is more popular.
> 
> Thanks!
> --Vahid
> 
> 







Contribution request

2018-07-13 Thread Jan Paulovcak
Greetings,

I'd like to become a contributor for Apache Kafka project. Could you please
add me to the list of contributors in JIRA? My username is "jpact"

Thanks in advance!

Regards,
J


Re: [DISCUSS] KIP-341: Update Sticky Assignor's User Data Protocol

2018-07-13 Thread Mike Freyberger
This is great!

For the client side implementation, I think it’s still possible for there to be 
a duplication. I’ll try to walk through the example here. 

Let’s says there are 2 consumers, 1 topic with 2 partitions. 

After the initial rebalance, generation 0:
Consumer A has partition 0
Consumer B has partition 1

Let’s say consumer B leaves the group (long debug, GC pause). This leads to 
another rebalance. This rebalance will be considered generation 1 and will 
result in:

Generation 1, Consumer A owns partition 0,1

Now let’s say Consumer B is still out of the group and then Consumer A leaves 
as well. While Consumer A is out of the group, Consumer B rejoins the group. 
During this rebalance, the only previous state would be the initial generation 
0 assignment. So this assignment would be considered generation 1 as well and 
would result in:

Generation 1, Consumer B owns partition 0,1

When A rejoins the group, both consumers would claim ownership of both 
partitions and they would report the assignment was from generation 1. This 
gets us back into the same issue as before because the generation number cannot 
help at all. You could add a timestamp in addition to the generation marker, 
but that’d still be vulnerable to clock skew.

Would hooking into the existing generation marker protect the assignor for this 
kind of situation? We need to make sure the selected implantation is protected 
against the kind of failure mentioned above. 

Also, I have been working on KIP-315, which is another Sticky Assignor, which 
also requires some kind of epoch/generation marker to be protected against 
zombies. So, I’d be in favor of a generic solution here that other assignors 
can leverage. 

Best,

Mike Freyberger

> On Jul 13, 2018, at 6:15 PM, Vahid S Hashemian  
> wrote:
> 
> Hi all,
> 
> I create a short KIP to address an issue in Sticky Assignor assignment 
> logic: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-341%3A+Update+Sticky+Assignor%27s+User+Data+Protocol
> Please take a look and share your feedback / comments.
> 
> In particular, there is a Generation Marker section (
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-341%3A+Update+Sticky+Assignor%27s+User+Data+Protocol#KIP-341:UpdateStickyAssignor'sUserDataProtocol-GenerationMarker
> ) that provides two methods for implementing the improvement to the 
> protocol. I'd like to know which option is more popular.
> 
> Thanks!
> --Vahid
> 
> 


Re: [DISCUSS]: KIP-339: Create a new ModifyConfigs API

2018-07-13 Thread Ted Yu
I experimented with putting null value into ConcurrentHashMap which led me
to this code:

final V putVal(K key, V value, boolean onlyIfAbsent) {

if (key == null || value == null) throw new NullPointerException();

I agree that getting NPE this way is not user friendly.
Using Java 8, the notion of null value support would be conveyed to user in
a friendly manner.

Cheers

On Fri, Jul 13, 2018 at 5:51 PM Colin McCabe  wrote:

> On Fri, Jul 13, 2018, at 17:45, Ted Yu wrote:
>
> Looking at modifyConfigs API, it doesn't seem that ConcurrentHashMap
> should be used as the underlying parameter type.
>
>
> I agree that there are other types of maps that do support null values.
> However, the fact that some official map implementations from the standard
> library don't support null values makes this a questionable feature to rely
> on.  Imagine being a new user of this API who created a ConcurrentHashMap,
> tried to insert some null keys, and pass to  the API.  It would compile,
> but not work.  It would certainly be confusing.
>
> Anyway, to signify that null value is supported, value type can be
> declared as Optional.
>
> FYI
>
>
> Yeah, now that we're on Java 8, Optional could be a good choice here.
>
> best,
> Colin
>
>
> On Fri, Jul 13, 2018 at 5:35 PM Colin McCabe  wrote:
>
>
> Hi Ted,
>
> That’s a fair question.  I think the main reason I didn’t propose that
> originally is that many people find null values in maps confusing.  Also,
> some newer java maps don’t support null values, such as ConcuurentHashMap.
> I’m curious what others think about this.
>
> Best,
> Colin
>
> On Wed, Jul 11, 2018, at 21:28, Ted Yu wrote:
> > bq. Map changes, Set
> removals,
> >
> > Is it possible to combine the two parameters into one Map where null
> Config
> > value signifies removal of config ?
> > This way, the following wouldn't occur (reducing un-intended config
> > removal):
> >
> > bq. If a configuration key is specified in both *changes* and *removals*
> >
> > *Cheers*
> >
> > On Wed, Jul 11, 2018 at 10:54 AM Colin McCabe 
> wrote:
> >
> > > Hi all,
> > >
> > > Previously, we discussed some issues with alterConfigs here on the
> mailing
> > > list, and eventually came to the conclusion that the RPC as implemented
> > > can't be used for a shell command modifying configurations.
> > >
> > > I wrote up a small KIP to fix the issues with the RPC.  Please take a
> look
> > > at
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-339%3A+Create+a+new+ModifyConfigs+API
> > >
> > > best,
> > > Colin
> > >
>
>
>


[jira] [Created] (KAFKA-7162) Flaky unit tests caused by record creation timestamps differ from validation time by more than timestampDiffMaxMs

2018-07-13 Thread Lucas Wang (JIRA)
Lucas Wang created KAFKA-7162:
-

 Summary: Flaky unit tests caused by record creation timestamps 
differ from validation time by more than timestampDiffMaxMs
 Key: KAFKA-7162
 URL: https://issues.apache.org/jira/browse/KAFKA-7162
 Project: Kafka
  Issue Type: Bug
Reporter: Lucas Wang


While running gradle unit tests, we found the test method 
LogValidatorTest.testCompressedV1 can fail sometimes. Upon investigation, it 
turns out the test method uses one set of timestamps, say t0, t1 and t2, for 
the records, while using a separate timestamp, say t3, for the "now" parameter 
when invoking the LogValidator.validateMessagesAndAssignOffsets method. The 
validateMessagesAndAssignOffsets validation method also takes a parameter 
timestampDiffMaxMs=1 second, that specifies the maximum allowed time different 
between t3 and the timestamps in records, i.e. t0, t1, and t2. While running 
unit tests, especially when multiple tests are run simultaneously, there is no 
guarantee that the time difference between t3 and t0 is within 1 second, 
causing the test method to flaky sometimes. Many other test methods in the 
LogValidatorTest can suffer from the same problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: KIP-327: Add describe all topics API to AdminClient

2018-07-13 Thread Ted Yu
bq. describe topics by a regular expression on the server side

Should caution be taken if the regex doesn't filter ("*") ?

Cheers

On Fri, Jul 13, 2018 at 6:02 PM Colin McCabe  wrote:

> As Jason wrote, this won't scale as the number of partitions increases.
> We already have users who have tens of thousands of topics, or more.  If
> you multiply that by 100x over the next few years, you end up with this API
> returning full information about millions of topics, which clearly doesn't
> work.
>
> We discussed this a lot in the original KIP-117 DISCUSS thread which added
> the Java AdminClient.  ListTopics and DescribeTopics were deliberately kept
> separate because we understood that eventually a single RPC would not be
> able to return information about all the topics in the cluster.  So I have
> to vote -1 for this proposal as it stands.
>
> I do agree that adding a way to describe topics by a regular expression on
> the server side would be very useful.  This would also fix a major
> scalability problem we have now, which is that when subscribing via a
> regular expression, clients need to fetch the full list of all topics in
> the cluster and filter locally.
>
> I think a regular expression library like re2 would be ideal for this
> purpose.  re2 is standardized and language-agnostic (it's not tied only to
> Java).  In contrast, Java regular expression change with different releases
> of the JDK (there were some changes in java 8, for example).  Also, re2
> regular expressions are linear time, never exponential time.  See
> https://github.com/google/re2j
>
> regards,
> Colin
>
>
> On Fri, Jul 13, 2018, at 05:00, Andras Beni wrote:
> > The KIP looks good to me.
> > However, if there is willingness in the community to work on metadata
> > request with patterns, the feature proposed here and filtering by '*' or
> > '.*' would be redundant.
> >
> > Andras
> >
> >
> >
> > On Fri, Jul 13, 2018 at 12:38 AM Jason Gustafson 
> wrote:
> >
> > > Hey Manikumar,
> > >
> > > As Kafka begins to scale to larger and larger numbers of
> topics/partitions,
> > > I'm a little concerned about the scalability of APIs such as this. The
> API
> > > looks benign, but imagine you have have a few million partitions. We
> > > already expose similar APIs in the producer and consumer, so probably
> not
> > > much additional harm to expose it in the AdminClient, but it would be
> nice
> > > to put a little thought into some longer term options. We should be
> giving
> > > users an efficient way to select a smaller set of the topics they are
> > > interested in. We have always discussed adding some filtering support
> to
> > > the Metadata API. Perhaps now is a good time to reconsider this? We now
> > > have a convention for wildcard ACLs, so perhaps we can do something
> > > similar. Full regex support might be ideal given the consumer's
> > > subscription API, but that is more challenging. What do you think?
> > >
> > > Thanks,
> > > Jason
> > >
> > > On Thu, Jul 12, 2018 at 2:35 PM, Harsha  wrote:
> > >
> > > > Very useful. LGTM.
> > > >
> > > > Thanks,
> > > > Harsha
> > > >
> > > > On Thu, Jul 12, 2018, at 9:56 AM, Manikumar wrote:
> > > > > Hi all,
> > > > >
> > > > > I have created a KIP to add describe all topics API to AdminClient
> .
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 327%3A+Add+describe+all+topics+API+to+AdminClient
> > > > >
> > > > > Please take a look.
> > > > >
> > > > > Thanks,
> > > >
> > >
>


Re: KIP-327: Add describe all topics API to AdminClient

2018-07-13 Thread Colin McCabe
As Jason wrote, this won't scale as the number of partitions increases.  We 
already have users who have tens of thousands of topics, or more.  If you 
multiply that by 100x over the next few years, you end up with this API 
returning full information about millions of topics, which clearly doesn't work.

We discussed this a lot in the original KIP-117 DISCUSS thread which added the 
Java AdminClient.  ListTopics and DescribeTopics were deliberately kept 
separate because we understood that eventually a single RPC would not be able 
to return information about all the topics in the cluster.  So I have to vote 
-1 for this proposal as it stands.

I do agree that adding a way to describe topics by a regular expression on the 
server side would be very useful.  This would also fix a major scalability 
problem we have now, which is that when subscribing via a regular expression, 
clients need to fetch the full list of all topics in the cluster and filter 
locally.

I think a regular expression library like re2 would be ideal for this purpose.  
re2 is standardized and language-agnostic (it's not tied only to Java).  In 
contrast, Java regular expression change with different releases of the JDK 
(there were some changes in java 8, for example).  Also, re2 regular 
expressions are linear time, never exponential time.  See 
https://github.com/google/re2j

regards,
Colin


On Fri, Jul 13, 2018, at 05:00, Andras Beni wrote:
> The KIP looks good to me.
> However, if there is willingness in the community to work on metadata
> request with patterns, the feature proposed here and filtering by '*' or
> '.*' would be redundant.
> 
> Andras
> 
> 
> 
> On Fri, Jul 13, 2018 at 12:38 AM Jason Gustafson  wrote:
> 
> > Hey Manikumar,
> >
> > As Kafka begins to scale to larger and larger numbers of topics/partitions,
> > I'm a little concerned about the scalability of APIs such as this. The API
> > looks benign, but imagine you have have a few million partitions. We
> > already expose similar APIs in the producer and consumer, so probably not
> > much additional harm to expose it in the AdminClient, but it would be nice
> > to put a little thought into some longer term options. We should be giving
> > users an efficient way to select a smaller set of the topics they are
> > interested in. We have always discussed adding some filtering support to
> > the Metadata API. Perhaps now is a good time to reconsider this? We now
> > have a convention for wildcard ACLs, so perhaps we can do something
> > similar. Full regex support might be ideal given the consumer's
> > subscription API, but that is more challenging. What do you think?
> >
> > Thanks,
> > Jason
> >
> > On Thu, Jul 12, 2018 at 2:35 PM, Harsha  wrote:
> >
> > > Very useful. LGTM.
> > >
> > > Thanks,
> > > Harsha
> > >
> > > On Thu, Jul 12, 2018, at 9:56 AM, Manikumar wrote:
> > > > Hi all,
> > > >
> > > > I have created a KIP to add describe all topics API to AdminClient .
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 327%3A+Add+describe+all+topics+API+to+AdminClient
> > > >
> > > > Please take a look.
> > > >
> > > > Thanks,
> > >
> >


Re: [DISCUSS]: KIP-339: Create a new ModifyConfigs API

2018-07-13 Thread Colin McCabe
On Fri, Jul 13, 2018, at 17:45, Ted Yu wrote:
> Looking at modifyConfigs API, it doesn't seem that ConcurrentHashMap
> should be used as the underlying parameter type.
I agree that there are other types of maps that do support null values.
However, the fact that some official map implementations from the
standard library don't support null values makes this a questionable
feature to rely on.  Imagine being a new user of this API who created a
ConcurrentHashMap, tried to insert some null keys, and pass to  the API.
It would compile, but not work.  It would certainly be confusing.
> Anyway, to signify that null value is supported, value type can be
> declared as Optional.> 
> FYI

Yeah, now that we're on Java 8, Optional could be a good choice here.

best,
Colin


> On Fri, Jul 13, 2018 at 5:35 PM Colin McCabe
>  wrote:>> __
>> Hi Ted,
>> 
>> That’s a fair question.  I think the main reason I didn’t propose
>> that originally is that many people find null values in maps
>> confusing.  Also, some newer java maps don’t support null values,
>> such as ConcuurentHashMap.  I’m curious what others think about this.>> 
>> Best,
>> Colin
>> 
>> On Wed, Jul 11, 2018, at 21:28, Ted Yu wrote:
>> > bq. Map changes, Set
>> > removals,>> >
>> > Is it possible to combine the two parameters into one Map where
>> > null Config>> > value signifies removal of config ?
>> > This way, the following wouldn't occur (reducing un-intended config>> > 
>> > removal):
>> >
>> > bq. If a configuration key is specified in both *changes* and
>> > *removals*>> >
>> > *Cheers*
>> >
>> > On Wed, Jul 11, 2018 at 10:54 AM Colin McCabe 
>> > wrote:>> >
>> > > Hi all,
>> > >
>> > > Previously, we discussed some issues with alterConfigs here on
>> > > the mailing>> > > list, and eventually came to the conclusion that the 
>> > > RPC as
>> > > implemented>> > > can't be used for a shell command modifying 
>> > > configurations.
>> > >
>> > > I wrote up a small KIP to fix the issues with the RPC.  Please
>> > > take a look>> > > at
>> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-339%3A+Create+a+new+ModifyConfigs+API>>
>> > >  > >
>> > > best,
>> > > Colin
>> > >
>> 



Re: [DISCUSS]: KIP-339: Create a new ModifyConfigs API

2018-07-13 Thread Ted Yu
Looking at modifyConfigs API, it doesn't seem that ConcurrentHashMap should
be used as the underlying parameter type.

Anyway, to signify that null value is supported, value type can be declared
as Optional.

FYI

On Fri, Jul 13, 2018 at 5:35 PM Colin McCabe  wrote:

> Hi Ted,
>
> That’s a fair question.  I think the main reason I didn’t propose that
> originally is that many people find null values in maps confusing.  Also,
> some newer java maps don’t support null values, such as ConcuurentHashMap.
> I’m curious what others think about this.
>
> Best,
> Colin
>
> On Wed, Jul 11, 2018, at 21:28, Ted Yu wrote:
> > bq. Map changes, Set
> removals,
> >
> > Is it possible to combine the two parameters into one Map where null
> Config
> > value signifies removal of config ?
> > This way, the following wouldn't occur (reducing un-intended config
> > removal):
> >
> > bq. If a configuration key is specified in both *changes* and *removals*
> >
> > *Cheers*
> >
> > On Wed, Jul 11, 2018 at 10:54 AM Colin McCabe 
> wrote:
> >
> > > Hi all,
> > >
> > > Previously, we discussed some issues with alterConfigs here on the
> mailing
> > > list, and eventually came to the conclusion that the RPC as implemented
> > > can't be used for a shell command modifying configurations.
> > >
> > > I wrote up a small KIP to fix the issues with the RPC.  Please take a
> look
> > > at
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-339%3A+Create+a+new+ModifyConfigs+API
> > >
> > > best,
> > > Colin
> > >
>
>


Re: [DISCUSS]: KIP-339: Create a new ModifyConfigs API

2018-07-13 Thread Colin McCabe
Hi Ted,

That’s a fair question.  I think the main reason I didn’t propose that
originally is that many people find null values in maps confusing.
Also, some newer java maps don’t support null values, such as
ConcuurentHashMap.  I’m curious what others think about this.
Best,
Colin

On Wed, Jul 11, 2018, at 21:28, Ted Yu wrote:
> bq. Map changes, Set
> removals,>
> Is it possible to combine the two parameters into one Map where
> null Config> value signifies removal of config ?
> This way, the following wouldn't occur (reducing un-intended config
> removal):
>
> bq. If a configuration key is specified in both *changes* and
> *removals*>
> *Cheers*
>
> On Wed, Jul 11, 2018 at 10:54 AM Colin McCabe
>  wrote:>
> > Hi all,
> >
> > Previously, we discussed some issues with alterConfigs here on the
> > mailing> > list, and eventually came to the conclusion that the RPC as
> > implemented> > can't be used for a shell command modifying configurations.
> >
> > I wrote up a small KIP to fix the issues with the RPC.  Please take
> > a look> > at
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-339%3A+Create+a+new+ModifyConfigs+API>
> >  >
> > best,
> > Colin
> >



[DISCUSS] KIP-341: Update Sticky Assignor's User Data Protocol

2018-07-13 Thread Vahid S Hashemian
Hi all,

I create a short KIP to address an issue in Sticky Assignor assignment 
logic: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-341%3A+Update+Sticky+Assignor%27s+User+Data+Protocol
Please take a look and share your feedback / comments.

In particular, there is a Generation Marker section (
https://cwiki.apache.org/confluence/display/KAFKA/KIP-341%3A+Update+Sticky+Assignor%27s+User+Data+Protocol#KIP-341:UpdateStickyAssignor'sUserDataProtocol-GenerationMarker
) that provides two methods for implementing the improvement to the 
protocol. I'd like to know which option is more popular.

Thanks!
--Vahid




[jira] [Created] (KAFKA-7161) KTable Reduce should check for invalid conditions

2018-07-13 Thread John Roesler (JIRA)
John Roesler created KAFKA-7161:
---

 Summary: KTable Reduce should check for invalid conditions
 Key: KAFKA-7161
 URL: https://issues.apache.org/jira/browse/KAFKA-7161
 Project: Kafka
  Issue Type: Improvement
Reporter: John Roesler
Assignee: John Roesler


KTableReduce has the opportunity to explicitly check if the state is 
inconsistent with the oldValues arriving from the stream. If it did so, it 
could help detect topology changes that needed an app reset and fail fast 
before any data corruption occurs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7156) Deleting topics with long names can bring all brokers to unrecoverable state

2018-07-13 Thread Vahid Hashemian (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahid Hashemian resolved KAFKA-7156.

Resolution: Duplicate

> Deleting topics with long names can bring all brokers to unrecoverable state
> 
>
> Key: KAFKA-7156
> URL: https://issues.apache.org/jira/browse/KAFKA-7156
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Petr Pchelko
>Priority: Major
>
> Kafka limit for the topic name is 249 symbols, so creating a topic with a 
> name 248 symbol long is possible. However, when deleting the topic, Kafka 
> tries to rename the data directory for the topic to add some hash and 
> `-deleted` in the data directory, so that the resulting file name exceeds the 
> 255 symbol file name limit in most of the Unix file systems. This provokes a  
> java.nio.file.FileSystemException which in turn immediately shuts down all 
> the brokers. Further attemts to restart the broker fail with the same 
> exception. The only way to resurrect the cluster is to manually delete the 
> affected topic from zookeeper and from the disk on all the broker machines.
> Steps to reproduce:
> (Note: delete.topic.enable=true must be set in the config)
> {code:java}
> > kafka-topics.sh --zookeeper localhost:2181 --create --topic 
> > 
> >  --partitions 1 --replication-factor 1
> > kafka-topics.sh --zookeeper localhost:2181 --delete --topic 
> > aaa
>  {code}
> After these 2 commands executed all the brokers where this topic is 
> replicated immediately shut down with the following logs:
> {code:java}
> ERROR Error while renaming dir for 
> -0
>  in log dir /tmp/kafka-logs (kafka.server.LogDirFailureChannel)
> java.nio.file.FileSystemException: 
> /tmp/kafka-logs/-0
>  -> 
> /tmp/kafka-logs/-0.093fd1e1728f438ea990cbad8a514b9f-delete:
>  File name too long
> at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:457)
> at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
> at java.nio.file.Files.move(Files.java:1395)
> ...
> Suppressed: java.nio.file.FileSystemException: 
> /tmp/kafka-logs/-0
>  -> 
> /tmp/kafka-logs/-0.093fd1e1728f438ea990cbad8a514b9f-delete:
>  File name too long
> at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396)
> at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
> at java.nio.file.Files.move(Files.java:1395)
> at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:694)
> ... 23 more
> [2018-07-12 13:34:45,847] INFO [ReplicaManager broker=0] Stopping serving 
> replicas in dir /tmp/kafka-logs (kafka.server.ReplicaManager)
> [2018-07-12 13:34:45,848] INFO [ReplicaFetcherManager on broker 0] Removed 
> fetcher for partitions  

Re: [VOTE] 2.0.0 RC2

2018-07-13 Thread Ismael Juma
Hi Rajini,

Thanks for generating the RC. It seems like the kafka-streams-scala 2.12
artifact is missing from the Maven repository:

https://repository.apache.org/content/groups/staging/org/apache/kafka/

Since this is the first time we are publishing this artifact, it is
possible that this never worked properly.

Ismael

On Tue, Jul 10, 2018 at 10:17 AM Rajini Sivaram 
wrote:

> Hello Kafka users, developers and client-developers,
>
>
> This is the third candidate for release of Apache Kafka 2.0.0.
>
>
> This is a major version release of Apache Kafka. It includes 40 new  KIPs
> and
>
> several critical bug fixes. Please see the 2.0.0 release plan for more
> details:
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80448820
>
>
> A few notable highlights:
>
>- Prefixed wildcard ACLs (KIP-290), Fine grained ACLs for CreateTopics
>(KIP-277)
>- SASL/OAUTHBEARER implementation (KIP-255)
>- Improved quota communication and customization of quotas (KIP-219,
>KIP-257)
>- Efficient memory usage for down conversion (KIP-283)
>- Fix log divergence between leader and follower during fast leader
>failover (KIP-279)
>- Drop support for Java 7 and remove deprecated code including old scala
>clients
>- Connect REST extension plugin, support for externalizing secrets and
>improved error handling (KIP-285, KIP-297, KIP-298 etc.)
>- Scala API for Kafka Streams and other Streams API improvements
>(KIP-270, KIP-150, KIP-245, KIP-251 etc.)
>
>
> Release notes for the 2.0.0 release:
>
> http://home.apache.org/~rsivaram/kafka-2.0.0-rc2/RELEASE_NOTES.html
>
>
> *** Please download, test and vote by Friday, July 13, 4pm PT
>
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
>
> http://kafka.apache.org/KEYS
>
>
> * Release artifacts to be voted upon (source and binary):
>
> http://home.apache.org/~rsivaram/kafka-2.0.0-rc2/
>
>
> * Maven artifacts to be voted upon:
>
> https://repository.apache.org/content/groups/staging/
>
>
> * Javadoc:
>
> http://home.apache.org/~rsivaram/kafka-2.0.0-rc2/javadoc/
>
>
> * Tag to be voted upon (off 2.0 branch) is the 2.0.0 tag:
>
> https://github.com/apache/kafka/tree/2.0.0-rc2
>
>
>
> * Documentation:
>
> http://kafka.apache.org/20/documentation.html
>
>
> * Protocol:
>
> http://kafka.apache.org/20/protocol.html
>
>
> * Successful Jenkins builds for the 2.0 branch:
>
> Unit/integration tests: https://builds.apache.org/job/kafka-2.0-jdk8/72/
>
> System tests:
> https://jenkins.confluent.io/job/system-test-kafka/job/2.0/27/
>
>
> /**
>
>
> Thanks,
>
>
> Rajini
>


Request permission to assign JIRA

2018-07-13 Thread 关宇卫
 Hi,
I'm trying to claim a bug producing to Kafka in Jira, could you please do
me a favor to help me get the necessary permissions?
my JIRA username is williamguan.
Thanks for your time.


-- 
williamguan


[jira] [Created] (KAFKA-7160) Add check for group ID length

2018-07-13 Thread lambdaliu (JIRA)
lambdaliu created KAFKA-7160:


 Summary: Add check for group ID length
 Key: KAFKA-7160
 URL: https://issues.apache.org/jira/browse/KAFKA-7160
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: lambdaliu


We should limit the length of the group ID, because other system(such as 
monitor system) would use the group ID when we using  kafka in production 
environment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: KIP-327: Add describe all topics API to AdminClient

2018-07-13 Thread Andras Beni
The KIP looks good to me.
However, if there is willingness in the community to work on metadata
request with patterns, the feature proposed here and filtering by '*' or
'.*' would be redundant.

Andras



On Fri, Jul 13, 2018 at 12:38 AM Jason Gustafson  wrote:

> Hey Manikumar,
>
> As Kafka begins to scale to larger and larger numbers of topics/partitions,
> I'm a little concerned about the scalability of APIs such as this. The API
> looks benign, but imagine you have have a few million partitions. We
> already expose similar APIs in the producer and consumer, so probably not
> much additional harm to expose it in the AdminClient, but it would be nice
> to put a little thought into some longer term options. We should be giving
> users an efficient way to select a smaller set of the topics they are
> interested in. We have always discussed adding some filtering support to
> the Metadata API. Perhaps now is a good time to reconsider this? We now
> have a convention for wildcard ACLs, so perhaps we can do something
> similar. Full regex support might be ideal given the consumer's
> subscription API, but that is more challenging. What do you think?
>
> Thanks,
> Jason
>
> On Thu, Jul 12, 2018 at 2:35 PM, Harsha  wrote:
>
> > Very useful. LGTM.
> >
> > Thanks,
> > Harsha
> >
> > On Thu, Jul 12, 2018, at 9:56 AM, Manikumar wrote:
> > > Hi all,
> > >
> > > I have created a KIP to add describe all topics API to AdminClient .
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 327%3A+Add+describe+all+topics+API+to+AdminClient
> > >
> > > Please take a look.
> > >
> > > Thanks,
> >
>


Kafka Namespaces

2018-07-13 Thread Andras Beni
Hi All,

At Kafka Summit London 2018, Neha presented
 a thought experiment
about namespaces in Apache Kafka. I'd like to know if work on this vision
has started and if so, where I can find more information on it.
KIP-37 seems to be related, but was abandoned way before the talk.

Thanks,
Andras


Re: Builder Pattern for kafka-clients in 2.x ?

2018-07-13 Thread Matthias Wessendorf
Hi, Tommy!

thanks for the feeback - I've reopened the PR adding the ctor overload.

I've drafted a ProducerBuilder, which can be used like:

final Producer producer = new ProducerBuilder()
.partitionerClass(DefaultPartitioner.class)
.keySerializer(IntegerSerializer.class)
.valueSerializer(StringSerializer.class)
.bootstrapServers("localhost:9092")
.build();

producer.send(new ProducerRecord("mytopic", myKey, "the-val",
myHeaderIterable));

Internally the builder populates a property and on build() it delegates it
to the KafkaProducer ctor.
code: https://gist.github.com/matzew/b98dbacf7f40f27c9f666b736a2428d3

I think this reads much nicer, than directly dealing w/ the current way for
creating conumers and producers.

How do people feel about this?

thanks,
Matthias



On Thu, Jul 5, 2018 at 3:11 PM Thomas Becker  wrote:

> Personally, I like the idea of builders for the producer/consumer
> themselves, but I'm less enthusiastic about one for ProducerRecord. Mostly
> because I think the following is overly verbose/reads poorly:
>
> producer.send(ProducerRecord.builder()
> .topic("mytopic")
> .key("Key")
> .value("the-val")
> .headers(myHeaderIterable)
> .build());
>
> as compared to:
>
> producer.send(new ProducerRecord("mytopic", "Key", "the-val",
> myHeaderIterable));
>
> I think constructor overloads are fine for small data classes like this.
> The producer/consumer clietns themselves have a lot of options represented
> by various configuration keys, and a builder pattern makes these easily
> discoverable in code.
>
> -Tommy
>
> On Wed, 2018-07-04 at 15:42 +0200, Matthias Wessendorf wrote:
>
> Hi,
>
>
> I was filing KAFKA-7059 ([1]) and sent a PR adding a new ctor:
>
> --
>
> public ProducerRecord(String topic, K key, V value, Iterable
>
> headers)
>
> ---
>
>
> One reasonable comment on the PR was instead of doing constructor
>
> overloading, why not working on a builder for the ProducerRecord class.
>
>
> I think this is generally a nice idea I was wondering if there is much
>
> interest in ?
>
>
> Sample:
>
> ---
>
> final ProducerRecord myRecord = ProducerRecord.builder() //
>
> or an exposed builder
>
> .topic("mytopic")
>
> .key("Key")
>
> .value("the-val")
>
> .headers(myHeaderIterable)
>
> .build();
>
> ---
>
>
> While at it - instead of just offering a builder for the "ProducerRecord"
>
> class, why not adding a builder for the "KafkaProducer" and "KafkaConsumer"
>
> clazzes.
>
>
> ---
>
> final KafkaProducer myProducer = KafkaProducer.builder() //
>
> or an exposed builder clazz
>
> .config(myProducerConfig)
>
> .keySerializer(myStringSerializer)
>
> .valueSerializer(myStringSerializer)
>
> .build();
>
> ---
>
>
> to even make the above more nice, I think the "ProducerConfig" (analog the
>
> ConsumerConfig) configuration options could be also made accesible w/ this
>
> fluent API - instead of properties/map, which is what now dominates the
>
> creation of the Consumers/Producers.
>
>
>
> Any thoughts?   If there is interest, I am happy to start a KIP w/ a first
>
> draft of the suggested API!
>
>
> Cheers,
>
> Matthias
>
>
> [1] https://issues.apache.org/jira/browse/KAFKA-7059
>
>
>
>
>
> 
>
> This email and any attachments may contain confidential and privileged
> material for the sole use of the intended recipient. Any review, copying,
> or distribution of this email (or any attachments) by others is prohibited.
> If you are not the intended recipient, please contact the sender
> immediately and permanently delete this email and any attachments. No
> employee or agent of TiVo Inc. is authorized to conclude any binding
> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> Inc. may only be made by a signed written agreement.
>


-- 
Matthias Wessendorf

github: https://github.com/matzew
twitter: http://twitter.com/mwessendorf


[jira] [Created] (KAFKA-7159) mark configuration files in confluent-kafka RPM SPEC file

2018-07-13 Thread Robert (JIRA)
Robert created KAFKA-7159:
-

 Summary: mark configuration files in confluent-kafka RPM SPEC file
 Key: KAFKA-7159
 URL: https://issues.apache.org/jira/browse/KAFKA-7159
 Project: Kafka
  Issue Type: Improvement
  Components: packaging
Affects Versions: 1.1.0
 Environment: RHEL7
Reporter: Robert


All configuration files in confluent-kafka- RPM SPEC file should 
be marked with %config prefix in %files section.

This would prevent overwrites during install/upgrade and uninstall operations

[https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/rpm_packaging_guide/index#files]

It's especially important to save configuration during package upgrades.

Section to change in SPEC file:
{code:java}
%files
%config(noreplace) %{_sysconfdir}/kafka/*.conf
%config(noreplace) %{_sysconfdir}/kafka/*.properties
{code}
It would also be good to mark documentation files with %doc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)