Re: Java high level consumer providing duplicate messages when auto commit is off

2015-10-23 Thread Cliff Rhyne
Thanks, Jiangjie.  Understanding more about the auto-commit behavior and
why it's resilient to these is a big help.

We're going to do some deeper investigation and testing.  I'll report back
when I have more information.

Thanks,
Cliff

On Thu, Oct 22, 2015 at 11:48 PM, Jiangjie Qin 
wrote:

> Hi Cliff,
>
> If auto.offset.commit is set to true, the offset will be committed in
> following cases in addition to periodical offset commit:
>
> 1. During consumer rebalance before release the partition ownership.
> If consumer A owns partition P before rebalance, it will commit offset for
> partition P during rebalance. If consumer B become the new owner of
> partition P after rebalance, it will start from the committed offset, so
> there will be no duplicate messages.
> 2. When consumer closes.
>
> Rebalance will be triggered in the following cases:
> 1. A consumer joins/leaves the group.
> 2. Some topic/partition changes occurred to the interested topics.(e.g.
> partition expansion for a topic; a new topic created and the consumer is
> using a wildcard that matches the new topic name)
>
> To answer your question:
> Simple consumer should not interfere with high level consumer because it
> does not have any group management embedded.
>
> Typically a single high level consumer group will not rebalance unless
> there is topic/partition change. However, it is possible the consumer
> itself dropped out of the group and rejoins. This typically happens when
> you have a ZK session timeout. In that case, you should see "ZK expired" in
> your log. You can search for that and see if that is the problem.
>
> Jiangjie (Becket) Qin
>
>
> On Thu, Oct 22, 2015 at 1:14 PM, Cliff Rhyne  wrote:
>
> > We did some more testing with logging turned on (I figured out why it
> > wasn't working).  We tried increasing the JVM memory capacity on our test
> > server (it's lower than in production) and increasing the zookeeper
> > timeouts.  Neither changed the results.  With trace logging enabled, we
> saw
> > that we were getting rebalances even though there is only one high level
> > consumer running (there previously was a simple consumer that was told to
> > disconnect, but that consumer only checked the offsets and never consumed
> > data).
> >
> > - Is there possibly a race condition where the simple consumer has a hold
> > on a partition and shutdown is called before starting a high level
> consumer
> > but shutdown is done asynchronously?
> > - What are the various things that can cause a consumer rebalance other
> > than adding / removing high level consumers?
> >
> > Thanks,
> > Cliff
> >
> > On Wed, Oct 21, 2015 at 4:20 PM, Cliff Rhyne  wrote:
> >
> > > Hi Kris,
> > >
> > > Thanks for the tip.  I'm going to investigate this further.  I checked
> > and
> > > we have fairly short zk timeouts and run with a smaller memory
> allocation
> > > on the two environments we encounter this issue.  I'll let you all know
> > > what I find.
> > >
> > > I saw this ticket https://issues.apache.org/jira/browse/KAFKA-2049
> that
> > > seems to be related to the problem (but would only inform that an issue
> > > occurred).  Are there any other open issues that could be worked on to
> > > improve Kafka's handling of this situation?
> > >
> > > Thanks,
> > > Cliff
> > >
> > > On Wed, Oct 21, 2015 at 2:53 PM, Kris K 
> wrote:
> > >
> > >> Hi Cliff,
> > >>
> > >> One other case I observed in my environment is - when there were gc
> > pauses
> > >> on one of our high level consumer in the group.
> > >>
> > >> Thanks,
> > >> Kris
> > >>
> > >> On Wed, Oct 21, 2015 at 10:12 AM, Cliff Rhyne 
> wrote:
> > >>
> > >> > Hi James,
> > >> >
> > >> > There are two scenarios we run:
> > >> >
> > >> > 1. Multiple partitions with one consumer per partition.  This rarely
> > has
> > >> > starting/stopping of consumers, so the pool is very static.  There
> is
> > a
> > >> > configured consumer timeout, which is causing the
> > >> ConsumerTimeoutException
> > >> > to get thrown prior to the test starting.  We handle this exception
> > and
> > >> > then resume consuming.
> > >> > 2. Single partition with one consumer.  This consumer is started by
> a
> > >> > triggered condition (number of messages pending to be processed in
> the
> > >> > kafka topic or a schedule).  The consumer is stopped after
> processing
> > is
> > >> > completed.
> > >> >
> > >> > In both cases, based on my understanding there shouldn't be a
> > rebalance
> > >> as
> > >> > either a) all consumers are running or b) there's only one consumer
> /
> > >> > partition.  Also, the same consumer group is used by all consumers
> in
> > >> > scenario 1 and 2.  Is there a good way to investigate whether
> > rebalances
> > >> > are occurring?
> > >> >
> > >> > Thanks,
> > >> > Cliff
> > >> >
> > >> > On Wed, Oct 21, 2015 at 11:37 AM, James Cheng 
> > wrote:
> > >> >
> > >> > > Do you 

[jira] [Commented] (KAFKA-2675) SASL/Kerberos follow-up

2015-10-23 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971271#comment-14971271
 ] 

Ismael Juma commented on KAFKA-2675:


Thanks [~harsha_ch].

You mentioned that many projects use `serviceName` in the JAAS file (even 
though it's a non-standard JAAS config and causes the IBM JVM to fail). It 
would be good if you could elaborate on this some more for our benefit. 
ZooKeeper uses a system property (zookeeper.sasl.client.username) and there is 
no mention of serviceName in the Hadoop Security book (which makes sense since 
it doesn't use JAAS for configuration). I did see serviceName mentioned in the 
HDP documentation for a few projects (including Kafka), but it wasn't clear to 
me if this was a HDP enhancement or an upstream feature (it is an enhancement 
for Kafka, of course).

Regarding SASL_KAFKA_SERVER_REALM, that makes sense. I think I know where this 
needs to be changed and will include the change in the PR for this JIRA.

> SASL/Kerberos follow-up
> ---
>
> Key: KAFKA-2675
> URL: https://issues.apache.org/jira/browse/KAFKA-2675
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 0.9.0.0
>
>
> This is a follow-up to KAFKA-1686. 
> 1. Decide on `serviceName` configuration: do we want to keep it in two places?
> 2. auth.to.local config name is a bit opaque, is there a better one?
> 3. Implement or remove SASL_KAFKA_SERVER_REALM config
> 4. Consider making Login's thread a daemon thread
> 5. Write test that shows authentication failure due to invalid user
> 6. Write test that shows authentication failure due to wrong password
> 7. Write test that shows authentication failure due ticket expiring



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: (User) Document/guide on new Kafka features on security in upcoming 0.9.0?

2015-10-23 Thread Harsha
Jaikiran,
 Here is the SSL setup doc
 
https://cwiki.apache.org/confluence/display/KAFKA/Deploying+SSL+for+Kafka
 .
I am going to write one for SASL ( kerberos ) soon and Parth is working
on pushing one for Authorizer as well.

Thanks,
Harsha

On Fri, Oct 23, 2015, at 04:42 PM, Jaikiran Pai wrote:
> Given that 0.9.0 is being planned to released sometime soon in November, 
> our team has started looking into what's coming in the newer version and 
> what we need to do to start using the features. We are very much and 
> mainly interested in the security features of it for the upcoming 
> releases. We have been watching some of the KIP discussion threads and 
> the KIP articles around security[1][2]. However, those are very much 
> like design discussions/rationale. Is there some document which explains 
> to end users of Kafka to understand what security features have been 
> implemented for this release and what needs to be done to start using
> those?
> 
> The reason I ask for these now, before the release, is so that some of 
> us can start testing these out before the release and raise any issues 
> that we might run into.
> 
> [1] 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
> [2] 
> https://cwiki.apache.org/confluence/display/KAFKA/Security#Security-Authorization
> 
> -Jaikiran


[jira] [Commented] (KAFKA-2644) Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL

2015-10-23 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972339#comment-14972339
 ] 

Sriharsha Chintalapani commented on KAFKA-2644:
---

[~junrao] If possible we should use kerberos instead of MiniKDC. I am not sure 
how the ducktape tests run if they are using vagrant I've vagrant kerberos 
setup details here https://github.com/harshach/kafka-vagrant 

> Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL
> 
>
> Key: KAFKA-2644
> URL: https://issues.apache.org/jira/browse/KAFKA-2644
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.9.0.0
>
>
> We need to define which of the existing ducktape tests are relevant. cc 
> [~rsivaram]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2687) Allow GroupMetadataRequest to return member metadata when received by group coordinator

2015-10-23 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-2687:
--

 Summary: Allow GroupMetadataRequest to return member metadata when 
received by group coordinator
 Key: KAFKA-2687
 URL: https://issues.apache.org/jira/browse/KAFKA-2687
 Project: Kafka
  Issue Type: Improvement
Reporter: Jason Gustafson
Assignee: Jason Gustafson


Since the new consumer currently has no persistence in Zookeeper (pending 
outcome of KAFKA-2017), there is no way for administrators to investigate group 
status including getting the list of members in the group and their partition 
assignments. We therefore propose to modify GroupMetadataRequest (previously 
known as ConsumerMetadataRequest) to return group metadata when received by the 
respective group's coordinator. When received by another broker, the request 
will be handled as before: by only returning coordinator host and port 
information.

{code}
GroupMetadataRequest => GroupId IncludeMetadata
  GroupId => String
  IncludeMetadata => Boolean

GroupMetadataResponse => ErrorCode Coordinator GroupMetadata
  ErrorCode => int16
  Coordinator => Id Host Port
Id => int32
Host => string
Port => int32
  GroupMetadata => State ProtocolType Generation Protocol Leader  Members
State => String
ProtocolType => String
Generation => int32
Protocol => String
Leader => String
Members => [Member MemberMetadata MemberAssignment]
  Member => MemberIp ClientId
MemberIp => String
ClientId => String
  MemberMetadata => Bytes
  MemberAssignment => Bytes
{code}

The request schema includes a flag to indicate whether metadata is needed, 
which saves clients from having to read all group metadata when they are just 
trying to find the coordinator. This is important to reduce group overhead for 
use cases which involve a large number of topic subscriptions (e.g. mirror 
maker).

Tools will use the protocol type to determine how to parse metadata. For 
example, when the protocolType is "consumer", the tool can use ConsumerProtocol 
to parse the member metadata as topic subscriptions and partition assignments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2235) LogCleaner offset map overflow

2015-10-23 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971728#comment-14971728
 ] 

Joel Koshy commented on KAFKA-2235:
---

While I think the change is well-motivated I'm not sure this is the right fix 
for this issue as the check is too conservative. i.e., especially with highly 
compressible messages the message-count in the segment may be extremely high 
but the unique-key-count may be low.

> LogCleaner offset map overflow
> --
>
> Key: KAFKA-2235
> URL: https://issues.apache.org/jira/browse/KAFKA-2235
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.8.1, 0.8.2.0
>Reporter: Ivan Simoneko
>Assignee: Ivan Simoneko
> Fix For: 0.9.0.0
>
> Attachments: KAFKA-2235_v1.patch, KAFKA-2235_v2.patch
>
>
> We've seen log cleaning generating an error for a topic with lots of small 
> messages. It seems that cleanup map overflow is possible if a log segment 
> contains more unique keys than empty slots in offsetMap. Check for baseOffset 
> and map utilization before processing segment seems to be not enough because 
> it doesn't take into account segment size (number of unique messages in the 
> segment).
> I suggest to estimate upper bound of keys in a segment as a number of 
> messages in the segment and compare it with the number of available slots in 
> the map (keeping in mind desired load factor). It should work in cases where 
> an empty map is capable to hold all the keys for a single segment. If even a 
> single segment no able to fit into an empty map cleanup process will still 
> fail. Probably there should be a limit on the log segment entries count?
> Here is the stack trace for this error:
> 2015-05-19 16:52:48,758 ERROR [kafka-log-cleaner-thread-0] 
> kafka.log.LogCleaner - [kafka-log-cleaner-thread-0], Error due to
> java.lang.IllegalArgumentException: requirement failed: Attempt to add a new 
> entry to a full offset map.
>at scala.Predef$.require(Predef.scala:233)
>at kafka.log.SkimpyOffsetMap.put(OffsetMap.scala:79)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:543)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:538)
>at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:32)
>at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>at kafka.message.MessageSet.foreach(MessageSet.scala:67)
>at 
> kafka.log.Cleaner.kafka$log$Cleaner$$buildOffsetMapForSegment(LogCleaner.scala:538)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:515)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:512)
>at scala.collection.immutable.Stream.foreach(Stream.scala:547)
>at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:512)
>at kafka.log.Cleaner.clean(LogCleaner.scala:307)
>at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:221)
>at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:199)
>at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2687) Allow GroupMetadataRequest to return member metadata when received by group coordinator

2015-10-23 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971732#comment-14971732
 ] 

Gwen Shapira commented on KAFKA-2687:
-

If we allow regexp as the string in Groups, then it will handle most cases for 
getting lists of groups.
Agree about the IncludeMetadata flag to keep responses to reasonable size.

And yeah, having to query all brokers is a reasonable limitation that we can 
build tools to hide from admins.

> Allow GroupMetadataRequest to return member metadata when received by group 
> coordinator
> ---
>
> Key: KAFKA-2687
> URL: https://issues.apache.org/jira/browse/KAFKA-2687
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>
> Since the new consumer currently has no persistence in Zookeeper (pending 
> outcome of KAFKA-2017), there is no way for administrators to investigate 
> group status including getting the list of members in the group and their 
> partition assignments. We therefore propose to modify GroupMetadataRequest 
> (previously known as ConsumerMetadataRequest) to return group metadata when 
> received by the respective group's coordinator. When received by another 
> broker, the request will be handled as before: by only returning coordinator 
> host and port information.
> {code}
> GroupMetadataRequest => GroupId IncludeMetadata
>   GroupId => String
>   IncludeMetadata => Boolean
> GroupMetadataResponse => ErrorCode Coordinator GroupMetadata
>   ErrorCode => int16
>   Coordinator => Id Host Port
> Id => int32
> Host => string
> Port => int32
>   GroupMetadata => State ProtocolType Generation Protocol Leader  Members
> State => String
> ProtocolType => String
> Generation => int32
> Protocol => String
> Leader => String
> Members => [Member MemberMetadata MemberAssignment]
>   Member => MemberIp ClientId
> MemberIp => String
> ClientId => String
>   MemberMetadata => Bytes
>   MemberAssignment => Bytes
> {code}
> The request schema includes a flag to indicate whether metadata is needed, 
> which saves clients from having to read all group metadata when they are just 
> trying to find the coordinator. This is important to reduce group overhead 
> for use cases which involve a large number of topic subscriptions (e.g. 
> mirror maker).
> Tools will use the protocol type to determine how to parse metadata. For 
> example, when the protocolType is "consumer", the tool can use 
> ConsumerProtocol to parse the member metadata as topic subscriptions and 
> partition assignments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2687) Allow GroupMetadataRequest to return member metadata when received by group coordinator

2015-10-23 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971770#comment-14971770
 ] 

Jason Gustafson commented on KAFKA-2687:


[~gwenshap] That's an interesting idea. I wonder if there would be any problems 
with regex compatibility across languages? 

> Allow GroupMetadataRequest to return member metadata when received by group 
> coordinator
> ---
>
> Key: KAFKA-2687
> URL: https://issues.apache.org/jira/browse/KAFKA-2687
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>
> Since the new consumer currently has no persistence in Zookeeper (pending 
> outcome of KAFKA-2017), there is no way for administrators to investigate 
> group status including getting the list of members in the group and their 
> partition assignments. We therefore propose to modify GroupMetadataRequest 
> (previously known as ConsumerMetadataRequest) to return group metadata when 
> received by the respective group's coordinator. When received by another 
> broker, the request will be handled as before: by only returning coordinator 
> host and port information.
> {code}
> GroupMetadataRequest => GroupId IncludeMetadata
>   GroupId => String
>   IncludeMetadata => Boolean
> GroupMetadataResponse => ErrorCode Coordinator GroupMetadata
>   ErrorCode => int16
>   Coordinator => Id Host Port
> Id => int32
> Host => string
> Port => int32
>   GroupMetadata => State ProtocolType Generation Protocol Leader  Members
> State => String
> ProtocolType => String
> Generation => int32
> Protocol => String
> Leader => String
> Members => [Member MemberMetadata MemberAssignment]
>   Member => MemberIp ClientId
> MemberIp => String
> ClientId => String
>   MemberMetadata => Bytes
>   MemberAssignment => Bytes
> {code}
> The request schema includes a flag to indicate whether metadata is needed, 
> which saves clients from having to read all group metadata when they are just 
> trying to find the coordinator. This is important to reduce group overhead 
> for use cases which involve a large number of topic subscriptions (e.g. 
> mirror maker).
> Tools will use the protocol type to determine how to parse metadata. For 
> example, when the protocolType is "consumer", the tool can use 
> ConsumerProtocol to parse the member metadata as topic subscriptions and 
> partition assignments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2235) LogCleaner offset map overflow

2015-10-23 Thread Todd Palino (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971806#comment-14971806
 ] 

Todd Palino commented on KAFKA-2235:


I don't think we can. I have already increased it from 512MB to 1GB, and we 
still hit the same problems. That only provides a 2x increase in the size of 
the map, and I would need almost a 10x increase to solve the problem.

> LogCleaner offset map overflow
> --
>
> Key: KAFKA-2235
> URL: https://issues.apache.org/jira/browse/KAFKA-2235
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.8.1, 0.8.2.0
>Reporter: Ivan Simoneko
>Assignee: Ivan Simoneko
> Fix For: 0.9.0.0
>
> Attachments: KAFKA-2235_v1.patch, KAFKA-2235_v2.patch
>
>
> We've seen log cleaning generating an error for a topic with lots of small 
> messages. It seems that cleanup map overflow is possible if a log segment 
> contains more unique keys than empty slots in offsetMap. Check for baseOffset 
> and map utilization before processing segment seems to be not enough because 
> it doesn't take into account segment size (number of unique messages in the 
> segment).
> I suggest to estimate upper bound of keys in a segment as a number of 
> messages in the segment and compare it with the number of available slots in 
> the map (keeping in mind desired load factor). It should work in cases where 
> an empty map is capable to hold all the keys for a single segment. If even a 
> single segment no able to fit into an empty map cleanup process will still 
> fail. Probably there should be a limit on the log segment entries count?
> Here is the stack trace for this error:
> 2015-05-19 16:52:48,758 ERROR [kafka-log-cleaner-thread-0] 
> kafka.log.LogCleaner - [kafka-log-cleaner-thread-0], Error due to
> java.lang.IllegalArgumentException: requirement failed: Attempt to add a new 
> entry to a full offset map.
>at scala.Predef$.require(Predef.scala:233)
>at kafka.log.SkimpyOffsetMap.put(OffsetMap.scala:79)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:543)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:538)
>at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:32)
>at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>at kafka.message.MessageSet.foreach(MessageSet.scala:67)
>at 
> kafka.log.Cleaner.kafka$log$Cleaner$$buildOffsetMapForSegment(LogCleaner.scala:538)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:515)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:512)
>at scala.collection.immutable.Stream.foreach(Stream.scala:547)
>at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:512)
>at kafka.log.Cleaner.clean(LogCleaner.scala:307)
>at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:221)
>at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:199)
>at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2687) Allow GroupMetadataRequest to return member metadata when received by group coordinator

2015-10-23 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971660#comment-14971660
 ] 

Jason Gustafson commented on KAFKA-2687:


[~junrao] [~guozhang] I can write up a KIP for this if you think it's 
necessary. I think we need this for the 0.9 release if we decide to delay 
implementation of group persistence in KAFKA-2017. The consequence of not 
having it is that users will have to inspect consumer/broker logs to find group 
metadata.

> Allow GroupMetadataRequest to return member metadata when received by group 
> coordinator
> ---
>
> Key: KAFKA-2687
> URL: https://issues.apache.org/jira/browse/KAFKA-2687
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>
> Since the new consumer currently has no persistence in Zookeeper (pending 
> outcome of KAFKA-2017), there is no way for administrators to investigate 
> group status including getting the list of members in the group and their 
> partition assignments. We therefore propose to modify GroupMetadataRequest 
> (previously known as ConsumerMetadataRequest) to return group metadata when 
> received by the respective group's coordinator. When received by another 
> broker, the request will be handled as before: by only returning coordinator 
> host and port information.
> {code}
> GroupMetadataRequest => GroupId IncludeMetadata
>   GroupId => String
>   IncludeMetadata => Boolean
> GroupMetadataResponse => ErrorCode Coordinator GroupMetadata
>   ErrorCode => int16
>   Coordinator => Id Host Port
> Id => int32
> Host => string
> Port => int32
>   GroupMetadata => State ProtocolType Generation Protocol Leader  Members
> State => String
> ProtocolType => String
> Generation => int32
> Protocol => String
> Leader => String
> Members => [Member MemberMetadata MemberAssignment]
>   Member => MemberIp ClientId
> MemberIp => String
> ClientId => String
>   MemberMetadata => Bytes
>   MemberAssignment => Bytes
> {code}
> The request schema includes a flag to indicate whether metadata is needed, 
> which saves clients from having to read all group metadata when they are just 
> trying to find the coordinator. This is important to reduce group overhead 
> for use cases which involve a large number of topic subscriptions (e.g. 
> mirror maker).
> Tools will use the protocol type to determine how to parse metadata. For 
> example, when the protocolType is "consumer", the tool can use 
> ConsumerProtocol to parse the member metadata as topic subscriptions and 
> partition assignments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2687) Allow GroupMetadataRequest to return member metadata when received by group coordinator

2015-10-23 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971677#comment-14971677
 ] 

Gwen Shapira commented on KAFKA-2687:
-

This seems like a per-group request. Will there be a way to find a list of 
current groups? 

> Allow GroupMetadataRequest to return member metadata when received by group 
> coordinator
> ---
>
> Key: KAFKA-2687
> URL: https://issues.apache.org/jira/browse/KAFKA-2687
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>
> Since the new consumer currently has no persistence in Zookeeper (pending 
> outcome of KAFKA-2017), there is no way for administrators to investigate 
> group status including getting the list of members in the group and their 
> partition assignments. We therefore propose to modify GroupMetadataRequest 
> (previously known as ConsumerMetadataRequest) to return group metadata when 
> received by the respective group's coordinator. When received by another 
> broker, the request will be handled as before: by only returning coordinator 
> host and port information.
> {code}
> GroupMetadataRequest => GroupId IncludeMetadata
>   GroupId => String
>   IncludeMetadata => Boolean
> GroupMetadataResponse => ErrorCode Coordinator GroupMetadata
>   ErrorCode => int16
>   Coordinator => Id Host Port
> Id => int32
> Host => string
> Port => int32
>   GroupMetadata => State ProtocolType Generation Protocol Leader  Members
> State => String
> ProtocolType => String
> Generation => int32
> Protocol => String
> Leader => String
> Members => [Member MemberMetadata MemberAssignment]
>   Member => MemberIp ClientId
> MemberIp => String
> ClientId => String
>   MemberMetadata => Bytes
>   MemberAssignment => Bytes
> {code}
> The request schema includes a flag to indicate whether metadata is needed, 
> which saves clients from having to read all group metadata when they are just 
> trying to find the coordinator. This is important to reduce group overhead 
> for use cases which involve a large number of topic subscriptions (e.g. 
> mirror maker).
> Tools will use the protocol type to determine how to parse metadata. For 
> example, when the protocolType is "consumer", the tool can use 
> ConsumerProtocol to parse the member metadata as topic subscriptions and 
> partition assignments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2235) LogCleaner offset map overflow

2015-10-23 Thread Todd Palino (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971740#comment-14971740
 ] 

Todd Palino commented on KAFKA-2235:


I'm sure [~jjkoshy] will follow along with more detail on this, but we've run 
into a serious problem with this check. Basically, it's impossible to perform 
this kind of check accurately before the offset map is built. We now have 
partitions that should be able to be compacted as the total number of unique 
keys is far below the size of the offset map (currently at ~39 million for our 
configuration) but the messages are very frequent and very small. Even at a 
segment size of 64 MB, we have over 300 million messages in those segments. So 
this check creates a situation where log compaction should succeed, but fails 
because of a speculative check.

While I can play the game of trying to walk back segment sizes, there's no way 
to size segments by number of messages, so it's a guessing game. In addition, 
the check is clearly wrong in that case, so I shouldn't have to config around 
it. Lastly, the check causes the log cleaner thread to exit, which means log 
compaction on the broker fails entirely, rather than just skipping that 
partition.

A better way to handle this would be to cleanly catch the original error you 
are seeing, generate a clear error message in the logs as to what the failure 
is, and allow the log cleaner to continue and handle other partitions. You 
could also maintain a blacklist of partitions in memory in the log cleaner to 
make sure you don't come back around and try and compact the partition again.

> LogCleaner offset map overflow
> --
>
> Key: KAFKA-2235
> URL: https://issues.apache.org/jira/browse/KAFKA-2235
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.8.1, 0.8.2.0
>Reporter: Ivan Simoneko
>Assignee: Ivan Simoneko
> Fix For: 0.9.0.0
>
> Attachments: KAFKA-2235_v1.patch, KAFKA-2235_v2.patch
>
>
> We've seen log cleaning generating an error for a topic with lots of small 
> messages. It seems that cleanup map overflow is possible if a log segment 
> contains more unique keys than empty slots in offsetMap. Check for baseOffset 
> and map utilization before processing segment seems to be not enough because 
> it doesn't take into account segment size (number of unique messages in the 
> segment).
> I suggest to estimate upper bound of keys in a segment as a number of 
> messages in the segment and compare it with the number of available slots in 
> the map (keeping in mind desired load factor). It should work in cases where 
> an empty map is capable to hold all the keys for a single segment. If even a 
> single segment no able to fit into an empty map cleanup process will still 
> fail. Probably there should be a limit on the log segment entries count?
> Here is the stack trace for this error:
> 2015-05-19 16:52:48,758 ERROR [kafka-log-cleaner-thread-0] 
> kafka.log.LogCleaner - [kafka-log-cleaner-thread-0], Error due to
> java.lang.IllegalArgumentException: requirement failed: Attempt to add a new 
> entry to a full offset map.
>at scala.Predef$.require(Predef.scala:233)
>at kafka.log.SkimpyOffsetMap.put(OffsetMap.scala:79)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:543)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:538)
>at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:32)
>at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>at kafka.message.MessageSet.foreach(MessageSet.scala:67)
>at 
> kafka.log.Cleaner.kafka$log$Cleaner$$buildOffsetMapForSegment(LogCleaner.scala:538)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:515)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:512)
>at scala.collection.immutable.Stream.foreach(Stream.scala:547)
>at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:512)
>at kafka.log.Cleaner.clean(LogCleaner.scala:307)
>at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:221)
>at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:199)
>at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2235) LogCleaner offset map overflow

2015-10-23 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971804#comment-14971804
 ] 

Jun Rao commented on KAFKA-2235:


[~toddpalino], thanks for reporting this issue. I agree that it's better to log 
an error and then continue. For your use case, it seems that you can just 
increase log.cleaner.dedupe.buffer.size.

> LogCleaner offset map overflow
> --
>
> Key: KAFKA-2235
> URL: https://issues.apache.org/jira/browse/KAFKA-2235
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.8.1, 0.8.2.0
>Reporter: Ivan Simoneko
>Assignee: Ivan Simoneko
> Fix For: 0.9.0.0
>
> Attachments: KAFKA-2235_v1.patch, KAFKA-2235_v2.patch
>
>
> We've seen log cleaning generating an error for a topic with lots of small 
> messages. It seems that cleanup map overflow is possible if a log segment 
> contains more unique keys than empty slots in offsetMap. Check for baseOffset 
> and map utilization before processing segment seems to be not enough because 
> it doesn't take into account segment size (number of unique messages in the 
> segment).
> I suggest to estimate upper bound of keys in a segment as a number of 
> messages in the segment and compare it with the number of available slots in 
> the map (keeping in mind desired load factor). It should work in cases where 
> an empty map is capable to hold all the keys for a single segment. If even a 
> single segment no able to fit into an empty map cleanup process will still 
> fail. Probably there should be a limit on the log segment entries count?
> Here is the stack trace for this error:
> 2015-05-19 16:52:48,758 ERROR [kafka-log-cleaner-thread-0] 
> kafka.log.LogCleaner - [kafka-log-cleaner-thread-0], Error due to
> java.lang.IllegalArgumentException: requirement failed: Attempt to add a new 
> entry to a full offset map.
>at scala.Predef$.require(Predef.scala:233)
>at kafka.log.SkimpyOffsetMap.put(OffsetMap.scala:79)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:543)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:538)
>at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:32)
>at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>at kafka.message.MessageSet.foreach(MessageSet.scala:67)
>at 
> kafka.log.Cleaner.kafka$log$Cleaner$$buildOffsetMapForSegment(LogCleaner.scala:538)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:515)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:512)
>at scala.collection.immutable.Stream.foreach(Stream.scala:547)
>at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:512)
>at kafka.log.Cleaner.clean(LogCleaner.scala:307)
>at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:221)
>at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:199)
>at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2687) Allow GroupMetadataRequest to return member metadata when received by group coordinator

2015-10-23 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971799#comment-14971799
 ] 

Gwen Shapira commented on KAFKA-2687:
-

Is it any different from consumer subscribe() taking regexp?

> Allow GroupMetadataRequest to return member metadata when received by group 
> coordinator
> ---
>
> Key: KAFKA-2687
> URL: https://issues.apache.org/jira/browse/KAFKA-2687
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>
> Since the new consumer currently has no persistence in Zookeeper (pending 
> outcome of KAFKA-2017), there is no way for administrators to investigate 
> group status including getting the list of members in the group and their 
> partition assignments. We therefore propose to modify GroupMetadataRequest 
> (previously known as ConsumerMetadataRequest) to return group metadata when 
> received by the respective group's coordinator. When received by another 
> broker, the request will be handled as before: by only returning coordinator 
> host and port information.
> {code}
> GroupMetadataRequest => GroupId IncludeMetadata
>   GroupId => String
>   IncludeMetadata => Boolean
> GroupMetadataResponse => ErrorCode Coordinator GroupMetadata
>   ErrorCode => int16
>   Coordinator => Id Host Port
> Id => int32
> Host => string
> Port => int32
>   GroupMetadata => State ProtocolType Generation Protocol Leader  Members
> State => String
> ProtocolType => String
> Generation => int32
> Protocol => String
> Leader => String
> Members => [Member MemberMetadata MemberAssignment]
>   Member => MemberIp ClientId
> MemberIp => String
> ClientId => String
>   MemberMetadata => Bytes
>   MemberAssignment => Bytes
> {code}
> The request schema includes a flag to indicate whether metadata is needed, 
> which saves clients from having to read all group metadata when they are just 
> trying to find the coordinator. This is important to reduce group overhead 
> for use cases which involve a large number of topic subscriptions (e.g. 
> mirror maker).
> Tools will use the protocol type to determine how to parse metadata. For 
> example, when the protocolType is "consumer", the tool can use 
> ConsumerProtocol to parse the member metadata as topic subscriptions and 
> partition assignments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : kafka-trunk-jdk7 #716

2015-10-23 Thread Apache Jenkins Server
See 



Build failed in Jenkins: kafka-trunk-jdk8 #54

2015-10-23 Thread Apache Jenkins Server
See 

Changes:

[cshapi] KAFKA-2371: Add distributed support for Copycat.

[cshapi] KAFKA-2626: Handle null keys and value validation properly in

--
[...truncated 5007 lines...]

kafka.coordinator.MemberMetadataTest > testVoteRaisesOnNoSupportedProtocols 
PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[13] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[14] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[15] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[16] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[17] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[18] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[19] PASSED

kafka.log.OffsetMapTest > testClear PASSED

kafka.log.OffsetMapTest > testBasicValidation PASSED

kafka.log.LogManagerTest > testCleanupSegmentsToMaintainSize PASSED

kafka.log.LogManagerTest > testRecoveryDirectoryMappingWithRelativeDirectory 
PASSED

kafka.log.LogManagerTest > testGetNonExistentLog PASSED

kafka.log.LogManagerTest > testTwoLogManagersUsingSameDirFails PASSED

kafka.log.LogManagerTest > testLeastLoadedAssignment PASSED

kafka.log.LogManagerTest > testCleanupExpiredSegments PASSED

kafka.log.LogManagerTest > testCheckpointRecoveryPoints PASSED

kafka.log.LogManagerTest > testTimeBasedFlush PASSED

kafka.log.LogManagerTest > testCreateLog PASSED

kafka.log.LogManagerTest > testRecoveryDirectoryMappingWithTrailingSlash PASSED

kafka.log.LogSegmentTest > testRecoveryWithCorruptMessage PASSED

kafka.log.LogSegmentTest > testRecoveryFixesCorruptIndex PASSED

kafka.log.LogSegmentTest > testReadFromGap PASSED

kafka.log.LogSegmentTest > testTruncate PASSED

kafka.log.LogSegmentTest > testReadBeforeFirstOffset PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeAppendMessage PASSED

kafka.log.LogSegmentTest > testChangeFileSuffixes PASSED

kafka.log.LogSegmentTest > testMaxOffset PASSED

kafka.log.LogSegmentTest > testNextOffsetCalculation PASSED

kafka.log.LogSegmentTest > testReadOnEmptySegment PASSED

kafka.log.LogSegmentTest > testReadAfterLast PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeClearShutdown PASSED

kafka.log.LogSegmentTest > testTruncateFull PASSED

kafka.log.FileMessageSetTest > testTruncate PASSED

kafka.log.FileMessageSetTest > testIterationOverPartialAndTruncation PASSED

kafka.log.FileMessageSetTest > testRead PASSED

kafka.log.FileMessageSetTest > testFileSize PASSED

kafka.log.FileMessageSetTest > testIteratorWithLimits PASSED

kafka.log.FileMessageSetTest > testPreallocateTrue PASSED

kafka.log.FileMessageSetTest > testIteratorIsConsistent PASSED

kafka.log.FileMessageSetTest > testIterationDoesntChangePosition PASSED

kafka.log.FileMessageSetTest > testWrittenEqualsRead PASSED

kafka.log.FileMessageSetTest > testWriteTo PASSED

kafka.log.FileMessageSetTest > testPreallocateFalse PASSED

kafka.log.FileMessageSetTest > testPreallocateClearShutdown PASSED

kafka.log.FileMessageSetTest > testSearch PASSED

kafka.log.FileMessageSetTest > testSizeInBytes PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[0] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[1] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[2] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[3] PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingTopic PASSED

kafka.log.LogTest > testIndexRebuild PASSED

kafka.log.LogTest > testLogRolls PASSED

kafka.log.LogTest > testMessageSizeCheck PASSED

kafka.log.LogTest > testAsyncDelete PASSED

kafka.log.LogTest > testReadOutOfRange PASSED

kafka.log.LogTest > testReadAtLogGap PASSED

kafka.log.LogTest > testTimeBasedLogRoll PASSED

kafka.log.LogTest > testLoadEmptyLog PASSED

kafka.log.LogTest > testMessageSetSizeCheck PASSED

kafka.log.LogTest > testIndexResizingAtTruncation PASSED

kafka.log.LogTest > testCompactedTopicConstraints PASSED


Build failed in Jenkins: kafka-trunk-jdk7 #717

2015-10-23 Thread Apache Jenkins Server
See 

Changes:

[cshapi] KAFKA-2626: Handle null keys and value validation properly in

--
[...truncated 1797 lines...]
kafka.log.CleanerTest > testCleaningWithDeletes PASSED

kafka.log.CleanerTest > testCleanSegments PASSED

kafka.log.CleanerTest > testCleaningWithUnkeyedMessages PASSED

kafka.log.OffsetMapTest > testClear PASSED

kafka.log.OffsetMapTest > testBasicValidation PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testHeartbeatWrongCoordinator 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testHeartbeatIllegalGeneration 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testLeaderFailureInSyncGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testGenerationIdIncrementsOnRebalance PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testSyncGroupFromIllegalGeneration PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testHeartbeatUnknownGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testHeartbeatDuringRebalanceCausesRebalanceInProgress PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupInconsistentGroupProtocol PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupSessionTimeoutTooLarge PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupSessionTimeoutTooSmall PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testCommitOffsetWithDefaultGeneration PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupFromUnchangedLeaderShouldRebalance PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testHeartbeatRebalanceInProgress PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testLeaveGroupUnknownGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testSyncGroupFollowerAfterLeader PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testCommitOffsetInAwaitingSync 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testJoinGroupWrongCoordinator 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupUnknownConsumerExistingGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupFromUnknownGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupInconsistentProtocolType PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testCommitOffsetFromUnknownGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testLeaveGroupWrongCoordinator 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testLeaveGroupUnknownConsumerExistingGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupUnknownConsumerNewGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupFromUnchangedFollowerDoesNotRebalance PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testValidJoinGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testSyncGroupLeaderAfterFollower PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupFromUnknownMember 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testValidLeaveGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupNotCoordinator 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testHeartbeatUnknownConsumerExistingGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testValidHeartbeat PASSED

kafka.coordinator.MemberMetadataTest > testMatchesSupportedProtocols PASSED

kafka.coordinator.MemberMetadataTest > testMetadata PASSED

kafka.coordinator.MemberMetadataTest > testMetadataRaisesOnUnsupportedProtocol 
PASSED

kafka.coordinator.MemberMetadataTest > testVoteForPreferredProtocol PASSED

kafka.coordinator.MemberMetadataTest > testVoteRaisesOnNoSupportedProtocols 
PASSED

kafka.coordinator.GroupMetadataTest > testDeadToAwaitingSyncIllegalTransition 
PASSED

kafka.coordinator.GroupMetadataTest > 
testPreparingRebalanceToStableIllegalTransition PASSED

kafka.coordinator.GroupMetadataTest > testCannotRebalanceWhenDead PASSED

kafka.coordinator.GroupMetadataTest > testStableToDeadIllegalTransition PASSED

kafka.coordinator.GroupMetadataTest > testSelectProtocol PASSED

kafka.coordinator.GroupMetadataTest > testCannotRebalanceWhenPreparingRebalance 
PASSED

kafka.coordinator.GroupMetadataTest > 
testDeadToPreparingRebalanceIllegalTransition PASSED

kafka.coordinator.GroupMetadataTest > testCanRebalanceWhenAwaitingSync PASSED

kafka.coordinator.GroupMetadataTest > 
testAwaitingSyncToPreparingRebalanceTransition PASSED

kafka.coordinator.GroupMetadataTest > testStableToAwaitingSyncIllegalTransition 
PASSED

kafka.coordinator.GroupMetadataTest > testDeadToDeadIllegalTransition PASSED

kafka.coordinator.GroupMetadataTest > testSelectProtocolRaisesIfNoMembers PASSED

kafka.coordinator.GroupMetadataTest > testStableToPreparingRebalanceTransition 
PASSED

kafka.coordinator.GroupMetadataTest > testPreparingRebalanceToDeadTransition 
PASSED


[jira] [Commented] (KAFKA-2675) SASL/Kerberos follow-up

2015-10-23 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971107#comment-14971107
 ] 

Sriharsha Chintalapani commented on KAFKA-2675:
---

1. Decide on `serviceName` configuration: do we want to keep it in two places?
We should keep this in two places. Configuring serviceName in jaas file as been 
the way to go in all other projects. We only kept in two places because of IBM 
jdk.
3. Implement or remove SASL_KAFKA_SERVER_REALM config
  This is required on the client side. Its very common scenario where 
server/broker in one relam and clients are in another . In this case clients 
needs to configure the server realm. By default we use clients realm to connect 
to server.

> SASL/Kerberos follow-up
> ---
>
> Key: KAFKA-2675
> URL: https://issues.apache.org/jira/browse/KAFKA-2675
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 0.9.0.0
>
>
> This is a follow-up to KAFKA-1686. 
> 1. Decide on `serviceName` configuration: do we want to keep it in two places?
> 2. auth.to.local config name is a bit opaque, is there a better one?
> 3. Implement or remove SASL_KAFKA_SERVER_REALM config
> 4. Consider making Login's thread a daemon thread
> 5. Write test that shows authentication failure due to invalid user
> 6. Write test that shows authentication failure due to wrong password
> 7. Write test that shows authentication failure due ticket expiring



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka_system_tests #116

2015-10-23 Thread ewen
See 

--
Started by timer
Building in workspace 
[kafka_system_tests] $ /bin/bash /tmp/hudson3879296377443603922.sh
Running command: git pull && ./gradlew clean jar

Running command: which virtualenv

Running command: . 

 pip uninstall ducktape -y

Running command: . 

 pip uninstall ducktape -y

Running command: . 

 pip install ducktape==0.3.2

Running command: vagrant destroy -f

Running command: vagrant up --provider=aws --no-provision --no-parallel

Running command: vagrant destroy -f

Build step 'Execute shell' marked build as failure


[jira] [Commented] (KAFKA-2235) LogCleaner offset map overflow

2015-10-23 Thread Ivan Simoneko (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971848#comment-14971848
 ] 

Ivan Simoneko commented on KAFKA-2235:
--

The core problem is that offset map is limited by number of messages while 
segment is limited by size in bytes and this patch doesn't fix it. But the way 
it handles the problem (just throws exception end stops compacting) is 
consistent with other errors in compactor.

> LogCleaner offset map overflow
> --
>
> Key: KAFKA-2235
> URL: https://issues.apache.org/jira/browse/KAFKA-2235
> Project: Kafka
>  Issue Type: Bug
>  Components: core, log
>Affects Versions: 0.8.1, 0.8.2.0
>Reporter: Ivan Simoneko
>Assignee: Ivan Simoneko
> Fix For: 0.9.0.0
>
> Attachments: KAFKA-2235_v1.patch, KAFKA-2235_v2.patch
>
>
> We've seen log cleaning generating an error for a topic with lots of small 
> messages. It seems that cleanup map overflow is possible if a log segment 
> contains more unique keys than empty slots in offsetMap. Check for baseOffset 
> and map utilization before processing segment seems to be not enough because 
> it doesn't take into account segment size (number of unique messages in the 
> segment).
> I suggest to estimate upper bound of keys in a segment as a number of 
> messages in the segment and compare it with the number of available slots in 
> the map (keeping in mind desired load factor). It should work in cases where 
> an empty map is capable to hold all the keys for a single segment. If even a 
> single segment no able to fit into an empty map cleanup process will still 
> fail. Probably there should be a limit on the log segment entries count?
> Here is the stack trace for this error:
> 2015-05-19 16:52:48,758 ERROR [kafka-log-cleaner-thread-0] 
> kafka.log.LogCleaner - [kafka-log-cleaner-thread-0], Error due to
> java.lang.IllegalArgumentException: requirement failed: Attempt to add a new 
> entry to a full offset map.
>at scala.Predef$.require(Predef.scala:233)
>at kafka.log.SkimpyOffsetMap.put(OffsetMap.scala:79)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:543)
>at 
> kafka.log.Cleaner$$anonfun$kafka$log$Cleaner$$buildOffsetMapForSegment$1.apply(LogCleaner.scala:538)
>at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:32)
>at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>at kafka.message.MessageSet.foreach(MessageSet.scala:67)
>at 
> kafka.log.Cleaner.kafka$log$Cleaner$$buildOffsetMapForSegment(LogCleaner.scala:538)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:515)
>at 
> kafka.log.Cleaner$$anonfun$buildOffsetMap$3.apply(LogCleaner.scala:512)
>at scala.collection.immutable.Stream.foreach(Stream.scala:547)
>at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:512)
>at kafka.log.Cleaner.clean(LogCleaner.scala:307)
>at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:221)
>at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:199)
>at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2641) Upgrade path for ZK authentication

2015-10-23 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-2641.

   Resolution: Fixed
Fix Version/s: 0.9.0.0

Issue resolved by pull request 313
[https://github.com/apache/kafka/pull/313]

> Upgrade path for ZK authentication
> --
>
> Key: KAFKA-2641
> URL: https://issues.apache.org/jira/browse/KAFKA-2641
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
> Fix For: 0.9.0.0
>
>
> If a cluster isn't secure and wants to migrate to secure, then it will need 
> to execute the following steps with respect to ZK:
> # Perform a rolling restart passing the appropriate JAAS file to enable the 
> ZK client to authenticate against the server 
> # Run a ZkSecurityMigrationTool that will set the appropriate ACLs 
> # Perform a second rolling restart to set `zookeeper.enable.secure.acls`
> This recipe assumes that the ZK ensemble has security on. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2641) Upgrade path for ZK authentication

2015-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971981#comment-14971981
 ] 

ASF GitHub Bot commented on KAFKA-2641:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/313


> Upgrade path for ZK authentication
> --
>
> Key: KAFKA-2641
> URL: https://issues.apache.org/jira/browse/KAFKA-2641
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
> Fix For: 0.9.0.0
>
>
> If a cluster isn't secure and wants to migrate to secure, then it will need 
> to execute the following steps with respect to ZK:
> # Perform a rolling restart passing the appropriate JAAS file to enable the 
> ZK client to authenticate against the server 
> # Run a ZkSecurityMigrationTool that will set the appropriate ACLs 
> # Perform a second rolling restart to set `zookeeper.enable.secure.acls`
> This recipe assumes that the ZK ensemble has security on. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2626: Handle null keys and value validat...

2015-10-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/345


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2626) Null offsets in copycat causes exception in OffsetStorageWriter

2015-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972153#comment-14972153
 ] 

ASF GitHub Bot commented on KAFKA-2626:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/345


> Null offsets in copycat causes exception in OffsetStorageWriter
> ---
>
> Key: KAFKA-2626
> URL: https://issues.apache.org/jira/browse/KAFKA-2626
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.9.0.0
>
>
> {quote}
> [2015-10-07 16:20:39,052] ERROR CRITICAL: Failed to serialize offset data, 
> making it impossible to commit offsets under namespace wikipedia-irc-source. 
> This likely won't recover unless the unserializable partition or offset 
> information is overwritten. 
> (org.apache.kafka.copycat.storage.OffsetStorageWriter:152)
> [2015-10-07 16:20:39,053] ERROR Cause of serialization failure: 
> (org.apache.kafka.copycat.storage.OffsetStorageWriter:155)
> java.lang.NullPointerException
> at 
> org.apache.kafka.copycat.storage.OffsetUtils.validateFormat(OffsetUtils.java:34)
> at 
> org.apache.kafka.copycat.storage.OffsetStorageWriter.doFlush(OffsetStorageWriter.java:141)
> at 
> org.apache.kafka.copycat.runtime.WorkerSourceTask.commitOffsets(WorkerSourceTask.java:223)
> at 
> org.apache.kafka.copycat.runtime.WorkerSqourceTask.stop(WorkerSourceTask.java:100)
> at org.apache.kafka.copycat.runtime.Worker.stopTask(Worker.java:188)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.removeConnectorTasks(StandaloneHerder.java:210)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stopConnector(StandaloneHerder.java:155)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stop(StandaloneHerder.java:60)
> at org.apache.kafka.copycat.runtime.Copycat.stop(Copycat.java:66)
> at 
> org.apache.kafka.copycat.runtime.Copycat$ShutdownHook.run(Copycat.java:88)
> [2015-10-07 16:20:39,055] ERROR Failed to flush 
> org.apache.kafka.copycat.runtime.WorkerSourceTask$2@12782f6 offsets to 
> storage:  (org.apache.kafka.copycat.runtime.WorkerSourceTask:227)
> java.lang.NullPointerException
> at 
> org.apache.kafka.copycat.storage.OffsetUtils.validateFormat(OffsetUtils.java:34)
> at 
> org.apache.kafka.copycat.storage.OffsetStorageWriter.doFlush(OffsetStorageWriter.java:141)
> at 
> org.apache.kafka.copycat.runtime.WorkerSourceTask.commitOffsets(WorkerSourceTask.java:223)
> at 
> org.apache.kafka.copycat.runtime.WorkerSourceTask.stop(WorkerSourceTask.java:100)
> at org.apache.kafka.copycat.runtime.Worker.stopTask(Worker.java:188)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.removeConnectorTasks(StandaloneHerder.java:210)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stopConnector(StandaloneHerder.java:155)
> at 
> org.apache.kafka.copycat.runtime.standalone.StandaloneHerder.stop(StandaloneHerder.java:60)
> at org.apache.kafka.copycat.runtime.Copycat.stop(Copycat.java:66)
> at 
> org.apache.kafka.copycat.runtime.Copycat$ShutdownHook.run(Copycat.java:88)
> [2015-10-07 16:20:39,055] INFO Starting graceful shutdown of thread 
> WorkerSourceTask-wikipedia-irc-source-0 
> (org.apache.kafka.copycat.util.ShutdownableThread:119)
> [2015-10-07 16:20:39,056] INFO Herder stopped 
> (org.apache.kafka.copycat.runtime.standalone.StandaloneHerder:64)
> [2015-10-07 16:20:39,056] INFO Worker stopping 
> (org.apache.kafka.copycat.runtime.Worker:104)
> [2015-10-07 16:20:39,056] INFO Stopped FileOffsetBackingStore 
> (org.apache.kafka.copycat.storage.FileOffsetBackingStore:61)
> [2015-10-07 16:20:39,056] INFO Worker stopped 
> (org.apache.kafka.copycat.runtime.Worker:133)
> [2015-10-07 16:20:39,057] INFO Copycat stopped 
> (org.apache.kafka.copycat.runtime.Copycat:69)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2688) Avoid forcing reload of `Configuration`

2015-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-2688:
---
Summary: Avoid forcing reload of `Configuration`  (was: Improve)

> Avoid forcing reload of `Configuration`
> ---
>
> Key: KAFKA-2688
> URL: https://issues.apache.org/jira/browse/KAFKA-2688
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Reporter: Ismael Juma
>Priority: Critical
> Fix For: 0.9.0.0
>
>
> We currently call `Configuration.setConfiguration(null)` from a couple of 
> places in our codebase (`Login` and `JaasUtils`) to force `Configuration` to 
> be reloaded. If this code is removed, some tests can fail depending on the 
> test execution order.
> Ideally we would not need to call `setConfiguration(null)` outside of tests. 
> Investigate if this is possible. If not, we should at least ensure that 
> reloads are done in a safe way within our codebase (perhaps using a lock).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2687) Allow GroupMetadataRequest to return member metadata when received by group coordinator

2015-10-23 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972063#comment-14972063
 ] 

Jun Rao commented on KAFKA-2687:


[~gwenshap], the regex in the consumer is only used in the java client. The 
regex is not part of the request protocol.

[~hachikuji], yes, it would be good if you can submit a KIP on this.

> Allow GroupMetadataRequest to return member metadata when received by group 
> coordinator
> ---
>
> Key: KAFKA-2687
> URL: https://issues.apache.org/jira/browse/KAFKA-2687
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>
> Since the new consumer currently has no persistence in Zookeeper (pending 
> outcome of KAFKA-2017), there is no way for administrators to investigate 
> group status including getting the list of members in the group and their 
> partition assignments. We therefore propose to modify GroupMetadataRequest 
> (previously known as ConsumerMetadataRequest) to return group metadata when 
> received by the respective group's coordinator. When received by another 
> broker, the request will be handled as before: by only returning coordinator 
> host and port information.
> {code}
> GroupMetadataRequest => GroupId IncludeMetadata
>   GroupId => String
>   IncludeMetadata => Boolean
> GroupMetadataResponse => ErrorCode Coordinator GroupMetadata
>   ErrorCode => int16
>   Coordinator => Id Host Port
> Id => int32
> Host => string
> Port => int32
>   GroupMetadata => State ProtocolType Generation Protocol Leader  Members
> State => String
> ProtocolType => String
> Generation => int32
> Protocol => String
> Leader => String
> Members => [Member MemberMetadata MemberAssignment]
>   Member => MemberIp ClientId
> MemberIp => String
> ClientId => String
>   MemberMetadata => Bytes
>   MemberAssignment => Bytes
> {code}
> The request schema includes a flag to indicate whether metadata is needed, 
> which saves clients from having to read all group metadata when they are just 
> trying to find the coordinator. This is important to reduce group overhead 
> for use cases which involve a large number of topic subscriptions (e.g. 
> mirror maker).
> Tools will use the protocol type to determine how to parse metadata. For 
> example, when the protocolType is "consumer", the tool can use 
> ConsumerProtocol to parse the member metadata as topic subscriptions and 
> partition assignments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk7 #715

2015-10-23 Thread Apache Jenkins Server
See 

Changes:

[junrao] KAFKA-2641; Upgrade path for ZK authentication

--
[...truncated 324 lines...]
:kafka-trunk-jdk7:clients:createVersionFile
:kafka-trunk-jdk7:clients:jar UP-TO-DATE
:kafka-trunk-jdk7:clients:javadoc UP-TO-DATE
:kafka-trunk-jdk7:log4j-appender:compileJava UP-TO-DATE
:kafka-trunk-jdk7:log4j-appender:processResources UP-TO-DATE
:kafka-trunk-jdk7:log4j-appender:classes UP-TO-DATE
:kafka-trunk-jdk7:log4j-appender:jar UP-TO-DATE
:kafka-trunk-jdk7:core:compileJava UP-TO-DATE
:kafka-trunk-jdk7:core:compileScala UP-TO-DATE
:kafka-trunk-jdk7:core:processResources UP-TO-DATE
:kafka-trunk-jdk7:core:classes UP-TO-DATE
:kafka-trunk-jdk7:log4j-appender:javadoc
:kafka-trunk-jdk7:core:javadoc
:kafka-trunk-jdk7:core:javadocJar
:kafka-trunk-jdk7:core:scaladoc
[ant:scaladoc] Element 
' 
does not exist.
[ant:scaladoc] 
:293:
 warning: a pure expression does nothing in statement position; you may be 
omitting necessary parentheses
[ant:scaladoc] ControllerStats.uncleanLeaderElectionRate
[ant:scaladoc] ^
[ant:scaladoc] 
:294:
 warning: a pure expression does nothing in statement position; you may be 
omitting necessary parentheses
[ant:scaladoc] ControllerStats.leaderElectionTimer
[ant:scaladoc] ^
[ant:scaladoc] warning: there were 15 feature warning(s); re-run with -feature 
for details
[ant:scaladoc] 
:72:
 warning: Could not find any member to link for 
"java.util.concurrent.BlockingQueue#offer".
[ant:scaladoc]   /**
[ant:scaladoc]   ^
[ant:scaladoc] 
:32:
 warning: Could not find any member to link for 
"java.util.concurrent.BlockingQueue#offer".
[ant:scaladoc]   /**
[ant:scaladoc]   ^
[ant:scaladoc] 
:137:
 warning: Could not find any member to link for 
"java.util.concurrent.BlockingQueue#poll".
[ant:scaladoc]   /**
[ant:scaladoc]   ^
[ant:scaladoc] 
:120:
 warning: Could not find any member to link for 
"java.util.concurrent.BlockingQueue#poll".
[ant:scaladoc]   /**
[ant:scaladoc]   ^
[ant:scaladoc] 
:97:
 warning: Could not find any member to link for 
"java.util.concurrent.BlockingQueue#put".
[ant:scaladoc]   /**
[ant:scaladoc]   ^
[ant:scaladoc] 
:152:
 warning: Could not find any member to link for 
"java.util.concurrent.BlockingQueue#take".
[ant:scaladoc]   /**
[ant:scaladoc]   ^
[ant:scaladoc] 9 warnings found
:kafka-trunk-jdk7:core:scaladocJar
:kafka-trunk-jdk7:core:docsJar
:docsJar_2_11_7
Building project 'core' with Scala version 2.11.7
:kafka-trunk-jdk7:clients:compileJava UP-TO-DATE
:kafka-trunk-jdk7:clients:processResources UP-TO-DATE
:kafka-trunk-jdk7:clients:classes UP-TO-DATE
:kafka-trunk-jdk7:clients:determineCommitId UP-TO-DATE
:kafka-trunk-jdk7:clients:createVersionFile
:kafka-trunk-jdk7:clients:jar UP-TO-DATE
:kafka-trunk-jdk7:clients:javadoc UP-TO-DATE
:kafka-trunk-jdk7:log4j-appender:compileJava UP-TO-DATE
:kafka-trunk-jdk7:log4j-appender:processResources UP-TO-DATE
:kafka-trunk-jdk7:log4j-appender:classes UP-TO-DATE
:kafka-trunk-jdk7:log4j-appender:jar UP-TO-DATE
:kafka-trunk-jdk7:core:compileJava UP-TO-DATE
:kafka-trunk-jdk7:core:compileScala
:78:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.

org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP
 ^
:36:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 commitTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP,

  ^

Build failed in Jenkins: kafka-trunk-jdk8 #53

2015-10-23 Thread Apache Jenkins Server
See 

Changes:

[junrao] KAFKA-2641; Upgrade path for ZK authentication

--
[...truncated 4611 lines...]

org.apache.kafka.copycat.file.FileStreamSinkConnectorTest > testSinkTasks PASSED

org.apache.kafka.copycat.file.FileStreamSinkConnectorTest > testTaskClass PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > testSourceTasks 
PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > 
testSourceTasksStdin PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > testTaskClass 
PASSED

org.apache.kafka.copycat.file.FileStreamSourceConnectorTest > 
testMultipleSourcesInvalid PASSED
:copycat:json:checkstyleMain
:copycat:json:compileTestJavawarning: [options] bootstrap class path not set in 
conjunction with -source 1.7
Note: 

 uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 warning

:copycat:json:processTestResources UP-TO-DATE
:copycat:json:testClasses
:copycat:json:checkstyleTest
:copycat:json:test

org.apache.kafka.copycat.json.JsonConverterTest > longToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCacheSchemaToJsonConversion PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
nullSchemaAndMapNonStringKeysToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > floatToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > booleanToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndMapToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > stringToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timestampToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCopycatSchemaMetadataTranslation PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timestampToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > decimalToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToCopycatStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToJsonNonStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > longToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mismatchSchemaJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testCacheSchemaToCopycatConversion PASSED

org.apache.kafka.copycat.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation PASSED

org.apache.kafka.copycat.json.JsonConverterTest > bytesToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > shortToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > intToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > structToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > stringToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndArrayToJson 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > byteToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaPrimitiveToCopycat 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > byteToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > intToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > dateToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > noSchemaToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > noSchemaToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToJsonStringKeys PASSED

org.apache.kafka.copycat.json.JsonConverterTest > arrayToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > nullToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timeToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > structToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > shortToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > dateToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > doubleToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > timeToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > floatToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > decimalToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > arrayToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > booleanToJson PASSED

org.apache.kafka.copycat.json.JsonConverterTest > mapToCopycatNonStringKeys 
PASSED

org.apache.kafka.copycat.json.JsonConverterTest > bytesToCopycat PASSED

org.apache.kafka.copycat.json.JsonConverterTest > doubleToCopycat PASSED
:copycat:runtime:checkstyleMain
:copycat:runtime:compileTestJavawarning: [options] bootstrap class path not set 
in conjunction with -source 1.7
Note: Some input files use unchecked 

[GitHub] kafka pull request: KAFKA-2371: Add distributed support for Copyca...

2015-10-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/321


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2371) Add distributed coordinator implementation for Copycat

2015-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972113#comment-14972113
 ] 

ASF GitHub Bot commented on KAFKA-2371:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/321


> Add distributed coordinator implementation for Copycat
> --
>
> Key: KAFKA-2371
> URL: https://issues.apache.org/jira/browse/KAFKA-2371
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.9.0.0
>
>
> Copycat needs a Coordinator implementation that handles multiple Workers that 
> automatically manage the distribution of connectors and tasks across them. To 
> start, this implementation should handle any connectors that have been 
> registered via either a CLI or REST interface for starting/stopping 
> connectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2371) Add distributed coordinator implementation for Copycat

2015-10-23 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira resolved KAFKA-2371.
-
Resolution: Fixed

Issue resolved by pull request 321
[https://github.com/apache/kafka/pull/321]

> Add distributed coordinator implementation for Copycat
> --
>
> Key: KAFKA-2371
> URL: https://issues.apache.org/jira/browse/KAFKA-2371
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.9.0.0
>
>
> Copycat needs a Coordinator implementation that handles multiple Workers that 
> automatically manage the distribution of connectors and tasks across them. To 
> start, this implementation should handle any connectors that have been 
> registered via either a CLI or REST interface for starting/stopping 
> connectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


(User) Document/guide on new Kafka features on security in upcoming 0.9.0?

2015-10-23 Thread Jaikiran Pai
Given that 0.9.0 is being planned to released sometime soon in November, 
our team has started looking into what's coming in the newer version and 
what we need to do to start using the features. We are very much and 
mainly interested in the security features of it for the upcoming 
releases. We have been watching some of the KIP discussion threads and 
the KIP articles around security[1][2]. However, those are very much 
like design discussions/rationale. Is there some document which explains 
to end users of Kafka to understand what security features have been 
implemented for this release and what needs to be done to start using those?


The reason I ask for these now, before the release, is so that some of 
us can start testing these out before the release and raise any issues 
that we might run into.


[1] 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
[2] 
https://cwiki.apache.org/confluence/display/KAFKA/Security#Security-Authorization


-Jaikiran


Re: [DISCUSS] KIP-37 - Add namespaces in Kafka

2015-10-23 Thread Gwen Shapira
What about topic-level metrics? Are we going to report metrics at all
level now? Or maybe just at partition-level and use the monitoring app
to aggregate them in different levels (i.e. remove topic metrics
completely)?




On Wed, Oct 21, 2015 at 3:47 PM, Ashish Singh  wrote:
> On Wed, Oct 21, 2015 at 2:22 PM, Jay Kreps  wrote:
>
>> Gwen, It's a good question of what the producer semantics are--would we
>> only allow you to produce to a partition or first level directory or would
>> we hash over whatever subtree you supply? Actually not sure which makes
>> more sense...
>>
>> Ashish, here are some thoughts:
>> 1. I think we can do this online. There is a question of what happens to
>> readers and writers but presumably it would the same thing as if that topic
>> weren't there. There would be no guarantee this would happen atomic over
>> different brokers or clients, though.
>> 2. ACLs should work like unix perms, right?
>
>
> Are you suggesting we should move allowed operations to R, W, X model of
> unix. Currently, we support these operations
> 
> .
>
> I think configs would overide
>> hierarchically, so we would have a full set of configs for each partition
>> computed by walking up the tree from the root and taking the first
>> override). I think this is what you're describing, right?
>>
>
> Yes.
>
> 3. Totally agree no reason to have an arbitrary limit.
>> 4. I actually don't think the physical layout on disk should be at all
>> connected to the logical directory hierarchy we present.
>
>
> I think it will be useful to have that connection as that will enable users
> to encrypt different namespaces with different keys. Thus, one more step
> towards a completely multi tenant system.
>
>
>> That is, whether
>> you use RAID or not shouldn't impact the location of a topic in your
>> directory structure.
>
>
> Even if we make physical layout on disk representative of directory
> hierarchy,  I think this will not be a concern. Correct me, if I am missing
> something.
>
> Not sure if this is what you are saying or not. This
>> does raise the question of how to do the disk layout. The simplest thing
>> would be to keep the flat data directories but make the names of the
>> partitions on disk just be logical inode numbers and then have a separate
>> mapping of these inodes to logical names stored in ZK with a cache. I think
>> this would make things like rename fast and atomic. The downside of this is
>> that the 'ls' command will no longer tell you much about the data on a
>> broker.
>>
>
> Enabling renaming of topics is definitely something that will be nice to
> have, however with the flat structure we won't be able to enable encrypting
> different directories/ namespaces with different keys. However, with
> directory hierarchy on disk can be achieved with logical names, each dir
> will need a logical name though.
>
>
>> -Jay
>>
>> On Wed, Oct 21, 2015 at 12:43 PM, Ashish Singh 
>> wrote:
>>
>> > In last KIP hangout following questions were raised.
>> >
>> >1.
>> >
>> >*Whether or not to support move command? If yes, how do we support
>> it.*
>> >I think *move* command will be essential, once we start supporting
>> >directories. However, implementation might be a bit convoluted. A few
>> >things required for it will be, ability to mark a topic unavailable
>> > during
>> >the move, update brokers’ metadata cache to reflect the move.
>> >2.
>> >
>> >*How will acls/ configs inheritance work?*
>> >Say we have /dc/ns/topic.
>> >dc has dc_acl and dc_config. Similarly for ns and topic.
>> >For being able to perform an action on /dc/ns/topic, the user must
>> have
>> >required perms on dc, ns and topic for that operation. For example,
>> > User1
>> >will need DESCRIBE permissions on dc, ns and topic to be able to
>> > describe
>> >/dc/ns/topic.
>> >For configs, configs for /dc/ns/topic will be topic_config +
>> ns_config +
>> >dc_config, in that order. So, if a config is specified for topic then
>> > that
>> >will be used, else it’s parent (ns) will be checked for that config,
>> and
>> >this goes on.
>> >3.
>> >
>> >*Will supporting n-deep hierarchy be a concern?*
>> >This can be a performance concern, however it sounds more of a
>> misusage
>> >of the functionality or bad organization of topics. We can have a
>> depth
>> >limit, but I am not sure if it is required.
>> >4.
>> >
>> >*Will we continue to support multi-directory on disk, that was
>> proposed
>> >in KAFKA-188?*
>> >Yes, we should be able to support that. It is within those
>> directories,
>> >namespaces will be created. The heuristics for choosing least loaded
>> >disc/dir will remain same.
>> >5.
>> >
>> >*Will it be required to move existing topics from default 

[GitHub] kafka pull request: KAFKA-2641: Upgrade path for ZK authentication

2015-10-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/313


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2644) Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL

2015-10-23 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972026#comment-14972026
 ] 

Jun Rao commented on KAFKA-2644:


[~sriharsha], is MiniKDC realistic enough for testing Kerberos? Are there 
things that are only covered in a full KDC, but not in MiniKDC? Thanks,

> Run relevant ducktape tests with SASL_PLAINTEXT and SASL_SSL
> 
>
> Key: KAFKA-2644
> URL: https://issues.apache.org/jira/browse/KAFKA-2644
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.9.0.0
>
>
> We need to define which of the existing ducktape tests are relevant. cc 
> [~rsivaram]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2688) Improve

2015-10-23 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-2688:
--

 Summary: Improve
 Key: KAFKA-2688
 URL: https://issues.apache.org/jira/browse/KAFKA-2688
 Project: Kafka
  Issue Type: Bug
  Components: security
Reporter: Ismael Juma
Priority: Critical
 Fix For: 0.9.0.0


We currently call `Configuration.setConfiguration(null)` from a couple of 
places in our codebase (`Login` and `JaasUtils`) to force `Configuration` to be 
reloaded. If this code is removed, some tests can fail depending on the test 
execution order.

Ideally we would not need to call `setConfiguration(null)` outside of tests. 
Investigate if this is possible. If not, we should at least ensure that reloads 
are done in a safe way within our codebase (perhaps using a lock).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk8 #52

2015-10-23 Thread Apache Jenkins Server
See 

Changes:

[cshapi] KAFKA-2671: Enable starting Kafka server with a Properties object

--
[...truncated 3116 lines...]

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.SaslSslTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.SaslSslTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.SaslSslTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.SaslSslTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslSslTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.SaslSslTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslSslTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslSslTopicMetadataTest > testTopicMetadataRequest PASSED

kafka.integration.SaslSslTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.SslTopicMetadataTest > testIsrAfterBrokerShutDownAndJoinsBack 
PASSED

kafka.integration.SslTopicMetadataTest > testAutoCreateTopicWithCollision PASSED

kafka.integration.SslTopicMetadataTest > testAliveBrokerListWithNoTopics PASSED

kafka.integration.SslTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SslTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.SslTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SslTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SslTopicMetadataTest > testTopicMetadataRequest PASSED

kafka.integration.SslTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithCollision PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testTopicMetadataRequest 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.metrics.KafkaTimerTest > testKafkaTimer PASSED

kafka.utils.UtilsTest > testAbs PASSED

kafka.utils.UtilsTest > testReplaceSuffix PASSED

kafka.utils.UtilsTest > testDoublyLinkedList PASSED

kafka.utils.UtilsTest > testCircularIterator PASSED

kafka.utils.UtilsTest > testReadBytes PASSED

kafka.utils.UtilsTest > testCsvList PASSED

kafka.utils.UtilsTest > testReadInt PASSED

kafka.utils.UtilsTest > testCsvMap PASSED

kafka.utils.UtilsTest > testInLock PASSED

kafka.utils.UtilsTest > testSwallow PASSED

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask PASSED

kafka.utils.SchedulerTest > testNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testRestart PASSED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler PASSED

kafka.utils.SchedulerTest > testPeriodicTask PASSED

kafka.utils.ByteBoundedBlockingQueueTest > testByteBoundedBlockingQueue PASSED

kafka.utils.CommandLineUtilsTest > testParseEmptyArg PASSED

kafka.utils.CommandLineUtilsTest > testParseSingleArg PASSED

kafka.utils.CommandLineUtilsTest > testParseArgs PASSED

kafka.utils.IteratorTemplateTest > testIterator PASSED

kafka.utils.ReplicationUtilsTest > testUpdateLeaderAndIsr PASSED

kafka.utils.ReplicationUtilsTest > testGetLeaderIsrAndEpochForPartition PASSED

kafka.utils.JsonTest > testJsonEncoding PASSED

kafka.utils.timer.TimerTaskListTest > testAll PASSED

kafka.utils.timer.TimerTest > testAlreadyExpiredTask PASSED

kafka.utils.timer.TimerTest > testTaskExpiration PASSED

kafka.message.MessageCompressionTest >