SASL security issue

2017-03-08 Thread Paweł Tomasik
Hi
I've found a security issue in the kafka SASL implementation.
It seems that ticket refreshments are not necessary to keep
client-broker connection up.

Test scenario:
Client sucessfully connects to the broker using SASL_SSL security
protocol. Kerberos server is provided by Windows Server 2012 and
Active Directory
Client principal account is disabled on Active Directory
When Ticket expires the connection is still up and running. (Although
client side is no able to refresh it since account is blocked)

The problem root-cause on client side is located here:

org.apache.kafka.common.security.kerberos::KerberosLogin.java Lines 239-263
In my test scenario:
- Relogin fails
- Thread sleeps for hardocded 10 second delay
- Next relogin attempt is taken but immediately skipped because
hasSufficientTimeElapsed returns false (default value of
minTimeBeforeRelogin is set to 60 seconds)
- Next attempt is scheduled for next minute, but connection is not closed.
Process repeats

Application logs:
2017-03-06 12:06:30,709 INFO
[org.apache.kafka.common.security.kerberos.KerberosLogin]
(kafka-kerberos-refresh-thread) Initiating re-login for
host/domain.com
2017-03-06 12:06:40,713 WARN
[org.apache.kafka.common.security.kerberos.KerberosLogin]
(kafka-kerberos-refresh-thread) Not attempting to re-login since the
last re-login was attempted less than 60 seconds before.
2017-03-06 12:06:40,714 WARN
[org.apache.kafka.common.security.kerberos.KerberosLogin]
(kafka-kerberos-refresh-thread) No TGT found: will try again at Mon
Mar 06 12:07:40 CET 2017
2017-03-06 12:06:40,714 INFO
[org.apache.kafka.common.security.kerberos.KerberosLogin]
(kafka-kerberos-refresh-thread) TGT refresh sleeping until: Mon Mar 06
12:07:40 CET 2017

2017-03-06 12:07:40,714 INFO
[org.apache.kafka.common.security.kerberos.KerberosLogin]
(kafka-kerberos-refresh-thread) Initiating logout for host/domain.com
2017-03-06 12:07:40,715 INFO
[org.apache.kafka.common.security.kerberos.KerberosLogin]
(kafka-kerberos-refresh-thread) Initiating re-login for
host/domain.com
2017-03-06 12:07:50,717 WARN
[org.apache.kafka.common.security.kerberos.KerberosLogin]
(kafka-kerberos-refresh-thread) Not attempting to re-login since the
last re-login was attempted less than 60 seconds before.
2017-03-06 12:07:50,717 WARN
[org.apache.kafka.common.security.kerberos.KerberosLogin]
(kafka-kerberos-refresh-thread) No TGT found: will try again at Mon
Mar 06 12:08:50 CET 2017

On the broker side the problem seems to be even more severe, as the it
seems not to verify ticket expiration date.
So once client provides a valid ticket, it is no longer challenged
against its refreshments.
It looks that authentication is performed only once at connection
establish point by default Krb5LoginModule implementation. It is not
challenged later.

I'm new here, so forgive me if it is not a good place for such posts.


Best regards
Pawel


Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-03-08 Thread Jun Rao
Hi, Dong,

Thanks for the reply.

1.3 So the thread gets the lock, checks if caught up and releases the lock
if not? Then, in the case when there is continuous incoming data, the
thread may never get a chance to swap. One way to address this is when the
thread is getting really close in catching up, just hold onto the lock
until the thread fully catches up.

2.3 So, you are saying that the partition reassignment tool can first send
a ChangeReplicaDirRequest to relevant brokers to establish the log dir for
replicas not created yet, then trigger the partition movement across
brokers through the controller? That's actually a good idea. Then, we can
just leave LeaderAndIsrRequest as it is. Another thing related to
ChangeReplicaDirRequest.
Since this request may take long to complete, I am not sure if we should
wait for the movement to complete before respond. While waiting for the
movement to complete, the idle connection may be killed or the client may
be gone already. An alternative is to return immediately and add a new
request like CheckReplicaDirRequest to see if the movement has completed.
The tool can take advantage of that to check the status.

Thanks,

Jun



On Wed, Mar 8, 2017 at 6:21 PM, Dong Lin  wrote:

> Hey Jun,
>
> Thanks for the detailed explanation. I will use the separate thread pool to
> move replica between log directories. I will let you know when the KIP has
> been updated to use a separate thread pool.
>
> Here is my response to your other questions:
>
> 1.3 My idea is that the ReplicaMoveThread that moves data should get the
> lock before checking whether the replica in the destination log directory
> has caught up. If the new replica has caught up, then the ReplicaMoveThread
> should swaps the replica while it is still holding the lock. The
> ReplicaFetcherThread or RequestHandlerThread will not be able to append
> data to the replica in the source replica during this period because they
> can not get the lock. Does this address the problem?
>
> 2.3 I get your point that we want to keep controller simpler. If admin tool
> can send ChangeReplicaDirRequest to move data within a broker, then
> controller probably doesn't even need to include log directory path in the
> LeaderAndIsrRequest. How about this: controller will only deal with
> reassignment across brokers as it does now. If user specified destination
> replica for any disk, the admin tool will send ChangeReplicaDirRequest and
> wait for response from broker to confirm that all replicas have been moved
> to the destination log direcotry. The broker will put
> ChangeReplicaDirRequset in a purgatory and respond either when the movement
> is completed or when the request has timed-out.
>
> 4. I agree that we can expose these metrics via JMX. But I am not sure if
> it can be obtained easily with good performance using either existing tools
> or new script in kafka. I will ask SREs for their opinion.
>
> Thanks,
> Dong
>
>
>
>
>
>
>
>
> On Wed, Mar 8, 2017 at 1:24 PM, Jun Rao  wrote:
>
> > Hi, Dong,
> >
> > Thanks for the updated KIP. A few more comments below.
> >
> > 1.1 and 1.2: I am still not sure there is enough benefit of reusing
> > ReplicaFetchThread
> > to move data across disks.
> > (a) A big part of ReplicaFetchThread is to deal with issuing and tracking
> > fetch requests. So, it doesn't feel that we get much from reusing
> > ReplicaFetchThread
> > only to disable the fetching part.
> > (b) The leader replica has no ReplicaFetchThread to start with. It feels
> > weird to start one just for intra broker data movement.
> > (c) The ReplicaFetchThread is per broker. Intuitively, the number of
> > threads doing intra broker data movement should be related to the number
> of
> > disks in the broker, not the number of brokers in the cluster.
> > (d) If the destination disk fails, we want to stop the intra broker data
> > movement, but want to continue inter broker replication. So, logically,
> it
> > seems it's better to separate out the two.
> > (e) I am also not sure if we should reuse the existing throttling for
> > replication. It's designed to handle traffic across brokers and the
> > delaying is done in the fetch request. So, if we are not doing
> > fetching in ReplicaFetchThread,
> > I am not sure the existing throttling is effective. Also, when specifying
> > the throttling of moving data across disks, it seems the user shouldn't
> > care about whether a replica is a leader or a follower. Reusing the
> > existing throttling config name will be awkward in this regard.
> > (f) It seems it's simpler and more consistent to use a separate thread
> pool
> > for local data movement (for both leader and follower replicas). This
> > process can then be configured (e.g. number of threads, etc) and
> throttled
> > independently.
> >
> > 1.3 Yes, we will need some synchronization there. So, if the movement
> > thread catches up, gets the lock to do the swap, but realizes that new
> data
> > 

[jira] [Created] (KAFKA-4872) Getting java.io.IOException in kafka 0.10.1.0 & 0.10.1.1

2017-03-08 Thread Khushboo Sangal (JIRA)
Khushboo Sangal created KAFKA-4872:
--

 Summary: Getting java.io.IOException in kafka 0.10.1.0 & 0.10.1.1
 Key: KAFKA-4872
 URL: https://issues.apache.org/jira/browse/KAFKA-4872
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect, network
Affects Versions: 0.10.1.1, 0.10.1.0
 Environment: Centos machine, 8GB RAM, 4 Core CPUs
Reporter: Khushboo Sangal


I am running load test(20k msgs / sec) on kafka cluster with 3 nodes with above 
mentioned configuration.When I run same amount of load on 0.9 version it is 
working fine.Please look into it.

After 7-8 hours of running I get this exception on broker:

[2017-03-09 02:57:55,989] WARN [ReplicaFetcherThread-0-1], Error in fetch 
kafka.server.ReplicaFetcherThread$FetchRequest@36201b29 
(kafka.server.ReplicaFetcherThread)
java.io.IOException: Connection to 1 was disconnected before the response was 
read
at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:115)
at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:112)
at scala.Option.foreach(Option.scala:257)
at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:112)
at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:108)
at 
kafka.utils.NetworkClientBlockingOps$.recursivePoll$1(NetworkClientBlockingOps.scala:137)
at 
kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143)
at 
kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:108)
at 
kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:253)
at 
kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:238)
at 
kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
at 
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-4871) Kafka doesn't respect TTL on Zookeeper hostname - crash if zookeeper IP changes

2017-03-08 Thread Stephane Maarek (JIRA)
Stephane Maarek created KAFKA-4871:
--

 Summary: Kafka doesn't respect TTL on Zookeeper hostname - crash 
if zookeeper IP changes
 Key: KAFKA-4871
 URL: https://issues.apache.org/jira/browse/KAFKA-4871
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.10.2.0
Reporter: Stephane Maarek


I had a Zookeeper cluster that automatically obtains hostname so that they 
remain constant over time. I deleted my 3 zookeeper machines and new machines 
came back online, with the same hostname, and they updated their CNAME

Kafka then failed and couldn't reconnect to Zookeeper as it didn't try to 
resolve the IP of Zookeeper again. See log below:

[2017-03-09 05:49:57,302] INFO Client will use GSSAPI as SASL mechanism. 
(org.apache.zookeeper.client.ZooKeeperSaslClient)
[2017-03-09 05:49:57,302] INFO Opening socket connection to server 
zookeeper-3.example.com/10.12.79.43:2181. Will attempt to SASL-authenticate 
using Login Context section 'Client' (org.apache.zookeeper.ClientCnxn)

[ec2-user]$ dig +short zookeeper-3.example.com
10.12.79.36

As you can see even though the machine is capable of finding the new hostname, 
Kafka somehow didn't respect the TTL (was set to 60 seconds) and didn't get the 
new IP. I feel that on failed Zookeeper connection, Kafka should at least try 
to resolve the new Zookeeper IP. That allows Kafka to keep up with Zookeeper 
changes over time

What do you think? Is that expected behaviour or a bug?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2655: KAFKA-4864 added correct zookeeper nodes for secur...

2017-03-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2655


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4864) Kafka Secure Migrator tool doesn't secure all the nodes

2017-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902532#comment-15902532
 ] 

ASF GitHub Bot commented on KAFKA-4864:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2655


> Kafka Secure Migrator tool doesn't secure all the nodes
> ---
>
> Key: KAFKA-4864
> URL: https://issues.apache.org/jira/browse/KAFKA-4864
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.0, 0.10.1.1, 0.10.2.0
>Reporter: Stephane Maarek
>Priority: Critical
> Fix For: 0.11.0.0, 0.10.2.1
>
>
> It seems that the secure nodes as referred by ZkUtils.scala are the following:
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils/ZkUtils.scala#L201
> A couple things:
> - the list is highly outdated, and for example the most important nodes such 
> as kafka-acls don't get secured. That's a huge security risk. Would it be 
> better to just secure all the nodes recursively from the given root?
> - the root of some nodes aren't secured. Ex: /brokers (but many others).
> The result is the following after running the tool:
> zookeeper-security-migration --zookeeper.acl secure --zookeeper.connect 
> zoo1:2181/kafka-test
> [zk: localhost:2181(CONNECTED) 9] getAcl /kafka-test/brokers
> 'world,'anyone
> : cdrwa
> [zk: localhost:2181(CONNECTED) 11] getAcl /kafka-test/brokers/ids
> 'world,'anyone
> : r
> 'sasl,'myzkcli...@example.com
> : cdrwa
> [zk: localhost:2181(CONNECTED) 16] getAcl /kafka-test/kafka-acl
> 'world,'anyone
> : cdrwa
> That seems pretty bad to be honest... A fast enough ZkClient could delete 
> some root nodes, and create the nodes they like before the Acls get set. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (KAFKA-4864) Kafka Secure Migrator tool doesn't secure all the nodes

2017-03-08 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-4864.

   Resolution: Fixed
Fix Version/s: 0.10.2.1
   0.11.0.0

Issue resolved by pull request 2655
[https://github.com/apache/kafka/pull/2655]

> Kafka Secure Migrator tool doesn't secure all the nodes
> ---
>
> Key: KAFKA-4864
> URL: https://issues.apache.org/jira/browse/KAFKA-4864
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.0, 0.10.1.1, 0.10.2.0
>Reporter: Stephane Maarek
>Priority: Critical
> Fix For: 0.11.0.0, 0.10.2.1
>
>
> It seems that the secure nodes as referred by ZkUtils.scala are the following:
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils/ZkUtils.scala#L201
> A couple things:
> - the list is highly outdated, and for example the most important nodes such 
> as kafka-acls don't get secured. That's a huge security risk. Would it be 
> better to just secure all the nodes recursively from the given root?
> - the root of some nodes aren't secured. Ex: /brokers (but many others).
> The result is the following after running the tool:
> zookeeper-security-migration --zookeeper.acl secure --zookeeper.connect 
> zoo1:2181/kafka-test
> [zk: localhost:2181(CONNECTED) 9] getAcl /kafka-test/brokers
> 'world,'anyone
> : cdrwa
> [zk: localhost:2181(CONNECTED) 11] getAcl /kafka-test/brokers/ids
> 'world,'anyone
> : r
> 'sasl,'myzkcli...@example.com
> : cdrwa
> [zk: localhost:2181(CONNECTED) 16] getAcl /kafka-test/kafka-acl
> 'world,'anyone
> : cdrwa
> That seems pretty bad to be honest... A fast enough ZkClient could delete 
> some root nodes, and create the nodes they like before the Acls get set. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-4870) A question about broker down , the server is doing partition master election,the client producer may send msg fail . How the producer deal with the situation ??

2017-03-08 Thread zhaoziyan (JIRA)
zhaoziyan created KAFKA-4870:


 Summary: A question about broker down , the server is doing 
partition master election,the client producer may send msg fail . How the 
producer deal with the situation ??
 Key: KAFKA-4870
 URL: https://issues.apache.org/jira/browse/KAFKA-4870
 Project: Kafka
  Issue Type: Test
  Components: clients
 Environment: java client 
Reporter: zhaoziyan
Priority: Minor


the broker down . The kafka cluster is doing partion  master election , the 
producer send order msg or nomal msg ,the producer may send msg fail .How 
client update metadata and deal with the msg send fail ?? 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk8 #1338

2017-03-08 Thread Apache Jenkins Server
See 


Changes:

[ismael] MINOR: Use ConcurrentMap for ConsumerNetworkClient UnsentRequests

[ismael] KAFKA-4745; Remove unnecessary flush in

--
[...truncated 95.53 KB...]
kafka.api.PlaintextConsumerTest > testFetchInvalidOffset PASSED

kafka.api.PlaintextConsumerTest > testAutoCommitIntercept STARTED

kafka.api.PlaintextConsumerTest > testAutoCommitIntercept PASSED

kafka.api.PlaintextConsumerTest > 
testFetchHonoursMaxPartitionFetchBytesIfLargeRecordNotFirst STARTED

kafka.api.PlaintextConsumerTest > 
testFetchHonoursMaxPartitionFetchBytesIfLargeRecordNotFirst PASSED

kafka.api.PlaintextConsumerTest > testCommitSpecifiedOffsets STARTED

kafka.api.PlaintextConsumerTest > testCommitSpecifiedOffsets PASSED

kafka.api.PlaintextConsumerTest > testCommitMetadata STARTED

kafka.api.PlaintextConsumerTest > testCommitMetadata PASSED

kafka.api.PlaintextConsumerTest > testRoundRobinAssignment STARTED

kafka.api.PlaintextConsumerTest > testRoundRobinAssignment PASSED

kafka.api.PlaintextConsumerTest > testPatternSubscription STARTED

kafka.api.PlaintextConsumerTest > testPatternSubscription PASSED

kafka.api.PlaintextConsumerTest > testCoordinatorFailover STARTED

kafka.api.PlaintextConsumerTest > testCoordinatorFailover PASSED

kafka.api.PlaintextConsumerTest > testSimpleConsumption STARTED

kafka.api.PlaintextConsumerTest > testSimpleConsumption PASSED

kafka.api.RequestResponseSerializationTest > 
testSerializationAndDeserialization STARTED

kafka.api.RequestResponseSerializationTest > 
testSerializationAndDeserialization PASSED

kafka.api.RequestResponseSerializationTest > testFetchResponseVersion STARTED

kafka.api.RequestResponseSerializationTest > testFetchResponseVersion PASSED

kafka.api.RequestResponseSerializationTest > testProduceResponseVersion STARTED

kafka.api.RequestResponseSerializationTest > testProduceResponseVersion PASSED

kafka.api.UserClientIdQuotaTest > testProducerConsumerOverrideUnthrottled 
STARTED

kafka.api.UserClientIdQuotaTest > testProducerConsumerOverrideUnthrottled PASSED

kafka.api.UserClientIdQuotaTest > testThrottledProducerConsumer STARTED

kafka.api.UserClientIdQuotaTest > testThrottledProducerConsumer PASSED

kafka.api.UserClientIdQuotaTest > testQuotaOverrideDelete STARTED

kafka.api.UserClientIdQuotaTest > testQuotaOverrideDelete PASSED

kafka.api.AdminClientTest > testDescribeConsumerGroup STARTED

kafka.api.AdminClientTest > testDescribeConsumerGroup PASSED

kafka.api.AdminClientTest > testListGroups STARTED

kafka.api.AdminClientTest > testListGroups PASSED

kafka.api.AdminClientTest > testListAllBrokerVersionInfo STARTED

kafka.api.AdminClientTest > testListAllBrokerVersionInfo PASSED

kafka.api.AdminClientTest > testDescribeConsumerGroupForNonExistentGroup STARTED

kafka.api.AdminClientTest > testDescribeConsumerGroupForNonExistentGroup PASSED

kafka.api.AdminClientTest > testGetConsumerGroupSummary STARTED

kafka.api.AdminClientTest > testGetConsumerGroupSummary PASSED

kafka.api.PlaintextProducerSendTest > 
testSendCompressedMessageWithLogAppendTime STARTED

kafka.api.PlaintextProducerSendTest > 
testSendCompressedMessageWithLogAppendTime PASSED

kafka.api.PlaintextProducerSendTest > testAutoCreateTopic STARTED

kafka.api.PlaintextProducerSendTest > testAutoCreateTopic PASSED

kafka.api.PlaintextProducerSendTest > testSendWithInvalidCreateTime STARTED

kafka.api.PlaintextProducerSendTest > testSendWithInvalidCreateTime PASSED

kafka.api.PlaintextProducerSendTest > testBatchSizeZero STARTED

kafka.api.PlaintextProducerSendTest > testBatchSizeZero PASSED

kafka.api.PlaintextProducerSendTest > testWrongSerializer STARTED

kafka.api.PlaintextProducerSendTest > testWrongSerializer PASSED

kafka.api.PlaintextProducerSendTest > 
testSendNonCompressedMessageWithLogAppendTime STARTED

kafka.api.PlaintextProducerSendTest > 
testSendNonCompressedMessageWithLogAppendTime PASSED

kafka.api.PlaintextProducerSendTest > 
testSendNonCompressedMessageWithCreateTime STARTED

kafka.api.PlaintextProducerSendTest > 
testSendNonCompressedMessageWithCreateTime PASSED

kafka.api.PlaintextProducerSendTest > testClose STARTED

kafka.api.PlaintextProducerSendTest > testClose PASSED

kafka.api.PlaintextProducerSendTest > testFlush STARTED

kafka.api.PlaintextProducerSendTest > testFlush PASSED

kafka.api.PlaintextProducerSendTest > testSendToPartition STARTED

kafka.api.PlaintextProducerSendTest > testSendToPartition PASSED

kafka.api.PlaintextProducerSendTest > testSendOffset STARTED

kafka.api.PlaintextProducerSendTest > testSendOffset PASSED

kafka.api.PlaintextProducerSendTest > testSendCompressedMessageWithCreateTime 
STARTED

kafka.api.PlaintextProducerSendTest > testSendCompressedMessageWithCreateTime 
PASSED

kafka.api.PlaintextProducerSendTest > testCloseWithZeroTimeoutFromCallerThread 
STARTED

kafka.api.PlaintextProducerSendTest > 

Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-03-08 Thread Dong Lin
Hey Jun,

Thanks for the detailed explanation. I will use the separate thread pool to
move replica between log directories. I will let you know when the KIP has
been updated to use a separate thread pool.

Here is my response to your other questions:

1.3 My idea is that the ReplicaMoveThread that moves data should get the
lock before checking whether the replica in the destination log directory
has caught up. If the new replica has caught up, then the ReplicaMoveThread
should swaps the replica while it is still holding the lock. The
ReplicaFetcherThread or RequestHandlerThread will not be able to append
data to the replica in the source replica during this period because they
can not get the lock. Does this address the problem?

2.3 I get your point that we want to keep controller simpler. If admin tool
can send ChangeReplicaDirRequest to move data within a broker, then
controller probably doesn't even need to include log directory path in the
LeaderAndIsrRequest. How about this: controller will only deal with
reassignment across brokers as it does now. If user specified destination
replica for any disk, the admin tool will send ChangeReplicaDirRequest and
wait for response from broker to confirm that all replicas have been moved
to the destination log direcotry. The broker will put
ChangeReplicaDirRequset in a purgatory and respond either when the movement
is completed or when the request has timed-out.

4. I agree that we can expose these metrics via JMX. But I am not sure if
it can be obtained easily with good performance using either existing tools
or new script in kafka. I will ask SREs for their opinion.

Thanks,
Dong








On Wed, Mar 8, 2017 at 1:24 PM, Jun Rao  wrote:

> Hi, Dong,
>
> Thanks for the updated KIP. A few more comments below.
>
> 1.1 and 1.2: I am still not sure there is enough benefit of reusing
> ReplicaFetchThread
> to move data across disks.
> (a) A big part of ReplicaFetchThread is to deal with issuing and tracking
> fetch requests. So, it doesn't feel that we get much from reusing
> ReplicaFetchThread
> only to disable the fetching part.
> (b) The leader replica has no ReplicaFetchThread to start with. It feels
> weird to start one just for intra broker data movement.
> (c) The ReplicaFetchThread is per broker. Intuitively, the number of
> threads doing intra broker data movement should be related to the number of
> disks in the broker, not the number of brokers in the cluster.
> (d) If the destination disk fails, we want to stop the intra broker data
> movement, but want to continue inter broker replication. So, logically, it
> seems it's better to separate out the two.
> (e) I am also not sure if we should reuse the existing throttling for
> replication. It's designed to handle traffic across brokers and the
> delaying is done in the fetch request. So, if we are not doing
> fetching in ReplicaFetchThread,
> I am not sure the existing throttling is effective. Also, when specifying
> the throttling of moving data across disks, it seems the user shouldn't
> care about whether a replica is a leader or a follower. Reusing the
> existing throttling config name will be awkward in this regard.
> (f) It seems it's simpler and more consistent to use a separate thread pool
> for local data movement (for both leader and follower replicas). This
> process can then be configured (e.g. number of threads, etc) and throttled
> independently.
>
> 1.3 Yes, we will need some synchronization there. So, if the movement
> thread catches up, gets the lock to do the swap, but realizes that new data
> is added, it has to continue catching up while holding the lock?
>
> 2.3 The benefit of including the desired log directory in
> LeaderAndIsrRequest
> during partition reassignment is that the controller doesn't need to track
> the progress for disk movement. So, you don't need the additional
> BrokerDirStateUpdateRequest. Then the controller never needs to issue
> ChangeReplicaDirRequest.
> Only the admin tool will issue ChangeReplicaDirRequest to move data within
> a broker. I agree that this makes LeaderAndIsrRequest more complicated, but
> that seems simpler than changing the controller to track additional states
> during partition reassignment.
>
> 4. We want to make a decision on how to expose the stats. So far, we are
> exposing stats like the individual log size as JMX. So, one way is to just
> add new jmx to expose the log directory of individual replicas.
>
> Thanks,
>
> Jun
>
>
> On Thu, Mar 2, 2017 at 11:18 PM, Dong Lin  wrote:
>
> > Hey Jun,
> >
> > Thanks for all the comments! Please see my answer below. I have updated
> the
> > KIP to address most of the questions and make the KIP easier to
> understand.
> >
> > Thanks,
> > Dong
> >
> > On Thu, Mar 2, 2017 at 9:35 AM, Jun Rao  wrote:
> >
> > > Hi, Dong,
> > >
> > > Thanks for the KIP. A few comments below.
> > >
> > > 1. For moving data across directories
> > > 1.1 I am not sure 

Re: [jira] [Commented] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.

2017-03-08 Thread Marcos Juarez
Jun,

I see that line elsewhere in the cluster.  I don't see it happening on that
particular broker that ran into the problem.

On Mon, Mar 6, 2017 at 5:02 PM, Jun Rao (JIRA)  wrote:

>
> [ https://issues.apache.org/jira/browse/KAFKA-2729?page=
> com.atlassian.jira.plugin.system.issuetabpanels:comment-
> tabpanel=15898415#comment-15898415 ]
>
> Jun Rao commented on KAFKA-2729:
> 
>
> [~mjuarez], did you see ZK session expiration in the server.log in the
> controller around that time? The log will look like the following.
>
> INFO zookeeper state changed (Expired) (org.I0Itec.zkclient.ZkClient)
>
> > Cached zkVersion not equal to that in zookeeper, broker not recovering.
> > ---
> >
> > Key: KAFKA-2729
> > URL: https://issues.apache.org/jira/browse/KAFKA-2729
> > Project: Kafka
> >  Issue Type: Bug
> >Affects Versions: 0.8.2.1
> >Reporter: Danil Serdyuchenko
> >
> > After a small network wobble where zookeeper nodes couldn't reach each
> other, we started seeing a large number of undereplicated partitions. The
> zookeeper cluster recovered, however we continued to see a large number of
> undereplicated partitions. Two brokers in the kafka cluster were showing
> this in the logs:
> > {code}
> > [2015-10-27 11:36:00,888] INFO Partition 
> > [__samza_checkpoint_event-creation_1,3]
> on broker 5: Shrinking ISR for partition 
> [__samza_checkpoint_event-creation_1,3]
> from 6,5 to 5 (kafka.cluster.Partition)
> > [2015-10-27 11:36:00,891] INFO Partition 
> > [__samza_checkpoint_event-creation_1,3]
> on broker 5: Cached zkVersion [66] not equal to that in zookeeper, skip
> updating ISR (kafka.cluster.Partition)
> > {code}
> > For all of the topics on the effected brokers. Both brokers only
> recovered after a restart. Our own investigation yielded nothing, I was
> hoping you could shed some light on this issue. Possibly if it's related
> to: https://issues.apache.org/jira/browse/KAFKA-1382 , however we're
> using 0.8.2.1.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.15#6346)
>


[GitHub] kafka pull request #2661: kafka-4866: Kafka console consumer property is ign...

2017-03-08 Thread amethystic
GitHub user amethystic opened a pull request:

https://github.com/apache/kafka/pull/2661

kafka-4866: Kafka console consumer property is ignored

Added `print.value` config in ConsoleConsumer to match what the quickstart 
document says.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amethystic/kafka 
kafka4866_consoleconsumer_ignore_printvalue

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2661.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2661


commit 1499c20c05e4152c0e082e0e370f2915c686eb3d
Author: huxi 
Date:   2017-03-09T01:19:49Z

kafka-4866: Kafka console consumer property is ignored

Added `print.value` config in ConsoleConsumer to match what the quickstart 
document says.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (KAFKA-4866) Kafka console consumer property is ignored

2017-03-08 Thread huxi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huxi reassigned KAFKA-4866:
---

Assignee: huxi

> Kafka console consumer property is ignored
> --
>
> Key: KAFKA-4866
> URL: https://issues.apache.org/jira/browse/KAFKA-4866
> Project: Kafka
>  Issue Type: Bug
>  Components: core, tools
>Affects Versions: 0.10.2.0
> Environment: Java 8, Mac
>Reporter: Frank Lyaruu
>Assignee: huxi
>Priority: Trivial
>
> I'd like to read a topic using the console consumer, which prints the keys 
> but not the values:
> kafka-console-consumer --bootstrap-server someserver:9092 --from-beginning 
> --property print.key=true --property print.value=false --topic some_topic
> the print.value property seems to be completely ignored (I seems missing in 
> the source), but it is mentioned in the quickstart.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-03-08 Thread Colin McCabe
On Mon, Mar 6, 2017, at 11:55, Ismael Juma wrote:
> Thanks Colin.
> 
> I am familiar with the protocol semantics, but we need to document the
> API
> for users who don't know the protocol. I still think it would be valuable
> to have some examples of how the API would be used for common use cases.

Getting the version of all nodes in the cluster:
  Map nodesToVersions =
adminClient.listNodes().nodes().thenApply(
  nodes -> adminClient.apiVersions(nodes)).all().get();

Creating a topic:
  adminClient.createTopic(new NewTopic("myNewTopic", 3, (short)
  3)).all().get();

Validating that a topic can be created (but not creating it):
  adminClient.createTopic(new NewTopic("myNewTopic", 3, (short) 3),
new CreateTopicOptions().setValidateOnly(true)).all().get();

> For example, say someone creates a topic and then produces to it. What
> would be the recommended way to do that?

Once the future returned by createTopics has successfully completed, it
should be possible to produce to the topic.

There are a few warts that are definitely worth calling out.  These are
things that need to be fixed at the protocol layer, so they're outside
the scope of this KIP.  But you made a good point that we need to
document this well.  Here's my list (I wonder if anyone has more?):

* If auto.create.topics.enable is true on the brokers,
AdminClient#describeTopic(topicName) may create a topic named topicName.
 There are two workarounds: either use AdminClient#listTopics and ensure
that the topic is present before describing, or disable
auto.create.topics.enable.

* If delete.topic.enable is false on the brokers,
AdminClient#deleteTopic(topicName) will mark topicName for deletion, but
not actually delete it.  deleteTopic will return success in this case.

* It may take several seconds after AdminClient#deleteTopic returns
success for all the brokers to become aware that the topic is gone. 
During this time, AdminClient#listTopics and AdminClient#describeTopic
may continue to return information about the deleted topic.

best,
Colin


> 
> Ismael
> 
> On Mon, Mar 6, 2017 at 7:43 PM, Colin McCabe  wrote:
> 
> > On Mon, Mar 6, 2017, at 05:50, Ismael Juma wrote:
> > > Thanks Colin. It seems like you replied to me accidentally instead of the
> > > list, so leaving your reply below for the benefit of others.
> >
> > Thanks, Ismael.  I actually realized my mistake right after I sent to
> > you, and re-posted it to the mailing list instead of sending directly.
> > Sigh...
> >
> > >
> > > Regarding the disadvantage of having to hunt through the request class,
> > > don't people have to do that anyway with the Options classes?
> >
> > A lot of people will simply choose the default options, until they have
> > a reason to do otherwise (for example, they want a longer or shorter
> > timeout, etc.)
> >
> > >
> > > Aside from that, it would be great if the KIP included more detailed
> > > javadoc for each method including information about potential exceptions.
> >
> > That's a good question.  Because this is an asynchronous API, methods
> > never throw exceptions.  Instead, if you call get() / whenComplete() /
> > isCompletedExceptionally() / etc. on one of the CompletableFuture
> > objects, you will get the exception.  This is to allow Node.js-style
> > completion chaining.  I will add this explanation to the KIP.
> >
> > > I'm particularly interested in what a user can expect if a create topics
> > > succeeds versus what they can expect if a timeout exception is thrown. It
> > > would be good to consider partial failures as well.
> >
> > This is spelled out by KIP-4.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+
> > Command+line+and+centralized+administrative+operations
> >
> > Specifically,
> >
> > > If a timeout error occurs [in CreateTopic], the topic could still be
> > > created successfully at a later time. Its up to the client to query
> > > for the state at that point.
> >
> > Since we're specifically not changing the server as part of this KIP,
> > those semantics will still be in force.  Of course, there are plenty of
> > other exceptions that you can get from CreateTopics that are more
> > meaningful, such as permission-related or network-related ones.  But if
> > you get a timeout, the operation may or may not have succeeded.
> >
> > Could we fix the timeout problem?  Sort of.  We could implement
> > something like a retry cache.  The brokers would have to maintain a
> > cache of operations (and their results) which had succeeded or failed.
> > Then, if an RPC got interrupted after the server had performed it, but
> > before the client had received the response message, the client could
> > simply reconnect on another TCP session and ask the broker for the
> > result of the previous operation.  The broker could look up the result
> > in the cache and re-send it.
> >
> > This fix works, but it is very complex.  The cache requires space in
> > memory (and to 

[jira] [Commented] (KAFKA-4745) KafkaLZ4BlockOutputStream.java incorrectly finishes the last frame

2017-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902249#comment-15902249
 ] 

ASF GitHub Bot commented on KAFKA-4745:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2528


> KafkaLZ4BlockOutputStream.java incorrectly finishes the last frame
> --
>
> Key: KAFKA-4745
> URL: https://issues.apache.org/jira/browse/KAFKA-4745
> Project: Kafka
>  Issue Type: Bug
>  Components: compression
>Affects Versions: 0.10.1.1
>Reporter: Will Droste
> Fix For: 0.10.1.1
>
>
> There is a scenario where by the delegated OutputStream does not call flush 
> before close there will be missing data in the stream. The reason for this is 
> the stream is actually marked close before it is actually flushed.
> The end mark is written before the flush, also the writeEndMark was finishing 
> the stream so its redundant in this context to mark it finished. In my fork 
> the 'finished=true' was removed from the 'writeEndMark' method.
> {code}
> @Override
> public void close() throws IOException {
> if (!finished) {
> writeEndMark();
> flush();
> finished = true;
> }
> if (out != null) {
> out.close();
> out = null;
> }
> }
> {code}
> should be
> {code}
> @Override
> public void close() throws IOException {
> if (!finished) {
> // finish any pending data
> writeBlock();
> // write out the end mark
> writeEndMark();
> // mark the stream as finished
> finished = true;
> }
> if (out != null) {
> out.close();
> out = null;
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2528: KAFKA-4745 -Optimize close to remove unnecessary f...

2017-03-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2528


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Features ideas - need feedback

2017-03-08 Thread Maarek, Stephane
Hi,

Two ideas that I would like to get feedback on before putting KIPs together.

1) Ability to have the kafka client consumer “skip” data that can’t be 
de-serialized. it would be a consumer config such as 
“ignore.deserialization.errors” (got better naming?) that defaults to false to 
it’s backwards compatible, but if set to true, would produce a warning on the 
consumer client log but wouldn’t stop the processing - no errors thrown. The 
message would just be discarded. The use case is for example when reading an 
avro topic but someone pushes a message that’s not avro, currently consumers 
would break.

2) Ability to delete messages on topic. I believe log compaction already has a 
mechanism to do that so we would leverage that code. The idea would be to have 
an API to delete a message or a range of message based on topic / partition / 
offset. It would come with a command line tool. This would allow to delete 
messages from a topic so that if some bad data is pushed, it doesn’t break 
downstream consumers.


Additionally, I may be able to write 1) by myself, but I believe I won’t have 
the capability to write 2), so I’d look for someone to help out there

Looking forward to feedback.


Best regards,
Stephane

This email, and any attachments, is confidential and may be covered by legal 
professional privilege or other legal rules. If you are not the intended 
recipient you must not disclose or use the information contained in it. If you 
have received this email in error please notify us immediately by return email 
or by calling our main switchboard on +613 9868 2100 and delete the email.


[GitHub] kafka pull request #2660: MINOR: Make ConfigDef safer by not using empty str...

2017-03-08 Thread ewencp
GitHub user ewencp opened a pull request:

https://github.com/apache/kafka/pull/2660

MINOR: Make ConfigDef safer by not using empty string for NO_DEFAULT_VALUE.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ewencp/kafka minor-make-configdef-safer

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2660.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2660


commit cb00337c4b7da30f8ea1e3a1ec1e1578f77a13a3
Author: Ewen Cheslack-Postava 
Date:   2017-03-09T00:39:11Z

MINOR: Make ConfigDef safer by not using empty string for NO_DEFAULT_VALUE.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-trunk-jdk7 #2000

2017-03-08 Thread Apache Jenkins Server
See 




Features ideas - need feedback: message deletion and consumer improvements

2017-03-08 Thread Maarek, Stephane
Hi,

Two ideas that I would like to get feedback on before putting KIPs together.

1) Ability to have the kafka client consumer “skip” data that can’t be 
de-serialized. it would be a consumer config such as 
“ignore.deserialization.errors” (got better naming?) that defaults to false to 
it’s backwards compatible, but if set to true, would produce a warning on the 
consumer client log but wouldn’t stop the processing - no errors thrown. The 
message would just be discarded. The use case is for example when reading an 
avro topic but someone pushes a message that’s not avro, currently consumers 
would break.

2) Ability to delete messages on topic. I believe log compaction already has a 
mechanism to do that so we would leverage that code. The idea would be to have 
an API to delete a message or a range of message based on topic / partition / 
offset. It would come with a command line tool. This would allow to delete 
messages from a topic so that if some bad data is pushed, it doesn’t break 
downstream consumers.


Additionally, I may be able to write 1) by myself, but I believe I won’t have 
the capability to write 2), so I’d look for someone to help out there

Looking forward to feedback.


Best regards,
Stephane

This email, and any attachments, is confidential and may be covered by legal 
professional privilege or other legal rules. If you are not the intended 
recipient you must not disclose or use the information contained in it. If you 
have received this email in error please notify us immediately by return email 
or by calling our main switchboard on +613 9868 2100 and delete the email.


Re: [DISCUSS] KIP 130: Expose states of active tasks to KafkaStreams public API

2017-03-08 Thread Florian Hussonnois
Hi Guozhang

Thank you for your feedback. I've started to look more deeply into the
code. As you mention, it would be more clever to use the current
StreamMetadata API to expose these information.

I think exposing metrics through JMX is great for building monitoring
dashboards using some tools like jmxtrans and grafana.
But for our use case we would like to expose the states directely from the
application embedding the kstreams topologies.
So we expect to be able to retrieve states in a programmatic way.

For instance, we could imagin to produce those states into a dedicated
topic. In that way a third application could automatically discover all
kafka-streams applications which could be monitored.
In production environment, that can be clearly a solution to have a
complete overview of a microservices architecture based on Kafka Streams.

The toString() method give a lots of information it can only be used for
debugging purpose but not to build a topologies visualization tool. We
could actually expose same details about the stream topology from the
StreamMetadata API ? So the TaskMetadata class you have suggested could
contains similar information that ones return by the toString method from
AbstractTask class ?

I can update the KIP in that way.

Finally,  I'm not sure to understand your last point :* "Note that the
task-level assignment information is static, i.e. it will not change during
the runtime" *

Does that mean when a rebalance occurs new tasks are created for the new
assignments and old ones just switch to a standby state ?

Thanks,

2017-03-05 7:04 GMT+01:00 Guozhang Wang :

> Hello Florian,
>
> Thanks for the KIP and your detailed explanation of your use case. I think
> there are two dimensions to discuss on how to improve Streams'
> debuggability (or more specifically state exposure for visualization).
>
> First question is "what information should we expose to the user". From
> your KIP I saw generally three categories:
>
> 1. The state of the thread within a process, as you mentioned currently we
> only expose the state of the process but not the finer grained per-thread
> state.
> 2. The state of the task. Currently the most close API to this is
> StreamsMetadata,
> however it aggregates the tasks across all threads and only present the
> aggregated set of the assigned partitions / state stores etc. We can
> consider extending this method to have a new StreamsMetadata#tasks() which
> returns a TaskMetadata with the similar fields, and the
> StreamsMetadata.stateStoreNames / etc would still be returning the
> aggregated results but users can still "drill down" if they want.
>
> The second question is "how should we expose them to the user". For
> example, you mentioned about consumedOffsetsByPartition in the activeTasks.
> We could add this as a JMX metric based on fetch positions inside the
> consumer layer (note that Streams is just embedding consumers) or we could
> consider adding it into TaskMetadata. Either case it can be visualized for
> monitoring. The reason we expose StreamsMetadata as well as State was that
> it is expected to be "polled" in a programmatic way for interactive queries
> and also for control flows (e.g. I would like to ONLY start running my
> other topology until the first topology has been up and running) while for
> your case it seems the main purpose is to continuously query them for
> monitoring etc. Personally I'd prefer to expose them as JMX only for such
> purposes only to have a simpler API.
>
> So given your current motivations I'd suggest expose the following
> information as newly added metrics in Streams:
>
> 1. Thread-level state metric.
> 2. Task-level hosted client identifier metric (e.g. host:port).
> 3. Consumer-level per-topic/partition position metric (
> https://kafka.apache.org/documentation/#topic_fetch_monitoring).
>
> Note that the task-level assignment information is static, i.e. it will not
> change during the runtime at all and can be accessed from the `toString()`
> function already even before the instance start running, so I think this
> piece of information do not need to be exposed through JMX anymore.
>
> WDYT?
>
> Guozhang
>
>
> On Thu, Mar 2, 2017 at 3:11 AM, Damian Guy  wrote:
>
> > Hi Florian,
> >
> > Thanks for the KIP.
> >
> > It seems there is some overlap here with what we already have in
> > KafkaStreams.allMetadata(). This currently returns a
> > Collection where each StreamsMetadata instance holds the
> > state stores and partition assignment for every instance of the
> > KafkaStreams application. I'm wondering if that is good enough for what
> you
> > are trying to achieve? If not could it be modified to include the per
> > Thread assignment?
> >
> > Thanks,
> > Damian
> >
> >
> >
> >
> >
> >
> > On Wed, 1 Mar 2017 at 22:49 Florian Hussonnois 
> > wrote:
> >
> > > Hi Matthias,
> > >
> > > First, I will answer to your last question.
> > >
> > > The main 

[GitHub] kafka pull request #2656: MINOR: Use ConcurrentMap for ConsumerNetworkClient...

2017-03-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2656


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk7 #1999

2017-03-08 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] KAFKA-4722: Add application.id to StreamThread name

--
[...truncated 848.58 KB...]
org.apache.kafka.streams.processor.internals.StateRestorerTest > 
shouldBeCompletedIfOffsetAndOffsetLimitAreZero PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldHaveCompactionPropSetIfSupplied STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldHaveCompactionPropSetIfSupplied PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldThrowIfNameIsNull STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldThrowIfNameIsNull PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldConfigureRetentionMsWithAdditionalRetentionWhenCompactAndDelete STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldConfigureRetentionMsWithAdditionalRetentionWhenCompactAndDelete PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldBeCompactedIfCleanupPolicyCompactOrCompactAndDelete STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldBeCompactedIfCleanupPolicyCompactOrCompactAndDelete PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldNotBeCompactedWhenCleanupPolicyIsDelete STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldNotBeCompactedWhenCleanupPolicyIsDelete PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldHavePropertiesSuppliedByUser STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldHavePropertiesSuppliedByUser PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldUseCleanupPolicyFromConfigIfSupplied STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldUseCleanupPolicyFromConfigIfSupplied PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldNotConfigureRetentionMsWhenCompact STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldNotConfigureRetentionMsWhenCompact PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldNotConfigureRetentionMsWhenDelete STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldNotConfigureRetentionMsWhenDelete PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowOffsetResetSourceWithDuplicateSourceName STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowOffsetResetSourceWithDuplicateSourceName PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullProcessorSupplier STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullProcessorSupplier PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotSetApplicationIdToNull STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotSetApplicationIdToNull PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > testSourceTopics 
STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > testSourceTopics PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullNameWhenAddingSink STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullNameWhenAddingSink PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testNamedTopicMatchesAlreadyProvidedPattern STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testNamedTopicMatchesAlreadyProvidedPattern PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAddInternalTopicConfigWithCompactAndDeleteSetForWindowStores STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAddInternalTopicConfigWithCompactAndDeleteSetForWindowStores PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAddInternalTopicConfigWithCompactForNonWindowStores STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAddInternalTopicConfigWithCompactForNonWindowStores PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddSinkWithSameName STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddSinkWithSameName PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddSinkWithSelfParent STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddSinkWithSelfParent PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddProcessorWithSelfParent STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddProcessorWithSelfParent PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 

[jira] [Updated] (KAFKA-4840) There are are still cases where producer buffer pool will not remove waiters.

2017-03-08 Thread Sean McCauliff (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean McCauliff updated KAFKA-4840:
--
Status: Patch Available  (was: Open)

> There are are still cases where producer buffer pool will not remove waiters.
> -
>
> Key: KAFKA-4840
> URL: https://issues.apache.org/jira/browse/KAFKA-4840
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.2.0
>Reporter: Sean McCauliff
>
> In BufferPool.allocate(int size, long maxTimeToBlockMs):
> If a Throwable other than InterruptedException is thrown out of await() for 
> some reason or if there is an exception thrown in the corresponding finally 
> block around the await(), for example if waitTime.record(.) throws an 
> exception, then the waiters are not removed from the waiters deque.
> The number of available bytes are also not restored when an exception happens.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4840) There are are still cases where producer buffer pool will not remove waiters.

2017-03-08 Thread Sean McCauliff (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean McCauliff updated KAFKA-4840:
--
Description: 
In BufferPool.allocate(int size, long maxTimeToBlockMs):
If a Throwable other than InterruptedException is thrown out of await() for 
some reason or if there is an exception thrown in the corresponding finally 
block around the await(), for example if waitTime.record(.) throws an 
exception, then the waiters are not removed from the waiters deque.

The number of available bytes are also not restored when an exception happens.

  was:
In BufferPool.allocate(int size, long maxTimeToBlockMs):
If a Throwable other than InterruptedException is thrown out of await() for 
some reason or if there is an exception thrown in the corresponding finally 
block around the await(), for example if waitTime.record(.) throws an 
exception, then the waiters are not removed from the waiters deque.


> There are are still cases where producer buffer pool will not remove waiters.
> -
>
> Key: KAFKA-4840
> URL: https://issues.apache.org/jira/browse/KAFKA-4840
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.2.0
>Reporter: Sean McCauliff
>
> In BufferPool.allocate(int size, long maxTimeToBlockMs):
> If a Throwable other than InterruptedException is thrown out of await() for 
> some reason or if there is an exception thrown in the corresponding finally 
> block around the await(), for example if waitTime.record(.) throws an 
> exception, then the waiters are not removed from the waiters deque.
> The number of available bytes are also not restored when an exception happens.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4840) There are are still cases where producer buffer pool will not remove waiters.

2017-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902125#comment-15902125
 ] 

ASF GitHub Bot commented on KAFKA-4840:
---

GitHub user smccauliff opened a pull request:

https://github.com/apache/kafka/pull/2659

KAFKA-4840 : BufferPool errors can cause buffer pool to go into a bad state



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/smccauliff/kafka kafka-4840

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2659.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2659


commit fe201ae279861c534f2f2e3513898d7ce292d5fe
Author: Sean McCauliff 
Date:   2017-03-07T02:58:40Z

Remove more memory condition variable on throwable or sensor exception.

commit 36b9265216b1a916ddbb4a7bff8f26a910b0fe0b
Author: Sean McCauliff 
Date:   2017-03-07T06:37:55Z

Test sensor report exception.

commit 7e4997b308e29ce5a20df1c50e30eec3a275a74e
Author: Sean McCauliff 
Date:   2017-03-07T15:39:26Z

Fix checkstyle violations.
Readability.

commit f353afcf272e849b5f60ce0af653916240bebd7d
Author: Sean McCauliff 
Date:   2017-03-08T22:30:52Z

Restore avaialble memory count on exception.




> There are are still cases where producer buffer pool will not remove waiters.
> -
>
> Key: KAFKA-4840
> URL: https://issues.apache.org/jira/browse/KAFKA-4840
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.2.0
>Reporter: Sean McCauliff
>
> In BufferPool.allocate(int size, long maxTimeToBlockMs):
> If a Throwable other than InterruptedException is thrown out of await() for 
> some reason or if there is an exception thrown in the corresponding finally 
> block around the await(), for example if waitTime.record(.) throws an 
> exception, then the waiters are not removed from the waiters deque.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2659: KAFKA-4840 : BufferPool errors can cause buffer po...

2017-03-08 Thread smccauliff
GitHub user smccauliff opened a pull request:

https://github.com/apache/kafka/pull/2659

KAFKA-4840 : BufferPool errors can cause buffer pool to go into a bad state



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/smccauliff/kafka kafka-4840

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2659.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2659


commit fe201ae279861c534f2f2e3513898d7ce292d5fe
Author: Sean McCauliff 
Date:   2017-03-07T02:58:40Z

Remove more memory condition variable on throwable or sensor exception.

commit 36b9265216b1a916ddbb4a7bff8f26a910b0fe0b
Author: Sean McCauliff 
Date:   2017-03-07T06:37:55Z

Test sensor report exception.

commit 7e4997b308e29ce5a20df1c50e30eec3a275a74e
Author: Sean McCauliff 
Date:   2017-03-07T15:39:26Z

Fix checkstyle violations.
Readability.

commit f353afcf272e849b5f60ce0af653916240bebd7d
Author: Sean McCauliff 
Date:   2017-03-08T22:30:52Z

Restore avaialble memory count on exception.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-4222) Transient failure in QueryableStateIntegrationTest.queryOnRebalance

2017-03-08 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-4222.

Resolution: Resolved

> Transient failure in QueryableStateIntegrationTest.queryOnRebalance
> ---
>
> Key: KAFKA-4222
> URL: https://issues.apache.org/jira/browse/KAFKA-4222
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams, unit tests
>Reporter: Jason Gustafson
>Assignee: Matthias J. Sax
> Fix For: 0.10.1.0
>
>
> Seen here: https://builds.apache.org/job/kafka-trunk-jdk8/915/console
> {code}
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
> queryOnRebalance[1] FAILED
> java.lang.AssertionError: Condition not met within timeout 3. waiting 
> for metadata, store and value to be non null
> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:276)
> at 
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest.verifyAllKVKeys(QueryableStateIntegrationTest.java:263)
> at 
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest.queryOnRebalance(QueryableStateIntegrationTest.java:342)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

2017-03-08 Thread Matthias J. Sax
Hi,

sorry for not replying earlier and thanks for all your feedback. After
some more discussions I updated the KIP. The new proposal puts some
other design considerations into account, that I want to highlight
shortly. Those considerations, automatically resolve the concerns raised.

First some answers:

> The PAPI processors I use in my KStreams app are all functioning on KTable
> internals.  I wouldn't be able to convert them to process()/transform().
> 
> What's the harm in permitting both APIs to be used in the same application?

It's not about "harm" but about design. We want to switch from a
"inheritance" to a "composition" pattern.

About the interface idea: using a shared interface would not help to get
a composition pattern


Next I want to give the design considerations leading to the updated KIP:

1) Using KStreamBuilder in the constructor of KafkaStreams is unnatural.
KafkaStreams client executes a `Topology` and this execution should be
independent of the way the topology is "put together", ie, low-level API
or DSL.

2) Thus, we don't want to have any changes to KafkaStreams class.

3) Thus, KStreamBuilder needs to have a method `build()` that returns a
`Topology` that can be passed into KafakStreams.

4) Because `KStreamBuilder` should build a `Topology` I suggest to
rename the new class to `StreamsTopologyBuilder` (the name
TopologyBuilder would actually be more natural, but would be easily
confused with old low-level API TopologyBuilder).

Thus, PAPI and DSL can be mixed-and-matched with full power, as
StreamsTopologyBuilder return the created Topology via #build().

I also removed `final` for both builder classes.



With regard to the larger scope of the overal API redesign, I also want
to point to a summary of API issues:
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Streams+Discussions

Thus, this KIP is only one building block of a larger improvement
effort, and we hope to get as much as possible done for 0.11. If you
have any API improvement ideas, please share them so we can come up with
an holistic sound design (instead of uncoordinated local improvements
that might diverge)



Looking forward to your feedback on this KIP and the other API issues.



-Matthias




On 2/15/17 7:36 PM, Mathieu Fenniak wrote:
> On Wed, Feb 15, 2017 at 5:04 PM, Matthias J. Sax 
> wrote:
> 
>> - We also removed method #topologyBuilder() from KStreamBuilder because
>> we think #transform() should provide all functionality you need to
>> mix-an-match Processor API and DSL. If there is any further concern
>> about this, please let us know.
>>
> 
> Hi Matthias,
> 
> Yes, I'm sorry I didn't respond sooner, but I still have a lot of concerns
> about this.  You're correct to point out that transform() can be used for
> some of the output situations I pointed out; albeit it seems somewhat
> awkward to do so in a "transform" method; what do you do with the retval?
> 
> The PAPI processors I use in my KStreams app are all functioning on KTable
> internals.  I wouldn't be able to convert them to process()/transform().
> 
> What's the harm in permitting both APIs to be used in the same application?
> 
> Mathieu
> 



signature.asc
Description: OpenPGP digital signature


[GitHub] kafka pull request #2658: MINOR: Introduce NetworkClient.hasInFlightRequests

2017-03-08 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/2658

MINOR: Introduce NetworkClient.hasInFlightRequests

It’s a minor optimisation, but simple enough.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka has-in-flight-requests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2658.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2658






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4861) log.message.timestamp.type=LogAppendTime breaks Kafka based consumers

2017-03-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4861:
---
Status: Patch Available  (was: Open)

> log.message.timestamp.type=LogAppendTime breaks Kafka based consumers
> -
>
> Key: KAFKA-4861
> URL: https://issues.apache.org/jira/browse/KAFKA-4861
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.2.0
>Reporter: Dustin Cote
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.11.0.0, 0.10.2.1
>
>
> Using 0.10.2 brokers with the property 
> `log.message.timestamp.type=LogAppendTime` breaks all Kafka-based consumers 
> for the cluster. The consumer will return:
> {code}
> [2017-03-07 15:25:10,215] ERROR Unknown error when running consumer:  
> (kafka.tools.ConsoleConsumer$)
> org.apache.kafka.common.KafkaException: Unexpected error from SyncGroup: The 
> timestamp of the message is out of acceptable range.
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:535)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:508)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:764)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:745)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:186)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:149)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:116)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:493)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:322)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:253)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:172)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:334)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
>   at kafka.consumer.NewShinyConsumer.(BaseConsumer.scala:55)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:69)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:50)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {code}
> On the broker side you see:
> {code}
> [2017-03-07 15:25:20,216] INFO [GroupCoordinator 0]: Group 
> console-consumer-73205 with generation 2 is now empty 
> (kafka.coordinator.GroupCoordinator)
> [2017-03-07 15:25:20,217] ERROR [Group Metadata Manager on Broker 0]: 
> Appending metadata message for group console-consumer-73205 generation 2 
> failed due to unexpected error: 
> org.apache.kafka.common.errors.InvalidTimestampException 
> (kafka.coordinator.GroupMetadataManager)
> [2017-03-07 15:25:20,218] WARN [GroupCoordinator 0]: Failed to write empty 
> metadata for group console-consumer-73205: The timestamp of the message is 
> out of acceptable range. (kafka.coordinator.GroupCoordinator)
> {code}
> Marking as a blocker since this appears to be a regression in that it doesn't 
> happen on 0.10.1.1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4861) log.message.timestamp.type=LogAppendTime breaks Kafka based consumers

2017-03-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4861:
---
Fix Version/s: 0.11.0.0

> log.message.timestamp.type=LogAppendTime breaks Kafka based consumers
> -
>
> Key: KAFKA-4861
> URL: https://issues.apache.org/jira/browse/KAFKA-4861
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.2.0
>Reporter: Dustin Cote
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 0.11.0.0, 0.10.2.1
>
>
> Using 0.10.2 brokers with the property 
> `log.message.timestamp.type=LogAppendTime` breaks all Kafka-based consumers 
> for the cluster. The consumer will return:
> {code}
> [2017-03-07 15:25:10,215] ERROR Unknown error when running consumer:  
> (kafka.tools.ConsoleConsumer$)
> org.apache.kafka.common.KafkaException: Unexpected error from SyncGroup: The 
> timestamp of the message is out of acceptable range.
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:535)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:508)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:764)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:745)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:186)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:149)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:116)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:493)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:322)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:253)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:172)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:334)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
>   at kafka.consumer.NewShinyConsumer.(BaseConsumer.scala:55)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:69)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:50)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {code}
> On the broker side you see:
> {code}
> [2017-03-07 15:25:20,216] INFO [GroupCoordinator 0]: Group 
> console-consumer-73205 with generation 2 is now empty 
> (kafka.coordinator.GroupCoordinator)
> [2017-03-07 15:25:20,217] ERROR [Group Metadata Manager on Broker 0]: 
> Appending metadata message for group console-consumer-73205 generation 2 
> failed due to unexpected error: 
> org.apache.kafka.common.errors.InvalidTimestampException 
> (kafka.coordinator.GroupMetadataManager)
> [2017-03-07 15:25:20,218] WARN [GroupCoordinator 0]: Failed to write empty 
> metadata for group console-consumer-73205: The timestamp of the message is 
> out of acceptable range. (kafka.coordinator.GroupCoordinator)
> {code}
> Marking as a blocker since this appears to be a regression in that it doesn't 
> happen on 0.10.1.1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (KAFKA-4861) log.message.timestamp.type=LogAppendTime breaks Kafka based consumers

2017-03-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reassigned KAFKA-4861:
--

Assignee: Ismael Juma  (was: Jason Gustafson)

> log.message.timestamp.type=LogAppendTime breaks Kafka based consumers
> -
>
> Key: KAFKA-4861
> URL: https://issues.apache.org/jira/browse/KAFKA-4861
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.2.0
>Reporter: Dustin Cote
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.11.0.0, 0.10.2.1
>
>
> Using 0.10.2 brokers with the property 
> `log.message.timestamp.type=LogAppendTime` breaks all Kafka-based consumers 
> for the cluster. The consumer will return:
> {code}
> [2017-03-07 15:25:10,215] ERROR Unknown error when running consumer:  
> (kafka.tools.ConsoleConsumer$)
> org.apache.kafka.common.KafkaException: Unexpected error from SyncGroup: The 
> timestamp of the message is out of acceptable range.
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:535)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:508)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:764)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:745)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:186)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:149)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:116)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:493)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:322)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:253)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:172)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:334)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
>   at kafka.consumer.NewShinyConsumer.(BaseConsumer.scala:55)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:69)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:50)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {code}
> On the broker side you see:
> {code}
> [2017-03-07 15:25:20,216] INFO [GroupCoordinator 0]: Group 
> console-consumer-73205 with generation 2 is now empty 
> (kafka.coordinator.GroupCoordinator)
> [2017-03-07 15:25:20,217] ERROR [Group Metadata Manager on Broker 0]: 
> Appending metadata message for group console-consumer-73205 generation 2 
> failed due to unexpected error: 
> org.apache.kafka.common.errors.InvalidTimestampException 
> (kafka.coordinator.GroupMetadataManager)
> [2017-03-07 15:25:20,218] WARN [GroupCoordinator 0]: Failed to write empty 
> metadata for group console-consumer-73205: The timestamp of the message is 
> out of acceptable range. (kafka.coordinator.GroupCoordinator)
> {code}
> Marking as a blocker since this appears to be a regression in that it doesn't 
> happen on 0.10.1.1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2647: MINOR: Add varint serde utilities for new message ...

2017-03-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2647


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4861) log.message.timestamp.type=LogAppendTime breaks Kafka based consumers

2017-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902079#comment-15902079
 ] 

ASF GitHub Bot commented on KAFKA-4861:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/2657

KAFKA-4861; GroupMetadataManager record is rejected if broker configured 
with LogAppendTime

The record should be created with CreateTime (like in the producer). The 
conversion to
LogAppendTime happens automatically (if necessary).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-4861-log-append-time-breaks-group-data-manager

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2657.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2657


commit 245319d6ba977bb346cc51abe8d948cf42fc2e13
Author: Ismael Juma 
Date:   2017-03-08T22:18:02Z

KAFKA-4861; GroupMetadataManager record is rejected if broker configured 
with LogAppendTime

The record should be created with CreateTime (like in the producer). The 
conversion to
LogAppendTime happens automatically (if necessary).




> log.message.timestamp.type=LogAppendTime breaks Kafka based consumers
> -
>
> Key: KAFKA-4861
> URL: https://issues.apache.org/jira/browse/KAFKA-4861
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.2.0
>Reporter: Dustin Cote
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 0.10.2.1
>
>
> Using 0.10.2 brokers with the property 
> `log.message.timestamp.type=LogAppendTime` breaks all Kafka-based consumers 
> for the cluster. The consumer will return:
> {code}
> [2017-03-07 15:25:10,215] ERROR Unknown error when running consumer:  
> (kafka.tools.ConsoleConsumer$)
> org.apache.kafka.common.KafkaException: Unexpected error from SyncGroup: The 
> timestamp of the message is out of acceptable range.
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:535)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:508)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:764)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:745)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:186)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:149)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:116)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:493)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:322)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:253)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:172)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:334)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
>   at kafka.consumer.NewShinyConsumer.(BaseConsumer.scala:55)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:69)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:50)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {code}
> On the broker side you see:
> {code}
> [2017-03-07 15:25:20,216] INFO [GroupCoordinator 0]: Group 
> console-consumer-73205 with generation 2 is now empty 
> (kafka.coordinator.GroupCoordinator)
> [2017-03-07 15:25:20,217] ERROR [Group Metadata Manager on Broker 0]: 
> Appending metadata message for group console-consumer-73205 generation 2 
> failed due to unexpected error: 
> 

[GitHub] kafka pull request #2657: KAFKA-4861; GroupMetadataManager record is rejecte...

2017-03-08 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/2657

KAFKA-4861; GroupMetadataManager record is rejected if broker configured 
with LogAppendTime

The record should be created with CreateTime (like in the producer). The 
conversion to
LogAppendTime happens automatically (if necessary).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-4861-log-append-time-breaks-group-data-manager

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2657.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2657


commit 245319d6ba977bb346cc51abe8d948cf42fc2e13
Author: Ismael Juma 
Date:   2017-03-08T22:18:02Z

KAFKA-4861; GroupMetadataManager record is rejected if broker configured 
with LogAppendTime

The record should be created with CreateTime (like in the producer). The 
conversion to
LogAppendTime happens automatically (if necessary).




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-124: Request rate quotas

2017-03-08 Thread Todd Palino
Rajini -

I understand what you’re saying, but the point I’m making is that I don’t
believe we need to take it into account directly. The CPU utilization of
the network threads is directly proportional to the number of bytes being
sent. The more bytes, the more CPU that is required for SSL (or other
tasks). This is opposed to the request handler threads, where there are a
number of factors that affect CPU utilization. This means that it’s not
necessary to separately quota network thread byte usage and CPU - if we
quota byte usage (which we already do), we have fixed the CPU usage at a
proportional amount.

Jun -

Thanks for the clarification there. I was thinking of the utilization
percentage as being fixed, not what the percentage reflects. I’m not tied
to either way of doing it, provided that we do not lock clients to a single
thread. For example, if I specify that a given client can use 10% of a
single thread, that should also mean they can use 1% on 10 threads.

-Todd



On Wed, Mar 8, 2017 at 8:57 AM, Jun Rao  wrote:

> Hi, Todd,
>
> Thanks for the feedback.
>
> I just want to clarify your second point. If the limit percentage is per
> thread and the thread counts are changed, the absolute processing limit for
> existing users haven't changed and there is no need to adjust them. On the
> other hand, if the limit percentage is of total thread pool capacity and
> the thread counts are changed, the effective processing limit for a user
> will change. So, to preserve the current processing limit, existing user
> limits have to be adjusted. If there is a hardware change, the effective
> processing limit for a user will change in either approach and the existing
> limit may need to be adjusted. However, hardware changes are less common
> than thread pool configuration changes.
>
> Thanks,
>
> Jun
>
> On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino  wrote:
>
> > I’ve been following this one on and off, and overall it sounds good to
> me.
> >
> > - The SSL question is a good one. However, that type of overhead should
> be
> > proportional to the bytes rate, so I think that a bytes rate quota would
> > still be a suitable way to address it.
> >
> > - I think it’s better to make the quota percentage of total thread pool
> > capacity, and not percentage of an individual thread. That way you don’t
> > have to adjust it when you adjust thread counts (tuning, hardware
> changes,
> > etc.)
> >
> >
> > -Todd
> >
> >
> >
> > On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin  wrote:
> >
> > > I see. Good point about SSL.
> > >
> > > I just asked Todd to take a look.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao  wrote:
> > >
> > > > Hi, Jiangjie,
> > > >
> > > > Yes, I agree that byte rate already protects the network threads
> > > > indirectly. I am not sure if byte rate fully captures the CPU
> overhead
> > in
> > > > network due to SSL. So, at the high level, we can use request time
> > limit
> > > to
> > > > protect CPU and use byte rate to protect storage and network.
> > > >
> > > > Also, do you think you can get Todd to comment on this KIP?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin 
> > > wrote:
> > > >
> > > > > Hi Rajini/Jun,
> > > > >
> > > > > The percentage based reasoning sounds good.
> > > > > One thing I am wondering is that if we assume the network thread
> are
> > > just
> > > > > doing the network IO, can we say bytes rate quota is already sort
> of
> > > > > network threads quota?
> > > > > If we take network threads into the consideration here, would that
> be
> > > > > somewhat overlapping with the bytes rate quota?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > > rajinisiva...@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Jun,
> > > > > >
> > > > > > Thank you for the explanation, I hadn't realized you meant
> > percentage
> > > > of
> > > > > > the total thread pool. If everyone is OK with Jun's suggestion, I
> > > will
> > > > > > update the KIP.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Rajini
> > > > > >
> > > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao 
> wrote:
> > > > > >
> > > > > > > Hi, Rajini,
> > > > > > >
> > > > > > > Let's take your example. Let's say a user sets the limit to
> 50%.
> > I
> > > am
> > > > > not
> > > > > > > sure if it's better to apply the same percentage separately to
> > > > network
> > > > > > and
> > > > > > > io thread pool. For example, for produce requests, most of the
> > time
> > > > > will
> > > > > > be
> > > > > > > spent in the io threads whereas for fetch requests, most of the
> > > time
> > > > > will
> > > > > > > be in the network threads. So, using the same percentage in
> both
> > > > thread
> > 

[jira] [Commented] (KAFKA-4474) Poor kafka-streams throughput

2017-03-08 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902026#comment-15902026
 ] 

Eno Thereska commented on KAFKA-4474:
-

[~agomez] I haven't forgotten about this and we're redesigning the runLoop. 
Have a look at this PR if you get a chance. Any feedback is welcome: 
https://github.com/apache/kafka/pull/2643.

> Poor kafka-streams throughput
> -
>
> Key: KAFKA-4474
> URL: https://issues.apache.org/jira/browse/KAFKA-4474
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Juan Chorro
>Assignee: Eno Thereska
> Attachments: hctop sreenshot.png, kafka-streams-bug-1.png, 
> kafka-streams-bug-2.png, Performance test results.png
>
>
> Hi! 
> I'm writing because I have a worry about kafka-streams throughput.
> I have only a kafka-streams application instance that consumes from 'input' 
> topic, prints on the screen and produces in 'output' topic. All topics have 4 
> partitions. As can be observed the topology is very simple.
> I produce 120K messages/second to 'input' topic, when I measure the 'output' 
> topic I detect that I'm receiving ~4K messages/second. I had next 
> configuration (Remaining parameters by default):
> application.id: myApp
> bootstrap.servers: localhost:9092
> zookeeper.connect: localhost:2181
> num.stream.threads: 1
> I was doing proofs and tests without success, but when I created a new 
> 'input' topic with 1 partition (Maintain 'output' topic with 4 partitions) I 
> got in 'output' topic 120K messages/seconds.
> I have been doing some performance tests and proof with next cases (All 
> topics have 4 partitions in all cases):
> Case A - 1 Instance:
> - With num.stream.threads set to 1 I had ~3785 messages/second
> - With num.stream.threads set to 2 I had ~3938 messages/second
> - With num.stream.threads set to 4 I had ~120K messages/second
> Case B - 2 Instances:
> - With num.stream.threads set to 1 I had ~3930 messages/second for each 
> instance (And throughput ~8K messages/second)
> - With num.stream.threads set to 2 I had ~3945 messages/second for each 
> instance (And more or less same throughput that with num.stream.threads set 
> to 1)
> Case C - 4 Instances
> - With num.stream.threads set to 1 I had 3946 messages/seconds for each 
> instance (And throughput ~17K messages/second):
> As can be observed when num.stream.threads is set to #partitions I have best 
> results. Then I have next questions:
> - Why whether I have a topic with #partitions > 1 and with 
> num.streams.threads is set to 1 I have ~4K messages/second always?
> - In case C. 4 instances with num.stream.threads set to 1 should be better 
> that 1 instance with num.stream.threads set to 4. Is corrects this 
> supposition?
> This is the kafka-streams application that I use: 
> https://gist.github.com/Chorro/5522ec4acd1a005eb8c9663da86f5a18



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4222) Transient failure in QueryableStateIntegrationTest.queryOnRebalance

2017-03-08 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902017#comment-15902017
 ] 

Eno Thereska commented on KAFKA-4222:
-

Sounds good.

> Transient failure in QueryableStateIntegrationTest.queryOnRebalance
> ---
>
> Key: KAFKA-4222
> URL: https://issues.apache.org/jira/browse/KAFKA-4222
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams, unit tests
>Reporter: Jason Gustafson
>Assignee: Matthias J. Sax
> Fix For: 0.10.1.0
>
>
> Seen here: https://builds.apache.org/job/kafka-trunk-jdk8/915/console
> {code}
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
> queryOnRebalance[1] FAILED
> java.lang.AssertionError: Condition not met within timeout 3. waiting 
> for metadata, store and value to be non null
> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:276)
> at 
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest.verifyAllKVKeys(QueryableStateIntegrationTest.java:263)
> at 
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest.queryOnRebalance(QueryableStateIntegrationTest.java:342)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4222) Transient failure in QueryableStateIntegrationTest.queryOnRebalance

2017-03-08 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902014#comment-15902014
 ] 

Matthias J. Sax commented on KAFKA-4222:


This did not occur anymore. Can we close this for now?

> Transient failure in QueryableStateIntegrationTest.queryOnRebalance
> ---
>
> Key: KAFKA-4222
> URL: https://issues.apache.org/jira/browse/KAFKA-4222
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams, unit tests
>Reporter: Jason Gustafson
>Assignee: Matthias J. Sax
> Fix For: 0.10.1.0
>
>
> Seen here: https://builds.apache.org/job/kafka-trunk-jdk8/915/console
> {code}
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
> queryOnRebalance[1] FAILED
> java.lang.AssertionError: Condition not met within timeout 3. waiting 
> for metadata, store and value to be non null
> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:276)
> at 
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest.verifyAllKVKeys(QueryableStateIntegrationTest.java:263)
> at 
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest.queryOnRebalance(QueryableStateIntegrationTest.java:342)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-03-08 Thread Jun Rao
Hi, Dong,

Thanks for the updated KIP. A few more comments below.

1.1 and 1.2: I am still not sure there is enough benefit of reusing
ReplicaFetchThread
to move data across disks.
(a) A big part of ReplicaFetchThread is to deal with issuing and tracking
fetch requests. So, it doesn't feel that we get much from reusing
ReplicaFetchThread
only to disable the fetching part.
(b) The leader replica has no ReplicaFetchThread to start with. It feels
weird to start one just for intra broker data movement.
(c) The ReplicaFetchThread is per broker. Intuitively, the number of
threads doing intra broker data movement should be related to the number of
disks in the broker, not the number of brokers in the cluster.
(d) If the destination disk fails, we want to stop the intra broker data
movement, but want to continue inter broker replication. So, logically, it
seems it's better to separate out the two.
(e) I am also not sure if we should reuse the existing throttling for
replication. It's designed to handle traffic across brokers and the
delaying is done in the fetch request. So, if we are not doing
fetching in ReplicaFetchThread,
I am not sure the existing throttling is effective. Also, when specifying
the throttling of moving data across disks, it seems the user shouldn't
care about whether a replica is a leader or a follower. Reusing the
existing throttling config name will be awkward in this regard.
(f) It seems it's simpler and more consistent to use a separate thread pool
for local data movement (for both leader and follower replicas). This
process can then be configured (e.g. number of threads, etc) and throttled
independently.

1.3 Yes, we will need some synchronization there. So, if the movement
thread catches up, gets the lock to do the swap, but realizes that new data
is added, it has to continue catching up while holding the lock?

2.3 The benefit of including the desired log directory in LeaderAndIsrRequest
during partition reassignment is that the controller doesn't need to track
the progress for disk movement. So, you don't need the additional
BrokerDirStateUpdateRequest. Then the controller never needs to issue
ChangeReplicaDirRequest.
Only the admin tool will issue ChangeReplicaDirRequest to move data within
a broker. I agree that this makes LeaderAndIsrRequest more complicated, but
that seems simpler than changing the controller to track additional states
during partition reassignment.

4. We want to make a decision on how to expose the stats. So far, we are
exposing stats like the individual log size as JMX. So, one way is to just
add new jmx to expose the log directory of individual replicas.

Thanks,

Jun


On Thu, Mar 2, 2017 at 11:18 PM, Dong Lin  wrote:

> Hey Jun,
>
> Thanks for all the comments! Please see my answer below. I have updated the
> KIP to address most of the questions and make the KIP easier to understand.
>
> Thanks,
> Dong
>
> On Thu, Mar 2, 2017 at 9:35 AM, Jun Rao  wrote:
>
> > Hi, Dong,
> >
> > Thanks for the KIP. A few comments below.
> >
> > 1. For moving data across directories
> > 1.1 I am not sure why we want to use ReplicaFetcherThread to move data
> > around in the leader. ReplicaFetchThread fetches data from socket. For
> > moving data locally, it seems that we want to avoid the socket overhead.
> >
>
> The purpose of using ReplicaFetchThread is to re-use existing thread
> instead of creating more threads and make our thread model more complex. It
> seems like a nature choice for copying data between disks since it is
> similar to copying data between brokers. Another reason is that if the
> replica to be moved is a follower, we don't need lock to swap replicas when
> destination replica has caught up, since the same thread which is fetching
> data from leader will swap the replica.
>
> The ReplicaFetchThread will not incur socket overhead while copying data
> between disks. It will read directly from source disk (as we do when
> processing FetchRequest) and write to destination disk (as we do when
> processing ProduceRequest).
>
>
> > 1.2 I am also not sure about moving data in the ReplicaFetcherThread in
> the
> > follower. For example, I am not sure setting replica.fetch.max.wait to 0
> >  is ideal. It may not always be effective since a fetch request in the
> > ReplicaFetcherThread could be arbitrarily delayed due to replication
> > throttling on the leader. In general, the data movement logic across
> disks
> > seems different from that in ReplicaFetcherThread. So, I am not sure why
> > they need to be coupled.
> >
>
> While it may not be the most efficient way to copy data between local
> disks, it will be at least as efficient as copying data from leader to the
> destination disk. The expected goal of KIP-113 is to enable data movement
> between disks with no less efficiency than what we do now when moving data
> between brokers. I think we can optimize its performance using separate
> thread if the performance is not 

[jira] [Updated] (KAFKA-4869) 0.10.2.0 release notes incorrectly include KIP-115

2017-03-08 Thread Yeva Byzek (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yeva Byzek updated KAFKA-4869:
--
Description: 
>From http://kafka.apache.org/documentation.html :

bq. The offsets.topic.replication.factor broker config is now enforced upon 
auto topic creation. Internal auto topic creation will fail with a 
GROUP_COORDINATOR_NOT_AVAILABLE error until the cluster size meets this 
replication factor requirement.

Even though this feature 
[KIP-115|https://cwiki.apache.org/confluence/display/KAFKA/KIP-115%3A+Enforce+offsets.topic.replication.factor+upon+__consumer_offsets+auto+topic+creation]
 did not make it into 0.10.2.0

  was:
>From http://kafka.apache.org/documentation.html :

bq. The offsets.topic.replication.factor broker config is now enforced upon 
auto topic creation. Internal auto topic creation will fail with a 
GROUP_COORDINATOR_NOT_AVAILABLE error until the cluster size meets this 
replication factor requirement.

Even though this feature (KIP-115) did not make it into 0.10.2.0


> 0.10.2.0 release notes incorrectly include KIP-115
> --
>
> Key: KAFKA-4869
> URL: https://issues.apache.org/jira/browse/KAFKA-4869
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.10.2.0
>Reporter: Yeva Byzek
>Priority: Minor
>
> From http://kafka.apache.org/documentation.html :
> bq. The offsets.topic.replication.factor broker config is now enforced upon 
> auto topic creation. Internal auto topic creation will fail with a 
> GROUP_COORDINATOR_NOT_AVAILABLE error until the cluster size meets this 
> replication factor requirement.
> Even though this feature 
> [KIP-115|https://cwiki.apache.org/confluence/display/KAFKA/KIP-115%3A+Enforce+offsets.topic.replication.factor+upon+__consumer_offsets+auto+topic+creation]
>  did not make it into 0.10.2.0



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4869) 0.10.2.0 release notes incorrectly include KIP-115

2017-03-08 Thread Yeva Byzek (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yeva Byzek updated KAFKA-4869:
--
Description: 
>From http://kafka.apache.org/documentation.html :

bq. The offsets.topic.replication.factor broker config is now enforced upon 
auto topic creation. Internal auto topic creation will fail with a 
GROUP_COORDINATOR_NOT_AVAILABLE error until the cluster size meets this 
replication factor requirement.

Even though this feature (KIP-115) did not make it into 0.10.2.0

  was:
>From http://kafka.apache.org/documentation.html show

bq. The offsets.topic.replication.factor broker config is now enforced upon 
auto topic creation. Internal auto topic creation will fail with a 
GROUP_COORDINATOR_NOT_AVAILABLE error until the cluster size meets this 
replication factor requirement.

Even though this feature (KIP-115) did not make it into 0.10.2.0


> 0.10.2.0 release notes incorrectly include KIP-115
> --
>
> Key: KAFKA-4869
> URL: https://issues.apache.org/jira/browse/KAFKA-4869
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.10.2.0
>Reporter: Yeva Byzek
>Priority: Minor
>
> From http://kafka.apache.org/documentation.html :
> bq. The offsets.topic.replication.factor broker config is now enforced upon 
> auto topic creation. Internal auto topic creation will fail with a 
> GROUP_COORDINATOR_NOT_AVAILABLE error until the cluster size meets this 
> replication factor requirement.
> Even though this feature (KIP-115) did not make it into 0.10.2.0



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-4869) 0.10.2.0 release notes incorrectly include KIP-115

2017-03-08 Thread Yeva Byzek (JIRA)
Yeva Byzek created KAFKA-4869:
-

 Summary: 0.10.2.0 release notes incorrectly include KIP-115
 Key: KAFKA-4869
 URL: https://issues.apache.org/jira/browse/KAFKA-4869
 Project: Kafka
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.10.2.0
Reporter: Yeva Byzek
Priority: Minor


>From http://kafka.apache.org/documentation.html show

bq. The offsets.topic.replication.factor broker config is now enforced upon 
auto topic creation. Internal auto topic creation will fail with a 
GROUP_COORDINATOR_NOT_AVAILABLE error until the cluster size meets this 
replication factor requirement.

Even though this feature (KIP-115) did not make it into 0.10.2.0



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4722) Add application.id to StreamThread name

2017-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901865#comment-15901865
 ] 

ASF GitHub Bot commented on KAFKA-4722:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2617


> Add application.id to StreamThread name
> ---
>
> Key: KAFKA-4722
> URL: https://issues.apache.org/jira/browse/KAFKA-4722
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.1.1
>Reporter: Steven Schlansker
>Assignee: Sharad
>Priority: Minor
>  Labels: beginner, easyfix, newbie
> Fix For: 0.11.0.0
>
>
> StreamThread currently sets its name thusly:
> {code}
> super("StreamThread-" + STREAM_THREAD_ID_SEQUENCE.getAndIncrement());
> {code}
> If you have multiple {{KafkaStreams}} instance within a single application, 
> it would help to add the application ID to {{StreamThread}} name to identify 
> which thread belong to what {{KafkaStreams}} instance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2617: KAFKA-4722: Add application.id to StreamThread nam...

2017-03-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2617


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4722) Add application.id to StreamThread name

2017-03-08 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-4722:
-
   Resolution: Fixed
Fix Version/s: 0.11.0.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2617
[https://github.com/apache/kafka/pull/2617]

> Add application.id to StreamThread name
> ---
>
> Key: KAFKA-4722
> URL: https://issues.apache.org/jira/browse/KAFKA-4722
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.1.1
>Reporter: Steven Schlansker
>Assignee: Sharad
>Priority: Minor
>  Labels: beginner, easyfix, newbie
> Fix For: 0.11.0.0
>
>
> StreamThread currently sets its name thusly:
> {code}
> super("StreamThread-" + STREAM_THREAD_ID_SEQUENCE.getAndIncrement());
> {code}
> If you have multiple {{KafkaStreams}} instance within a single application, 
> it would help to add the application ID to {{StreamThread}} name to identify 
> which thread belong to what {{KafkaStreams}} instance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-4868) Optimize RocksDb config for fast recovery/bulk load

2017-03-08 Thread Eno Thereska (JIRA)
Eno Thereska created KAFKA-4868:
---

 Summary: Optimize RocksDb config for fast recovery/bulk load
 Key: KAFKA-4868
 URL: https://issues.apache.org/jira/browse/KAFKA-4868
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 0.10.2.0
Reporter: Eno Thereska
Priority: Minor
 Fix For: 0.11.0.0


RocksDb can be tuned to bulk-load data fast. Kafka Streams bulk-loads records 
during recovery. It is likely we can use a different config to make recovery 
faster, then revert to another config for normal operations like put/get. See 
https://github.com/facebook/rocksdb/wiki/performance-benchmarks for examples. 

Would be good to measure the performance gain as part of addressing this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2656: MINOR: Use ConcurrentMap for ConsumerNetworkClient...

2017-03-08 Thread hachikuji
GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/2656

MINOR: Use ConcurrentMap for ConsumerNetworkClient UnsentRequests



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka minor-cleanup-unsent-requests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2656.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2656


commit 5880c77087f0286db4aa962b189b0a0a46cea26e
Author: Jason Gustafson 
Date:   2017-03-08T18:39:22Z

MINOR: Use ConcurrentMap for ConsumerNetworkClient UnsentRequests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3908) Set SendBufferSize for socket used by Processor

2017-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901702#comment-15901702
 ] 

ASF GitHub Bot commented on KAFKA-3908:
---

Github user lindong28 closed the pull request at:

https://github.com/apache/kafka/pull/1562


> Set SendBufferSize for socket used by Processor
> ---
>
> Key: KAFKA-3908
> URL: https://issues.apache.org/jira/browse/KAFKA-3908
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
>
> SRE should be able to control the receive buffer size of sockets that are 
> used to receive request from clients, for the same reason set receive buffer 
> size for all other sockets in the server and client. However, we currently 
> only set the send buffer size of this socket.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #1562: KAFKA-3908; Set ReceiveBufferSize for socket used ...

2017-03-08 Thread lindong28
Github user lindong28 closed the pull request at:

https://github.com/apache/kafka/pull/1562


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (KAFKA-4863) Querying window store may return unwanted keys

2017-03-08 Thread Damian Guy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Guy reassigned KAFKA-4863:
-

Assignee: Damian Guy  (was: Eno Thereska)

> Querying window store may return unwanted keys
> --
>
> Key: KAFKA-4863
> URL: https://issues.apache.org/jira/browse/KAFKA-4863
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Xavier Léauté
>Assignee: Damian Guy
>Priority: Critical
>
> Using variable length keys in a window store may cause unwanted results to be 
> returned when querying certain ranges.
> Below is a test case for {{RocksDBWindowStoreTest}} that shows the problem. 
> It fails, returning {{\[0001, 0003, 0002, 0004, 0005\]}} instead of {{\[0001, 
> 0003, 0005\]}}.
> {code:java}
> @Test
> public void testPutAndFetchSanity() throws IOException {
> final RocksDBWindowStoreSupplier supplier =
> new RocksDBWindowStoreSupplier<>(
> "window", 60 * 1000L * 2, 3,
> true, Serdes.String(), Serdes.String(),
> windowSize, true, Collections.emptyMap(), 
> false
> );
> final WindowStore store = supplier.get();
> store.init(context, store);
> try {
> store.put("a", "0001", 0);
> store.put("aa", "0002", 0);
> store.put("a", "0003", 1);
> store.put("aa", "0004", 1);
> store.put("a", "0005", 6);
> assertEquals(Utils.mkList("0001", "0003", "0005"), 
> toList(store.fetch("a", 0, Long.MAX_VALUE)));
> } finally {
> store.close();
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-4867) zookeeper-security-migration.sh does not clear ACLs from all nodes

2017-03-08 Thread Stevo Slavic (JIRA)
Stevo Slavic created KAFKA-4867:
---

 Summary: zookeeper-security-migration.sh does not clear ACLs from 
all nodes
 Key: KAFKA-4867
 URL: https://issues.apache.org/jira/browse/KAFKA-4867
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.10.1.1
Reporter: Stevo Slavic
Priority: Minor


zookeeper-security-migration.sh help for --zookeeper.acl switch with 
'secure'/'unsecure' as possible values suggests that command should apply the 
change to all Kafka znodes. That doesn't seem to be the case at least for 
'unsecure', so clearing ACLs use case.

With ACLs set on Kafka znodes, I ran

{noformat}
bin/zookeeper-security-migration.sh --zookeeper.acl 'unsecure' 
--zookeeper.connect x.y.z.w:2181
{noformat}

and with zookeeper-shell.sh getAcl checked ACLs set on few nodes. Node 
_/brokers/topics_ had ACL cleared (only default one that world can do anything 
remained). On the other hand node _/brokers_ still had secure ACLs set that 
world can read and owner can do everything. Nodes and respective sub trees of 
_/cluster_ and _/controller_ also had secure ACLs still set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-124: Request rate quotas

2017-03-08 Thread Jun Rao
Hi, Todd,

Thanks for the feedback.

I just want to clarify your second point. If the limit percentage is per
thread and the thread counts are changed, the absolute processing limit for
existing users haven't changed and there is no need to adjust them. On the
other hand, if the limit percentage is of total thread pool capacity and
the thread counts are changed, the effective processing limit for a user
will change. So, to preserve the current processing limit, existing user
limits have to be adjusted. If there is a hardware change, the effective
processing limit for a user will change in either approach and the existing
limit may need to be adjusted. However, hardware changes are less common
than thread pool configuration changes.

Thanks,

Jun

On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino  wrote:

> I’ve been following this one on and off, and overall it sounds good to me.
>
> - The SSL question is a good one. However, that type of overhead should be
> proportional to the bytes rate, so I think that a bytes rate quota would
> still be a suitable way to address it.
>
> - I think it’s better to make the quota percentage of total thread pool
> capacity, and not percentage of an individual thread. That way you don’t
> have to adjust it when you adjust thread counts (tuning, hardware changes,
> etc.)
>
>
> -Todd
>
>
>
> On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin  wrote:
>
> > I see. Good point about SSL.
> >
> > I just asked Todd to take a look.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao  wrote:
> >
> > > Hi, Jiangjie,
> > >
> > > Yes, I agree that byte rate already protects the network threads
> > > indirectly. I am not sure if byte rate fully captures the CPU overhead
> in
> > > network due to SSL. So, at the high level, we can use request time
> limit
> > to
> > > protect CPU and use byte rate to protect storage and network.
> > >
> > > Also, do you think you can get Todd to comment on this KIP?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin 
> > wrote:
> > >
> > > > Hi Rajini/Jun,
> > > >
> > > > The percentage based reasoning sounds good.
> > > > One thing I am wondering is that if we assume the network thread are
> > just
> > > > doing the network IO, can we say bytes rate quota is already sort of
> > > > network threads quota?
> > > > If we take network threads into the consideration here, would that be
> > > > somewhat overlapping with the bytes rate quota?
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > rajinisiva...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Jun,
> > > > >
> > > > > Thank you for the explanation, I hadn't realized you meant
> percentage
> > > of
> > > > > the total thread pool. If everyone is OK with Jun's suggestion, I
> > will
> > > > > update the KIP.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Rajini
> > > > >
> > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao  wrote:
> > > > >
> > > > > > Hi, Rajini,
> > > > > >
> > > > > > Let's take your example. Let's say a user sets the limit to 50%.
> I
> > am
> > > > not
> > > > > > sure if it's better to apply the same percentage separately to
> > > network
> > > > > and
> > > > > > io thread pool. For example, for produce requests, most of the
> time
> > > > will
> > > > > be
> > > > > > spent in the io threads whereas for fetch requests, most of the
> > time
> > > > will
> > > > > > be in the network threads. So, using the same percentage in both
> > > thread
> > > > > > pools means one of the pools' resource will be over allocated.
> > > > > >
> > > > > > An alternative way is to simply model network and io thread pool
> > > > > together.
> > > > > > If you get 10 io threads and 5 network threads, you get 1500%
> > request
> > > > > > processing power. A 50% limit means a total of 750% processing
> > power.
> > > > We
> > > > > > just add up the time a user request spent in either network or io
> > > > thread.
> > > > > > If that total exceeds 750% (doesn't matter whether it's spent
> more
> > in
> > > > > > network or io thread), the request will be throttled. This seems
> > more
> > > > > > general and is not sensitive to the current implementation detail
> > of
> > > > > having
> > > > > > a separate network and io thread pool. In the future, if the
> > > threading
> > > > > > model changes, the same concept of quota can still be applied.
> For
> > > now,
> > > > > > since it's a bit tricky to add the delay logic in the network
> > thread
> > > > > pool,
> > > > > > we could probably just do the delaying only in the io threads as
> > you
> > > > > > suggested earlier.
> > > > > >
> > > > > > There is still the orthogonal question of whether a quota of 50%
> is
> > > out
> > > > > of
> > > > > > 100% or 100% * #total processing threads. My 

[jira] [Commented] (KAFKA-4569) Transient failure in org.apache.kafka.clients.consumer.KafkaConsumerTest.testWakeupWithFetchDataAvailable

2017-03-08 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901556#comment-15901556
 ] 

Jason Gustafson commented on KAFKA-4569:


Sorry for the late response. If I understand correctly, the bug we've 
identified is that the consumer's wakeup will be postponed to a later poll() if 
we have fetch data available. The challenge fixing this is ensuring that we 
don't increment the consumer's positions prior to raising the WakeupException 
since that would effectively cause lost data. A straightforward option that 
comes to mind is adding a {{hasFetchedRecords()}} method to {{Fetcher}} to be 
able to check for fetched data without incrementing the position. We could also 
add a {{hasPendingWakeup()}} or something to {{ConsumerNetworkClient}}. If a 
wakeup is expected, then we can just call {{poll(0)}} to trigger it.

[~umesh9...@gmail.com] Are you still working on this?

> Transient failure in 
> org.apache.kafka.clients.consumer.KafkaConsumerTest.testWakeupWithFetchDataAvailable
> -
>
> Key: KAFKA-4569
> URL: https://issues.apache.org/jira/browse/KAFKA-4569
> Project: Kafka
>  Issue Type: Sub-task
>  Components: unit tests
>Reporter: Guozhang Wang
>Assignee: Umesh Chaudhary
>  Labels: newbie
> Fix For: 0.11.0.0, 0.10.2.1
>
>
> One example is:
> https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/370/testReport/junit/org.apache.kafka.clients.consumer/KafkaConsumerTest/testWakeupWithFetchDataAvailable/
> {code}
> Stacktrace
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.fail(Assert.java:95)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumerTest.testWakeupWithFetchDataAvailable(KafkaConsumerTest.java:679)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
>   at 

[jira] [Commented] (KAFKA-4669) KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws exception

2017-03-08 Thread Edoardo Comar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901349#comment-15901349
 ] 

Edoardo Comar commented on KAFKA-4669:
--

thanks [~ijuma] yes we're planning to move to 0.10.2 very soon

> KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws 
> exception
> -
>
> Key: KAFKA-4669
> URL: https://issues.apache.org/jira/browse/KAFKA-4669
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.1
>Reporter: Cheng Ju
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.0
>
>
> There is no try catch in NetworkClient.handleCompletedReceives.  If an 
> exception is thrown after inFlightRequests.completeNext(source), then the 
> corresponding RecordBatch's done will never get called, and 
> KafkaProducer.flush will hang on this RecordBatch.
> I've checked 0.10 code and think this bug does exist in 0.10 versions.
> A real case.  First a correlateId not match exception happens:
> 13 Jan 2017 17:08:24,059 ERROR [kafka-producer-network-thread | producer-21] 
> (org.apache.kafka.clients.producer.internals.Sender.run:130)  - Uncaught 
> error in kafka producer I/O thread: 
> java.lang.IllegalStateException: Correlation id for response (703766) does 
> not match request (703764)
>   at 
> org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477)
>   at 
> org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
>   at java.lang.Thread.run(Thread.java:745)
> Then jstack shows the thread is hanging on:
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
>   at 
> org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:57)
>   at 
> org.apache.kafka.clients.producer.internals.RecordAccumulator.awaitFlushCompletion(RecordAccumulator.java:425)
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.flush(KafkaProducer.java:544)
>   at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:224)
>   at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
>   at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
>   at java.lang.Thread.run(Thread.java:745)
> client code 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-4866) Kafka console consumer property is ignored

2017-03-08 Thread Frank Lyaruu (JIRA)
Frank Lyaruu created KAFKA-4866:
---

 Summary: Kafka console consumer property is ignored
 Key: KAFKA-4866
 URL: https://issues.apache.org/jira/browse/KAFKA-4866
 Project: Kafka
  Issue Type: Bug
  Components: core, tools
Affects Versions: 0.10.2.0
 Environment: Java 8, Mac
Reporter: Frank Lyaruu
Priority: Trivial


I'd like to read a topic using the console consumer, which prints the keys but 
not the values:

kafka-console-consumer --bootstrap-server someserver:9092 --from-beginning 
--property print.key=true --property print.value=false --topic some_topic

the print.value property seems to be completely ignored (I seems missing in the 
source), but it is mentioned in the quickstart.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4669) KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws exception

2017-03-08 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901095#comment-15901095
 ] 

Ismael Juma commented on KAFKA-4669:


By the way, the code where the NPE was happening changed in 0.10.2.0 so that we 
check for `null` and call `closingChannel` in that case:

https://github.com/apache/kafka/commit/e53babab9cada20cc54a18c0fd63aa5ab84fd012#diff-24ec8055938739c40d2c11cafff35455R494

It's not clear to me if the issue is totally fixed though as a NPE could arise 
if `closingChannel` returns `null` in that case (in theory, it should not, but 
we were never able to find out what could cause `channel` to return `null` 
previously).

[~ecomar], are you guys planning to upgrade to 0.10.2.0 soon?

> KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws 
> exception
> -
>
> Key: KAFKA-4669
> URL: https://issues.apache.org/jira/browse/KAFKA-4669
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.1
>Reporter: Cheng Ju
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.0
>
>
> There is no try catch in NetworkClient.handleCompletedReceives.  If an 
> exception is thrown after inFlightRequests.completeNext(source), then the 
> corresponding RecordBatch's done will never get called, and 
> KafkaProducer.flush will hang on this RecordBatch.
> I've checked 0.10 code and think this bug does exist in 0.10 versions.
> A real case.  First a correlateId not match exception happens:
> 13 Jan 2017 17:08:24,059 ERROR [kafka-producer-network-thread | producer-21] 
> (org.apache.kafka.clients.producer.internals.Sender.run:130)  - Uncaught 
> error in kafka producer I/O thread: 
> java.lang.IllegalStateException: Correlation id for response (703766) does 
> not match request (703764)
>   at 
> org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477)
>   at 
> org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
>   at java.lang.Thread.run(Thread.java:745)
> Then jstack shows the thread is hanging on:
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
>   at 
> org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:57)
>   at 
> org.apache.kafka.clients.producer.internals.RecordAccumulator.awaitFlushCompletion(RecordAccumulator.java:425)
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.flush(KafkaProducer.java:544)
>   at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:224)
>   at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
>   at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
>   at java.lang.Thread.run(Thread.java:745)
> client code 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4864) Kafka Secure Migrator tool doesn't secure all the nodes

2017-03-08 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901088#comment-15901088
 ] 

Ismael Juma commented on KAFKA-4864:


Thanks for the report. I think the expectation was that you would secure ZK 
before you added ACLs to it, but I agree that this should be improved. I 
haven't looked at the details of the rest yet.

> Kafka Secure Migrator tool doesn't secure all the nodes
> ---
>
> Key: KAFKA-4864
> URL: https://issues.apache.org/jira/browse/KAFKA-4864
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.0, 0.10.1.1, 0.10.2.0
>Reporter: Stephane Maarek
>Priority: Critical
>
> It seems that the secure nodes as referred by ZkUtils.scala are the following:
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils/ZkUtils.scala#L201
> A couple things:
> - the list is highly outdated, and for example the most important nodes such 
> as kafka-acls don't get secured. That's a huge security risk. Would it be 
> better to just secure all the nodes recursively from the given root?
> - the root of some nodes aren't secured. Ex: /brokers (but many others).
> The result is the following after running the tool:
> zookeeper-security-migration --zookeeper.acl secure --zookeeper.connect 
> zoo1:2181/kafka-test
> [zk: localhost:2181(CONNECTED) 9] getAcl /kafka-test/brokers
> 'world,'anyone
> : cdrwa
> [zk: localhost:2181(CONNECTED) 11] getAcl /kafka-test/brokers/ids
> 'world,'anyone
> : r
> 'sasl,'myzkcli...@example.com
> : cdrwa
> [zk: localhost:2181(CONNECTED) 16] getAcl /kafka-test/kafka-acl
> 'world,'anyone
> : cdrwa
> That seems pretty bad to be honest... A fast enough ZkClient could delete 
> some root nodes, and create the nodes they like before the Acls get set. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4669) KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws exception

2017-03-08 Thread Edoardo Comar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901075#comment-15901075
 ] 

Edoardo Comar commented on KAFKA-4669:
--

The NPE moved from broker1 to another broker when the former was restarted. 
This suggest to me it's caused by a 'special' client

> KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws 
> exception
> -
>
> Key: KAFKA-4669
> URL: https://issues.apache.org/jira/browse/KAFKA-4669
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.1
>Reporter: Cheng Ju
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.0
>
>
> There is no try catch in NetworkClient.handleCompletedReceives.  If an 
> exception is thrown after inFlightRequests.completeNext(source), then the 
> corresponding RecordBatch's done will never get called, and 
> KafkaProducer.flush will hang on this RecordBatch.
> I've checked 0.10 code and think this bug does exist in 0.10 versions.
> A real case.  First a correlateId not match exception happens:
> 13 Jan 2017 17:08:24,059 ERROR [kafka-producer-network-thread | producer-21] 
> (org.apache.kafka.clients.producer.internals.Sender.run:130)  - Uncaught 
> error in kafka producer I/O thread: 
> java.lang.IllegalStateException: Correlation id for response (703766) does 
> not match request (703764)
>   at 
> org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477)
>   at 
> org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
>   at java.lang.Thread.run(Thread.java:745)
> Then jstack shows the thread is hanging on:
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
>   at 
> org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:57)
>   at 
> org.apache.kafka.clients.producer.internals.RecordAccumulator.awaitFlushCompletion(RecordAccumulator.java:425)
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.flush(KafkaProducer.java:544)
>   at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:224)
>   at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
>   at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
>   at java.lang.Thread.run(Thread.java:745)
> client code 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-124: Request rate quotas

2017-03-08 Thread Rajini Sivaram
Hi Todd,

Thank you for the review.

For SSL, the case that is not covered is Scenario 6 in the KIP that Ismael
pointed out. For clusters with only SSL or PLAINTEXT, byte rate quotas work
well, but for clusters with both SSL and PLAINTEXT, network thread
utilization also needs to be taken into account.

For percentage used in quota configuration, looks like opinion is still
split between an overall percentage and per-thread percentage. Will wait
for Jun to respond before updating the KIP either way.

Regards,

Rajini

On Wed, Mar 8, 2017 at 12:45 AM, Todd Palino  wrote:

> I’ve been following this one on and off, and overall it sounds good to me.
>
> - The SSL question is a good one. However, that type of overhead should be
> proportional to the bytes rate, so I think that a bytes rate quota would
> still be a suitable way to address it.
>
> - I think it’s better to make the quota percentage of total thread pool
> capacity, and not percentage of an individual thread. That way you don’t
> have to adjust it when you adjust thread counts (tuning, hardware changes,
> etc.)
>
>
> -Todd
>
>
>
> On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin  wrote:
>
> > I see. Good point about SSL.
> >
> > I just asked Todd to take a look.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao  wrote:
> >
> > > Hi, Jiangjie,
> > >
> > > Yes, I agree that byte rate already protects the network threads
> > > indirectly. I am not sure if byte rate fully captures the CPU overhead
> in
> > > network due to SSL. So, at the high level, we can use request time
> limit
> > to
> > > protect CPU and use byte rate to protect storage and network.
> > >
> > > Also, do you think you can get Todd to comment on this KIP?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin 
> > wrote:
> > >
> > > > Hi Rajini/Jun,
> > > >
> > > > The percentage based reasoning sounds good.
> > > > One thing I am wondering is that if we assume the network thread are
> > just
> > > > doing the network IO, can we say bytes rate quota is already sort of
> > > > network threads quota?
> > > > If we take network threads into the consideration here, would that be
> > > > somewhat overlapping with the bytes rate quota?
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > rajinisiva...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Jun,
> > > > >
> > > > > Thank you for the explanation, I hadn't realized you meant
> percentage
> > > of
> > > > > the total thread pool. If everyone is OK with Jun's suggestion, I
> > will
> > > > > update the KIP.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Rajini
> > > > >
> > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao  wrote:
> > > > >
> > > > > > Hi, Rajini,
> > > > > >
> > > > > > Let's take your example. Let's say a user sets the limit to 50%.
> I
> > am
> > > > not
> > > > > > sure if it's better to apply the same percentage separately to
> > > network
> > > > > and
> > > > > > io thread pool. For example, for produce requests, most of the
> time
> > > > will
> > > > > be
> > > > > > spent in the io threads whereas for fetch requests, most of the
> > time
> > > > will
> > > > > > be in the network threads. So, using the same percentage in both
> > > thread
> > > > > > pools means one of the pools' resource will be over allocated.
> > > > > >
> > > > > > An alternative way is to simply model network and io thread pool
> > > > > together.
> > > > > > If you get 10 io threads and 5 network threads, you get 1500%
> > request
> > > > > > processing power. A 50% limit means a total of 750% processing
> > power.
> > > > We
> > > > > > just add up the time a user request spent in either network or io
> > > > thread.
> > > > > > If that total exceeds 750% (doesn't matter whether it's spent
> more
> > in
> > > > > > network or io thread), the request will be throttled. This seems
> > more
> > > > > > general and is not sensitive to the current implementation detail
> > of
> > > > > having
> > > > > > a separate network and io thread pool. In the future, if the
> > > threading
> > > > > > model changes, the same concept of quota can still be applied.
> For
> > > now,
> > > > > > since it's a bit tricky to add the delay logic in the network
> > thread
> > > > > pool,
> > > > > > we could probably just do the delaying only in the io threads as
> > you
> > > > > > suggested earlier.
> > > > > >
> > > > > > There is still the orthogonal question of whether a quota of 50%
> is
> > > out
> > > > > of
> > > > > > 100% or 100% * #total processing threads. My feeling is that the
> > > latter
> > > > > is
> > > > > > slightly better based on my explanation earlier. The way to
> > describe
> > > > this
> > > > > > quota to the users can be "share of elapsed request processing
> time
> > > on
> > 

Re: [VOTE] KIP-128: Add ByteArrayConverter for Kafka Connect

2017-03-08 Thread Dongjin Lee
 
 
Me too. +1.
   
 
 --
 
 
 
 
Dongjin Lee
 
 
 

   
Software developer in Line+.
 
So interested in massive-scale machine learning.
 
 

 facebook:   www.facebook.com/dongjin.lee.kr 
(http://www.facebook.com/dongjin.lee.kr)  
 
linkedin:   kr.linkedin.com/in/dongjinleekr 
(http://kr.linkedin.com/in/dongjinleekr)
 
 
github:   (http://goog_969573159/)github.com/dongjinleekr 
(http://github.com/dongjinleekr)
 
twitter:   www.twitter.com/dongjinleekr (http://www.twitter.com/dongjinleekr)
 
 
 
 
 
 
 
 

 
 
>  
> On Mar 8, 2017 at 3:45 PM,  mailto:r...@confluent.io)>  
> wrote:
>  
>  
>  
>  +1. Looks good 
>
> On Tue, Mar 7, 2017 at 10:33 PM, Ewen Cheslack-Postava    
> wrote: 
>
> >  Hah, I forgot to do it in the original email, but I suppose I should make 
> >  it explicit: +1 (binding) 
> >  
> >  -Ewen 
> >  
> >  On Mon, Mar 6, 2017 at 9:26 PM, Gwen Shapira    wrote: 
> >  
> >   >  +1 (binding) 
> >   >  
> >   >  On Mon, Mar 6, 2017 at 7:53 PM, Ewen Cheslack-Postava  
> >  >   >  
> >   >  wrote: 
> >   >  
> >   >   >  Hi all, 
> >   >   >  
> >   >   >  I'd like to kick off voting for KIP-128: Add ByteArrayConverter 
> > for 
> >  Kafka 
> >   >   >  Connect: 
> >   >   >  
> >   >   >  https://cwiki.apache.org/confluence/display/KAFKA/KIP- 
> >   >   >  128%3A+Add+ByteArrayConverter+for+Kafka+Connect 
> >   >   >  
> >   >   >  There was a small amount of discussion, see the original thread 
> > here: 
> >   >   >  
> > https://lists.apache.org/thread.html/62fc2245285ac5d15ebb9b2ebed672 
> >   >   >  b51e391c8dfe9a51be85f685f3@%3Cdev.kafka.apache.org%3E 
> >   >   >  
> >   >   >  The vote will stay open for at least 72 hours. 
> >   >   >  
> >   >   >  -Ewen 
> >   >   >  
> >   >  
> >   >  
> >   >  
> >   >  -- 
> >   >  *Gwen Shapira* 
> >   >  Product Manager | Confluent 
> >   >  650.450.2760 | @gwenshap 
> >   >  Follow us: Twitter    | blog 
> >   >     
> >   >  
> >  
>