date:20170602


 [ 
https://issues.apache.org/jira/browse/KAFKA-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5284:
---
Fix Version/s: (was: 0.11.0.0)
   0.11.0.1

> Add tools and metrics to diagnose problems with the idempotent producer and 
> transactions
> 
>
> Key: KAFKA-5284
> URL: https://issues.apache.org/jira/browse/KAFKA-5284
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Apurva Mehta
>  Labels: exactly-once
> Fix For: 0.11.0.1
>
>
> The KIP mentions a number of metrics which we should add, but haven't yet 
> done so. IT would also be good to have tools to help diagnose degenerate 
> situations like:
> # If a consumer is stuck, we should be able to find the LSO of the partition 
> it is blocked on, and which producer is holding up the advancement of the LSO.
> # We should be able to force abort any inflight transaction to free up 
> consumers
> # etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KAFKA-5284) Add tools and metrics to diagnose problems with the idempotent producer and transactions


[ 
https://issues.apache.org/jira/browse/KAFKA-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034258#comment-16034258
 ] 

Ismael Juma commented on KAFKA-5284:


I'm moving this to 0.11.0.1 given the current timelines. If we somehow have 
time to do it before the release, we can consider whether to include it.

> Add tools and metrics to diagnose problems with the idempotent producer and 
> transactions
> 
>
> Key: KAFKA-5284
> URL: https://issues.apache.org/jira/browse/KAFKA-5284
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Apurva Mehta
>  Labels: exactly-once
> Fix For: 0.11.0.1
>
>
> The KIP mentions a number of metrics which we should add, but haven't yet 
> done so. IT would also be good to have tools to help diagnose degenerate 
> situations like:
> # If a consumer is stuck, we should be able to find the LSO of the partition 
> it is blocked on, and which producer is holding up the advancement of the LSO.
> # We should be able to force abort any inflight transaction to free up 
> consumers
> # etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-5026) DebuggingConsumerId and DebuggingMessageFormatter and message format v2


 [ 
https://issues.apache.org/jira/browse/KAFKA-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5026:
---
Fix Version/s: (was: 0.11.0.0)
   0.11.0.1

> DebuggingConsumerId and DebuggingMessageFormatter and message format v2
> ---
>
> Key: KAFKA-5026
> URL: https://issues.apache.org/jira/browse/KAFKA-5026
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>  Labels: exactly-once, tooling
> Fix For: 0.11.0.1
>
>
> [~junrao] suggested the following:
> Currently, the broker supports a DebuggingConsumerId mode for the fetch 
> request. Should we extend that so that the consumer can read the control 
> message as well? Should we also have some kind of DebuggingMessageFormatter 
> so that ConsoleConsumer can show all the newly introduced fields in the new 
> message format (e.g., pid, epoch, etc) for debugging purpose?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (KAFKA-5032) Think through implications of max.message.size affecting record batches in message format V2


 [ 
https://issues.apache.org/jira/browse/KAFKA-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reassigned KAFKA-5032:
--

Assignee: (was: Ismael Juma)

> Think through implications of max.message.size affecting record batches in 
> message format V2
> 
>
> Key: KAFKA-5032
> URL: https://issues.apache.org/jira/browse/KAFKA-5032
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>Priority: Critical
>  Labels: documentation, exactly-once
> Fix For: 0.11.0.0
>
>
> It's worth noting that the new behaviour for uncompressed messages is the 
> same as the existing behaviour for compressed messages.
> A few things to think about:
> 1. Do the producer settings max.request.size and batch.size still make sense 
> and do we need to update the documentation? My conclusion is that things are 
> still fine, but we may need to revise the docs.
> 2. Consider changing default max message set size to include record batch 
> overhead. This is currently defined as:
> {code}
> val MessageMaxBytes = 100 + MessageSet.LogOverhead
> {code}
> We should consider changing it to (I haven't thought it through though):
> {code}
> val MessageMaxBytes = 100 + DefaultRecordBatch.RECORD_BATCH_OVERHEAD
> {code}
> 3. When a record batch is too large, we throw RecordTooLargeException, which 
> is confusing because there's also a RecordBatchTooLargeException. We should 
> consider renaming these exceptions to make the behaviour clearer.
> 4. We should consider deprecating max.message.bytes (server config) and 
> message.max.bytes (topic config) in favour of configs that make it clear that 
> we are talking about record batches instead of individual messages.
> Part of the work in this JIRA is working out what should be done for 0.11.0.0 
> and what can be done later.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KAFKA-5026) DebuggingConsumerId and DebuggingMessageFormatter and message format v2


[ 
https://issues.apache.org/jira/browse/KAFKA-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034259#comment-16034259
 ] 

Ismael Juma commented on KAFKA-5026:


Moving this to 0.11.0.1. If we somehow have time to do this before 0.11.0.0 is 
out, we can consider including it then.

> DebuggingConsumerId and DebuggingMessageFormatter and message format v2
> ---
>
> Key: KAFKA-5026
> URL: https://issues.apache.org/jira/browse/KAFKA-5026
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>  Labels: exactly-once, tooling
> Fix For: 0.11.0.1
>
>
> [~junrao] suggested the following:
> Currently, the broker supports a DebuggingConsumerId mode for the fetch 
> request. Should we extend that so that the consumer can read the control 
> message as well? Should we also have some kind of DebuggingMessageFormatter 
> so that ConsoleConsumer can show all the newly introduced fields in the new 
> message format (e.g., pid, epoch, etc) for debugging purpose?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (KAFKA-5032) Think through implications of max.message.size affecting record batches in message format V2


 [ 
https://issues.apache.org/jira/browse/KAFKA-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reassigned KAFKA-5032:
--

Assignee: Ismael Juma

> Think through implications of max.message.size affecting record batches in 
> message format V2
> 
>
> Key: KAFKA-5032
> URL: https://issues.apache.org/jira/browse/KAFKA-5032
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Critical
>  Labels: documentation, exactly-once
> Fix For: 0.11.0.0
>
>
> It's worth noting that the new behaviour for uncompressed messages is the 
> same as the existing behaviour for compressed messages.
> A few things to think about:
> 1. Do the producer settings max.request.size and batch.size still make sense 
> and do we need to update the documentation? My conclusion is that things are 
> still fine, but we may need to revise the docs.
> 2. Consider changing default max message set size to include record batch 
> overhead. This is currently defined as:
> {code}
> val MessageMaxBytes = 100 + MessageSet.LogOverhead
> {code}
> We should consider changing it to (I haven't thought it through though):
> {code}
> val MessageMaxBytes = 100 + DefaultRecordBatch.RECORD_BATCH_OVERHEAD
> {code}
> 3. When a record batch is too large, we throw RecordTooLargeException, which 
> is confusing because there's also a RecordBatchTooLargeException. We should 
> consider renaming these exceptions to make the behaviour clearer.
> 4. We should consider deprecating max.message.bytes (server config) and 
> message.max.bytes (topic config) in favour of configs that make it clear that 
> we are talking about record batches instead of individual messages.
> Part of the work in this JIRA is working out what should be done for 0.11.0.0 
> and what can be done later.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-5032) Think through implications of max.message.size affecting record batches in message format V2


 [ 
https://issues.apache.org/jira/browse/KAFKA-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5032:
---
Labels: documentation exactly-once  (was: exactly-once)

> Think through implications of max.message.size affecting record batches in 
> message format V2
> 
>
> Key: KAFKA-5032
> URL: https://issues.apache.org/jira/browse/KAFKA-5032
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>Priority: Critical
>  Labels: documentation, exactly-once
> Fix For: 0.11.0.0
>
>
> It's worth noting that the new behaviour for uncompressed messages is the 
> same as the existing behaviour for compressed messages.
> A few things to think about:
> 1. Do the producer settings max.request.size and batch.size still make sense 
> and do we need to update the documentation? My conclusion is that things are 
> still fine, but we may need to revise the docs.
> 2. Consider changing default max message set size to include record batch 
> overhead. This is currently defined as:
> {code}
> val MessageMaxBytes = 100 + MessageSet.LogOverhead
> {code}
> We should consider changing it to (I haven't thought it through though):
> {code}
> val MessageMaxBytes = 100 + DefaultRecordBatch.RECORD_BATCH_OVERHEAD
> {code}
> 3. When a record batch is too large, we throw RecordTooLargeException, which 
> is confusing because there's also a RecordBatchTooLargeException. We should 
> consider renaming these exceptions to make the behaviour clearer.
> 4. We should consider deprecating max.message.bytes (server config) and 
> message.max.bytes (topic config) in favour of configs that make it clear that 
> we are talking about record batches instead of individual messages.
> Part of the work in this JIRA is working out what should be done for 0.11.0.0 
> and what can be done later.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-4595) Controller send thread can't stop when broker change listener event trigger for dead brokers


 [ 
https://issues.apache.org/jira/browse/KAFKA-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4595:
---
Fix Version/s: (was: 0.11.0.0)
   0.11.0.1

> Controller send thread can't stop when broker change listener event trigger 
> for  dead brokers
> -
>
> Key: KAFKA-4595
> URL: https://issues.apache.org/jira/browse/KAFKA-4595
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0, 0.10.1.1
>Reporter: Pengwei
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.1
>
>
> In our test env, we found controller is not working after a delete topic 
> opertation and network issue, the stack is below:
> "ZkClient-EventThread-15-192.168.1.3:2184,192.168.1.4:2184,192.168.1.5:2184" 
> #15 daemon prio=5 os_prio=0 tid=0x7fb76416e000 nid=0x3019 waiting on 
> condition [0x7fb76b7c8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xc05497b8> (a 
> java.util.concurrent.CountDownLatch$Sync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
>   at 
> kafka.utils.ShutdownableThread.awaitShutdown(ShutdownableThread.scala:50)
>   at kafka.utils.ShutdownableThread.shutdown(ShutdownableThread.scala:32)
>   at 
> kafka.controller.ControllerChannelManager.kafka$controller$ControllerChannelManager$$removeExistingBroker(ControllerChannelManager.scala:128)
>   at 
> kafka.controller.ControllerChannelManager.removeBroker(ControllerChannelManager.scala:81)
>   - locked <0xc0258760> (a java.lang.Object)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$2.apply$mcVI$sp(ReplicaStateMachine.scala:369)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$2.apply(ReplicaStateMachine.scala:369)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$2.apply(ReplicaStateMachine.scala:369)
>   at scala.collection.immutable.Set$Set1.foreach(Set.scala:79)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:369)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
>   at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356)
>   at org.I0Itec.zkclient.ZkClient$10.run(ZkClient.java:842)
>   at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
>Locked ownable synchronizers:
>   - <0xc02587f8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> "Controller-1001-to-broker-1003-send-thread" #88 prio=5 os_prio=0 
> tid=0x7fb778342000 nid=0x5a4c waiting on condition [0x7fb761de]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xc02587f8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.Ab

[jira] [Commented] (KAFKA-4595) Controller send thread can't stop when broker change listener event trigger for dead brokers


[ 
https://issues.apache.org/jira/browse/KAFKA-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034264#comment-16034264
 ] 

Ismael Juma commented on KAFKA-4595:


Bumping the fix version until we get confirmation if this was fixed or not.

> Controller send thread can't stop when broker change listener event trigger 
> for  dead brokers
> -
>
> Key: KAFKA-4595
> URL: https://issues.apache.org/jira/browse/KAFKA-4595
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0, 0.10.1.1
>Reporter: Pengwei
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.1
>
>
> In our test env, we found controller is not working after a delete topic 
> opertation and network issue, the stack is below:
> "ZkClient-EventThread-15-192.168.1.3:2184,192.168.1.4:2184,192.168.1.5:2184" 
> #15 daemon prio=5 os_prio=0 tid=0x7fb76416e000 nid=0x3019 waiting on 
> condition [0x7fb76b7c8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xc05497b8> (a 
> java.util.concurrent.CountDownLatch$Sync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
>   at 
> kafka.utils.ShutdownableThread.awaitShutdown(ShutdownableThread.scala:50)
>   at kafka.utils.ShutdownableThread.shutdown(ShutdownableThread.scala:32)
>   at 
> kafka.controller.ControllerChannelManager.kafka$controller$ControllerChannelManager$$removeExistingBroker(ControllerChannelManager.scala:128)
>   at 
> kafka.controller.ControllerChannelManager.removeBroker(ControllerChannelManager.scala:81)
>   - locked <0xc0258760> (a java.lang.Object)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$2.apply$mcVI$sp(ReplicaStateMachine.scala:369)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$2.apply(ReplicaStateMachine.scala:369)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$2.apply(ReplicaStateMachine.scala:369)
>   at scala.collection.immutable.Set$Set1.foreach(Set.scala:79)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:369)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
>   at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
>   at 
> kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356)
>   at org.I0Itec.zkclient.ZkClient$10.run(ZkClient.java:842)
>   at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
>Locked ownable synchronizers:
>   - <0xc02587f8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> "Controller-1001-to-broker-1003-send-thread" #88 prio=5 os_prio=0 
> tid=0x7fb778342000 nid=0x5a4c waiting on condition [0x7fb761de]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xc02587f8> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueue

Re: [DISCUSS] KIP-147: Add missing type parameters to StateStoreSupplier factories and KGroupedStream/Table methods

Thanks Matthias,

I appreciate people are busy now preparing the 0.11 release.

One thing I would also appreciate input on is perhaps a better name for
the new TypedStores class, I just picked it quickly but don't really
like it.

Perhaps StateStores would make for a better name?

Cheers,
Michal

On 02/06/17 07:18, Matthias J. Sax wrote:

Thanks for the update Michal.

I did skip over the PR. Looks good to me, as far as I can tell. Maybe
Damian, Xavier, or Ismael can comment on this. Would be good to get
confirmation that the change is backward compatible.

-Matthias

On 5/27/17 11:11 AM, Michal Borowiecki wrote:

Hi all,

I've updated the KIP to reflect the proposed backwards-compatible approach:

https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=69408481

Given the vast area of APIs affected, I think the PR is easier to read
than the code excerpts in the KIP itself:
https://github.com/apache/kafka/pull/2992/files

Thanks,
Michał

On 07/05/17 10:16, Eno Thereska wrote:

I like this KIP in general and I agree it’s needed. Perhaps Damian can comment
on the session store issue?

Thanks
Eno

On May 6, 2017, at 10:32 PM, Michal Borowiecki
wrote:

Hi Matthias,

Agreed. I tried your proposal and indeed it would work.

However, I think to maintain full backward compatibility we would also need to
deprecate Stores.create() and leave it unchanged, while providing a new method
that returns the more strongly typed Factories.

( This is because PersistentWindowFactory and PersistentSessionFactory cannot extend the existing
PersistentKeyValueFactory interface, since their build() methods will be returning
TypedStateStoreSupplier> and TypedStateStoreSupplier>
respectively, which are NOT subclasses of TypedStateStoreSupplier>. I do not see
another way around it. Admittedly, my type covariance skills are rudimentary. Does anyone see a better way around
this? )

Since create() takes only the store name as argument, and I don't see what we
could overload it with, the new method would need to have a different name.

Alternatively, since create(String) is the only method in Stores, we could
deprecate the entire class and provide a new one. That would be my preference.
Any ideas what to call it?

All comments and suggestions appreciated.

Cheers,

Michał

On 04/05/17 21:48, Matthias J. Sax wrote:

I had a quick look into this.

With regard to backward compatibility, I think it would be required do
introduce a new type `TypesStateStoreSupplier` (that extends
`StateStoreSupplier`) and to overload all methods that take a
`StateStoreSupplier` that accept the new type instead of the current one.

This would allow `.build` to return a `TypedStateStoreSupplier` and
thus, would not break any code. As least if I did not miss anything with
regard to some magic of type inference using generics (I am not an
expert in this field).

-Matthias

On 5/4/17 11:32 AM, Matthias J. Sax wrote:

Did not have time to have a look. But backward compatibility is a must
from my point of view.

-Matthias

On 5/4/17 12:56 AM, Michal Borowiecki wrote:

Hello,

I've updated the KIP with missing information.

I would especially appreciate some comments on the compatibility aspects
of this as the proposed change is not fully backwards-compatible.

In the absence of comments I shall call for a vote in the next few days.

Thanks,

Michal

On 30/04/17 23:11, Michal Borowiecki wrote:

Hi community!

I have just drafted KIP-147: Add missing type parameters to
StateStoreSupplier factories and KGroupedStream/Table methods

Please let me know if this a step in the right direction.

All comments welcome.

Thanks,
Michal
--
Signature
Michal Borowiecki
Senior Software Engineer L4
T: +44 208 742 1600

+44 203 249 8448

E: michal.borowie...@openbet.com

W: www.openbet.com

OpenBet Ltd

Chiswick Park Building 9

566 Chiswick High Rd

London

W4 5XT

This message is confidential and intended only for the addressee. If
you have received this message in error, please immediately notify the
postmas...@openbet.com
and delete it
from your system as well as any copies. The content of e-mails as well
as traffic data may be monitored by OpenBet for employment and
security purposes. To protect the environment please do not print this
e-mail unless necessary. OpenBet Ltd. Registered Office: Chiswick Park
B

[jira] [Commented] (KAFKA-5032) Think through implications of max.message.size affecting record batches in message format V2


[ 
https://issues.apache.org/jira/browse/KAFKA-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034279#comment-16034279
 ] 

Ismael Juma commented on KAFKA-5032:


I looked into the config change and it looks like we include 
MessageSet.LogOverhead and DefaultRecordBatch.RECORD_BATCH_OVERHEAD (depending 
on the message format version) in the size computation used by the producer and 
the default `MAX_REQUEST_SIZE_CONFIG` is 1 * 1024 * 1024. Given that, it seems 
that we don't need to change the default broker config. In fact, it looks like 
there is no reason to add `MessageSet.LogOverhead` either, but I suggest we 
just leave this as it is. cc [~junrao][~hachikuji] in case I am missing 
something.

> Think through implications of max.message.size affecting record batches in 
> message format V2
> 
>
> Key: KAFKA-5032
> URL: https://issues.apache.org/jira/browse/KAFKA-5032
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>Priority: Critical
>  Labels: documentation, exactly-once
> Fix For: 0.11.0.0
>
>
> It's worth noting that the new behaviour for uncompressed messages is the 
> same as the existing behaviour for compressed messages.
> A few things to think about:
> 1. Do the producer settings max.request.size and batch.size still make sense 
> and do we need to update the documentation? My conclusion is that things are 
> still fine, but we may need to revise the docs.
> 2. Consider changing default max message set size to include record batch 
> overhead. This is currently defined as:
> {code}
> val MessageMaxBytes = 100 + MessageSet.LogOverhead
> {code}
> We should consider changing it to (I haven't thought it through though):
> {code}
> val MessageMaxBytes = 100 + DefaultRecordBatch.RECORD_BATCH_OVERHEAD
> {code}
> 3. When a record batch is too large, we throw RecordTooLargeException, which 
> is confusing because there's also a RecordBatchTooLargeException. We should 
> consider renaming these exceptions to make the behaviour clearer.
> 4. We should consider deprecating max.message.bytes (server config) and 
> message.max.bytes (topic config) in favour of configs that make it clear that 
> we are talking about record batches instead of individual messages.
> Part of the work in this JIRA is working out what should be done for 0.11.0.0 
> and what can be done later.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (KAFKA-5032) Think through implications of max.message.size affecting record batches in message format V2


[ 
https://issues.apache.org/jira/browse/KAFKA-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034279#comment-16034279
 ] 

Ismael Juma edited comment on KAFKA-5032 at 6/2/17 7:35 AM:


I looked into the config change and it looks like we include 
MessageSet.LogOverhead and DefaultRecordBatch.RECORD_BATCH_OVERHEAD (depending 
on the message format version) in the size computation used by the producer and 
the default `MAX_REQUEST_SIZE_CONFIG` is 1 * 1024 * 1024. Given that, it seems 
that we don't need to change the default broker config. In fact, it looks like 
there is no reason to add `MessageSet.LogOverhead` either, but I suggest we 
just leave this as it is. cc [~junrao] [~hachikuji] in case I am missing 
something.


was (Author: ijuma):
I looked into the config change and it looks like we include 
MessageSet.LogOverhead and DefaultRecordBatch.RECORD_BATCH_OVERHEAD (depending 
on the message format version) in the size computation used by the producer and 
the default `MAX_REQUEST_SIZE_CONFIG` is 1 * 1024 * 1024. Given that, it seems 
that we don't need to change the default broker config. In fact, it looks like 
there is no reason to add `MessageSet.LogOverhead` either, but I suggest we 
just leave this as it is. cc [~junrao][~hachikuji] in case I am missing 
something.

> Think through implications of max.message.size affecting record batches in 
> message format V2
> 
>
> Key: KAFKA-5032
> URL: https://issues.apache.org/jira/browse/KAFKA-5032
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>Priority: Critical
>  Labels: documentation, exactly-once
> Fix For: 0.11.0.0
>
>
> It's worth noting that the new behaviour for uncompressed messages is the 
> same as the existing behaviour for compressed messages.
> A few things to think about:
> 1. Do the producer settings max.request.size and batch.size still make sense 
> and do we need to update the documentation? My conclusion is that things are 
> still fine, but we may need to revise the docs.
> 2. Consider changing default max message set size to include record batch 
> overhead. This is currently defined as:
> {code}
> val MessageMaxBytes = 100 + MessageSet.LogOverhead
> {code}
> We should consider changing it to (I haven't thought it through though):
> {code}
> val MessageMaxBytes = 100 + DefaultRecordBatch.RECORD_BATCH_OVERHEAD
> {code}
> 3. When a record batch is too large, we throw RecordTooLargeException, which 
> is confusing because there's also a RecordBatchTooLargeException. We should 
> consider renaming these exceptions to make the behaviour clearer.
> 4. We should consider deprecating max.message.bytes (server config) and 
> message.max.bytes (topic config) in favour of configs that make it clear that 
> we are talking about record batches instead of individual messages.
> Part of the work in this JIRA is working out what should be done for 0.11.0.0 
> and what can be done later.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-5032) Think through implications of max.message.size affecting record batches in message format V2


 [ 
https://issues.apache.org/jira/browse/KAFKA-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5032:
---
Description: 
It's worth noting that the new behaviour for uncompressed messages is the same 
as the existing behaviour for compressed messages.

A few things to think about:

1. Do the producer settings max.request.size and batch.size still make sense 
and do we need to update the documentation? My conclusion is that things are 
still fine, but we may need to revise the docs.

2. (Seems like we don't need to do this) Consider changing default max message 
set size to include record batch overhead. This is currently defined as:

{code}
val MessageMaxBytes = 100 + MessageSet.LogOverhead
{code}

We should consider changing it to (I haven't thought it through though):

{code}
val MessageMaxBytes = 100 + DefaultRecordBatch.RECORD_BATCH_OVERHEAD
{code}

3. When a record batch is too large, we throw RecordTooLargeException, which is 
confusing because there's also a RecordBatchTooLargeException. We should 
consider renaming these exceptions to make the behaviour clearer.

4. We should consider deprecating max.message.bytes (server config) and 
message.max.bytes (topic config) in favour of configs that make it clear that 
we are talking about record batches instead of individual messages.

Part of the work in this JIRA is working out what should be done for 0.11.0.0 
and what can be done later.

  was:
It's worth noting that the new behaviour for uncompressed messages is the same 
as the existing behaviour for compressed messages.

A few things to think about:

1. Do the producer settings max.request.size and batch.size still make sense 
and do we need to update the documentation? My conclusion is that things are 
still fine, but we may need to revise the docs.

2. Consider changing default max message set size to include record batch 
overhead. This is currently defined as:

{code}
val MessageMaxBytes = 100 + MessageSet.LogOverhead
{code}

We should consider changing it to (I haven't thought it through though):

{code}
val MessageMaxBytes = 100 + DefaultRecordBatch.RECORD_BATCH_OVERHEAD
{code}

3. When a record batch is too large, we throw RecordTooLargeException, which is 
confusing because there's also a RecordBatchTooLargeException. We should 
consider renaming these exceptions to make the behaviour clearer.

4. We should consider deprecating max.message.bytes (server config) and 
message.max.bytes (topic config) in favour of configs that make it clear that 
we are talking about record batches instead of individual messages.

Part of the work in this JIRA is working out what should be done for 0.11.0.0 
and what can be done later.


> Think through implications of max.message.size affecting record batches in 
> message format V2
> 
>
> Key: KAFKA-5032
> URL: https://issues.apache.org/jira/browse/KAFKA-5032
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>Priority: Critical
>  Labels: documentation, exactly-once
> Fix For: 0.11.0.0
>
>
> It's worth noting that the new behaviour for uncompressed messages is the 
> same as the existing behaviour for compressed messages.
> A few things to think about:
> 1. Do the producer settings max.request.size and batch.size still make sense 
> and do we need to update the documentation? My conclusion is that things are 
> still fine, but we may need to revise the docs.
> 2. (Seems like we don't need to do this) Consider changing default max 
> message set size to include record batch overhead. This is currently defined 
> as:
> {code}
> val MessageMaxBytes = 100 + MessageSet.LogOverhead
> {code}
> We should consider changing it to (I haven't thought it through though):
> {code}
> val MessageMaxBytes = 100 + DefaultRecordBatch.RECORD_BATCH_OVERHEAD
> {code}
> 3. When a record batch is too large, we throw RecordTooLargeException, which 
> is confusing because there's also a RecordBatchTooLargeException. We should 
> consider renaming these exceptions to make the behaviour clearer.
> 4. We should consider deprecating max.message.bytes (server config) and 
> message.max.bytes (topic config) in favour of configs that make it clear that 
> we are talking about record batches instead of individual messages.
> Part of the work in this JIRA is working out what should be done for 0.11.0.0 
> and what can be done later.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-5364) Producer attempts to send transactional messages before adding partitions to transaction

2017-06-02 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-5364:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 3202
[https://github.com/apache/kafka/pull/3202]

> Producer attempts to send transactional messages before adding partitions to 
> transaction
> 
>
> Key: KAFKA-5364
> URL: https://issues.apache.org/jira/browse/KAFKA-5364
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Affects Versions: 0.11.0.0
>Reporter: Apurva Mehta
>Assignee: Apurva Mehta
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> Due to a race condition between the sender thread and the producer.send(), 
> the following is possible: 
> # In KakfaProducer.doSend(), we add partitions to the transaction and then do 
> accumulator.append. 
> # In Sender.run(), we check whether there are transactional request. If there 
> are, we send them and wait for the response. 
> # If there aren't we drain the accumulator queue and send the produce 
> requests.
> # The problem is that the sequence step 2, 1, 3 is entire possible. This 
> means that we won't send the 'AddPartitions' request but yet try to send the 
> produce data. Which results in a fatal error and requires the producer to 
> close. 
> The solution is that in the accumulator.drain, we should check again if there 
> are pending add partitions requests, and if so, don't drain anything.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #3202: KAFKA-5364: Don't fail producer if drained partiti...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3202


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5364) Producer attempts to send transactional messages before adding partitions to transaction


[ 
https://issues.apache.org/jira/browse/KAFKA-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034304#comment-16034304
 ] 

ASF GitHub Bot commented on KAFKA-5364:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3202


> Producer attempts to send transactional messages before adding partitions to 
> transaction
> 
>
> Key: KAFKA-5364
> URL: https://issues.apache.org/jira/browse/KAFKA-5364
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Affects Versions: 0.11.0.0
>Reporter: Apurva Mehta
>Assignee: Apurva Mehta
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> Due to a race condition between the sender thread and the producer.send(), 
> the following is possible: 
> # In KakfaProducer.doSend(), we add partitions to the transaction and then do 
> accumulator.append. 
> # In Sender.run(), we check whether there are transactional request. If there 
> are, we send them and wait for the response. 
> # If there aren't we drain the accumulator queue and send the produce 
> requests.
> # The problem is that the sequence step 2, 1, 3 is entire possible. This 
> means that we won't send the 'AddPartitions' request but yet try to send the 
> produce data. Which results in a fatal error and requires the producer to 
> close. 
> The solution is that in the accumulator.drain, we should check again if there 
> are pending add partitions requests, and if so, don't drain anything.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-5311) Support ExtendedDeserializer in Kafka Streams


 [ 
https://issues.apache.org/jira/browse/KAFKA-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5311:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 3199
[https://github.com/apache/kafka/pull/3199]

> Support ExtendedDeserializer in Kafka Streams
> -
>
> Key: KAFKA-5311
> URL: https://issues.apache.org/jira/browse/KAFKA-5311
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Dale Peakall
>Assignee: Dale Peakall
> Fix For: 0.11.0.0
>
>
> KIP-82 introduced the concept of message headers and introduced an 
> ExtendedDeserializer interface that allowed a Deserializer to access those 
> message headers.
> Change Kafka Streams to support the use of ExtendedDeserializer to provide 
> compatibility with Kafka Clients that use the new header functionality.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #3199: KAFKA-5311: Support ExtendedDeserializer in Kafka ...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3199


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5311) Support ExtendedDeserializer in Kafka Streams


[ 
https://issues.apache.org/jira/browse/KAFKA-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034325#comment-16034325
 ] 

ASF GitHub Bot commented on KAFKA-5311:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3199


> Support ExtendedDeserializer in Kafka Streams
> -
>
> Key: KAFKA-5311
> URL: https://issues.apache.org/jira/browse/KAFKA-5311
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Dale Peakall
>Assignee: Dale Peakall
> Fix For: 0.11.0.0
>
>
> KIP-82 introduced the concept of message headers and introduced an 
> ExtendedDeserializer interface that allowed a Deserializer to access those 
> message headers.
> Change Kafka Streams to support the use of ExtendedDeserializer to provide 
> compatibility with Kafka Clients that use the new header functionality.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (KAFKA-5367) Producer should not expiry topic from metadata cache if accumulator still has data for this topic

2017-06-02 Thread Dong Lin (JIRA)

Dong Lin created KAFKA-5367:
---

 Summary: Producer should not expiry topic from metadata cache if 
accumulator still has data for this topic
 Key: KAFKA-5367
 URL: https://issues.apache.org/jira/browse/KAFKA-5367
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin


To be added.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (KAFKA-5368) Kafka Streams skipped-records-rate sensor produces nonzero values when the timestamps are valid

2017-06-02 Thread Hamidreza Afzali (JIRA)

Hamidreza Afzali created KAFKA-5368:
---

 Summary: Kafka Streams skipped-records-rate sensor produces 
nonzero values when the timestamps are valid
 Key: KAFKA-5368
 URL: https://issues.apache.org/jira/browse/KAFKA-5368
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Hamidreza Afzali
Assignee: Hamidreza Afzali


Kafka Streams skipped-records-rate sensor produces nonzero values even when the 
timestamps are valid and records are processed. The values are equal to 
poll-rate.

Related issue: KAFKA-5055 




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #3203: KAFKA-5365: Fix regression in compressed message i...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3203


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5365) Fix regression in compressed message iteration affecting magic v0 and v1


[ 
https://issues.apache.org/jira/browse/KAFKA-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034331#comment-16034331
 ] 

ASF GitHub Bot commented on KAFKA-5365:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3203


> Fix regression in compressed message iteration affecting magic v0 and v1
> 
>
> Key: KAFKA-5365
> URL: https://issues.apache.org/jira/browse/KAFKA-5365
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 0.11.0.0
>
>
> We added a shortcut to break iteration over compressed message sets for v0 
> and v1 if the inner offset matches the last offset in the wrapper. 
> Unfortunately this breaks older clients which may use offset 0 in the wrapper 
> record in records sent in produce requests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (KAFKA-5365) Fix regression in compressed message iteration affecting magic v0 and v1


 [ 
https://issues.apache.org/jira/browse/KAFKA-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5365.

Resolution: Fixed

> Fix regression in compressed message iteration affecting magic v0 and v1
> 
>
> Key: KAFKA-5365
> URL: https://issues.apache.org/jira/browse/KAFKA-5365
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 0.11.0.0
>
>
> We added a shortcut to break iteration over compressed message sets for v0 
> and v1 if the inner offset matches the last offset in the wrapper. 
> Unfortunately this breaks older clients which may use offset 0 in the wrapper 
> record in records sent in produce requests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-06-02 Thread Lukas Gemela (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034356#comment-16034356
 ] 

Lukas Gemela commented on KAFKA-5154:
-

[~guozhang] no worries, thank you for the fix, we will update ASAP once a new 
version gets released

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>Assignee: Damian Guy
> Attachments: 5154_problem.log, clio_afa596e9b809.gz, clio_reduced.gz, 
> clio.txt.gz
>
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
> stream-thread [StreamThread-1] Streams application error during processing:
>  java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]

Re: [VOTE] KIP-164 Add unavailablePartitionCount and per-partition Unavailable metrics

2017-06-02 Thread Mickael Maison

+1 (non binding)
Thanks for the KIP

On Thu, Jun 1, 2017 at 5:44 PM, Dong Lin  wrote:
> Hi all,
>
> Can you please vote for KIP-164? The KIP can be found at
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-164-+Add+UnderMinIsrPartitionCount+and+per-partition+UnderMinIsr+metrics
> .
>
> Thanks,
> Dong

[jira] [Commented] (KAFKA-5368) Kafka Streams skipped-records-rate sensor produces nonzero values when the timestamps are valid

2017-06-02 Thread Hamidreza Afzali (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034367#comment-16034367
 ] 

Hamidreza Afzali commented on KAFKA-5368:
-

*Problem:*

Skipped records sensor is using a {{Rate}} of type {{Count}} for 
skipped-records-rate metric.

In 
{{org.apache.kafka.streams.processor.internals.StreamThread#addRecordsToTasks}} 
the Count value is incremented by one regardless of the number of skipped 
records, i.e. the value increments even if no record is skipped.


{code}
skippedRecordsSensor.add(metrics.metricName("skipped-records-rate", 
this.groupName, "The average per-second number of skipped records.", 
this.tags), new Rate(new Count()));

...

private void addRecordsToTasks(final ConsumerRecords records) {
if (records != null && !records.isEmpty()) {
...
streamsMetrics.skippedRecordsSensor.record(records.count() - 
numAddedRecords, timerStartedMs);
}
}
{code}

{{org.apache.kafka.streams.processor.internals.StreamThread#addRecordsToTasks}} 
is called in 
{{org.apache.kafka.streams.processor.internals.StreamThread#runLoop}} after 
each successful poll request.

{code}

private void runLoop() {
...
while (stillRunning()) {
...
final ConsumerRecords records = 
pollRequests(pollTimeMs);
if (records != null && !records.isEmpty() && !activeTasks.isEmpty()) {
streamsMetrics.pollTimeSensor.record(computeLatency(), 
timerStartedMs);
addRecordsToTasks(records);
...
}
...
}
...
}
{code}

This can explain why skipped-records-rate is equal to poll-rate.

*Solution:*

The sensor should keep a sum of all skipped records.



> Kafka Streams skipped-records-rate sensor produces nonzero values when the 
> timestamps are valid
> ---
>
> Key: KAFKA-5368
> URL: https://issues.apache.org/jira/browse/KAFKA-5368
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Hamidreza Afzali
>Assignee: Hamidreza Afzali
>
> Kafka Streams skipped-records-rate sensor produces nonzero values even when 
> the timestamps are valid and records are processed. The values are equal to 
> poll-rate.
> Related issue: KAFKA-5055 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Re: [VOTE] KIP-164 Add unavailablePartitionCount and per-partition Unavailable metrics


+1 (non binding)

Thanks,
Michał

On 02/06/17 10:18, Mickael Maison wrote:

+1 (non binding)
Thanks for the KIP

On Thu, Jun 1, 2017 at 5:44 PM, Dong Lin  wrote:

Hi all,

Can you please vote for KIP-164? The KIP can be found at
https://cwiki.apache.org/confluence/display/KAFKA/KIP-164-+Add+UnderMinIsrPartitionCount+and+per-partition+UnderMinIsr+metrics
.

Thanks,
Dong


--
Signature
 Michal Borowiecki
Senior Software Engineer L4
T:  +44 208 742 1600


+44 203 249 8448



E:  michal.borowie...@openbet.com
W:  www.openbet.com 


OpenBet Ltd

Chiswick Park Building 9

566 Chiswick High Rd

London

W4 5XT

UK




This message is confidential and intended only for the addressee. If you 
have received this message in error, please immediately notify the 
postmas...@openbet.com  and delete it 
from your system as well as any copies. The content of e-mails as well 
as traffic data may be monitored by OpenBet for employment and security 
purposes. To protect the environment please do not print this e-mail 
unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building 
9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company 
registered in England and Wales. Registered no. 3134634. VAT no. 
GB927523612

[jira] [Updated] (KAFKA-5282) Transactions integration test: Use factory methods to keep track of open producers and consumers and close them all on tearDown


 [ 
https://issues.apache.org/jira/browse/KAFKA-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5282:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Transactions integration test: Use factory methods to keep track of open 
> producers and consumers and close them all on tearDown
> ---
>
> Key: KAFKA-5282
> URL: https://issues.apache.org/jira/browse/KAFKA-5282
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Apurva Mehta
>Assignee: Vahid Hashemian
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> See: https://github.com/apache/kafka/pull/3093/files#r117354588
> The current transactions integration test creates individual producers and 
> consumer per test, and closes them independently. 
> It would be more robust to create them through a central factory method that 
> keeps track of each instance, and then close those instances on `tearDown`.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #3129: KAFKA-5282: Use a factory method to create produce...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3129


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5282) Transactions integration test: Use factory methods to keep track of open producers and consumers and close them all on tearDown


[ 
https://issues.apache.org/jira/browse/KAFKA-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034376#comment-16034376
 ] 

ASF GitHub Bot commented on KAFKA-5282:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3129


> Transactions integration test: Use factory methods to keep track of open 
> producers and consumers and close them all on tearDown
> ---
>
> Key: KAFKA-5282
> URL: https://issues.apache.org/jira/browse/KAFKA-5282
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Apurva Mehta
>Assignee: Vahid Hashemian
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> See: https://github.com/apache/kafka/pull/3093/files#r117354588
> The current transactions integration test creates individual producers and 
> consumer per test, and closes them independently. 
> It would be more robust to create them through a central factory method that 
> keeps track of each instance, and then close those instances on `tearDown`.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-02 Thread Damian Guy

I agree with what Matthias has said w.r.t failing fast. There are plenty of
times when you don't want to fail-fast and must attempt to  make progress.
The dead-letter queue is exactly for these circumstances. Of course if
every record is failing, then you probably do want to give up.

On Fri, 2 Jun 2017 at 07:56 Matthias J. Sax  wrote:

> First a meta comment. KIP discussion should take place on the dev list
> -- if user list is cc'ed please make sure to reply to both lists. Thanks.
>
> Thanks for making the scope of the KIP clear. Makes a lot of sense to
> focus on deserialization exceptions for now.
>
> With regard to corrupted state stores, would it make sense to fail a
> task and wipe out the store to repair it via recreation from the
> changelog? That's of course a quite advance pattern, but I want to bring
> it up to design the first step in a way such that we can get there (if
> we think it's a reasonable idea).
>
> I also want to comment about fail fast vs making progress. I think that
> fail-fast must not always be the best option. The scenario I have in
> mind is like this: you got a bunch of producers that feed the Streams
> input topic. Most producers work find, but maybe one producer miss
> behaves and the data it writes is corrupted. You might not even be able
> to recover this lost data at any point -- thus, there is no reason to
> stop processing but you just skip over those records. Of course, you
> need to fix the root cause, and thus you need to alert (either via logs
> of the exception handler directly) and you need to start to investigate
> to find the bad producer, shut it down and fix it.
>
> Here the dead letter queue comes into place. From my understanding, the
> purpose of this feature is solely enable post debugging. I don't think
> those record would be fed back at any point in time (so I don't see any
> ordering issue -- a skipped record, with this regard, is just "fully
> processed"). Thus, the dead letter queue should actually encode the
> original records metadata (topic, partition offset etc) to enable such
> debugging. I guess, this might also be possible if you just log the bad
> records, but it would be harder to access (you first must find the
> Streams instance that did write the log and extract the information from
> there). Reading it from topic is much simpler.
>
> I also want to mention the following. Assume you have such a topic with
> some bad records and some good records. If we always fail-fast, it's
> going to be super hard to process the good data. You would need to write
> an extra app that copied the data into a new topic filtering out the bad
> records (or apply the map() workaround withing stream). So I don't think
> that failing fast is most likely the best option in production is
> necessarily, true.
>
> Or do you think there are scenarios, for which you can recover the
> corrupted records successfully? And even if this is possible, it might
> be a case for reprocessing instead of failing the whole application?
> Also, if you think you can "repair" a corrupted record, should the
> handler allow to return a "fixed" record? This would solve the ordering
> problem.
>
>
>
> -Matthias
>
>
>
>
> On 5/30/17 1:47 AM, Michael Noll wrote:
> > Thanks for your work on this KIP, Eno -- much appreciated!
> >
> > - I think it would help to improve the KIP by adding an end-to-end code
> > example that demonstrates, with the DSL and with the Processor API, how
> the
> > user would write a simple application that would then be augmented with
> the
> > proposed KIP changes to handle exceptions.  It should also become much
> > clearer then that e.g. the KIP would lead to different code paths for the
> > happy case and any failure scenarios.
> >
> > - Do we have sufficient information available to make informed decisions
> on
> > what to do next?  For example, do we know in which part of the topology
> the
> > record failed? `ConsumerRecord` gives us access to topic, partition,
> > offset, timestamp, etc., but what about topology-related information
> (e.g.
> > what is the associated state store, if any)?
> >
> > - Only partly on-topic for the scope of this KIP, but this is about the
> > bigger picture: This KIP would give users the option to send corrupted
> > records to dead letter queue (quarantine topic).  But, what pattern would
> > we advocate to process such a dead letter queue then, e.g. how to allow
> for
> > retries with backoff ("If the first record in the dead letter queue fails
> > again, then try the second record for the time being and go back to the
> > first record at a later time").  Jay and Jan already alluded to ordering
> > problems that will be caused by dead letter queues. As I said, retries
> > might be out of scope but perhaps the implications should be considered
> if
> > possible?
> >
> > Also, I wrote the text below before reaching the point in the
> conversation
> > that this KIP's scope will be limited to exceptions in the category of
> > poiso

Build failed in Jenkins: kafka-trunk-jdk8 #1639

See 


Changes:

[jason] KAFKA-5364; Don't fail producer if drained partition is not yet in

[ismael] KAFKA-5311: Support ExtendedDeserializer in Kafka Streams

[ismael] KAFKA-5365; Fix regression in compressed message iteration affecting

--
[...truncated 2.48 MB...]
org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnector PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorFailedCustomValidation STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorFailedCustomValidation PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorFailedBasicValidation STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorFailedBasicValidation PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartConnector STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartConnector PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testDestroyConnector STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testDestroyConnector PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorAlreadyExists STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorAlreadyExists PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTask STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTask PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testAccessors STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testAccessors PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorNameConflictsWithWorkerGroupId STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorNameConflictsWithWorkerGroupId PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartConnectorRedirectToLeader STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartConnectorRedirectToLeader PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartConnectorRedirectToOwner STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartConnectorRedirectToOwner PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartUnknownTask STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartUnknownTask PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRequestProcessingOrder STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRequestProcessingOrder PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTaskRedirectToLeader STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTaskRedirectToLeader PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTaskRedirectToOwner STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTaskRedirectToOwner PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorConfigAdded STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorConfigAdded PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorConfigUpdate STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorConfigUpdate PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorPaused STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorPaused PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorResumed STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorResumed PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testUnknownConnectorPaused STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testUnknownConnectorPaused PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorPausedRunningTaskOnly STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorPausedRunningTaskOnly PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorResumedRunningTaskOnly STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorResumedRunningTaskOnly PASS

[GitHub] kafka pull request #3205: KAFKA-5236; Increase the block/buffer size when co...

2017-06-02 Thread ijuma

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/3205

KAFKA-5236; Increase the block/buffer size when compressing with Snappy and 
Gzip

We had originally increased Snappyâs block size as part of KAFKA-3704. 
However,
we had some issues with excessive memory usage in the producer and we 
reverted
it in 7c6ee8d5e.

After more investigation, we fixed the underlying reason why memory usage 
seemed
to grow much more than expected in KAFKA-3747 (included in 0.10.0.1).

In 0.10.2, we changed the broker to use the same classes as the producer 
and the
brokerâs block size for Snappy was changed from 32 KB to 1KB. As reported 
in
KAFKA-5236, the on disk size is, in some cases, 50% larger when the data is 
compressed
with 1 KB instead of 32 KB as the block size.

As discussed in KAFKA-3704, it may be worth making this configurable and/or 
allocate
the compression buffers from the producer pool. However, for 0.11.0.0, I 
think the
simplest thing to do is to default to 32 KB for Snappy (the default if no 
block size
is provided).

I also increased the Gzip buffer size. 1 KB is too small and the default is 
smaller
still (512 bytes). 8 KB (which is the default buffer size for 
BufferedOutputStream)
seemed like a reasonable default.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka kafka-5236-snappy-block-size

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3205.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3205


commit ef4af6757575e694c109074b67e59704ff437b56
Author: Ismael Juma 
Date:   2017-06-02T10:17:23Z

KAFKA-5236; Increase the block/buffer size when compressing with Snappy and 
Gzip

We had originally increased Snappyâs block size as part of KAFKA-3704. 
However,
we had some issues with excessive memory usage in the producer and we 
reverted
it in 7c6ee8d5e.

After more investigation, we fixed the underlying reason why memory usage 
seemed
to grow much more than expected in KAFKA-3747 (included in 0.10.0.1).

In 0.10.2, we changed the broker to use the same classes as the producer 
and the
brokerâs block size for Snappy was changed from 32 KB to 1KB. As reported 
in
KAFKA-5236, the on disk size is, in some cases, 50% larger when the data is 
compressed
with 1 KB instead of 32 KB as the block size.

As discussed in KAFKA-3704, it may be worth making this configurable and/or 
allocate
the compression buffers from the producer pool. However, for 0.11.0.0, I 
think the
simplest thing to do is to default to 32 KB for Snappy (the default if no 
block size
is provided).

I also increased the Gzip buffer size. 1 KB is too small and the default is 
smaller
still (512 bytes). 8 KB (which is the default buffer size for 
BufferedOutputStream)
seemed like a reasonable default.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5236) Regression in on-disk log size when using Snappy compression with 0.8.2 log message format


[ 
https://issues.apache.org/jira/browse/KAFKA-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034453#comment-16034453
 ] 

ASF GitHub Bot commented on KAFKA-5236:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/3205

KAFKA-5236; Increase the block/buffer size when compressing with Snappy and 
Gzip

We had originally increased Snappy’s block size as part of KAFKA-3704. 
However,
we had some issues with excessive memory usage in the producer and we 
reverted
it in 7c6ee8d5e.

After more investigation, we fixed the underlying reason why memory usage 
seemed
to grow much more than expected in KAFKA-3747 (included in 0.10.0.1).

In 0.10.2, we changed the broker to use the same classes as the producer 
and the
broker’s block size for Snappy was changed from 32 KB to 1KB. As reported in
KAFKA-5236, the on disk size is, in some cases, 50% larger when the data is 
compressed
with 1 KB instead of 32 KB as the block size.

As discussed in KAFKA-3704, it may be worth making this configurable and/or 
allocate
the compression buffers from the producer pool. However, for 0.11.0.0, I 
think the
simplest thing to do is to default to 32 KB for Snappy (the default if no 
block size
is provided).

I also increased the Gzip buffer size. 1 KB is too small and the default is 
smaller
still (512 bytes). 8 KB (which is the default buffer size for 
BufferedOutputStream)
seemed like a reasonable default.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka kafka-5236-snappy-block-size

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3205.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3205


commit ef4af6757575e694c109074b67e59704ff437b56
Author: Ismael Juma 
Date:   2017-06-02T10:17:23Z

KAFKA-5236; Increase the block/buffer size when compressing with Snappy and 
Gzip

We had originally increased Snappy’s block size as part of KAFKA-3704. 
However,
we had some issues with excessive memory usage in the producer and we 
reverted
it in 7c6ee8d5e.

After more investigation, we fixed the underlying reason why memory usage 
seemed
to grow much more than expected in KAFKA-3747 (included in 0.10.0.1).

In 0.10.2, we changed the broker to use the same classes as the producer 
and the
broker’s block size for Snappy was changed from 32 KB to 1KB. As reported in
KAFKA-5236, the on disk size is, in some cases, 50% larger when the data is 
compressed
with 1 KB instead of 32 KB as the block size.

As discussed in KAFKA-3704, it may be worth making this configurable and/or 
allocate
the compression buffers from the producer pool. However, for 0.11.0.0, I 
think the
simplest thing to do is to default to 32 KB for Snappy (the default if no 
block size
is provided).

I also increased the Gzip buffer size. 1 KB is too small and the default is 
smaller
still (512 bytes). 8 KB (which is the default buffer size for 
BufferedOutputStream)
seemed like a reasonable default.




> Regression in on-disk log size when using Snappy compression with 0.8.2 log 
> message format
> --
>
> Key: KAFKA-5236
> URL: https://issues.apache.org/jira/browse/KAFKA-5236
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.1
>Reporter: Nick Travers
>  Labels: regression
> Fix For: 0.11.0.0
>
>
> We recently upgraded our brokers in our production environments from 0.10.1.1 
> to 0.10.2.1 and we've noticed a sizable regression in the on-disk .log file 
> size. For some deployments the increase was as much as 50%.
> We run our brokers with the 0.8.2 log message format version. The majority of 
> our message volume comes from 0.10.x Java clients sending messages encoded 
> with the Snappy codec.
> Some initial testing only shows a regression between the two versions when 
> using Snappy compression with a log message format of 0.8.2.
> I also tested 0.10.x log message formats as well as Gzip compression. The log 
> sizes do not differ in this case, so the issue seems confined to 0.8.2 
> message format and Snappy compression.
> A git-bisect lead me to this commit, which modified the server-side 
> implementation of `Record`:
> https://github.com/apache/kafka/commit/67f1e5b91bf073151ff57d5d656693e385726697
> Here's the PR, which has more context:
> https://github.com/apache/kafka/pull/2140
> Here is a link to the test I used to re-producer th

[jira] [Updated] (KAFKA-5236) Regression in on-disk log size when using Snappy compression with 0.8.2 log message format


 [ 
https://issues.apache.org/jira/browse/KAFKA-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5236:
---
Assignee: Ismael Juma
  Status: Patch Available  (was: Open)

> Regression in on-disk log size when using Snappy compression with 0.8.2 log 
> message format
> --
>
> Key: KAFKA-5236
> URL: https://issues.apache.org/jira/browse/KAFKA-5236
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.1
>Reporter: Nick Travers
>Assignee: Ismael Juma
>  Labels: regression
> Fix For: 0.11.0.0
>
>
> We recently upgraded our brokers in our production environments from 0.10.1.1 
> to 0.10.2.1 and we've noticed a sizable regression in the on-disk .log file 
> size. For some deployments the increase was as much as 50%.
> We run our brokers with the 0.8.2 log message format version. The majority of 
> our message volume comes from 0.10.x Java clients sending messages encoded 
> with the Snappy codec.
> Some initial testing only shows a regression between the two versions when 
> using Snappy compression with a log message format of 0.8.2.
> I also tested 0.10.x log message formats as well as Gzip compression. The log 
> sizes do not differ in this case, so the issue seems confined to 0.8.2 
> message format and Snappy compression.
> A git-bisect lead me to this commit, which modified the server-side 
> implementation of `Record`:
> https://github.com/apache/kafka/commit/67f1e5b91bf073151ff57d5d656693e385726697
> Here's the PR, which has more context:
> https://github.com/apache/kafka/pull/2140
> Here is a link to the test I used to re-producer this issue:
> https://github.com/nicktrav/kafka/commit/68e8db4fa525e173651ac740edb270b0d90b8818
> cc: [~hachikuji] [~junrao] [~ijuma] [~guozhang] (tagged on original PR)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KAFKA-5368) Kafka Streams skipped-records-rate sensor produces nonzero values when the timestamps are valid


[ 
https://issues.apache.org/jira/browse/KAFKA-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034461#comment-16034461
 ] 

ASF GitHub Bot commented on KAFKA-5368:
---

GitHub user hrafzali opened a pull request:

https://github.com/apache/kafka/pull/3206

KAFKA-5368 Kafka Streams skipped-records-rate sensor bug

This resolved the issue with Kafka Streams skipped records sensor reporting 
wrong values.

Jira ticket: https://issues.apache.org/jira/browse/KAFKA-5368

The contribution is my original work and I license the work to the project 
under the project's open source license.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hrafzali/kafka 
KAFKA-5368_skipped-records-sensor-bug

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3206.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3206


commit db4e424c55d63b9172de563fb559002f435b7f01
Author: Hamidreza Afzali 
Date:   2017-06-02T09:43:32Z

KAFKA-5368 Kafka Streams skipped-records-rate sensor bug




> Kafka Streams skipped-records-rate sensor produces nonzero values when the 
> timestamps are valid
> ---
>
> Key: KAFKA-5368
> URL: https://issues.apache.org/jira/browse/KAFKA-5368
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Hamidreza Afzali
>Assignee: Hamidreza Afzali
>
> Kafka Streams skipped-records-rate sensor produces nonzero values even when 
> the timestamps are valid and records are processed. The values are equal to 
> poll-rate.
> Related issue: KAFKA-5055 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #3206: KAFKA-5368 Kafka Streams skipped-records-rate sens...

2017-06-02 Thread hrafzali

GitHub user hrafzali opened a pull request:

https://github.com/apache/kafka/pull/3206

KAFKA-5368 Kafka Streams skipped-records-rate sensor bug

This resolved the issue with Kafka Streams skipped records sensor reporting 
wrong values.

Jira ticket: https://issues.apache.org/jira/browse/KAFKA-5368

The contribution is my original work and I license the work to the project 
under the project's open source license.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hrafzali/kafka 
KAFKA-5368_skipped-records-sensor-bug

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3206.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3206


commit db4e424c55d63b9172de563fb559002f435b7f01
Author: Hamidreza Afzali 
Date:   2017-06-02T09:43:32Z

KAFKA-5368 Kafka Streams skipped-records-rate sensor bug




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5304) Kafka Producer throwing infinite NullPointerExceptions

2017-06-02 Thread Rajini Sivaram (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034486#comment-16034486
 ] 

Rajini Sivaram commented on KAFKA-5304:
---

[~prchaudh] What is the security protocol of your producer?

> Kafka Producer throwing infinite NullPointerExceptions
> --
>
> Key: KAFKA-5304
> URL: https://issues.apache.org/jira/browse/KAFKA-5304
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.9.0.1
> Environment: RedHat Enterprise Linux 6.8
>Reporter: Pranay Kumar Chaudhary
>
> 2017-05-22 11:38:56,918 LL="ERROR" TR="kafka-producer-network-thread | 
> application-name.hostname.com" LN="o.a.k.c.p.i.Sender"  Uncaught error in 
> kafka producer I/O thread:
> java.lang.NullPointerException: null
> Continuously getting this error in logs which is filling up the disk space. 
> Not able to get a stack trace to pinpoint the source of the error.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (KAFKA-5325) Connection Lose during Kafka Kerberos Renewal process

2017-06-02 Thread Rajini Sivaram (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram reassigned KAFKA-5325:
-

Assignee: Rajini Sivaram

> Connection Lose during Kafka Kerberos Renewal process
> -
>
> Key: KAFKA-5325
> URL: https://issues.apache.org/jira/browse/KAFKA-5325
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.9.0.0
>Reporter: MuthuKumar
>Assignee: Rajini Sivaram
>
> During Kerberos Ticket renewal, all requests reaching the server interim 
> Kerberos renewal ticket logout & re-login  is getting failed with below 
> mentioned error.
> kafka-clients-0.9.0.0.jar is being used for producer end. Reason for using 
> Kafka version 0.9.0.0 at producer end as the server is running in 0.10.0.x 
> OS: Oracle Linux Server release 6.7
> Kerberos Configuration - Producer end
> -
> KafkaClient {
> com.sun.security.auth.module.Krb5LoginModule required
> refreshKrb5Config=true
> principal="u...@.com"
> useKeyTab=true
> serviceName="kafka"
> keyTab="x.keytab"
> client=true;
> };
> Application Log
> ---
> 2017-05-25 02:20:37,515 INF [Login.java:354] Initiating logout for 
> u...@.com
> 2017-05-25 02:20:37,515 INF [Login.java:365] Initiating re-login for 
> u...@.com
> 2017-05-25 02:20:37,525 INF [SaslChannelBuilder.java:91] Failed to create 
> channel due to
> org.apache.kafka.common.KafkaException: Failed to configure 
> SaslClientAuthenticator
> at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:94)
> at 
> org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:88)
> at org.apache.kafka.common.network.Selector.connect(Selector.java:162)
> at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:514)
> at 
> org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:169)
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:180)
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.NoSuchElementException: null
> at java.util.LinkedList$ListItr.next(LinkedList.java:890)
> at javax.security.auth.Subject$SecureSet$1.next(Subject.java:1056)
> at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:90)
> ... 7 common frames omitted
> 2017-05-25 02:20:37,526 ERR [Sender.java:130] Uncaught error in kafka 
> producer I/O thread:
> org.apache.kafka.common.KafkaException: 
> org.apache.kafka.common.KafkaException: Failed to configure 
> SaslClientAuthenticator
> at 
> org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:92)
> at org.apache.kafka.common.network.Selector.connect(Selector.java:162)
> at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:514)
> at 
> org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:169)
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:180)
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kafka.common.KafkaException: Failed to configure 
> SaslClientAuthenticator
> at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:94)
> at 
> org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:88)
> ... 6 common frames omitted
> Caused by: java.util.NoSuchElementException: null
> at java.util.LinkedList$ListItr.next(LinkedList.java:890)
> at javax.security.auth.Subject$SecureSet$1.next(Subject.java:1056)
> at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:90)
> ... 7 common frames omitted
> 2017-05-25 02:20:37,536 ERR [Sender.java:130] Uncaught error in kafka 
> producer I/O thread:
> java.lang.NullPointerException: null
> 2017-05-25 02:20:37,536 ERR [Sender.java:130] Uncaught error in kafka 
> producer I/O thread:
> java.lang.NullPointerException: null
> 2017-05-25 02:20:37,536 ERR [Sender.java:130] Uncaught error in kafka 
> producer I/O thread:
> java.lang.NullPointerException: null



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-02 Thread Damian Guy

Jan, you have a choice to Fail fast if you want. This is about giving
people options and there are times when you don't want to fail fast.


On Fri, 2 Jun 2017 at 11:00 Jan Filipiak  wrote:

> Hi
>
> 1.
> That greatly complicates monitoring.  Fail Fast gives you that when you
> monitor only the lag of all your apps
> you are completely covered. With that sort of new application Monitoring
> is very much more complicated as
> you know need to monitor fail % of some special apps aswell. In my
> opinion that is a huge downside already.
>
> 2.
> using a schema regerstry like Avrostuff it might not even be the record
> that is broken, it might be just your app
> unable to fetch a schema it needs now know. Maybe you got partitioned
> away from that registry.
>
> 3. When you get alerted because of to high fail percentage. what are the
> steps you gonna do?
> shut it down to buy time. fix the problem. spend way to much time to
> find a good reprocess offset.
> Your timewindows are in bad shape anyways, and you pretty much lost.
> This routine is nonsense.
>
> Dead letter queues would be the worst possible addition to the kafka
> toolkit that I can think of. It just doesn't fit the architecture
> of having clients falling behind is a valid option.
>
> Further. I mentioned already the only bad pill ive seen so far is crc
> errors. any plans for those?
>
> Best Jan
>
>
>
>
>
>
> On 02.06.2017 11:34, Damian Guy wrote:
> > I agree with what Matthias has said w.r.t failing fast. There are plenty
> of
> > times when you don't want to fail-fast and must attempt to  make
> progress.
> > The dead-letter queue is exactly for these circumstances. Of course if
> > every record is failing, then you probably do want to give up.
> >
> > On Fri, 2 Jun 2017 at 07:56 Matthias J. Sax 
> wrote:
> >
> >> First a meta comment. KIP discussion should take place on the dev list
> >> -- if user list is cc'ed please make sure to reply to both lists.
> Thanks.
> >>
> >> Thanks for making the scope of the KIP clear. Makes a lot of sense to
> >> focus on deserialization exceptions for now.
> >>
> >> With regard to corrupted state stores, would it make sense to fail a
> >> task and wipe out the store to repair it via recreation from the
> >> changelog? That's of course a quite advance pattern, but I want to bring
> >> it up to design the first step in a way such that we can get there (if
> >> we think it's a reasonable idea).
> >>
> >> I also want to comment about fail fast vs making progress. I think that
> >> fail-fast must not always be the best option. The scenario I have in
> >> mind is like this: you got a bunch of producers that feed the Streams
> >> input topic. Most producers work find, but maybe one producer miss
> >> behaves and the data it writes is corrupted. You might not even be able
> >> to recover this lost data at any point -- thus, there is no reason to
> >> stop processing but you just skip over those records. Of course, you
> >> need to fix the root cause, and thus you need to alert (either via logs
> >> of the exception handler directly) and you need to start to investigate
> >> to find the bad producer, shut it down and fix it.
> >>
> >> Here the dead letter queue comes into place. From my understanding, the
> >> purpose of this feature is solely enable post debugging. I don't think
> >> those record would be fed back at any point in time (so I don't see any
> >> ordering issue -- a skipped record, with this regard, is just "fully
> >> processed"). Thus, the dead letter queue should actually encode the
> >> original records metadata (topic, partition offset etc) to enable such
> >> debugging. I guess, this might also be possible if you just log the bad
> >> records, but it would be harder to access (you first must find the
> >> Streams instance that did write the log and extract the information from
> >> there). Reading it from topic is much simpler.
> >>
> >> I also want to mention the following. Assume you have such a topic with
> >> some bad records and some good records. If we always fail-fast, it's
> >> going to be super hard to process the good data. You would need to write
> >> an extra app that copied the data into a new topic filtering out the bad
> >> records (or apply the map() workaround withing stream). So I don't think
> >> that failing fast is most likely the best option in production is
> >> necessarily, true.
> >>
> >> Or do you think there are scenarios, for which you can recover the
> >> corrupted records successfully? And even if this is possible, it might
> >> be a case for reprocessing instead of failing the whole application?
> >> Also, if you think you can "repair" a corrupted record, should the
> >> handler allow to return a "fixed" record? This would solve the ordering
> >> problem.
> >>
> >>
> >>
> >> -Matthias
> >>
> >>
> >>
> >>
> >> On 5/30/17 1:47 AM, Michael Noll wrote:
> >>> Thanks for your work on this KIP, Eno -- much appreciated!
> >>>
> >>> - I think it would help to improve the KIP by

Build failed in Jenkins: kafka-trunk-jdk7 #2324

See 


Changes:

[ismael] KAFKA-5282; Use a factory method to create producers/consumers and 
close

--
[...truncated 908.86 KB...]
kafka.admin.ConfigCommandTest > 
shouldNotUpdateBrokerConfigIfMalformedBracketConfig PASSED

kafka.admin.ConfigCommandTest > shouldFailIfUnrecognisedEntityType STARTED

kafka.admin.ConfigCommandTest > shouldFailIfUnrecognisedEntityType PASSED

kafka.admin.ConfigCommandTest > 
shouldNotUpdateBrokerConfigIfNonExistingConfigIsDeleted STARTED

kafka.admin.ConfigCommandTest > 
shouldNotUpdateBrokerConfigIfNonExistingConfigIsDeleted PASSED

kafka.admin.ConfigCommandTest > 
shouldNotUpdateBrokerConfigIfMalformedEntityName STARTED

kafka.admin.ConfigCommandTest > 
shouldNotUpdateBrokerConfigIfMalformedEntityName PASSED

kafka.admin.ConfigCommandTest > shouldSupportCommaSeparatedValues STARTED

kafka.admin.ConfigCommandTest > shouldSupportCommaSeparatedValues PASSED

kafka.admin.ConfigCommandTest > shouldNotUpdateBrokerConfigIfMalformedConfig 
STARTED

kafka.admin.ConfigCommandTest > shouldNotUpdateBrokerConfigIfMalformedConfig 
PASSED

kafka.admin.ConfigCommandTest > shouldParseArgumentsForBrokersEntityType STARTED

kafka.admin.ConfigCommandTest > shouldParseArgumentsForBrokersEntityType PASSED

kafka.admin.ConfigCommandTest > shouldAddBrokerConfig STARTED

kafka.admin.ConfigCommandTest > shouldAddBrokerConfig PASSED

kafka.admin.ConfigCommandTest > testQuotaDescribeEntities STARTED

kafka.admin.ConfigCommandTest > testQuotaDescribeEntities PASSED

kafka.admin.ConfigCommandTest > shouldParseArgumentsForClientsEntityType STARTED

kafka.admin.ConfigCommandTest > shouldParseArgumentsForClientsEntityType PASSED

kafka.admin.AddPartitionsTest > testReplicaPlacementAllServers STARTED

kafka.admin.AddPartitionsTest > testReplicaPlacementAllServers PASSED

kafka.admin.AddPartitionsTest > testWrongReplicaCount STARTED

kafka.admin.AddPartitionsTest > testWrongReplicaCount PASSED

kafka.admin.AddPartitionsTest > testReplicaPlacementPartialServers STARTED

kafka.admin.AddPartitionsTest > testReplicaPlacementPartialServers PASSED

kafka.admin.AddPartitionsTest > testTopicDoesNotExist STARTED

kafka.admin.AddPartitionsTest > testTopicDoesNotExist PASSED

kafka.admin.AddPartitionsTest > testIncrementPartitions STARTED

kafka.admin.AddPartitionsTest > testIncrementPartitions PASSED

kafka.admin.AddPartitionsTest > testManualAssignmentOfReplicas STARTED

kafka.admin.AddPartitionsTest > testManualAssignmentOfReplicas PASSED

kafka.admin.ReassignPartitionsIntegrationTest > testRackAwareReassign STARTED

kafka.admin.ReassignPartitionsIntegrationTest > testRackAwareReassign PASSED

kafka.admin.AdminTest > testBasicPreferredReplicaElection STARTED

kafka.admin.AdminTest > testBasicPreferredReplicaElection PASSED

kafka.admin.AdminTest > testPreferredReplicaJsonData STARTED

kafka.admin.AdminTest > testPreferredReplicaJsonData PASSED

kafka.admin.AdminTest > testReassigningNonExistingPartition STARTED

kafka.admin.AdminTest > testReassigningNonExistingPartition PASSED

kafka.admin.AdminTest > testGetBrokerMetadatas STARTED

kafka.admin.AdminTest > testGetBrokerMetadatas PASSED

kafka.admin.AdminTest > testBootstrapClientIdConfig STARTED

kafka.admin.AdminTest > testBootstrapClientIdConfig PASSED

kafka.admin.AdminTest > testPartitionReassignmentNonOverlappingReplicas STARTED

kafka.admin.AdminTest > testPartitionReassignmentNonOverlappingReplicas PASSED

kafka.admin.AdminTest > testReplicaAssignment STARTED

kafka.admin.AdminTest > testReplicaAssignment PASSED

kafka.admin.AdminTest > testPartitionReassignmentWithLeaderNotInNewReplicas 
STARTED

kafka.admin.AdminTest > testPartitionReassignmentWithLeaderNotInNewReplicas 
PASSED

kafka.admin.AdminTest > testTopicConfigChange STARTED

kafka.admin.AdminTest > testTopicConfigChange PASSED

kafka.admin.AdminTest > testResumePartitionReassignmentThatWasCompleted STARTED

kafka.admin.AdminTest > testResumePartitionReassignmentThatWasCompleted PASSED

kafka.admin.AdminTest > testManualReplicaAssignment STARTED

kafka.admin.AdminTest > testManualReplicaAssignment PASSED

kafka.admin.AdminTest > testConcurrentTopicCreation STARTED

kafka.admin.AdminTest > testConcurrentTopicCreation PASSED

kafka.admin.AdminTest > testPartitionReassignmentWithLeaderInNewReplicas STARTED

kafka.admin.AdminTest > testPartitionReassignmentWithLeaderInNewReplicas PASSED

kafka.admin.AdminTest > shouldPropagateDynamicBrokerConfigs STARTED

kafka.admin.AdminTest > shouldPropagateDynamicBrokerConfigs PASSED

kafka.admin.AdminTest > testShutdownBroker STARTED

kafka.admin.AdminTest > testShutdownBroker PASSED

kafka.admin.AdminTest > testTopicCreationWithCollision STARTED

kafka.admin.AdminTest > testTopicCreationWithCollision PASSED

kafka.admin.AdminTest > testTopicCreationInZK STARTED

kafka.admin.AdminTest > testTopicCreationInZK PASSED

kafka.admin.DeleteTopicTest

[GitHub] kafka pull request #2328: KAFKA-3264: Deprecate the old Scala consumer (KIP-...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2328


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (KAFKA-3264) Mark the old Scala consumer and related classes as deprecated


 [ 
https://issues.apache.org/jira/browse/KAFKA-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3264:
---
   Resolution: Fixed
Fix Version/s: 0.11.1.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2328
[https://github.com/apache/kafka/pull/2328]

> Mark the old Scala consumer and related classes as deprecated
> -
>
> Key: KAFKA-3264
> URL: https://issues.apache.org/jira/browse/KAFKA-3264
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Grant Henke
>Assignee: Vahid Hashemian
> Fix For: 0.11.0.0, 0.11.1.0
>
>
> Once the new consumer is out of beta, we should consider deprecating the old 
> Scala consumers to encourage use of the new consumer and facilitate the 
> removal of the old consumers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KAFKA-3264) Mark the old Scala consumer and related classes as deprecated


[ 
https://issues.apache.org/jira/browse/KAFKA-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034536#comment-16034536
 ] 

ASF GitHub Bot commented on KAFKA-3264:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2328


> Mark the old Scala consumer and related classes as deprecated
> -
>
> Key: KAFKA-3264
> URL: https://issues.apache.org/jira/browse/KAFKA-3264
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Grant Henke
>Assignee: Vahid Hashemian
> Fix For: 0.11.0.0, 0.11.1.0
>
>
> Once the new consumer is out of beta, we should consider deprecating the old 
> Scala consumers to encourage use of the new consumer and facilitate the 
> removal of the old consumers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-3264) Mark the old Scala consumer and related classes as deprecated


 [ 
https://issues.apache.org/jira/browse/KAFKA-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3264:
---
Fix Version/s: (was: 0.11.1.0)

> Mark the old Scala consumer and related classes as deprecated
> -
>
> Key: KAFKA-3264
> URL: https://issues.apache.org/jira/browse/KAFKA-3264
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Grant Henke
>Assignee: Vahid Hashemian
> Fix For: 0.11.0.0
>
>
> Once the new consumer is out of beta, we should consider deprecating the old 
> Scala consumers to encourage use of the new consumer and facilitate the 
> removal of the old consumers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #3207: MINOR: Fix flaky testHWCheckpointWithFailuresSingl...

2017-06-02 Thread ijuma

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/3207

MINOR: Fix flaky testHWCheckpointWithFailuresSingleLogSegment

It was incorrectly passing `oldLeaderOpt` to 
`waitUntilLeaderIsElectedOrChanged`
as can be seen by the subsequent assertion. This started failing recently
because we fixed `waitUntilLeaderIsElectedOrChanged` to handle
this case correctly.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
fix-flaky-testHWCheckpointWithFailuresSingleLogSegment

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3207.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3207


commit d8ad33168da7d780d6905669edafb959ad461a3c
Author: Ismael Juma 
Date:   2017-06-02T11:38:24Z

MINOR: Fix flaky testHWCheckpointWithFailuresSingleLogSegment

It was incorrectly passing `oldLeaderOpt` to 
`waitUntilLeaderIsElectedOrChanged`
as can be seen by the subsequent assertion. This started failing recently
because we fixed `waitUntilLeaderIsElectedOrChanged` to handle
this case correctly.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5123) Refactor ZkUtils readData* methods

2017-06-02 Thread Balint Molnar (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034575#comment-16034575
 ] 

Balint Molnar commented on KAFKA-5123:
--

How to refactor zkUtils.readData method:
* move all zkException inside ZkUtils
** Class ConsumerGroupCommand:
*** 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/ConsumerGroupCommand.scala#L342
*** It is easy because we can match for None and then call the printerror(…)
** Class ConsumerOffsetChecker:
*** 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L172
*** Maybe we can throw a different exception here/ introduce new one which is 
not related to zk.
** Class ZkUtils:
*** 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils/ZkUtils.scala#L496
*** We can handle this with None
* this method should only return the data without the stat and introduce a new 
method with name readDataAndStat which returns the data and the stat too, with 
this one we don’t need to call the annoying ._1 every time. 

How to refactor zkUtils.readDataMaybeNull method:
* Do not return Some(null) convert Some(null) into None with calling Option() 
instead of Some(). Also rename this method to readData. I do not see why we 
need a separate method for these two things.
** This method is a little bit tricky because of the Some() no one traits the 
Some(null)
** For example:
*** Class ConsumerGroupCommand:
 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/ConsumerGroupCommand.scala#L272
 we are calling a substring on null string
*** Class ZookeeperConsumerConnector:
 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala#L422
 we are calling toLong on null string
* This method should only return the data without the stat and introduce a new 
method with name readDataAndStat. 

How to refactor zkUtils.readDataAndVersionMaybeNull method:
* I think we can remove the MaybeNull from it’s name, otherwise this method 
looks ok to me.

[~ijuma] what do you think? If you agree, I will start to implement this.

> Refactor ZkUtils readData* methods 
> ---
>
> Key: KAFKA-5123
> URL: https://issues.apache.org/jira/browse/KAFKA-5123
> Project: Kafka
>  Issue Type: Bug
>Reporter: Balint Molnar
>Assignee: Balint Molnar
>Priority: Minor
>
> Usually only the data value is required but every readData method in the 
> ZkUtils returns a Tuple with the data and the stat.
> https://github.com/apache/kafka/pull/2888



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Jenkins build is back to normal : kafka-trunk-jdk8 #1640

See

Build failed in Jenkins: kafka-trunk-jdk7 #2325

See 


Changes:

[ismael] KAFKA-3264; Deprecate the old Scala consumer (KIP-109)

--
[...truncated 948.20 KB...]
kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete STARTED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
PASSED

kafka.integration.TopicMetadataTest > testIsrAfterBrokerShutDownAndJoinsBack 
STARTED

kafka.integration.TopicMetadataTest > testIsrAfterBrokerShutDownAndJoinsBack 
PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithCollision STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithCollision PASSED

kafka.integration.TopicMetadataTest > testAliveBrokerListWithNoTopics STARTED

kafka.integration.TopicMetadataTest > testAliveBrokerListWithNoTopics PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.TopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.TopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.TopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.TopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithInvalidReplication 
STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithInvalidReplication 
PASSED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.FetcherTest > testFetcher STARTED

kafka.integration.FetcherTest > testFetcher PASSED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionEnabled 
STARTED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionEnabled 
PASSED

kafka.integration.UncleanLeaderElectionTest > 
testCleanLeaderElectionDisabledByTopicOverride STARTED

kafka.integration.UncleanLeaderElectionTest > 
testCleanLeaderElectionDisabledByTopicOverride SKIPPED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionDisabled 
STARTED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionDisabled 
SKIPPED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionInvalidTopicOverride STARTED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionInvalidTopicOverride PASSED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionEnabledByTopicOverride STARTED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionEnabledByTopicOverride PASSED

kafka.integration.MinIsrConfigTest > testDefaultKafkaConfig STARTED

kafka.integration.MinIsrConfigTest > testDefaultKafkaConfig PASSED

unit.kafka.server.KafkaApisTest >

Re: [DISCUSS] KIP-167: Add a restoreAll method to StateRestoreCallback

2017-06-02 Thread Jim Jagielski


> On Jun 2, 2017, at 12:54 AM, Matthias J. Sax  wrote:
> 
> With regard to backward compatibility, we should not change the current
> interface, but add a new interface that extends the current one.
> 

++1

Re: [DISCUSS] KIP-150 - Kafka-Streams Cogroup

2017-06-02 Thread Jim Jagielski

This makes much more sense to me. +1

> On Jun 1, 2017, at 10:33 AM, Kyle Winkelman  wrote:
> 
> I have updated the KIP and my PR. Let me know what you think.
> To created a cogrouped stream just call cogroup on a KgroupedStream and
> supply the initializer, aggValueSerde, and an aggregator. Then continue
> adding kgroupedstreams and aggregators. Then call one of the many aggregate
> calls to create a KTable.
> 
> Thanks,
> Kyle
> 
> On Jun 1, 2017 4:03 AM, "Damian Guy"  wrote:
> 
>> Hi Kyle,
>> 
>> Thanks for the update. I think just one initializer makes sense as it
>> should only be called once per key and generally it is just going to create
>> a new instance of whatever the Aggregate class is.
>> 
>> Cheers,
>> Damian
>> 
>> On Wed, 31 May 2017 at 20:09 Kyle Winkelman 
>> wrote:
>> 
>>> Hello all,
>>> 
>>> I have spent some more time on this and the best alternative I have come
>> up
>>> with is:
>>> KGroupedStream has a single cogroup call that takes an initializer and an
>>> aggregator.
>>> CogroupedKStream has a cogroup call that takes additional groupedStream
>>> aggregator pairs.
>>> CogroupedKStream has multiple aggregate methods that create the different
>>> stores.
>>> 
>>> I plan on updating the kip but I want people's input on if we should have
>>> the initializer be passed in once at the beginning or if we should
>> instead
>>> have the initializer be required for each call to one of the aggregate
>>> calls. The first makes more sense to me but doesnt allow the user to
>>> specify different initializers for different tables.
>>> 
>>> Thanks,
>>> Kyle
>>> 
>>> On May 24, 2017 7:46 PM, "Kyle Winkelman" 
>>> wrote:
>>> 
 Yea I really like that idea I'll see what I can do to update the kip
>> and
 my pr when I have some time. I'm not sure how well creating the
 kstreamaggregates will go though because at that point I will have
>> thrown
 away the type of the values. It will be type safe I just may need to
>> do a
 little forcing.
 
 Thanks,
 Kyle
 
 On May 24, 2017 3:28 PM, "Guozhang Wang"  wrote:
 
> Kyle,
> 
> Thanks for the explanations, my previous read on the wiki examples was
> wrong.
> 
> So I guess my motivation should be "reduced" to: can we move the
>> window
> specs param from "KGroupedStream#cogroup(..)" to
> "CogroupedKStream#aggregate(..)", and my motivations are:
> 
> 1. minor: we can reduce the #.generics in CogroupedKStream from 3 to
>> 2.
> 2. major: this is for extensibility of the APIs, and since we are
>>> removing
> the "Evolving" annotations on Streams it may be harder to change it
>>> again
> in the future. The extended use cases are that people wanted to have
> windowed running aggregates on different granularities, e.g. "give me
>>> the
> counts per-minute, per-hour, per-day and per-week", and today in DSL
>> we
> need to specify that case in multiple aggregate operators, which gets
>> a
> state store / changelog, etc. And it is possible to optimize it as
>> well
>>> to
> a single state store. Its implementation would be tricky as you need
>> to
> contain different lengthed windows within your window store but just
>>> from
> the public API point of view, it could be specified as:
> 
> CogroupedKStream stream = stream1.cogroup(stream2, ...
> "state-store-name");
> 
> table1 = stream.aggregate(/*per-minute window*/)
> table2 = stream.aggregate(/*per-hour window*/)
> table3 = stream.aggregate(/*per-day window*/)
> 
> while underlying we are only using a single store "state-store-name"
>> for
> it.
> 
> 
> Although this feature is out of the scope of this KIP, I'd like to
>>> discuss
> if we can "leave the door open" to make such changes without modifying
>>> the
> public APIs .
> 
> Guozhang
> 
> 
> On Wed, May 24, 2017 at 3:57 AM, Kyle Winkelman <
>>> winkelman.k...@gmail.com
>> 
> wrote:
> 
>> I allow defining a single window/sessionwindow one time when you
>> make
> the
>> cogroup call from a KGroupedStream. From then on you are using the
> cogroup
>> call from with in CogroupedKStream which doesnt accept any
>> additional
>> windows/sessionwindows.
>> 
>> Is this what you meant by your question or did I misunderstand?
>> 
>> On May 23, 2017 9:33 PM, "Guozhang Wang" 
>> wrote:
>> 
>> Another question that came to me is on "window alignment": from the
>>> KIP
> it
>> seems you are allowing users to specify a (potentially different)
>>> window
>> spec in each co-grouped input stream. So if these window specs are
>> different how should we "align" them with different input streams? I
> think
>> it is more natural to only specify on window spec in the
>> 
>> KTable CogroupedKStream#aggregate(Windows);
>> 
>> 
>> And remove it from the cogroup() functions. WDYT?
>>

Jenkins build is back to normal : kafka-trunk-jdk7 #2326

See

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-02 Thread Jan Filipiak


Hi

1.
That greatly complicates monitoring.  Fail Fast gives you that when you 
monitor only the lag of all your apps
you are completely covered. With that sort of new application Monitoring 
is very much more complicated as
you know need to monitor fail % of some special apps aswell. In my 
opinion that is a huge downside already.


2.
using a schema regerstry like Avrostuff it might not even be the record 
that is broken, it might be just your app
unable to fetch a schema it needs now know. Maybe you got partitioned 
away from that registry.


3. When you get alerted because of to high fail percentage. what are the 
steps you gonna do?
shut it down to buy time. fix the problem. spend way to much time to 
find a good reprocess offset.

Your timewindows are in bad shape anyways, and you pretty much lost.
This routine is nonsense.

Dead letter queues would be the worst possible addition to the kafka 
toolkit that I can think of. It just doesn't fit the architecture

of having clients falling behind is a valid option.

Further. I mentioned already the only bad pill ive seen so far is crc 
errors. any plans for those?


Best Jan






On 02.06.2017 11:34, Damian Guy wrote:

I agree with what Matthias has said w.r.t failing fast. There are plenty of
times when you don't want to fail-fast and must attempt to  make progress.
The dead-letter queue is exactly for these circumstances. Of course if
every record is failing, then you probably do want to give up.

On Fri, 2 Jun 2017 at 07:56 Matthias J. Sax  wrote:


First a meta comment. KIP discussion should take place on the dev list
-- if user list is cc'ed please make sure to reply to both lists. Thanks.

Thanks for making the scope of the KIP clear. Makes a lot of sense to
focus on deserialization exceptions for now.

With regard to corrupted state stores, would it make sense to fail a
task and wipe out the store to repair it via recreation from the
changelog? That's of course a quite advance pattern, but I want to bring
it up to design the first step in a way such that we can get there (if
we think it's a reasonable idea).

I also want to comment about fail fast vs making progress. I think that
fail-fast must not always be the best option. The scenario I have in
mind is like this: you got a bunch of producers that feed the Streams
input topic. Most producers work find, but maybe one producer miss
behaves and the data it writes is corrupted. You might not even be able
to recover this lost data at any point -- thus, there is no reason to
stop processing but you just skip over those records. Of course, you
need to fix the root cause, and thus you need to alert (either via logs
of the exception handler directly) and you need to start to investigate
to find the bad producer, shut it down and fix it.

Here the dead letter queue comes into place. From my understanding, the
purpose of this feature is solely enable post debugging. I don't think
those record would be fed back at any point in time (so I don't see any
ordering issue -- a skipped record, with this regard, is just "fully
processed"). Thus, the dead letter queue should actually encode the
original records metadata (topic, partition offset etc) to enable such
debugging. I guess, this might also be possible if you just log the bad
records, but it would be harder to access (you first must find the
Streams instance that did write the log and extract the information from
there). Reading it from topic is much simpler.

I also want to mention the following. Assume you have such a topic with
some bad records and some good records. If we always fail-fast, it's
going to be super hard to process the good data. You would need to write
an extra app that copied the data into a new topic filtering out the bad
records (or apply the map() workaround withing stream). So I don't think
that failing fast is most likely the best option in production is
necessarily, true.

Or do you think there are scenarios, for which you can recover the
corrupted records successfully? And even if this is possible, it might
be a case for reprocessing instead of failing the whole application?
Also, if you think you can "repair" a corrupted record, should the
handler allow to return a "fixed" record? This would solve the ordering
problem.



-Matthias




On 5/30/17 1:47 AM, Michael Noll wrote:

Thanks for your work on this KIP, Eno -- much appreciated!

- I think it would help to improve the KIP by adding an end-to-end code
example that demonstrates, with the DSL and with the Processor API, how

the

user would write a simple application that would then be augmented with

the

proposed KIP changes to handle exceptions.  It should also become much
clearer then that e.g. the KIP would lead to different code paths for the
happy case and any failure scenarios.

- Do we have sufficient information available to make informed decisions

on

what to do next?  For example, do we know in which part of the topology

the

record failed? `ConsumerRecord` gives

[GitHub] kafka pull request #3208: KAFKA-5325: Avoid and handle exceptions for Kerber...

2017-06-02 Thread rajinisivaram

GitHub user rajinisivaram opened a pull request:

https://github.com/apache/kafka/pull/3208

KAFKA-5325: Avoid and handle exceptions for Kerberos re-login



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajinisivaram/kafka KAFKA-5325

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3208.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3208


commit ad8afca300bda010a4fdbd9bbdcdf80bfda17250
Author: Rajini Sivaram 
Date:   2017-06-02T11:31:48Z

KAFKA-5325: Avoid and handle exceptions for Kerberos re-login




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5325) Connection Lose during Kafka Kerberos Renewal process


[ 
https://issues.apache.org/jira/browse/KAFKA-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034742#comment-16034742
 ] 

ASF GitHub Bot commented on KAFKA-5325:
---

GitHub user rajinisivaram opened a pull request:

https://github.com/apache/kafka/pull/3208

KAFKA-5325: Avoid and handle exceptions for Kerberos re-login



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajinisivaram/kafka KAFKA-5325

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3208.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3208


commit ad8afca300bda010a4fdbd9bbdcdf80bfda17250
Author: Rajini Sivaram 
Date:   2017-06-02T11:31:48Z

KAFKA-5325: Avoid and handle exceptions for Kerberos re-login




> Connection Lose during Kafka Kerberos Renewal process
> -
>
> Key: KAFKA-5325
> URL: https://issues.apache.org/jira/browse/KAFKA-5325
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.9.0.0
>Reporter: MuthuKumar
>Assignee: Rajini Sivaram
>
> During Kerberos Ticket renewal, all requests reaching the server interim 
> Kerberos renewal ticket logout & re-login  is getting failed with below 
> mentioned error.
> kafka-clients-0.9.0.0.jar is being used for producer end. Reason for using 
> Kafka version 0.9.0.0 at producer end as the server is running in 0.10.0.x 
> OS: Oracle Linux Server release 6.7
> Kerberos Configuration - Producer end
> -
> KafkaClient {
> com.sun.security.auth.module.Krb5LoginModule required
> refreshKrb5Config=true
> principal="u...@.com"
> useKeyTab=true
> serviceName="kafka"
> keyTab="x.keytab"
> client=true;
> };
> Application Log
> ---
> 2017-05-25 02:20:37,515 INF [Login.java:354] Initiating logout for 
> u...@.com
> 2017-05-25 02:20:37,515 INF [Login.java:365] Initiating re-login for 
> u...@.com
> 2017-05-25 02:20:37,525 INF [SaslChannelBuilder.java:91] Failed to create 
> channel due to
> org.apache.kafka.common.KafkaException: Failed to configure 
> SaslClientAuthenticator
> at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:94)
> at 
> org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:88)
> at org.apache.kafka.common.network.Selector.connect(Selector.java:162)
> at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:514)
> at 
> org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:169)
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:180)
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.NoSuchElementException: null
> at java.util.LinkedList$ListItr.next(LinkedList.java:890)
> at javax.security.auth.Subject$SecureSet$1.next(Subject.java:1056)
> at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:90)
> ... 7 common frames omitted
> 2017-05-25 02:20:37,526 ERR [Sender.java:130] Uncaught error in kafka 
> producer I/O thread:
> org.apache.kafka.common.KafkaException: 
> org.apache.kafka.common.KafkaException: Failed to configure 
> SaslClientAuthenticator
> at 
> org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:92)
> at org.apache.kafka.common.network.Selector.connect(Selector.java:162)
> at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:514)
> at 
> org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:169)
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:180)
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kafka.common.KafkaException: Failed to configure 
> SaslClientAuthenticator
> at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:94)
> at 
> org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:88)
> ... 6 common frames omitted
> Caused by: java.util.NoSuchElementException: null
> at java.util.LinkedList$ListItr.next(LinkedList.java:890)
> at javax.security.auth.Subject$SecureSet$1.next(Subject.java:1056)
> at 
> org.a

[GitHub] kafka pull request #3209: KAFKA-5345; Close KafkaClient when streams client ...

2017-06-02 Thread rajinisivaram

GitHub user rajinisivaram opened a pull request:

https://github.com/apache/kafka/pull/3209

KAFKA-5345; Close KafkaClient when streams client is closed

Cherry-pick KAFKA-5345 to 0.10.2

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajinisivaram/kafka KAFKA-5345-0102

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3209.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3209


commit 67ca06ecd7fdecd929dae0b657c70b4be71f49ed
Author: Rajini Sivaram 
Date:   2017-06-01T20:54:28Z

KAFKA-5345; Close KafkaClient when streams client is closed

Author: Rajini Sivaram 

Reviewers: Guozhang Wang , Ismael Juma 


Closes #3195 from rajinisivaram/KAFKA-5345




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5345) Some socket connections not closed after restart of Kafka Streams


[ 
https://issues.apache.org/jira/browse/KAFKA-5345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034791#comment-16034791
 ] 

ASF GitHub Bot commented on KAFKA-5345:
---

GitHub user rajinisivaram opened a pull request:

https://github.com/apache/kafka/pull/3209

KAFKA-5345; Close KafkaClient when streams client is closed

Cherry-pick KAFKA-5345 to 0.10.2

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajinisivaram/kafka KAFKA-5345-0102

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3209.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3209


commit 67ca06ecd7fdecd929dae0b657c70b4be71f49ed
Author: Rajini Sivaram 
Date:   2017-06-01T20:54:28Z

KAFKA-5345; Close KafkaClient when streams client is closed

Author: Rajini Sivaram 

Reviewers: Guozhang Wang , Ismael Juma 


Closes #3195 from rajinisivaram/KAFKA-5345




> Some socket connections not closed after restart of Kafka Streams
> -
>
> Key: KAFKA-5345
> URL: https://issues.apache.org/jira/browse/KAFKA-5345
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0, 0.10.2.1
> Environment: MacOs 10.12.5 and Ubuntu 14.04
>Reporter: Jeroen van Wilgenburg
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> We ran into a problem that resulted in a "Too many open files" exception 
> because some sockets are not closed after a restart.
> This problem only occurs with version {{0.10.2.1}} and {{0.10.2.0}}. 
> {{0.10.1.1}} and {{0.10.1.0}} both work as expected.
> I used the same version for the server and client.
> I used https://github.com/kohsuke/file-leak-detector to display the open file 
> descriptors. The culprit was :
> {noformat}
> #146 socket channel by thread:pool-2-thread-1 on Mon May 29 11:20:47 CEST 2017
>   at sun.nio.ch.SocketChannelImpl.(SocketChannelImpl.java:108)
>   at 
> sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60)
>   at java.nio.channels.SocketChannel.open(SocketChannel.java:145)
>   at org.apache.kafka.common.network.Selector.connect(Selector.java:168)
>   at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:629)
>   at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:186)
>   at 
> org.apache.kafka.streams.processor.internals.StreamsKafkaClient.ensureOneNodeIsReady(StreamsKafkaClient.java:195)
>   at 
> org.apache.kafka.streams.processor.internals.StreamsKafkaClient.getAnyReadyBrokerId(StreamsKafkaClient.java:233)
>   at 
> org.apache.kafka.streams.processor.internals.StreamsKafkaClient.checkBrokerCompatibility(StreamsKafkaClient.java:300)
>   at 
> org.apache.kafka.streams.KafkaStreams.checkBrokerVersionCompatibility(KafkaStreams.java:401)
>   at org.apache.kafka.streams.KafkaStreams.start(KafkaStreams.java:425)
> {noformat}
>   
>   
> I could narrow the problem down to a reproducable example below (the only 
> dependency is 
> {{org.apache.kafka:kafka-streams:jar:0.10.2.1}}). 
> *IMPORTANT*: You have to run this code in the Intellij IDEA debugger with a 
> special breakpoint to see it fail. 
> See the comments on the socketChannels variable on how to add this 
> breakpoint. 
> When you run this code you will see the number of open SocketChannels 
> increase (only on version 0.10.2.x).
>   
> {code:title=App.java}
> import org.apache.kafka.common.serialization.Serdes;
> import org.apache.kafka.streams.KafkaStreams;
> import org.apache.kafka.streams.StreamsConfig;
> import org.apache.kafka.streams.kstream.KStream;
> import org.apache.kafka.streams.kstream.KStreamBuilder;
> import java.nio.channels.SocketChannel;
> import java.nio.channels.spi.AbstractInterruptibleChannel;
> import java.util.ArrayList;
> import java.util.List;
> import java.util.Properties;
> import java.util.concurrent.Executors;
> import java.util.concurrent.ScheduledExecutorService;
> import java.util.concurrent.TimeUnit;
> import java.util.stream.Collectors;
> public class App {
> private static KafkaStreams streams;
> private static String brokerList;
> // Fill socketChannels with entries on line 'Socket socket = 
> socketChannel.socket();' (line number 170  on 0.10.2.1)
> // of org.apache.kafka.common.network.Selector: Add breakpoint, right 
> click on breakpoint.
> // - Uncheck 'Suspend'
> // - Check 'Evaluate and log' and fill text field with (without quotes) 
> 'App.socketChannels.add(socketChannel)'
> private static final List socketChannels = new 
> Ar

[GitHub] kafka pull request #3209: KAFKA-5345; Close KafkaClient when streams client ...

2017-06-02 Thread rajinisivaram

Github user rajinisivaram closed the pull request at:

https://github.com/apache/kafka/pull/3209


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5345) Some socket connections not closed after restart of Kafka Streams


[ 
https://issues.apache.org/jira/browse/KAFKA-5345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034805#comment-16034805
 ] 

ASF GitHub Bot commented on KAFKA-5345:
---

Github user rajinisivaram closed the pull request at:

https://github.com/apache/kafka/pull/3209


> Some socket connections not closed after restart of Kafka Streams
> -
>
> Key: KAFKA-5345
> URL: https://issues.apache.org/jira/browse/KAFKA-5345
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0, 0.10.2.1
> Environment: MacOs 10.12.5 and Ubuntu 14.04
>Reporter: Jeroen van Wilgenburg
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> We ran into a problem that resulted in a "Too many open files" exception 
> because some sockets are not closed after a restart.
> This problem only occurs with version {{0.10.2.1}} and {{0.10.2.0}}. 
> {{0.10.1.1}} and {{0.10.1.0}} both work as expected.
> I used the same version for the server and client.
> I used https://github.com/kohsuke/file-leak-detector to display the open file 
> descriptors. The culprit was :
> {noformat}
> #146 socket channel by thread:pool-2-thread-1 on Mon May 29 11:20:47 CEST 2017
>   at sun.nio.ch.SocketChannelImpl.(SocketChannelImpl.java:108)
>   at 
> sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60)
>   at java.nio.channels.SocketChannel.open(SocketChannel.java:145)
>   at org.apache.kafka.common.network.Selector.connect(Selector.java:168)
>   at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:629)
>   at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:186)
>   at 
> org.apache.kafka.streams.processor.internals.StreamsKafkaClient.ensureOneNodeIsReady(StreamsKafkaClient.java:195)
>   at 
> org.apache.kafka.streams.processor.internals.StreamsKafkaClient.getAnyReadyBrokerId(StreamsKafkaClient.java:233)
>   at 
> org.apache.kafka.streams.processor.internals.StreamsKafkaClient.checkBrokerCompatibility(StreamsKafkaClient.java:300)
>   at 
> org.apache.kafka.streams.KafkaStreams.checkBrokerVersionCompatibility(KafkaStreams.java:401)
>   at org.apache.kafka.streams.KafkaStreams.start(KafkaStreams.java:425)
> {noformat}
>   
>   
> I could narrow the problem down to a reproducable example below (the only 
> dependency is 
> {{org.apache.kafka:kafka-streams:jar:0.10.2.1}}). 
> *IMPORTANT*: You have to run this code in the Intellij IDEA debugger with a 
> special breakpoint to see it fail. 
> See the comments on the socketChannels variable on how to add this 
> breakpoint. 
> When you run this code you will see the number of open SocketChannels 
> increase (only on version 0.10.2.x).
>   
> {code:title=App.java}
> import org.apache.kafka.common.serialization.Serdes;
> import org.apache.kafka.streams.KafkaStreams;
> import org.apache.kafka.streams.StreamsConfig;
> import org.apache.kafka.streams.kstream.KStream;
> import org.apache.kafka.streams.kstream.KStreamBuilder;
> import java.nio.channels.SocketChannel;
> import java.nio.channels.spi.AbstractInterruptibleChannel;
> import java.util.ArrayList;
> import java.util.List;
> import java.util.Properties;
> import java.util.concurrent.Executors;
> import java.util.concurrent.ScheduledExecutorService;
> import java.util.concurrent.TimeUnit;
> import java.util.stream.Collectors;
> public class App {
> private static KafkaStreams streams;
> private static String brokerList;
> // Fill socketChannels with entries on line 'Socket socket = 
> socketChannel.socket();' (line number 170  on 0.10.2.1)
> // of org.apache.kafka.common.network.Selector: Add breakpoint, right 
> click on breakpoint.
> // - Uncheck 'Suspend'
> // - Check 'Evaluate and log' and fill text field with (without quotes) 
> 'App.socketChannels.add(socketChannel)'
> private static final List socketChannels = new 
> ArrayList<>();
> public static void main(String[] args) {
> brokerList = args[0];
> init();
> ScheduledExecutorService scheduledThreadPool = 
> Executors.newScheduledThreadPool(1);
> Runnable command = () -> {
> streams.close();
> System.out.println("Open socketChannels: " + 
> socketChannels.stream()
> .filter(AbstractInterruptibleChannel::isOpen)
> .collect(Collectors.toList()).size());
> init();
> };
> scheduledThreadPool.scheduleWithFixedDelay(command, 1L, 2000, 
> TimeUnit.MILLISECONDS);
> }
> private static void init() {
> Properties streamsConfiguration = new Properties();
> streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, 
> "JeroenApp");
>

[jira] [Updated] (KAFKA-5345) Some socket connections not closed after restart of Kafka Streams

2017-06-02 Thread Rajini Sivaram (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-5345:
--
Fix Version/s: 0.10.2.2

> Some socket connections not closed after restart of Kafka Streams
> -
>
> Key: KAFKA-5345
> URL: https://issues.apache.org/jira/browse/KAFKA-5345
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0, 0.10.2.1
> Environment: MacOs 10.12.5 and Ubuntu 14.04
>Reporter: Jeroen van Wilgenburg
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0, 0.10.2.2
>
>
> We ran into a problem that resulted in a "Too many open files" exception 
> because some sockets are not closed after a restart.
> This problem only occurs with version {{0.10.2.1}} and {{0.10.2.0}}. 
> {{0.10.1.1}} and {{0.10.1.0}} both work as expected.
> I used the same version for the server and client.
> I used https://github.com/kohsuke/file-leak-detector to display the open file 
> descriptors. The culprit was :
> {noformat}
> #146 socket channel by thread:pool-2-thread-1 on Mon May 29 11:20:47 CEST 2017
>   at sun.nio.ch.SocketChannelImpl.(SocketChannelImpl.java:108)
>   at 
> sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60)
>   at java.nio.channels.SocketChannel.open(SocketChannel.java:145)
>   at org.apache.kafka.common.network.Selector.connect(Selector.java:168)
>   at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:629)
>   at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:186)
>   at 
> org.apache.kafka.streams.processor.internals.StreamsKafkaClient.ensureOneNodeIsReady(StreamsKafkaClient.java:195)
>   at 
> org.apache.kafka.streams.processor.internals.StreamsKafkaClient.getAnyReadyBrokerId(StreamsKafkaClient.java:233)
>   at 
> org.apache.kafka.streams.processor.internals.StreamsKafkaClient.checkBrokerCompatibility(StreamsKafkaClient.java:300)
>   at 
> org.apache.kafka.streams.KafkaStreams.checkBrokerVersionCompatibility(KafkaStreams.java:401)
>   at org.apache.kafka.streams.KafkaStreams.start(KafkaStreams.java:425)
> {noformat}
>   
>   
> I could narrow the problem down to a reproducable example below (the only 
> dependency is 
> {{org.apache.kafka:kafka-streams:jar:0.10.2.1}}). 
> *IMPORTANT*: You have to run this code in the Intellij IDEA debugger with a 
> special breakpoint to see it fail. 
> See the comments on the socketChannels variable on how to add this 
> breakpoint. 
> When you run this code you will see the number of open SocketChannels 
> increase (only on version 0.10.2.x).
>   
> {code:title=App.java}
> import org.apache.kafka.common.serialization.Serdes;
> import org.apache.kafka.streams.KafkaStreams;
> import org.apache.kafka.streams.StreamsConfig;
> import org.apache.kafka.streams.kstream.KStream;
> import org.apache.kafka.streams.kstream.KStreamBuilder;
> import java.nio.channels.SocketChannel;
> import java.nio.channels.spi.AbstractInterruptibleChannel;
> import java.util.ArrayList;
> import java.util.List;
> import java.util.Properties;
> import java.util.concurrent.Executors;
> import java.util.concurrent.ScheduledExecutorService;
> import java.util.concurrent.TimeUnit;
> import java.util.stream.Collectors;
> public class App {
> private static KafkaStreams streams;
> private static String brokerList;
> // Fill socketChannels with entries on line 'Socket socket = 
> socketChannel.socket();' (line number 170  on 0.10.2.1)
> // of org.apache.kafka.common.network.Selector: Add breakpoint, right 
> click on breakpoint.
> // - Uncheck 'Suspend'
> // - Check 'Evaluate and log' and fill text field with (without quotes) 
> 'App.socketChannels.add(socketChannel)'
> private static final List socketChannels = new 
> ArrayList<>();
> public static void main(String[] args) {
> brokerList = args[0];
> init();
> ScheduledExecutorService scheduledThreadPool = 
> Executors.newScheduledThreadPool(1);
> Runnable command = () -> {
> streams.close();
> System.out.println("Open socketChannels: " + 
> socketChannels.stream()
> .filter(AbstractInterruptibleChannel::isOpen)
> .collect(Collectors.toList()).size());
> init();
> };
> scheduledThreadPool.scheduleWithFixedDelay(command, 1L, 2000, 
> TimeUnit.MILLISECONDS);
> }
> private static void init() {
> Properties streamsConfiguration = new Properties();
> streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, 
> "JeroenApp");
> streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, 
> brokerList);
> StreamsConfig config =

[GitHub] kafka pull request #3210: [WIP] KAFKA-5272: Improve validation for Describe/...

2017-06-02 Thread ijuma

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/3210

[WIP] KAFKA-5272: Improve validation for Describe/Alter Configs (KIP-133)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-5272-improve-validation-for-describe-alter-configs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3210.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3210


commit a944c7b5727ebddec940cbed0ca1622b4f0b4016
Author: Ismael Juma 
Date:   2017-06-01T23:21:56Z

KAFKA-5272: Add AlterConfigPolicy




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5272) Improve validation for Describe/Alter Configs (KIP-133)


[ 
https://issues.apache.org/jira/browse/KAFKA-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034818#comment-16034818
 ] 

ASF GitHub Bot commented on KAFKA-5272:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/3210

[WIP] KAFKA-5272: Improve validation for Describe/Alter Configs (KIP-133)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-5272-improve-validation-for-describe-alter-configs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3210.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3210


commit a944c7b5727ebddec940cbed0ca1622b4f0b4016
Author: Ismael Juma 
Date:   2017-06-01T23:21:56Z

KAFKA-5272: Add AlterConfigPolicy




> Improve validation for Describe/Alter Configs (KIP-133)
> ---
>
> Key: KAFKA-5272
> URL: https://issues.apache.org/jira/browse/KAFKA-5272
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 0.11.0.0
>
>
> TopicConfigHandler.processConfigChanges() warns about certain topic configs. 
> We should include such validations in alter configs and reject the change if 
> the validation fails. Note that this should be done without changing the 
> behaviour of the ConfigCommand (as it does not have access to the broker 
> configs).
> We should consider adding other validations like KAFKA-4092 and KAFKA-4680.
> Finally, the AlterConfigsPolicy mentioned in KIP-133 will be added at the 
> same time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Android app produces data in Kafka

2017-06-02 Thread Mireia


> Hi.
> 
> I am going to do a project in the University in order to finish my master in 
> IoT.
> 
> I need to know if it is posible connect an android app with Kafka. I want 
> that my mobile android takes data with its sensors and produces the data in 
> Kafka logs.
> If that is posible, where can I find documentation about it?
> 
> Thanks. 
> 
> Regards,
> 
> Mireya B. de Miguel Álvarez

[jira] [Updated] (KAFKA-5363) Add ability to batch restore and receive restoration stats.

2017-06-02 Thread Bill Bejeck (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck updated KAFKA-5363:
---
Summary: Add ability to batch restore and receive restoration stats.  (was: 
Add restoreAll functionality to StateRestoreCallback)

> Add ability to batch restore and receive restoration stats.
> ---
>
> Key: KAFKA-5363
> URL: https://issues.apache.org/jira/browse/KAFKA-5363
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Bill Bejeck
>Assignee: Bill Bejeck
>  Labels: kip
> Fix For: 0.11.1.0
>
>
> Add a new method {{restoreAll(List> records)}} to 
> the {{StateRestoreCallback}} to enable bulk writing to the underlying state 
> store vs individual {{restore(byte[] key, byte[] value)}} resulting in 
> quicker restore times.
> KIP: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-167%3A+Add+a+restoreAll+method+to+StateRestoreCallback



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-5363) Add ability to batch restore and receive restoration stats.

2017-06-02 Thread Bill Bejeck (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck updated KAFKA-5363:
---
Description: 
Currently, when restoring a state store in a Kafka Streams application, we put 
one key-value at a time into the store.  

This task aims to make this recovery more efficient by creating a new interface 
with "restoreAll" functionality allowing for bulk writes by the underlying 
state store implementation.  

The proposal will also add "beginRestore" and "endRestore" callback methods 
potentially used for 
Tracking when the bulk restoration process begins and ends.
Keeping track of the number of records and last offset restored.



KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-167%3A+Add+a+restoreAll+method+to+StateRestoreCallback

  was:
Add a new method {{restoreAll(List> records)}} to the 
{{StateRestoreCallback}} to enable bulk writing to the underlying state store 
vs individual {{restore(byte[] key, byte[] value)}} resulting in quicker 
restore times.

KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-167%3A+Add+a+restoreAll+method+to+StateRestoreCallback


> Add ability to batch restore and receive restoration stats.
> ---
>
> Key: KAFKA-5363
> URL: https://issues.apache.org/jira/browse/KAFKA-5363
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Bill Bejeck
>Assignee: Bill Bejeck
>  Labels: kip
> Fix For: 0.11.1.0
>
>
> Currently, when restoring a state store in a Kafka Streams application, we 
> put one key-value at a time into the store.  
> This task aims to make this recovery more efficient by creating a new 
> interface with "restoreAll" functionality allowing for bulk writes by the 
> underlying state store implementation.  
> The proposal will also add "beginRestore" and "endRestore" callback methods 
> potentially used for 
> Tracking when the bulk restoration process begins and ends.
> Keeping track of the number of records and last offset restored.
> KIP: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-167%3A+Add+a+restoreAll+method+to+StateRestoreCallback



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Re: 0.11.0.0 Release Update

Hi all,

So will Exactly Once Semantics be reason enough to bump version to 1.0?
Or is the leading zero here to stay indefinitely? :-)

Cheers,

Michal

On 05/05/17 04:28, Ismael Juma wrote:

Hi all,

We're quickly approaching our next time-based release. If you missed any of
the updates on the new time-based releases we'll be following, see
https://cwiki.apache.org/confluence/display/KAFKA/Time+Based+Release+Plan
for an explanation.

The release plan can be found in the usual location:
https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+0.11.0.0

Here are the important dates (also documented in the wiki):

- KIP Freeze: May 10, 2017 (a KIP must be accepted by this date in order
to be considered for this release)
- Feature Freeze: May 17, 2017 (major features merged & working on
stabilization, minor features have PR, release branch cut; anything not in
this state will be automatically moved to the next release in JIRA)
- Code Freeze: May 31, 2017 (first RC created now)
- Release: June 14, 2017

There are a couple of changes based on Ewen's feedback as release manager
for 0.10.2.0:

1. We now have a KIP freeze one week before the feature freeze to avoid
the risky and confusing situation where some KIPs are being discussed,
voted on and merged all in the same week.
2. All the dates were moved from Friday to Wednesday so that release
management doesn't spill over to the weekend.

KIPs: we have 24 adopted with 10 already committed and 10 with patches in
flight. The feature freeze is 12 days away so we have a lot of reviewing to
do, but significant changes have been merged already.

Open JIRAs: As usual, we have a lot!

*https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22)%20AND%20fixVersion%20%3D%200.11.0.0%20ORDER%20BY%20priority%20DESC
*

146 at the moment. As we get nearer to the feature freeze, I will start
moving JIRAs out of this release.

* Closed JIRAs: So far ~191 closed tickets for 0.11.0.0:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.11.0.0%20ORDER%20BY%20priority%20DESC

* Release features:
https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+0.11.0.0 has
a "Release Features" section that will be included with the release
notes/email for the release. I added some items to get it going. Please add
to
this list anything you think is worth noting.

I'll plan to give another update next week just before the KIP freeze.

Ismael

--
Signature
Michal Borowiecki
Senior Software Engineer L4
T: +44 208 742 1600

+44 203 249 8448

E: michal.borowie...@openbet.com
W: www.openbet.com

OpenBet Ltd

Chiswick Park Building 9

566 Chiswick High Rd

London

W4 5XT

This message is confidential and intended only for the addressee. If you
have received this message in error, please immediately notify the
postmas...@openbet.com and delete it
from your system as well as any copies. The content of e-mails as well
as traffic data may be monitored by OpenBet for employment and security
purposes. To protect the environment please do not print this e-mail
unless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building
9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A company
registered in England and Wales. Registered no. 3134634. VAT no.
GB927523612

Re: 0.11.0.0 Release Update

2017-06-02 Thread Ismael Juma

Hi Michal,

We would like to allow for at least one stabilisation cycle after
exactly-once lands before we consider the 1.0 release. Soon after the
0.11.0.0 release, a release manager for the subsequent release will
hopefully volunteer and we can then discuss the details.

While on the subject, I will send an update on 0.11.0.0 later today or
tomorrow.

Ismael

On Fri, Jun 2, 2017 at 5:10 PM, Michal Borowiecki <
michal.borowie...@openbet.com> wrote:

> Hi all,
>
> So will Exactly Once Semantics be reason enough to bump version to 1.0? Or
> is the leading zero here to stay indefinitely? :-)
>
> Cheers,
>
> Michal
>
> On 05/05/17 04:28, Ismael Juma wrote:
>
> Hi all,
>
> We're quickly approaching our next time-based release. If you missed any of
> the updates on the new time-based releases we'll be following, 
> seehttps://cwiki.apache.org/confluence/display/KAFKA/Time+Based+Release+Plan
> for an explanation.
>
> The release plan can be found in the usual 
> location:https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+0.11.0.0
>
> Here are the important dates (also documented in the wiki):
>
>- KIP Freeze: May 10, 2017 (a KIP must be accepted by this date in order
>to be considered for this release)
>- Feature Freeze: May 17, 2017 (major features merged & working on
>stabilization, minor features have PR, release branch cut; anything not in
>this state will be automatically moved to the next release in JIRA)
>- Code Freeze: May 31, 2017 (first RC created now)
>- Release: June 14, 2017
>
> There are a couple of changes based on Ewen's feedback as release manager
> for 0.10.2.0:
>
>1. We now have a KIP freeze one week before the feature freeze to avoid
>the risky and confusing situation where some KIPs are being discussed,
>voted on and merged all in the same week.
>2. All the dates were moved from Friday to Wednesday so that release
>management doesn't spill over to the weekend.
>
> KIPs: we have 24 adopted with 10 already committed and 10 with patches in
> flight. The feature freeze is 12 days away so we have a lot of reviewing to
> do, but significant changes have been merged already.
>
> Open JIRAs: As usual, we have a lot!
>
> *https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22)%20AND%20fixVersion%20%3D%200.11.0.0%20ORDER%20BY%20priority%20DESC
>  
> *
>
> 146 at the moment. As we get nearer to the feature freeze, I will start
> moving JIRAs out of this release.
>
> * Closed JIRAs: So far ~191 closed tickets for 
> 0.11.0.0:https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.11.0.0%20ORDER%20BY%20priority%20DESC
>
> * Release 
> features:https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+0.11.0.0
>  has
> a "Release Features" section that will be included with the release
> notes/email for the release. I added some items to get it going. Please add
> to
> this list anything you think is worth noting.
>
> I'll plan to give another update next week just before the KIP freeze.
>
> Ismael
>
>
>
> --
>  Michal Borowiecki
> Senior Software Engineer L4
> T: +44 208 742 1600 <020%208742%201600>
>
>
> +44 203 249 8448 <020%203249%208448>
>
>
>
> E: michal.borowie...@openbet.com
> W: www.openbet.com
> OpenBet Ltd
>
> Chiswick Park Building 9
>
> 566 Chiswick High Rd
>
> London
>
> W4 5XT
>
> UK
> 
> This message is confidential and intended only for the addressee. If you
> have received this message in error, please immediately notify the
> postmas...@openbet.com and delete it from your system as well as any
> copies. The content of e-mails as well as traffic data may be monitored by
> OpenBet for employment and security purposes. To protect the environment
> please do not print this e-mail unless necessary. OpenBet Ltd. Registered
> Office: Chiswick Park Building 9, 566 Chiswick High Road, London, W4 5XT,
> United Kingdom. A company registered in England and Wales. Registered no.
> 3134634. VAT no. GB927523612
>

Re: [DISCUSS] KIP-167: Add a restoreAll method to StateRestoreCallback

2017-06-02 Thread Bill Bejeck

Guozhang, Matthias,

Thanks for the comments.  I have updated the KIP, (JIRA title and
description as well).

I had thought about introducing a separate interface altogether, but
extending the current one makes more sense.

As for intermediate callbacks based on time or number of records, I think
the latest update to the KIP addresses this point of querying for
intermediate results, but it would be per batch restored.

Thanks,
Bill

On Fri, Jun 2, 2017 at 8:36 AM, Jim Jagielski  wrote:

>
> > On Jun 2, 2017, at 12:54 AM, Matthias J. Sax 
> wrote:
> >
> > With regard to backward compatibility, we should not change the current
> > interface, but add a new interface that extends the current one.
> >
>
> ++1
>
>

Re: [DISCUSS] KIP-138: Change punctuate semantics


Hi Matthias,

Apologies, somehow I totally missed this email earlier.

Wrt ValueTransformer, I added it to the the list of deprecated methods 
(PR is up to date).


Wrt Cancellable vs Cancelable:

I'm not fluent enough to have spotted this nuance, but having googled 
for it, you are right.


On the other hand however, the precedent seems to have been set by 
java.util.concurrent.Cancellable and akka for instance followed that 
with akka.actor.Cancellable.


Given established heritage in computing context, I'd err on the side of 
consistency with prior practice.


Unless anyone has strong opinions on this matter?


Thanks,

Michal


On 04/05/17 20:43, Matthias J. Sax wrote:

Hi,

thanks for updating the KIP. Looks good to me overall.

I think adding `Cancellable` (or should it be `Cancelable` to follow
American English?) is a clean solution, in contrast to the proposed
alternative.

One minor comment: can you add `ValueTransformer#punctuate()` to the
list of deprecated methods?


-Matthias



On 5/4/17 1:41 AM, Michal Borowiecki wrote:

Further in this direction I've updated the main proposal to incorporate
the Cancellable return type for ProcessorContext.schedule and the
guidance on how to implement "hybrid" punctuation with the proposed 2
PunctuationTypes.

I look forward to more comments whether the Cancallable return type is
an agreeable solution and it's precise definition.

I shall move all alternatives other than the main proposal into the
Rejected Alternatives section and if I hear any objections, I'll move
those back up and we'll discuss further.


Looking forward to all comments and suggestions.


Thanks,

Michal


On 01/05/17 18:23, Michal Borowiecki wrote:

Hi all,

As promised, here is my take at how one could implement the previously
discussed hybrid semantics using the 2 PunctuationType callbacks (one
for STREAM_TIME and one for SYSTEM_TIME).

However, there's a twist.

Since currently calling context.schedule() adds a new
PunctuationSchedule and does not overwrite the previous one, a slight
change would be required:

a) either that PuncuationSchedules are cancellable

b) or that calling schedule() ||overwrites(cancels) the previous one
with the given |PunctuationType |(but that's not how it works currently)


Below is an example assuming approach a) is implemented by having
schedule return Cancellable instead of void.

|ProcessorContext context;|
|long| |streamTimeInterval = ...;|
|long| |systemTimeUpperBound = ...; ||//e.g. systemTimeUpperBound =
streamTimeInterval + some tolerance|
|Cancellable streamTimeSchedule;|
|Cancellable systemTimeSchedule;|
|long| |lastStreamTimePunctation = -||1||;|
| |
|public| |void| |init(ProcessorContext context){|
|||this||.context = context;|
|||streamTimeSchedule =
context.schedule(PunctuationType.STREAM_TIME,
streamTimeInterval,   ||this||::streamTimePunctuate);|
|||systemTimeSchedule =
context.schedule(PunctuationType.SYSTEM_TIME,
systemTimeUpperBound, ||this||::systemTimePunctuate);   |
|}|
| |
|public| |void| |streamTimePunctuate(||long| |streamTime){|
|||periodicBusiness(streamTime);|
  
|||systemTimeSchedule.cancel();|

|||systemTimeSchedule =
context.schedule(PunctuationType.SYSTEM_TIME,
systemTimeUpperBound, ||this||::systemTimePunctuate);|
|}|
| |
|public| |void| |systemTimePunctuate(||long| |systemTime){|
|||periodicBusiness(context.timestamp());|
  
|||streamTimeSchedule.cancel();|

|||streamTimeSchedule =
context.schedule(PunctuationType.STREAM_TIME,
streamTimeInterval, ||this||::streamTimePunctuate);|
|}|
| |
|public| |void| |periodicBusiness(||long| |streamTime){|
|||// guard against streamTime == -1, easy enough.|
|||// if you need system time instead, just use
System.currentTimeMillis()|
| |
|||// do something businessy here|
|}|

Where Cancellable is either an interface containing just a single void
cancel() method or also boolean isCancelled() like here
.


Please let your opinions known whether we should proceed in this
direction or leave "hybrid" considerations out of scope.

Looking forward to hearing your thoughts.

Thanks,
Michal

On 30/04/17 20:07, Michal Borowiecki wrote:

Hi Matthias,

I'd like to start moving the discarded ideas into Rejected
Alternatives section. Before I do, I want to tidy them up, ensure
they've each been given proper treatment.

To that end let me go back to one of your earlier comments about the
original suggestion (A) to put that to bed.


On 04/04/17 06:44, Matthias J. Sax wrote:

(A) You argue, that users can still "punctuate" on event-time via
process(), but I am not sure if this is possible. Note, that users only
get record timestamps via context.timestamp(). Thus, users would need to
track the time progress per partition (based on the partitions they
obverse via context.partition(). (This alone puts a huge burden on the
user by itself.) However, users are not notified at startup what
partitio

[jira] [Created] (KAFKA-5369) Producer hangs on metadata fetch from broker in security authorization related failures and no meaningful exception is thrown

2017-06-02 Thread Koelli Mungee (JIRA)

Koelli Mungee created KAFKA-5369:


 Summary: Producer hangs on metadata fetch from broker in security 
authorization related failures and no meaningful exception is thrown
 Key: KAFKA-5369
 URL: https://issues.apache.org/jira/browse/KAFKA-5369
 Project: Kafka
  Issue Type: Improvement
  Components: consumer, producer 
Affects Versions: 0.10.0.0
Reporter: Koelli Mungee


Debugging security related misconfigurations becomes painful since the only 
symptom is a hang trying to fetch the metadata on the producer side

at java.lang.Object.wait(Native Method) 
at org.apache.kafka.clients.Metadata.awaitUpdate(Metadata.java:129) 
at 
org.apache.kafka.clients.producer.KafkaProducer.waitOnMetadata(KafkaProducer.java:528)
 
at 
org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:441) 
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:430) 
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:353)

There are no meaningful errors on the broker side either. Enabling 
Djavax.net.debug=all and/or Dsun.security.krb5.debug=true is an option, however 
this should be improved to get a meaningful error.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-5164) SetSchemaMetadata does not replace the schemas in structs correctly

2017-06-02 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-5164:
-
   Resolution: Fixed
Fix Version/s: 0.11.1.0
   0.11.0.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 3198
[https://github.com/apache/kafka/pull/3198]

> SetSchemaMetadata does not replace the schemas in structs correctly
> ---
>
> Key: KAFKA-5164
> URL: https://issues.apache.org/jira/browse/KAFKA-5164
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.1
>Reporter: Ewen Cheslack-Postava
>Assignee: Randall Hauch
> Fix For: 0.11.0.0, 0.11.1.0
>
>
> In SetSchemaMetadataTest we verify that the name and version of the schema in 
> the record have been replaced:
> https://github.com/apache/kafka/blob/trunk/connect/transforms/src/test/java/org/apache/kafka/connect/transforms/SetSchemaMetadataTest.java#L62
> However, in the case of Structs, the schema will be attached to both the 
> record and the Struct itself. So we correctly rebuild the Record:
> https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L77
> https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L104
> https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L119
> But if the key or value are a struct, they will still contain the old schema 
> embedded in the struct.
> Ultimately this can lead to validations in other code failing (even for very 
> simple changes like adjusting the name of a schema):
> {code}
> (org.apache.kafka.connect.runtime.WorkerTask:141)
> org.apache.kafka.connect.errors.DataException: Mismatching struct schema
> at io.confluent.connect.avro.AvroData.fromConnectData(AvroData.java:471)
> at io.confluent.connect.avro.AvroData.fromConnectData(AvroData.java:295)
> at 
> io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:73)
> at 
> org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:196)
> at 
> org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:167)
> at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
> at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The solution to this is probably to check whether we're dealing with a Struct 
> when we use a new schema and potentially copy/reallocate it.
> This particular issue would only appear if we don't modify the data, so I 
> think SetSchemaMetadata is currently the only transformation that would have 
> the issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #3198: KAFKA-5164 Ensure SetSchemaMetadata updates key or...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3198


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5164) SetSchemaMetadata does not replace the schemas in structs correctly


[ 
https://issues.apache.org/jira/browse/KAFKA-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035053#comment-16035053
 ] 

ASF GitHub Bot commented on KAFKA-5164:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3198


> SetSchemaMetadata does not replace the schemas in structs correctly
> ---
>
> Key: KAFKA-5164
> URL: https://issues.apache.org/jira/browse/KAFKA-5164
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.1
>Reporter: Ewen Cheslack-Postava
>Assignee: Randall Hauch
> Fix For: 0.11.0.0, 0.11.1.0
>
>
> In SetSchemaMetadataTest we verify that the name and version of the schema in 
> the record have been replaced:
> https://github.com/apache/kafka/blob/trunk/connect/transforms/src/test/java/org/apache/kafka/connect/transforms/SetSchemaMetadataTest.java#L62
> However, in the case of Structs, the schema will be attached to both the 
> record and the Struct itself. So we correctly rebuild the Record:
> https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L77
> https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L104
> https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L119
> But if the key or value are a struct, they will still contain the old schema 
> embedded in the struct.
> Ultimately this can lead to validations in other code failing (even for very 
> simple changes like adjusting the name of a schema):
> {code}
> (org.apache.kafka.connect.runtime.WorkerTask:141)
> org.apache.kafka.connect.errors.DataException: Mismatching struct schema
> at io.confluent.connect.avro.AvroData.fromConnectData(AvroData.java:471)
> at io.confluent.connect.avro.AvroData.fromConnectData(AvroData.java:295)
> at 
> io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:73)
> at 
> org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:196)
> at 
> org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:167)
> at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
> at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The solution to this is probably to check whether we're dealing with a Struct 
> when we use a new schema and potentially copy/reallocate it.
> This particular issue would only appear if we don't modify the data, so I 
> think SetSchemaMetadata is currently the only transformation that would have 
> the issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-5366) Add cases for concurrent transactional reads and writes in system tests

2017-06-02 Thread Apurva Mehta (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apurva Mehta updated KAFKA-5366:

Labels: exactly-once  (was: )

> Add cases for concurrent transactional reads and writes in system tests
> ---
>
> Key: KAFKA-5366
> URL: https://issues.apache.org/jira/browse/KAFKA-5366
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 0.11.0.0
>Reporter: Apurva Mehta
>Assignee: Apurva Mehta
>  Labels: exactly-once
> Fix For: 0.11.1.0
>
>
> Currently the transactions system test does a transactional copy while 
> bouncing brokers and clients, and then does a verifying read on the output 
> topic to ensure that it exactly matches the input. 
> We should also have a transactional consumer reading the tail of the output 
> topic as the writes are happening, and then assert that the values _it_ reads 
> also exactly match the values in the source topics. 
> This test really exercises the abort index, and we don't have any of them in 
> the system or integration tests right now. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-5292) Update authorization checks in AdminClient and add authorization tests

2017-06-02 Thread Colin P. McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin P. McCabe updated KAFKA-5292:
---
Summary: Update authorization checks in AdminClient and add authorization 
tests  (was: Authorization tests for AdminClient)

> Update authorization checks in AdminClient and add authorization tests
> --
>
> Key: KAFKA-5292
> URL: https://issues.apache.org/jira/browse/KAFKA-5292
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Assignee: Colin P. McCabe
>Priority: Critical
> Fix For: 0.11.0.0
>
>
> AuthorizerIntegrationTest includes protocol, consumer and producer tests. We 
> should add tests for the AdminClient as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #3211: Minor: make flush no-op as we don't need to call f...

2017-06-02 Thread bbejeck

GitHub user bbejeck opened a pull request:

https://github.com/apache/kafka/pull/3211

Minor: make flush no-op as we don't need to call flush on commit.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bbejeck/kafka MINOR_no_flush_on_commit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3211.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3211


commit 7fd20819b250477e4e4f36fe0da60e0ff74a1403
Author: Bill Bejeck 
Date:   2017-06-02T18:12:52Z

Minor: make flush no-op as we don't need to call flush on commit.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (KAFKA-5327) Console Consumer should only poll for up to max messages

2017-06-02 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-5327.
--
   Resolution: Fixed
Fix Version/s: 0.11.0.0

Issue resolved by pull request 3148
[https://github.com/apache/kafka/pull/3148]

> Console Consumer should only poll for up to max messages
> 
>
> Key: KAFKA-5327
> URL: https://issues.apache.org/jira/browse/KAFKA-5327
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Dustin Cote
>Assignee: huxi
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> The ConsoleConsumer has a --max-messages flag that can be used to limit the 
> number of messages consumed. However, the number of records actually consumed 
> is governed by max.poll.records. This means you see one message on the 
> console, but your offset has moved forward a default of 500, which is kind of 
> counterintuitive. It would be good to only commit offsets for messages we 
> have printed to the console.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #3148: KAFKA-5327: ConsoleConsumer should manually commit...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3148


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5327) Console Consumer should only poll for up to max messages


[ 
https://issues.apache.org/jira/browse/KAFKA-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035261#comment-16035261
 ] 

ASF GitHub Bot commented on KAFKA-5327:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3148


> Console Consumer should only poll for up to max messages
> 
>
> Key: KAFKA-5327
> URL: https://issues.apache.org/jira/browse/KAFKA-5327
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Dustin Cote
>Assignee: huxi
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> The ConsoleConsumer has a --max-messages flag that can be used to limit the 
> number of messages consumed. However, the number of records actually consumed 
> is governed by max.poll.records. This means you see one message on the 
> console, but your offset has moved forward a default of 500, which is kind of 
> counterintuitive. It would be good to only commit offsets for messages we 
> have printed to the console.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (KAFKA-5368) Kafka Streams skipped-records-rate sensor produces nonzero values when the timestamps are valid

2017-06-02 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-5368.
--
   Resolution: Fixed
Fix Version/s: 0.11.0.0

Issue resolved by pull request 3206
[https://github.com/apache/kafka/pull/3206]

> Kafka Streams skipped-records-rate sensor produces nonzero values when the 
> timestamps are valid
> ---
>
> Key: KAFKA-5368
> URL: https://issues.apache.org/jira/browse/KAFKA-5368
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Hamidreza Afzali
>Assignee: Hamidreza Afzali
> Fix For: 0.11.0.0
>
>
> Kafka Streams skipped-records-rate sensor produces nonzero values even when 
> the timestamps are valid and records are processed. The values are equal to 
> poll-rate.
> Related issue: KAFKA-5055 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #3206: KAFKA-5368 Kafka Streams skipped-records-rate sens...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3206


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5368) Kafka Streams skipped-records-rate sensor produces nonzero values when the timestamps are valid


[ 
https://issues.apache.org/jira/browse/KAFKA-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035281#comment-16035281
 ] 

ASF GitHub Bot commented on KAFKA-5368:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3206


> Kafka Streams skipped-records-rate sensor produces nonzero values when the 
> timestamps are valid
> ---
>
> Key: KAFKA-5368
> URL: https://issues.apache.org/jira/browse/KAFKA-5368
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Hamidreza Afzali
>Assignee: Hamidreza Afzali
> Fix For: 0.11.0.0
>
>
> Kafka Streams skipped-records-rate sensor produces nonzero values even when 
> the timestamps are valid and records are processed. The values are equal to 
> poll-rate.
> Related issue: KAFKA-5055 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-3982) Issue with processing order of consumer properties in console consumer

2017-06-02 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3982:
-
   Resolution: Fixed
Fix Version/s: 0.11.0.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 1655
[https://github.com/apache/kafka/pull/1655]

> Issue with processing order of consumer properties in console consumer
> --
>
> Key: KAFKA-3982
> URL: https://issues.apache.org/jira/browse/KAFKA-3982
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Vahid Hashemian
>Assignee: Vahid Hashemian
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> With the recent introduction of {{consumer.property}} argument in console 
> consumer, both new and old consumer could overwrite certain properties 
> provided using this new argument.
> Specifically, the old consumer would overwrite the values provided for 
> [{{auto.offset.reset}}|https://github.com/apache/kafka/blob/10bbffd75439e10fe9db6cf0aa48a7da7e386ef3/core/src/main/scala/kafka/tools/ConsoleConsumer.scala#L173]
>  and 
> [{{zookeeper.connect}}|https://github.com/apache/kafka/blob/10bbffd75439e10fe9db6cf0aa48a7da7e386ef3/core/src/main/scala/kafka/tools/ConsoleConsumer.scala#L174],
>  and the new consumer would overwrite the values provided for 
> [{{auto.offset.reset}}|https://github.com/apache/kafka/blob/10bbffd75439e10fe9db6cf0aa48a7da7e386ef3/core/src/main/scala/kafka/tools/ConsoleConsumer.scala#L196],
>  
> [{{bootstrap.servers}}|https://github.com/apache/kafka/blob/10bbffd75439e10fe9db6cf0aa48a7da7e386ef3/core/src/main/scala/kafka/tools/ConsoleConsumer.scala#L197],
>  
> [{{key.deserializer}}|https://github.com/apache/kafka/blob/10bbffd75439e10fe9db6cf0aa48a7da7e386ef3/core/src/main/scala/kafka/tools/ConsoleConsumer.scala#L198],
>  and 
> [{{key.deserializer}}|https://github.com/apache/kafka/blob/10bbffd75439e10fe9db6cf0aa48a7da7e386ef3/core/src/main/scala/kafka/tools/ConsoleConsumer.scala#L199].
> For example, running the old consumer as {{bin/kafka-console-consumer.sh 
> --zookeeper localhost:2181 --topic foo --consumer-property 
> auto.offset.reset=none}} the value that's eventually selected for 
> {{auto.offset.reset}} will be {{largest}}, overwriting what the user provides 
> in the command line.
> This seems to be because the properties provided via {{consumer.property}} 
> argument are not considered when finalizing the configuration of the consumer.
> Some properties can now be provided in three different places (directly in 
> the command line, via the {{consumer.property}} argument, and via the 
> {{consumer.config}} argument, in the same order of precedence).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #1655: KAFKA-3982: Fix processing order of some of the co...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1655


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-3982) Issue with processing order of consumer properties in console consumer


[ 
https://issues.apache.org/jira/browse/KAFKA-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035298#comment-16035298
 ] 

ASF GitHub Bot commented on KAFKA-3982:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1655


> Issue with processing order of consumer properties in console consumer
> --
>
> Key: KAFKA-3982
> URL: https://issues.apache.org/jira/browse/KAFKA-3982
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Vahid Hashemian
>Assignee: Vahid Hashemian
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> With the recent introduction of {{consumer.property}} argument in console 
> consumer, both new and old consumer could overwrite certain properties 
> provided using this new argument.
> Specifically, the old consumer would overwrite the values provided for 
> [{{auto.offset.reset}}|https://github.com/apache/kafka/blob/10bbffd75439e10fe9db6cf0aa48a7da7e386ef3/core/src/main/scala/kafka/tools/ConsoleConsumer.scala#L173]
>  and 
> [{{zookeeper.connect}}|https://github.com/apache/kafka/blob/10bbffd75439e10fe9db6cf0aa48a7da7e386ef3/core/src/main/scala/kafka/tools/ConsoleConsumer.scala#L174],
>  and the new consumer would overwrite the values provided for 
> [{{auto.offset.reset}}|https://github.com/apache/kafka/blob/10bbffd75439e10fe9db6cf0aa48a7da7e386ef3/core/src/main/scala/kafka/tools/ConsoleConsumer.scala#L196],
>  
> [{{bootstrap.servers}}|https://github.com/apache/kafka/blob/10bbffd75439e10fe9db6cf0aa48a7da7e386ef3/core/src/main/scala/kafka/tools/ConsoleConsumer.scala#L197],
>  
> [{{key.deserializer}}|https://github.com/apache/kafka/blob/10bbffd75439e10fe9db6cf0aa48a7da7e386ef3/core/src/main/scala/kafka/tools/ConsoleConsumer.scala#L198],
>  and 
> [{{key.deserializer}}|https://github.com/apache/kafka/blob/10bbffd75439e10fe9db6cf0aa48a7da7e386ef3/core/src/main/scala/kafka/tools/ConsoleConsumer.scala#L199].
> For example, running the old consumer as {{bin/kafka-console-consumer.sh 
> --zookeeper localhost:2181 --topic foo --consumer-property 
> auto.offset.reset=none}} the value that's eventually selected for 
> {{auto.offset.reset}} will be {{largest}}, overwriting what the user provides 
> in the command line.
> This seems to be because the properties provided via {{consumer.property}} 
> argument are not considered when finalizing the configuration of the consumer.
> Some properties can now be provided in three different places (directly in 
> the command line, via the {{consumer.property}} argument, and via the 
> {{consumer.config}} argument, in the same order of precedence).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KAFKA-5368) Kafka Streams skipped-records-rate sensor produces nonzero values when the timestamps are valid

2017-06-02 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035344#comment-16035344
 ] 

Matthias J. Sax commented on KAFKA-5368:


Could we add a test for this? I think this issue was introduce by code 
refactoring and slipped because of a missing test...

> Kafka Streams skipped-records-rate sensor produces nonzero values when the 
> timestamps are valid
> ---
>
> Key: KAFKA-5368
> URL: https://issues.apache.org/jira/browse/KAFKA-5368
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Hamidreza Afzali
>Assignee: Hamidreza Afzali
> Fix For: 0.11.0.0
>
>
> Kafka Streams skipped-records-rate sensor produces nonzero values even when 
> the timestamps are valid and records are processed. The values are equal to 
> poll-rate.
> Related issue: KAFKA-5055 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #3205: KAFKA-5236; Increase the block/buffer size when co...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3205


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5236) Regression in on-disk log size when using Snappy compression with 0.8.2 log message format


[ 
https://issues.apache.org/jira/browse/KAFKA-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035353#comment-16035353
 ] 

ASF GitHub Bot commented on KAFKA-5236:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3205


> Regression in on-disk log size when using Snappy compression with 0.8.2 log 
> message format
> --
>
> Key: KAFKA-5236
> URL: https://issues.apache.org/jira/browse/KAFKA-5236
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.1
>Reporter: Nick Travers
>Assignee: Ismael Juma
>  Labels: regression
> Fix For: 0.11.0.0
>
>
> We recently upgraded our brokers in our production environments from 0.10.1.1 
> to 0.10.2.1 and we've noticed a sizable regression in the on-disk .log file 
> size. For some deployments the increase was as much as 50%.
> We run our brokers with the 0.8.2 log message format version. The majority of 
> our message volume comes from 0.10.x Java clients sending messages encoded 
> with the Snappy codec.
> Some initial testing only shows a regression between the two versions when 
> using Snappy compression with a log message format of 0.8.2.
> I also tested 0.10.x log message formats as well as Gzip compression. The log 
> sizes do not differ in this case, so the issue seems confined to 0.8.2 
> message format and Snappy compression.
> A git-bisect lead me to this commit, which modified the server-side 
> implementation of `Record`:
> https://github.com/apache/kafka/commit/67f1e5b91bf073151ff57d5d656693e385726697
> Here's the PR, which has more context:
> https://github.com/apache/kafka/pull/2140
> Here is a link to the test I used to re-producer this issue:
> https://github.com/nicktrav/kafka/commit/68e8db4fa525e173651ac740edb270b0d90b8818
> cc: [~hachikuji] [~junrao] [~ijuma] [~guozhang] (tagged on original PR)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KAFKA-1595) Remove deprecated and slower scala JSON parser from kafka.consumer.TopicCount

2017-06-02 Thread Onur Karaman (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035365#comment-16035365
 ] 

Onur Karaman commented on KAFKA-1595:
-

Sounds good.

> Remove deprecated and slower scala JSON parser from kafka.consumer.TopicCount
> -
>
> Key: KAFKA-1595
> URL: https://issues.apache.org/jira/browse/KAFKA-1595
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.1.1
>Reporter: Jagbir
>Assignee: Ismael Juma
>  Labels: newbie
>
> The following issue is created as a follow up suggested by Jun Rao
> in a kafka news group message with the Subject
> "Blocking Recursive parsing from 
> kafka.consumer.TopicCount$.constructTopicCount"
> SUMMARY:
> An issue was detected in a typical cluster of 3 kafka instances backed
> by 3 zookeeper instances (kafka version 0.8.1.1, scala version 2.10.3,
> java version 1.7.0_65). On consumer end, when consumers get recycled,
> there is a troubling JSON parsing recursion which takes a busy lock and
> blocks consumers thread pool.
> In 0.8.1.1 scala client library ZookeeperConsumerConnector.scala:355 takes
> a global lock (0xd3a7e1d0) during the rebalance, and fires an
> expensive JSON parsing, while keeping the other consumers from shutting
> down, see, e.g,
> at 
> kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:161)
> The deep recursive JSON parsing should be deprecated in favor
> of a better JSON parser, see, e.g,
> http://engineering.ooyala.com/blog/comparing-scala-json-libraries?
> DETAILS:
> The first dump is for a recursive blocking thread holding the lock for 
> 0xd3a7e1d0
> and the subsequent dump is for a waiting thread.
> (Please grep for 0xd3a7e1d0 to see the locked object.)
> Â 
> -8<-
> "Sa863f22b1e5hjh6788991800900b34545c_profile-a-prod1-s-140789080845312-c397945e8_watcher_executor"
> prio=10 tid=0x7f24dc285800 nid=0xda9 runnable [0x7f249e40b000]
> java.lang.Thread.State: RUNNABLE
> at 
> scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.p$7(Parsers.scala:722)
> at 
> scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.continue$1(Parsers.scala:726)
> at 
> scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.apply(Parsers.scala:737)
> at 
> scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.apply(Parsers.scala:721)
> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
> at 
> scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239)
> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239)
> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
> at 
> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:22

[GitHub] kafka pull request #3212: KAFKA-5019: Upgrades notes for idempotent/transact...

2017-06-02 Thread hachikuji

GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/3212

KAFKA-5019: Upgrades notes for idempotent/transactional features and new 
message format



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA-5019

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3212.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3212


commit 84390c7a6c65e78d01e7eb164a65e27d1cd03adb
Author: Jason Gustafson 
Date:   2017-06-02T20:49:32Z

KAFKA-5019: Upgrades notes for idempotent/transactional features and new 
message format




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5019) Exactly-once upgrade notes


[ 
https://issues.apache.org/jira/browse/KAFKA-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035403#comment-16035403
 ] 

ASF GitHub Bot commented on KAFKA-5019:
---

GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/3212

KAFKA-5019: Upgrades notes for idempotent/transactional features and new 
message format



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA-5019

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3212.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3212


commit 84390c7a6c65e78d01e7eb164a65e27d1cd03adb
Author: Jason Gustafson 
Date:   2017-06-02T20:49:32Z

KAFKA-5019: Upgrades notes for idempotent/transactional features and new 
message format




> Exactly-once upgrade notes
> --
>
> Key: KAFKA-5019
> URL: https://issues.apache.org/jira/browse/KAFKA-5019
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>Assignee: Jason Gustafson
>Priority: Blocker
>  Labels: documentation, exactly-once
> Fix For: 0.11.0.0
>
>
> We have added some basic upgrade notes, but we need to flesh them out. We 
> should cover every item that has compatibility implications as well new and 
> updated protocol APIs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KAFKA-5236) Regression in on-disk log size when using Snappy compression with 0.8.2 log message format

2017-06-02 Thread Nick Travers (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035404#comment-16035404
 ] 

Nick Travers commented on KAFKA-5236:
-

Thanks for the patch [~ijuma]! I ran the same tests I used originally and 
confirmed that there was no regression in the on-disk size.

> Regression in on-disk log size when using Snappy compression with 0.8.2 log 
> message format
> --
>
> Key: KAFKA-5236
> URL: https://issues.apache.org/jira/browse/KAFKA-5236
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.1
>Reporter: Nick Travers
>Assignee: Ismael Juma
>  Labels: regression
> Fix For: 0.11.0.0
>
>
> We recently upgraded our brokers in our production environments from 0.10.1.1 
> to 0.10.2.1 and we've noticed a sizable regression in the on-disk .log file 
> size. For some deployments the increase was as much as 50%.
> We run our brokers with the 0.8.2 log message format version. The majority of 
> our message volume comes from 0.10.x Java clients sending messages encoded 
> with the Snappy codec.
> Some initial testing only shows a regression between the two versions when 
> using Snappy compression with a log message format of 0.8.2.
> I also tested 0.10.x log message formats as well as Gzip compression. The log 
> sizes do not differ in this case, so the issue seems confined to 0.8.2 
> message format and Snappy compression.
> A git-bisect lead me to this commit, which modified the server-side 
> implementation of `Record`:
> https://github.com/apache/kafka/commit/67f1e5b91bf073151ff57d5d656693e385726697
> Here's the PR, which has more context:
> https://github.com/apache/kafka/pull/2140
> Here is a link to the test I used to re-producer this issue:
> https://github.com/nicktrav/kafka/commit/68e8db4fa525e173651ac740edb270b0d90b8818
> cc: [~hachikuji] [~junrao] [~ijuma] [~guozhang] (tagged on original PR)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (KAFKA-5019) Exactly-once upgrade notes

2017-06-02 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-5019:
---
Status: Patch Available  (was: In Progress)

> Exactly-once upgrade notes
> --
>
> Key: KAFKA-5019
> URL: https://issues.apache.org/jira/browse/KAFKA-5019
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>Assignee: Jason Gustafson
>Priority: Blocker
>  Labels: documentation, exactly-once
> Fix For: 0.11.0.0
>
>
> We have added some basic upgrade notes, but we need to flesh them out. We 
> should cover every item that has compatibility implications as well new and 
> updated protocol APIs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KAFKA-5236) Regression in on-disk log size when using Snappy compression with 0.8.2 log message format


[ 
https://issues.apache.org/jira/browse/KAFKA-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035417#comment-16035417
 ] 

Ismael Juma commented on KAFKA-5236:


Thanks for verifying [~nickt].

> Regression in on-disk log size when using Snappy compression with 0.8.2 log 
> message format
> --
>
> Key: KAFKA-5236
> URL: https://issues.apache.org/jira/browse/KAFKA-5236
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.1
>Reporter: Nick Travers
>Assignee: Ismael Juma
>  Labels: regression
> Fix For: 0.11.0.0
>
>
> We recently upgraded our brokers in our production environments from 0.10.1.1 
> to 0.10.2.1 and we've noticed a sizable regression in the on-disk .log file 
> size. For some deployments the increase was as much as 50%.
> We run our brokers with the 0.8.2 log message format version. The majority of 
> our message volume comes from 0.10.x Java clients sending messages encoded 
> with the Snappy codec.
> Some initial testing only shows a regression between the two versions when 
> using Snappy compression with a log message format of 0.8.2.
> I also tested 0.10.x log message formats as well as Gzip compression. The log 
> sizes do not differ in this case, so the issue seems confined to 0.8.2 
> message format and Snappy compression.
> A git-bisect lead me to this commit, which modified the server-side 
> implementation of `Record`:
> https://github.com/apache/kafka/commit/67f1e5b91bf073151ff57d5d656693e385726697
> Here's the PR, which has more context:
> https://github.com/apache/kafka/pull/2140
> Here is a link to the test I used to re-producer this issue:
> https://github.com/nicktrav/kafka/commit/68e8db4fa525e173651ac740edb270b0d90b8818
> cc: [~hachikuji] [~junrao] [~ijuma] [~guozhang] (tagged on original PR)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] kafka pull request #3213: KAFKA-5359: Make future exception the exception ca...

2017-06-02 Thread vahidhashemian

GitHub user vahidhashemian opened a pull request:

https://github.com/apache/kafka/pull/3213

KAFKA-5359: Make future exception the exception cause on the client side

Instead of throwing `future.exception()` on the client side, throw an 
exception with `future.exception()` as the cause. This is to better identify 
where on the client side the exception is thrown.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vahidhashemian/kafka KAFKA-5359

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3213.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3213


commit cee7244dc58fd48e53064f3abc6600a386b274c6
Author: Vahid Hashemian 
Date:   2017-06-02T20:06:21Z

KAFKA-5359: Make future exception the exception cause on the client side

Instead of throwing `future.exception()` on the client side, throw an 
exception with that `future.exception()` as the cause.
This is to better identify where on the client side the exception is thrown.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-5359) Exceptions from RequestFuture lack parts of the stack trace