[jira] [Assigned] (KAFKA-4801) Transient test failure (part 2): ConsumerBounceTest.testConsumptionWithBrokerFailures

2017-04-27 Thread Armin Braun (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Armin Braun reassigned KAFKA-4801:
--

Assignee: Jason Gustafson  (was: Armin Braun)

> Transient test failure (part 2): 
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> -
>
> Key: KAFKA-4801
> URL: https://issues.apache.org/jira/browse/KAFKA-4801
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Armin Braun
>Assignee: Jason Gustafson
>Priority: Minor
>  Labels: transient-system-test-failure
>
> There is still some (but very little ... when reproducing this you need more 
> than 100 runs in half the cases statistically) instability left in the test
> {code}
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> {code}
> Resulting in this exception being thrown at a relatively low rate (I'd say 
> def less than 0.5% of all runs on my machine).
> {code}
> kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures FAILED
> java.lang.IllegalArgumentException: You can only check the position for 
> partitions assigned to this consumer.
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1271)
> at 
> kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:96)
> at 
> kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:69)
> {code}
> this was also reported in a comment to the original KAFKA-4198
> https://issues.apache.org/jira/browse/KAFKA-4198?focusedCommentId=15765468&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15765468



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5097) KafkaConsumer.poll throws IllegalStateException

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988301#comment-15988301
 ] 

ASF GitHub Bot commented on KAFKA-5097:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2887


> KafkaConsumer.poll throws IllegalStateException
> ---
>
> Key: KAFKA-5097
> URL: https://issues.apache.org/jira/browse/KAFKA-5097
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ismael Juma
>Assignee: Eno Thereska
>Priority: Blocker
> Fix For: 0.10.2.1
>
>
> The backport of KAFKA-5075 to 0.10.2 seems to have introduced a regression: 
> If a fetch returns more data than `max.poll.records` and there is a rebalance 
> or the user changes the assignment/subscription after a `poll` that doesn't 
> return all the fetched data, the next call will throw an 
> `IllegalStateException`. More discussion in the following PR that includes a 
> fix:
> https://github.com/apache/kafka/pull/2876/files#r112413428
> This issue caused a Streams system test to fail, see KAFKA-4755.
> We should fix the regression before releasing 0.10.2.1.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2887: KAFKA-5097: Add testFetchAfterPartitionWithFetched...

2017-04-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2887


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4986) Add producer per task support

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988288#comment-15988288
 ] 

ASF GitHub Bot commented on KAFKA-4986:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2854


> Add producer per task support
> -
>
> Key: KAFKA-4986
> URL: https://issues.apache.org/jira/browse/KAFKA-4986
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.11.0.0
>
>
> Add new config parameter {{processing_guarantee}} and enable "producer per 
> task" initialization of new config is set to {{exactly_once}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2854: KAFKA-4986: Adding producer per task (follow-up)

2017-04-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2854


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2933: MINOR: Some cleanup in the transactional producer

2017-04-27 Thread hachikuji
GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/2933

MINOR: Some cleanup in the transactional producer



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka 
minor-transactional-client-cleanup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2933.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2933


commit 24d2538c585d79b2879f69113e1b09cf8dad7ed8
Author: Jason Gustafson 
Date:   2017-04-28T06:09:27Z

MINOR: Some cleanup in the transactional producer




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5137) Controlled shutdown timeout message improvement

2017-04-27 Thread Umesh Chaudhary (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988235#comment-15988235
 ] 

Umesh Chaudhary commented on KAFKA-5137:


Indeed [~cotedm]. Even the *socketTimeoutMs* instance variable used in this 
method points to *controller.socket.timeout.ms*. Sent an initial PR, please 
review it and I can improve it if we can get other reasons of IOException 
during controlled shutdown. 

> Controlled shutdown timeout message improvement
> ---
>
> Key: KAFKA-5137
> URL: https://issues.apache.org/jira/browse/KAFKA-5137
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.10.2.0
>Reporter: Dustin Cote
>Priority: Minor
>  Labels: newbie
>
> Currently if you fail during controlled shutdown, you can get a message that 
> says the socket.timeout.ms has expired. This config actually doesn't exist on 
> the broker. Instead, we should explicitly say if we've hit the 
> controller.socket.timeout.ms or the request.timeout.ms as it's confusing to 
> take action given the current message. I believe the relevant code is here:
> https://github.com/apache/kafka/blob/0.10.2/core/src/main/scala/kafka/server/KafkaServer.scala#L428-L454
> I'm also not sure if there's another timeout that could be hit here or 
> another reason why IOException might be thrown. In the least we should call 
> out those two configs instead of the non-existent one but if we can direct to 
> the proper one that would be even better.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5137) Controlled shutdown timeout message improvement

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988230#comment-15988230
 ] 

ASF GitHub Bot commented on KAFKA-5137:
---

GitHub user umesh9794 opened a pull request:

https://github.com/apache/kafka/pull/2932

KAFKA-5137 : Controlled shutdown timeout message improvement

This PR improves the warning message by adding correct config details. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/umesh9794/kafka local

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2932.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2932


commit f2b35a8e56d83301a42a09fef61a9ba752acce70
Author: umesh9794 
Date:   2017-04-28T05:14:04Z

KAFKA-5137 : Controlled shutdown timeout message improvement




> Controlled shutdown timeout message improvement
> ---
>
> Key: KAFKA-5137
> URL: https://issues.apache.org/jira/browse/KAFKA-5137
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.10.2.0
>Reporter: Dustin Cote
>Priority: Minor
>  Labels: newbie
>
> Currently if you fail during controlled shutdown, you can get a message that 
> says the socket.timeout.ms has expired. This config actually doesn't exist on 
> the broker. Instead, we should explicitly say if we've hit the 
> controller.socket.timeout.ms or the request.timeout.ms as it's confusing to 
> take action given the current message. I believe the relevant code is here:
> https://github.com/apache/kafka/blob/0.10.2/core/src/main/scala/kafka/server/KafkaServer.scala#L428-L454
> I'm also not sure if there's another timeout that could be hit here or 
> another reason why IOException might be thrown. In the least we should call 
> out those two configs instead of the non-existent one but if we can direct to 
> the proper one that would be even better.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2932: KAFKA-5137 : Controlled shutdown timeout message i...

2017-04-27 Thread umesh9794
GitHub user umesh9794 opened a pull request:

https://github.com/apache/kafka/pull/2932

KAFKA-5137 : Controlled shutdown timeout message improvement

This PR improves the warning message by adding correct config details. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/umesh9794/kafka local

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2932.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2932


commit f2b35a8e56d83301a42a09fef61a9ba752acce70
Author: umesh9794 
Date:   2017-04-28T05:14:04Z

KAFKA-5137 : Controlled shutdown timeout message improvement




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4593) Task migration during rebalance callback process could lead the obsoleted task's IllegalStateException

2017-04-27 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988200#comment-15988200
 ] 

Matthias J. Sax commented on KAFKA-4593:


It is possible of A and B are on different machines. And re-reading both JIRAs, 
this is a different one than the other. I guess it's a rare scenario but 
possible.

> Task migration during rebalance callback process could lead the obsoleted 
> task's IllegalStateException
> --
>
> Key: KAFKA-4593
> URL: https://issues.apache.org/jira/browse/KAFKA-4593
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>  Labels: infrastructure
>
> 1. Assume 2 running threads A and B, and one task t1 jut for simplicity.
> 2. First rebalance is triggered, task t1 is assigned to A (B has no assigned 
> task).
> 3. During the first rebalance callback, task t1's state store need to be 
> restored on thread A, and this is called in "restoreActiveState" of 
> "createStreamTask".
> 4. Not suppose thread A has a long GC causing it to stall, a second rebalance 
> then will be triggered and kicked A out of the group; B gets the task t1 and 
> did the same restoration process, after the process thread B continues to 
> process data and update the state store, while at the same time writes more 
> messages to the changelog (so its log end offset has incremented).
> 5. After a while A resumes from the long GC, not knowing it has actually be 
> kicked out of the group and task t1 is no longer owned to itself, it 
> continues the restoration process but then realize that the log end offset 
> has advanced. When this happens, we will see the following exception on 
> thread A:
> {code}
> java.lang.IllegalStateException: task XXX Log end offset of
> YYY-table_stream-changelog-ZZ should not change while
> restoring: old end offset .., current offset ..
> at
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.restoreActiveState(ProcessorStateManager.java:248)
> at
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.register(ProcessorStateManager.java:201)
> at
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.register(ProcessorContextImpl.java:122)
> at
> org.apache.kafka.streams.state.internals.RocksDBWindowStore.init(RocksDBWindowStore.java:200)
> at
> org.apache.kafka.streams.state.internals.MeteredWindowStore.init(MeteredWindowStore.java:65)
> at
> org.apache.kafka.streams.state.internals.CachingWindowStore.init(CachingWindowStore.java:65)
> at
> org.apache.kafka.streams.processor.internals.AbstractTask.initializeStateStores(AbstractTask.java:86)
> at
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:120)
> at
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:794)
> at
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1222)
> at
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1195)
> at
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:897)
> at
> org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:71)
> at
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:240)
> at
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:230)
> at
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:314)
> at
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:278)
> at
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:261)
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1039)
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1004)
> at
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:570)
> at
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:359)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-4593) Task migration during rebalance callback process could lead the obsoleted task's IllegalStateException

2017-04-27 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988200#comment-15988200
 ] 

Matthias J. Sax edited comment on KAFKA-4593 at 4/28/17 4:57 AM:
-

It is possible if A and B are on different machines. And re-reading both JIRAs, 
this is a different one than the other. I guess it's a rare scenario but 
possible.


was (Author: mjsax):
It is possible of A and B are on different machines. And re-reading both JIRAs, 
this is a different one than the other. I guess it's a rare scenario but 
possible.

> Task migration during rebalance callback process could lead the obsoleted 
> task's IllegalStateException
> --
>
> Key: KAFKA-4593
> URL: https://issues.apache.org/jira/browse/KAFKA-4593
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>  Labels: infrastructure
>
> 1. Assume 2 running threads A and B, and one task t1 jut for simplicity.
> 2. First rebalance is triggered, task t1 is assigned to A (B has no assigned 
> task).
> 3. During the first rebalance callback, task t1's state store need to be 
> restored on thread A, and this is called in "restoreActiveState" of 
> "createStreamTask".
> 4. Not suppose thread A has a long GC causing it to stall, a second rebalance 
> then will be triggered and kicked A out of the group; B gets the task t1 and 
> did the same restoration process, after the process thread B continues to 
> process data and update the state store, while at the same time writes more 
> messages to the changelog (so its log end offset has incremented).
> 5. After a while A resumes from the long GC, not knowing it has actually be 
> kicked out of the group and task t1 is no longer owned to itself, it 
> continues the restoration process but then realize that the log end offset 
> has advanced. When this happens, we will see the following exception on 
> thread A:
> {code}
> java.lang.IllegalStateException: task XXX Log end offset of
> YYY-table_stream-changelog-ZZ should not change while
> restoring: old end offset .., current offset ..
> at
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.restoreActiveState(ProcessorStateManager.java:248)
> at
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.register(ProcessorStateManager.java:201)
> at
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.register(ProcessorContextImpl.java:122)
> at
> org.apache.kafka.streams.state.internals.RocksDBWindowStore.init(RocksDBWindowStore.java:200)
> at
> org.apache.kafka.streams.state.internals.MeteredWindowStore.init(MeteredWindowStore.java:65)
> at
> org.apache.kafka.streams.state.internals.CachingWindowStore.init(CachingWindowStore.java:65)
> at
> org.apache.kafka.streams.processor.internals.AbstractTask.initializeStateStores(AbstractTask.java:86)
> at
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:120)
> at
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:794)
> at
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1222)
> at
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1195)
> at
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:897)
> at
> org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:71)
> at
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:240)
> at
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:230)
> at
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:314)
> at
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:278)
> at
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:261)
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1039)
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1004)
> at
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:570)
> at
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:359)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-5025) FetchRequestTest should use batches with more than one message

2017-04-27 Thread Umesh Chaudhary (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982953#comment-15982953
 ] 

Umesh Chaudhary edited comment on KAFKA-5025 at 4/28/17 4:53 AM:
-

[~ijuma], taking this one to start my contribution to the project. May I ask 
some guidelines to start working on this? What should be considered in this 
restructure? 


was (Author: umesh9...@gmail.com):
[~ijuma], taking this one to start my contribution to the project. May I ask 
some guidelines to start working on this ?

> FetchRequestTest should use batches with more than one message
> --
>
> Key: KAFKA-5025
> URL: https://issues.apache.org/jira/browse/KAFKA-5025
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>Assignee: Umesh Chaudhary
> Fix For: 0.11.0.0
>
>
> As part of the message format changes for KIP-98, 
> FetchRequestTest.produceData was changed to always use record batches 
> containing a single message. We should restructure the test so that it's more 
> realistic. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5132) Abort long running transactions

2017-04-27 Thread Umesh Chaudhary (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988198#comment-15988198
 ] 

Umesh Chaudhary commented on KAFKA-5132:


Sure [~damianguy]. Can understand :)

> Abort long running transactions
> ---
>
> Key: KAFKA-5132
> URL: https://issues.apache.org/jira/browse/KAFKA-5132
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Damian Guy
>Assignee: Damian Guy
>
> We need to abort any transactions that have been running longer than the txn 
> timeout



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-5070) org.apache.kafka.streams.errors.LockException: task [0_18] Failed to lock the state directory: /opt/rocksdb/pulse10/0_18

2017-04-27 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988196#comment-15988196
 ] 

Matthias J. Sax edited comment on KAFKA-5070 at 4/28/17 4:49 AM:
-

Not sure. I sounds like your locally caches application state got corrupted. 
You can try to delete the state directory and let Streams recreate it from the 
changelog.

Btw: your original stack trace on shows a WARN message (the logging is not good 
at this place and looks more severe than it is) -- on rebalance it's expected 
that a state directory might not be released (yet). For this case, we will just 
sleep/wait and try again. (from the stack trace {{at 
o.a.k.s.p.i.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)}}

Thus, this would only be an problem, when the warning does not go away -- do 
you see the stack trace over an over again, and it does not resolve itself?


was (Author: mjsax):
Not sure. I sounds like your locally caches application state got corrupted. 
You can try to delete the state directory and let Streams recreate it from the 
changelog.

Btw: your original stack trace on shows a WARN message (the logging is not good 
at this place and looks more severe than it is) -- on rebalance it's expected 
that a state directory might not be released (yet). For this case, we will just 
sleep/wait and try again. (from the stack trace {{at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)}}

> org.apache.kafka.streams.errors.LockException: task [0_18] Failed to lock the 
> state directory: /opt/rocksdb/pulse10/0_18
> 
>
> Key: KAFKA-5070
> URL: https://issues.apache.org/jira/browse/KAFKA-5070
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
> Environment: Linux Version
>Reporter: Dhana
> Attachments: RocksDB_LockStateDirec.7z
>
>
> Notes: we run two instance of consumer in two difference machines/nodes.
> we have 400 partitions. 200  stream threads/consumer, with 2 consumer.
> We perform HA test(on rebalance - shutdown of one of the consumer/broker), we 
> see this happening
> Error:
> 2017-04-05 11:36:09.352 WARN  StreamThread:1184 StreamThread-66 - Could not 
> create task 0_115. Will retry.
> org.apache.kafka.streams.errors.LockException: task [0_115] Failed to lock 
> the state directory: /opt/rocksdb/pulse10/0_115
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
>   at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5070) org.apache.kafka.streams.errors.LockException: task [0_18] Failed to lock the state directory: /opt/rocksdb/pulse10/0_18

2017-04-27 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988196#comment-15988196
 ] 

Matthias J. Sax commented on KAFKA-5070:


Not sure. I sounds like your locally caches application state got corrupted. 
You can try to delete the state directory and let Streams recreate it from the 
changelog.

Btw: your original stack trace on shows a WARN message (the logging is not good 
at this place and looks more severe than it is) -- on rebalance it's expected 
that a state directory might not be released (yet). For this case, we will just 
sleep/wait and try again. (from the stack trace {{at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)}}

> org.apache.kafka.streams.errors.LockException: task [0_18] Failed to lock the 
> state directory: /opt/rocksdb/pulse10/0_18
> 
>
> Key: KAFKA-5070
> URL: https://issues.apache.org/jira/browse/KAFKA-5070
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
> Environment: Linux Version
>Reporter: Dhana
> Attachments: RocksDB_LockStateDirec.7z
>
>
> Notes: we run two instance of consumer in two difference machines/nodes.
> we have 400 partitions. 200  stream threads/consumer, with 2 consumer.
> We perform HA test(on rebalance - shutdown of one of the consumer/broker), we 
> see this happening
> Error:
> 2017-04-05 11:36:09.352 WARN  StreamThread:1184 StreamThread-66 - Could not 
> create task 0_115. Will retry.
> org.apache.kafka.streams.errors.LockException: task [0_115] Failed to lock 
> the state directory: /opt/rocksdb/pulse10/0_115
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
>   at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4144) Allow per stream/table timestamp extractor

2017-04-27 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988191#comment-15988191
 ] 

Matthias J. Sax commented on KAFKA-4144:


[~miguno] I am not sure what you have in mind for docs updates. Can you clarify?

> Allow per stream/table timestamp extractor
> --
>
> Key: KAFKA-4144
> URL: https://issues.apache.org/jira/browse/KAFKA-4144
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.1
>Reporter: Elias Levy
>Assignee: Jeyhun Karimov
>  Labels: api, kip
> Fix For: 0.11.0.0
>
>
> At the moment the timestamp extractor is configured via a {{StreamConfig}} 
> value to {{KafkaStreams}}. That means you can only have a single timestamp 
> extractor per app, even though you may be joining multiple streams/tables 
> that require different timestamp extraction methods.
> You should be able to specify a timestamp extractor via 
> {{KStreamBuilder.stream()/table()}}, just like you can specify key and value 
> serdes that override the StreamConfig defaults.
> KIP: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=68714788
> Specifying a per-stream extractor should only be possible for sources, but 
> not for intermediate topics. For PAPI we cannot enforce this, but for DSL 
> {{through()}} should not allow to set a custom extractor by the user. In 
> contrast, with regard to KAFKA-4785, is must internally set an extractor that 
> returns the record's metadata timestamp in order to overwrite the global 
> extractor from {{StreamsConfig}} (ie, set 
> {{FailOnInvalidTimestampExtractor}}). This change should be done in 
> KAFKA-4785 though.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4772) Exploit #peek to implement #print() and other methods

2017-04-27 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988188#comment-15988188
 ] 

Matthias J. Sax commented on KAFKA-4772:


This JIRA is not about any public API changes (see KAFKA-4830). Any public API 
change requires a KIP, and we do have a KIP for the other JIRA: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-132%3A+Augment+KStream.print+to+allow+extra+parameters+in+the+printed+string

This PR is only about internal code refactoring. {{KStreamImpl#print()}} uses 
{{KeyValuePrinter}} as processor. But with {{#peek()}} we don't need this 
processor anymore, but can simple call {{#peek()}} with an appropriate UDF.

Furthermore, the class {{KStreamForeach}} can be replaced by {{KStreamPeek}} if 
we add a flag to {{KStreamPeek}} about forwarding the data or not.

Does this make sense?

> Exploit #peek to implement #print() and other methods
> -
>
> Key: KAFKA-4772
> URL: https://issues.apache.org/jira/browse/KAFKA-4772
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: james chien
>Priority: Minor
>  Labels: beginner, newbie
>
> From: https://github.com/apache/kafka/pull/2493#pullrequestreview-22157555
> Things that I can think of:
> - print / writeAsTest can be a special impl of peek; KStreamPrint etc can be 
> removed.
> - consider collapse KStreamPeek with KStreamForeach with a flag parameter 
> indicating if the acted key-value pair should still be forwarded.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk8 #1459

2017-04-27 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] KAFKA-5005: IntegrationTestUtils to override consumer configs and 
reuse

[wangguoz] KAFKA-5111: Code cleanup and improved log4j on StreamThread and Task

[jason] KAFKA-4379; Follow-up to avoid sending to changelog while restoring

[wangguoz] MINOR: adding global store must ensure unique names

[ismael] Use zkUtils instead of zkClient in AdminUtils

--
[...truncated 2.08 MB...]

org.apache.kafka.common.security.scram.ScramMessagesTest > 
validClientFinalMessage STARTED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
validClientFinalMessage PASSED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
invalidServerFirstMessage STARTED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
invalidServerFirstMessage PASSED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
validServerFinalMessage STARTED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
validServerFinalMessage PASSED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryWithoutPasswordConfiguration STARTED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryWithoutPasswordConfiguration PASSED

org.apache.kafka.common.security.ssl.SslFactoryTest > testClientMode STARTED

org.apache.kafka.common.security.ssl.SslFactoryTest > testClientMode PASSED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryConfiguration STARTED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryConfiguration PASSED

org.apache.kafka.common.security.kerberos.KerberosNameTest > testParse STARTED

org.apache.kafka.common.security.kerberos.KerberosNameTest > testParse PASSED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testPrincipalNameCanContainSeparator STARTED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testPrincipalNameCanContainSeparator PASSED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testEqualsAndHashCode STARTED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testEqualsAndHashCode PASSED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithListenerNameOverride STARTED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithListenerNameOverride PASSED

org.apache.kafka.common.security.JaasContextTest > testMissingOptionValue 
STARTED

org.apache.kafka.common.security.JaasContextTest > testMissingOptionValue PASSED

org.apache.kafka.common.security.JaasContextTest > testSingleOption STARTED

org.apache.kafka.common.security.JaasContextTest > testSingleOption PASSED

org.apache.kafka.common.security.JaasContextTest > 
testNumericOptionWithoutQuotes STARTED

org.apache.kafka.common.security.JaasContextTest > 
testNumericOptionWithoutQuotes PASSED

org.apache.kafka.common.security.JaasContextTest > testConfigNoOptions STARTED

org.apache.kafka.common.security.JaasContextTest > testConfigNoOptions PASSED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithWrongListenerName STARTED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithWrongListenerName PASSED

org.apache.kafka.common.security.JaasContextTest > testNumericOptionWithQuotes 
STARTED

org.apache.kafka.common.security.JaasContextTest > testNumericOptionWithQuotes 
PASSED

org.apache.kafka.common.security.JaasContextTest > testQuotedOptionValue STARTED

org.apache.kafka.common.security.JaasContextTest > testQuotedOptionValue PASSED

org.apache.kafka.common.security.JaasContextTest > testMissingLoginModule 
STARTED

org.apache.kafka.common.security.JaasContextTest > testMissingLoginModule PASSED

org.apache.kafka.common.security.JaasContextTest > testMissingSemicolon STARTED

org.apache.kafka.common.security.JaasContextTest > testMissingSemicolon PASSED

org.apache.kafka.common.security.JaasContextTest > testMultipleOptions STARTED

org.apache.kafka.common.security.JaasContextTest > testMultipleOptions PASSED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForClientWithListenerName STARTED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForClientWithListenerName PASSED

org.apache.kafka.common.security.JaasContextTest > testMultipleLoginModules 
STARTED

org.apache.kafka.common.security.JaasContextTest > testMultipleLoginModules 
PASSED

org.apache.kafka.common.security.JaasContextTest > testMissingControlFlag 
STARTED

org.apache.kafka.common.security.JaasContextTest > testMissingControlFlag PASSED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithListenerNameAndFallback STARTED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithListenerNameAndFallback PASSED

org.apache.kafka.common.security.JaasContextTest > testQuotedOptionName STARTED

org.apache.kafka.common.security.JaasContextTest > testQuotedOptionName PASSED

org.apache.kafka

[jira] [Commented] (KAFKA-4477) Node reduces its ISR to itself, and doesn't recover. Other nodes do not take leadership, cluster remains sick until node is restarted.

2017-04-27 Thread Arpan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988162#comment-15988162
 ] 

Arpan commented on KAFKA-4477:
--

Also not sure why  I am unable to find this bug reference in release notes.

> Node reduces its ISR to itself, and doesn't recover. Other nodes do not take 
> leadership, cluster remains sick until node is restarted.
> --
>
> Key: KAFKA-4477
> URL: https://issues.apache.org/jira/browse/KAFKA-4477
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.0
> Environment: RHEL7
> java version "1.8.0_66"
> Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
>Reporter: Michael Andre Pearce (IG)
>Assignee: Apurva Mehta
>Priority: Critical
>  Labels: reliability
> Fix For: 0.10.1.1
>
> Attachments: 2016_12_15.zip, issue_node_1001_ext.log, 
> issue_node_1001.log, issue_node_1002_ext.log, issue_node_1002.log, 
> issue_node_1003_ext.log, issue_node_1003.log, kafka.jstack, 
> state_change_controller.tar.gz
>
>
> We have encountered a critical issue that has re-occured in different 
> physical environments. We haven't worked out what is going on. We do though 
> have a nasty work around to keep service alive. 
> We do have not had this issue on clusters still running 0.9.01.
> We have noticed a node randomly shrinking for the partitions it owns the 
> ISR's down to itself, moments later we see other nodes having disconnects, 
> followed by finally app issues, where producing to these partitions is 
> blocked.
> It seems only by restarting the kafka instance java process resolves the 
> issues.
> We have had this occur multiple times and from all network and machine 
> monitoring the machine never left the network, or had any other glitches.
> Below are seen logs from the issue.
> Node 7:
> [2016-12-01 07:01:28,112] INFO Partition 
> [com_ig_trade_v1_position_event--demo--compacted,10] on broker 7: Shrinking 
> ISR for partition [com_ig_trade_v1_position_event--demo--compacted,10] from 
> 1,2,7 to 7 (kafka.cluster.Partition)
> All other nodes:
> [2016-12-01 07:01:38,172] WARN [ReplicaFetcherThread-0-7], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@5aae6d42 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 7 was disconnected before the response was 
> read
> All clients:
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.NetworkException: The server disconnected 
> before a response was received.
> After this occurs, we then suddenly see on the sick machine an increasing 
> amount of close_waits and file descriptors.
> As a work around to keep service we are currently putting in an automated 
> process that tails and regex's for: and where new_partitions hit just itself 
> we restart the node. 
> "\[(?P.+)\] INFO Partition \[.*\] on broker .* Shrinking ISR for 
> partition \[.*\] from (?P.+) to (?P.+) 
> \(kafka.cluster.Partition\)"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4477) Node reduces its ISR to itself, and doesn't recover. Other nodes do not take leadership, cluster remains sick until node is restarted.

2017-04-27 Thread Arpan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988159#comment-15988159
 ] 

Arpan commented on KAFKA-4477:
--

Hi Apurva, I shall send you the stack trace again but the behavior is exactly 
same - we see this behavior almost every week. I haven't observed file 
descriptor during the issue.

And it also gets resolved after restarts of nodes. We even observed missing 
offsets in Consumer at the time of issue.

We restarted it 2-3 days ago, I shall take thread dumps, observe file 
descriptor count and let you know today.

Regards,
Arpan Khagram

> Node reduces its ISR to itself, and doesn't recover. Other nodes do not take 
> leadership, cluster remains sick until node is restarted.
> --
>
> Key: KAFKA-4477
> URL: https://issues.apache.org/jira/browse/KAFKA-4477
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.0
> Environment: RHEL7
> java version "1.8.0_66"
> Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
>Reporter: Michael Andre Pearce (IG)
>Assignee: Apurva Mehta
>Priority: Critical
>  Labels: reliability
> Fix For: 0.10.1.1
>
> Attachments: 2016_12_15.zip, issue_node_1001_ext.log, 
> issue_node_1001.log, issue_node_1002_ext.log, issue_node_1002.log, 
> issue_node_1003_ext.log, issue_node_1003.log, kafka.jstack, 
> state_change_controller.tar.gz
>
>
> We have encountered a critical issue that has re-occured in different 
> physical environments. We haven't worked out what is going on. We do though 
> have a nasty work around to keep service alive. 
> We do have not had this issue on clusters still running 0.9.01.
> We have noticed a node randomly shrinking for the partitions it owns the 
> ISR's down to itself, moments later we see other nodes having disconnects, 
> followed by finally app issues, where producing to these partitions is 
> blocked.
> It seems only by restarting the kafka instance java process resolves the 
> issues.
> We have had this occur multiple times and from all network and machine 
> monitoring the machine never left the network, or had any other glitches.
> Below are seen logs from the issue.
> Node 7:
> [2016-12-01 07:01:28,112] INFO Partition 
> [com_ig_trade_v1_position_event--demo--compacted,10] on broker 7: Shrinking 
> ISR for partition [com_ig_trade_v1_position_event--demo--compacted,10] from 
> 1,2,7 to 7 (kafka.cluster.Partition)
> All other nodes:
> [2016-12-01 07:01:38,172] WARN [ReplicaFetcherThread-0-7], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@5aae6d42 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 7 was disconnected before the response was 
> read
> All clients:
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.NetworkException: The server disconnected 
> before a response was received.
> After this occurs, we then suddenly see on the sick machine an increasing 
> amount of close_waits and file descriptors.
> As a work around to keep service we are currently putting in an automated 
> process that tails and regex's for: and where new_partitions hit just itself 
> we restart the node. 
> "\[(?P.+)\] INFO Partition \[.*\] on broker .* Shrinking ISR for 
> partition \[.*\] from (?P.+) to (?P.+) 
> \(kafka.cluster.Partition\)"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5140) Flaky ResetIntegrationTest

2017-04-27 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5140:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Flaky ResetIntegrationTest
> --
>
> Key: KAFKA-5140
> URL: https://issues.apache.org/jira/browse/KAFKA-5140
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 0.10.2.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.11.0.0
>
>
> {noformat}
> org.apache.kafka.streams.integration.ResetIntegrationTest > 
> testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
> java.lang.AssertionError: 
> Expected: <[KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642075, 1), KeyValue(2986681642035, 1), 
> KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642115, 1), KeyValue(2986681642075, 1), 
> KeyValue(2986681642075, 2), KeyValue(2986681642095, 2), 
> KeyValue(2986681642115, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642095, 2), KeyValue(2986681642115, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642115, 2), KeyValue(2986681642135, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642175, 1), 
> KeyValue(2986681642135, 2), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 1), KeyValue(2986681642195, 1), 
> KeyValue(2986681642135, 3), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 2), KeyValue(2986681642195, 1), 
> KeyValue(2986681642155, 3), KeyValue(2986681642175, 2), 
> KeyValue(2986681642195, 2), KeyValue(2986681642155, 3), 
> KeyValue(2986681642175, 3), KeyValue(2986681642195, 2), 
> KeyValue(2986681642155, 4), KeyValue(2986681642175, 3), 
> KeyValue(2986681642195, 3)]>
>  but: was <[KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642075, 1), KeyValue(2986681642035, 1), 
> KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642115, 1), KeyValue(2986681642075, 1), 
> KeyValue(2986681642075, 2), KeyValue(2986681642095, 2), 
> KeyValue(2986681642115, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642095, 2), KeyValue(2986681642115, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642115, 2), KeyValue(2986681642135, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642175, 1), 
> KeyValue(2986681642135, 2), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 1), KeyValue(2986681642195, 1), 
> KeyValue(2986681642135, 3), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 2), KeyValue(2986681642195, 1), 
> KeyValue(2986681642155, 3), KeyValue(2986681642175, 2), 
> KeyValue(2986681642195, 2), KeyValue(2986681642155, 3)]>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
> at 
> org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:190)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5063) Flaky ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic

2017-04-27 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5063:
---
Status: Patch Available  (was: Open)

https://github.com/apache/kafka/pull/2931

> Flaky 
> ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic
> -
>
> Key: KAFKA-5063
> URL: https://issues.apache.org/jira/browse/KAFKA-5063
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>
> {noformat}
> org.apache.kafka.streams.integration.ResetIntegrationTest > 
> testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
> java.lang.AssertionError: 
> Expected: <[KeyValue(2983939126775, 1), KeyValue(2983939126815, 1), 
> KeyValue(2983939126835, 1), KeyValue(2983939126795, 1), 
> KeyValue(2983939126815, 1), KeyValue(2983939126855, 1), 
> KeyValue(2983939126835, 1), KeyValue(2983939126795, 1), 
> KeyValue(2983939126815, 2), KeyValue(2983939126835, 2), 
> KeyValue(2983939126855, 1), KeyValue(2983939126875, 1), 
> KeyValue(2983939126835, 2), KeyValue(2983939126855, 2), 
> KeyValue(2983939126895, 1), KeyValue(2983939126875, 1), 
> KeyValue(2983939126855, 2), KeyValue(2983939126875, 2), 
> KeyValue(2983939126895, 1), KeyValue(2983939126915, 1), 
> KeyValue(2983939126875, 2), KeyValue(2983939126895, 2), 
> KeyValue(2983939126915, 1), KeyValue(2983939126935, 1), 
> KeyValue(2983939126875, 3), KeyValue(2983939126895, 2), 
> KeyValue(2983939126915, 2), KeyValue(2983939126935, 1), 
> KeyValue(2983939126875, 3), KeyValue(2983939126895, 3), 
> KeyValue(2983939126915, 2), KeyValue(2983939126935, 2), 
> KeyValue(2983939126875, 4), KeyValue(2983939126895, 3), 
> KeyValue(2983939126915, 3), KeyValue(2983939126935, 2), 
> KeyValue(2983939126875, 4), KeyValue(2983939126895, 4), 
> KeyValue(2983939126915, 3), KeyValue(2983939126935, 3)]>
>  but: was <[KeyValue(2983939126775, 1), KeyValue(2983939126815, 1), 
> KeyValue(2983939126835, 1), KeyValue(2983939126795, 1), 
> KeyValue(2983939126815, 1), KeyValue(2983939126855, 1), 
> KeyValue(2983939126835, 1), KeyValue(2983939126795, 1), 
> KeyValue(2983939126815, 2), KeyValue(2983939126835, 2), 
> KeyValue(2983939126855, 1), KeyValue(2983939126875, 1), 
> KeyValue(2983939126835, 2), KeyValue(2983939126855, 2), 
> KeyValue(2983939126895, 1), KeyValue(2983939126875, 1), 
> KeyValue(2983939126855, 2), KeyValue(2983939126875, 2), 
> KeyValue(2983939126895, 1), KeyValue(2983939126915, 1), 
> KeyValue(2983939126875, 2), KeyValue(2983939126895, 2), 
> KeyValue(2983939126915, 1), KeyValue(2983939126935, 1), 
> KeyValue(2983939126875, 3), KeyValue(2983939126895, 2), 
> KeyValue(2983939126915, 2), KeyValue(2983939126935, 1), 
> KeyValue(2983939126875, 3), KeyValue(2983939126895, 3), 
> KeyValue(2983939126915, 2), KeyValue(2983939126935, 2), 
> KeyValue(2983939126875, 4)]>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
> at 
> org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:190)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (KAFKA-5063) Flaky ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic

2017-04-27 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax reassigned KAFKA-5063:
--

Assignee: Matthias J. Sax

> Flaky 
> ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic
> -
>
> Key: KAFKA-5063
> URL: https://issues.apache.org/jira/browse/KAFKA-5063
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>
> {noformat}
> org.apache.kafka.streams.integration.ResetIntegrationTest > 
> testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
> java.lang.AssertionError: 
> Expected: <[KeyValue(2983939126775, 1), KeyValue(2983939126815, 1), 
> KeyValue(2983939126835, 1), KeyValue(2983939126795, 1), 
> KeyValue(2983939126815, 1), KeyValue(2983939126855, 1), 
> KeyValue(2983939126835, 1), KeyValue(2983939126795, 1), 
> KeyValue(2983939126815, 2), KeyValue(2983939126835, 2), 
> KeyValue(2983939126855, 1), KeyValue(2983939126875, 1), 
> KeyValue(2983939126835, 2), KeyValue(2983939126855, 2), 
> KeyValue(2983939126895, 1), KeyValue(2983939126875, 1), 
> KeyValue(2983939126855, 2), KeyValue(2983939126875, 2), 
> KeyValue(2983939126895, 1), KeyValue(2983939126915, 1), 
> KeyValue(2983939126875, 2), KeyValue(2983939126895, 2), 
> KeyValue(2983939126915, 1), KeyValue(2983939126935, 1), 
> KeyValue(2983939126875, 3), KeyValue(2983939126895, 2), 
> KeyValue(2983939126915, 2), KeyValue(2983939126935, 1), 
> KeyValue(2983939126875, 3), KeyValue(2983939126895, 3), 
> KeyValue(2983939126915, 2), KeyValue(2983939126935, 2), 
> KeyValue(2983939126875, 4), KeyValue(2983939126895, 3), 
> KeyValue(2983939126915, 3), KeyValue(2983939126935, 2), 
> KeyValue(2983939126875, 4), KeyValue(2983939126895, 4), 
> KeyValue(2983939126915, 3), KeyValue(2983939126935, 3)]>
>  but: was <[KeyValue(2983939126775, 1), KeyValue(2983939126815, 1), 
> KeyValue(2983939126835, 1), KeyValue(2983939126795, 1), 
> KeyValue(2983939126815, 1), KeyValue(2983939126855, 1), 
> KeyValue(2983939126835, 1), KeyValue(2983939126795, 1), 
> KeyValue(2983939126815, 2), KeyValue(2983939126835, 2), 
> KeyValue(2983939126855, 1), KeyValue(2983939126875, 1), 
> KeyValue(2983939126835, 2), KeyValue(2983939126855, 2), 
> KeyValue(2983939126895, 1), KeyValue(2983939126875, 1), 
> KeyValue(2983939126855, 2), KeyValue(2983939126875, 2), 
> KeyValue(2983939126895, 1), KeyValue(2983939126915, 1), 
> KeyValue(2983939126875, 2), KeyValue(2983939126895, 2), 
> KeyValue(2983939126915, 1), KeyValue(2983939126935, 1), 
> KeyValue(2983939126875, 3), KeyValue(2983939126895, 2), 
> KeyValue(2983939126915, 2), KeyValue(2983939126935, 1), 
> KeyValue(2983939126875, 3), KeyValue(2983939126895, 3), 
> KeyValue(2983939126915, 2), KeyValue(2983939126935, 2), 
> KeyValue(2983939126875, 4)]>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
> at 
> org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:190)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5140) Flaky ResetIntegrationTest

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988155#comment-15988155
 ] 

ASF GitHub Bot commented on KAFKA-5140:
---

GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/2931

KAFKA-5140: Flaky ResetIntegrationTest



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka 
kafka-5140-flaky-reset-integration-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2931.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2931


commit 62b9bd6e72f28c78f6cd0f7f5d72ad38e97065e6
Author: Matthias J. Sax 
Date:   2017-04-28T03:57:26Z

KAFKA-5140: Flaky ResetIntegrationTest




> Flaky ResetIntegrationTest
> --
>
> Key: KAFKA-5140
> URL: https://issues.apache.org/jira/browse/KAFKA-5140
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 0.10.2.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.11.0.0
>
>
> {noformat}
> org.apache.kafka.streams.integration.ResetIntegrationTest > 
> testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
> java.lang.AssertionError: 
> Expected: <[KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642075, 1), KeyValue(2986681642035, 1), 
> KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642115, 1), KeyValue(2986681642075, 1), 
> KeyValue(2986681642075, 2), KeyValue(2986681642095, 2), 
> KeyValue(2986681642115, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642095, 2), KeyValue(2986681642115, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642115, 2), KeyValue(2986681642135, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642175, 1), 
> KeyValue(2986681642135, 2), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 1), KeyValue(2986681642195, 1), 
> KeyValue(2986681642135, 3), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 2), KeyValue(2986681642195, 1), 
> KeyValue(2986681642155, 3), KeyValue(2986681642175, 2), 
> KeyValue(2986681642195, 2), KeyValue(2986681642155, 3), 
> KeyValue(2986681642175, 3), KeyValue(2986681642195, 2), 
> KeyValue(2986681642155, 4), KeyValue(2986681642175, 3), 
> KeyValue(2986681642195, 3)]>
>  but: was <[KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642075, 1), KeyValue(2986681642035, 1), 
> KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642115, 1), KeyValue(2986681642075, 1), 
> KeyValue(2986681642075, 2), KeyValue(2986681642095, 2), 
> KeyValue(2986681642115, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642095, 2), KeyValue(2986681642115, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642115, 2), KeyValue(2986681642135, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642175, 1), 
> KeyValue(2986681642135, 2), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 1), KeyValue(2986681642195, 1), 
> KeyValue(2986681642135, 3), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 2), KeyValue(2986681642195, 1), 
> KeyValue(2986681642155, 3), KeyValue(2986681642175, 2), 
> KeyValue(2986681642195, 2), KeyValue(2986681642155, 3)]>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
> at 
> org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:190)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5140) Flaky ResetIntegrationTest

2017-04-27 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5140:
---
Status: Patch Available  (was: Open)

> Flaky ResetIntegrationTest
> --
>
> Key: KAFKA-5140
> URL: https://issues.apache.org/jira/browse/KAFKA-5140
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 0.10.2.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.11.0.0
>
>
> {noformat}
> org.apache.kafka.streams.integration.ResetIntegrationTest > 
> testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
> java.lang.AssertionError: 
> Expected: <[KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642075, 1), KeyValue(2986681642035, 1), 
> KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642115, 1), KeyValue(2986681642075, 1), 
> KeyValue(2986681642075, 2), KeyValue(2986681642095, 2), 
> KeyValue(2986681642115, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642095, 2), KeyValue(2986681642115, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642115, 2), KeyValue(2986681642135, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642175, 1), 
> KeyValue(2986681642135, 2), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 1), KeyValue(2986681642195, 1), 
> KeyValue(2986681642135, 3), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 2), KeyValue(2986681642195, 1), 
> KeyValue(2986681642155, 3), KeyValue(2986681642175, 2), 
> KeyValue(2986681642195, 2), KeyValue(2986681642155, 3), 
> KeyValue(2986681642175, 3), KeyValue(2986681642195, 2), 
> KeyValue(2986681642155, 4), KeyValue(2986681642175, 3), 
> KeyValue(2986681642195, 3)]>
>  but: was <[KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642075, 1), KeyValue(2986681642035, 1), 
> KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642115, 1), KeyValue(2986681642075, 1), 
> KeyValue(2986681642075, 2), KeyValue(2986681642095, 2), 
> KeyValue(2986681642115, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642095, 2), KeyValue(2986681642115, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642115, 2), KeyValue(2986681642135, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642175, 1), 
> KeyValue(2986681642135, 2), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 1), KeyValue(2986681642195, 1), 
> KeyValue(2986681642135, 3), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 2), KeyValue(2986681642195, 1), 
> KeyValue(2986681642155, 3), KeyValue(2986681642175, 2), 
> KeyValue(2986681642195, 2), KeyValue(2986681642155, 3)]>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
> at 
> org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:190)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2931: KAFKA-5140: Flaky ResetIntegrationTest

2017-04-27 Thread mjsax
GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/2931

KAFKA-5140: Flaky ResetIntegrationTest



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka 
kafka-5140-flaky-reset-integration-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2931.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2931


commit 62b9bd6e72f28c78f6cd0f7f5d72ad38e97065e6
Author: Matthias J. Sax 
Date:   2017-04-28T03:57:26Z

KAFKA-5140: Flaky ResetIntegrationTest




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5140) Flaky ResetIntegrationTest

2017-04-27 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-5140:
--

 Summary: Flaky ResetIntegrationTest
 Key: KAFKA-5140
 URL: https://issues.apache.org/jira/browse/KAFKA-5140
 Project: Kafka
  Issue Type: Bug
  Components: streams, unit tests
Affects Versions: 0.10.2.0
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
 Fix For: 0.11.0.0


{noformat}
org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
java.lang.AssertionError: 
Expected: <[KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
KeyValue(2986681642075, 1), KeyValue(2986681642035, 1), KeyValue(2986681642095, 
1), KeyValue(2986681642055, 1), KeyValue(2986681642115, 1), 
KeyValue(2986681642075, 1), KeyValue(2986681642075, 2), KeyValue(2986681642095, 
2), KeyValue(2986681642115, 1), KeyValue(2986681642135, 1), 
KeyValue(2986681642095, 2), KeyValue(2986681642115, 2), KeyValue(2986681642155, 
1), KeyValue(2986681642135, 1), KeyValue(2986681642115, 2), 
KeyValue(2986681642135, 2), KeyValue(2986681642155, 1), KeyValue(2986681642175, 
1), KeyValue(2986681642135, 2), KeyValue(2986681642155, 2), 
KeyValue(2986681642175, 1), KeyValue(2986681642195, 1), KeyValue(2986681642135, 
3), KeyValue(2986681642155, 2), KeyValue(2986681642175, 2), 
KeyValue(2986681642195, 1), KeyValue(2986681642155, 3), KeyValue(2986681642175, 
2), KeyValue(2986681642195, 2), KeyValue(2986681642155, 3), 
KeyValue(2986681642175, 3), KeyValue(2986681642195, 2), KeyValue(2986681642155, 
4), KeyValue(2986681642175, 3), KeyValue(2986681642195, 3)]>
 but: was <[KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
KeyValue(2986681642075, 1), KeyValue(2986681642035, 1), KeyValue(2986681642095, 
1), KeyValue(2986681642055, 1), KeyValue(2986681642115, 1), 
KeyValue(2986681642075, 1), KeyValue(2986681642075, 2), KeyValue(2986681642095, 
2), KeyValue(2986681642115, 1), KeyValue(2986681642135, 1), 
KeyValue(2986681642095, 2), KeyValue(2986681642115, 2), KeyValue(2986681642155, 
1), KeyValue(2986681642135, 1), KeyValue(2986681642115, 2), 
KeyValue(2986681642135, 2), KeyValue(2986681642155, 1), KeyValue(2986681642175, 
1), KeyValue(2986681642135, 2), KeyValue(2986681642155, 2), 
KeyValue(2986681642175, 1), KeyValue(2986681642195, 1), KeyValue(2986681642135, 
3), KeyValue(2986681642155, 2), KeyValue(2986681642175, 2), 
KeyValue(2986681642195, 1), KeyValue(2986681642155, 3), KeyValue(2986681642175, 
2), KeyValue(2986681642195, 2), KeyValue(2986681642155, 3)]>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
at 
org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:190)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


new contributor

2017-04-27 Thread Bryan Karlovitz
Hello,

My name is Bryan. I'd like to be added to the contributor list and, if it
isn't claimed already, I'd like to work on KAFKA-5137.

Thanks,
Bryan


Jenkins build is back to normal : kafka-trunk-jdk7 #2127

2017-04-27 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : kafka-trunk-jdk8 #1458

2017-04-27 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-trunk-jdk7 #2126

2017-04-27 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-5119; Ensure global metrics are empty before running

[ismael] KAFKA-5101; Remove unused zkClient parameter in 
incrementControllerEpoch

[wangguoz] KAFKA-5005: IntegrationTestUtils to override consumer configs and 
reuse

[wangguoz] KAFKA-5111: Code cleanup and improved log4j on StreamThread and Task

[jason] KAFKA-4379; Follow-up to avoid sending to changelog while restoring

--
[...truncated 1.62 MB...]

kafka.log.LogTest > testParseTopicPartitionNameForMissingPartition STARTED

kafka.log.LogTest > testParseTopicPartitionNameForMissingPartition PASSED

kafka.log.LogTest > testParseTopicPartitionNameForEmptyName STARTED

kafka.log.LogTest > testParseTopicPartitionNameForEmptyName PASSED

kafka.log.LogTest > testOpenDeletesObsoleteFiles STARTED

kafka.log.LogTest > testOpenDeletesObsoleteFiles PASSED

kafka.log.LogTest > testUpdatePidMapWithCompactedData STARTED

kafka.log.LogTest > testUpdatePidMapWithCompactedData PASSED

kafka.log.LogTest > shouldUpdateOffsetForLeaderEpochsWhenDeletingSegments 
STARTED

kafka.log.LogTest > shouldUpdateOffsetForLeaderEpochsWhenDeletingSegments PASSED

kafka.log.LogTest > testPeriodicPidSnapshot STARTED

kafka.log.LogTest > testPeriodicPidSnapshot PASSED

kafka.log.LogTest > testRebuildTimeIndexForOldMessages STARTED

kafka.log.LogTest > testRebuildTimeIndexForOldMessages PASSED

kafka.log.LogTest > testLogRecoversForLeaderEpoch STARTED

kafka.log.LogTest > testLogRecoversForLeaderEpoch PASSED

kafka.log.LogTest > testSizeBasedLogRoll STARTED

kafka.log.LogTest > testSizeBasedLogRoll PASSED

kafka.log.LogTest > shouldNotDeleteSizeBasedSegmentsWhenUnderRetentionSize 
STARTED

kafka.log.LogTest > shouldNotDeleteSizeBasedSegmentsWhenUnderRetentionSize 
PASSED

kafka.log.LogTest > testTimeBasedLogRollJitter STARTED

kafka.log.LogTest > testTimeBasedLogRollJitter PASSED

kafka.log.LogTest > testParseTopicPartitionName STARTED

kafka.log.LogTest > testParseTopicPartitionName PASSED

kafka.log.LogTest > testPidMapTruncateTo STARTED

kafka.log.LogTest > testPidMapTruncateTo PASSED

kafka.log.LogTest > testTruncateTo STARTED

kafka.log.LogTest > testTruncateTo PASSED

kafka.log.LogTest > shouldApplyEpochToMessageOnAppendIfLeader STARTED

kafka.log.LogTest > shouldApplyEpochToMessageOnAppendIfLeader PASSED

kafka.log.LogTest > testCleanShutdownFile STARTED

kafka.log.LogTest > testCleanShutdownFile PASSED

kafka.log.LogTest > testPidExpirationOnSegmentDeletion STARTED

kafka.log.LogTest > testPidExpirationOnSegmentDeletion PASSED

kafka.log.LogTest > testBuildTimeIndexWhenNotAssigningOffsets STARTED

kafka.log.LogTest > testBuildTimeIndexWhenNotAssigningOffsets PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[13] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[13] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[14] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[14] PASSED

kafka.log.BrokerCompressionTest > testBrokerSid

[jira] [Commented] (KAFKA-4801) Transient test failure (part 2): ConsumerBounceTest.testConsumptionWithBrokerFailures

2017-04-27 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988049#comment-15988049
 ] 

Jason Gustafson commented on KAFKA-4801:


I submitted a patch to make the expectation that the given topic partition is 
assigned explicit: https://github.com/apache/kafka/pull/2930. I expect the new 
assertion to now fail instead of the call to {{position}}.

> Transient test failure (part 2): 
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> -
>
> Key: KAFKA-4801
> URL: https://issues.apache.org/jira/browse/KAFKA-4801
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Armin Braun
>Assignee: Armin Braun
>Priority: Minor
>  Labels: transient-system-test-failure
>
> There is still some (but very little ... when reproducing this you need more 
> than 100 runs in half the cases statistically) instability left in the test
> {code}
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> {code}
> Resulting in this exception being thrown at a relatively low rate (I'd say 
> def less than 0.5% of all runs on my machine).
> {code}
> kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures FAILED
> java.lang.IllegalArgumentException: You can only check the position for 
> partitions assigned to this consumer.
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1271)
> at 
> kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:96)
> at 
> kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:69)
> {code}
> this was also reported in a comment to the original KAFKA-4198
> https://issues.apache.org/jira/browse/KAFKA-4198?focusedCommentId=15765468&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15765468



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-4772) Exploit #peek to implement #print() and other methods

2017-04-27 Thread james chien (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988025#comment-15988025
 ] 

james chien edited comment on KAFKA-4772 at 4/28/17 1:51 AM:
-

I want to clarify the problem, as I know is want to do two things.

1. deprecated function "KStream.print()" due to implement "peek()".
(like -> 
https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html#peek-java.util.function.Consumer-)

2. add flag in "peek()" to let user decide whether data still be forwarded or 
not (eg. peek(xxx, true)  or peek(xxx, false) )


was (Author: james.c):
I want to clarify the problem, as I know is want to do two things.
1. deprecated function "KStream.print()" due to implement "peek()".
2. add flag in "peek()" to let user decide whether data still be forwarded or 
not (eg. peek(xxx, true)  or peek(xxx, false) )

> Exploit #peek to implement #print() and other methods
> -
>
> Key: KAFKA-4772
> URL: https://issues.apache.org/jira/browse/KAFKA-4772
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: james chien
>Priority: Minor
>  Labels: beginner, newbie
>
> From: https://github.com/apache/kafka/pull/2493#pullrequestreview-22157555
> Things that I can think of:
> - print / writeAsTest can be a special impl of peek; KStreamPrint etc can be 
> removed.
> - consider collapse KStreamPeek with KStreamForeach with a flag parameter 
> indicating if the acted key-value pair should still be forwarded.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2930: MINOR: Make assignment expectation explicit in tes...

2017-04-27 Thread hachikuji
GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/2930

MINOR: Make assignment expectation explicit in 
testConsumptionWithBrokerFailures



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka bouncetest-assert-assignment

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2930.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2930


commit 2fe1219bdbce78e9d4e0ef1153ba7079d5370597
Author: Jason Gustafson 
Date:   2017-04-28T01:48:42Z

MINOR: Make assignment expectation explicit in 
testConsumptionWithBrokerFailures




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (KAFKA-4772) Exploit #peek to implement #print() and other methods

2017-04-27 Thread james chien (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988025#comment-15988025
 ] 

james chien edited comment on KAFKA-4772 at 4/28/17 1:49 AM:
-

I want to clarify the problem, as I know is want to do two things.
1. deprecated function "KStream.print()" due to implement "peek()".
2. add flag in "peek()" to let user decide whether data still be forwarded or 
not (eg. peek(xxx, true)  or peek(xxx, false) )


was (Author: james.c):
I want to clarify the problem, as I know is want to do two things.
1. deprecated function "KStream.print()" due to implement "peak()".
2. add flag in "peak()" to let user decide whether data still be forwarded or 
not (eg. peak(xxx, true)  or peak(xxx, false) )

> Exploit #peek to implement #print() and other methods
> -
>
> Key: KAFKA-4772
> URL: https://issues.apache.org/jira/browse/KAFKA-4772
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: james chien
>Priority: Minor
>  Labels: beginner, newbie
>
> From: https://github.com/apache/kafka/pull/2493#pullrequestreview-22157555
> Things that I can think of:
> - print / writeAsTest can be a special impl of peek; KStreamPrint etc can be 
> removed.
> - consider collapse KStreamPeek with KStreamForeach with a flag parameter 
> indicating if the acted key-value pair should still be forwarded.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-4772) Exploit #peek to implement #print() and other methods

2017-04-27 Thread james chien (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988025#comment-15988025
 ] 

james chien edited comment on KAFKA-4772 at 4/28/17 1:47 AM:
-

I want to clarify the problem, as I know is want to do two things.
1. deprecated function "KStream.print()" due to implement "peak()".
2. add flag in "peak()" to let user decide whether data still be forwarded or 
not (eg. peak(xxx, true)  or peak(xxx, false) )


was (Author: james.c):
I want to clarify the problem, as I know is want to do two things.
1. deprecated function "KStream.print()" due to we have "peak()".
2. add flag in "peak()" to let user decide whether data still be forwarded or 
not (eg. peak(xxx, true)  or peak(xxx, false) )

> Exploit #peek to implement #print() and other methods
> -
>
> Key: KAFKA-4772
> URL: https://issues.apache.org/jira/browse/KAFKA-4772
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: james chien
>Priority: Minor
>  Labels: beginner, newbie
>
> From: https://github.com/apache/kafka/pull/2493#pullrequestreview-22157555
> Things that I can think of:
> - print / writeAsTest can be a special impl of peek; KStreamPrint etc can be 
> removed.
> - consider collapse KStreamPeek with KStreamForeach with a flag parameter 
> indicating if the acted key-value pair should still be forwarded.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4830) Augment KStream.print() to allow users pass in extra parameters in the printed string

2017-04-27 Thread james chien (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988028#comment-15988028
 ] 

james chien commented on KAFKA-4830:


yes, I want to work on that !

> Augment KStream.print() to allow users pass in extra parameters in the 
> printed string
> -
>
> Key: KAFKA-4830
> URL: https://issues.apache.org/jira/browse/KAFKA-4830
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>  Labels: needs-kip, newbie
>
> Today {{KStream.print}} use the hard-coded result string as:
> {code}
> "[" + this.streamName + "]: " + keyToPrint + " , " + valueToPrint
> {code}
> And some users are asking to augment this so that they can customize the 
> output string as {{KStream.print(KeyValueMapper)}} :
> {code}
> "[" + this.streamName + "]: " + mapper.apply(keyToPrint, valueToPrint)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (KAFKA-4772) Exploit #peek to implement #print() and other methods

2017-04-27 Thread james chien (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

james chien reassigned KAFKA-4772:
--

Assignee: james chien

> Exploit #peek to implement #print() and other methods
> -
>
> Key: KAFKA-4772
> URL: https://issues.apache.org/jira/browse/KAFKA-4772
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: james chien
>Priority: Minor
>  Labels: beginner, newbie
>
> From: https://github.com/apache/kafka/pull/2493#pullrequestreview-22157555
> Things that I can think of:
> - print / writeAsTest can be a special impl of peek; KStreamPrint etc can be 
> removed.
> - consider collapse KStreamPeek with KStreamForeach with a flag parameter 
> indicating if the acted key-value pair should still be forwarded.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4772) Exploit #peek to implement #print() and other methods

2017-04-27 Thread james chien (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988025#comment-15988025
 ] 

james chien commented on KAFKA-4772:


I want to clarify the problem, as I know is want to do two things.
1. deprecated function "KStream.print()" due to we have "peak()".
2. add flag in "peak()" to let user decide whether data still be forwarded or 
not (eg. peak(xxx, true)  or peak(xxx, false) )

> Exploit #peek to implement #print() and other methods
> -
>
> Key: KAFKA-4772
> URL: https://issues.apache.org/jira/browse/KAFKA-4772
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Minor
>  Labels: beginner, newbie
>
> From: https://github.com/apache/kafka/pull/2493#pullrequestreview-22157555
> Things that I can think of:
> - print / writeAsTest can be a special impl of peek; KStreamPrint etc can be 
> removed.
> - consider collapse KStreamPeek with KStreamForeach with a flag parameter 
> indicating if the acted key-value pair should still be forwarded.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4801) Transient test failure (part 2): ConsumerBounceTest.testConsumptionWithBrokerFailures

2017-04-27 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988003#comment-15988003
 ] 

Jason Gustafson commented on KAFKA-4801:


[~original-brownbear] I couldn't reproduce the failure locally either. At a 
glance, the test case doesn't look totally safe since we don't actually ensure 
that the partition was assigned to the consumer before calling {{position}}. A 
simple way to make the test more resilient would be to only execute the 
{{commitSync}} block if the call to {{poll}} returned a non-empty record set. 
That said, it's unclear in what situation we would lose the partition 
assignment, so it would be good to understand that before patching the test 
case if possible.

> Transient test failure (part 2): 
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> -
>
> Key: KAFKA-4801
> URL: https://issues.apache.org/jira/browse/KAFKA-4801
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Armin Braun
>Assignee: Armin Braun
>Priority: Minor
>  Labels: transient-system-test-failure
>
> There is still some (but very little ... when reproducing this you need more 
> than 100 runs in half the cases statistically) instability left in the test
> {code}
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> {code}
> Resulting in this exception being thrown at a relatively low rate (I'd say 
> def less than 0.5% of all runs on my machine).
> {code}
> kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures FAILED
> java.lang.IllegalArgumentException: You can only check the position for 
> partitions assigned to this consumer.
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1271)
> at 
> kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:96)
> at 
> kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:69)
> {code}
> this was also reported in a comment to the original KAFKA-4198
> https://issues.apache.org/jira/browse/KAFKA-4198?focusedCommentId=15765468&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15765468



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2916: WIP: Avoid FileInputStream and FileOutputStream

2017-04-27 Thread ijuma
Github user ijuma closed the pull request at:

https://github.com/apache/kafka/pull/2916


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-5103) Refactor AdminUtils to use zkUtils methods instad of zkUtils.zkClient

2017-04-27 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5103.

Resolution: Fixed

> Refactor AdminUtils to use zkUtils methods instad of zkUtils.zkClient
> -
>
> Key: KAFKA-5103
> URL: https://issues.apache.org/jira/browse/KAFKA-5103
> Project: Kafka
>  Issue Type: Sub-task
>  Components: admin
>Reporter: Balint Molnar
>Assignee: Balint Molnar
>
> Replace zkUtils.zkClient.createPersistentSequential(seqNode, content) to 
> zkUtils.createSequentialPersistentPath(seqNode, content).
> The zkClient variant does not respects the Acl's.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5103) Refactor AdminUtils to use zkUtils methods instad of zkUtils.zkClient

2017-04-27 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5103:
---
Fix Version/s: 0.11.0.0

> Refactor AdminUtils to use zkUtils methods instad of zkUtils.zkClient
> -
>
> Key: KAFKA-5103
> URL: https://issues.apache.org/jira/browse/KAFKA-5103
> Project: Kafka
>  Issue Type: Sub-task
>  Components: admin
>Reporter: Balint Molnar
>Assignee: Balint Molnar
> Fix For: 0.11.0.0
>
>
> Replace zkUtils.zkClient.createPersistentSequential(seqNode, content) to 
> zkUtils.createSequentialPersistentPath(seqNode, content).
> The zkClient variant does not respects the Acl's.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5103) Refactor AdminUtils to use zkUtils methods instad of zkUtils.zkClient

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987984#comment-15987984
 ] 

ASF GitHub Bot commented on KAFKA-5103:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2888


> Refactor AdminUtils to use zkUtils methods instad of zkUtils.zkClient
> -
>
> Key: KAFKA-5103
> URL: https://issues.apache.org/jira/browse/KAFKA-5103
> Project: Kafka
>  Issue Type: Sub-task
>  Components: admin
>Reporter: Balint Molnar
>Assignee: Balint Molnar
>
> Replace zkUtils.zkClient.createPersistentSequential(seqNode, content) to 
> zkUtils.createSequentialPersistentPath(seqNode, content).
> The zkClient variant does not respects the Acl's.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2888: KAFKA-5103 Refactor AdminUtils to use zkUtils meth...

2017-04-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2888


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2909: MINOR: adding global store must ensure unique name...

2017-04-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2909


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4379) Remove caching of dirty and removed keys from StoreChangeLogger

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987946#comment-15987946
 ] 

ASF GitHub Bot commented on KAFKA-4379:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2908


> Remove caching of dirty and removed keys from StoreChangeLogger
> ---
>
> Key: KAFKA-4379
> URL: https://issues.apache.org/jira/browse/KAFKA-4379
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Damian Guy
>Assignee: Damian Guy
>Priority: Minor
> Fix For: 0.10.1.1, 0.10.2.0
>
>
> The StoreChangeLogger currently keeps a cache of dirty and removed keys and 
> will batch the changelog records such that we don't send a record for each 
> update. However, with KIP-63 this is unnecessary as the batching and 
> de-duping is done by the caching layer. Further, the StoreChangeLogger relies 
> on context.timestamp() which is likely to be incorrect when caching is enabled



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2908: KAFKA-4379 Followup: Avoid sending to changelog wh...

2017-04-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2908


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-trunk-jdk7 #2125

2017-04-27 Thread Apache Jenkins Server
See 




[jira] [Updated] (KAFKA-4953) Global Store: cast exception when initialising with in-memory logged state store

2017-04-27 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-4953:
-
Labels: user-experience  (was: )

> Global Store: cast exception when initialising with in-memory logged state 
> store
> 
>
> Key: KAFKA-4953
> URL: https://issues.apache.org/jira/browse/KAFKA-4953
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Yennick Trevels
>  Labels: user-experience
>
> Currently it is not possible to initialise a global store with an in-memory 
> *logged* store via the TopologyBuilder. This results in the following 
> exception:
> {code}
> java.lang.ClassCastException: 
> org.apache.kafka.streams.processor.internals.GlobalProcessorContextImpl 
> cannot be cast to 
> org.apache.kafka.streams.processor.internals.RecordCollector$Supplier
>   at 
> org.apache.kafka.streams.state.internals.StoreChangeLogger.(StoreChangeLogger.java:52)
>   at 
> org.apache.kafka.streams.state.internals.StoreChangeLogger.(StoreChangeLogger.java:44)
>   at 
> org.apache.kafka.streams.state.internals.InMemoryKeyValueLoggedStore.init(InMemoryKeyValueLoggedStore.java:56)
>   at 
> org.apache.kafka.streams.state.internals.MeteredKeyValueStore$7.run(MeteredKeyValueStore.java:99)
>   at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:187)
>   at 
> org.apache.kafka.streams.state.internals.MeteredKeyValueStore.init(MeteredKeyValueStore.java:130)
>   at 
> org.apache.kafka.streams.processor.internals.GlobalStateManagerImpl.initialize(GlobalStateManagerImpl.java:97)
>   at 
> org.apache.kafka.streams.processor.internals.GlobalStateUpdateTask.initialize(GlobalStateUpdateTask.java:61)
>   at 
> org.apache.kafka.test.ProcessorTopologyTestDriver.(ProcessorTopologyTestDriver.java:215)
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorTopologyTest.shouldDriveInMemoryLoggedGlobalStore(ProcessorTopologyTest.java:235)
>   ...
> {code}
> I've created a PR which includes a unit this to verify this behavior.
> If the below PR gets merge, the fixing PR should leverage the provided test 
> {{ProcessorTopologyTest#shouldDriveInMemoryLoggedGlobalStore}} by removing 
> the {{@ignore}} annotation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5111) Improve internal Task APIs

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987931#comment-15987931
 ] 

ASF GitHub Bot commented on KAFKA-5111:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2917


> Improve internal Task APIs
> --
>
> Key: KAFKA-5111
> URL: https://issues.apache.org/jira/browse/KAFKA-5111
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.11.0.0
>
>
> Currently, the internal interface for tasks is not very clean and it's hard 
> to reason about the control flow when tasks get closes, suspended, resumed 
> etc. This makes exception handling particularly hard.
> We want to refactor this part of the code to get a clean control flow and 
> interface.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2917: KAFKA-5111: code cleanup follow up

2017-04-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2917


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4801) Transient test failure (part 2): ConsumerBounceTest.testConsumptionWithBrokerFailures

2017-04-27 Thread Armin Braun (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987910#comment-15987910
 ] 

Armin Braun commented on KAFKA-4801:


Will try to reproduce again tomorrow ...

> Transient test failure (part 2): 
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> -
>
> Key: KAFKA-4801
> URL: https://issues.apache.org/jira/browse/KAFKA-4801
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Armin Braun
>Assignee: Armin Braun
>Priority: Minor
>  Labels: transient-system-test-failure
>
> There is still some (but very little ... when reproducing this you need more 
> than 100 runs in half the cases statistically) instability left in the test
> {code}
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> {code}
> Resulting in this exception being thrown at a relatively low rate (I'd say 
> def less than 0.5% of all runs on my machine).
> {code}
> kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures FAILED
> java.lang.IllegalArgumentException: You can only check the position for 
> partitions assigned to this consumer.
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1271)
> at 
> kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:96)
> at 
> kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:69)
> {code}
> this was also reported in a comment to the original KAFKA-4198
> https://issues.apache.org/jira/browse/KAFKA-4198?focusedCommentId=15765468&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15765468



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (KAFKA-4801) Transient test failure (part 2): ConsumerBounceTest.testConsumptionWithBrokerFailures

2017-04-27 Thread Armin Braun (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Armin Braun reassigned KAFKA-4801:
--

Assignee: Armin Braun

> Transient test failure (part 2): 
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> -
>
> Key: KAFKA-4801
> URL: https://issues.apache.org/jira/browse/KAFKA-4801
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Armin Braun
>Assignee: Armin Braun
>Priority: Minor
>  Labels: transient-system-test-failure
>
> There is still some (but very little ... when reproducing this you need more 
> than 100 runs in half the cases statistically) instability left in the test
> {code}
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> {code}
> Resulting in this exception being thrown at a relatively low rate (I'd say 
> def less than 0.5% of all runs on my machine).
> {code}
> kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures FAILED
> java.lang.IllegalArgumentException: You can only check the position for 
> partitions assigned to this consumer.
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1271)
> at 
> kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:96)
> at 
> kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:69)
> {code}
> this was also reported in a comment to the original KAFKA-4198
> https://issues.apache.org/jira/browse/KAFKA-4198?focusedCommentId=15765468&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15765468



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5005) JoinIntegrationTest fails occasionally

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987905#comment-15987905
 ] 

ASF GitHub Bot commented on KAFKA-5005:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2920


> JoinIntegrationTest fails occasionally
> --
>
> Key: KAFKA-5005
> URL: https://issues.apache.org/jira/browse/KAFKA-5005
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Assignee: Armin Braun
> Fix For: 0.11.0.0
>
>
> testLeftKStreamKStream:
> {noformat}
> java.lang.AssertionError: Condition not met within timeout 3. Expecting 1 
> records from topic outputTopic while only received 0: []
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:265)
> at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinValuesRecordsReceived(IntegrationTestUtils.java:247)
> at 
> org.apache.kafka.streams.integration.JoinIntegrationTest.checkResult(JoinIntegrationTest.java:170)
> at 
> org.apache.kafka.streams.integration.JoinIntegrationTest.runTest(JoinIntegrationTest.java:192)
> at 
> org.apache.kafka.streams.integration.JoinIntegrationTest.testLeftKStreamKStream(JoinIntegrationTest.java:250)
> {noformat}
> testInnerKStreamKTable:
> {noformat}
> java.lang.AssertionError: Condition not met within timeout 3. Expecting 1 
> records from topic outputTopic while only received 0: []
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:265)
>   at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinValuesRecordsReceived(IntegrationTestUtils.java:248)
>   at 
> org.apache.kafka.streams.integration.JoinIntegrationTest.checkResult(JoinIntegrationTest.java:171)
>   at 
> org.apache.kafka.streams.integration.JoinIntegrationTest.runTest(JoinIntegrationTest.java:193)
>   at 
> org.apache.kafka.streams.integration.JoinIntegrationTest.testInnerKStreamKTable(JoinIntegrationTest.java:305)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2920: KAFKA-5005: IntegrationTestUtils Consumers need to...

2017-04-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2920


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5005) JoinIntegrationTest fails occasionally

2017-04-27 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-5005:
-
   Resolution: Fixed
Fix Version/s: 0.11.0.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2920
[https://github.com/apache/kafka/pull/2920]

> JoinIntegrationTest fails occasionally
> --
>
> Key: KAFKA-5005
> URL: https://issues.apache.org/jira/browse/KAFKA-5005
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Assignee: Armin Braun
> Fix For: 0.11.0.0
>
>
> testLeftKStreamKStream:
> {noformat}
> java.lang.AssertionError: Condition not met within timeout 3. Expecting 1 
> records from topic outputTopic while only received 0: []
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:265)
> at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinValuesRecordsReceived(IntegrationTestUtils.java:247)
> at 
> org.apache.kafka.streams.integration.JoinIntegrationTest.checkResult(JoinIntegrationTest.java:170)
> at 
> org.apache.kafka.streams.integration.JoinIntegrationTest.runTest(JoinIntegrationTest.java:192)
> at 
> org.apache.kafka.streams.integration.JoinIntegrationTest.testLeftKStreamKStream(JoinIntegrationTest.java:250)
> {noformat}
> testInnerKStreamKTable:
> {noformat}
> java.lang.AssertionError: Condition not met within timeout 3. Expecting 1 
> records from topic outputTopic while only received 0: []
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:265)
>   at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinValuesRecordsReceived(IntegrationTestUtils.java:248)
>   at 
> org.apache.kafka.streams.integration.JoinIntegrationTest.checkResult(JoinIntegrationTest.java:171)
>   at 
> org.apache.kafka.streams.integration.JoinIntegrationTest.runTest(JoinIntegrationTest.java:193)
>   at 
> org.apache.kafka.streams.integration.JoinIntegrationTest.testInnerKStreamKTable(JoinIntegrationTest.java:305)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] KIP-112 - Handle disk failure for JBOD

2017-04-27 Thread Dong Lin
Thanks to everyone who voted and provided feedback!

This KIP is now adopted with 3 binding +1s (Jun, Joel, Becket) and 1
non-binding +1s (Radai)

Dong

On Thu, Apr 27, 2017 at 4:12 PM, Dong Lin  wrote:

> Thanks for the vote Jun!
>
> I think that statement is probably OK because it assumes that broker has
> bad log directories. If all log directories are good, the replica should be
> created in one of the good log directories. It is clarified in the wiki
> that "Even if isNewReplica=false and replica is not found on any log
> directory, broker will still create replica on a good log directory if
> there is no bad log directory.".
>
>
> On Thu, Apr 27, 2017 at 4:07 PM, Jun Rao  wrote:
>
>> Hi, Dong,
>>
>> Thanks for the proposal. +1. Just one minor comment.
>>
>> in "3. Broker bootstraps with bad log directories", when a broker receives
>> a LeaderAndIsrRequest with isNewReplica=False but not found on any good
>> log
>> directory, if all log directories are good, it seems that we should create
>> the replica in one of the good log directories? This can happen if a
>> replica is manually deleted from the log directory.
>>
>> Jun
>>
>> On Wed, Apr 26, 2017 at 11:27 AM, Dong Lin  wrote:
>>
>> > Thanks for the vote!
>> >
>> > Discussed with Joel offline. I have updated the KIP to specify that
>> > controller will consider a replica to be offline if
>> KafkaStorageException
>> > is specified for the replica in the LeaderAndIsrResponse. The other two
>> > improvements may be done in the future KIP.
>> >
>> >
>> >
>> > On Wed, Apr 26, 2017 at 10:30 AM, Joel Koshy 
>> wrote:
>> >
>> > > +1
>> > >
>> > > Discussed a few edits/improvements with Dong.
>> > >
>> > > - Rather than a blanket (Error != None) condition for detecting
>> offline
>> > > replicas you probably want a storage exception-specific error code.
>> > >
>> > > - Definitely in favor of improvement #7 and it shouldn’t be too hard
>> to
>> > do.
>> > > When bouncing with a log directory on a faulty disk, the condition
>> may be
>> > > detected while loading logs and you may not have the full list of
>> local
>> > > replicas. So a subsequent L&ISR request would recreate the replica on
>> the
>> > > good disks (which may or may not be what the user wants).
>> > >
>> > > - Another improvement worth investigating is how best to support
>> > partition
>> > > reassignments even with a bad disk. The wiki hints that this is
>> > unnecessary
>> > > because reassignments being disallowed with an offline replica is
>> similar
>> > > to the current state of handling an offline broker. With JBOD though
>> the
>> > > broker with a bad disk does not have to be offline anymore so it
>> should
>> > be
>> > > possible to support reassignments even with offline replicas. I'm not
>> > > suggesting this is trivial, but would better leverage JBOD.
>> > >
>> > > On Wed, Apr 5, 2017 at 5:46 PM, Becket Qin 
>> wrote:
>> > >
>> > > > +1
>> > > >
>> > > > Thanks for the KIP. Made a pass and had some minor change.
>> > > >
>> > > > On Mon, Apr 3, 2017 at 3:16 PM, radai 
>> > > wrote:
>> > > >
>> > > > > +1, LGTM
>> > > > >
>> > > > > On Mon, Apr 3, 2017 at 9:49 AM, Dong Lin 
>> > wrote:
>> > > > >
>> > > > > > Hi all,
>> > > > > >
>> > > > > > It seems that there is no further concern with the KIP-112. We
>> > would
>> > > > like
>> > > > > > to start the voting process. The KIP can be found at
>> > > > > > *https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > > > > > 112%3A+Handle+disk+failure+for+JBOD
>> > > > > > > > > > > > 112%3A+Handle+disk+failure+for+JBOD>.*
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Dong
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>


Re: [ANNOUCE] Apache Kafka 0.10.2.1 Released

2017-04-27 Thread Ismael Juma
Thanks for managing the release Gwen. And thanks to everyone who
contributed. :)

Ismael

On Thu, Apr 27, 2017 at 7:52 PM, Gwen Shapira  wrote:

> The Apache Kafka community is pleased to announce the release for Apache
> Kafka 0.10.2.1. This is a bug fix release that fixes 29 issues in 0.10.2.0.
>
> All of the changes in this release can be found in the release notes:
> *https://archive.apache.org/dist/kafka/0.10.2.1/RELEASE_NOTES.html
> 
>
> Apache Kafka is a distributed streaming platform with four four core APIs:
>
> ** The Producer API allows an application to publish a stream records to
> one or more Kafka topics.
>
> ** The Consumer API allows an application to subscribe to one or more
> topics and process the stream of records produced to them.
>
> ** The Streams API allows an application to act as a stream processor,
> consuming an input stream from one or more topics and producing an output
> stream to one or more output topics, effectively transforming the input
> streams to output streams.
>
> ** The Connector API allows building and running reusable producers or
> consumers that connect Kafka topics to existing applications or data
> systems. For example, a connector to a relational database might capture
> every change to a table.three key capabilities:
>
>
> With these APIs, Kafka can be used for two broad classes of application:
>
> ** Building real-time streaming data pipelines that reliably get data
> between systems or applications.
>
> ** Building real-time streaming applications that transform or react to the
> streams of data.
>
>
> You can download the source release from
> https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.
> 1/kafka-0.10.2.1-src.tgz
>
> and binary releases from
> https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.
> 1/kafka_2.10-0.10.2.1.tgz
> https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.
> 1/kafka_2.11-0.10.2.1.tgz
>  1/kafka_2.11-0.10.2.1.tgz>
>
> A big thank you for the following 25 contributors to this release!
>
> Aaron Coburn, Apurva Mehta, Armin Braun, Ben Stopford, Bill Bejeck,
> Bruce Szalwinski, Clemens Valiente, Colin P. Mccabe, Damian Guy, Dong
> Lin, Eno Thereska, Ewen Cheslack-Postava, Guozhang Wang, Gwen Shapira,
> Ismael Juma, Jason Gustafson, Konstantine Karantasis, Marco Ebert,
> Matthias J. Sax, Michael G. Noll, Onur Karaman, Rajini Sivaram, Ryan
> P, simplesteph, Vahid Hashemian
>
> We welcome your help and feedback. For more information on how to
> report problems, and to get involved, visit the project website at
> http://kafka.apache.org/
>
>
> Thanks,
> -- Gwen
>


Build failed in Jenkins: kafka-trunk-jdk8 #1457

2017-04-27 Thread Apache Jenkins Server
See 


Changes:

[junrao] KAFKA-5028; convert kafka controller to a single-threaded event queue

--
[...truncated 2.42 MB...]
at 
hudson.util.SequentialExecutionQueue$QueueEntry.run(SequentialExecutionQueue.java:119)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
ERROR: Could not install GRADLE_3_4_RC_2_HOME
java.lang.NullPointerException
at 
hudson.plugins.toolenv.ToolEnvBuildWrapper$1.buildEnvVars(ToolEnvBuildWrapper.java:46)
at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:931)
at hudson.plugins.git.GitSCM.getParamExpandedRepos(GitSCM.java:416)
at 
hudson.plugins.git.GitSCM.compareRemoteRevisionWithImpl(GitSCM.java:622)
at hudson.plugins.git.GitSCM.compareRemoteRevisionWith(GitSCM.java:587)
at hudson.scm.SCM.compareRemoteRevisionWith(SCM.java:391)
at hudson.scm.SCM.poll(SCM.java:408)
at hudson.model.AbstractProject._poll(AbstractProject.java:1460)
at hudson.model.AbstractProject.poll(AbstractProject.java:1363)
at hudson.triggers.SCMTrigger$Runner.runPolling(SCMTrigger.java:563)
at hudson.triggers.SCMTrigger$Runner.run(SCMTrigger.java:609)
at 
hudson.util.SequentialExecutionQueue$QueueEntry.run(SequentialExecutionQueue.java:119)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
ERROR: Could not install GRADLE_3_4_RC_2_HOME
java.lang.NullPointerException
at 
hudson.plugins.toolenv.ToolEnvBuildWrapper$1.buildEnvVars(ToolEnvBuildWrapper.java:46)
at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:931)
at hudson.plugins.git.GitSCM.getParamExpandedRepos(GitSCM.java:416)
at 
hudson.plugins.git.GitSCM.compareRemoteRevisionWithImpl(GitSCM.java:622)
at hudson.plugins.git.GitSCM.compareRemoteRevisionWith(GitSCM.java:587)
at hudson.scm.SCM.compareRemoteRevisionWith(SCM.java:391)
at hudson.scm.SCM.poll(SCM.java:408)
at hudson.model.AbstractProject._poll(AbstractProject.java:1460)
at hudson.model.AbstractProject.poll(AbstractProject.java:1363)
at hudson.triggers.SCMTrigger$Runner.runPolling(SCMTrigger.java:563)
at hudson.triggers.SCMTrigger$Runner.run(SCMTrigger.java:609)
at 
hudson.util.SequentialExecutionQueue$QueueEntry.run(SequentialExecutionQueue.java:119)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
ERROR: Could not install GRADLE_3_4_RC_2_HOME
java.lang.NullPointerException
at 
hudson.plugins.toolenv.ToolEnvBuildWrapper$1.buildEnvVars(ToolEnvBuildWrapper.java:46)
at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:931)
at hudson.plugins.git.GitSCM.getParamExpandedRepos(GitSCM.java:416)
at 
hudson.plugins.git.GitSCM.compareRemoteRevisionWithImpl(GitSCM.java:622)
at hudson.plugins.git.GitSCM.compareRemoteRevisionWith(GitSCM.java:587)
at hudson.scm.SCM.compareRemoteRevisionWith(SCM.java:391)
at hudson.scm.SCM.poll(SCM.java:408)
at hudson.model.AbstractProject._poll(AbstractProject.java:1460)
at hudson.model.AbstractProject.poll(AbstractProject.java:1363)
at hudson.triggers.SCMTrigger$Runner.runPolling(SCMTrigger.java:563)
at hudson.triggers.SCMTrigger$Runner.run(SCMTrigger.java:609)
at 
hudson.util.SequentialExecutionQueue$QueueEntry.run(SequentialExecutionQueue.java:119)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
ERROR: Could not install GRADLE_3_4_RC_2_HOME
java.lang.NullPointerException
at 
hudson.plugins.toolenv.ToolEnvBuil

[jira] [Commented] (KAFKA-4763) Handle disk failure for JBOD (KIP-112)

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987865#comment-15987865
 ] 

ASF GitHub Bot commented on KAFKA-4763:
---

GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/2929

KAFKA-4763; Handle disk failure for JBOD (KIP-112)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-4763

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2929.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2929


commit ab6302b82b6245d1bbf8d77d836e362b95750ca4
Author: Dong Lin 
Date:   2017-04-03T00:46:34Z

KAFKA-4763; Handle disk failure for JBOD (KIP-112)




> Handle disk failure for JBOD (KIP-112)
> --
>
> Key: KAFKA-4763
> URL: https://issues.apache.org/jira/browse/KAFKA-4763
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
>
> See 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-112%3A+Handle+disk+failure+for+JBOD
>  for motivation and design.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2929: KAFKA-4763; Handle disk failure for JBOD (KIP-112)

2017-04-27 Thread lindong28
GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/2929

KAFKA-4763; Handle disk failure for JBOD (KIP-112)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-4763

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2929.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2929


commit ab6302b82b6245d1bbf8d77d836e362b95750ca4
Author: Dong Lin 
Date:   2017-04-03T00:46:34Z

KAFKA-4763; Handle disk failure for JBOD (KIP-112)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] KIP-112 - Handle disk failure for JBOD

2017-04-27 Thread Dong Lin
Thanks for the vote Jun!

I think that statement is probably OK because it assumes that broker has
bad log directories. If all log directories are good, the replica should be
created in one of the good log directories. It is clarified in the wiki
that "Even if isNewReplica=false and replica is not found on any log
directory, broker will still create replica on a good log directory if
there is no bad log directory.".


On Thu, Apr 27, 2017 at 4:07 PM, Jun Rao  wrote:

> Hi, Dong,
>
> Thanks for the proposal. +1. Just one minor comment.
>
> in "3. Broker bootstraps with bad log directories", when a broker receives
> a LeaderAndIsrRequest with isNewReplica=False but not found on any good log
> directory, if all log directories are good, it seems that we should create
> the replica in one of the good log directories? This can happen if a
> replica is manually deleted from the log directory.
>
> Jun
>
> On Wed, Apr 26, 2017 at 11:27 AM, Dong Lin  wrote:
>
> > Thanks for the vote!
> >
> > Discussed with Joel offline. I have updated the KIP to specify that
> > controller will consider a replica to be offline if KafkaStorageException
> > is specified for the replica in the LeaderAndIsrResponse. The other two
> > improvements may be done in the future KIP.
> >
> >
> >
> > On Wed, Apr 26, 2017 at 10:30 AM, Joel Koshy 
> wrote:
> >
> > > +1
> > >
> > > Discussed a few edits/improvements with Dong.
> > >
> > > - Rather than a blanket (Error != None) condition for detecting offline
> > > replicas you probably want a storage exception-specific error code.
> > >
> > > - Definitely in favor of improvement #7 and it shouldn’t be too hard to
> > do.
> > > When bouncing with a log directory on a faulty disk, the condition may
> be
> > > detected while loading logs and you may not have the full list of local
> > > replicas. So a subsequent L&ISR request would recreate the replica on
> the
> > > good disks (which may or may not be what the user wants).
> > >
> > > - Another improvement worth investigating is how best to support
> > partition
> > > reassignments even with a bad disk. The wiki hints that this is
> > unnecessary
> > > because reassignments being disallowed with an offline replica is
> similar
> > > to the current state of handling an offline broker. With JBOD though
> the
> > > broker with a bad disk does not have to be offline anymore so it should
> > be
> > > possible to support reassignments even with offline replicas. I'm not
> > > suggesting this is trivial, but would better leverage JBOD.
> > >
> > > On Wed, Apr 5, 2017 at 5:46 PM, Becket Qin 
> wrote:
> > >
> > > > +1
> > > >
> > > > Thanks for the KIP. Made a pass and had some minor change.
> > > >
> > > > On Mon, Apr 3, 2017 at 3:16 PM, radai 
> > > wrote:
> > > >
> > > > > +1, LGTM
> > > > >
> > > > > On Mon, Apr 3, 2017 at 9:49 AM, Dong Lin 
> > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > It seems that there is no further concern with the KIP-112. We
> > would
> > > > like
> > > > > > to start the voting process. The KIP can be found at
> > > > > > *https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > 112%3A+Handle+disk+failure+for+JBOD
> > > > > >  > > > > > 112%3A+Handle+disk+failure+for+JBOD>.*
> > > > > >
> > > > > > Thanks,
> > > > > > Dong
> > > > > >
> > > > >
> > > >
> > >
> >
>


[jira] [Resolved] (KAFKA-5101) Remove KafkaController's incrementControllerEpoch method parameter

2017-04-27 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5101.

   Resolution: Fixed
Fix Version/s: 0.11.0.0

Issue resolved by pull request 2886
[https://github.com/apache/kafka/pull/2886]

> Remove KafkaController's incrementControllerEpoch method parameter 
> ---
>
> Key: KAFKA-5101
> URL: https://issues.apache.org/jira/browse/KAFKA-5101
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Reporter: Balint Molnar
>Assignee: Balint Molnar
>Priority: Trivial
> Fix For: 0.11.0.0
>
>
> KAFKA-4814 replaced the zkClient.createPersistent method with 
> zkUtils.createPersistentPath so the zkClient parameter is no longer required.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5101) Remove KafkaController's incrementControllerEpoch method parameter

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987859#comment-15987859
 ] 

ASF GitHub Bot commented on KAFKA-5101:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2886


> Remove KafkaController's incrementControllerEpoch method parameter 
> ---
>
> Key: KAFKA-5101
> URL: https://issues.apache.org/jira/browse/KAFKA-5101
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Reporter: Balint Molnar
>Assignee: Balint Molnar
>Priority: Trivial
> Fix For: 0.11.0.0
>
>
> KAFKA-4814 replaced the zkClient.createPersistent method with 
> zkUtils.createPersistentPath so the zkClient parameter is no longer required.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2886: KAFKA-5101 Remove KafkaController's incrementContr...

2017-04-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2886


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] KIP-112 - Handle disk failure for JBOD

2017-04-27 Thread Jun Rao
Hi, Dong,

Thanks for the proposal. +1. Just one minor comment.

in "3. Broker bootstraps with bad log directories", when a broker receives
a LeaderAndIsrRequest with isNewReplica=False but not found on any good log
directory, if all log directories are good, it seems that we should create
the replica in one of the good log directories? This can happen if a
replica is manually deleted from the log directory.

Jun

On Wed, Apr 26, 2017 at 11:27 AM, Dong Lin  wrote:

> Thanks for the vote!
>
> Discussed with Joel offline. I have updated the KIP to specify that
> controller will consider a replica to be offline if KafkaStorageException
> is specified for the replica in the LeaderAndIsrResponse. The other two
> improvements may be done in the future KIP.
>
>
>
> On Wed, Apr 26, 2017 at 10:30 AM, Joel Koshy  wrote:
>
> > +1
> >
> > Discussed a few edits/improvements with Dong.
> >
> > - Rather than a blanket (Error != None) condition for detecting offline
> > replicas you probably want a storage exception-specific error code.
> >
> > - Definitely in favor of improvement #7 and it shouldn’t be too hard to
> do.
> > When bouncing with a log directory on a faulty disk, the condition may be
> > detected while loading logs and you may not have the full list of local
> > replicas. So a subsequent L&ISR request would recreate the replica on the
> > good disks (which may or may not be what the user wants).
> >
> > - Another improvement worth investigating is how best to support
> partition
> > reassignments even with a bad disk. The wiki hints that this is
> unnecessary
> > because reassignments being disallowed with an offline replica is similar
> > to the current state of handling an offline broker. With JBOD though the
> > broker with a bad disk does not have to be offline anymore so it should
> be
> > possible to support reassignments even with offline replicas. I'm not
> > suggesting this is trivial, but would better leverage JBOD.
> >
> > On Wed, Apr 5, 2017 at 5:46 PM, Becket Qin  wrote:
> >
> > > +1
> > >
> > > Thanks for the KIP. Made a pass and had some minor change.
> > >
> > > On Mon, Apr 3, 2017 at 3:16 PM, radai 
> > wrote:
> > >
> > > > +1, LGTM
> > > >
> > > > On Mon, Apr 3, 2017 at 9:49 AM, Dong Lin 
> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > It seems that there is no further concern with the KIP-112. We
> would
> > > like
> > > > > to start the voting process. The KIP can be found at
> > > > > *https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 112%3A+Handle+disk+failure+for+JBOD
> > > > >  > > > > 112%3A+Handle+disk+failure+for+JBOD>.*
> > > > >
> > > > > Thanks,
> > > > > Dong
> > > > >
> > > >
> > >
> >
>


[jira] [Resolved] (KAFKA-5119) Transient test failure SocketServerTest.testMetricCollectionAfterShutdown

2017-04-27 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5119.

   Resolution: Fixed
Fix Version/s: 0.11.0.0

Issue resolved by pull request 2915
[https://github.com/apache/kafka/pull/2915]

> Transient test failure SocketServerTest.testMetricCollectionAfterShutdown
> -
>
> Key: KAFKA-5119
> URL: https://issues.apache.org/jira/browse/KAFKA-5119
> Project: Kafka
>  Issue Type: Bug
>  Components: unit tests
>Reporter: Jason Gustafson
> Fix For: 0.11.0.0
>
>
> From a recent build:
> {code}
> 20:04:15 kafka.network.SocketServerTest > testMetricCollectionAfterShutdown 
> FAILED
> 20:04:15 java.lang.AssertionError: expected:<0.0> but 
> was:<1.603886948862125>
> 20:04:15 at org.junit.Assert.fail(Assert.java:88)
> 20:04:15 at org.junit.Assert.failNotEquals(Assert.java:834)
> 20:04:15 at org.junit.Assert.assertEquals(Assert.java:553)
> 20:04:15 at org.junit.Assert.assertEquals(Assert.java:683)
> 20:04:15 at 
> kafka.network.SocketServerTest.testMetricCollectionAfterShutdown(SocketServerTest.scala:414)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5119) Transient test failure SocketServerTest.testMetricCollectionAfterShutdown

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987849#comment-15987849
 ] 

ASF GitHub Bot commented on KAFKA-5119:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2915


> Transient test failure SocketServerTest.testMetricCollectionAfterShutdown
> -
>
> Key: KAFKA-5119
> URL: https://issues.apache.org/jira/browse/KAFKA-5119
> Project: Kafka
>  Issue Type: Bug
>  Components: unit tests
>Reporter: Jason Gustafson
>
> From a recent build:
> {code}
> 20:04:15 kafka.network.SocketServerTest > testMetricCollectionAfterShutdown 
> FAILED
> 20:04:15 java.lang.AssertionError: expected:<0.0> but 
> was:<1.603886948862125>
> 20:04:15 at org.junit.Assert.fail(Assert.java:88)
> 20:04:15 at org.junit.Assert.failNotEquals(Assert.java:834)
> 20:04:15 at org.junit.Assert.assertEquals(Assert.java:553)
> 20:04:15 at org.junit.Assert.assertEquals(Assert.java:683)
> 20:04:15 at 
> kafka.network.SocketServerTest.testMetricCollectionAfterShutdown(SocketServerTest.scala:414)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2915: KAFKA-5119: Ensure global metrics are empty before...

2017-04-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2915


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4801) Transient test failure (part 2): ConsumerBounceTest.testConsumptionWithBrokerFailures

2017-04-27 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987843#comment-15987843
 ] 

Jason Gustafson commented on KAFKA-4801:


Recent occurrence: https://jenkins.confluent.io/job/kafka-trunk/1847/console.

> Transient test failure (part 2): 
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> -
>
> Key: KAFKA-4801
> URL: https://issues.apache.org/jira/browse/KAFKA-4801
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Armin Braun
>Priority: Minor
>  Labels: transient-system-test-failure
>
> There is still some (but very little ... when reproducing this you need more 
> than 100 runs in half the cases statistically) instability left in the test
> {code}
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> {code}
> Resulting in this exception being thrown at a relatively low rate (I'd say 
> def less than 0.5% of all runs on my machine).
> {code}
> kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures FAILED
> java.lang.IllegalArgumentException: You can only check the position for 
> partitions assigned to this consumer.
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1271)
> at 
> kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:96)
> at 
> kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:69)
> {code}
> this was also reported in a comment to the original KAFKA-4198
> https://issues.apache.org/jira/browse/KAFKA-4198?focusedCommentId=15765468&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15765468



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (KAFKA-4801) Transient test failure (part 2): ConsumerBounceTest.testConsumptionWithBrokerFailures

2017-04-27 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang reopened KAFKA-4801:
--

> Transient test failure (part 2): 
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> -
>
> Key: KAFKA-4801
> URL: https://issues.apache.org/jira/browse/KAFKA-4801
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Armin Braun
>Priority: Minor
>  Labels: transient-system-test-failure
>
> There is still some (but very little ... when reproducing this you need more 
> than 100 runs in half the cases statistically) instability left in the test
> {code}
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> {code}
> Resulting in this exception being thrown at a relatively low rate (I'd say 
> def less than 0.5% of all runs on my machine).
> {code}
> kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures FAILED
> java.lang.IllegalArgumentException: You can only check the position for 
> partitions assigned to this consumer.
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1271)
> at 
> kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:96)
> at 
> kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:69)
> {code}
> this was also reported in a comment to the original KAFKA-4198
> https://issues.apache.org/jira/browse/KAFKA-4198?focusedCommentId=15765468&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15765468



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (KAFKA-5139) Transient failure ConsumerBounceTest.testConsumptionWithBrokerFailures

2017-04-27 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-5139.

Resolution: Duplicate

Sorry for the noise.

> Transient failure ConsumerBounceTest.testConsumptionWithBrokerFailures
> --
>
> Key: KAFKA-5139
> URL: https://issues.apache.org/jira/browse/KAFKA-5139
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>
> {code}
> 22:29:34 kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures 
> FAILED
> 22:29:34 java.lang.IllegalArgumentException: You can only check the 
> position for partitions assigned to this consumer.
> 22:29:34 at 
> org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1275)
> 22:29:34 at 
> kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:110)
> 22:29:34 at 
> kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:83)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5139) Transient failure ConsumerBounceTest.testConsumptionWithBrokerFailures

2017-04-27 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-5139:
--

 Summary: Transient failure 
ConsumerBounceTest.testConsumptionWithBrokerFailures
 Key: KAFKA-5139
 URL: https://issues.apache.org/jira/browse/KAFKA-5139
 Project: Kafka
  Issue Type: Bug
Reporter: Jason Gustafson


{code}
22:29:34 kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures FAILED
22:29:34 java.lang.IllegalArgumentException: You can only check the 
position for partitions assigned to this consumer.
22:29:34 at 
org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1275)
22:29:34 at 
kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:110)
22:29:34 at 
kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:83)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk7 #2124

2017-04-27 Thread Apache Jenkins Server
See 


Changes:

[jason] KAFKA-4818; Exactly once transactional clients

--
[...truncated 835.78 KB...]

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldAddRequestToBrokerQueue PASSED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldRetryGettingLeaderWhenNotFound STARTED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldRetryGettingLeaderWhenNotFound PASSED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldClearPurgatoryForPartitionWhenPartitionEmigrated STARTED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldClearPurgatoryForPartitionWhenPartitionEmigrated PASSED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldDrainBrokerQueueWhenGeneratingRequests STARTED

kafka.coordinator.transaction.TransactionMarkerChannelManagerTest > 
shouldDrainBrokerQueueWhenGeneratingRequests PASSED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldThrowIllegalStateExceptionWhenErrorNotHandled STARTED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldThrowIllegalStateExceptionWhenErrorNotHandled PASSED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldRetryPartitionWhenNotLeaderForPartitionError STARTED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldRetryPartitionWhenNotLeaderForPartitionError PASSED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldReEnqueuePartitionsWhenBrokerDisconnected STARTED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldReEnqueuePartitionsWhenBrokerDisconnected PASSED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldRetryPartitionWhenNotEnoughReplicasError STARTED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldRetryPartitionWhenNotEnoughReplicasError PASSED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldThrowIllegalStateExceptionIfErrorsNullForPid STARTED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldThrowIllegalStateExceptionIfErrorsNullForPid PASSED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldRemoveCompletedPartitionsFromMetadataWhenNoErrors STARTED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldRemoveCompletedPartitionsFromMetadataWhenNoErrors PASSED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldRetryPartitionWhenUnknownTopicOrPartitionError STARTED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldRetryPartitionWhenUnknownTopicOrPartitionError PASSED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldRetryPartitionWhenNotEnoughReplicasAfterAppendError STARTED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldRetryPartitionWhenNotEnoughReplicasAfterAppendError PASSED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldTryCompleteDelayedTxnOperation STARTED

kafka.coordinator.transaction.TransactionMarkerRequestCompletionHandlerTest > 
shouldTryCompleteDelayedTxnOperation PASSED

kafka.coordinator.transaction.TransactionMarkerChannelTest > 
shouldUpdateDestinationBrokerNodeWhenUpdatingBroker STARTED

kafka.coordinator.transaction.TransactionMarkerChannelTest > 
shouldUpdateDestinationBrokerNodeWhenUpdatingBroker PASSED

kafka.coordinator.transaction.TransactionMarkerChannelTest > 
shouldRemoveBrokerRequestsForPartitionWhenPartitionEmigrated STARTED

kafka.coordinator.transaction.TransactionMarkerChannelTest > 
shouldRemoveBrokerRequestsForPartitionWhenPartitionEmigrated PASSED

kafka.coordinator.transaction.TransactionMarkerChannelTest > 
shouldGetPendingTxnMetadataByPid STARTED

kafka.coordinator.transaction.TransactionMarkerChannelTest > 
shouldGetPendingTxnMetadataByPid PASSED

kafka.coordinator.transaction.TransactionMarkerChannelTest > 
shouldAddRequestsToCorrectBrokerQueues STARTED

kafka.coordinator.transaction.TransactionMarkerChannelTest > 
shouldAddRequestsToCorrectBrokerQueues PASSED

kafka.coordinator.transaction.TransactionMarkerChannelTest > 
shouldRemovePendingRequestsForPartitionWhenPartitionEmigrated STARTED

kafka.coordinator.transaction.TransactionMarkerChannelTest > 
shouldRemovePendingRequestsForPartitionWhenPartitionEmigrated PASSED

kafka.coordinator.transaction.TransactionMarkerChannelTest > 
shouldWakeupNetworkClientWhenRequestsQueued STARTED

kafka.coordinator.transaction.TransactionMarkerChannelTest > 
shouldWakeupNetworkClientWhenRequestsQueued PASSED

kafka.coordinato

[jira] [Resolved] (KAFKA-5122) Kafka Streams unexpected off-heap memory growth

2017-04-27 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-5122.
--
Resolution: Fixed

> Kafka Streams unexpected off-heap memory growth
> ---
>
> Key: KAFKA-5122
> URL: https://issues.apache.org/jira/browse/KAFKA-5122
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
> Environment: Linux 64-bit
> Oracle JVM version "1.8.0_121"
>Reporter: Jon Buffington
>Assignee: Guozhang Wang
>Priority: Minor
>
> I have a Kafka Streams application that leaks off-heap memory at a rate of 
> 20MB per commit interval. The application is configured with a 1G heap; the 
> heap memory does not show signs of leaking. The application reaches 16g of 
> system memory usage before terminating and restarting.
> Application facts:
> * The data pipeline is source -> map -> groupByKey -> reduce -> to.
> * The reduce operation uses a tumbling time window 
> TimeWindows.of(TimeUnit.HOURS.toMillis(1)).until(TimeUnit.HOURS.toMillis(168)).
> * The commit interval is five minutes (30ms).
> * The application links to v0.10.2.0-cp1 of the Kakfa libraries. When I link 
> to the current 0.10.2.1 RC3, the leak rate changes to ~10MB per commit 
> interval.
> * The application uses the schema registry for two pairs of serdes. One serde 
> pair is used to read from a source topic that has 40 partitions. The other 
> serde pair is used by the internal changelog and repartition topics created 
> by the groupByKey/reduce operations.
> * The source input rate varies between 500-1500 records/sec. The source rate 
> variation does not change the size or frequency of the leak.
> * The application heap has been configured using both 1024m and 2048m. The 
> only observed difference between the two JVM heap sizes is more old gen 
> collections at 1024m although there is little difference in throughput. JVM 
> settings are {-server -Djava.awt.headless=true -Xss256k 
> -XX:MaxMetaspaceSize=128m -XX:ReservedCodeCacheSize=64m 
> -XX:CompressedClassSpaceSize=32m -XX:MaxDirectMemorySize=128m 
> -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:MaxGCPauseMillis=50 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+PerfDisableSharedMem 
> -XX:+UseStringDeduplication -XX:MinMetaspaceFreeRatio=50 
> -XX:MaxMetaspaceFreeRatio=80}
> * We configure a custom RocksDBConfigSetter to set 
> options.setMaxBackgroundCompactions(Runtime.getRuntime.availableProcessors)
> * Per 
> ,
>  the SSTables are being compacted. Total disk usage for the state files 
> (RocksDB) is ~2.5g. Per partition and window, there are 3-4 SSTables.
> * The application is written in Scala and compiled using version 2.12.1.
> • Oracle JVM version "1.8.0_121"
> Various experiments that had no effect on the leak rate:
> * Tried different RocksDB block sizes (4k, 16k, and 32k).
> * Different numbers of instances (1, 2, and 4).
> * Different numbers of threads (1, 4, 10, 40).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (KAFKA-5122) Kafka Streams unexpected off-heap memory growth

2017-04-27 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang reassigned KAFKA-5122:


Assignee: Guozhang Wang

> Kafka Streams unexpected off-heap memory growth
> ---
>
> Key: KAFKA-5122
> URL: https://issues.apache.org/jira/browse/KAFKA-5122
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
> Environment: Linux 64-bit
> Oracle JVM version "1.8.0_121"
>Reporter: Jon Buffington
>Assignee: Guozhang Wang
>Priority: Minor
>
> I have a Kafka Streams application that leaks off-heap memory at a rate of 
> 20MB per commit interval. The application is configured with a 1G heap; the 
> heap memory does not show signs of leaking. The application reaches 16g of 
> system memory usage before terminating and restarting.
> Application facts:
> * The data pipeline is source -> map -> groupByKey -> reduce -> to.
> * The reduce operation uses a tumbling time window 
> TimeWindows.of(TimeUnit.HOURS.toMillis(1)).until(TimeUnit.HOURS.toMillis(168)).
> * The commit interval is five minutes (30ms).
> * The application links to v0.10.2.0-cp1 of the Kakfa libraries. When I link 
> to the current 0.10.2.1 RC3, the leak rate changes to ~10MB per commit 
> interval.
> * The application uses the schema registry for two pairs of serdes. One serde 
> pair is used to read from a source topic that has 40 partitions. The other 
> serde pair is used by the internal changelog and repartition topics created 
> by the groupByKey/reduce operations.
> * The source input rate varies between 500-1500 records/sec. The source rate 
> variation does not change the size or frequency of the leak.
> * The application heap has been configured using both 1024m and 2048m. The 
> only observed difference between the two JVM heap sizes is more old gen 
> collections at 1024m although there is little difference in throughput. JVM 
> settings are {-server -Djava.awt.headless=true -Xss256k 
> -XX:MaxMetaspaceSize=128m -XX:ReservedCodeCacheSize=64m 
> -XX:CompressedClassSpaceSize=32m -XX:MaxDirectMemorySize=128m 
> -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:MaxGCPauseMillis=50 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+PerfDisableSharedMem 
> -XX:+UseStringDeduplication -XX:MinMetaspaceFreeRatio=50 
> -XX:MaxMetaspaceFreeRatio=80}
> * We configure a custom RocksDBConfigSetter to set 
> options.setMaxBackgroundCompactions(Runtime.getRuntime.availableProcessors)
> * Per 
> ,
>  the SSTables are being compacted. Total disk usage for the state files 
> (RocksDB) is ~2.5g. Per partition and window, there are 3-4 SSTables.
> * The application is written in Scala and compiled using version 2.12.1.
> • Oracle JVM version "1.8.0_121"
> Various experiments that had no effect on the leak rate:
> * Tried different RocksDB block sizes (4k, 16k, and 32k).
> * Different numbers of instances (1, 2, and 4).
> * Different numbers of threads (1, 4, 10, 40).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-5122) Kafka Streams unexpected off-heap memory growth

2017-04-27 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987806#comment-15987806
 ] 

Guozhang Wang edited comment on KAFKA-5122 at 4/27/17 10:16 PM:


[~jon_fuseelements] Thanks a lots for reporting this!

We are aware of the windowed store #.segment factor in the memory usage as well 
as Streams capacity planning, and are working towards optimizing that since 
usually only the most recent segment gets read / write access and hence older 
segment's write buffer (i.e. memtables) would be likely wasted.

As for now I will take your suggestion to update the capacity planning guide. 
Since it is referring to the Confluent docs I will close this Apache Kafka 
ticket for now.


was (Author: guozhang):
[~jon_fuseelements] Thanks a lots for reporting this!

We are aware of the windowed store #.segment factor in the memory usage as well 
as Streams capacity planning, and are working towards optimizing that since 
usually only the most recent segment gets read / write access and hence older 
segment's write buffer (i.e. memtables) would be likely wasted.

As for now I will take your suggestion to update the capacity planning guide.

> Kafka Streams unexpected off-heap memory growth
> ---
>
> Key: KAFKA-5122
> URL: https://issues.apache.org/jira/browse/KAFKA-5122
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
> Environment: Linux 64-bit
> Oracle JVM version "1.8.0_121"
>Reporter: Jon Buffington
>Priority: Minor
>
> I have a Kafka Streams application that leaks off-heap memory at a rate of 
> 20MB per commit interval. The application is configured with a 1G heap; the 
> heap memory does not show signs of leaking. The application reaches 16g of 
> system memory usage before terminating and restarting.
> Application facts:
> * The data pipeline is source -> map -> groupByKey -> reduce -> to.
> * The reduce operation uses a tumbling time window 
> TimeWindows.of(TimeUnit.HOURS.toMillis(1)).until(TimeUnit.HOURS.toMillis(168)).
> * The commit interval is five minutes (30ms).
> * The application links to v0.10.2.0-cp1 of the Kakfa libraries. When I link 
> to the current 0.10.2.1 RC3, the leak rate changes to ~10MB per commit 
> interval.
> * The application uses the schema registry for two pairs of serdes. One serde 
> pair is used to read from a source topic that has 40 partitions. The other 
> serde pair is used by the internal changelog and repartition topics created 
> by the groupByKey/reduce operations.
> * The source input rate varies between 500-1500 records/sec. The source rate 
> variation does not change the size or frequency of the leak.
> * The application heap has been configured using both 1024m and 2048m. The 
> only observed difference between the two JVM heap sizes is more old gen 
> collections at 1024m although there is little difference in throughput. JVM 
> settings are {-server -Djava.awt.headless=true -Xss256k 
> -XX:MaxMetaspaceSize=128m -XX:ReservedCodeCacheSize=64m 
> -XX:CompressedClassSpaceSize=32m -XX:MaxDirectMemorySize=128m 
> -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:MaxGCPauseMillis=50 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+PerfDisableSharedMem 
> -XX:+UseStringDeduplication -XX:MinMetaspaceFreeRatio=50 
> -XX:MaxMetaspaceFreeRatio=80}
> * We configure a custom RocksDBConfigSetter to set 
> options.setMaxBackgroundCompactions(Runtime.getRuntime.availableProcessors)
> * Per 
> ,
>  the SSTables are being compacted. Total disk usage for the state files 
> (RocksDB) is ~2.5g. Per partition and window, there are 3-4 SSTables.
> * The application is written in Scala and compiled using version 2.12.1.
> • Oracle JVM version "1.8.0_121"
> Various experiments that had no effect on the leak rate:
> * Tried different RocksDB block sizes (4k, 16k, and 32k).
> * Different numbers of instances (1, 2, and 4).
> * Different numbers of threads (1, 4, 10, 40).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5122) Kafka Streams unexpected off-heap memory growth

2017-04-27 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987806#comment-15987806
 ] 

Guozhang Wang commented on KAFKA-5122:
--

[~jon_fuseelements] Thanks a lots for reporting this!

We are aware of the windowed store #.segment factor in the memory usage as well 
as Streams capacity planning, and are working towards optimizing that since 
usually only the most recent segment gets read / write access and hence older 
segment's write buffer (i.e. memtables) would be likely wasted.

As for now I will take your suggestion to update the capacity planning guide.

> Kafka Streams unexpected off-heap memory growth
> ---
>
> Key: KAFKA-5122
> URL: https://issues.apache.org/jira/browse/KAFKA-5122
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
> Environment: Linux 64-bit
> Oracle JVM version "1.8.0_121"
>Reporter: Jon Buffington
>Priority: Minor
>
> I have a Kafka Streams application that leaks off-heap memory at a rate of 
> 20MB per commit interval. The application is configured with a 1G heap; the 
> heap memory does not show signs of leaking. The application reaches 16g of 
> system memory usage before terminating and restarting.
> Application facts:
> * The data pipeline is source -> map -> groupByKey -> reduce -> to.
> * The reduce operation uses a tumbling time window 
> TimeWindows.of(TimeUnit.HOURS.toMillis(1)).until(TimeUnit.HOURS.toMillis(168)).
> * The commit interval is five minutes (30ms).
> * The application links to v0.10.2.0-cp1 of the Kakfa libraries. When I link 
> to the current 0.10.2.1 RC3, the leak rate changes to ~10MB per commit 
> interval.
> * The application uses the schema registry for two pairs of serdes. One serde 
> pair is used to read from a source topic that has 40 partitions. The other 
> serde pair is used by the internal changelog and repartition topics created 
> by the groupByKey/reduce operations.
> * The source input rate varies between 500-1500 records/sec. The source rate 
> variation does not change the size or frequency of the leak.
> * The application heap has been configured using both 1024m and 2048m. The 
> only observed difference between the two JVM heap sizes is more old gen 
> collections at 1024m although there is little difference in throughput. JVM 
> settings are {-server -Djava.awt.headless=true -Xss256k 
> -XX:MaxMetaspaceSize=128m -XX:ReservedCodeCacheSize=64m 
> -XX:CompressedClassSpaceSize=32m -XX:MaxDirectMemorySize=128m 
> -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:MaxGCPauseMillis=50 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+PerfDisableSharedMem 
> -XX:+UseStringDeduplication -XX:MinMetaspaceFreeRatio=50 
> -XX:MaxMetaspaceFreeRatio=80}
> * We configure a custom RocksDBConfigSetter to set 
> options.setMaxBackgroundCompactions(Runtime.getRuntime.availableProcessors)
> * Per 
> ,
>  the SSTables are being compacted. Total disk usage for the state files 
> (RocksDB) is ~2.5g. Per partition and window, there are 3-4 SSTables.
> * The application is written in Scala and compiled using version 2.12.1.
> • Oracle JVM version "1.8.0_121"
> Various experiments that had no effect on the leak rate:
> * Tried different RocksDB block sizes (4k, 16k, and 32k).
> * Different numbers of instances (1, 2, and 4).
> * Different numbers of threads (1, 4, 10, 40).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (KAFKA-5086) Update topic expiry time in Metadata every time the topic metadata is requested

2017-04-27 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-5086.
---
   Resolution: Fixed
Fix Version/s: 0.11.0.0

Issue resolved by pull request 2869
[https://github.com/apache/kafka/pull/2869]

> Update topic expiry time in Metadata every time the topic metadata is 
> requested
> ---
>
> Key: KAFKA-5086
> URL: https://issues.apache.org/jira/browse/KAFKA-5086
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
> Fix For: 0.11.0.0
>
>
> As of current implementation, KafkaProducer.waitOnMetadata() will first reset 
> topic expiry time of the topic before repeatedly sending TopicMetadataRequest 
> and waiting for metadata response. However, if the metadata of the topic is 
> not available within Metadata.TOPIC_EXPIRY_MS, which is set to 5 minutes, 
> then the topic will be expired and removed from Metadata.topics. The 
> TopicMetadataRequest will no longer include the topic and the KafkaProducer 
> will never receive the metadata of this topic. It will enter an infinite loop 
> of sending TopicMetadataRequest and waiting for metadata response.
> This problem can be fixed by updating topic expiry time every time the topic 
> metadata is requested.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4818) Implement transactional clients

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987785#comment-15987785
 ] 

ASF GitHub Bot commented on KAFKA-4818:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2840


> Implement transactional clients
> ---
>
> Key: KAFKA-4818
> URL: https://issues.apache.org/jira/browse/KAFKA-4818
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Jason Gustafson
>Assignee: Apurva Mehta
> Fix For: 0.11.0.0
>
>
> This covers the implementation of the producer and consumer to support 
> transactions, as described in KIP-98: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5086) Update topic expiry time in Metadata every time the topic metadata is requested

2017-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987784#comment-15987784
 ] 

ASF GitHub Bot commented on KAFKA-5086:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2869


> Update topic expiry time in Metadata every time the topic metadata is 
> requested
> ---
>
> Key: KAFKA-5086
> URL: https://issues.apache.org/jira/browse/KAFKA-5086
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
> Fix For: 0.11.0.0
>
>
> As of current implementation, KafkaProducer.waitOnMetadata() will first reset 
> topic expiry time of the topic before repeatedly sending TopicMetadataRequest 
> and waiting for metadata response. However, if the metadata of the topic is 
> not available within Metadata.TOPIC_EXPIRY_MS, which is set to 5 minutes, 
> then the topic will be expired and removed from Metadata.topics. The 
> TopicMetadataRequest will no longer include the topic and the KafkaProducer 
> will never receive the metadata of this topic. It will enter an infinite loop 
> of sending TopicMetadataRequest and waiting for metadata response.
> This problem can be fixed by updating topic expiry time every time the topic 
> metadata is requested.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2840: KAFKA-4818: Exactly once transactional clients

2017-04-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2840


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2869: KAFKA-5086; Update topic expiry time in Metadata e...

2017-04-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2869


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-4818) Implement transactional clients

2017-04-27 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-4818.

Resolution: Fixed

Issue resolved by pull request 2840
[https://github.com/apache/kafka/pull/2840]

> Implement transactional clients
> ---
>
> Key: KAFKA-4818
> URL: https://issues.apache.org/jira/browse/KAFKA-4818
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Jason Gustafson
>Assignee: Apurva Mehta
> Fix For: 0.11.0.0
>
>
> This covers the implementation of the producer and consumer to support 
> transactions, as described in KIP-98: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-140: Add administrative RPCs for adding, deleting, and listing ACLs

2017-04-27 Thread Jun Rao
Hi, Colin,

Thanks for the KIP. Looks good overall. Just a few minor comments.

1. The controller is responsible for managing the state of topics. So, it
makes sense for create/delete topic requests to be sent to the controller.
The controller is not responsible for managing ACLs right now. So, it's a
bit weird to send acl requests only to the controller. Sending the acl
request to any broker is more intuitive and easier to implement.

10. Since we may add new operations in the future, perhaps use 0 for the
ALL operation?

11. "That is, ListAclsRequest(principal=none) will return all ACLs". Do you
mean principal == null?

12. "Principals must possess Cluster:All permissions to call
CreateAclsRequest". All just means all types of operations. So, we need a
specific operation that allows CreateAclsRequest. Will that be
ClusterAction or a new one?

Jun

On Mon, Apr 24, 2017 at 10:57 AM, Colin McCabe  wrote:

> Thanks for taking a look, Ismael.
>
> On Mon, Apr 24, 2017, at 04:36, Ismael Juma wrote:
> > Thanks Colin. A few quick comments:
> >
> > 1. Is there a reason why AddAclsRequest must be sent to the controller
> > broker? When this was discussed previously, Jun suggested that such a
> > restriction may not be necessary for this request.
>
> Hmm.  I guess my thinking here was that since the controller handles
> AddTopics and DeleteTopics requests, it would be nice if it had the most
> up-to-date ACL information.  This was also in the original KIP-4
> proposal.  However, given that auth is ZK (or a pluggable system,
> optionally) there is no inherent reason the controller broker has to be
> the only one to make the change.  What do you think?
>
> >
> > 2. Other protocol APIs use the Delete prefix instead of Remove. Is there
> > a
> > reason to deviate here? If there's a good reason to do so, we have to fix
> > the one mention of DeleteAcls in the proposal.
>
> Good point.  Let's make it consistent by changing it to DeleteAcls.  I
> will also change AddAclsRequest to CreateAclsRequest to match
> CreateTopicsRequest.
>
> >
> > 3. Do you mean "null" in the following sentence? "Note that an argument
> > of
> > "none" is different than a wildcard argument."
>
> For the string types, NULL is considered "none"; for the INT8 types, -1
> is considered "none".
>
> >
> > 4. How will the non-zero top-level error code be computed? A bit more
> > detail on this would be helpful. As a general rule, most batch protocol
> > APIs don't have a top level error code because errors are usually at the
> > batch element level. This has proved to be a bit of an issue in cases
> > where
> > we want to return a generic error code (e.g. InvalidProtocolVersion).
> > Also,
> > V2 of OffsetFetch has a top-level error code[1]. However, the OffsetFetch
> > behaviour is that we either use the top-level error code or the partition
> > level error codes, which is different than what is being suggested here.
>
> The idea behind the top-level error code is to implement error codes
> that don't have to do with elements in the batch.  For example, if we
> implement backpressure, the server could send back an error code of
> "slow down" telling the client to resend the request after a few
> milliseconds have elapsed.
>
> As you mention, we ought to have a generic response header that would
> let us intelligently handle situations like "server doesn't understand
> this request version or type"  Maybe this is something that needs to be
> handled in another KIP, though, since it would require all the responses
> to have this, and in the same format.  I guess I will remove this for
> now.
>
> >
> > 5. Nit: In other requests we used `error_message` instead of
> > `error_string`.
>
> OK.
>
> > 6. Regarding the migration plan, it's worth making it clear that the CLI
> > transition is not part of this KIP.
>
> OK.
>
> >
> > 7. Regarding the forward compatibility point, we could potentially use
> > enums with UNKNOWN element? This pattern has been used elsewhere in Kafka
> > and it would be good to compare it with the proposed solution. An
> > important
> > question is what can users do with UNKNOWN elements. If the assumption is
> > that users ignore them, then it seems like the enum approach may be good
> > enough.
>
> Hmm.  It seemed straightforward to let callers see the numeric value of
> the element.  It makes the command-line tools more useful when
> interacting with clusters that have newer versions of the software, for
> example.  I guess using UNKNOWN instead of UNKNOWN() has the
> advantage of hiding the internal representation better.
>
> best,
> Colin
>
>
> >
> > Thanks,
> > Ismael
> >
> > [1]
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-88%3A+
> OffsetFetch+Protocol+Update
> >
> >
> > On Fri, Apr 21, 2017 at 9:27 PM, Colin McCabe 
> wrote:
> >
> > > Hi all,
> > >
> > > As part of the AdminClient work, we would like to add methods for
> > > adding, deleting, and listing access control lists (ACLs).  I wrote up
> a
> > > KIP to disc

SIEMs Supported

2017-04-27 Thread Kristen Hale
Hi, What SIEMs are supported with Kafka messaging bus?  RSA, NetWitness,
QRADAR,etc?

Would like to have a list.

Thanks!
-Kristen


[GitHub] kafka pull request #2923: MINOR: Fix a few transaction coordinator issues

2017-04-27 Thread hachikuji
Github user hachikuji closed the pull request at:

https://github.com/apache/kafka/pull/2923


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4477) Node reduces its ISR to itself, and doesn't recover. Other nodes do not take leadership, cluster remains sick until node is restarted.

2017-04-27 Thread Joseph Aliase (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987661#comment-15987661
 ] 

Joseph Aliase commented on KAFKA-4477:
--

[~apurva] Im hitting this issue. Please refer to 
https://issues.apache.org/jira/browse/KAFKA-5007

> Node reduces its ISR to itself, and doesn't recover. Other nodes do not take 
> leadership, cluster remains sick until node is restarted.
> --
>
> Key: KAFKA-4477
> URL: https://issues.apache.org/jira/browse/KAFKA-4477
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.0
> Environment: RHEL7
> java version "1.8.0_66"
> Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
>Reporter: Michael Andre Pearce (IG)
>Assignee: Apurva Mehta
>Priority: Critical
>  Labels: reliability
> Fix For: 0.10.1.1
>
> Attachments: 2016_12_15.zip, issue_node_1001_ext.log, 
> issue_node_1001.log, issue_node_1002_ext.log, issue_node_1002.log, 
> issue_node_1003_ext.log, issue_node_1003.log, kafka.jstack, 
> state_change_controller.tar.gz
>
>
> We have encountered a critical issue that has re-occured in different 
> physical environments. We haven't worked out what is going on. We do though 
> have a nasty work around to keep service alive. 
> We do have not had this issue on clusters still running 0.9.01.
> We have noticed a node randomly shrinking for the partitions it owns the 
> ISR's down to itself, moments later we see other nodes having disconnects, 
> followed by finally app issues, where producing to these partitions is 
> blocked.
> It seems only by restarting the kafka instance java process resolves the 
> issues.
> We have had this occur multiple times and from all network and machine 
> monitoring the machine never left the network, or had any other glitches.
> Below are seen logs from the issue.
> Node 7:
> [2016-12-01 07:01:28,112] INFO Partition 
> [com_ig_trade_v1_position_event--demo--compacted,10] on broker 7: Shrinking 
> ISR for partition [com_ig_trade_v1_position_event--demo--compacted,10] from 
> 1,2,7 to 7 (kafka.cluster.Partition)
> All other nodes:
> [2016-12-01 07:01:38,172] WARN [ReplicaFetcherThread-0-7], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@5aae6d42 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 7 was disconnected before the response was 
> read
> All clients:
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.NetworkException: The server disconnected 
> before a response was received.
> After this occurs, we then suddenly see on the sick machine an increasing 
> amount of close_waits and file descriptors.
> As a work around to keep service we are currently putting in an automated 
> process that tails and regex's for: and where new_partitions hit just itself 
> we restart the node. 
> "\[(?P.+)\] INFO Partition \[.*\] on broker .* Shrinking ISR for 
> partition \[.*\] from (?P.+) to (?P.+) 
> \(kafka.cluster.Partition\)"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka-site pull request #55: Add Rajini to committers page

2017-04-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka-site/pull/55


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5138) MirrorMaker doesn't exit on send failure occasionally

2017-04-27 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5138:
---
Labels: newbie  (was: )

> MirrorMaker doesn't exit on send failure occasionally
> -
>
> Key: KAFKA-5138
> URL: https://issues.apache.org/jira/browse/KAFKA-5138
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
>Reporter: Dustin Cote
>  Labels: newbie
>
> MirrorMaker with abort.on.send.failure=true does not always exit if the 
> producer closes. Here is the logic that happens:
> First we encounter a problem producing and force the producer to close
> {code}
> [2017-04-10 07:17:25,137] ERROR Error when sending message to topic 
> mytopicwith key: 20 bytes, value: 314 bytes with error: 
> (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
> org.apache.kafka.common.errors.TimeoutException: Expiring 47 record(s) for 
> mytopic-2: 30879 ms has passed since last append
> [2017-04-10 07:17:25,170] INFO Closing producer due to send failure. 
> (kafka.tools.MirrorMaker$)
> [2017-04-10 07:17:25,170] INFO Closing the Kafka producer with timeoutMillis 
> = 0 ms. (org.apache.kafka.clients.producer.KafkaProducer)
> [2017-04-10 07:17:25,170] INFO Proceeding to force close the producer since 
> pending requests could not be completed within timeout 0 ms. 
> (org.apache.kafka.clients.producer.KafkaProducer)
> [2017-04-10 07:17:25,170] DEBUG The Kafka producer has closed. 
> (org.apache.kafka.clients.producer.KafkaProducer)
> [2017-04-10 07:17:25,170] ERROR Error when sending message to topic mytopic 
> with key: 20 bytes, value: 313 bytes with error: 
> (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
> org.apache.kafka.common.errors.TimeoutException: Expiring 47 record(s) for 
> mytopic-2: 30879 ms has passed since last append
> [2017-04-10 07:17:25,170] INFO Closing producer due to send failure. 
> (kafka.tools.MirrorMaker$)
> [2017-04-10 07:17:25,170] INFO Closing the Kafka producer with timeoutMillis 
> = 0 ms. (org.apache.kafka.clients.producer.KafkaProducer)
> [2017-04-10 07:17:25,170] INFO Proceeding to force close the producer since 
> pending requests could not be completed within timeout 0 ms. 
> (org.apache.kafka.clients.producer.KafkaProducer)
> [2017-04-10 07:17:25,170] DEBUG The Kafka producer has closed. 
> (org.apache.kafka.clients.producer.KafkaProducer)
> {code}
> All good there. Then we can't seem to close the producer nicely after about 
> 15 seconds and so it is forcefully killed:
> {code}
> [2017-04-10 07:17:39,778] ERROR Error when sending message to topic 
> mytopic.subscriptions with key: 70 bytes, value: null with error: 
> (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
> java.lang.IllegalStateException: Producer is closed forcefully.
> at 
> org.apache.kafka.clients.producer.internals.RecordAccumulator.abortBatches(RecordAccumulator.java:522)
> at 
> org.apache.kafka.clients.producer.internals.RecordAccumulator.abortIncompleteBatches(RecordAccumulator.java:502)
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:147)
> at java.lang.Thread.run(Unknown Source)
> [2017-04-10 07:17:39,778] INFO Closing producer due to send failure. 
> (kafka.tools.MirrorMaker$)
> [2017-04-10 07:17:39,778] INFO Closing the Kafka producer with timeoutMillis 
> = 0 ms. (org.apache.kafka.clients.producer.KafkaProducer)
> [2017-04-10 07:17:39,778] INFO Proceeding to force close the producer since 
> pending requests could not be completed within timeout 0 ms. 
> (org.apache.kafka.clients.producer.KafkaProducer)
> [2017-04-10 07:17:39,778] DEBUG The Kafka producer has closed. 
> (org.apache.kafka.clients.producer.KafkaProducer)
> [2017-04-10 07:17:39,779] DEBUG Removed sensor with name connections-closed: 
> (org.apache.kafka.common.metrics.Metrics)
> {code}
> After removing some metric sensors for awhile this happens:
> {code}
> [2017-04-10 07:17:39,780] DEBUG Removed sensor with name node-3.latency 
> (org.apache.kafka.common.metrics.Metrics)
> [2017-04-10 07:17:39,780] DEBUG Shutdown of Kafka producer I/O thread has 
> completed. (org.apache.kafka.clients.producer.internals.Sender)
> [2017-04-10 07:17:41,852] DEBUG Sending Heartbeat request for group 
> mirror-maker-1491619052-teab1-1 to coordinator myhost1:9092 (id: 2147483643 
> rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2017-04-10 07:17:41,953] DEBUG Received successful Heartbeat response for 
> group mirror-maker-1491619052-teab1-1 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2017-04-10 07:17:44,875] DEBUG Sending Heartbeat request for group 
> mirror-maker-1491619052-teab1-1 to coordinator myhost1:9092 (id: 2147483643 
> rack: null) (org.apache.kafka.clients.

[jira] [Updated] (KAFKA-5137) Controlled shutdown timeout message improvement

2017-04-27 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5137:
---
Labels: newbie  (was: )

> Controlled shutdown timeout message improvement
> ---
>
> Key: KAFKA-5137
> URL: https://issues.apache.org/jira/browse/KAFKA-5137
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.10.2.0
>Reporter: Dustin Cote
>Priority: Minor
>  Labels: newbie
>
> Currently if you fail during controlled shutdown, you can get a message that 
> says the socket.timeout.ms has expired. This config actually doesn't exist on 
> the broker. Instead, we should explicitly say if we've hit the 
> controller.socket.timeout.ms or the request.timeout.ms as it's confusing to 
> take action given the current message. I believe the relevant code is here:
> https://github.com/apache/kafka/blob/0.10.2/core/src/main/scala/kafka/server/KafkaServer.scala#L428-L454
> I'm also not sure if there's another timeout that could be hit here or 
> another reason why IOException might be thrown. In the least we should call 
> out those two configs instead of the non-existent one but if we can direct to 
> the proper one that would be even better.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5138) MirrorMaker doesn't exit on send failure occasionally

2017-04-27 Thread Dustin Cote (JIRA)
Dustin Cote created KAFKA-5138:
--

 Summary: MirrorMaker doesn't exit on send failure occasionally
 Key: KAFKA-5138
 URL: https://issues.apache.org/jira/browse/KAFKA-5138
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.10.2.0
Reporter: Dustin Cote


MirrorMaker with abort.on.send.failure=true does not always exit if the 
producer closes. Here is the logic that happens:
First we encounter a problem producing and force the producer to close
{code}
[2017-04-10 07:17:25,137] ERROR Error when sending message to topic mytopicwith 
key: 20 bytes, value: 314 bytes with error: 
(org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 47 record(s) for 
mytopic-2: 30879 ms has passed since last append
[2017-04-10 07:17:25,170] INFO Closing producer due to send failure. 
(kafka.tools.MirrorMaker$)
[2017-04-10 07:17:25,170] INFO Closing the Kafka producer with timeoutMillis = 
0 ms. (org.apache.kafka.clients.producer.KafkaProducer)
[2017-04-10 07:17:25,170] INFO Proceeding to force close the producer since 
pending requests could not be completed within timeout 0 ms. 
(org.apache.kafka.clients.producer.KafkaProducer)
[2017-04-10 07:17:25,170] DEBUG The Kafka producer has closed. 
(org.apache.kafka.clients.producer.KafkaProducer)
[2017-04-10 07:17:25,170] ERROR Error when sending message to topic mytopic 
with key: 20 bytes, value: 313 bytes with error: 
(org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 47 record(s) for 
mytopic-2: 30879 ms has passed since last append
[2017-04-10 07:17:25,170] INFO Closing producer due to send failure. 
(kafka.tools.MirrorMaker$)
[2017-04-10 07:17:25,170] INFO Closing the Kafka producer with timeoutMillis = 
0 ms. (org.apache.kafka.clients.producer.KafkaProducer)
[2017-04-10 07:17:25,170] INFO Proceeding to force close the producer since 
pending requests could not be completed within timeout 0 ms. 
(org.apache.kafka.clients.producer.KafkaProducer)
[2017-04-10 07:17:25,170] DEBUG The Kafka producer has closed. 
(org.apache.kafka.clients.producer.KafkaProducer)
{code}

All good there. Then we can't seem to close the producer nicely after about 15 
seconds and so it is forcefully killed:
{code}
[2017-04-10 07:17:39,778] ERROR Error when sending message to topic 
mytopic.subscriptions with key: 70 bytes, value: null with error: 
(org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
java.lang.IllegalStateException: Producer is closed forcefully.
at 
org.apache.kafka.clients.producer.internals.RecordAccumulator.abortBatches(RecordAccumulator.java:522)
at 
org.apache.kafka.clients.producer.internals.RecordAccumulator.abortIncompleteBatches(RecordAccumulator.java:502)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:147)
at java.lang.Thread.run(Unknown Source)
[2017-04-10 07:17:39,778] INFO Closing producer due to send failure. 
(kafka.tools.MirrorMaker$)
[2017-04-10 07:17:39,778] INFO Closing the Kafka producer with timeoutMillis = 
0 ms. (org.apache.kafka.clients.producer.KafkaProducer)
[2017-04-10 07:17:39,778] INFO Proceeding to force close the producer since 
pending requests could not be completed within timeout 0 ms. 
(org.apache.kafka.clients.producer.KafkaProducer)
[2017-04-10 07:17:39,778] DEBUG The Kafka producer has closed. 
(org.apache.kafka.clients.producer.KafkaProducer)
[2017-04-10 07:17:39,779] DEBUG Removed sensor with name connections-closed: 
(org.apache.kafka.common.metrics.Metrics)
{code}

After removing some metric sensors for awhile this happens:
{code}
[2017-04-10 07:17:39,780] DEBUG Removed sensor with name node-3.latency 
(org.apache.kafka.common.metrics.Metrics)
[2017-04-10 07:17:39,780] DEBUG Shutdown of Kafka producer I/O thread has 
completed. (org.apache.kafka.clients.producer.internals.Sender)
[2017-04-10 07:17:41,852] DEBUG Sending Heartbeat request for group 
mirror-maker-1491619052-teab1-1 to coordinator myhost1:9092 (id: 2147483643 
rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2017-04-10 07:17:41,953] DEBUG Received successful Heartbeat response for 
group mirror-maker-1491619052-teab1-1 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2017-04-10 07:17:44,875] DEBUG Sending Heartbeat request for group 
mirror-maker-1491619052-teab1-1 to coordinator myhost1:9092 (id: 2147483643 
rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
{code}

This heartbeating goes one for some time until:
{code}
[2017-04-10 07:19:57,392] DEBUG Received successful Heartbeat response for 
group mirror-maker-1491619052-teab1-1 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2017-04-10 07:19:57,994] DEBUG Connection with myhost1/123.123.321.321 
disconnected (or

[jira] [Created] (KAFKA-5137) Controlled shutdown timeout message improvement

2017-04-27 Thread Dustin Cote (JIRA)
Dustin Cote created KAFKA-5137:
--

 Summary: Controlled shutdown timeout message improvement
 Key: KAFKA-5137
 URL: https://issues.apache.org/jira/browse/KAFKA-5137
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.10.2.0
Reporter: Dustin Cote
Priority: Minor


Currently if you fail during controlled shutdown, you can get a message that 
says the socket.timeout.ms has expired. This config actually doesn't exist on 
the broker. Instead, we should explicitly say if we've hit the 
controller.socket.timeout.ms or the request.timeout.ms as it's confusing to 
take action given the current message. I believe the relevant code is here:
https://github.com/apache/kafka/blob/0.10.2/core/src/main/scala/kafka/server/KafkaServer.scala#L428-L454

I'm also not sure if there's another timeout that could be hit here or another 
reason why IOException might be thrown. In the least we should call out those 
two configs instead of the non-existent one but if we can direct to the proper 
one that would be even better.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2928: HOTFIX [WIP]: Check on not owned partitions

2017-04-27 Thread guozhangwang
GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/2928

HOTFIX [WIP]: Check on not owned partitions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka 
KHotfix-check-not-owned-partitions

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2928.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2928


commit f1690f3c3d2a86de27cb656a329127bba4659337
Author: Guozhang Wang 
Date:   2017-04-27T18:55:41Z

double check on null task




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[ANNOUCE] Apache Kafka 0.10.2.1 Released

2017-04-27 Thread Gwen Shapira
The Apache Kafka community is pleased to announce the release for Apache
Kafka 0.10.2.1. This is a bug fix release that fixes 29 issues in 0.10.2.0.

All of the changes in this release can be found in the release notes:
*https://archive.apache.org/dist/kafka/0.10.2.1/RELEASE_NOTES.html


Apache Kafka is a distributed streaming platform with four four core APIs:

** The Producer API allows an application to publish a stream records to
one or more Kafka topics.

** The Consumer API allows an application to subscribe to one or more
topics and process the stream of records produced to them.

** The Streams API allows an application to act as a stream processor,
consuming an input stream from one or more topics and producing an output
stream to one or more output topics, effectively transforming the input
streams to output streams.

** The Connector API allows building and running reusable producers or
consumers that connect Kafka topics to existing applications or data
systems. For example, a connector to a relational database might capture
every change to a table.three key capabilities:


With these APIs, Kafka can be used for two broad classes of application:

** Building real-time streaming data pipelines that reliably get data
between systems or applications.

** Building real-time streaming applications that transform or react to the
streams of data.


You can download the source release from
https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.1/kafka-0.10.2.1-src.tgz

and binary releases from
https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.1/kafka_2.10-0.10.2.1.tgz
https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.1/kafka_2.11-0.10.2.1.tgz


A big thank you for the following 25 contributors to this release!

Aaron Coburn, Apurva Mehta, Armin Braun, Ben Stopford, Bill Bejeck,
Bruce Szalwinski, Clemens Valiente, Colin P. Mccabe, Damian Guy, Dong
Lin, Eno Thereska, Ewen Cheslack-Postava, Guozhang Wang, Gwen Shapira,
Ismael Juma, Jason Gustafson, Konstantine Karantasis, Marco Ebert,
Matthias J. Sax, Michael G. Noll, Onur Karaman, Rajini Sivaram, Ryan
P, simplesteph, Vahid Hashemian

We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
http://kafka.apache.org/


Thanks,
-- Gwen


Re: [DISCUSS] KIP-143: Controller Health Metrics

2017-04-27 Thread Joel Koshy
Thanks for the KIP - couple of comments:
- Do you intend to actually use yammer metrics? or use kafka-metrics and
split the timer into an explicit rate and time? I think long term we ought
to move off yammer and use kafka-metrics only. Actually either is fine, but
we should ideally use only one in the long term - and I thought the plan
was to use kafka-metrics.
- metric #9 appears to be redundant since we already have per-API request
rate and time metrics.
- Same for metric #4, #5 (as there are request stats for DeleteTopicRequest
- although it is possible for users to trigger deletes via ZK)
- metric #2, #3 are potentially useful, but a bit overkill for a histogram.
Alternative is to stick to last known value, but that doesn't play well
with alerts if a high value isn't reset/decayed. Perhaps metric #1 would be
sufficient to gauge slow start/resignation transitions.
- metric #1 - some of the states may actually overlap
- I don't actually understand the semantics of metric #6. Is it rate of
partition reassignment triggers? does the number of partitions matter?

Joel

On Thu, Apr 27, 2017 at 8:04 AM, Tom Crayford  wrote:

> Ismael,
>
> Great, that sounds lovely.
>
> I'd like a `Timer` (using yammer metrics parlance) over how long it took to
> process the event, so we can get at p99 and max times spent processing
> things. Maybe we could even do a log at warning level if event processing
> takes over some timeout?
>
> Thanks
>
> Tom
>
> On Thu, Apr 27, 2017 at 3:59 PM, Ismael Juma  wrote:
>
> > Hi Tom,
> >
> > Yes, the plan is to merge KAFKA-5028 first and then use a lock-free
> > approach for the new  metrics. I considered mentioning that in the KIP
> > given KAFKA-5120, but didn't in the end. I'll add it to make it clear.
> >
> > Regarding locks, they are removed by KAFKA-5028, as you say. So, if I
> > understand correctly, you are suggesting an event processing rate metric
> > with event type as a tag? Onur and Jun, what do you think?
> >
> > Ismael
> >
> > On Thu, Apr 27, 2017 at 3:47 PM, Tom Crayford 
> > wrote:
> >
> > > Hi,
> > >
> > > We (Heroku) are very excited about this KIP, as we've struggled a bit
> > with
> > > controller stability recently. Having these additional metrics would be
> > > wonderful.
> > >
> > > I'd like to ensure polling these metrics *doesn't* hold any locks etc,
> > > because, as noted in https://issues.apache.org/jira/browse/KAFKA-5120,
> > > that
> > > lock can be held for quite some time. This may become not an issue as
> of
> > > KAFKA-5028 though.
> > >
> > > Lastly, I'd love to see some metrics around how long the controller
> > spends
> > > inside its lock. We've been tracking an issue (
> > > https://issues.apache.org/jira/browse/KAFKA-5116) where it can hold
> the
> > > lock for many, many minutes in a zk client listener thread when
> > responding
> > > to a single request. I'm not sure how that plays into
> > > https://issues.apache.org/jira/browse/KAFKA-5028 (which I assume will
> > land
> > > before this metrics patch), but it feels like there will be equivalent
> > > problems ("how long does it spend processing any individual message
> from
> > > the queue, broken down by message type").
> > >
> > > These are minor improvements though, the addition of more metrics to
> the
> > > controller is already going to be very helpful.
> > >
> > > Thanks
> > >
> > > Tom Crayford
> > > Heroku Kafka
> > >
> > > On Thu, Apr 27, 2017 at 3:10 PM, Ismael Juma 
> wrote:
> > >
> > > > Hi all,
> > > >
> > > > We've posted "KIP-143: Controller Health Metrics" for discussion:
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 143%3A+Controller+Health+Metrics
> > > >
> > > > Please take a look. Your feedback is appreciated.
> > > >
> > > > Thanks,
> > > > Ismael
> > > >
> > >
> >
>


[jira] [Updated] (KAFKA-5107) remove preferred replica election state from ControllerContext

2017-04-27 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman updated KAFKA-5107:

Status: Patch Available  (was: In Progress)

> remove preferred replica election state from ControllerContext
> --
>
> Key: KAFKA-5107
> URL: https://issues.apache.org/jira/browse/KAFKA-5107
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Onur Karaman
>
> KAFKA-5028 moves the controller to a single-threaded model, so we would no 
> longer have work interleaved between preferred replica leader election, 
> meaning we don't need to keep its state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Work started] (KAFKA-5107) remove preferred replica election state from ControllerContext

2017-04-27 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-5107 started by Onur Karaman.
---
> remove preferred replica election state from ControllerContext
> --
>
> Key: KAFKA-5107
> URL: https://issues.apache.org/jira/browse/KAFKA-5107
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Onur Karaman
>
> KAFKA-5028 moves the controller to a single-threaded model, so we would no 
> longer have work interleaved between preferred replica leader election, 
> meaning we don't need to keep its state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4477) Node reduces its ISR to itself, and doesn't recover. Other nodes do not take leadership, cluster remains sick until node is restarted.

2017-04-27 Thread Apurva Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987228#comment-15987228
 ] 

Apurva Mehta commented on KAFKA-4477:
-

Your stack trace alone doesn't indicate that you have hit this bug. You should 
also see an increased number of file descriptors on the leader, and all of it 
should be fixed after a bounce of the leader. Do you see that?

Given that we had at least 4 reports of this issue for 0.10.1 and all were 
resolved in 0.10.1.1 and later, I doubt you are hitting the same bug. 

> Node reduces its ISR to itself, and doesn't recover. Other nodes do not take 
> leadership, cluster remains sick until node is restarted.
> --
>
> Key: KAFKA-4477
> URL: https://issues.apache.org/jira/browse/KAFKA-4477
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.0
> Environment: RHEL7
> java version "1.8.0_66"
> Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
>Reporter: Michael Andre Pearce (IG)
>Assignee: Apurva Mehta
>Priority: Critical
>  Labels: reliability
> Fix For: 0.10.1.1
>
> Attachments: 2016_12_15.zip, issue_node_1001_ext.log, 
> issue_node_1001.log, issue_node_1002_ext.log, issue_node_1002.log, 
> issue_node_1003_ext.log, issue_node_1003.log, kafka.jstack, 
> state_change_controller.tar.gz
>
>
> We have encountered a critical issue that has re-occured in different 
> physical environments. We haven't worked out what is going on. We do though 
> have a nasty work around to keep service alive. 
> We do have not had this issue on clusters still running 0.9.01.
> We have noticed a node randomly shrinking for the partitions it owns the 
> ISR's down to itself, moments later we see other nodes having disconnects, 
> followed by finally app issues, where producing to these partitions is 
> blocked.
> It seems only by restarting the kafka instance java process resolves the 
> issues.
> We have had this occur multiple times and from all network and machine 
> monitoring the machine never left the network, or had any other glitches.
> Below are seen logs from the issue.
> Node 7:
> [2016-12-01 07:01:28,112] INFO Partition 
> [com_ig_trade_v1_position_event--demo--compacted,10] on broker 7: Shrinking 
> ISR for partition [com_ig_trade_v1_position_event--demo--compacted,10] from 
> 1,2,7 to 7 (kafka.cluster.Partition)
> All other nodes:
> [2016-12-01 07:01:38,172] WARN [ReplicaFetcherThread-0-7], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@5aae6d42 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 7 was disconnected before the response was 
> read
> All clients:
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.NetworkException: The server disconnected 
> before a response was received.
> After this occurs, we then suddenly see on the sick machine an increasing 
> amount of close_waits and file descriptors.
> As a work around to keep service we are currently putting in an automated 
> process that tails and regex's for: and where new_partitions hit just itself 
> we restart the node. 
> "\[(?P.+)\] INFO Partition \[.*\] on broker .* Shrinking ISR for 
> partition \[.*\] from (?P.+) to (?P.+) 
> \(kafka.cluster.Partition\)"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >