[jira] [Created] (KAFKA-3399) Offset/logfile corruption and kafka closing connections

2016-03-15 Thread Benjamin Vetter (JIRA)
Benjamin Vetter created KAFKA-3399:
--

 Summary: Offset/logfile corruption and kafka closing connections
 Key: KAFKA-3399
 URL: https://issues.apache.org/jira/browse/KAFKA-3399
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.9.0.1
Reporter: Benjamin Vetter


hi,

it seems kafka created some kind of gap and corruption within the logfiles. 
Seems it happened in between rotation.

The gap is of course no problem, but the "corruption" is, because kafka is 
closing connections when i try to fetch messages around offset 73581, such that 
my worker can't proceed.

offset 73580 works:

{code}
irb(main):011:0> client.fetch_messages(:topic => "my-topic", :offset => 73580, 
:partition => 0, :max_wait_time => 10).last
=> #

irb(main):003:0> client.fetch_messages(:topic => "my-topic", :offset => 73581, 
:partition => 0, :max_wait_time => 10).last
Kafka::ConnectionError: Connection error: EOFError

irb(main):004:0> client.fetch_messages(:topic => "my-topic", :offset => 73582, 
:partition => 0, :max_wait_time => 10).last
Kafka::ConnectionError: Connection error: EOFError

...

irb(main):007:0> client.fetch_messages(:topic => "my-topic", :offset => 73641, 
:partition => 0, :max_wait_time => 10).first
Kafka::ConnectionError: Connection error: EOFError

irb(main):005:0> client.fetch_messages(:topic => "my-topic", :offset => 73642, 
:partition => 0, :max_wait_time => 10).last
=> #
{code}

The EOFError is happening as kafka is closing the connection.
Starting from 73642 fetching messages works again.
Unsurprisingly, the respective logfiles look like:

kafka kafka12256 Mär 15 07:54 .index
kafka kafka  6392949 Mär 15 07:54 .log
kafka kafka 10485760 Mär 15 07:54 00073581.index
kafka kafka  1520909 Mär 15 09:02 00073581.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-3399) Offset/logfile corruption and kafka closing connections

2016-03-15 Thread Benjamin Vetter (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Vetter resolved KAFKA-3399.

Resolution: Fixed

sry, this was a client library bug

> Offset/logfile corruption and kafka closing connections
> ---
>
> Key: KAFKA-3399
> URL: https://issues.apache.org/jira/browse/KAFKA-3399
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Benjamin Vetter
>
> hi,
> it seems kafka created some kind of gap and corruption within the logfiles. 
> Seems it happened in between rotation.
> The gap is of course no problem, but the "corruption" is, because kafka is 
> closing connections when i try to fetch messages around offset 73581, such 
> that my worker can't proceed.
> offset 73580 works:
> {code}
> irb(main):011:0> client.fetch_messages(:topic => "my-topic", :offset => 
> 73580, :partition => 0, :max_wait_time => 10).last
> => #
> irb(main):003:0> client.fetch_messages(:topic => "my-topic", :offset => 
> 73581, :partition => 0, :max_wait_time => 10).last
> Kafka::ConnectionError: Connection error: EOFError
> irb(main):004:0> client.fetch_messages(:topic => "my-topic", :offset => 
> 73582, :partition => 0, :max_wait_time => 10).last
> Kafka::ConnectionError: Connection error: EOFError
> ...
> irb(main):007:0> client.fetch_messages(:topic => "my-topic", :offset => 
> 73641, :partition => 0, :max_wait_time => 10).first
> Kafka::ConnectionError: Connection error: EOFError
> irb(main):005:0> client.fetch_messages(:topic => "my-topic", :offset => 
> 73642, :partition => 0, :max_wait_time => 10).last
> => #
> {code}
> The EOFError is happening as kafka is closing the connection.
> Starting from 73642 fetching messages works again.
> Unsurprisingly, the respective logfiles look like:
> kafka kafka12256 Mär 15 07:54 .index
> kafka kafka  6392949 Mär 15 07:54 .log
> kafka kafka 10485760 Mär 15 07:54 00073581.index
> kafka kafka  1520909 Mär 15 09:02 00073581.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3399) Offset/logfile corruption and kafka closing connections

2016-03-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195006#comment-15195006
 ] 

Ismael Juma commented on KAFKA-3399:


Thanks for letting us know.

> Offset/logfile corruption and kafka closing connections
> ---
>
> Key: KAFKA-3399
> URL: https://issues.apache.org/jira/browse/KAFKA-3399
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Benjamin Vetter
>
> hi,
> it seems kafka created some kind of gap and corruption within the logfiles. 
> Seems it happened in between rotation.
> The gap is of course no problem, but the "corruption" is, because kafka is 
> closing connections when i try to fetch messages around offset 73581, such 
> that my worker can't proceed.
> offset 73580 works:
> {code}
> irb(main):011:0> client.fetch_messages(:topic => "my-topic", :offset => 
> 73580, :partition => 0, :max_wait_time => 10).last
> => #
> irb(main):003:0> client.fetch_messages(:topic => "my-topic", :offset => 
> 73581, :partition => 0, :max_wait_time => 10).last
> Kafka::ConnectionError: Connection error: EOFError
> irb(main):004:0> client.fetch_messages(:topic => "my-topic", :offset => 
> 73582, :partition => 0, :max_wait_time => 10).last
> Kafka::ConnectionError: Connection error: EOFError
> ...
> irb(main):007:0> client.fetch_messages(:topic => "my-topic", :offset => 
> 73641, :partition => 0, :max_wait_time => 10).first
> Kafka::ConnectionError: Connection error: EOFError
> irb(main):005:0> client.fetch_messages(:topic => "my-topic", :offset => 
> 73642, :partition => 0, :max_wait_time => 10).last
> => #
> {code}
> The EOFError is happening as kafka is closing the connection.
> Starting from 73642 fetching messages works again.
> Unsurprisingly, the respective logfiles look like:
> kafka kafka12256 Mär 15 07:54 .index
> kafka kafka  6392949 Mär 15 07:54 .log
> kafka kafka 10485760 Mär 15 07:54 00073581.index
> kafka kafka  1520909 Mär 15 09:02 00073581.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3202: System test that changes message v...

2016-03-15 Thread enothereska
GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/1070

KAFKA-3202: System test that changes message version on the fly

@becketqin @apovzner please have a look. @becketqin the test fails when the 
producer and consumer are 0.9.x and the message format changes on the fly.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka kafka-3202-format-change-fly

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1070.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1070


commit 25a07507e672e76b4861da7c6de1e15fedf62653
Author: Eno Thereska 
Date:   2016-03-15T09:50:12Z

System test that changes message version on the fly




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3202) Add system test for KIP-31 and KIP-32 - Change message format version on the fly

2016-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195020#comment-15195020
 ] 

ASF GitHub Bot commented on KAFKA-3202:
---

GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/1070

KAFKA-3202: System test that changes message version on the fly

@becketqin @apovzner please have a look. @becketqin the test fails when the 
producer and consumer are 0.9.x and the message format changes on the fly.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka kafka-3202-format-change-fly

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1070.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1070


commit 25a07507e672e76b4861da7c6de1e15fedf62653
Author: Eno Thereska 
Date:   2016-03-15T09:50:12Z

System test that changes message version on the fly




> Add system test for KIP-31 and KIP-32 - Change message format version on the 
> fly
> 
>
> Key: KAFKA-3202
> URL: https://issues.apache.org/jira/browse/KAFKA-3202
> Project: Kafka
>  Issue Type: Sub-task
>  Components: system tests
>Reporter: Jiangjie Qin
>Assignee: Eno Thereska
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> The system test should cover the case that message format changes are made 
> when clients are producing/consuming. The message format change should not 
> cause client side issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3400) Topic stop working / can't describe topic

2016-03-15 Thread Tobias (JIRA)
Tobias created KAFKA-3400:
-

 Summary: Topic stop working / can't describe topic
 Key: KAFKA-3400
 URL: https://issues.apache.org/jira/browse/KAFKA-3400
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.9.0.1
Reporter: Tobias


we are seeing an issue were we intermittently (every couple of hours) get and 
error with certain topics. They stop working and producers give a 
LeaderNotFoundException.
When we then try to use kafka-topics.sh to describe the topic we get the error 
below.

Error while executing topic command : next on empty iterator
{{
[2016-03-15 17:30:26,231] ERROR java.util.NoSuchElementException: next on empty 
iterator
at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
at scala.collection.Iterator$$anon$2.next(Iterator.scala:37)
at scala.collection.IterableLike$class.head(IterableLike.scala:91)
at scala.collection.AbstractIterable.head(Iterable.scala:54)
at 
kafka.admin.TopicCommand$$anonfun$describeTopic$1.apply(TopicCommand.scala:198)
at 
kafka.admin.TopicCommand$$anonfun$describeTopic$1.apply(TopicCommand.scala:188)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at kafka.admin.TopicCommand$.describeTopic(TopicCommand.scala:188)
at kafka.admin.TopicCommand$.main(TopicCommand.scala:66)
at kafka.admin.TopicCommand.main(TopicCommand.scala)
 (kafka.admin.TopicCommand$)
}}

if we delete the topic, then it will start to work again for a while
We can't see anything obvious in the logs but are happy to provide if needed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: Add test that verifies fix for KAFKA-30...

2016-03-15 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/1071

MINOR: Add test that verifies fix for KAFKA-3047

Also clean-up `LogTest` a little.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-3047-explicit-offset-assignment-corrupt-log-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1071.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1071


commit aab8f78cee26b869f14b6f3652cbac2245362076
Author: Ismael Juma 
Date:   2016-03-15T11:08:21Z

Add test that verifies fix for KAFKA-3047




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3047) Explicit offset assignment in Log.append can corrupt the log

2016-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195108#comment-15195108
 ] 

ASF GitHub Bot commented on KAFKA-3047:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/1071

MINOR: Add test that verifies fix for KAFKA-3047

Also clean-up `LogTest` a little.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-3047-explicit-offset-assignment-corrupt-log-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1071.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1071


commit aab8f78cee26b869f14b6f3652cbac2245362076
Author: Ismael Juma 
Date:   2016-03-15T11:08:21Z

Add test that verifies fix for KAFKA-3047




> Explicit offset assignment in Log.append can corrupt the log
> 
>
> Key: KAFKA-3047
> URL: https://issues.apache.org/jira/browse/KAFKA-3047
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.9.0.0
>Reporter: Maciek Makowski
>Assignee: Ismael Juma
> Fix For: 0.10.0.0
>
>
> {{Log.append()}} has {{assignOffsets}} parameter, which, when set to false, 
> should cause Kafka to use the offsets specified in the 
> {{ByteBufferMessageSet}} and not recalculate them based on 
> {{nextOffsetMetadata}}. However, in that function, {{appendInfo.firstOffset}} 
> is unconditionally set to {{nextOffsetMetadata.messageOffset}}. This can 
> cause corruption of the log in the following scenario:
> * {{nextOffsetMetadata.messageOffset}} is 2001
> * {{append(messageSet, assignOffsets = false)}} is called, where 
> {{messageSet}} contains offsets 1001...1500 
> * after {{val appendInfo = analyzeAndValidateMessageSet(messages)}} call, 
> {{appendInfo.fistOffset}} is 1001 and {{appendInfo.lastOffset}} is 1500
> * after {{appendInfo.firstOffset = nextOffsetMetadata.messageOffset}} call, 
> {{appendInfo.fistOffset}} is 2001 and {{appendInfo.lastOffset}} is 1500
> * consistency check {{if(!appendInfo.offsetsMonotonic || 
> appendInfo.firstOffset < nextOffsetMetadata.messageOffset)}} succeeds (the 
> second condition can never fail due to unconditional assignment) and writing 
> proceeds
> * the message set is appended to current log segment starting at offset 2001, 
> but the offsets in the set are 1001...1500
> * the system shuts down abruptly
> * on restart, the following unrecoverable error is reported: 
> {code}
> Exception in thread "main" kafka.common.InvalidOffsetException: Attempt to 
> append an offset (1001) to position 12345 no larger than the last offset 
> appended (1950) to xyz/.index.
>   at 
> kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:207)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
>   at kafka.log.OffsetIndex.append(OffsetIndex.scala:197)
>   at kafka.log.LogSegment.recover(LogSegment.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:160)
>   at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:778)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
>   at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:777)
>   at kafka.log.Log.loadSegments(Log.scala:160)
>   at kafka.log.Log.(Log.scala:90)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:150)
>   at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>   at java.lang.Thread.run(Thread.java:722)
> {code} 
> *Proposed fix:* the assignment {{appendInfo.firstOffset = 
> nextOffsetMetadata.messageOffset}} should only happen in {{if 
> (assignOffsets)}} branch of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3401) Message format change on the fly breaks 0.9 consumer

2016-03-15 Thread Eno Thereska (JIRA)
Eno Thereska created KAFKA-3401:
---

 Summary: Message format change on the fly breaks 0.9 consumer
 Key: KAFKA-3401
 URL: https://issues.apache.org/jira/browse/KAFKA-3401
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.10.0.0
Reporter: Eno Thereska
Assignee: Jiangjie Qin
Priority: Blocker
 Fix For: 0.10.0.0


The new system test as part of KAFKA-3202 reveals a problem when the message 
format is changed on the fly. When the cluster is using 0.10.x brokers and 
producers and consumers use version 0.9.0.1 an error happens when the message 
format is changed on the fly to version 0.9:

{code}
Exception: {'ConsoleConsumer-worker-1': Exception('Unexpected message format 
(expected an integer). Message: null',)}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3401) Message format change on the fly breaks 0.9 consumer

2016-03-15 Thread Eno Thereska (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eno Thereska updated KAFKA-3401:

Attachment: 2016-03-15--009.zip

Attaching error logs.

> Message format change on the fly breaks 0.9 consumer
> 
>
> Key: KAFKA-3401
> URL: https://issues.apache.org/jira/browse/KAFKA-3401
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Eno Thereska
>Assignee: Jiangjie Qin
>Priority: Blocker
> Fix For: 0.10.0.0
>
> Attachments: 2016-03-15--009.zip
>
>
> The new system test as part of KAFKA-3202 reveals a problem when the message 
> format is changed on the fly. When the cluster is using 0.10.x brokers and 
> producers and consumers use version 0.9.0.1 an error happens when the message 
> format is changed on the fly to version 0.9:
> {code}
> Exception: {'ConsoleConsumer-worker-1': Exception('Unexpected message format 
> (expected an integer). Message: null',)}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: KAFKA-3373 follow-up, a few val renames...

2016-03-15 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/1072

MINOR: KAFKA-3373 follow-up, a few val renames remaining



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka kafka-3373-follow-up

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1072.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1072


commit 13b9fbc6a60a3ace789794072b3ee7ed5c72e713
Author: Ismael Juma 
Date:   2016-03-15T12:19:12Z

MINOR: KAFKA-3373 follow-up, a few val renames remaining




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3373) Add `log` prefix to KIP-31/32 configs

2016-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195191#comment-15195191
 ] 

ASF GitHub Bot commented on KAFKA-3373:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/1072

MINOR: KAFKA-3373 follow-up, a few val renames remaining



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka kafka-3373-follow-up

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1072.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1072


commit 13b9fbc6a60a3ace789794072b3ee7ed5c72e713
Author: Ismael Juma 
Date:   2016-03-15T12:19:12Z

MINOR: KAFKA-3373 follow-up, a few val renames remaining




> Add `log` prefix to KIP-31/32 configs
> -
>
> Key: KAFKA-3373
> URL: https://issues.apache.org/jira/browse/KAFKA-3373
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ismael Juma
>Assignee: Jiangjie Qin
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> [~jjkoshy] suggested that we should prefix the configs introduced as part of 
> KIP-31/32 to include a `log` prefix:
> message.format.version
> message.timestamp.type
> message.timestamp.difference.max.ms
> If we do it, we must update the KIP.
> Marking it as blocker because we should decide either way before 0.10.0.0.
> Discussion here:
> https://github.com/apache/kafka/pull/907#issuecomment-193950768



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3401) Message format change on the fly breaks 0.9 consumer

2016-03-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195218#comment-15195218
 ] 

Ismael Juma commented on KAFKA-3401:


[~becket_qin], the upgrade notes say:

{code}
Note: By setting the message format version, one certifies that all 
existing messages are on or below that
message format version. Otherwise consumers before 0.10.0.0 might break. In 
particular, after the message format
is set to 0.10.0, one should not change it back to an earlier format as it 
may break consumers on versions before 0.10.0.0.
{code}

So, did you really mean that we should do step 4 in KAFKA-3202?

{code}
4. change the message format version for both topic back to 0.9.0 on the fly.
{code}

Or do we need to switch off older clients before we do that (under the 
assumption that the upgrade notes are correct)?

> Message format change on the fly breaks 0.9 consumer
> 
>
> Key: KAFKA-3401
> URL: https://issues.apache.org/jira/browse/KAFKA-3401
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Eno Thereska
>Assignee: Jiangjie Qin
>Priority: Blocker
> Fix For: 0.10.0.0
>
> Attachments: 2016-03-15--009.zip
>
>
> The new system test as part of KAFKA-3202 reveals a problem when the message 
> format is changed on the fly. When the cluster is using 0.10.x brokers and 
> producers and consumers use version 0.9.0.1 an error happens when the message 
> format is changed on the fly to version 0.9:
> {code}
> Exception: {'ConsoleConsumer-worker-1': Exception('Unexpected message format 
> (expected an integer). Message: null',)}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3394) Broker fails to parse Null Metadata in OffsetCommit requests

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3394:
---
Reviewer: Jun Rao
  Status: Patch Available  (was: Open)

> Broker fails to parse Null Metadata in OffsetCommit requests
> 
>
> Key: KAFKA-3394
> URL: https://issues.apache.org/jira/browse/KAFKA-3394
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0
>Reporter: Magnus Edenhill
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> librdkafka sends a Null Metadata string (size -1) in its OffsetCommitRequests 
> when there is no metadata, this unfortunately leads to an exception on the 
> broker that expects a non-null string.
> {noformat}
> [2016-03-11 11:11:57,623] ERROR Closing socket for 
> 10.191.0.33:9092-10.191.0.33:56503 because of error (kafka.network.Processor)
> kafka.network.InvalidRequestException: Error getting request for apiKey: 8 
> and apiVersion: 1
> at 
> kafka.network.RequestChannel$Request.liftedTree2$1(RequestChannel.scala:91)
> at kafka.network.RequestChannel$Request.(RequestChannel.scala:88)
> at kafka.network.Processor$$anonfun$run$11.apply(SocketServer.scala:426)
> at kafka.network.Processor$$anonfun$run$11.apply(SocketServer.scala:421)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at kafka.network.Processor.run(SocketServer.scala:421)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kafka.common.protocol.types.SchemaException: Error 
> reading field 'topics': Error reading field 'partitions': Error reading field 
> 'metadata': java.lang.NegativeArraySizeException
> at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:73)
> at 
> org.apache.kafka.common.requests.OffsetCommitRequest.parse(OffsetCommitRequest.java:260)
> at 
> org.apache.kafka.common.requests.AbstractRequest.getRequest(AbstractRequest.java:50)
> at 
> kafka.network.RequestChannel$Request.liftedTree2$1(RequestChannel.scala:88)
> ... 9 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2832) support exclude.internal.topics in new consumer

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-2832:
---
Assignee: Ismael Juma  (was: Vahid Hashemian)
Reviewer: Gwen Shapira

> support exclude.internal.topics in new consumer
> ---
>
> Key: KAFKA-2832
> URL: https://issues.apache.org/jira/browse/KAFKA-2832
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients
>Reporter: Jun Rao
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> The old consumer supports exclude.internal.topics that prevents internal 
> topics from being consumed by default. It would be useful to add that in the 
> new consumer, especially when wildcards are used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3402) Restore behaviour of MetadataCache.getTopicMetadata when unsupported security protocol is received

2016-03-15 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-3402:
--

 Summary: Restore behaviour of MetadataCache.getTopicMetadata when 
unsupported security protocol is received
 Key: KAFKA-3402
 URL: https://issues.apache.org/jira/browse/KAFKA-3402
 Project: Kafka
  Issue Type: Bug
Reporter: Ismael Juma
Assignee: Ismael Juma
Priority: Critical
 Fix For: 0.10.0.0


The behaviour of `MetadataCache.getTopicMetadata` when a security protocol that 
is not supported by a broker is passed as a parameter was to throw 
BrokerEndpointNotAvailableException until 
https://github.com/apache/kafka/commit/764d8ca9eb0aba6099ba289a10f437e72b53ffec 
when it was accidentally changed to return a response with error code 
`LeaderNotAvailable` or `ReplicaNotAvailable`.

The right behaviour is probably to return a new error code, but in the meantime 
we should restore the existing behavior and add some tests.

This issue was found by [~hachikuji].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2832) support exclude.internal.topics in new consumer

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-2832:
---
Assignee: Vahid Hashemian  (was: Ismael Juma)

> support exclude.internal.topics in new consumer
> ---
>
> Key: KAFKA-2832
> URL: https://issues.apache.org/jira/browse/KAFKA-2832
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients
>Reporter: Jun Rao
>Assignee: Vahid Hashemian
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> The old consumer supports exclude.internal.topics that prevents internal 
> topics from being consumed by default. It would be useful to add that in the 
> new consumer, especially when wildcards are used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2832) support exclude.internal.topics in new consumer

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-2832:
---
Reviewer:   (was: Gwen Shapira)

> support exclude.internal.topics in new consumer
> ---
>
> Key: KAFKA-2832
> URL: https://issues.apache.org/jira/browse/KAFKA-2832
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients
>Reporter: Jun Rao
>Assignee: Vahid Hashemian
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> The old consumer supports exclude.internal.topics that prevents internal 
> topics from being consumed by default. It would be useful to add that in the 
> new consumer, especially when wildcards are used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3402; Restore behaviour of MetadataCache...

2016-03-15 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/1073

KAFKA-3402; Restore behaviour of MetadataCache.getTopicMetadata when 
unsupported security protocol is received



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-3402-restore-get-topic-metadata-behaviour

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1073.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1073


commit f314f180dae4a80a2b47120e6188137bf679ff7e
Author: Ismael Juma 
Date:   2016-03-15T14:11:53Z

Restore behaviour of MetadataCache.getTopicMetadata when unsupported 
security protocol is received




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3402) Restore behaviour of MetadataCache.getTopicMetadata when unsupported security protocol is received

2016-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195347#comment-15195347
 ] 

ASF GitHub Bot commented on KAFKA-3402:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/1073

KAFKA-3402; Restore behaviour of MetadataCache.getTopicMetadata when 
unsupported security protocol is received



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-3402-restore-get-topic-metadata-behaviour

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1073.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1073


commit f314f180dae4a80a2b47120e6188137bf679ff7e
Author: Ismael Juma 
Date:   2016-03-15T14:11:53Z

Restore behaviour of MetadataCache.getTopicMetadata when unsupported 
security protocol is received




> Restore behaviour of MetadataCache.getTopicMetadata when unsupported security 
> protocol is received
> --
>
> Key: KAFKA-3402
> URL: https://issues.apache.org/jira/browse/KAFKA-3402
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> The behaviour of `MetadataCache.getTopicMetadata` when a security protocol 
> that is not supported by a broker is passed as a parameter was to throw 
> BrokerEndpointNotAvailableException until 
> https://github.com/apache/kafka/commit/764d8ca9eb0aba6099ba289a10f437e72b53ffec
>  when it was accidentally changed to return a response with error code 
> `LeaderNotAvailable` or `ReplicaNotAvailable`.
> The right behaviour is probably to return a new error code, but in the 
> meantime we should restore the existing behavior and add some tests.
> This issue was found by [~hachikuji].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3402) Restore behaviour of MetadataCache.getTopicMetadata when unsupported security protocol is received

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3402:
---
Status: Patch Available  (was: Open)

> Restore behaviour of MetadataCache.getTopicMetadata when unsupported security 
> protocol is received
> --
>
> Key: KAFKA-3402
> URL: https://issues.apache.org/jira/browse/KAFKA-3402
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> The behaviour of `MetadataCache.getTopicMetadata` when a security protocol 
> that is not supported by a broker is passed as a parameter was to throw 
> BrokerEndpointNotAvailableException until 
> https://github.com/apache/kafka/commit/764d8ca9eb0aba6099ba289a10f437e72b53ffec
>  when it was accidentally changed to return a response with error code 
> `LeaderNotAvailable` or `ReplicaNotAvailable`.
> The right behaviour is probably to return a new error code, but in the 
> meantime we should restore the existing behavior and add some tests.
> This issue was found by [~hachikuji].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3403) Upgrade ZkClient to 0.8

2016-03-15 Thread Grant Henke (JIRA)
Grant Henke created KAFKA-3403:
--

 Summary: Upgrade ZkClient to 0.8
 Key: KAFKA-3403
 URL: https://issues.apache.org/jira/browse/KAFKA-3403
 Project: Kafka
  Issue Type: Improvement
Reporter: Grant Henke
Assignee: Grant Henke


KAFKA-3328 requires a versioned delete that [~fpj] added to ZkClient and will 
be available in the 0.8 release. When released, we should upgrade to zkclient 
0.8. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3282) Change tools to use --new-consumer by default and introduce --old-consumer

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3282:
---
Fix Version/s: (was: 0.10.0.0)
   0.10.1.0

> Change tools to use --new-consumer by default and introduce --old-consumer
> --
>
> Key: KAFKA-3282
> URL: https://issues.apache.org/jira/browse/KAFKA-3282
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Ismael Juma
> Fix For: 0.10.1.0
>
>
> This only applies to tools that support the new consumer and it's similar to 
> what we did with the producer for 0.9.0.0.
> Part of this JIRA is updating the documentation to remove `--new-consumer` 
> from command invocations where appropriate. An example where this will be the 
> case is in the security documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3283) Consider marking the new consumer as production-ready

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3283:
---
Priority: Major  (was: Critical)

> Consider marking the new consumer as production-ready
> -
>
> Key: KAFKA-3283
> URL: https://issues.apache.org/jira/browse/KAFKA-3283
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Ismael Juma
>Assignee: Jason Gustafson
> Fix For: 0.10.1.0
>
>
> Ideally, we would:
> * Remove the beta label
> * Remove the `Unstable` annotation
> * Filling any critical gaps in functionality
> * Update the documentation on the old consumers to recommend the new consumer 
> (without deprecating the old consumer, however)
> The new consumer already looks pretty good in 0.9.0.1 so it's feasible that 
> we may be able to do this for 0.10.0.0. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3283) Consider marking the new consumer as production-ready

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3283:
---
Fix Version/s: (was: 0.10.0.0)
   0.10.1.0

> Consider marking the new consumer as production-ready
> -
>
> Key: KAFKA-3283
> URL: https://issues.apache.org/jira/browse/KAFKA-3283
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Ismael Juma
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 0.10.1.0
>
>
> Ideally, we would:
> * Remove the beta label
> * Remove the `Unstable` annotation
> * Filling any critical gaps in functionality
> * Update the documentation on the old consumers to recommend the new consumer 
> (without deprecating the old consumer, however)
> The new consumer already looks pretty good in 0.9.0.1 so it's feasible that 
> we may be able to do this for 0.10.0.0. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3177) Kafka consumer can hang when position() is called on a non-existing partition.

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3177:
---
Priority: Critical  (was: Major)

> Kafka consumer can hang when position() is called on a non-existing partition.
> --
>
> Key: KAFKA-3177
> URL: https://issues.apache.org/jira/browse/KAFKA-3177
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> This can be easily reproduced as following:
> {code}
> {
> ...
> consumer.assign(SomeNonExsitingTopicParition);
> consumer.position();
> ...
> }
> {code}
> It seems when position is called we will try to do the following:
> 1. Fetch committed offsets.
> 2. If there is no committed offsets, try to reset offset using reset 
> strategy. in sendListOffsetRequest(), if the consumer does not know the 
> TopicPartition, it will refresh its metadata and retry. In this case, because 
> the partition does not exist, we fall in to the infinite loop of refreshing 
> topic metadata.
> Another orthogonal issue is that if the topic in the above code piece does 
> not exist, position() call will actually create the topic due to the fact 
> that currently topic metadata request could automatically create the topic. 
> This is a known separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: KIP-4 Wiki Update

2016-03-15 Thread Grant Henke
Moving the relevant wiki text here for discussion/tracking:
>
> Server-side Admin Request handlers
>
> At the highest level, admin requests will be handled on the brokers the
> same way that all message types are. However, because admin messages modify
> cluster metadata they should be handled by the controller. This allows the
> controller to propagate the changes to the rest of the cluster.  However,
> because the messages need to be handled by the controller does not
> necessarily mean they need to be sent directly to the controller. A message
> forwarding mechanism can be used to forward the message from any broker to
> the correct broker for handling.
>
> Because supporting all of this is quite the undertaking I will describe
> the "ideal functionality" and then the "intermediate functionality" that
> gets us some basic administrative support quickly while working towards the
> optimal state.
>
> *Ideal Functionality:*
>
>1. A client sends an admin request to *any* broker
>2. The admin request is forwarded to the required broker (likely the
>controller)
>3. The request is handled and the server blocks until a timeout is
>reached or the requested operation is completed (failure or success)
>   1. An operation is considered complete/successful when *all
>   required nodes have the correct/current state*.
>   2. Immediate follow up requests to *any broker* will succeed.
>   3. Requests that timeout may still be completed after the timeout.
>   The users would need to poll to check the state.
>4. The response is generated and forwarded back to the broker that
>received the request.
>5. A response is sent back to the client.
>
> *Intermediate Functionality*:
>
>1. A client sends an admin request to *the controller* broker
>   1. As a follow up request forwarding can be added transparently.
>   (see below)
>2. The request is handled and the server blocks until a timeout is
>reached or the requested operation is completed (failure or success)
>   1. An operation is considered complete/successful when *the
>   controller node has the correct/current state.*
>   2. Immediate follow up requests to *the controller* will succeed.
>   Others (not to the controller) are likely to succeed or cause a 
> retriable
>   exception that would eventually succeed.
>   3. Requests that timeout may still be completed after the timeout.
>   The users would need to poll to check the state.
>3. A response is sent back to the client.
>
> The ideal functionality has 2 features that are more challenging
> initially. For that reason those features will be removed from the initial
> changes, but will be tracked as follow up improvements. However, this
> intermediate solution should allow for a relatively transparent  transition
> to the ideal functionality.
>
> *Request Forwarding: KAFKA-1912
> *
>
> Request forwarding is relevant to any message the needs to be sent to the
> "correct" broker (ex: partition leader, group coordinator, etc). Though at
> first it may seam simple it has many technicall challenges that need to be
> decided in regards to connections, failure, retries, etc. Today, we depend
> on the client to choose the correct broker and clients that want to utilize
> the cluster "optimally" would likely continue to do so. For those reasons
> it can be handled it can be handled generically as an independent feature.
>
> *Cluster Consistent Blocking:*
>
> Blocking an admin request until the entire cluster is aware of the
> correct/current state is difficult based on Kafka's current approach for
> propagating metadata. This approach varies based on the the metadata
> changing.
>
>- Topic metadata changes are propagated via UpdateMetadata and
>LeaderAndIsr requests
>- Config changes are propagated via zookeeper and listeners
>- ACL changes depend on the implementation of the Authorizer interface
>   - The default SimpleACLAuthorizer uses zookeeper and listeners
>
> Though all of these mechanisms are different, they are all commonly
> "eventually consistent". None of the mechanisms, as currently implemented,
> will block until the metadata has been propagated successfully. Changing
> this behavior would require a large amount of change to the
> KafkaController, additional inter-broker messages, and potentially a change
> to the Authorizer interface. These are are all changes that should not
> block the implementation of KIP-4.
>
> The intermediate changes in KIP-4 should allow an easy transition to
> "complete blocking" when the work can be done. This is supported by
> providing *optional* local blocking in the mean time. This local blocking
> only blocks until the local state on the controller is correct. We will
> still provide a polling mechanism for users that do not want to block at
> all. A polling mechanism is required in the optimal implementation t

[jira] [Updated] (KAFKA-3284) Consider removing beta label in security documentation

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3284:
---
Fix Version/s: (was: 0.10.0.0)
   0.10.0.1

> Consider removing beta label in security documentation
> --
>
> Key: KAFKA-3284
> URL: https://issues.apache.org/jira/browse/KAFKA-3284
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Critical
> Fix For: 0.10.0.1
>
>
> We currently state that our security support is beta. It would be good to 
> remove that for 0.10.0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3173) Error while moving some partitions to OnlinePartition state

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3173:
---
Reviewer: Jun Rao
  Status: Patch Available  (was: Open)

> Error while moving some partitions to OnlinePartition state 
> 
>
> Key: KAFKA-3173
> URL: https://issues.apache.org/jira/browse/KAFKA-3173
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
>Priority: Critical
> Fix For: 0.10.0.0
>
> Attachments: KAFKA-3173-race-repro.patch
>
>
> We observed another instance of the problem reported in KAFKA-2300, but this 
> time the error appeared in the partition state machine. In KAFKA-2300, we 
> haven't cleaned up the state in {{PartitionStateMachine}} and 
> {{ReplicaStateMachine}} as we do in {{KafkaController}}.
> Here is the stack trace:
> {noformat}
> 2016-01-29 15:26:51,393] ERROR [Partition state machine on Controller 0]: 
> Error while moving some partitions to OnlinePartition state 
> (kafka.controller.PartitionStateMachine)java.lang.IllegalStateException: 
> Controller to broker state change requests batch is not empty while creating 
> a new one. 
> Some LeaderAndIsr state changes Map(0 -> Map(foo-0 -> (LeaderAndIsrInfo:
> (Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:0)))
>  might be lostat 
> kafka.controller.ControllerBrokerRequestBatch.newBatch(ControllerChannelManager.scala:254)
> at 
> kafka.controller.PartitionStateMachine.handleStateChanges(PartitionStateMachine.scala:144)
> at 
> kafka.controller.KafkaController.onNewPartitionCreation(KafkaController.scala:517)
> at 
> kafka.controller.KafkaController.onNewTopicCreation(KafkaController.scala:504)
> at 
> kafka.controller.PartitionStateMachine$TopicChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(PartitionStateMachine.scala:437)
> at 
> kafka.controller.PartitionStateMachine$TopicChangeListener$$anonfun$handleChildChange$1.apply(PartitionStateMachine.scala:419)
> at 
> kafka.controller.PartitionStateMachine$TopicChangeListener$$anonfun$handleChildChange$1.apply(PartitionStateMachine.scala:419)
> at 
> kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)at 
> kafka.controller.PartitionStateMachine$TopicChangeListener.handleChildChange(PartitionStateMachine.scala:418)
> at 
> org.I0Itec.zkclient.ZkClient$10.run(ZkClient.java:842)at 
> org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3173) Error while moving some partitions to OnlinePartition state

2016-03-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195412#comment-15195412
 ] 

Ismael Juma commented on KAFKA-3173:


PR from [~fpj] is here: https://github.com/apache/kafka/pull/1041

> Error while moving some partitions to OnlinePartition state 
> 
>
> Key: KAFKA-3173
> URL: https://issues.apache.org/jira/browse/KAFKA-3173
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
>Priority: Critical
> Fix For: 0.10.0.0
>
> Attachments: KAFKA-3173-race-repro.patch
>
>
> We observed another instance of the problem reported in KAFKA-2300, but this 
> time the error appeared in the partition state machine. In KAFKA-2300, we 
> haven't cleaned up the state in {{PartitionStateMachine}} and 
> {{ReplicaStateMachine}} as we do in {{KafkaController}}.
> Here is the stack trace:
> {noformat}
> 2016-01-29 15:26:51,393] ERROR [Partition state machine on Controller 0]: 
> Error while moving some partitions to OnlinePartition state 
> (kafka.controller.PartitionStateMachine)java.lang.IllegalStateException: 
> Controller to broker state change requests batch is not empty while creating 
> a new one. 
> Some LeaderAndIsr state changes Map(0 -> Map(foo-0 -> (LeaderAndIsrInfo:
> (Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:0)))
>  might be lostat 
> kafka.controller.ControllerBrokerRequestBatch.newBatch(ControllerChannelManager.scala:254)
> at 
> kafka.controller.PartitionStateMachine.handleStateChanges(PartitionStateMachine.scala:144)
> at 
> kafka.controller.KafkaController.onNewPartitionCreation(KafkaController.scala:517)
> at 
> kafka.controller.KafkaController.onNewTopicCreation(KafkaController.scala:504)
> at 
> kafka.controller.PartitionStateMachine$TopicChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(PartitionStateMachine.scala:437)
> at 
> kafka.controller.PartitionStateMachine$TopicChangeListener$$anonfun$handleChildChange$1.apply(PartitionStateMachine.scala:419)
> at 
> kafka.controller.PartitionStateMachine$TopicChangeListener$$anonfun$handleChildChange$1.apply(PartitionStateMachine.scala:419)
> at 
> kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)at 
> kafka.controller.PartitionStateMachine$TopicChangeListener.handleChildChange(PartitionStateMachine.scala:418)
> at 
> org.I0Itec.zkclient.ZkClient$10.run(ZkClient.java:842)at 
> org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2311) Consumer's ensureNotClosed method not thread safe

2016-03-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195419#comment-15195419
 ] 

Ismael Juma commented on KAFKA-2311:


[~timbrooks], would you like to submit a PR as per 
http://kafka.apache.org/contributing.html ? Thanks.

> Consumer's ensureNotClosed method not thread safe
> -
>
> Key: KAFKA-2311
> URL: https://issues.apache.org/jira/browse/KAFKA-2311
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Reporter: Tim Brooks
>Assignee: Tim Brooks
> Fix For: 0.10.0.0
>
> Attachments: KAFKA-2311.patch, KAFKA-2311.patch
>
>
> When a call is to the consumer is made, the first check is to see that the 
> consumer is not closed. This variable is not volatile so there is no 
> guarantee previous stores will be visible before a read.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3123) Follower Broker cannot start if offsets are already out of range

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3123:
---
Priority: Critical  (was: Major)

> Follower Broker cannot start if offsets are already out of range
> 
>
> Key: KAFKA-3123
> URL: https://issues.apache.org/jira/browse/KAFKA-3123
> Project: Kafka
>  Issue Type: Bug
>  Components: core, replication
>Affects Versions: 0.9.0.0
>Reporter: Soumyajit Sahu
>Assignee: Neha Narkhede
>Priority: Critical
>  Labels: patch
> Fix For: 0.10.0.0
>
> Attachments: 
> 0001-Fix-Follower-crashes-when-offset-out-of-range-during.patch
>
>
> I was trying to upgrade our test Windows cluster from 0.8.1.1 to 0.9.0 one 
> machine at a time. Our logs have just 2 hours of retention. I had re-imaged 
> the test machine under consideration, and got the following error in loop 
> after starting afresh with 0.9.0 broker:
> [2016-01-19 13:57:28,809] WARN [ReplicaFetcherThread-1-169595708], Replica 
> 15588 for partition [EventLogs4,1] reset its fetch offset from 0 to 
> current leader 169595708's start offset 334086 
> (kafka.server.ReplicaFetcherThread)
> [2016-01-19 13:57:28,809] ERROR [ReplicaFetcherThread-1-169595708], Error 
> getting offset for partition [EventLogs4,1] to broker 169595708 
> (kafka.server.ReplicaFetcherThread)
> java.lang.IllegalStateException: Compaction for partition [EXO_EventLogs4,1] 
> cannot be aborted and paused since it is in LogCleaningPaused state.
>   at 
> kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply$mcV$sp(LogCleanerManager.scala:149)
>   at 
> kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply(LogCleanerManager.scala:140)
>   at 
> kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply(LogCleanerManager.scala:140)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
>   at 
> kafka.log.LogCleanerManager.abortAndPauseCleaning(LogCleanerManager.scala:140)
>   at kafka.log.LogCleaner.abortAndPauseCleaning(LogCleaner.scala:141)
>   at kafka.log.LogManager.truncateFullyAndStartAt(LogManager.scala:304)
>   at 
> kafka.server.ReplicaFetcherThread.handleOffsetOutOfRange(ReplicaFetcherThread.scala:185)
>   at 
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:152)
>   at 
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:122)
>   at scala.Option.foreach(Option.scala:236)
>   at 
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:122)
>   at 
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:120)
>   at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>   at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>   at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
>   at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
>   at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
>   at 
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:120)
>   at 
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:120)
>   at 
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:120)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
>   at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
>   at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:93)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> I could unblock myself with a code change. I deleted the action for "case s 
> =>" in the LogCleanerManager.scala's abortAndPauseCleaning(). I think we 
> should not throw exception if the state is already LogCleaningAborted or 
> LogCleaningPaused in this function, but instead just let it roll.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3383) Producer should not remove an in flight request before successfully parsing the response.

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3383:
---
Priority: Critical  (was: Major)

> Producer should not remove an in flight request before successfully parsing 
> the response.
> -
>
> Key: KAFKA-3383
> URL: https://issues.apache.org/jira/browse/KAFKA-3383
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: chen zhu
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> In the NetworkClient, we remove the in flight request before we successfully 
> parse the response. If the response parse failed, the request will not be 
> fulfilled but just lost. For a producer request, that means the callback of 
> the messages won't be fired forever.
> We should only remove the in flight request after response parsing succeeds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1464) Add a throttling option to the Kafka replication tool

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-1464:
---
Fix Version/s: (was: 0.10.0.0)
   0.10.1.0

> Add a throttling option to the Kafka replication tool
> -
>
> Key: KAFKA-1464
> URL: https://issues.apache.org/jira/browse/KAFKA-1464
> Project: Kafka
>  Issue Type: New Feature
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: mjuarez
>Assignee: Ismael Juma
>Priority: Minor
>  Labels: replication, replication-tools
> Fix For: 0.10.1.0
>
>
> When performing replication on new nodes of a Kafka cluster, the replication 
> process will use all available resources to replicate as fast as possible.  
> This causes performance issues (mostly disk IO and sometimes network 
> bandwidth) when doing this in a production environment, in which you're 
> trying to serve downstream applications, at the same time you're performing 
> maintenance on the Kafka cluster.
> An option to throttle the replication to a specific rate (in either MB/s or 
> activities/second) would help production systems to better handle maintenance 
> tasks while still serving downstream applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3284) Consider removing beta label in security documentation

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3284:
---
Priority: Major  (was: Critical)

> Consider removing beta label in security documentation
> --
>
> Key: KAFKA-3284
> URL: https://issues.apache.org/jira/browse/KAFKA-3284
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 0.10.0.1
>
>
> We currently state that our security support is beta. It would be good to 
> remove that for 0.10.0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3404) Issues running the kafka system test Sasl

2016-03-15 Thread Mickael Maison (JIRA)
Mickael Maison created KAFKA-3404:
-

 Summary: Issues running the kafka system test Sasl
 Key: KAFKA-3404
 URL: https://issues.apache.org/jira/browse/KAFKA-3404
 Project: Kafka
  Issue Type: Bug
  Components: system tests
Affects Versions: 0.9.0.1
 Environment: Intel x86_64
Ubuntu 14.04

Reporter: Mickael Maison


Hi,

I'm trying to run the test_console_consumer.py system test and it's failing 
while testing the SASL protocols. 

[INFO  - 2016-03-15 14:41:58,533 - runner - log - lineno:211]: 
SerialTestRunner: 
kafkatest.sanity_checks.test_console_consumer.ConsoleConsumerTest.test_lifecycle.security_protocol=SASL_SSL:
 Summary: Kafka server didn't finish startup
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/ducktape/tests/runner.py", line 
102, in run_all_tests
result.data = self.run_single_test()
  File "/usr/local/lib/python2.7/dist-packages/ducktape/tests/runner.py", line 
154, in run_single_test
return self.current_test_context.function(self.current_test)
  File "/usr/local/lib/python2.7/dist-packages/ducktape/mark/_mark.py", line 
331, in wrapper
return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File 
"/home/mickael/ibm/messagehub/kafka/tests/kafkatest/sanity_checks/test_console_consumer.py",
 line 54, in test_lifecycle
self.kafka.start()
  File 
"/home/mickael/ibm/messagehub/kafka/tests/kafkatest/services/kafka/kafka.py", 
line 81, in start
Service.start(self)
  File "/usr/local/lib/python2.7/dist-packages/ducktape/services/service.py", 
line 140, in start
self.start_node(node)
  File 
"/home/mickael/ibm/messagehub/kafka/tests/kafkatest/services/kafka/kafka.py", 
line 124, in start_node
monitor.wait_until("Kafka Server.*started", timeout_sec=30, err_msg="Kafka 
server didn't finish startup")
  File 
"/usr/local/lib/python2.7/dist-packages/ducktape/cluster/remoteaccount.py", 
line 303, in wait_until
return wait_until(lambda: self.acct.ssh("tail -c +%d %s | grep '%s'" % 
(self.offset+1, self.log, pattern), allow_fail=True) == 0, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/ducktape/utils/util.py", line 
36, in wait_until
raise TimeoutError(err_msg)
TimeoutError: Kafka server didn't finish startup

Looking at the logs from the kafka worker, I can see that Kafka is not able to 
connect the the kerberos server:
[2016-03-15 14:41:28,751] FATAL [Kafka Server 1], Fatal error during 
KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.apache.kafka.common.KafkaException: 
javax.security.auth.login.LoginException: Connection refused
at 
org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:74)
at 
org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:60)
at kafka.network.Processor.(SocketServer.scala:379)
at 
kafka.network.SocketServer$$anonfun$startup$1$$anonfun$apply$1.apply$mcVI$sp(SocketServer.scala:96)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at 
kafka.network.SocketServer$$anonfun$startup$1.apply(SocketServer.scala:95)
at 
kafka.network.SocketServer$$anonfun$startup$1.apply(SocketServer.scala:91)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at 
scala.collection.MapLike$DefaultValuesIterable.foreach(MapLike.scala:206)
at kafka.network.SocketServer.startup(SocketServer.scala:91)
at kafka.server.KafkaServer.startup(KafkaServer.scala:179)
at 
kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
at kafka.Kafka$.main(Kafka.scala:67)
at kafka.Kafka.main(Kafka.scala)

Looking at the kerberos worker, I can see it was started fine:
Standalone MiniKdc Running
---
  Realm   : EXAMPLE.COM
  Running at  : worker4:worker4
  krb5conf: /mnt/minikdc/krb5.conf

  created keytab  : /mnt/minikdc/keytab
  with principals : [client, kafka/worker2]

 Do  or kill  to stop it
---

Running netstat on the kerberos worker, I can see that it's listening on 47385:
vagrant@worker4:~$ netstat -ano
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address   Foreign Address State   
Timer
tcp0  0 0.0.0.0:111 0.0.0.0:*   LISTEN  
off (0.00/0/0)
tcp0  0 0.0.0.0:22  0.0.0.0:*   LISTEN  
off (0.00/0/0)
tcp0  0 0.0.0.0:44313   0.0.0.0:*   LISTEN  
off (0.00/0/0)
tcp0  0 10.0.2.15:2210.0.2.2:56153  ESTABLISHED 
keepalive (7165.86/0/0)
tcp0  0 127.0.0.1:47747 127.0.1.1:4

[GitHub] kafka pull request: Changed port of bootstrap server to default.

2016-03-15 Thread raphw
GitHub user raphw opened a pull request:

https://github.com/apache/kafka/pull/1074

Changed port of bootstrap server to default.

Kafka is typically running on port 9092. The example named a different port 
what makes it difficult to run a bootstrap example without any further 
configuration.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/raphw/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1074.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1074


commit d3edbf4d6537307dd834b7dd24bcfd2c5edf6513
Author: Rafael Winterhalter 
Date:   2016-03-15T15:15:27Z

Changed port of bootstrap server to default

Kafka is typically running on port 9092. The example named a different port 
what makes it difficult to run a bootstrap example.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (KAFKA-3250) release tarball is unnecessarily large due to duplicate libraries

2016-03-15 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KAFKA-3250:
--

Assignee: Grant Henke

> release tarball is unnecessarily large due to duplicate libraries
> -
>
> Key: KAFKA-3250
> URL: https://issues.apache.org/jira/browse/KAFKA-3250
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Gwen Shapira
>Assignee: Grant Henke
> Fix For: 0.10.0.0
>
>
> Between 0.8.2.2 and 0.9.0, our release tarballs grew from 17M to 34M. We 
> thought it is just due to new libraries and dependencies. But:
> 1. If you untar Kafka into a directory and check the directory size (du -sh), 
> it is around 28M, smaller than the tarball. Recompressing give you 25M 
> tarball.
> 2. If you list the original tar contents and grep for "snappy", you see it 4 
> times in the tarball.
> Clearly we are creating a tarball with duplicates (and we didn't before).
> I think its due to how we are generating the tarball from core but pull in 
> other projects into libs/ directory with their dependencies (which overlap).
> We need to find out how to sort it out (possibly with excludes).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3250) release tarball is unnecessarily large due to duplicate libraries

2016-03-15 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KAFKA-3250:
---
Fix Version/s: 0.10.0.0

> release tarball is unnecessarily large due to duplicate libraries
> -
>
> Key: KAFKA-3250
> URL: https://issues.apache.org/jira/browse/KAFKA-3250
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Gwen Shapira
>Assignee: Grant Henke
> Fix For: 0.10.0.0
>
>
> Between 0.8.2.2 and 0.9.0, our release tarballs grew from 17M to 34M. We 
> thought it is just due to new libraries and dependencies. But:
> 1. If you untar Kafka into a directory and check the directory size (du -sh), 
> it is around 28M, smaller than the tarball. Recompressing give you 25M 
> tarball.
> 2. If you list the original tar contents and grep for "snappy", you see it 4 
> times in the tarball.
> Clearly we are creating a tarball with duplicates (and we didn't before).
> I think its due to how we are generating the tarball from core but pull in 
> other projects into libs/ directory with their dependencies (which overlap).
> We need to find out how to sort it out (possibly with excludes).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3250) release tarball is unnecessarily large due to duplicate libraries

2016-03-15 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KAFKA-3250:
---
Affects Version/s: 0.9.0.1

> release tarball is unnecessarily large due to duplicate libraries
> -
>
> Key: KAFKA-3250
> URL: https://issues.apache.org/jira/browse/KAFKA-3250
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Gwen Shapira
>Assignee: Grant Henke
> Fix For: 0.10.0.0
>
>
> Between 0.8.2.2 and 0.9.0, our release tarballs grew from 17M to 34M. We 
> thought it is just due to new libraries and dependencies. But:
> 1. If you untar Kafka into a directory and check the directory size (du -sh), 
> it is around 28M, smaller than the tarball. Recompressing give you 25M 
> tarball.
> 2. If you list the original tar contents and grep for "snappy", you see it 4 
> times in the tarball.
> Clearly we are creating a tarball with duplicates (and we didn't before).
> I think its due to how we are generating the tarball from core but pull in 
> other projects into libs/ directory with their dependencies (which overlap).
> We need to find out how to sort it out (possibly with excludes).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3250: release tarball is unnecessarily l...

2016-03-15 Thread granthenke
GitHub user granthenke opened a pull request:

https://github.com/apache/kafka/pull/1075

KAFKA-3250: release tarball is unnecessarily large due to duplicate l…

…ibraries

This ensures duplicates are not copied in the distribution without 
rewriting all of the tar'ing logic. A larger improvement could be made to the 
packaging code, but that should be tracked by another jira.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/granthenke/kafka libs-duplicates

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1075.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1075


commit 8cdbf18fb5b751e0fc922d405643c152daaef4d1
Author: Grant Henke 
Date:   2016-03-15T15:53:43Z

KAFKA-3250: release tarball is unnecessarily large due to duplicate 
libraries

This ensures duplicates are not copied in the distribution without 
rewriting all of the tar'ing logic. A larger improvement could be made to the 
packaging code, but that should be tracked by another jira.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3250) release tarball is unnecessarily large due to duplicate libraries

2016-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195538#comment-15195538
 ] 

ASF GitHub Bot commented on KAFKA-3250:
---

GitHub user granthenke opened a pull request:

https://github.com/apache/kafka/pull/1075

KAFKA-3250: release tarball is unnecessarily large due to duplicate l…

…ibraries

This ensures duplicates are not copied in the distribution without 
rewriting all of the tar'ing logic. A larger improvement could be made to the 
packaging code, but that should be tracked by another jira.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/granthenke/kafka libs-duplicates

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1075.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1075


commit 8cdbf18fb5b751e0fc922d405643c152daaef4d1
Author: Grant Henke 
Date:   2016-03-15T15:53:43Z

KAFKA-3250: release tarball is unnecessarily large due to duplicate 
libraries

This ensures duplicates are not copied in the distribution without 
rewriting all of the tar'ing logic. A larger improvement could be made to the 
packaging code, but that should be tracked by another jira.




> release tarball is unnecessarily large due to duplicate libraries
> -
>
> Key: KAFKA-3250
> URL: https://issues.apache.org/jira/browse/KAFKA-3250
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Gwen Shapira
>Assignee: Grant Henke
> Fix For: 0.10.0.0
>
>
> Between 0.8.2.2 and 0.9.0, our release tarballs grew from 17M to 34M. We 
> thought it is just due to new libraries and dependencies. But:
> 1. If you untar Kafka into a directory and check the directory size (du -sh), 
> it is around 28M, smaller than the tarball. Recompressing give you 25M 
> tarball.
> 2. If you list the original tar contents and grep for "snappy", you see it 4 
> times in the tarball.
> Clearly we are creating a tarball with duplicates (and we didn't before).
> I think its due to how we are generating the tarball from core but pull in 
> other projects into libs/ directory with their dependencies (which overlap).
> We need to find out how to sort it out (possibly with excludes).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3250) release tarball is unnecessarily large due to duplicate libraries

2016-03-15 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KAFKA-3250:
---
Status: Patch Available  (was: Open)

> release tarball is unnecessarily large due to duplicate libraries
> -
>
> Key: KAFKA-3250
> URL: https://issues.apache.org/jira/browse/KAFKA-3250
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Gwen Shapira
>Assignee: Grant Henke
> Fix For: 0.10.0.0
>
>
> Between 0.8.2.2 and 0.9.0, our release tarballs grew from 17M to 34M. We 
> thought it is just due to new libraries and dependencies. But:
> 1. If you untar Kafka into a directory and check the directory size (du -sh), 
> it is around 28M, smaller than the tarball. Recompressing give you 25M 
> tarball.
> 2. If you list the original tar contents and grep for "snappy", you see it 4 
> times in the tarball.
> Clearly we are creating a tarball with duplicates (and we didn't before).
> I think its due to how we are generating the tarball from core but pull in 
> other projects into libs/ directory with their dependencies (which overlap).
> We need to find out how to sort it out (possibly with excludes).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3390) ReplicaManager may infinitely try-fail to shrink ISR set of deleted partition

2016-03-15 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195593#comment-15195593
 ] 

Mayuresh Gharat commented on KAFKA-3390:


Do you mean that even after the topic got completely removed the replicaManager 
on the leader broker kept on trying to shrink the ISR. Am I understanding it 
correctly?

-Mayuresh

> ReplicaManager may infinitely try-fail to shrink ISR set of deleted partition
> -
>
> Key: KAFKA-3390
> URL: https://issues.apache.org/jira/browse/KAFKA-3390
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Stevo Slavic
>Assignee: Mayuresh Gharat
>
> For a topic whose deletion has been requested, Kafka replica manager may end 
> up infinitely trying and failing to shrink ISR.
> Here is fragment from server.log where this recurring and never ending 
> condition has been noticed:
> {noformat}
> [2016-03-04 09:42:13,894] INFO Partition [foo,0] on broker 1: Shrinking ISR 
> for partition [foo,0] from 1,3,2 to 1 (kafka.cluster.Partition)
> [2016-03-04 09:42:13,897] WARN Conditional update of path 
> /brokers/topics/foo/partitions/0/state with data 
> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1]} 
> and expected version 68 failed due to 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /brokers/topics/foo/partitions/0/state (kafka.utils.ZkUtils)
> [2016-03-04 09:42:13,898] INFO Partition [foo,0] on broker 1: Cached 
> zkVersion [68] not equal to that in zookeeper, skip updating ISR 
> (kafka.cluster.Partition)
> [2016-03-04 09:42:23,894] INFO Partition [foo,0] on broker 1: Shrinking ISR 
> for partition [foo,0] from 1,3,2 to 1 (kafka.cluster.Partition)
> [2016-03-04 09:42:23,897] WARN Conditional update of path 
> /brokers/topics/foo/partitions/0/state with data 
> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1]} 
> and expected version 68 failed due to 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /brokers/topics/foo/partitions/0/state (kafka.utils.ZkUtils)
> [2016-03-04 09:42:23,897] INFO Partition [foo,0] on broker 1: Cached 
> zkVersion [68] not equal to that in zookeeper, skip updating ISR 
> (kafka.cluster.Partition)
> [2016-03-04 09:42:33,894] INFO Partition [foo,0] on broker 1: Shrinking ISR 
> for partition [foo,0] from 1,3,2 to 1 (kafka.cluster.Partition)
> [2016-03-04 09:42:33,897] WARN Conditional update of path 
> /brokers/topics/foo/partitions/0/state with data 
> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1]} 
> and expected version 68 failed due to 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /brokers/topics/foo/partitions/0/state (kafka.utils.ZkUtils)
> [2016-03-04 09:42:33,897] INFO Partition [foo,0] on broker 1: Cached 
> zkVersion [68] not equal to that in zookeeper, skip updating ISR 
> (kafka.cluster.Partition)
> ...
> {noformat}
> Before topic deletion was requested, this was state in ZK of its sole 
> partition:
> {noformat}
> Zxid: 0x181045
> Cxid: 0xc92
> Client id:0x3532dd88fd2
> Time: Mon Feb 29 16:46:23 CET 2016
> Operation:setData
> Path: /brokers/topics/foo/partitions/0/state
> Data: 
> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1,3,2]}
> Version:  68
> {noformat}
> Topic (sole partition) had no data ever published to it. I guess at some 
> point after topic deletion has been requested, partition state first got 
> updated and this was updated state:
> {noformat}
> Zxid: 0x18b0be
> Cxid: 0x141e4
> Client id:0x3532dd88fd2
> Time: Fri Mar 04 9:41:52 CET 2016
> Operation:setData
> Path: /brokers/topics/foo/partitions/0/state
> Data: 
> {"controller_epoch":54,"leader":1,"version":1,"leader_epoch":35,"isr":[1,3]}
> Version:  69
> {noformat}
> For whatever reason replica manager (some cache it uses, I guess 
> ReplicaManager.allPartitions) never sees this update, nor does it see that 
> the partition state, partition, partitions node and finally topic node got 
> deleted:
> {noformat}
> Zxid: 0x18b0bf
> Cxid: 0x40fb
> Client id:0x3532dd88fd2000a
> Time: Fri Mar 04 9:41:52 CET 2016
> Operation:delete
> Path: /brokers/topics/foo/partitions/0/state
> ---
> Zxid: 0x18b0c0
> Cxid: 0x40fe
> Client id:0x3532dd88fd2000a
> Time: Fri Mar 04 9:41:52 CET 2016
> Operation:delete
> Path: /brokers/topics/foo/partitions/0
> ---
> Zxid: 0x18b0c1
> Cxid: 0x4100
> Client id:0x3532dd88fd2000a
> Time: Fri Mar 04 9:41:52 CET 2016

[jira] [Commented] (KAFKA-3390) ReplicaManager may infinitely try-fail to shrink ISR set of deleted partition

2016-03-15 Thread Stevo Slavic (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195612#comment-15195612
 ] 

Stevo Slavic commented on KAFKA-3390:
-

Yes. Topic ZK node and subnodes would not get deleted if topic and partitions 
did not get actually deleted.
Replica manager cache didn't see these changes, so every time it got triggered 
to shrink ISR it would get 
{{org.apache.zookeeper.KeeperException$NoNodeException}}.
For whatever reason it also didn't ever get unscheduled not to do this check 
and shrink ISR.

Only workaround was - restart the node. That flushed the cache, changed 
controller and things were peaceful again.

> ReplicaManager may infinitely try-fail to shrink ISR set of deleted partition
> -
>
> Key: KAFKA-3390
> URL: https://issues.apache.org/jira/browse/KAFKA-3390
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Stevo Slavic
>Assignee: Mayuresh Gharat
>
> For a topic whose deletion has been requested, Kafka replica manager may end 
> up infinitely trying and failing to shrink ISR.
> Here is fragment from server.log where this recurring and never ending 
> condition has been noticed:
> {noformat}
> [2016-03-04 09:42:13,894] INFO Partition [foo,0] on broker 1: Shrinking ISR 
> for partition [foo,0] from 1,3,2 to 1 (kafka.cluster.Partition)
> [2016-03-04 09:42:13,897] WARN Conditional update of path 
> /brokers/topics/foo/partitions/0/state with data 
> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1]} 
> and expected version 68 failed due to 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /brokers/topics/foo/partitions/0/state (kafka.utils.ZkUtils)
> [2016-03-04 09:42:13,898] INFO Partition [foo,0] on broker 1: Cached 
> zkVersion [68] not equal to that in zookeeper, skip updating ISR 
> (kafka.cluster.Partition)
> [2016-03-04 09:42:23,894] INFO Partition [foo,0] on broker 1: Shrinking ISR 
> for partition [foo,0] from 1,3,2 to 1 (kafka.cluster.Partition)
> [2016-03-04 09:42:23,897] WARN Conditional update of path 
> /brokers/topics/foo/partitions/0/state with data 
> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1]} 
> and expected version 68 failed due to 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /brokers/topics/foo/partitions/0/state (kafka.utils.ZkUtils)
> [2016-03-04 09:42:23,897] INFO Partition [foo,0] on broker 1: Cached 
> zkVersion [68] not equal to that in zookeeper, skip updating ISR 
> (kafka.cluster.Partition)
> [2016-03-04 09:42:33,894] INFO Partition [foo,0] on broker 1: Shrinking ISR 
> for partition [foo,0] from 1,3,2 to 1 (kafka.cluster.Partition)
> [2016-03-04 09:42:33,897] WARN Conditional update of path 
> /brokers/topics/foo/partitions/0/state with data 
> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1]} 
> and expected version 68 failed due to 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /brokers/topics/foo/partitions/0/state (kafka.utils.ZkUtils)
> [2016-03-04 09:42:33,897] INFO Partition [foo,0] on broker 1: Cached 
> zkVersion [68] not equal to that in zookeeper, skip updating ISR 
> (kafka.cluster.Partition)
> ...
> {noformat}
> Before topic deletion was requested, this was state in ZK of its sole 
> partition:
> {noformat}
> Zxid: 0x181045
> Cxid: 0xc92
> Client id:0x3532dd88fd2
> Time: Mon Feb 29 16:46:23 CET 2016
> Operation:setData
> Path: /brokers/topics/foo/partitions/0/state
> Data: 
> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1,3,2]}
> Version:  68
> {noformat}
> Topic (sole partition) had no data ever published to it. I guess at some 
> point after topic deletion has been requested, partition state first got 
> updated and this was updated state:
> {noformat}
> Zxid: 0x18b0be
> Cxid: 0x141e4
> Client id:0x3532dd88fd2
> Time: Fri Mar 04 9:41:52 CET 2016
> Operation:setData
> Path: /brokers/topics/foo/partitions/0/state
> Data: 
> {"controller_epoch":54,"leader":1,"version":1,"leader_epoch":35,"isr":[1,3]}
> Version:  69
> {noformat}
> For whatever reason replica manager (some cache it uses, I guess 
> ReplicaManager.allPartitions) never sees this update, nor does it see that 
> the partition state, partition, partitions node and finally topic node got 
> deleted:
> {noformat}
> Zxid: 0x18b0bf
> Cxid: 0x40fb
> Client id:0x3532dd88fd2000a
> Time: Fri Mar 04 9:41:52 CET 2016
> Operation:delete
> Path: /brokers/topics/foo/partitions/0/state
> ---
> Zxid: 0x18b0c0
> Cxid:   

[jira] [Comment Edited] (KAFKA-3390) ReplicaManager may infinitely try-fail to shrink ISR set of deleted partition

2016-03-15 Thread Stevo Slavic (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195612#comment-15195612
 ] 

Stevo Slavic edited comment on KAFKA-3390 at 3/15/16 4:44 PM:
--

Yes. Topic ZK node and subnodes would not get deleted if topic and partitions 
did not get actually deleted.
Replica manager cache didn't see these changes, so every time it got triggered 
to shrink ISR it would get 
{{org.apache.zookeeper.KeeperException$NoNodeException}}.
For whatever reason it also didn't ever get unscheduled not to do this check 
and shrink ISR.

Only workaround was - restart the node. That "flushed" the cache, changed 
controller and things were peaceful again.


was (Author: sslavic):
Yes. Topic ZK node and subnodes would not get deleted if topic and partitions 
did not get actually deleted.
Replica manager cache didn't see these changes, so every time it got triggered 
to shrink ISR it would get 
{{org.apache.zookeeper.KeeperException$NoNodeException}}.
For whatever reason it also didn't ever get unscheduled not to do this check 
and shrink ISR.

Only workaround was - restart the node. That flushed the cache, changed 
controller and things were peaceful again.

> ReplicaManager may infinitely try-fail to shrink ISR set of deleted partition
> -
>
> Key: KAFKA-3390
> URL: https://issues.apache.org/jira/browse/KAFKA-3390
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Stevo Slavic
>Assignee: Mayuresh Gharat
>
> For a topic whose deletion has been requested, Kafka replica manager may end 
> up infinitely trying and failing to shrink ISR.
> Here is fragment from server.log where this recurring and never ending 
> condition has been noticed:
> {noformat}
> [2016-03-04 09:42:13,894] INFO Partition [foo,0] on broker 1: Shrinking ISR 
> for partition [foo,0] from 1,3,2 to 1 (kafka.cluster.Partition)
> [2016-03-04 09:42:13,897] WARN Conditional update of path 
> /brokers/topics/foo/partitions/0/state with data 
> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1]} 
> and expected version 68 failed due to 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /brokers/topics/foo/partitions/0/state (kafka.utils.ZkUtils)
> [2016-03-04 09:42:13,898] INFO Partition [foo,0] on broker 1: Cached 
> zkVersion [68] not equal to that in zookeeper, skip updating ISR 
> (kafka.cluster.Partition)
> [2016-03-04 09:42:23,894] INFO Partition [foo,0] on broker 1: Shrinking ISR 
> for partition [foo,0] from 1,3,2 to 1 (kafka.cluster.Partition)
> [2016-03-04 09:42:23,897] WARN Conditional update of path 
> /brokers/topics/foo/partitions/0/state with data 
> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1]} 
> and expected version 68 failed due to 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /brokers/topics/foo/partitions/0/state (kafka.utils.ZkUtils)
> [2016-03-04 09:42:23,897] INFO Partition [foo,0] on broker 1: Cached 
> zkVersion [68] not equal to that in zookeeper, skip updating ISR 
> (kafka.cluster.Partition)
> [2016-03-04 09:42:33,894] INFO Partition [foo,0] on broker 1: Shrinking ISR 
> for partition [foo,0] from 1,3,2 to 1 (kafka.cluster.Partition)
> [2016-03-04 09:42:33,897] WARN Conditional update of path 
> /brokers/topics/foo/partitions/0/state with data 
> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1]} 
> and expected version 68 failed due to 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /brokers/topics/foo/partitions/0/state (kafka.utils.ZkUtils)
> [2016-03-04 09:42:33,897] INFO Partition [foo,0] on broker 1: Cached 
> zkVersion [68] not equal to that in zookeeper, skip updating ISR 
> (kafka.cluster.Partition)
> ...
> {noformat}
> Before topic deletion was requested, this was state in ZK of its sole 
> partition:
> {noformat}
> Zxid: 0x181045
> Cxid: 0xc92
> Client id:0x3532dd88fd2
> Time: Mon Feb 29 16:46:23 CET 2016
> Operation:setData
> Path: /brokers/topics/foo/partitions/0/state
> Data: 
> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1,3,2]}
> Version:  68
> {noformat}
> Topic (sole partition) had no data ever published to it. I guess at some 
> point after topic deletion has been requested, partition state first got 
> updated and this was updated state:
> {noformat}
> Zxid: 0x18b0be
> Cxid: 0x141e4
> Client id:0x3532dd88fd2
> Time: Fri Mar 04 9:41:52 CET 2016
> Operation:setData
> Path: /brokers/topics/foo/partitions/0/state
> Data: 
> {"controller_epoch":54,"leader":1,"version":1,"l

[jira] [Updated] (KAFKA-2787) Refactor gradle build

2016-03-15 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KAFKA-2787:
---
Fix Version/s: (was: 0.10.0.0)
   0.10.1.0

> Refactor gradle build
> -
>
> Key: KAFKA-2787
> URL: https://issues.apache.org/jira/browse/KAFKA-2787
> Project: Kafka
>  Issue Type: Task
>  Components: build
>Affects Versions: 0.8.2.2
>Reporter: Grant Henke
>Assignee: Grant Henke
> Fix For: 0.10.1.0
>
>
> The build files are quite large with a lot of duplication and overlap. This 
> could lead to mistakes, reduce readability and functionality, and hinder 
> future changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3364) Centrallize doc generation

2016-03-15 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KAFKA-3364:
---
Summary: Centrallize doc generation  (was: Centrallize doc generation and 
move all generated docs under a single "generated" folder)

> Centrallize doc generation
> --
>
> Key: KAFKA-3364
> URL: https://issues.apache.org/jira/browse/KAFKA-3364
> Project: Kafka
>  Issue Type: Sub-task
>  Components: build
>Reporter: Grant Henke
> Fix For: 0.10.0.0
>
>
> Currently docs generation is scattered throughout the build file/process. 
> Centralizing doc generation into its own location/file would help make the 
> process more clear. 
> Also, every time a new generated file is added the .gitignore file needs to 
> be updated. Instead putting all generated docs under docs/generated/* would 
> allow one entry to be a catchall. This change should be made at the same 
> time. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3405) Deduplicate and break out release tasks

2016-03-15 Thread Grant Henke (JIRA)
Grant Henke created KAFKA-3405:
--

 Summary: Deduplicate and break out release tasks
 Key: KAFKA-3405
 URL: https://issues.apache.org/jira/browse/KAFKA-3405
 Project: Kafka
  Issue Type: Sub-task
Reporter: Grant Henke
Assignee: Grant Henke


Tasks like copyDependent libs are repeated throughout the build. Other tasks 
like releaseTarGz should be be moved out of the core module. 

While refactoring this code other optimizations like ensuring sources and 
javadoc jars are not included in the classpath should be done as well.

If it makes sense, moving the release tasks to a separate gradle file is 
preferred.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3364) Centrallize doc generation

2016-03-15 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195643#comment-15195643
 ] 

Grant Henke commented on KAFKA-3364:


Moving docs under a single "generated" folder was done in KAFKA-3361

> Centrallize doc generation
> --
>
> Key: KAFKA-3364
> URL: https://issues.apache.org/jira/browse/KAFKA-3364
> Project: Kafka
>  Issue Type: Sub-task
>  Components: build
>Reporter: Grant Henke
> Fix For: 0.10.0.0
>
>
> Currently docs generation is scattered throughout the build file/process. 
> Centralizing doc generation into its own location/file would help make the 
> process more clear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3364) Centrallize doc generation

2016-03-15 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KAFKA-3364:
---
Description: Currently docs generation is scattered throughout the build 
file/process. Centralizing doc generation into its own location/file would help 
make the process more clear.  (was: Currently docs generation is scattered 
throughout the build file/process. Centralizing doc generation into its own 
location/file would help make the process more clear. 

Also, every time a new generated file is added the .gitignore file needs to be 
updated. Instead putting all generated docs under docs/generated/* would allow 
one entry to be a catchall. This change should be made at the same time. )

> Centrallize doc generation
> --
>
> Key: KAFKA-3364
> URL: https://issues.apache.org/jira/browse/KAFKA-3364
> Project: Kafka
>  Issue Type: Sub-task
>  Components: build
>Reporter: Grant Henke
> Fix For: 0.10.0.0
>
>
> Currently docs generation is scattered throughout the build file/process. 
> Centralizing doc generation into its own location/file would help make the 
> process more clear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-1215) Rack-Aware replica assignment option

2016-03-15 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-1215.

Resolution: Fixed

Issue resolved by pull request 132
[https://github.com/apache/kafka/pull/132]

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Allen Wang
>Priority: Critical
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-1215: Rack-Aware replica assignment opti...

2016-03-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/132


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2016-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195674#comment-15195674
 ] 

ASF GitHub Bot commented on KAFKA-1215:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/132


> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Allen Wang
>Priority: Critical
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3336) Unify ser/de pair classes into one serde class

2016-03-15 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-3336:

Status: Patch Available  (was: Open)

> Unify ser/de pair classes into one serde class
> --
>
> Key: KAFKA-3336
> URL: https://issues.apache.org/jira/browse/KAFKA-3336
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> Right now users must provide two separate classes for serializers and 
> deserializers, respectively.  This means the current API functions have at 
> least 2 * numberOfTypes parameters.
> *Example (current, bad): "foo(..., longSerializer, longDeserializer)".*
> Because the serde aspect of the API is already one of the biggest UX issues, 
> we should unify the serde functionality into a single serde class, i.e. one 
> class that provides both serialization and deserialization functionality.  
> This will reduce the number of required serde parameters in the API by 50%.
> *Example (suggested, better): "foo(..., longSerializerDeserializer)"*. 
> * Note: This parameter name is horrible and only used to highlight the 
> difference to the "current" example above.
> We also want to 1) add a pairing function for each operator that does not 
> require serialization and 2) add a default serde in the configs to make these 
> not required configs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3315) Add Connect API to expose connector configuration info

2016-03-15 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-3315:
-
Status: Patch Available  (was: Open)

https://github.com/apache/kafka/pull/964

> Add Connect API to expose connector configuration info
> --
>
> Key: KAFKA-3315
> URL: https://issues.apache.org/jira/browse/KAFKA-3315
> Project: Kafka
>  Issue Type: Bug
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Liquan Pei
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> Connectors should be able to provide information about how they can be 
> configured. It will be nice to expose this programmatically as part of the 
> standard interface for connectors. This can also include support for more 
> than just a static set of config options. For example, a validation REST API 
> could provide intermediate feedback based on a partial configuration and 
> include recommendations/suggestions for fields based on the settings 
> available so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3387: Update GetOffsetShell tool to not ...

2016-03-15 Thread SinghAsDev
GitHub user SinghAsDev opened a pull request:

https://github.com/apache/kafka/pull/1076

KAFKA-3387: Update GetOffsetShell tool to not rely on old producer



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/SinghAsDev/kafka KAFKA-3387

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1076.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1076


commit f1ad5a2dee9b451573f68259eda5081f52525029
Author: Ashish Singh 
Date:   2016-03-14T23:50:19Z

KAFKA-3387: Update GetOffsetShell tool to not rely on old producer




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3387) Update GetOffsetShell tool to not rely on old producer.

2016-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195752#comment-15195752
 ] 

ASF GitHub Bot commented on KAFKA-3387:
---

GitHub user SinghAsDev opened a pull request:

https://github.com/apache/kafka/pull/1076

KAFKA-3387: Update GetOffsetShell tool to not rely on old producer



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/SinghAsDev/kafka KAFKA-3387

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1076.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1076


commit f1ad5a2dee9b451573f68259eda5081f52525029
Author: Ashish Singh 
Date:   2016-03-14T23:50:19Z

KAFKA-3387: Update GetOffsetShell tool to not rely on old producer




> Update GetOffsetShell tool to not rely on old producer.
> ---
>
> Key: KAFKA-3387
> URL: https://issues.apache.org/jira/browse/KAFKA-3387
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3387) Update GetOffsetShell tool to not rely on old producer.

2016-03-15 Thread Ashish K Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish K Singh updated KAFKA-3387:
--
Status: Patch Available  (was: Open)

> Update GetOffsetShell tool to not rely on old producer.
> ---
>
> Key: KAFKA-3387
> URL: https://issues.apache.org/jira/browse/KAFKA-3387
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3401) Message format change on the fly breaks 0.9 consumer

2016-03-15 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195757#comment-15195757
 ] 

Jiangjie Qin commented on KAFKA-3401:
-

[~ijuma] You are right. We should only change the message format version for 
the topic for 0.10.0.0 consumer. Sorry about the confusion.

> Message format change on the fly breaks 0.9 consumer
> 
>
> Key: KAFKA-3401
> URL: https://issues.apache.org/jira/browse/KAFKA-3401
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Eno Thereska
>Assignee: Jiangjie Qin
>Priority: Blocker
> Fix For: 0.10.0.0
>
> Attachments: 2016-03-15--009.zip
>
>
> The new system test as part of KAFKA-3202 reveals a problem when the message 
> format is changed on the fly. When the cluster is using 0.10.x brokers and 
> producers and consumers use version 0.9.0.1 an error happens when the message 
> format is changed on the fly to version 0.9:
> {code}
> Exception: {'ConsoleConsumer-worker-1': Exception('Unexpected message format 
> (expected an integer). Message: null',)}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-3401) Message format change on the fly breaks 0.9 consumer

2016-03-15 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195757#comment-15195757
 ] 

Jiangjie Qin edited comment on KAFKA-3401 at 3/15/16 5:39 PM:
--

[~ijuma] You are right. We should only change the message format version for 
the topic for 0.10.0.0 consumer. 

[~enothereska] Sorry about the confusion. I think this exception is expected.


was (Author: becket_qin):
[~ijuma] You are right. We should only change the message format version for 
the topic for 0.10.0.0 consumer. Sorry about the confusion.

> Message format change on the fly breaks 0.9 consumer
> 
>
> Key: KAFKA-3401
> URL: https://issues.apache.org/jira/browse/KAFKA-3401
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Eno Thereska
>Assignee: Jiangjie Qin
>Priority: Blocker
> Fix For: 0.10.0.0
>
> Attachments: 2016-03-15--009.zip
>
>
> The new system test as part of KAFKA-3202 reveals a problem when the message 
> format is changed on the fly. When the cluster is using 0.10.x brokers and 
> producers and consumers use version 0.9.0.1 an error happens when the message 
> format is changed on the fly to version 0.9:
> {code}
> Exception: {'ConsoleConsumer-worker-1': Exception('Unexpected message format 
> (expected an integer). Message: null',)}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-3401) Message format change on the fly breaks 0.9 consumer

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3401.

Resolution: Not A Problem

Thanks for clarifying Becket. [~enothereska], please update the test to take 
Becket's comment into account.

> Message format change on the fly breaks 0.9 consumer
> 
>
> Key: KAFKA-3401
> URL: https://issues.apache.org/jira/browse/KAFKA-3401
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Eno Thereska
>Assignee: Jiangjie Qin
>Priority: Blocker
> Fix For: 0.10.0.0
>
> Attachments: 2016-03-15--009.zip
>
>
> The new system test as part of KAFKA-3202 reveals a problem when the message 
> format is changed on the fly. When the cluster is using 0.10.x brokers and 
> producers and consumers use version 0.9.0.1 an error happens when the message 
> format is changed on the fly to version 0.9:
> {code}
> Exception: {'ConsoleConsumer-worker-1': Exception('Unexpected message format 
> (expected an integer). Message: null',)}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3401) Message format change on the fly breaks 0.9 consumer

2016-03-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3401:
---
Fix Version/s: (was: 0.10.0.0)

> Message format change on the fly breaks 0.9 consumer
> 
>
> Key: KAFKA-3401
> URL: https://issues.apache.org/jira/browse/KAFKA-3401
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Eno Thereska
>Assignee: Jiangjie Qin
>Priority: Blocker
> Attachments: 2016-03-15--009.zip
>
>
> The new system test as part of KAFKA-3202 reveals a problem when the message 
> format is changed on the fly. When the cluster is using 0.10.x brokers and 
> producers and consumers use version 0.9.0.1 an error happens when the message 
> format is changed on the fly to version 0.9:
> {code}
> Exception: {'ConsoleConsumer-worker-1': Exception('Unexpected message format 
> (expected an integer). Message: null',)}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: kstream/ktable counting method with def...

2016-03-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1065


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk7 #1116

2016-03-15 Thread Apache Jenkins Server
See 

Changes:

[junrao] KAFKA-1215; Rack-Aware replica assignment option

--
[...truncated 70 lines...]
  if (value.expireTimestamp == 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP)

^
:300:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
  if (partitionData.timestamp == 
OffsetCommitRequest.DEFAULT_TIMESTAMP)
 ^
:307:
 a pure expression does nothing in statement position; you may be omitting 
necessary parentheses
ControllerStats.uncleanLeaderElectionRate
^
:308:
 a pure expression does nothing in statement position; you may be omitting 
necessary parentheses
ControllerStats.leaderElectionTimer
^
:394:
 constructor UpdateMetadataRequest in class UpdateMetadataRequest is 
deprecated: see corresponding Javadoc for more information.
new UpdateMetadataRequest(controllerId, controllerEpoch, 
liveBrokers.asJava, partitionStates.asJava)
^
:129:
 method readFromReadableChannel in class NetworkReceive is deprecated: see 
corresponding Javadoc for more information.
  response.readFromReadableChannel(channel)
   ^
:300:
 value timestamp in class PartitionData is deprecated: see corresponding 
Javadoc for more information.
  if (partitionData.timestamp == 
OffsetCommitRequest.DEFAULT_TIMESTAMP)
^
:303:
 value timestamp in class PartitionData is deprecated: see corresponding 
Javadoc for more information.
offsetRetention + partitionData.timestamp
^
11 warnings found
:kafka-trunk-jdk7:core:processResources UP-TO-DATE
:kafka-trunk-jdk7:core:classes
:kafka-trunk-jdk7:clients:compileTestJava
:kafka-trunk-jdk7:clients:processTestResources
:kafka-trunk-jdk7:clients:testClasses
:kafka-trunk-jdk7:core:copyDependantLibs
:kafka-trunk-jdk7:core:copyDependantTestLibs
:kafka-trunk-jdk7:core:jar
:jar_core_2_11
Building project 'core' with Scala version 2.11.8
:kafka-trunk-jdk7:clients:compileJava UP-TO-DATE
:kafka-trunk-jdk7:clients:processResources UP-TO-DATE
:kafka-trunk-jdk7:clients:classes UP-TO-DATE
:kafka-trunk-jdk7:clients:determineCommitId UP-TO-DATE
:kafka-trunk-jdk7:clients:createVersionFile
:kafka-trunk-jdk7:clients:jar UP-TO-DATE
:kafka-trunk-jdk7:core:compileJava UP-TO-DATE
:kafka-trunk-jdk7:core:compileScala
:79:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.

org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP
 ^
:36:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 commitTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP,

  ^
:37:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 expireTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP) {

  ^
:403:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresp

[GitHub] kafka pull request: MINOR: Add test that verifies fix for KAFKA-30...

2016-03-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1071


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3047) Explicit offset assignment in Log.append can corrupt the log

2016-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196012#comment-15196012
 ] 

ASF GitHub Bot commented on KAFKA-3047:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1071


> Explicit offset assignment in Log.append can corrupt the log
> 
>
> Key: KAFKA-3047
> URL: https://issues.apache.org/jira/browse/KAFKA-3047
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.9.0.0
>Reporter: Maciek Makowski
>Assignee: Ismael Juma
> Fix For: 0.10.0.0
>
>
> {{Log.append()}} has {{assignOffsets}} parameter, which, when set to false, 
> should cause Kafka to use the offsets specified in the 
> {{ByteBufferMessageSet}} and not recalculate them based on 
> {{nextOffsetMetadata}}. However, in that function, {{appendInfo.firstOffset}} 
> is unconditionally set to {{nextOffsetMetadata.messageOffset}}. This can 
> cause corruption of the log in the following scenario:
> * {{nextOffsetMetadata.messageOffset}} is 2001
> * {{append(messageSet, assignOffsets = false)}} is called, where 
> {{messageSet}} contains offsets 1001...1500 
> * after {{val appendInfo = analyzeAndValidateMessageSet(messages)}} call, 
> {{appendInfo.fistOffset}} is 1001 and {{appendInfo.lastOffset}} is 1500
> * after {{appendInfo.firstOffset = nextOffsetMetadata.messageOffset}} call, 
> {{appendInfo.fistOffset}} is 2001 and {{appendInfo.lastOffset}} is 1500
> * consistency check {{if(!appendInfo.offsetsMonotonic || 
> appendInfo.firstOffset < nextOffsetMetadata.messageOffset)}} succeeds (the 
> second condition can never fail due to unconditional assignment) and writing 
> proceeds
> * the message set is appended to current log segment starting at offset 2001, 
> but the offsets in the set are 1001...1500
> * the system shuts down abruptly
> * on restart, the following unrecoverable error is reported: 
> {code}
> Exception in thread "main" kafka.common.InvalidOffsetException: Attempt to 
> append an offset (1001) to position 12345 no larger than the last offset 
> appended (1950) to xyz/.index.
>   at 
> kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:207)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
>   at kafka.log.OffsetIndex.append(OffsetIndex.scala:197)
>   at kafka.log.LogSegment.recover(LogSegment.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:160)
>   at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:778)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
>   at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:777)
>   at kafka.log.Log.loadSegments(Log.scala:160)
>   at kafka.log.Log.(Log.scala:90)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:150)
>   at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>   at java.lang.Thread.run(Thread.java:722)
> {code} 
> *Proposed fix:* the assignment {{appendInfo.firstOffset = 
> nextOffsetMetadata.messageOffset}} should only happen in {{if 
> (assignOffsets)}} branch of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-43: Kafka SASL enhancements

2016-03-15 Thread Rajini Sivaram
Following on from the discussions in the KIP meeting today, the suggestion
is to implement a cut-down version of KIP-43 for 0.10.0.0 with a follow-on
KIP after the release to address support for custom mechanisms.

Changes to be removed from KIP-43:

   1. Remove the configuration for CallbackHandler. The callback handler
   implementation in Kafka will support Kerberos, PLAIN and Digest-MD5. It
   will not support custom or more complex mechanisms which require additional
   callbacks.
   2. Remove the configuration for Login. The Login implementation in Kafka
   will support Kerberos and any other mechanism (PLAIN, Digest-MD5 etc) that
   doesn't require functionality like token refresh.

Changes included in KIP-43:

   1. Configurable mechanism
   2. Support for multiple mechanisms in the broker
   3. Implementation of SASL/PLAIN

If there are no objections to this, I can update the KIP and the PR by
tomorrow. And move the support for custom mechanisms into another KIP and
PR for review after the release of 0.10.0.0.


On Mon, Mar 14, 2016 at 7:48 AM, Rajini Sivaram <
rajinisiva...@googlemail.com> wrote:

> Harsha,
>
> You are right, we don't expect to override callback handler or login for
> Digest-MD5.
>
> Pluggable CallbackHandler and Login modules enable custom SASL mechanisms
> to be implemented without modifying Kafka. For instance, it would enable
> KIP-44 (
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-44+-+Allow+Kafka+to+have+a+customized+security+protocol)
> to be implemented without making the whole security protocol pluggable. Tao
> Xiao has already confirmed earlier in this discussion thread that the
> proposed callback handler and login interfaces are suitable for their
> custom authentication.
>
>
>
> On Sun, Mar 13, 2016 at 6:59 PM, Harsha  wrote:
>
>> Agree with Gwen here. I feel like these additional pluggable Login
>> Modules are making this KIP complex. Since the main goal of the KIP is
>> to enable additional mechanism , can we limit the scope to that and If
>> we feel necessary for pluggable Login and callback handler classes we
>> can address in another JIRA.
>>
>> Adding digest-md5 ,password callbacks can be done to existing
>> callbackhandler without  expose it as pluggable class. It would be
>> useful to have broker support multiple mechanisms.  I haven't seen
>> anyone using more than this in hadoop . It might be different for Kafka
>> but I personally haven't seen anyone asking for this yet.
>>
>> Thanks,
>> Harsha
>>
>>
>> On Thu, Mar 10, 2016, at 01:44 AM, Rajini Sivaram wrote:
>> > Gwen,
>> >
>> > Just to be clear, the alternative would be:
>> >
>> > *jaas.conf:*
>> >
>> > GssapiKafkaServer {
>> >
>> > com.ibm.security.auth.module.Krb5LoginModule required
>> > credsType=both
>> > useKeytab="file:/kafka/key.tab"
>> > principal="kafka/localh...@example.com ";
>> >
>> > };
>> >
>> > SmartcardKafkaServer {
>> >
>> >   example.SmartcardLoginModule required
>> >
>> >   cardNumber=123;
>> >
>> > };
>> >
>> >
>> > *KafkaConfig*
>> >
>> >
>> >
>> >- login.context.map={"GSSAPI="GssapiKafkaServer",
>> >   "SMARTCARD"=SmartcardKafkaServer}
>> >   - login.class.map={"GSSAPI=GssapiLogin.class,
>> >   "SMARTCARD"=SmartcardLogin.class}
>> >   -
>> callback.handler.class.map={"GSSAPI"=GssapiCallbackHandler.class,
>> >   "SMARTCARD"=SmartcardCallbackHandler.class}
>> >
>> > *Client Config *
>> > Same as the server, but with only one entry allowed in each map and
>> > jaas.conf
>> >
>> >
>> >
>> > This is a different model from the Java standard for supporting multiple
>> > logins. As a developer, I am inclined to stick with approaches that are
>> > widely in use like JSSE. But this alternative can be made to work if the
>> > Kafka community feels it is more appropriate for Kafka. If you know of
>> > other systems which use this approach, that would be helpful.
>> >
>> >
>> >
>> > On Thu, Mar 10, 2016 at 2:07 AM, Gwen Shapira 
>> wrote:
>> >
>> > > What I'm hearing is that:
>> > >
>> > > 1. In order to support authentication mechanisms that were not written
>> > > specifically with Kafka in mind, someone will need to write the
>> > > integration between the mechanism and Kafka. This may include Login
>> > > and CallbackHandler classes. This can be the mechanism vendor, the
>> > > user or a 3rd party vendor.
>> > > 2. If someone wrote the code to support a mechanism in Kafka, and a
>> > > user will want to use more than one mechanism, they will still need to
>> > > write a wrapper.
>> > > 3. In reality, #2 will not be necessary ("edge-case") because Kafka
>> > > will actually already provide the callback needed (and presumably also
>> > > the code to load the LoginModule provided by Example.com)?
>> > >
>> > > Tradeoff #1 sounds reasonable.
>> > > #2 and #3 do not sound reasonable considering one of the goals of the
>> > > patch is to support multiple mechanisms. I don't think we should force
>> > > our users to write code just to avoid w

[GitHub] kafka pull request: MINOR: Remove unused import in `WordCountJob` ...

2016-03-15 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/1077

MINOR: Remove unused import in `WordCountJob` to fix checkstyle failure



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka fix-streams-checkstyle-failure

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1077.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1077


commit 2b372c6b4f3ea849a5547159670a29e621dda7a3
Author: Ismael Juma 
Date:   2016-03-15T19:54:14Z

MINOR: Remove unused import in `WordCountJob` to fix checkstyle failure




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-2982; Mark the old Scala producer and re...

2016-03-15 Thread ijuma
Github user ijuma closed the pull request at:

https://github.com/apache/kafka/pull/1067


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2982) Mark the old Scala producer and related classes as deprecated

2016-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196119#comment-15196119
 ] 

ASF GitHub Bot commented on KAFKA-2982:
---

Github user ijuma closed the pull request at:

https://github.com/apache/kafka/pull/1067


> Mark the old Scala producer and related classes as deprecated
> -
>
> Key: KAFKA-2982
> URL: https://issues.apache.org/jira/browse/KAFKA-2982
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 0.9.0.0
>Reporter: Grant Henke
>Assignee: Ismael Juma
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> Now that the new producer and consumer are released the old Scala producer 
> and consumer clients should be deprecated to encourage use of the new clients 
> and facilitate the removal of the old clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: Remove unused import in `WordCountJob` ...

2016-03-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1077


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Kafka KIP meeting Mar 15 at 11:00am PST

2016-03-15 Thread Jun Rao
The following are the notes from today's KIP discussion.


   - KIP-33 - Add a time based log index to Kafka: We decided NOT to
   include this in 0.10.0 since the changes may have performance risks.
   - KIP-45 - Standardize all client sequence interaction on
   j.u.Collection: There is no consensus in the discussion. We will just put
   it to vote.
   - KIP-35 - Retrieving protocol version: This gets the longest
   discussion. There is still no consensus. Magnus thinks the current proposal
   of maintaining a global protocol version won't work and will try to submit
   a new proposal.
   - KIP-43 - Kafka SASL enhancements: Rajini will modify the KIP to only
   support native SASL mechanisms and leave the changes to Login and
   CallbackHandler to KIP-44 instead.


The video will be uploaded soon in
https://cwiki.apache.org/confluence/display/KAFKA/Kafka
+Improvement+Proposals .

Thanks,

Jun

On Mon, Mar 14, 2016 at 1:37 PM, Jun Rao  wrote:

> Hi, Everyone,
>
> We will have a Kafka KIP meeting tomorrow at 11:00am PST. If you plan to
> attend but haven't received an invite, please let me know. The following is
> the agenda.
>
> Agenda:
>
> KIP-35 - Retrieving protocol version
> KIP-45 - Standardize all client sequence interaction on j.u.Collection
> KIP-33 - Add a time based log index to Kafka (discuss if should be in
> 0.10.0)
> KIP-43 - Kafka SASL enhancements (discuss if should be in 0.10.0)
>
> Thanks,
>
> Jun
>
>


Build failed in Jenkins: kafka-trunk-jdk8 #445

2016-03-15 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] MINOR: kstream/ktable counting method with default long serdes

[wangguoz] MINOR: Add test that verifies fix for KAFKA-3047

[me] MINOR: Remove unused import in `WordCountJob` to fix checkstyle failure

--
Started by an SCM change
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on ubuntu-us1 (Ubuntu ubuntu ubuntu-us golang-ppa) in 
workspace 
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision fd6efbe0b7f18beb506f112350b30b7209a2f5cc 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f fd6efbe0b7f18beb506f112350b30b7209a2f5cc
 > git rev-list 951e30adc6d4a0ed37dcc3fde0050ca5faff146d # timeout=10
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson1837881408554585785.sh
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
http://gradle.org/docs/2.4-rc-2/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:downloadWrapper

BUILD SUCCESSFUL

Total time: 14.089 secs
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson5877046584320792721.sh
+ export GRADLE_OPTS=-Xmx1024m
+ GRADLE_OPTS=-Xmx1024m
+ ./gradlew -Dorg.gradle.project.maxParallelForks=1 clean jarAll testAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.11/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:clean UP-TO-DATE
:clients:clean UP-TO-DATE
:connect:clean UP-TO-DATE
:core:clean UP-TO-DATE
:examples:clean UP-TO-DATE
:log4j-appender:clean UP-TO-DATE
:streams:clean UP-TO-DATE
:tools:clean UP-TO-DATE
:connect:api:clean UP-TO-DATE
:connect:file:clean UP-TO-DATE
:connect:json:clean UP-TO-DATE
:connect:runtime:clean UP-TO-DATE
:streams:examples:clean UP-TO-DATE
:jar_core_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk8:clients:compileJava
:jar_core_2_10 FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Failed to capture snapshot of input files for task 'compileJava' during 
up-to-date check.
> Could not add entry 
> '/home/jenkins/.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-api/1.7.18/b631d286463ced7cc42ee2171fe3beaed2836823/slf4j-api-1.7.18.jar'
>  to cache fileHashes.bin 
> (

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

BUILD FAILED

Total time: 11.882 secs
Build step 'Execute shell' marked build as failure
Recording test results
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
ERROR: Step ‘Publish JUnit test result report’ failed: No test report files 
were found. Configuration error?
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2


[VOTE] KIP-45: Standardize KafkaConsumer API to use Collection

2016-03-15 Thread Jason Gustafson
I'd like to open the vote for KIP-45. We've discussed several alternatives
on the mailing list and in the KIP call, but this vote is only on the
documented KIP:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61337336. This
change will not be compatible with 0.9, but it will provide a cleaner API
long term for users to work with. This is really the last chance to make an
incompatible change like this with 0.10 shortly on the way, but compatible
options (such as method overloading) could be brought up again in the
future if we find it's needed.

Thanks,
Jason


[GitHub] kafka pull request: KAFKA-3260 - Added SourceTask.commitRecord

2016-03-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/950


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3260) Increase the granularity of commit for SourceTask

2016-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196306#comment-15196306
 ] 

ASF GitHub Bot commented on KAFKA-3260:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/950


> Increase the granularity of commit for SourceTask
> -
>
> Key: KAFKA-3260
> URL: https://issues.apache.org/jira/browse/KAFKA-3260
> Project: Kafka
>  Issue Type: Improvement
>  Components: copycat
>Affects Versions: 0.9.0.1
>Reporter: Jeremy Custenborder
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.10.0.0
>
>
> As of right now when commit is called the developer does not know which 
> messages have been accepted since the last poll. I'm proposing that we extend 
> the SourceTask class to allow records to be committed individually.
> {code}
> public void commitRecord(SourceRecord record) throws InterruptedException 
> {
> // This space intentionally left blank.
> }
> {code}
> This method could be overridden to receive a SourceRecord during the callback 
> of producer.send. This will give us messages that have been successfully 
> written to Kafka. The developer then has the capability to commit messages to 
> the source individually or in batch.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk7 #1117

2016-03-15 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] MINOR: kstream/ktable counting method with default long serdes

[wangguoz] MINOR: Add test that verifies fix for KAFKA-3047

--
[...truncated 5741 lines...]

org.apache.kafka.connect.runtime.WorkerSinkTaskThreadedTest > 
testCommitTaskSuccessAndFlushFailure PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskThreadedTest > 
testCommitConsumerFailure PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskThreadedTest > testCommitTimeout 
PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskThreadedTest > 
testAssignmentPauseResume PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskThreadedTest > testRewind PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskThreadedTest > 
testPollsInBackground PASSED

org.apache.kafka.connect.runtime.AbstractHerderTest > connectorStatus PASSED

org.apache.kafka.connect.runtime.AbstractHerderTest > taskStatus PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskTest > 
testErrorInRebalancePartitionRevocation PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskTest > testPollRedelivery PASSED

org.apache.kafka.connect.runtime.WorkerSinkTaskTest > 
testErrorInRebalancePartitionAssignment PASSED

org.apache.kafka.connect.runtime.standalone.StandaloneHerderTest > 
testCreateSinkConnector PASSED

org.apache.kafka.connect.runtime.standalone.StandaloneHerderTest > 
testDestroyConnector PASSED

org.apache.kafka.connect.runtime.standalone.StandaloneHerderTest > 
testCreateAndStop PASSED

org.apache.kafka.connect.runtime.standalone.StandaloneHerderTest > 
testAccessors PASSED

org.apache.kafka.connect.runtime.standalone.StandaloneHerderTest > 
testCreateSourceConnector PASSED

org.apache.kafka.connect.runtime.standalone.StandaloneHerderTest > 
testCreateConnectorAlreadyExists PASSED

org.apache.kafka.connect.runtime.standalone.StandaloneHerderTest > 
testPutConnectorConfig PASSED

org.apache.kafka.connect.runtime.standalone.StandaloneHerderTest > 
testPutTaskConfigs PASSED

org.apache.kafka.connect.runtime.distributed.WorkerCoordinatorTest > 
testNormalJoinGroupFollower PASSED

org.apache.kafka.connect.runtime.distributed.WorkerCoordinatorTest > 
testMetadata PASSED

org.apache.kafka.connect.runtime.distributed.WorkerCoordinatorTest > 
testLeaderPerformAssignment1 PASSED

org.apache.kafka.connect.runtime.distributed.WorkerCoordinatorTest > 
testLeaderPerformAssignment2 PASSED

org.apache.kafka.connect.runtime.distributed.WorkerCoordinatorTest > 
testJoinLeaderCannotAssign PASSED

org.apache.kafka.connect.runtime.distributed.WorkerCoordinatorTest > 
testRejoinGroup PASSED

org.apache.kafka.connect.runtime.distributed.WorkerCoordinatorTest > 
testNormalJoinGroupLeader PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testDestroyConnector PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testAccessors PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorAlreadyExists PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testPutConnectorConfig PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnector PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorConfigAdded PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorConfigUpdate PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testTaskConfigAdded PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testJoinLeaderCatchUpFails PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testInconsistentConfigs PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testJoinAssignment PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRebalance PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testHaltCleansUpWorker PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testCommit PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testPollsInBackground 
PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testFailureInPoll PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testCommitFailure PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > 
testSendRecordsConvertsData PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testSendRecordsRetries 
PASSED

org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testSlowTaskStart PASSED

org.apache.kafka.connect.runtime.WorkerTaskTest > stopBeforeStarting PASSED

org.apache.kafka.connect.runtime.WorkerTaskTest > standardStartup PASSED

org.apache.kafka.connect.runtime.WorkerTaskTest > cancelBeforeStopping PASSED

org.apache.kafka.connect.runtime.WorkerTest > testStartAndStopConnector PASSED

org.apac

[jira] [Resolved] (KAFKA-3260) Increase the granularity of commit for SourceTask

2016-03-15 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-3260.
--
   Resolution: Fixed
Fix Version/s: 0.10.0.0

Issue resolved by pull request 950
[https://github.com/apache/kafka/pull/950]

> Increase the granularity of commit for SourceTask
> -
>
> Key: KAFKA-3260
> URL: https://issues.apache.org/jira/browse/KAFKA-3260
> Project: Kafka
>  Issue Type: Improvement
>  Components: copycat
>Affects Versions: 0.9.0.1
>Reporter: Jeremy Custenborder
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.10.0.0
>
>
> As of right now when commit is called the developer does not know which 
> messages have been accepted since the last poll. I'm proposing that we extend 
> the SourceTask class to allow records to be committed individually.
> {code}
> public void commitRecord(SourceRecord record) throws InterruptedException 
> {
> // This space intentionally left blank.
> }
> {code}
> This method could be overridden to receive a SourceRecord during the callback 
> of producer.send. This will give us messages that have been successfully 
> written to Kafka. The developer then has the capability to commit messages to 
> the source individually or in batch.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-15 Thread Ashish Singh
Magnus and I had a brief discussion following the KIP call. KIP-35

wiki has been updated accordingly. Please review the KIP and vote on the
corresponding vote thread.

On Mon, Mar 14, 2016 at 11:27 PM, Ashish Singh  wrote:

> I think there is a bit of misunderstanding going on here regarding
> protocol documentation and its versioning. It could be that I am the one
> who misunderstood it, please correct me if so.
>
> Taking Gwen's example.
>
> 1. 0.10.0 (protocol v4)  is released with current KIP-35
> 2. On trunk, modify produce requests and bump to v5
> 3. On trunk, we modify metadata requests and bump to v6
> 4. Now we decide that the metadata change fixes a super critical issue and
> want to backport the change. What's the protocol version of the next
> release of 0.10.0 - which supports v6 protocol only partially?
>
> As per my understanding, this will be v7. When we say a broker is on
> ApiVersion 7, we do not necessarily mean that it also supports ApiVersion
> up to v7. A broker on ApiVersion v7 should probably mean, please refer v7
> of protocol documentation to find out supported protocol versions of this
> broker.
>
> I just added an example on the KIP wiki to elaborate more on protocol
> documentation versioning. Below is the excerpt.
>
> For instance say we have two brokers, BrokerA has ApiVersion 4 and BrokerB
> has ApiVersion 5. This means we should have protocol documentations for
> ApiVersions 4 and 5. Say we have the following as protocol documentation
> for these two versions.
>
> Sample Protocol Documentation V4
> Version: 4 // Comes from ApiVersion
> REQ_A_0: ...
> REQ_A_1: ...
> RESP_A_0: ...
> RESP_A_1: ...
>
> Sample Protocol Documentation V5
> Version: 5 // Comes from ApiVersion
> REQ_A_1: ...
> REQ_A_2: ...
> RESP_A_1: ...
> RESP_A_2: ...
>
> All a client needs to know to be able to successfully communicate with a
> broker is what is the supported ApiVersion of the broker. Say via some
> mechanism, discussed below, client gets to know that BrokerA has ApiVersion
> 4 and BrokerB has ApiVersion 5. With that information, and the available
> protocol documentations for those ApiVersions client can deduce what
> protocol versions does the broker supports. In this case client will deduce
> that it can use v0 and v1 of REQ_A and RESP_A while talking to BrokerA,
> while it can use v1 and v2 of REQ_A and RESP_A while talking to BrokerB.
>
> On Mon, Mar 14, 2016 at 10:50 PM, Ewen Cheslack-Postava  > wrote:
>
>> Yeah, Gwen's example is a good one. And it doesn't even have to be thought
>> of in terms of the implementation -- you can think of the protocol itself
>> as effectively being possible to branch and have changes cherry-picked.
>> Given the way some changes interact and that only some may be feasible to
>> backport, this may be important.
>>
>> Similarly, it's difficult to make that definition . In practice, we
>> sometimes branch and effectively merge the protocol -- i.e. we develop 2
>> KIPs with independent changes at the same time. If you force a linear
>> model, you also *force* the ordering of implementation, which will be a
>> pretty serious constraint in a lot of cases. Two protocol-changing KIPs
>> may
>> occur near in time, but one may be a much larger change.
>>
>> Finally, it might be worth noting that from a client developer's
>> perspective, the linear order may not be all that intuitive when we pile
>> on
>> a bunch of protocol changes in one release. They probably don't actually
>> care about that global protocol version. They'll care more about the types
>> of things Dana was talking about previously: LZ4 support (which isn't even
>> a protocol change, but an important feature clients might need to know
>> about!), Kafka-backed offset storage (requires 2 protocol changes), etc.
>> While we want to encourage supporting all features, we should be realistic
>> about how client developers tackle feature development and limited
>> bandwidth. They are probably more feature driven than version driven.
>>
>> This is what Gwen was saying I had mentioned. The idea of features is
>> actually separate from what has been described so far and *does* require a
>> mapping to protocol versions, but also allows you to capture more than
>> that
>> and at more flexible granularity (single request type protocol version
>> bump
>> or the whole set of requests could change). The idea isn't quite the same
>> as browser feature detection, but that's my frame of reference for it (see
>> https://developer.mozilla.org/en-US/docs/Browser_Feature_Detection), the
>> process of trying to sort out supported features and protocols based on
>> browser version IDs (sort of equivalent to broker implementation versions
>> here) is a huge mess. Going entirely the other route (say, only enabling a
>> feature in CSS3 if *all* CSS3 features are implemented) is really
>> restrictive.
>>
>> I don't have a concrete proposal ri

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-15 Thread Jay Kreps
Hey Ashish,

Can you expand in the proposal on how this would be used by clients?
This proposal only has one slot for api versions, though in fact there
is potentially a different version on each broker. I think the
proposal is that every single time the client establishes a connection
it would then need to issue a metadata request on that connection to
check supported versions. Is that correct?

The point of merging version information with metadata request was
that the client wouldn't have to manage this additional state for each
connection, but rather the broker would gather the information and
give a summary of all brokers in the cluster. (Managing the state
doesn't seem complex but actually since the full state machine for a
request is something like begin connecting=>connection complete=>begin
sending request=>do work sending=>await response=>do work reading
response adding to the state machine around this is not as simple as
it seems...you can see the code in the java client around this).

It sounds like in this proposal you are proposing merging with the
metadata request but not summarizing across the cluster? Can you
explain the thinking vs a separate request?

It would really be good if the KIP can summarize the whole interaction
and how clients will work.

-Jay

On Tue, Mar 15, 2016 at 3:24 PM, Ashish Singh  wrote:
> Magnus and I had a brief discussion following the KIP call. KIP-35
> 
> wiki has been updated accordingly. Please review the KIP and vote on the
> corresponding vote thread.
>
> On Mon, Mar 14, 2016 at 11:27 PM, Ashish Singh  wrote:
>
>> I think there is a bit of misunderstanding going on here regarding
>> protocol documentation and its versioning. It could be that I am the one
>> who misunderstood it, please correct me if so.
>>
>> Taking Gwen's example.
>>
>> 1. 0.10.0 (protocol v4)  is released with current KIP-35
>> 2. On trunk, modify produce requests and bump to v5
>> 3. On trunk, we modify metadata requests and bump to v6
>> 4. Now we decide that the metadata change fixes a super critical issue and
>> want to backport the change. What's the protocol version of the next
>> release of 0.10.0 - which supports v6 protocol only partially?
>>
>> As per my understanding, this will be v7. When we say a broker is on
>> ApiVersion 7, we do not necessarily mean that it also supports ApiVersion
>> up to v7. A broker on ApiVersion v7 should probably mean, please refer v7
>> of protocol documentation to find out supported protocol versions of this
>> broker.
>>
>> I just added an example on the KIP wiki to elaborate more on protocol
>> documentation versioning. Below is the excerpt.
>>
>> For instance say we have two brokers, BrokerA has ApiVersion 4 and BrokerB
>> has ApiVersion 5. This means we should have protocol documentations for
>> ApiVersions 4 and 5. Say we have the following as protocol documentation
>> for these two versions.
>>
>> Sample Protocol Documentation V4
>> Version: 4 // Comes from ApiVersion
>> REQ_A_0: ...
>> REQ_A_1: ...
>> RESP_A_0: ...
>> RESP_A_1: ...
>>
>> Sample Protocol Documentation V5
>> Version: 5 // Comes from ApiVersion
>> REQ_A_1: ...
>> REQ_A_2: ...
>> RESP_A_1: ...
>> RESP_A_2: ...
>>
>> All a client needs to know to be able to successfully communicate with a
>> broker is what is the supported ApiVersion of the broker. Say via some
>> mechanism, discussed below, client gets to know that BrokerA has ApiVersion
>> 4 and BrokerB has ApiVersion 5. With that information, and the available
>> protocol documentations for those ApiVersions client can deduce what
>> protocol versions does the broker supports. In this case client will deduce
>> that it can use v0 and v1 of REQ_A and RESP_A while talking to BrokerA,
>> while it can use v1 and v2 of REQ_A and RESP_A while talking to BrokerB.
>>
>> On Mon, Mar 14, 2016 at 10:50 PM, Ewen Cheslack-Postava > > wrote:
>>
>>> Yeah, Gwen's example is a good one. And it doesn't even have to be thought
>>> of in terms of the implementation -- you can think of the protocol itself
>>> as effectively being possible to branch and have changes cherry-picked.
>>> Given the way some changes interact and that only some may be feasible to
>>> backport, this may be important.
>>>
>>> Similarly, it's difficult to make that definition . In practice, we
>>> sometimes branch and effectively merge the protocol -- i.e. we develop 2
>>> KIPs with independent changes at the same time. If you force a linear
>>> model, you also *force* the ordering of implementation, which will be a
>>> pretty serious constraint in a lot of cases. Two protocol-changing KIPs
>>> may
>>> occur near in time, but one may be a much larger change.
>>>
>>> Finally, it might be worth noting that from a client developer's
>>> perspective, the linear order may not be all that intuitive when we pile
>>> on
>>> a bunch of protocol changes in one release. They probably don't actually
>

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-15 Thread Ashish Singh
Hello Jay,

On Tue, Mar 15, 2016 at 3:35 PM, Jay Kreps  wrote:

> Hey Ashish,
>
> Can you expand in the proposal on how this would be used by clients?
>

Will add some info along this line.

This proposal only has one slot for api versions, though in fact there
> is potentially a different version on each broker. I think the
> proposal is that every single time the client establishes a connection
> it would then need to issue a metadata request on that connection to
> check supported versions. Is that correct?
>
> The point of merging version information with metadata request was
> that the client wouldn't have to manage this additional state for each
> connection, but rather the broker would gather the information and
> give a summary of all brokers in the cluster. (Managing the state
> doesn't seem complex but actually since the full state machine for a
> request is something like begin connecting=>connection complete=>begin
> sending request=>do work sending=>await response=>do work reading
> response adding to the state machine around this is not as simple as
> it seems...you can see the code in the java client around this).
>
> It sounds like in this proposal you are proposing merging with the
> metadata request but not summarizing across the cluster? Can you
> explain the thinking vs a separate request?
>

You are right that this proposal is suggesting that client will have to ask
each broker it is talking to, every time a new connection is established
for supported protocol versions.

The other option, as you suggested, is to have broker return information on
supported protocol versions for all brokers. However, that will be a bit
hard to manage. For instance, lets consider following scenario.

1. Client sends MetadataRequest to BrokerA.
2. BrokerA replies following state to Client.
2.a. BrokerA [(0, 0, 2)] // (api, min_version, max_version)
2.b. BrokerB [(0, 0, 3)]
3. Client stores this info and decides to talk to BrokerB using version 3
for Api 0.
4. Meanwhile BrokerB was downgraded and now it actually only supports [(0,
0, 1)]. Not expecting a v3, broker will close connection to the hopeful
Client.

Yes, at this point client can re-send MetadataRequest and update its view,
but a scenario on rolling upgrade or downgrade this can get messy.

As per the current suggestion this scenario will look something like this.

1. Client sends MetadataRequest to BrokerB.
2. BrokerB replies with its supported versions [(0, 0, 3)]
3. Client stores this info and decides to talk to BrokerB using version 3
for Api 0.
4. Meanwhile BrokerB was downgraded and now it actually only supports [(0,
0, 1)]. While doing so, it will close connection to BrokerB.
5. BrokerB re-connects and sends the MetadataRequest.
6. BrokerB replies with its supported versions [(0, 0, 1)]
7. Client and BrokerB talked happily ever after.

AFAIK, we moved away from separate request as having to send a separate
request will need clients to wait for two network cycles instead of one.

>
> It would really be good if the KIP can summarize the whole interaction
> and how clients will work.
>
> -Jay
>
> On Tue, Mar 15, 2016 at 3:24 PM, Ashish Singh  wrote:
> > Magnus and I had a brief discussion following the KIP call. KIP-35
> > <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
> >
> > wiki has been updated accordingly. Please review the KIP and vote on the
> > corresponding vote thread.
> >
> > On Mon, Mar 14, 2016 at 11:27 PM, Ashish Singh 
> wrote:
> >
> >> I think there is a bit of misunderstanding going on here regarding
> >> protocol documentation and its versioning. It could be that I am the one
> >> who misunderstood it, please correct me if so.
> >>
> >> Taking Gwen's example.
> >>
> >> 1. 0.10.0 (protocol v4)  is released with current KIP-35
> >> 2. On trunk, modify produce requests and bump to v5
> >> 3. On trunk, we modify metadata requests and bump to v6
> >> 4. Now we decide that the metadata change fixes a super critical issue
> and
> >> want to backport the change. What's the protocol version of the next
> >> release of 0.10.0 - which supports v6 protocol only partially?
> >>
> >> As per my understanding, this will be v7. When we say a broker is on
> >> ApiVersion 7, we do not necessarily mean that it also supports
> ApiVersion
> >> up to v7. A broker on ApiVersion v7 should probably mean, please refer
> v7
> >> of protocol documentation to find out supported protocol versions of
> this
> >> broker.
> >>
> >> I just added an example on the KIP wiki to elaborate more on protocol
> >> documentation versioning. Below is the excerpt.
> >>
> >> For instance say we have two brokers, BrokerA has ApiVersion 4 and
> BrokerB
> >> has ApiVersion 5. This means we should have protocol documentations for
> >> ApiVersions 4 and 5. Say we have the following as protocol documentation
> >> for these two versions.
> >>
> >> Sample Protocol Documentation V4
> >> Version: 4 // Comes from ApiVersion
> >> REQ_A_0: .

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-15 Thread Magnus Edenhill
Hey Jay,

as discussed earlier it is not safe to cache/relay a broker's version or
its supported API versions,
by the time the client connects the broker might have upgraded to another
version which effectively
makes this information useless in a cached form.

The complexity of querying for protocol verion is very implementation
dependent and
hard to generalize on, I dont foresee any bigger problems adding support
for an extra protocol version
querying state in librdkafka, but other client devs should chime in.
There are already post-connect,pre-operation states for dealing with SSL
and SASL.

The reason for putting the API versioning stuff in the Metadata request is
that it is already used
for bootstrapping a client and/or connection and thus saves us a round-trip
(and possibly a state).


For how this will be used; I can't speak for other client devs but aim to
make a mapping between
the features my client exposes to a set of specific APIs and their minimum
version..
E.g.: Balanced consumer groups requires JoinGroup >= V0, LeaveGroup >= V0,
SyncGroup >= V0, and so on.
If those requirements can be fullfilled then the feature is enabled,
otherwise an error is returned to the user.

/Magnus


2016-03-15 23:35 GMT+01:00 Jay Kreps :

> Hey Ashish,
>
> Can you expand in the proposal on how this would be used by clients?
> This proposal only has one slot for api versions, though in fact there
> is potentially a different version on each broker. I think the
> proposal is that every single time the client establishes a connection
> it would then need to issue a metadata request on that connection to
> check supported versions. Is that correct?
>
> The point of merging version information with metadata request was
> that the client wouldn't have to manage this additional state for each
> connection, but rather the broker would gather the information and
> give a summary of all brokers in the cluster. (Managing the state
> doesn't seem complex but actually since the full state machine for a
> request is something like begin connecting=>connection complete=>begin
> sending request=>do work sending=>await response=>do work reading
> response adding to the state machine around this is not as simple as
> it seems...you can see the code in the java client around this).
>
> It sounds like in this proposal you are proposing merging with the
> metadata request but not summarizing across the cluster? Can you
> explain the thinking vs a separate request?
>
> It would really be good if the KIP can summarize the whole interaction
> and how clients will work.
>
> -Jay
>
> On Tue, Mar 15, 2016 at 3:24 PM, Ashish Singh  wrote:
> > Magnus and I had a brief discussion following the KIP call. KIP-35
> > <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
> >
> > wiki has been updated accordingly. Please review the KIP and vote on the
> > corresponding vote thread.
> >
> > On Mon, Mar 14, 2016 at 11:27 PM, Ashish Singh 
> wrote:
> >
> >> I think there is a bit of misunderstanding going on here regarding
> >> protocol documentation and its versioning. It could be that I am the one
> >> who misunderstood it, please correct me if so.
> >>
> >> Taking Gwen's example.
> >>
> >> 1. 0.10.0 (protocol v4)  is released with current KIP-35
> >> 2. On trunk, modify produce requests and bump to v5
> >> 3. On trunk, we modify metadata requests and bump to v6
> >> 4. Now we decide that the metadata change fixes a super critical issue
> and
> >> want to backport the change. What's the protocol version of the next
> >> release of 0.10.0 - which supports v6 protocol only partially?
> >>
> >> As per my understanding, this will be v7. When we say a broker is on
> >> ApiVersion 7, we do not necessarily mean that it also supports
> ApiVersion
> >> up to v7. A broker on ApiVersion v7 should probably mean, please refer
> v7
> >> of protocol documentation to find out supported protocol versions of
> this
> >> broker.
> >>
> >> I just added an example on the KIP wiki to elaborate more on protocol
> >> documentation versioning. Below is the excerpt.
> >>
> >> For instance say we have two brokers, BrokerA has ApiVersion 4 and
> BrokerB
> >> has ApiVersion 5. This means we should have protocol documentations for
> >> ApiVersions 4 and 5. Say we have the following as protocol documentation
> >> for these two versions.
> >>
> >> Sample Protocol Documentation V4
> >> Version: 4 // Comes from ApiVersion
> >> REQ_A_0: ...
> >> REQ_A_1: ...
> >> RESP_A_0: ...
> >> RESP_A_1: ...
> >>
> >> Sample Protocol Documentation V5
> >> Version: 5 // Comes from ApiVersion
> >> REQ_A_1: ...
> >> REQ_A_2: ...
> >> RESP_A_1: ...
> >> RESP_A_2: ...
> >>
> >> All a client needs to know to be able to successfully communicate with a
> >> broker is what is the supported ApiVersion of the broker. Say via some
> >> mechanism, discussed below, client gets to know that BrokerA has
> ApiVersion
> >> 4 and BrokerB has ApiVersion 5. With that information, 

[jira] [Commented] (KAFKA-3205) Error in I/O with host (java.io.EOFException) raised in producer

2016-03-15 Thread Mart Haitjema (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196471#comment-15196471
 ] 

Mart Haitjema commented on KAFKA-3205:
--

I also ran into this issue and discovered that the broker closes connections 
that have been idle for connections.max.idle.ms 
(https://kafka.apache.org/090/configuration.html#brokerconfigs) which has a 
default of 10 minutes.
While this parameter was introduced in 0.8.2 
(https://kafka.apache.org/082/configuration.html#brokerconfigs) it wasn't 
actually enforced by the broker until 0.9.0 which closes the connections inside 
Selector.java::maybeCloseOldestConnection()
(see 
https://github.com/apache/kafka/commit/78ba492e3e70fd9db61bc82469371d04a8d6b762#diff-d71b50516bd2143d208c14563842390a).
While the producer config also defines this parameter with a default of 9 
minutes, it does not appear to be respected by the 0.8.2.x clients which mean 
idle connections aren't being closed on the client-side but are timed out by 
the broker.
When the broker drops the connection, it results in an java.io.EOFException: 
null exception on the producer-side that looks exactly like the one shown in 
the description.

To work around this issue, we explicitly set the connections.max.idle.ms to 
something very large in the broker config  (e.g. 1 year) which seems to have 
mitigated the problem for us.


> Error in I/O with host (java.io.EOFException) raised in producer
> 
>
> Key: KAFKA-3205
> URL: https://issues.apache.org/jira/browse/KAFKA-3205
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.2.1, 0.9.0.0
>Reporter: Jonathan Raffre
>
> In a situation with a Kafka broker in 0.9 and producers still in 0.8.2.x, 
> producers seems to raise the following after a variable amount of time since 
> start :
> {noformat}
> 2016-01-29 14:33:13,066 WARN [] o.a.k.c.n.Selector: Error in I/O with 
> 172.22.2.170
> java.io.EOFException: null
> at 
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62)
>  ~[org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at org.apache.kafka.common.network.Selector.poll(Selector.java:248) 
> ~[org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) 
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) 
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) 
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-internal]
> {noformat}
> This can be reproduced successfully by doing the following :
>  * Start a 0.8.2 producer connected to the 0.9 broker
>  * Wait 15 minutes, exactly
>  * See the error in the producer logs.
> Oddly, this also shows up in an active producer but after 10 minutes of 
> activity.
> Kafka's server.properties :
> {noformat}
> broker.id=1
> listeners=PLAINTEXT://:9092
> port=9092
> num.network.threads=2
> num.io.threads=2
> socket.send.buffer.bytes=1048576
> socket.receive.buffer.bytes=1048576
> socket.request.max.bytes=104857600
> log.dirs=/mnt/data/kafka
> num.partitions=4
> auto.create.topics.enable=false
> delete.topic.enable=true
> num.recovery.threads.per.data.dir=1
> log.retention.hours=48
> log.retention.bytes=524288000
> log.segment.bytes=52428800
> log.retention.check.interval.ms=6
> log.roll.hours=24
> log.cleanup.policy=delete
> log.cleaner.enable=true
> zookeeper.connect=127.0.0.1:2181
> zookeeper.connection.timeout.ms=100
> {noformat}
> Producer's configuration :
> {noformat}
>   compression.type = none
>   metric.reporters = []
>   metadata.max.age.ms = 30
>   metadata.fetch.timeout.ms = 6
>   acks = all
>   batch.size = 16384
>   reconnect.backoff.ms = 10
>   bootstrap.servers = [127.0.0.1:9092]
>   receive.buffer.bytes = 32768
>   retry.backoff.ms = 500
>   buffer.memory = 33554432
>   timeout.ms = 3
>   key.serializer = class 
> org.apache.kafka.common.serialization.StringSerializer
>   retries = 3
>   max.request.size = 500
>   block.on.buffer.full = true
>   value.serializer = class 
> org.apache.kafka.common.serialization.StringSerializer
>   metrics.sample.window.ms = 3
>   send.buffer.bytes = 131072
>   max.in.flight.requests.per.connection = 5
>   metrics.num.samples = 2
>   linger.ms = 0
>   client.id = 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3406) CommonClientConfigs.RETRY_BACKOFF_MS_DOC should be more general

2016-03-15 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-3406:
--

 Summary: CommonClientConfigs.RETRY_BACKOFF_MS_DOC should be more 
general
 Key: KAFKA-3406
 URL: https://issues.apache.org/jira/browse/KAFKA-3406
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.9.0.1
Reporter: Jun Rao
Priority: Minor


The doc now says "fetch request" and "repeated fetching-and-failing". However, 
this doc is shared with the producer as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-15 Thread Jay Kreps
Yeah I think there are two possible approaches:
1. You get the versions in the metadata request and cache them and
invalidate that cache if you get a version mismatch error (basically
as we do with leadership information).
2. You check each connection

I think combining metadata request and version check only makes sense
in (1), right? If it is (2) I don't see how you save anything and the
requests don't really make sense because you're mixing cluster wide
state about partitions with info about the answering broker.

-Jay

On Tue, Mar 15, 2016 at 4:25 PM, Magnus Edenhill  wrote:
> Hey Jay,
>
> as discussed earlier it is not safe to cache/relay a broker's version or
> its supported API versions,
> by the time the client connects the broker might have upgraded to another
> version which effectively
> makes this information useless in a cached form.
>
> The complexity of querying for protocol verion is very implementation
> dependent and
> hard to generalize on, I dont foresee any bigger problems adding support
> for an extra protocol version
> querying state in librdkafka, but other client devs should chime in.
> There are already post-connect,pre-operation states for dealing with SSL
> and SASL.
>
> The reason for putting the API versioning stuff in the Metadata request is
> that it is already used
> for bootstrapping a client and/or connection and thus saves us a round-trip
> (and possibly a state).
>
>
> For how this will be used; I can't speak for other client devs but aim to
> make a mapping between
> the features my client exposes to a set of specific APIs and their minimum
> version..
> E.g.: Balanced consumer groups requires JoinGroup >= V0, LeaveGroup >= V0,
> SyncGroup >= V0, and so on.
> If those requirements can be fullfilled then the feature is enabled,
> otherwise an error is returned to the user.
>
> /Magnus
>
>
> 2016-03-15 23:35 GMT+01:00 Jay Kreps :
>
>> Hey Ashish,
>>
>> Can you expand in the proposal on how this would be used by clients?
>> This proposal only has one slot for api versions, though in fact there
>> is potentially a different version on each broker. I think the
>> proposal is that every single time the client establishes a connection
>> it would then need to issue a metadata request on that connection to
>> check supported versions. Is that correct?
>>
>> The point of merging version information with metadata request was
>> that the client wouldn't have to manage this additional state for each
>> connection, but rather the broker would gather the information and
>> give a summary of all brokers in the cluster. (Managing the state
>> doesn't seem complex but actually since the full state machine for a
>> request is something like begin connecting=>connection complete=>begin
>> sending request=>do work sending=>await response=>do work reading
>> response adding to the state machine around this is not as simple as
>> it seems...you can see the code in the java client around this).
>>
>> It sounds like in this proposal you are proposing merging with the
>> metadata request but not summarizing across the cluster? Can you
>> explain the thinking vs a separate request?
>>
>> It would really be good if the KIP can summarize the whole interaction
>> and how clients will work.
>>
>> -Jay
>>
>> On Tue, Mar 15, 2016 at 3:24 PM, Ashish Singh  wrote:
>> > Magnus and I had a brief discussion following the KIP call. KIP-35
>> > <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
>> >
>> > wiki has been updated accordingly. Please review the KIP and vote on the
>> > corresponding vote thread.
>> >
>> > On Mon, Mar 14, 2016 at 11:27 PM, Ashish Singh 
>> wrote:
>> >
>> >> I think there is a bit of misunderstanding going on here regarding
>> >> protocol documentation and its versioning. It could be that I am the one
>> >> who misunderstood it, please correct me if so.
>> >>
>> >> Taking Gwen's example.
>> >>
>> >> 1. 0.10.0 (protocol v4)  is released with current KIP-35
>> >> 2. On trunk, modify produce requests and bump to v5
>> >> 3. On trunk, we modify metadata requests and bump to v6
>> >> 4. Now we decide that the metadata change fixes a super critical issue
>> and
>> >> want to backport the change. What's the protocol version of the next
>> >> release of 0.10.0 - which supports v6 protocol only partially?
>> >>
>> >> As per my understanding, this will be v7. When we say a broker is on
>> >> ApiVersion 7, we do not necessarily mean that it also supports
>> ApiVersion
>> >> up to v7. A broker on ApiVersion v7 should probably mean, please refer
>> v7
>> >> of protocol documentation to find out supported protocol versions of
>> this
>> >> broker.
>> >>
>> >> I just added an example on the KIP wiki to elaborate more on protocol
>> >> documentation versioning. Below is the excerpt.
>> >>
>> >> For instance say we have two brokers, BrokerA has ApiVersion 4 and
>> BrokerB
>> >> has ApiVersion 5. This means we should have protocol documentations for
>> >> Ap

Re: [DISCUSS] KIP-43: Kafka SASL enhancements

2016-03-15 Thread Rajini Sivaram
Both the KIP and the PR have been updated to a cut-down version as
discussed in the KIP meeting today.

Any feedback is appreciated.

On Tue, Mar 15, 2016 at 7:39 PM, Rajini Sivaram <
rajinisiva...@googlemail.com> wrote:

> Following on from the discussions in the KIP meeting today, the suggestion
> is to implement a cut-down version of KIP-43 for 0.10.0.0 with a follow-on
> KIP after the release to address support for custom mechanisms.
>
> Changes to be removed from KIP-43:
>
>1. Remove the configuration for CallbackHandler. The callback handler
>implementation in Kafka will support Kerberos, PLAIN and Digest-MD5. It
>will not support custom or more complex mechanisms which require additional
>callbacks.
>2. Remove the configuration for Login. The Login implementation in
>Kafka will support Kerberos and any other mechanism (PLAIN, Digest-MD5 etc)
>that doesn't require functionality like token refresh.
>
> Changes included in KIP-43:
>
>1. Configurable mechanism
>2. Support for multiple mechanisms in the broker
>3. Implementation of SASL/PLAIN
>
> If there are no objections to this, I can update the KIP and the PR by
> tomorrow. And move the support for custom mechanisms into another KIP and
> PR for review after the release of 0.10.0.0.
>
>
> On Mon, Mar 14, 2016 at 7:48 AM, Rajini Sivaram <
> rajinisiva...@googlemail.com> wrote:
>
>> Harsha,
>>
>> You are right, we don't expect to override callback handler or login for
>> Digest-MD5.
>>
>> Pluggable CallbackHandler and Login modules enable custom SASL mechanisms
>> to be implemented without modifying Kafka. For instance, it would enable
>> KIP-44 (
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-44+-+Allow+Kafka+to+have+a+customized+security+protocol)
>> to be implemented without making the whole security protocol pluggable. Tao
>> Xiao has already confirmed earlier in this discussion thread that the
>> proposed callback handler and login interfaces are suitable for their
>> custom authentication.
>>
>>
>>
>> On Sun, Mar 13, 2016 at 6:59 PM, Harsha  wrote:
>>
>>> Agree with Gwen here. I feel like these additional pluggable Login
>>> Modules are making this KIP complex. Since the main goal of the KIP is
>>> to enable additional mechanism , can we limit the scope to that and If
>>> we feel necessary for pluggable Login and callback handler classes we
>>> can address in another JIRA.
>>>
>>> Adding digest-md5 ,password callbacks can be done to existing
>>> callbackhandler without  expose it as pluggable class. It would be
>>> useful to have broker support multiple mechanisms.  I haven't seen
>>> anyone using more than this in hadoop . It might be different for Kafka
>>> but I personally haven't seen anyone asking for this yet.
>>>
>>> Thanks,
>>> Harsha
>>>
>>>
>>> On Thu, Mar 10, 2016, at 01:44 AM, Rajini Sivaram wrote:
>>> > Gwen,
>>> >
>>> > Just to be clear, the alternative would be:
>>> >
>>> > *jaas.conf:*
>>> >
>>> > GssapiKafkaServer {
>>> >
>>> > com.ibm.security.auth.module.Krb5LoginModule required
>>> > credsType=both
>>> > useKeytab="file:/kafka/key.tab"
>>> > principal="kafka/localh...@example.com ";
>>> >
>>> > };
>>> >
>>> > SmartcardKafkaServer {
>>> >
>>> >   example.SmartcardLoginModule required
>>> >
>>> >   cardNumber=123;
>>> >
>>> > };
>>> >
>>> >
>>> > *KafkaConfig*
>>> >
>>> >
>>> >
>>> >- login.context.map={"GSSAPI="GssapiKafkaServer",
>>> >   "SMARTCARD"=SmartcardKafkaServer}
>>> >   - login.class.map={"GSSAPI=GssapiLogin.class,
>>> >   "SMARTCARD"=SmartcardLogin.class}
>>> >   -
>>> callback.handler.class.map={"GSSAPI"=GssapiCallbackHandler.class,
>>> >   "SMARTCARD"=SmartcardCallbackHandler.class}
>>> >
>>> > *Client Config *
>>> > Same as the server, but with only one entry allowed in each map and
>>> > jaas.conf
>>> >
>>> >
>>> >
>>> > This is a different model from the Java standard for supporting
>>> multiple
>>> > logins. As a developer, I am inclined to stick with approaches that are
>>> > widely in use like JSSE. But this alternative can be made to work if
>>> the
>>> > Kafka community feels it is more appropriate for Kafka. If you know of
>>> > other systems which use this approach, that would be helpful.
>>> >
>>> >
>>> >
>>> > On Thu, Mar 10, 2016 at 2:07 AM, Gwen Shapira 
>>> wrote:
>>> >
>>> > > What I'm hearing is that:
>>> > >
>>> > > 1. In order to support authentication mechanisms that were not
>>> written
>>> > > specifically with Kafka in mind, someone will need to write the
>>> > > integration between the mechanism and Kafka. This may include Login
>>> > > and CallbackHandler classes. This can be the mechanism vendor, the
>>> > > user or a 3rd party vendor.
>>> > > 2. If someone wrote the code to support a mechanism in Kafka, and a
>>> > > user will want to use more than one mechanism, they will still need
>>> to
>>> > > write a wrapper.
>>> > > 3. In reality, #2 will not be necessary ("edge-case") because Kafka
>>> >

Build failed in Jenkins: kafka-trunk-jdk7 #1118

2016-03-15 Thread Apache Jenkins Server
See 

Changes:

[me] MINOR: Remove unused import in `WordCountJob` to fix checkstyle failure

[me] KAFKA-3260 - Added SourceTask.commitRecord

--
[...truncated 1544 lines...]
kafka.integration.SslTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SslTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithCollision PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.KafkaTest > testGetKafkaConfigFromArgsWrongSetValue PASSED

kafka.KafkaTest > testKafkaSslPasswords PASSED

kafka.KafkaTest > testGetKafkaConfigFromArgs PASSED

kafka.KafkaTest > testGetKafkaConfigFromArgsNonArgsAtTheEnd PASSED

kafka.KafkaTest > testGetKafkaConfigFromArgsNonArgsOnly PASSED

kafka.KafkaTest > testGetKafkaConfigFromArgsNonArgsAtTheBegging PASSED

kafka.metrics.KafkaTimerTest > testKafkaTimer PASSED

kafka.utils.UtilsTest > testAbs PASSED

kafka.utils.UtilsTest > testReplaceSuffix PASSED

kafka.utils.UtilsTest > testCircularIterator PASSED

kafka.utils.UtilsTest > testReadBytes PASSED

kafka.utils.UtilsTest > testCsvList PASSED

kafka.utils.UtilsTest > testReadInt PASSED

kafka.utils.UtilsTest > testCsvMap PASSED

kafka.utils.UtilsTest > testInLock PASSED

kafka.utils.UtilsTest > testSwallow PASSED

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask PASSED

kafka.utils.SchedulerTest > testNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testRestart PASSED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler PASSED

kafka.utils.SchedulerTest > testPeriodicTask PASSED

kafka.utils.ByteBoundedBlockingQueueTest > testByteBoundedBlockingQueue PASSED

kafka.utils.timer.TimerTaskListTest > testAll PASSED

kafka.utils.timer.TimerTest > testAlreadyExpiredTask PASSED

kafka.utils.timer.TimerTest > testTaskExpiration PASSED

kafka.utils.CommandLineUtilsTest > testParseEmptyArg PASSED

kafka.utils.CommandLineUtilsTest > testParseSingleArg PASSED

kafka.utils.CommandLineUtilsTest > testParseArgs PASSED

kafka.utils.CommandLineUtilsTest > testParseEmptyArgAsValid PASSED

kafka.utils.IteratorTemplateTest > testIterator PASSED

kafka.utils.ReplicationUtilsTest > testUpdateLeaderAndIsr PASSED

kafka.utils.ReplicationUtilsTest > testGetLeaderIsrAndEpochForPartition PASSED

kafka.utils.JsonTest > testJsonEncoding PASSED

kafka.message.MessageCompressionTest > testCompressSize PASSED

kafka.message.MessageCompressionTest > testSimpleCompressDecompress PASSED

kafka.message.MessageWriterTest > testWithNoCompressionAttribute PASSED

kafka.message.MessageWriterTest > testWithCompressionAttribute PASSED

kafka.message.MessageWriterTest > testBufferingOutputStream PASSED

kafka.message.MessageWriterTest > testWithKey PASSED

kafka.message.MessageTest > testChecksum PASSED

kafka.message.MessageTest > testInvalidTimestamp PASSED

kafka.message.MessageTest > testIsHashable PASSED

kafka.message.MessageTest > testInvalidTimestampAndMagicValueCombination PASSED

kafka.message.MessageTest > testExceptionMapping PASSED

kafka.message.MessageTest > testFieldValues PASSED

kafka.message.MessageTest > testInvalidMagicByte PASSED

kafka.message.MessageTest > testEquality PASSED

kafka.message.MessageTest > testMessageFormatConversion PASSED

kafka.message.ByteBufferMessageSetTest > testMessageWithProvidedOffsetSeq PASSED

kafka.message.ByteBufferMessageSetTest > testValidBytes PASSED

kafka.message.ByteBufferMessageSetTest > testValidBytesWithCompression PASSED

kafka.message.ByteBufferMessageSetTest > 
testOffsetAssignmentAfterMessageFormatConversion PASSED

kafka.message.ByteBufferMessageSetTest > testIteratorIsConsistent PASSED

kafka.message.ByteBufferMessageSetTest > testAbsoluteOffsetAssignment PASSED

kafka.message.ByteBufferMessageSetTest > testCreateTime PASSED

kafka.message.ByteBufferMessageSetTest > testInva

[jira] [Commented] (KAFKA-2078) Getting Selector [WARN] Error in I/O with host java.io.EOFException

2016-03-15 Thread Mart Haitjema (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196540#comment-15196540
 ] 

Mart Haitjema commented on KAFKA-2078:
--

I posted a 
[comment|https://issues.apache.org/jira/browse/KAFKA-3205?focusedCommentId=15196471&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15196471]
 in KAFKA-3205 but this issue seems related as we began experiencing similar 
errors periodically with our producers (using kafka 0.8.2.1) after our brokers 
where upgraded to 0.9. This turned out to be due to the fact that the brokers 
now timeout idle connections after a default of 10 minutes. Increasing the 
connections.max.idle.ms in the broker config solved this problem for us.

> Getting Selector [WARN] Error in I/O with host java.io.EOFException
> ---
>
> Key: KAFKA-2078
> URL: https://issues.apache.org/jira/browse/KAFKA-2078
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
> Environment: OS Version: 2.6.39-400.209.1.el5uek and Hardware: 8 x 
> Intel(R) Xeon(R) CPU X5660  @ 2.80GHz/44GB
>Reporter: Aravind
>Assignee: Jun Rao
>
> When trying to Produce 1000 (10 MB) messages, getting this below error some 
> where between 997 to 1000th message. There is no pattern but able to 
> reproduce.
> [PDT] 2015-03-31 13:53:50 Selector [WARN] Error in I/O with "our host" 
> java.io.EOFException at 
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62)
>  at org.apache.kafka.common.network.Selector.poll(Selector.java:248) at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) at 
> java.lang.Thread.run(Thread.java:724)
> This error I am getting some times @ 997th message or 999th message. There is 
> no pattern but able to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3378) Client blocks forever if SocketChannel connects instantly

2016-03-15 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-3378:
---
Priority: Blocker  (was: Critical)

Thanks for reporting this. Marking this as a blocker for 0.10.0 since it can 
have a big impact to the client.

> Client blocks forever if SocketChannel connects instantly
> -
>
> Key: KAFKA-3378
> URL: https://issues.apache.org/jira/browse/KAFKA-3378
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.1
>Reporter: Larkin Lowrey
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> Observed that some consumers were blocked in Fetcher.listOffset() when 
> starting many dozens of consumer threads at the same time.
> Selector.connect(...) calls SocketChannel.connect() in non-blocking mode and 
> assumes that false is always returned and that the channel will be in the 
> Selector's readyKeys once the connection is ready for connect completion due 
> to the OP_CONNECT interest op.
> When connect() returns true the channel is fully connected connected and will 
> not be included in readyKeys since only OP_CONNECT is set.
> I implemented a fix which handles the case when connect(...) returns true and 
> verified that I no longer see stuck consumers. A git pull request will be 
> forthcoming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: Fixes for Windows #154

2016-03-15 Thread JeffersJi
GitHub user JeffersJi opened a pull request:

https://github.com/apache/kafka/pull/1078

Fixes for Windows #154

See the link:
https://github.com/apache/kafka/pull/154

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JeffersJi/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1078.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1078


commit 88563137f9e434e320065f799886c48091384cef
Author: JeffersJi 
Date:   2016-03-15T23:57:52Z

Update LogSegment.scala

commit f7c3cbb5cc115e99b74c588de2f9168fd84e9ffe
Author: JeffersJi 
Date:   2016-03-16T00:02:08Z

Update OffsetIndex.scala




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk8 #446

2016-03-15 Thread Apache Jenkins Server
See 

Changes:

[me] KAFKA-3260 - Added SourceTask.commitRecord

--
[...truncated 4940 lines...]

org.apache.kafka.common.record.RecordTest > testFields[168] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[169] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[169] PASSED

org.apache.kafka.common.record.RecordTest > testFields[169] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[170] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[170] PASSED

org.apache.kafka.common.record.RecordTest > testFields[170] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[171] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[171] PASSED

org.apache.kafka.common.record.RecordTest > testFields[171] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[172] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[172] PASSED

org.apache.kafka.common.record.RecordTest > testFields[172] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[173] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[173] PASSED

org.apache.kafka.common.record.RecordTest > testFields[173] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[174] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[174] PASSED

org.apache.kafka.common.record.RecordTest > testFields[174] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[175] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[175] PASSED

org.apache.kafka.common.record.RecordTest > testFields[175] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[176] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[176] PASSED

org.apache.kafka.common.record.RecordTest > testFields[176] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[177] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[177] PASSED

org.apache.kafka.common.record.RecordTest > testFields[177] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[178] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[178] PASSED

org.apache.kafka.common.record.RecordTest > testFields[178] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[179] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[179] PASSED

org.apache.kafka.common.record.RecordTest > testFields[179] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[180] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[180] PASSED

org.apache.kafka.common.record.RecordTest > testFields[180] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[181] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[181] PASSED

org.apache.kafka.common.record.RecordTest > testFields[181] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[182] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[182] PASSED

org.apache.kafka.common.record.RecordTest > testFields[182] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[183] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[183] PASSED

org.apache.kafka.common.record.RecordTest > testFields[183] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[184] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[184] PASSED

org.apache.kafka.common.record.RecordTest > testFields[184] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[185] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[185] PASSED

org.apache.kafka.common.record.RecordTest > testFields[185] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[186] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[186] PASSED

org.apache.kafka.common.record.RecordTest > testFields[186] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[187] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[187] PASSED

org.apache.kafka.common.record.RecordTest > testFields[187] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[188] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[188] PASSED

org.apache.kafka.common.record.RecordTest > testFields[188] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[189] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[189] PASSED

org.apache.kafka.common.record.RecordTest > testFields[189] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[190] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[190] PASSED

org.apache.kafka.common.record.RecordTest > testFields[190] PASSED

org.apache.kafka.common.record.RecordTest > testChecksum[191] PASSED

org.apache.kafka.common.record.RecordTest > testEquality[191] PASSED

org.apache.kafka.

[jira] [Created] (KAFKA-3407) ErrorLoggingCallback trims helpful logging messages.

2016-03-15 Thread Jeremy Custenborder (JIRA)
Jeremy Custenborder created KAFKA-3407:
--

 Summary: ErrorLoggingCallback trims helpful logging messages.
 Key: KAFKA-3407
 URL: https://issues.apache.org/jira/browse/KAFKA-3407
 Project: Kafka
  Issue Type: Improvement
Reporter: Jeremy Custenborder
Priority: Minor


ErrorLoggingCallback currently only returns the message of the message 
returned. Any inner exception or callstack is not included. This makes 
troubleshooting more difficult. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3407) ErrorLoggingCallback trims helpful diagnosting information.

2016-03-15 Thread Jeremy Custenborder (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Custenborder updated KAFKA-3407:
---
Summary: ErrorLoggingCallback trims helpful diagnosting information.  (was: 
ErrorLoggingCallback trims helpful logging messages.)

> ErrorLoggingCallback trims helpful diagnosting information.
> ---
>
> Key: KAFKA-3407
> URL: https://issues.apache.org/jira/browse/KAFKA-3407
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jeremy Custenborder
>Priority: Minor
>
> ErrorLoggingCallback currently only returns the message of the message 
> returned. Any inner exception or callstack is not included. This makes 
> troubleshooting more difficult. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3407) ErrorLoggingCallback trims helpful diagnostic information.

2016-03-15 Thread Jeremy Custenborder (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Custenborder updated KAFKA-3407:
---
Summary: ErrorLoggingCallback trims helpful diagnostic information.  (was: 
ErrorLoggingCallback trims helpful diagnosting information.)

> ErrorLoggingCallback trims helpful diagnostic information.
> --
>
> Key: KAFKA-3407
> URL: https://issues.apache.org/jira/browse/KAFKA-3407
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jeremy Custenborder
>Priority: Minor
>
> ErrorLoggingCallback currently only returns the message of the message 
> returned. Any inner exception or callstack is not included. This makes 
> troubleshooting more difficult. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3407 - ErrorLoggingCallback trims helpfu...

2016-03-15 Thread jcustenborder
GitHub user jcustenborder opened a pull request:

https://github.com/apache/kafka/pull/1079

KAFKA-3407 - ErrorLoggingCallback trims helpful diagnostic information.

This should help when diagnosing issues with the console producer. This 
allows the logger to use `exception` rather than `exception.getMessage()`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jcustenborder/kafka KAFKA-3407

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1079.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1079


commit fe80be8e3837dd6f03f69947e3ac8f7a8c4fe14b
Author: Jeremy Custenborder 
Date:   2016-03-16T02:21:01Z

KAFKA-3407 - Changed to use exception instead of exception.getMessage(). 
This will return callstack and inner exceptions.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3407) ErrorLoggingCallback trims helpful diagnostic information.

2016-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196650#comment-15196650
 ] 

ASF GitHub Bot commented on KAFKA-3407:
---

GitHub user jcustenborder opened a pull request:

https://github.com/apache/kafka/pull/1079

KAFKA-3407 - ErrorLoggingCallback trims helpful diagnostic information.

This should help when diagnosing issues with the console producer. This 
allows the logger to use `exception` rather than `exception.getMessage()`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jcustenborder/kafka KAFKA-3407

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1079.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1079


commit fe80be8e3837dd6f03f69947e3ac8f7a8c4fe14b
Author: Jeremy Custenborder 
Date:   2016-03-16T02:21:01Z

KAFKA-3407 - Changed to use exception instead of exception.getMessage(). 
This will return callstack and inner exceptions.




> ErrorLoggingCallback trims helpful diagnostic information.
> --
>
> Key: KAFKA-3407
> URL: https://issues.apache.org/jira/browse/KAFKA-3407
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jeremy Custenborder
>Priority: Minor
>
> ErrorLoggingCallback currently only returns the message of the message 
> returned. Any inner exception or callstack is not included. This makes 
> troubleshooting more difficult. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3408) consumer rebalance fail

2016-03-15 Thread zhongkai liu (JIRA)
zhongkai liu created KAFKA-3408:
---

 Summary: consumer rebalance fail
 Key: KAFKA-3408
 URL: https://issues.apache.org/jira/browse/KAFKA-3408
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.9.0.1
 Environment: centos linux
Reporter: zhongkai liu


I use "/bin/kafka-console-consumer" command to start two consumers of group 
"page_group",then the first conumer console report rebalance failure like this:
ERROR [page_view_group1_slave2-1458095694092-80c33086], error during 
syncedRebalance (kafka.consumer.ZookeeperConsumerConnector)
kafka.common.ConsumerRebalanceFailedException: 
page_view_group1_slave2-1458095694092-80c33086 can't rebalance after 10 retries
at 
kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:660)
at 
kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:579)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3409) Mirror maker hangs indefinitely due to commit

2016-03-15 Thread TAO XIAO (JIRA)
TAO XIAO created KAFKA-3409:
---

 Summary: Mirror maker hangs indefinitely due to commit 
 Key: KAFKA-3409
 URL: https://issues.apache.org/jira/browse/KAFKA-3409
 Project: Kafka
  Issue Type: Bug
  Components: tools
Affects Versions: 0.9.0.1
 Environment: Kafka 0.9.0.1
Reporter: TAO XIAO


Mirror maker hangs indefinitely upon receiving CommitFailedException. I believe 
this is due to CommitFailedException not caught by mirror maker and mirror 
maker has no way to recover from it.

A better approach will be catching the exception and rejoin the group. Here is 
the stack trace

[2016-03-15 09:34:36,463] ERROR Error UNKNOWN_MEMBER_ID occurred while 
committing offsets for group x 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2016-03-15 09:34:36,463] FATAL [mirrormaker-thread-3] Mirror maker thread 
failure due to  (kafka.tools.MirrorMaker$MirrorMakerThread)
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be 
completed due to group rebalance
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:552)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:493)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:665)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:644)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:380)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:274)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:320)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:213)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:193)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:358)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:968)
at 
kafka.tools.MirrorMaker$MirrorMakerNewConsumer.commit(MirrorMaker.scala:548)
at kafka.tools.MirrorMaker$.commitOffsets(MirrorMaker.scala:340)
at 
kafka.tools.MirrorMaker$MirrorMakerThread.maybeFlushAndCommitOffsets(MirrorMaker.scala:438)
at kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:399)
[2016-03-15 09:34:36,463] INFO [mirrormaker-thread-3] Flushing producer. 
(kafka.tools.MirrorMaker$MirrorMakerThread)
[2016-03-15 09:34:36,464] INFO [mirrormaker-thread-3] Committing consumer 
offsets. (kafka.tools.MirrorMaker$MirrorMakerThread)
[2016-03-15 09:34:36,477] ERROR Error UNKNOWN_MEMBER_ID occurred while 
committing offsets for group x 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3290) WorkerSourceTask testCommit transient failure

2016-03-15 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196767#comment-15196767
 ] 

Ewen Cheslack-Postava commented on KAFKA-3290:
--

This was reopened because it seemed to be triggered very infrequently, but I'm 
actually encountering something a bit different but which might also be 
related. I manage to grab a thread dump at (approximately) the point of 
failure, and I think this is the relevant piece:

{quote}
"pool-3-thread-1"
   java.lang.Thread.State: TIMED_WAITING
at java.lang.Object.wait(Native Method)
at 
org.apache.kafka.connect.runtime.WorkerSourceTask.commitOffsets(WorkerSourceTask.java:284)
at 
org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:159)
at 
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:126)
at 
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:139)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{quote}

Triggering this seems to require parallel test execution and there must be at 
least a bit of load on my CPU. I'm not sure the output is perfect because 
during some of the test runs it seemed like the stdout reporting may be somehow 
broken by parallel test execution (e.g. some output seemed cut off at the wrong 
place).

It looks like the accounting for outstanding records might not be handled 
properly because a commit that happens when the SourceTask is being stopped 
ends up waiting for producer callbacks for some outstanding records. However, 
I'm skeptical of that assessment because even adding some debugging printouts 
to stdout is not showing any records being produced in the stdout of the failed 
tests

[~hachikuji] I found this while trying to merge KAFKA-3394, which is obviously 
unrelated. Seems like as soon as I merged [~jcustenborder]'s commitRecord patch 
to trunk it decided to start failing often for me... I think these two issues 
might have the same cause (some threading/timing related issue).

> WorkerSourceTask testCommit transient failure
> -
>
> Key: KAFKA-3290
> URL: https://issues.apache.org/jira/browse/KAFKA-3290
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 0.10.0.0
>
>
> From recent failed build:
> {code}
> org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testCommit FAILED
> java.lang.AssertionError:
>   Expectation failure on verify:
> Listener.onStartup(job-0): expected: 1, actual: 1
> Listener.onShutdown(job-0): expected: 1, actual: 1
> at org.easymock.internal.MocksControl.verify(MocksControl.java:225)
> at 
> org.powermock.api.easymock.internal.invocationcontrol.EasyMockMethodInvocationControl.verify(EasyMockMethodInvocationControl.java:132)
> at org.powermock.api.easymock.PowerMock.verify(PowerMock.java:1466)
> at org.powermock.api.easymock.PowerMock.verifyAll(PowerMock.java:1405)
> at 
> org.apache.kafka.connect.runtime.WorkerSourceTaskTest.testCommit(WorkerSourceTaskTest.java:221)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >