Re: Review Request 25995: Patch for KAFKA-1650

2014-12-08 Thread Jiangjie Qin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25995/
---

(Updated Dec. 8, 2014, 9:36 a.m.)


Review request for kafka.


Bugs: KAFKA-1650 and KAKFA-1650
https://issues.apache.org/jira/browse/KAFKA-1650
https://issues.apache.org/jira/browse/KAKFA-1650


Repository: kafka


Description (updated)
---

Addressed Guozhang's comments.


Addressed Guozhang's comments


commit before switch to trunk


commit before rebase


Rebased on trunk, Addressed Guozhang's comments.


Addressed Guozhang's comments on MaxInFlightRequests


Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
mirrormaker-redesign


Incorporated Guozhang's comments


Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
mirrormaker-redesign


Merged KAFKA-345 into this patch. Incorporated Joel and Jun's comments.


Added consumer rebalance listener to mirror maker, will test it later.


Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
mirrormaker-redesign

Conflicts:
core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala

core/src/test/scala/unit/kafka/consumer/ZookeeperConsumerConnectorTest.scala

added custom config for consumer rebalance listener


Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
mirrormaker-redesign


Add configurable consumer rebalance listener


Incorporated Guozhang's comments


Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
mirrormaker-redesign


Incorporated Guozhang's comments.


Addressed Guozhang's comment.


numMessageUnacked should be decremented no matter the send was successful or 
not.


Addressed Jun's comments.


Incorporated Jun's comments


Incorporated Jun's comments and rebased on trunk


Diffs (updated)
-

  core/src/main/scala/kafka/consumer/ConsumerConnector.scala 
62c0686e816d2888772d5a911becf625eedee397 
  core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
e991d2187d03241f639eeaf6769fb59c8c99664c 
  core/src/main/scala/kafka/javaapi/consumer/ZookeeperConsumerConnector.scala 
9baad34a9793e5067d11289ece2154ba87b388af 
  core/src/main/scala/kafka/tools/MirrorMaker.scala 
b06ff6000183b257005b5ac3ccc7ba8976f1ab8d 

Diff: https://reviews.apache.org/r/25995/diff/


Testing
---


Thanks,

Jiangjie Qin



[jira] [Updated] (KAFKA-1650) Mirror Maker could lose data on unclean shutdown.

2014-12-08 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-1650:

Attachment: KAFKA-1650_2014-12-08_01:36:01.patch

 Mirror Maker could lose data on unclean shutdown.
 -

 Key: KAFKA-1650
 URL: https://issues.apache.org/jira/browse/KAFKA-1650
 Project: Kafka
  Issue Type: Improvement
Reporter: Jiangjie Qin
Assignee: Jiangjie Qin
 Attachments: KAFKA-1650.patch, KAFKA-1650_2014-10-06_10:17:46.patch, 
 KAFKA-1650_2014-11-12_09:51:30.patch, KAFKA-1650_2014-11-17_18:44:37.patch, 
 KAFKA-1650_2014-11-20_12:00:16.patch, KAFKA-1650_2014-11-24_08:15:17.patch, 
 KAFKA-1650_2014-12-03_15:02:31.patch, KAFKA-1650_2014-12-03_19:02:13.patch, 
 KAFKA-1650_2014-12-04_11:59:07.patch, KAFKA-1650_2014-12-06_18:58:57.patch, 
 KAFKA-1650_2014-12-08_01:36:01.patch


 Currently if mirror maker got shutdown uncleanly, the data in the data 
 channel and buffer could potentially be lost. With the new producer's 
 callback, this issue could be solved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1650) Mirror Maker could lose data on unclean shutdown.

2014-12-08 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237679#comment-14237679
 ] 

Jiangjie Qin commented on KAFKA-1650:
-

Updated reviewboard https://reviews.apache.org/r/25995/diff/
 against branch origin/trunk

 Mirror Maker could lose data on unclean shutdown.
 -

 Key: KAFKA-1650
 URL: https://issues.apache.org/jira/browse/KAFKA-1650
 Project: Kafka
  Issue Type: Improvement
Reporter: Jiangjie Qin
Assignee: Jiangjie Qin
 Attachments: KAFKA-1650.patch, KAFKA-1650_2014-10-06_10:17:46.patch, 
 KAFKA-1650_2014-11-12_09:51:30.patch, KAFKA-1650_2014-11-17_18:44:37.patch, 
 KAFKA-1650_2014-11-20_12:00:16.patch, KAFKA-1650_2014-11-24_08:15:17.patch, 
 KAFKA-1650_2014-12-03_15:02:31.patch, KAFKA-1650_2014-12-03_19:02:13.patch, 
 KAFKA-1650_2014-12-04_11:59:07.patch, KAFKA-1650_2014-12-06_18:58:57.patch, 
 KAFKA-1650_2014-12-08_01:36:01.patch


 Currently if mirror maker got shutdown uncleanly, the data in the data 
 channel and buffer could potentially be lost. With the new producer's 
 callback, this issue could be solved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 28643: Patch for KAFKA-1802

2014-12-08 Thread Andrii Biletskyi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28643/
---

(Updated Dec. 8, 2014, 10:56 a.m.)


Review request for kafka.


Bugs: KAFKA-1802
https://issues.apache.org/jira/browse/KAFKA-1802


Repository: kafka


Description (updated)
---

KAFKA-1802 -  Add a new type of request for the discovery of the controller


KAFKA-1802 -  UpdateMetadataRequest is not sent on startup, so brokers do not 
cache cluster info


Diffs (updated)
-

  clients/src/main/java/org/apache/kafka/common/protocol/ApiKeys.java 
109fc965e09b2ed186a073351bd037ac8af20a4c 
  clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
7517b879866fc5dad5f8d8ad30636da8bbe7784a 
  clients/src/main/java/org/apache/kafka/common/requests/AdminRequest.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/requests/AdminResponse.java 
PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/requests/ClusterMetadataRequest.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/requests/ClusterMetadataResponse.java
 PRE-CREATION 
  
clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java 
df37fc6d8f0db0b8192a948426af603be3444da4 
  core/src/main/scala/kafka/api/AdminRequest.scala PRE-CREATION 
  core/src/main/scala/kafka/api/AdminResponse.scala PRE-CREATION 
  core/src/main/scala/kafka/api/ClusterMetadataRequest.scala PRE-CREATION 
  core/src/main/scala/kafka/api/ClusterMetadataResponse.scala PRE-CREATION 
  core/src/main/scala/kafka/api/RequestKeys.scala 
c24c0345feedc7b9e2e9f40af11bfa1b8d328c43 
  core/src/main/scala/kafka/api/admin/request/args/ParseException.scala 
PRE-CREATION 
  core/src/main/scala/kafka/api/admin/request/args/TopicCommandArguments.scala 
PRE-CREATION 
  core/src/main/scala/kafka/common/AdminRequestFailedException.scala 
PRE-CREATION 
  core/src/main/scala/kafka/common/ErrorMapping.scala 
eedc2f5f21dd8755fba891998456351622e17047 
  core/src/main/scala/kafka/controller/ControllerChannelManager.scala 
eb492f00449744bc8d63f55b393e2a1659d38454 
  core/src/main/scala/kafka/controller/KafkaController.scala 
66df6d2fbdbdd556da6bea0df84f93e0472c8fbf 
  core/src/main/scala/kafka/server/KafkaApis.scala 
2a1c0326b6e6966d8b8254bd6a1cb83ad98a3b80 
  core/src/main/scala/kafka/server/MetadataCache.scala 
bf81a1ab88c14be8697b441eedbeb28fa0112643 
  core/src/test/scala/unit/kafka/api/AdminRequestTest.scala PRE-CREATION 
  core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
cd16ced5465d098be7a60498326b2a98c248f343 

Diff: https://reviews.apache.org/r/28643/diff/


Testing
---


Thanks,

Andrii Biletskyi



[jira] [Commented] (KAFKA-1802) Add a new type of request for the discovery of the controller

2014-12-08 Thread Andrii Biletskyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237754#comment-14237754
 ] 

Andrii Biletskyi commented on KAFKA-1802:
-

Updated reviewboard https://reviews.apache.org/r/28643/diff/
 against branch origin/trunk

 Add a new type of request for the discovery of the controller
 -

 Key: KAFKA-1802
 URL: https://issues.apache.org/jira/browse/KAFKA-1802
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3

 Attachments: KAFKA-1802.patch, KAFKA-1802_2014-12-08_12:56:03.patch


 The goal here is like meta data discovery is for producer so CLI can find 
 which broker it should send the rest of its admin requests too.  Any broker 
 can respond to this specific AdminMeta RQ/RP but only the controller broker 
 should be responding to Admin message otherwise that broker should respond to 
 any admin message with the response for what the controller is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1802) Add a new type of request for the discovery of the controller

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi updated KAFKA-1802:

Attachment: KAFKA-1802_2014-12-08_12:56:03.patch

 Add a new type of request for the discovery of the controller
 -

 Key: KAFKA-1802
 URL: https://issues.apache.org/jira/browse/KAFKA-1802
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3

 Attachments: KAFKA-1802.patch, KAFKA-1802_2014-12-08_12:56:03.patch


 The goal here is like meta data discovery is for producer so CLI can find 
 which broker it should send the rest of its admin requests too.  Any broker 
 can respond to this specific AdminMeta RQ/RP but only the controller broker 
 should be responding to Admin message otherwise that broker should respond to 
 any admin message with the response for what the controller is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 28481: Patch for KAFKA-1792

2014-12-08 Thread Dmitry Pekar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28481/
---

(Updated Dec. 8, 2014, 11:43 a.m.)


Review request for kafka.


Bugs: KAFKA-1792
https://issues.apache.org/jira/browse/KAFKA-1792


Repository: kafka


Description (updated)
---

KAFKA-1792: CR


KAFKA-1792: CR2


Diffs (updated)
-

  core/src/main/scala/kafka/admin/AdminUtils.scala 
28b12c7b89a56c113b665fbde1b95f873f8624a3 
  core/src/main/scala/kafka/admin/ReassignPartitionsCommand.scala 
979992b68af3723cd229845faff81c641123bb88 
  core/src/test/scala/unit/kafka/admin/AdminTest.scala 
e28979827110dfbbb92fe5b152e7f1cc973de400 
  topics.json ff011ed381e781b9a177036001d44dca3eac586f 

Diff: https://reviews.apache.org/r/28481/diff/


Testing
---


Thanks,

Dmitry Pekar



[jira] [Updated] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2014-12-08 Thread Dmitry Pekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Pekar updated KAFKA-1792:

Attachment: KAFKA-1792_2014-12-08_13:42:43.patch

 change behavior of --generate to produce assignment config with fair replica 
 distribution and minimal number of reassignments
 -

 Key: KAFKA-1792
 URL: https://issues.apache.org/jira/browse/KAFKA-1792
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3

 Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, 
 KAFKA-1792_2014-12-08_13:42:43.patch


 Current implementation produces fair replica distribution between specified 
 list of brokers. Unfortunately, it doesn't take
 into account current replica assignment.
 So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
 broker id=3, 
 generate will create an assignment config which will redistribute replicas 
 fairly across brokers [0..3] 
 in the same way as those partitions were created from scratch. It will not 
 take into consideration current replica 
 assignment and accordingly will not try to minimize number of replica moves 
 between brokers.
 As proposed by [~charmalloc] this should be improved. New output of improved 
 --generate algorithm should suite following requirements:
 - fairness of replica distribution - every broker will have R or R+1 replicas 
 assigned;
 - minimum of reassignments - number of replica moves between brokers will be 
 minimal;
 Example.
 Consider following replica distribution per brokers [0..3] (we just added 
 brokers 2 and 3):
 - broker - 0, 1, 2, 3 
 - replicas - 7, 6, 0, 0
 The new algorithm will produce following assignment:
 - broker - 0, 1, 2, 3 
 - replicas - 4, 3, 3, 3
 - moves - -3, -3, +3, +3
 It will be fair and number of moves will be 6, which is minimal for specified 
 initial distribution.
 The scope of this issue is:
 - design an algorithm matching the above requirements;
 - implement this algorithm and unit tests;
 - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2014-12-08 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237786#comment-14237786
 ] 

Dmitry Pekar commented on KAFKA-1792:
-

Updated reviewboard https://reviews.apache.org/r/28481/diff/
 against branch origin/trunk

 change behavior of --generate to produce assignment config with fair replica 
 distribution and minimal number of reassignments
 -

 Key: KAFKA-1792
 URL: https://issues.apache.org/jira/browse/KAFKA-1792
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3

 Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, 
 KAFKA-1792_2014-12-08_13:42:43.patch


 Current implementation produces fair replica distribution between specified 
 list of brokers. Unfortunately, it doesn't take
 into account current replica assignment.
 So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
 broker id=3, 
 generate will create an assignment config which will redistribute replicas 
 fairly across brokers [0..3] 
 in the same way as those partitions were created from scratch. It will not 
 take into consideration current replica 
 assignment and accordingly will not try to minimize number of replica moves 
 between brokers.
 As proposed by [~charmalloc] this should be improved. New output of improved 
 --generate algorithm should suite following requirements:
 - fairness of replica distribution - every broker will have R or R+1 replicas 
 assigned;
 - minimum of reassignments - number of replica moves between brokers will be 
 minimal;
 Example.
 Consider following replica distribution per brokers [0..3] (we just added 
 brokers 2 and 3):
 - broker - 0, 1, 2, 3 
 - replicas - 7, 6, 0, 0
 The new algorithm will produce following assignment:
 - broker - 0, 1, 2, 3 
 - replicas - 4, 3, 3, 3
 - moves - -3, -3, +3, +3
 It will be fair and number of moves will be 6, which is minimal for specified 
 initial distribution.
 The scope of this issue is:
 - design an algorithm matching the above requirements;
 - implement this algorithm and unit tests;
 - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2014-12-08 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237796#comment-14237796
 ] 

Dmitry Pekar commented on KAFKA-1792:
-

[~nehanarkhede] Thank you for your comments. I've updated and added a patch for 
fixed items.
I can't agree with you about unit test for the algorithm. If it contains a bug 
or could be improved in future,
than we would not be able to guarantee it correctness after the fix/improvement 
if not having the unit-test. 

The above unit test, IMHO, already contains those scenarios, but may be I've 
missed some important scenario.
Could you please review it also?

 change behavior of --generate to produce assignment config with fair replica 
 distribution and minimal number of reassignments
 -

 Key: KAFKA-1792
 URL: https://issues.apache.org/jira/browse/KAFKA-1792
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3

 Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, 
 KAFKA-1792_2014-12-08_13:42:43.patch


 Current implementation produces fair replica distribution between specified 
 list of brokers. Unfortunately, it doesn't take
 into account current replica assignment.
 So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
 broker id=3, 
 generate will create an assignment config which will redistribute replicas 
 fairly across brokers [0..3] 
 in the same way as those partitions were created from scratch. It will not 
 take into consideration current replica 
 assignment and accordingly will not try to minimize number of replica moves 
 between brokers.
 As proposed by [~charmalloc] this should be improved. New output of improved 
 --generate algorithm should suite following requirements:
 - fairness of replica distribution - every broker will have R or R+1 replicas 
 assigned;
 - minimum of reassignments - number of replica moves between brokers will be 
 minimal;
 Example.
 Consider following replica distribution per brokers [0..3] (we just added 
 brokers 2 and 3):
 - broker - 0, 1, 2, 3 
 - replicas - 7, 6, 0, 0
 The new algorithm will produce following assignment:
 - broker - 0, 1, 2, 3 
 - replicas - 4, 3, 3, 3
 - moves - -3, -3, +3, +3
 It will be fair and number of moves will be 6, which is minimal for specified 
 initial distribution.
 The scope of this issue is:
 - design an algorithm matching the above requirements;
 - implement this algorithm and unit tests;
 - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2014-12-08 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237796#comment-14237796
 ] 

Dmitry Pekar edited comment on KAFKA-1792 at 12/8/14 11:48 AM:
---

[~nehanarkhede] Thank you for your comments. I've updated rb and added a patch 
for fixed items.
I can't agree with you about unit test for the algorithm. If it contains a bug 
or could be improved in future,
than we would not be able to guarantee it correctness after the fix/improvement 
if not having the unit-test. 

The above unit test, IMHO, already contains those scenarios, but may be I've 
missed some important scenario.
Could you please review it also?


was (Author: dmitry pekar):
[~nehanarkhede] Thank you for your comments. I've updated and added a patch for 
fixed items.
I can't agree with you about unit test for the algorithm. If it contains a bug 
or could be improved in future,
than we would not be able to guarantee it correctness after the fix/improvement 
if not having the unit-test. 

The above unit test, IMHO, already contains those scenarios, but may be I've 
missed some important scenario.
Could you please review it also?

 change behavior of --generate to produce assignment config with fair replica 
 distribution and minimal number of reassignments
 -

 Key: KAFKA-1792
 URL: https://issues.apache.org/jira/browse/KAFKA-1792
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3

 Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, 
 KAFKA-1792_2014-12-08_13:42:43.patch


 Current implementation produces fair replica distribution between specified 
 list of brokers. Unfortunately, it doesn't take
 into account current replica assignment.
 So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
 broker id=3, 
 generate will create an assignment config which will redistribute replicas 
 fairly across brokers [0..3] 
 in the same way as those partitions were created from scratch. It will not 
 take into consideration current replica 
 assignment and accordingly will not try to minimize number of replica moves 
 between brokers.
 As proposed by [~charmalloc] this should be improved. New output of improved 
 --generate algorithm should suite following requirements:
 - fairness of replica distribution - every broker will have R or R+1 replicas 
 assigned;
 - minimum of reassignments - number of replica moves between brokers will be 
 minimal;
 Example.
 Consider following replica distribution per brokers [0..3] (we just added 
 brokers 2 and 3):
 - broker - 0, 1, 2, 3 
 - replicas - 7, 6, 0, 0
 The new algorithm will produce following assignment:
 - broker - 0, 1, 2, 3 
 - replicas - 4, 3, 3, 3
 - moves - -3, -3, +3, +3
 It will be fair and number of moves will be 6, which is minimal for specified 
 initial distribution.
 The scope of this issue is:
 - design an algorithm matching the above requirements;
 - implement this algorithm and unit tests;
 - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2014-12-08 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237796#comment-14237796
 ] 

Dmitry Pekar edited comment on KAFKA-1792 at 12/8/14 11:49 AM:
---

[~nehanarkhede] Thank you for your comments. I've updated rb and added a patch 
for fixed items.
I can't agree with you about unit test for the algorithm. If the algorithm 
contains a bug or could be improved in future,
than we would not be able to verify it correctness after the fix/improvement if 
not having the unit-test. 

The above unit test, IMHO, already contains those scenarios, but may be I've 
missed some important scenario.
Could you please review it also?


was (Author: dmitry pekar):
[~nehanarkhede] Thank you for your comments. I've updated rb and added a patch 
for fixed items.
I can't agree with you about unit test for the algorithm. If it contains a bug 
or could be improved in future,
than we would not be able to guarantee it correctness after the fix/improvement 
if not having the unit-test. 

The above unit test, IMHO, already contains those scenarios, but may be I've 
missed some important scenario.
Could you please review it also?

 change behavior of --generate to produce assignment config with fair replica 
 distribution and minimal number of reassignments
 -

 Key: KAFKA-1792
 URL: https://issues.apache.org/jira/browse/KAFKA-1792
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3

 Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, 
 KAFKA-1792_2014-12-08_13:42:43.patch


 Current implementation produces fair replica distribution between specified 
 list of brokers. Unfortunately, it doesn't take
 into account current replica assignment.
 So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
 broker id=3, 
 generate will create an assignment config which will redistribute replicas 
 fairly across brokers [0..3] 
 in the same way as those partitions were created from scratch. It will not 
 take into consideration current replica 
 assignment and accordingly will not try to minimize number of replica moves 
 between brokers.
 As proposed by [~charmalloc] this should be improved. New output of improved 
 --generate algorithm should suite following requirements:
 - fairness of replica distribution - every broker will have R or R+1 replicas 
 assigned;
 - minimum of reassignments - number of replica moves between brokers will be 
 minimal;
 Example.
 Consider following replica distribution per brokers [0..3] (we just added 
 brokers 2 and 3):
 - broker - 0, 1, 2, 3 
 - replicas - 7, 6, 0, 0
 The new algorithm will produce following assignment:
 - broker - 0, 1, 2, 3 
 - replicas - 4, 3, 3, 3
 - moves - -3, -3, +3, +3
 It will be fair and number of moves will be 6, which is minimal for specified 
 initial distribution.
 The scope of this issue is:
 - design an algorithm matching the above requirements;
 - implement this algorithm and unit tests;
 - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1694) kafka command line and centralized operations

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi updated KAFKA-1694:

Attachment: (was: KAFKA-1772_1802_1775_1774.patch)

 kafka command line and centralized operations
 -

 Key: KAFKA-1694
 URL: https://issues.apache.org/jira/browse/KAFKA-1694
 Project: Kafka
  Issue Type: Bug
Reporter: Joe Stein
Priority: Critical
 Fix For: 0.8.3


 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1694) kafka command line and centralized operations

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi updated KAFKA-1694:

Attachment: KAFKA-1772_1802_1775_1774_v2.patch

 kafka command line and centralized operations
 -

 Key: KAFKA-1694
 URL: https://issues.apache.org/jira/browse/KAFKA-1694
 Project: Kafka
  Issue Type: Bug
Reporter: Joe Stein
Priority: Critical
 Fix For: 0.8.3

 Attachments: KAFKA-1772_1802_1775_1774_v2.patch


 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1694) kafka command line and centralized operations

2014-12-08 Thread Andrii Biletskyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237283#comment-14237283
 ] 

Andrii Biletskyi edited comment on KAFKA-1694 at 12/8/14 12:40 PM:
---

I've added a single patch that covers all currently implemented functionality 
(Admin message + basis shell functionality with TopicCommand) to receive some 
initial feedback.

To start the Shell please follow the instructions under 
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+Tool+Installation


was (Author: abiletskyi):
I've added a single patch that covers all currently implemented functionality 
(Admin message + basis shell functionality with TopicCommand) to receive some 
initial feedback.
Patch is created against trunk, commit 7e9368b.

To get this working:
1) apply patch
2) build kafka: ./gradlew releaseTarGz_2_10_4
3) start somewhere kafka from build release (archive is in 
./core/build/distributions)
4.1) To start interactive shell:
#sudo bin/kafka.sh --shell --broker host:port
#kafka help
Or:
4.2) Call TopicCommand right from kafka.sh. E.g.:
#sudo bin/kafka.sh --list-topics --broker host:port



 kafka command line and centralized operations
 -

 Key: KAFKA-1694
 URL: https://issues.apache.org/jira/browse/KAFKA-1694
 Project: Kafka
  Issue Type: Bug
Reporter: Joe Stein
Priority: Critical
 Fix For: 0.8.3

 Attachments: KAFKA-1772_1802_1775_1774_v2.patch


 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1774) REPL and Shell Client for Admin Message RQ/RP

2014-12-08 Thread Andrii Biletskyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237951#comment-14237951
 ] 

Andrii Biletskyi commented on KAFKA-1774:
-

Patch is available under parent ticket - KAFKA-1694

 REPL and Shell Client for Admin Message RQ/RP
 -

 Key: KAFKA-1774
 URL: https://issues.apache.org/jira/browse/KAFKA-1774
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 We should have a REPL we can work in and execute the commands with the 
 arguments. With this we can do:
 ./kafka.sh --shell 
 kafkaattach cluster -b localhost:9092;
 kafkadescribe topic sampleTopicNameForExample;
 the command line version can work like it does now so folks don't have to 
 re-write all of their tooling.
 kafka.sh --topics --everything the same like kafka-topics.sh is 
 kafka.sh --reassign --everything the same like kafka-reassign-partitions.sh 
 is 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1772) Add an Admin message type for request response

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi updated KAFKA-1772:

Attachment: (was: KAFKA-1772_2014-12-02_16:23:26.patch)

 Add an Admin message type for request response
 --

 Key: KAFKA-1772
 URL: https://issues.apache.org/jira/browse/KAFKA-1772
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 - utility int8
 - command int8
 - format int8
 - args variable length bytes
 utility 
 0 - Broker
 1 - Topic
 2 - Replication
 3 - Controller
 4 - Consumer
 5 - Producer
 Command
 0 - Create
 1 - Alter
 3 - Delete
 4 - List
 5 - Audit
 format
 0 - JSON
 args e.g. (which would equate to the data structure values == 2,1,0)
 meta-store: {
 {zookeeper:localhost:12913/kafka}
 }args: {
  partitions:
   [
 {topic: topic1, partition: 0},
 {topic: topic1, partition: 1},
 {topic: topic1, partition: 2},
  
 {topic: topic2, partition: 0},
 {topic: topic2, partition: 1},
   ]
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1772) Add an Admin message type for request response

2014-12-08 Thread Andrii Biletskyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216456#comment-14216456
 ] 

Andrii Biletskyi edited comment on KAFKA-1772 at 12/8/14 3:12 PM:
--

Patch is available under parent ticket - KAFKA-1694


was (Author: abiletskyi):
Created reviewboard https://reviews.apache.org/r/28175/diff/
 against branch origin/trunk

 Add an Admin message type for request response
 --

 Key: KAFKA-1772
 URL: https://issues.apache.org/jira/browse/KAFKA-1772
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 - utility int8
 - command int8
 - format int8
 - args variable length bytes
 utility 
 0 - Broker
 1 - Topic
 2 - Replication
 3 - Controller
 4 - Consumer
 5 - Producer
 Command
 0 - Create
 1 - Alter
 3 - Delete
 4 - List
 5 - Audit
 format
 0 - JSON
 args e.g. (which would equate to the data structure values == 2,1,0)
 meta-store: {
 {zookeeper:localhost:12913/kafka}
 }args: {
  partitions:
   [
 {topic: topic1, partition: 0},
 {topic: topic1, partition: 1},
 {topic: topic1, partition: 2},
  
 {topic: topic2, partition: 0},
 {topic: topic2, partition: 1},
   ]
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1772) Add an Admin message type for request response

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi updated KAFKA-1772:

Attachment: (was: KAFKA-1772.patch)

 Add an Admin message type for request response
 --

 Key: KAFKA-1772
 URL: https://issues.apache.org/jira/browse/KAFKA-1772
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 - utility int8
 - command int8
 - format int8
 - args variable length bytes
 utility 
 0 - Broker
 1 - Topic
 2 - Replication
 3 - Controller
 4 - Consumer
 5 - Producer
 Command
 0 - Create
 1 - Alter
 3 - Delete
 4 - List
 5 - Audit
 format
 0 - JSON
 args e.g. (which would equate to the data structure values == 2,1,0)
 meta-store: {
 {zookeeper:localhost:12913/kafka}
 }args: {
  partitions:
   [
 {topic: topic1, partition: 0},
 {topic: topic1, partition: 1},
 {topic: topic1, partition: 2},
  
 {topic: topic2, partition: 0},
 {topic: topic2, partition: 1},
   ]
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (KAFKA-1802) Add a new type of request for the discovery of the controller

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi updated KAFKA-1802:

Comment: was deleted

(was: Updated reviewboard https://reviews.apache.org/r/28643/diff/
 against branch origin/trunk)

 Add a new type of request for the discovery of the controller
 -

 Key: KAFKA-1802
 URL: https://issues.apache.org/jira/browse/KAFKA-1802
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3

 Attachments: KAFKA-1802.patch, KAFKA-1802_2014-12-08_12:56:03.patch


 The goal here is like meta data discovery is for producer so CLI can find 
 which broker it should send the rest of its admin requests too.  Any broker 
 can respond to this specific AdminMeta RQ/RP but only the controller broker 
 should be responding to Admin message otherwise that broker should respond to 
 any admin message with the response for what the controller is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (KAFKA-1802) Add a new type of request for the discovery of the controller

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi updated KAFKA-1802:

Comment: was deleted

(was: Created reviewboard https://reviews.apache.org/r/28643/diff/
 against branch origin/trunk

[patch created on top of patch for KAFKA-1772])

 Add a new type of request for the discovery of the controller
 -

 Key: KAFKA-1802
 URL: https://issues.apache.org/jira/browse/KAFKA-1802
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3

 Attachments: KAFKA-1802.patch, KAFKA-1802_2014-12-08_12:56:03.patch


 The goal here is like meta data discovery is for producer so CLI can find 
 which broker it should send the rest of its admin requests too.  Any broker 
 can respond to this specific AdminMeta RQ/RP but only the controller broker 
 should be responding to Admin message otherwise that broker should respond to 
 any admin message with the response for what the controller is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-1774) REPL and Shell Client for Admin Message RQ/RP

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi resolved KAFKA-1774.
-
Resolution: Implemented

 REPL and Shell Client for Admin Message RQ/RP
 -

 Key: KAFKA-1774
 URL: https://issues.apache.org/jira/browse/KAFKA-1774
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 We should have a REPL we can work in and execute the commands with the 
 arguments. With this we can do:
 ./kafka.sh --shell 
 kafkaattach cluster -b localhost:9092;
 kafkadescribe topic sampleTopicNameForExample;
 the command line version can work like it does now so folks don't have to 
 re-write all of their tooling.
 kafka.sh --topics --everything the same like kafka-topics.sh is 
 kafka.sh --reassign --everything the same like kafka-reassign-partitions.sh 
 is 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1802) Add a new type of request for the discovery of the controller

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi updated KAFKA-1802:

Attachment: (was: KAFKA-1802.patch)

 Add a new type of request for the discovery of the controller
 -

 Key: KAFKA-1802
 URL: https://issues.apache.org/jira/browse/KAFKA-1802
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 The goal here is like meta data discovery is for producer so CLI can find 
 which broker it should send the rest of its admin requests too.  Any broker 
 can respond to this specific AdminMeta RQ/RP but only the controller broker 
 should be responding to Admin message otherwise that broker should respond to 
 any admin message with the response for what the controller is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1802) Add a new type of request for the discovery of the controller

2014-12-08 Thread Andrii Biletskyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237954#comment-14237954
 ] 

Andrii Biletskyi commented on KAFKA-1802:
-

Patch is available under parent ticket - KAFKA-1694

 Add a new type of request for the discovery of the controller
 -

 Key: KAFKA-1802
 URL: https://issues.apache.org/jira/browse/KAFKA-1802
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 The goal here is like meta data discovery is for producer so CLI can find 
 which broker it should send the rest of its admin requests too.  Any broker 
 can respond to this specific AdminMeta RQ/RP but only the controller broker 
 should be responding to Admin message otherwise that broker should respond to 
 any admin message with the response for what the controller is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-1775) Re-factor TopicCommand into thew handerAdminMessage call

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi resolved KAFKA-1775.
-
Resolution: Implemented

 Re-factor TopicCommand into thew handerAdminMessage call 
 -

 Key: KAFKA-1775
 URL: https://issues.apache.org/jira/browse/KAFKA-1775
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 kafka-topic.sh should become
 kafka --topic --everything else the same from the CLI perspective so we need 
 to have the calls from the byte lalery get fed into that same code (few 
 changes as possible called from the handleAdmin call after deducing what 
 Utility[1] it is operating for 
 I think we should not remove the existing kafka-topic.sh and preserve the 
 existing functionality (with as little code duplication as possible) until 
 0.9 (and there we can remove it after folks have used it for a release or two 
 and feedback and the rest)[2]
 [1] https://issues.apache.org/jira/browse/KAFKA-1772
 [2] https://issues.apache.org/jira/browse/KAFKA-1776



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1802) Add a new type of request for the discovery of the controller

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi updated KAFKA-1802:

Attachment: (was: KAFKA-1802_2014-12-08_12:56:03.patch)

 Add a new type of request for the discovery of the controller
 -

 Key: KAFKA-1802
 URL: https://issues.apache.org/jira/browse/KAFKA-1802
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 The goal here is like meta data discovery is for producer so CLI can find 
 which broker it should send the rest of its admin requests too.  Any broker 
 can respond to this specific AdminMeta RQ/RP but only the controller broker 
 should be responding to Admin message otherwise that broker should respond to 
 any admin message with the response for what the controller is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (KAFKA-1774) REPL and Shell Client for Admin Message RQ/RP

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi reopened KAFKA-1774:
-

 REPL and Shell Client for Admin Message RQ/RP
 -

 Key: KAFKA-1774
 URL: https://issues.apache.org/jira/browse/KAFKA-1774
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 We should have a REPL we can work in and execute the commands with the 
 arguments. With this we can do:
 ./kafka.sh --shell 
 kafkaattach cluster -b localhost:9092;
 kafkadescribe topic sampleTopicNameForExample;
 the command line version can work like it does now so folks don't have to 
 re-write all of their tooling.
 kafka.sh --topics --everything the same like kafka-topics.sh is 
 kafka.sh --reassign --everything the same like kafka-reassign-partitions.sh 
 is 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (KAFKA-1775) Re-factor TopicCommand into thew handerAdminMessage call

2014-12-08 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi reopened KAFKA-1775:
-

 Re-factor TopicCommand into thew handerAdminMessage call 
 -

 Key: KAFKA-1775
 URL: https://issues.apache.org/jira/browse/KAFKA-1775
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 kafka-topic.sh should become
 kafka --topic --everything else the same from the CLI perspective so we need 
 to have the calls from the byte lalery get fed into that same code (few 
 changes as possible called from the handleAdmin call after deducing what 
 Utility[1] it is operating for 
 I think we should not remove the existing kafka-topic.sh and preserve the 
 existing functionality (with as little code duplication as possible) until 
 0.9 (and there we can remove it after folks have used it for a release or two 
 and feedback and the rest)[2]
 [1] https://issues.apache.org/jira/browse/KAFKA-1772
 [2] https://issues.apache.org/jira/browse/KAFKA-1776



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1775) Re-factor TopicCommand into thew handerAdminMessage call

2014-12-08 Thread Andrii Biletskyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237960#comment-14237960
 ] 

Andrii Biletskyi commented on KAFKA-1775:
-

Patch is available under parent ticket - KAFKA-1694

 Re-factor TopicCommand into thew handerAdminMessage call 
 -

 Key: KAFKA-1775
 URL: https://issues.apache.org/jira/browse/KAFKA-1775
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 kafka-topic.sh should become
 kafka --topic --everything else the same from the CLI perspective so we need 
 to have the calls from the byte lalery get fed into that same code (few 
 changes as possible called from the handleAdmin call after deducing what 
 Utility[1] it is operating for 
 I think we should not remove the existing kafka-topic.sh and preserve the 
 existing functionality (with as little code duplication as possible) until 
 0.9 (and there we can remove it after folks have used it for a release or two 
 and feedback and the rest)[2]
 [1] https://issues.apache.org/jira/browse/KAFKA-1772
 [2] https://issues.apache.org/jira/browse/KAFKA-1776



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1806) broker can still expose uncommitted data to a consumer

2014-12-08 Thread lokesh Birla (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238136#comment-14238136
 ] 

lokesh Birla commented on KAFKA-1806:
-

is there any update on this?

 broker can still expose uncommitted data to a consumer
 --

 Key: KAFKA-1806
 URL: https://issues.apache.org/jira/browse/KAFKA-1806
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 0.8.1.1
Reporter: lokesh Birla
Assignee: Neha Narkhede

 Although following issue: https://issues.apache.org/jira/browse/KAFKA-727
 is marked fixed but I still see this issue in 0.8.1.1. I am able to 
 reproducer the issue consistently. 
 [2014-08-18 06:43:58,356] ERROR [KafkaApi-1] Error when processing fetch 
 request for partition [mmetopic4,2] offset 1940029 from consumer with 
 correlation id 21 (kafka.server.Kaf
 kaApis)
 java.lang.IllegalArgumentException: Attempt to read with a maximum offset 
 (1818353) less than the start offset (1940029).
 at kafka.log.LogSegment.read(LogSegment.scala:136)
 at kafka.log.Log.read(Log.scala:386)
 at 
 kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSet(KafkaApis.scala:530)
 at 
 kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:476)
 at 
 kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:471)
 at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233)
 at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233)
 at scala.collection.immutable.Map$Map1.foreach(Map.scala:119)
 at 
 scala.collection.TraversableLike$class.map(TraversableLike.scala:233)
 at scala.collection.immutable.Map$Map1.map(Map.scala:107)
 at 
 kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSets(KafkaApis.scala:471)
 at 
 kafka.server.KafkaApis$FetchRequestPurgatory.expire(KafkaApis.scala:783)
 at 
 kafka.server.KafkaApis$FetchRequestPurgatory.expire(KafkaApis.scala:765)
 at 
 kafka.server.RequestPurgatory$ExpiredRequestReaper.run(RequestPurgatory.scala:216)
 at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists

2014-12-08 Thread Jeff Holoman (JIRA)
Jeff Holoman created KAFKA-1810:
---

 Summary: Add IP Filtering / Whitelists-Blacklists 
 Key: KAFKA-1810
 URL: https://issues.apache.org/jira/browse/KAFKA-1810
 Project: Kafka
  Issue Type: New Feature
  Components: core, network
Reporter: Jeff Holoman
Assignee: Jun Rao
Priority: Minor


While longer-term goals of security in Kafka are on the roadmap there exists 
some value for the ability to restrict connection to Kafka brokers based on IP 
address. This is not intended as a replacement for security but more of a 
precaution against misconfiguration and to provide some level of control to 
Kafka administrators about who is reading/writing to their cluster.

1) In some organizations software administration vs o/s systems administration 
and network administration is disjointed and not well choreographed. Providing 
software administrators the ability to configure their platform relatively 
independently (after initial configuration) from Systems administrators is 
desirable.
2) Configuration and deployment is sometimes error prone and there are 
situations when test environments could erroneously read/write to production 
environments
3) An additional precaution against reading sensitive data is typically 
welcomed in most large enterprise deployments.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists

2014-12-08 Thread Jeff Holoman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Holoman reassigned KAFKA-1810:
---

Assignee: Jeff Holoman  (was: Jun Rao)

 Add IP Filtering / Whitelists-Blacklists 
 -

 Key: KAFKA-1810
 URL: https://issues.apache.org/jira/browse/KAFKA-1810
 Project: Kafka
  Issue Type: New Feature
  Components: core, network
Reporter: Jeff Holoman
Assignee: Jeff Holoman
Priority: Minor
 Fix For: 0.8.3


 While longer-term goals of security in Kafka are on the roadmap there exists 
 some value for the ability to restrict connection to Kafka brokers based on 
 IP address. This is not intended as a replacement for security but more of a 
 precaution against misconfiguration and to provide some level of control to 
 Kafka administrators about who is reading/writing to their cluster.
 1) In some organizations software administration vs o/s systems 
 administration and network administration is disjointed and not well 
 choreographed. Providing software administrators the ability to configure 
 their platform relatively independently (after initial configuration) from 
 Systems administrators is desirable.
 2) Configuration and deployment is sometimes error prone and there are 
 situations when test environments could erroneously read/write to production 
 environments
 3) An additional precaution against reading sensitive data is typically 
 welcomed in most large enterprise deployments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists

2014-12-08 Thread Jeff Holoman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Holoman updated KAFKA-1810:

Fix Version/s: 0.8.3

 Add IP Filtering / Whitelists-Blacklists 
 -

 Key: KAFKA-1810
 URL: https://issues.apache.org/jira/browse/KAFKA-1810
 Project: Kafka
  Issue Type: New Feature
  Components: core, network
Reporter: Jeff Holoman
Assignee: Jun Rao
Priority: Minor
 Fix For: 0.8.3


 While longer-term goals of security in Kafka are on the roadmap there exists 
 some value for the ability to restrict connection to Kafka brokers based on 
 IP address. This is not intended as a replacement for security but more of a 
 precaution against misconfiguration and to provide some level of control to 
 Kafka administrators about who is reading/writing to their cluster.
 1) In some organizations software administration vs o/s systems 
 administration and network administration is disjointed and not well 
 choreographed. Providing software administrators the ability to configure 
 their platform relatively independently (after initial configuration) from 
 Systems administrators is desirable.
 2) Configuration and deployment is sometimes error prone and there are 
 situations when test environments could erroneously read/write to production 
 environments
 3) An additional precaution against reading sensitive data is typically 
 welcomed in most large enterprise deployments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists

2014-12-08 Thread Jeff Holoman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238179#comment-14238179
 ] 

Jeff Holoman commented on KAFKA-1810:
-

There was a discussion in KAFKA-1512 regarding the capability for limiting 
connections by setting max connections to 0. This gives a cleaner way to 
perform similar functionality

 Add IP Filtering / Whitelists-Blacklists 
 -

 Key: KAFKA-1810
 URL: https://issues.apache.org/jira/browse/KAFKA-1810
 Project: Kafka
  Issue Type: New Feature
  Components: core, network
Reporter: Jeff Holoman
Assignee: Jeff Holoman
Priority: Minor
 Fix For: 0.8.3


 While longer-term goals of security in Kafka are on the roadmap there exists 
 some value for the ability to restrict connection to Kafka brokers based on 
 IP address. This is not intended as a replacement for security but more of a 
 precaution against misconfiguration and to provide some level of control to 
 Kafka administrators about who is reading/writing to their cluster.
 1) In some organizations software administration vs o/s systems 
 administration and network administration is disjointed and not well 
 choreographed. Providing software administrators the ability to configure 
 their platform relatively independently (after initial configuration) from 
 Systems administrators is desirable.
 2) Configuration and deployment is sometimes error prone and there are 
 situations when test environments could erroneously read/write to production 
 environments
 3) An additional precaution against reading sensitive data is typically 
 welcomed in most large enterprise deployments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] adding the serializer api back to the new java producer

2014-12-08 Thread Jun Rao
Ok, based on all the feedbacks that we have heard, I plan to do the
following.

1. Keep the generic api in KAFKA-1797.
2. Add a new constructor in Producer/Consumer that takes the key and the
value serializer instance.
3. Have KAFKA-1797 reviewed and checked into 0.8.2 and trunk.

This will make it easy for people to reuse common serializers while at the
same time allow people to use the byte array api if one chooses to do so.

I plan to make those changes in the next couple of days unless someone
strongly objects.

Thanks,

Jun


On Fri, Dec 5, 2014 at 5:46 PM, Jiangjie Qin j...@linkedin.com.invalid
wrote:

 Hi Jun,

 Thanks for pointing out this. Yes, putting serialization/deserialization
 code into record does lose some flexibility. Some more thinking, I think
 no matter what we do to bind the producer and serializer/deserializer, we
 can always to the same thing on Record, i.e. We can also have some
 constructor like ProducerRecorSerializerK, V, DeserializerK, V. The
 downside of this is that we could potentially have a
 serializer/deserializer instance for each record (that's actually the very
 reason that I propose to put the code in record). This problem could be
 addressed by either using a singleton class or factory for
 serializer/deserializer library. But it might be a little bit complicated
 and we are not able to enforce that to external library either. So it
 seems only make sense if we really want to:
 1. Have a single simple producer interface.
 AND
 2. use a single producer send all type of messages

 I'm not sure if these requirement are strong enough to make us take the
 complexity of singleton/factory class serializer/deserializer library.

 Thanks.

 Jiangjie (Becket) Qin

 On 12/5/14, 3:16 PM, Jun Rao j...@confluent.io wrote:

 Jiangjie,
 
 The issue with adding the serializer in ProducerRecord is that you need to
 implement all combinations of serializers for key and value. So, instead
 of
 just implementing int and string serializers, you will have to implement
 all 4 combinations.
 
 Adding a new producer constructor like ProducerK, V(KeySerializerK,
 ValueSerializerV, Properties properties) can be useful.
 
 Thanks,
 
 Jun
 
 On Thu, Dec 4, 2014 at 10:33 AM, Jiangjie Qin j...@linkedin.com.invalid
 wrote:
 
 
  I'm just thinking instead of binding serialization with producer,
 another
  option is to bind serializer/deserializer with
  ProducerRecord/ConsumerRecord (please see the detail proposal below.)
 The arguments for this option is:
  A. A single producer could send different message types. There
 are
  several use cases in LinkedIn for per record serializer
  - In Samza, there are some in-stream order-sensitive control
  messages
  having different deserializer from other messages.
  - There are use cases which need support for sending both Avro
  messages
  and raw bytes.
  - Some use cases needs to deserialize some Avro messages into
  generic
  record and some other messages into specific record.
  B. In current proposal, the serializer/deserilizer is
 instantiated
  according to config. Compared with that, binding serializer with
  ProducerRecord and ConsumerRecord is less error prone.
 
 
  This option includes the following changes:
  A. Add serializer and deserializer interfaces to replace
 serializer
  instance from config.
  Public interface Serializer K, V {
  public byte[] serializeKey(K key);
  public byte[] serializeValue(V value);
  }
  Public interface deserializer K, V {
  Public K deserializeKey(byte[] key);
  public V deserializeValue(byte[] value);
  }
 
  B. Make ProducerRecord and ConsumerRecord abstract class
  implementing
  Serializer K, V and Deserializer K, V respectively.
  Public abstract class ProducerRecord K, V implements
  Serializer K, V
  {...}
  Public abstract class ConsumerRecord K, V implements
  Deserializer K,
  V {...}
 
  C. Instead of instantiate the serializer/Deserializer from
 config,
  let
  concrete ProducerRecord/ConsumerRecord extends the abstract class and
  override the serialize/deserialize methods.
 
  Public class AvroProducerRecord extends ProducerRecord
  String,
  GenericRecord {
  ...
  @Override
  Public byte[] serializeKey(String key) {Š}
  @Override
  public byte[] serializeValue(GenericRecord
 value);
  }
 
  Public class AvroConsumerRecord extends ConsumerRecord
  String,
  GenericRecord {
  ...
  @Override
  Public K deserializeKey(byte[] key) {Š}
  @Override
   

controller conflict

2014-12-08 Thread Kane Kim
Hello,

We are using kafka 0.8.1.1 and hit this bug  - 1029, /controller
ephemeral node conflict. Is it supposed to be fixed or has something
to do with 1451?

Thanks.


[jira] [Commented] (KAFKA-1044) change log4j to slf4j

2014-12-08 Thread Jon Barksdale (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238611#comment-14238611
 ] 

Jon Barksdale commented on KAFKA-1044:
--

As as workaround, you can exclude log4j from kafka, and just use the 
org.slf4j:log4j-over-slf4j instead.  That feeds in most log4j commands to 
slf4j, and then logback or another slf4j binding can be used.  

 change log4j to slf4j 
 --

 Key: KAFKA-1044
 URL: https://issues.apache.org/jira/browse/KAFKA-1044
 Project: Kafka
  Issue Type: Bug
  Components: log
Affects Versions: 0.8.0
Reporter: sjk
Assignee: Jay Kreps
 Fix For: 0.9.0


 can u chanage the log4j to slf4j, in my project, i use logback, it's conflict 
 with log4j.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1797) add the serializer/deserializer api to the new java client

2014-12-08 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1797:
-
Reviewer: Neha Narkhede

 add the serializer/deserializer api to the new java client
 --

 Key: KAFKA-1797
 URL: https://issues.apache.org/jira/browse/KAFKA-1797
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 0.8.2
Reporter: Jun Rao
Assignee: Jun Rao
 Attachments: kafka-1797.patch


 Currently, the new java clients take a byte array for both the key and the 
 value. While this api is simple, it pushes the serialization/deserialization 
 logic into the application. This makes it hard to reason about what type of 
 data flows through Kafka and also makes it hard to share an implementation of 
 the serializer/deserializer. For example, to support Avro, the serialization 
 logic could be quite involved since it might need to register the Avro schema 
 in some remote registry and maintain a schema cache locally, etc. Without a 
 serialization api, it's impossible to share such an implementation so that 
 people can easily reuse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1784) Implement a ConsumerOffsetClient library

2014-12-08 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238723#comment-14238723
 ] 

Neha Narkhede commented on KAFKA-1784:
--

[~mgharat] Thanks for the patch. Can you add a little more context on the 
intent and usage of this library? It is meant for admin usage only or do you 
intend to refactor the existing KafkaApis to use this new library?

 Implement a ConsumerOffsetClient library
 

 Key: KAFKA-1784
 URL: https://issues.apache.org/jira/browse/KAFKA-1784
 Project: Kafka
  Issue Type: New Feature
Reporter: Joel Koshy
Assignee: Mayuresh Gharat
Priority: Blocker
 Fix For: 0.8.2

 Attachments: KAFKA-1784.patch


 I think it would be useful to provide an offset client library. It would make 
 the documentation a lot simpler. Right now it is non-trivial to commit/fetch 
 offsets to/from kafka.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1784) Implement a ConsumerOffsetClient library

2014-12-08 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1784:
-
Reviewer: Neha Narkhede

Assigning to myself for review since [~jjkoshy] is on vacation.

 Implement a ConsumerOffsetClient library
 

 Key: KAFKA-1784
 URL: https://issues.apache.org/jira/browse/KAFKA-1784
 Project: Kafka
  Issue Type: New Feature
Reporter: Joel Koshy
Assignee: Mayuresh Gharat
Priority: Blocker
 Fix For: 0.8.2

 Attachments: KAFKA-1784.patch


 I think it would be useful to provide an offset client library. It would make 
 the documentation a lot simpler. Right now it is non-trivial to commit/fetch 
 offsets to/from kafka.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1784) Implement a ConsumerOffsetClient library

2014-12-08 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238736#comment-14238736
 ] 

Mayuresh Gharat commented on KAFKA-1784:


Yes. We are planning to do that. Actually the patch for KAFKA-1013 has those. 
Currently this patch can be used for admin usage and then we will get back to 
KAFKA-1013 to make it work with KafkaApis and Consumer.

 Implement a ConsumerOffsetClient library
 

 Key: KAFKA-1784
 URL: https://issues.apache.org/jira/browse/KAFKA-1784
 Project: Kafka
  Issue Type: New Feature
Reporter: Joel Koshy
Assignee: Mayuresh Gharat
Priority: Blocker
 Fix For: 0.8.2

 Attachments: KAFKA-1784.patch


 I think it would be useful to provide an offset client library. It would make 
 the documentation a lot simpler. Right now it is non-trivial to commit/fetch 
 offsets to/from kafka.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] adding the serializer api back to the new java producer

2014-12-08 Thread Sriram Subramanian
Thank you Jay. I agree with the issue that you point w.r.t paired
serializers. I also think having mix serialization types is rare. To get
the current behavior, one can simply use a ByteArraySerializer. This is
best understood by talking with many customers and you seem to have done
that. I am convinced about the change.

For the rest who gave -1 or 0 for this proposal, does the answers for the
three points(updated) below seem reasonable? Are these explanations
convincing? 


1. Can we keep the serialization semantics outside the Producer interface
and have simple bytes in / bytes out for the interface (This is what we
have today).

The points for this is to keep the interface simple and usage easy to
understand. The points against this is that it gets hard to share common
usage patterns around serialization/message validations for the future.

2. Can we create a wrapper producer that does the serialization and have
different variants of it for different data formats?

The points for this is again to keep the main API clean. The points
against this is that it duplicates the API, increases the surface area and
creates redundancy for a minor addition.

3. Do we need to support different data types per record? The current
interface (bytes in/bytes out) lets you instantiate one producer and use
it to send multiple data formats. There seems to be some valid use cases
for this.


Mixed serialization types are rare based on interactions with customers.
To get the current behavior, one can simply use a ByteArraySerializer.

On 12/5/14 5:00 PM, Jay Kreps j...@confluent.io wrote:

Hey Sriram,

Thanks! I think this is a very helpful summary.

Let me try to address your point about passing in the serde at send time.

I think the first objection is really to the paired key/value serializer
interfaces. This leads to kind of a weird combinatorial thing where you
would have an avro/avro serializer a string/avro serializer, a pb/pb
serializer, and a string/pb serializer, and so on. But your proposal would
work as well with separate serializers for key and value.

I think the downside is just the one you call out--that this is a corner
case and you end up with two versions of all the apis to support it. This
also makes the serializer api more annoying to implement. I think the
alternative solution to this case and any other we can give people is just
configuring ByteArraySerializer which gives you basically the api that you
have now with byte arrays. If this is incredibly common then this would be
a silly solution, but I guess the belief is that these cases are rare and
a
really well implemented avro or json serializer should be 100% of what
most
people need.

In practice the cases that actually mix serialization types in a single
stream are pretty rare I think just because the consumer then has the
problem of guessing how to deserialize, so most of these will end up with
at least some marker or schema id or whatever that tells you how to read
the data. Arguable this mixed serialization with marker is itself a
serializer type and should have a serializer of its own...

-Jay

On Fri, Dec 5, 2014 at 3:48 PM, Sriram Subramanian 
srsubraman...@linkedin.com.invalid wrote:

 This thread has diverged multiple times now and it would be worth
 summarizing them.

 There seems to be the following points of discussion -

 1. Can we keep the serialization semantics outside the Producer
interface
 and have simple bytes in / bytes out for the interface (This is what we
 have today).

 The points for this is to keep the interface simple and usage easy to
 understand. The points against this is that it gets hard to share common
 usage patterns around serialization/message validations for the future.

 2. Can we create a wrapper producer that does the serialization and have
 different variants of it for different data formats?

 The points for this is again to keep the main API clean. The points
 against this is that it duplicates the API, increases the surface area
and
 creates redundancy for a minor addition.

 3. Do we need to support different data types per record? The current
 interface (bytes in/bytes out) lets you instantiate one producer and use
 it to send multiple data formats. There seems to be some valid use cases
 for this.

 I have still not seen a strong argument against not having this
 functionality. Can someone provide their views on why we don't need this
 support that is possible with the current API?

 One possible approach for the per record serialization would be to
define

 public interface SerDeK,V {
   public byte[] serializeKey();

   public K deserializeKey();

   public byte[] serializeValue();

   public V deserializeValue();
 }

 This would be used by both the Producer and the Consumer.

 The send APIs can then be

 public FutureRecordMetadata send(ProducerRecordK,V record);
 public FutureRecordMetadata send(ProducerRecordK,V record, Callback
 callback);


 public FutureRecordMetadata send(ProducerRecordK,V 

[jira] [Created] (KAFKA-1811) ensuring registered broker host:port is unique

2014-12-08 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-1811:
--

 Summary: ensuring registered broker host:port is unique
 Key: KAFKA-1811
 URL: https://issues.apache.org/jira/browse/KAFKA-1811
 Project: Kafka
  Issue Type: Improvement
Reporter: Jun Rao


Currently, we expect each of the registered broker to have a unique host:port 
pair. However, we don't enforce that, which causes various weird problems. It 
would be useful to ensure this during broker registration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1812) Allow IpV6 in configuration with parseCsvMap

2014-12-08 Thread Jeff Holoman (JIRA)
Jeff Holoman created KAFKA-1812:
---

 Summary:  Allow IpV6 in configuration with parseCsvMap
 Key: KAFKA-1812
 URL: https://issues.apache.org/jira/browse/KAFKA-1812
 Project: Kafka
  Issue Type: Bug
Reporter: Jeff Holoman
Assignee: Jeff Holoman
Priority: Minor
 Fix For: 0.8.3


The current implementation of parseCsvMap in Utils expects k:v,k:v. This 
modifies that function to accept a string with multiple : characters and 
splitting on the last occurrence per pair. 

This limitation is noted in the Reviewboard comments for KAFKA-1512



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1812) Allow IpV6 in configuration with parseCsvMap

2014-12-08 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1812:
-
Labels: newbie  (was: )

  Allow IpV6 in configuration with parseCsvMap
 -

 Key: KAFKA-1812
 URL: https://issues.apache.org/jira/browse/KAFKA-1812
 Project: Kafka
  Issue Type: Bug
Reporter: Jeff Holoman
Assignee: Jeff Holoman
Priority: Minor
  Labels: newbie
 Fix For: 0.8.3


 The current implementation of parseCsvMap in Utils expects k:v,k:v. This 
 modifies that function to accept a string with multiple : characters and 
 splitting on the last occurrence per pair. 
 This limitation is noted in the Reviewboard comments for KAFKA-1512



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1811) ensuring registered broker host:port is unique

2014-12-08 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1811:
-
Labels: newbie  (was: )

 ensuring registered broker host:port is unique
 --

 Key: KAFKA-1811
 URL: https://issues.apache.org/jira/browse/KAFKA-1811
 Project: Kafka
  Issue Type: Improvement
Reporter: Jun Rao
  Labels: newbie

 Currently, we expect each of the registered broker to have a unique host:port 
 pair. However, we don't enforce that, which causes various weird problems. It 
 would be useful to ensure this during broker registration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists

2014-12-08 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238985#comment-14238985
 ] 

Neha Narkhede commented on KAFKA-1810:
--

[~jholoman] Not sure I understood what you are proposing. Can you be more 
specific about the changes you are proposing?

 Add IP Filtering / Whitelists-Blacklists 
 -

 Key: KAFKA-1810
 URL: https://issues.apache.org/jira/browse/KAFKA-1810
 Project: Kafka
  Issue Type: New Feature
  Components: core, network
Reporter: Jeff Holoman
Assignee: Jeff Holoman
Priority: Minor
 Fix For: 0.8.3


 While longer-term goals of security in Kafka are on the roadmap there exists 
 some value for the ability to restrict connection to Kafka brokers based on 
 IP address. This is not intended as a replacement for security but more of a 
 precaution against misconfiguration and to provide some level of control to 
 Kafka administrators about who is reading/writing to their cluster.
 1) In some organizations software administration vs o/s systems 
 administration and network administration is disjointed and not well 
 choreographed. Providing software administrators the ability to configure 
 their platform relatively independently (after initial configuration) from 
 Systems administrators is desirable.
 2) Configuration and deployment is sometimes error prone and there are 
 situations when test environments could erroneously read/write to production 
 environments
 3) An additional precaution against reading sensitive data is typically 
 welcomed in most large enterprise deployments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1806) broker can still expose uncommitted data to a consumer

2014-12-08 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238987#comment-14238987
 ] 

Neha Narkhede commented on KAFKA-1806:
--

[~lokeshbirla] Please can you provide the steps to reproduce this issue?

 broker can still expose uncommitted data to a consumer
 --

 Key: KAFKA-1806
 URL: https://issues.apache.org/jira/browse/KAFKA-1806
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 0.8.1.1
Reporter: lokesh Birla
Assignee: Neha Narkhede

 Although following issue: https://issues.apache.org/jira/browse/KAFKA-727
 is marked fixed but I still see this issue in 0.8.1.1. I am able to 
 reproducer the issue consistently. 
 [2014-08-18 06:43:58,356] ERROR [KafkaApi-1] Error when processing fetch 
 request for partition [mmetopic4,2] offset 1940029 from consumer with 
 correlation id 21 (kafka.server.Kaf
 kaApis)
 java.lang.IllegalArgumentException: Attempt to read with a maximum offset 
 (1818353) less than the start offset (1940029).
 at kafka.log.LogSegment.read(LogSegment.scala:136)
 at kafka.log.Log.read(Log.scala:386)
 at 
 kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSet(KafkaApis.scala:530)
 at 
 kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:476)
 at 
 kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:471)
 at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233)
 at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233)
 at scala.collection.immutable.Map$Map1.foreach(Map.scala:119)
 at 
 scala.collection.TraversableLike$class.map(TraversableLike.scala:233)
 at scala.collection.immutable.Map$Map1.map(Map.scala:107)
 at 
 kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSets(KafkaApis.scala:471)
 at 
 kafka.server.KafkaApis$FetchRequestPurgatory.expire(KafkaApis.scala:783)
 at 
 kafka.server.KafkaApis$FetchRequestPurgatory.expire(KafkaApis.scala:765)
 at 
 kafka.server.RequestPurgatory$ExpiredRequestReaper.run(RequestPurgatory.scala:216)
 at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: controller conflict

2014-12-08 Thread Neha Narkhede
It's hard to say without any details. But you are free to try 0.8.2-beta as
both of those JIRAs should be fixed in this version.

On Mon, Dec 8, 2014 at 2:29 PM, Kane Kim kane.ist...@gmail.com wrote:

 Hello,

 We are using kafka 0.8.1.1 and hit this bug  - 1029, /controller
 ephemeral node conflict. Is it supposed to be fixed or has something
 to do with 1451?

 Thanks.




-- 
Thanks,
Neha


[jira] [Commented] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists

2014-12-08 Thread Jeff Holoman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239020#comment-14239020
 ] 

Jeff Holoman commented on KAFKA-1810:
-

[~nehanarkhede]  Sure no problem. I had a request to provide the ability to 
specify a range of IP addresses to either include or exclude. I was thinking 
the easiest way would be to specify IP addresses in CIDR notation and include 
them in the server.properties such as 192.168.2.0/24:allow, 
192.168.1.0/16:deny. This would allow an administrator to accept/deny 
connections based on ip ranges. Does that clarify?


 Add IP Filtering / Whitelists-Blacklists 
 -

 Key: KAFKA-1810
 URL: https://issues.apache.org/jira/browse/KAFKA-1810
 Project: Kafka
  Issue Type: New Feature
  Components: core, network
Reporter: Jeff Holoman
Assignee: Jeff Holoman
Priority: Minor
 Fix For: 0.8.3


 While longer-term goals of security in Kafka are on the roadmap there exists 
 some value for the ability to restrict connection to Kafka brokers based on 
 IP address. This is not intended as a replacement for security but more of a 
 precaution against misconfiguration and to provide some level of control to 
 Kafka administrators about who is reading/writing to their cluster.
 1) In some organizations software administration vs o/s systems 
 administration and network administration is disjointed and not well 
 choreographed. Providing software administrators the ability to configure 
 their platform relatively independently (after initial configuration) from 
 Systems administrators is desirable.
 2) Configuration and deployment is sometimes error prone and there are 
 situations when test environments could erroneously read/write to production 
 environments
 3) An additional precaution against reading sensitive data is typically 
 welcomed in most large enterprise deployments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost

2014-12-08 Thread Bhavesh Mistry (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239063#comment-14239063
 ] 

Bhavesh Mistry commented on KAFKA-1642:
---

[~stevenz3wu],

0.8.2 is very well tested and worked well under heavy load.  This bug is rare 
only happen when broker or network has issue.  We have been producing about 7 
to 10 TB per day using this new producer, so 0.8.2 is very safe to use in 
production.  It has survived  pick traffic of the year on large e-commerce 
site.  So I am fairly confident that  New Java API is indeed does true 
round-robin and much faster than Scala Based API.

[~ewencp],  I will verify the patch by end of this Friday, but do let me know 
your understanding based on my last comment.

Thanks,

Bhavesh

 [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network 
 connection is lost
 ---

 Key: KAFKA-1642
 URL: https://issues.apache.org/jira/browse/KAFKA-1642
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 0.8.2
Reporter: Bhavesh Mistry
Assignee: Ewen Cheslack-Postava
Priority: Blocker
 Fix For: 0.8.2

 Attachments: 
 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, 
 KAFKA-1642.patch, KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, 
 KAFKA-1642_2014-10-23_16:19:41.patch


 I see my CPU spike to 100% when network connection is lost for while.  It 
 seems network  IO thread are very busy logging following error message.  Is 
 this expected behavior ?
 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR 
 org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka 
 producer I/O thread: 
 java.lang.IllegalStateException: No entry found for node -2
 at 
 org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110)
 at 
 org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99)
 at 
 org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394)
 at 
 org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380)
 at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174)
 at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175)
 at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
 at java.lang.Thread.run(Thread.java:744)
 Thanks,
 Bhavesh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost

2014-12-08 Thread Bhavesh Mistry (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239063#comment-14239063
 ] 

Bhavesh Mistry edited comment on KAFKA-1642 at 12/9/14 6:53 AM:


[~stevenz3wu],

0.8.2 is very well tested and worked well under heavy load.  This bug is rare 
only happen when broker or network has issue.  We have been producing about 7 
to 10 TB per day using this new producer, so 0.8.2 is very safe to use in 
production.  It has survived  pick traffic of the year on large e-commerce 
site.  So I am fairly confident that  New Java API is indeed does true 
round-robin and much faster than Scala Based API.

[~ewencp],  I will verify the patch by end of this Friday, but do let me know 
your understanding based on my last comment. The goal is to rest this issue and 
cover all the use case.

Thanks,

Bhavesh


was (Author: bmis13):
[~stevenz3wu],

0.8.2 is very well tested and worked well under heavy load.  This bug is rare 
only happen when broker or network has issue.  We have been producing about 7 
to 10 TB per day using this new producer, so 0.8.2 is very safe to use in 
production.  It has survived  pick traffic of the year on large e-commerce 
site.  So I am fairly confident that  New Java API is indeed does true 
round-robin and much faster than Scala Based API.

[~ewencp],  I will verify the patch by end of this Friday, but do let me know 
your understanding based on my last comment.

Thanks,

Bhavesh

 [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network 
 connection is lost
 ---

 Key: KAFKA-1642
 URL: https://issues.apache.org/jira/browse/KAFKA-1642
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 0.8.2
Reporter: Bhavesh Mistry
Assignee: Ewen Cheslack-Postava
Priority: Blocker
 Fix For: 0.8.2

 Attachments: 
 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, 
 KAFKA-1642.patch, KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, 
 KAFKA-1642_2014-10-23_16:19:41.patch


 I see my CPU spike to 100% when network connection is lost for while.  It 
 seems network  IO thread are very busy logging following error message.  Is 
 this expected behavior ?
 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR 
 org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka 
 producer I/O thread: 
 java.lang.IllegalStateException: No entry found for node -2
 at 
 org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110)
 at 
 org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99)
 at 
 org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394)
 at 
 org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380)
 at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174)
 at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175)
 at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
 at java.lang.Thread.run(Thread.java:744)
 Thanks,
 Bhavesh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)