from:"Sriharsha Chintalapani \(JIRA\)"

[jira] [Commented] (KAFKA-1646) Improve consumer read performance for Windows

2015-03-30 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387828#comment-14387828
 ] 

Sriharsha Chintalapani commented on KAFKA-1646:
---

[~jkreps] Any guidance on the latest patch?

 Improve consumer read performance for Windows
 -

 Key: KAFKA-1646
 URL: https://issues.apache.org/jira/browse/KAFKA-1646
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.1.1
 Environment: Windows
Reporter: xueqiang wang
Assignee: xueqiang wang
  Labels: newbie, patch
 Attachments: Improve consumer read performance for Windows.patch, 
 KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
 KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, 
 KAFKA-1646_20150312_200352.patch


 This patch is for Window platform only. In Windows platform, if there are 
 more than one replicas writing to disk, the segment log files will not be 
 consistent in disk and then consumer reading performance will be dropped down 
 greatly. This fix allocates more disk spaces when rolling a new segment, and 
 then it will improve the consumer reading performance in NTFS file system.
 This patch doesn't affect file allocation of other filesystems, for it only 
 adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1983) TestEndToEndLatency can be unreliable after hard kill

2015-03-29 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386040#comment-14386040
 ] 

Sriharsha Chintalapani commented on KAFKA-1983:
---

[~gchao] Here are the steps
1) start a single node kafka server 
2)  create a topic test
3) ./bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency localhost:9092 
localhost:2181 test 10 1000 1
(USAGE: java kafka.tools.TestEndToEndLatency$ broker_list zookeeper_connect 
topic num_messages consumer_fetch_max_wait producer_acks)
4) hard kill the TestEndToEndLatency
5) restarting the number 3 step causes it report low latency.

 TestEndToEndLatency can be unreliable after hard kill
 -

 Key: KAFKA-1983
 URL: https://issues.apache.org/jira/browse/KAFKA-1983
 Project: Kafka
  Issue Type: Improvement
Reporter: Jun Rao
Assignee: Grayson Chao
  Labels: newbie

 If you hard kill TestEndToEndLatency, the committed offset remains the last 
 checkpointed one. However, more messages are now appended after the last 
 checkpointed offset. When restarting TestEndToEndLatency, the consumer 
 resumes from the last checkpointed offset and will report really low latency 
 since it doesn't need to wait for a new message to be produced to read the 
 next message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-27 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384810#comment-14384810
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

Updated reviewboard https://reviews.apache.org/r/31366/diff/
 against branch origin/trunk

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch, KAFKA-1461_2015-03-17_16:03:33.patch, 
 KAFKA-1461_2015-03-27_15:31:11.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-27 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384813#comment-14384813
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

[~guozhang] addressed your last review. Please take a look. Thanks.

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch, KAFKA-1461_2015-03-17_16:03:33.patch, 
 KAFKA-1461_2015-03-27_15:31:11.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-27 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1461:
--
Attachment: KAFKA-1461_2015-03-27_15:31:11.patch

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch, KAFKA-1461_2015-03-17_16:03:33.patch, 
 KAFKA-1461_2015-03-27_15:31:11.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-27 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384922#comment-14384922
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

Updated reviewboard https://reviews.apache.org/r/31366/diff/
 against branch origin/trunk

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch, KAFKA-1461_2015-03-17_16:03:33.patch, 
 KAFKA-1461_2015-03-27_15:31:11.patch, KAFKA-1461_2015-03-27_16:56:45.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-27 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1461:
--
Attachment: KAFKA-1461_2015-03-27_16:56:45.patch

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch, KAFKA-1461_2015-03-17_16:03:33.patch, 
 KAFKA-1461_2015-03-27_15:31:11.patch, KAFKA-1461_2015-03-27_16:56:45.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-27 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384932#comment-14384932
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

Updated reviewboard https://reviews.apache.org/r/31366/diff/
 against branch origin/trunk

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch, KAFKA-1461_2015-03-17_16:03:33.patch, 
 KAFKA-1461_2015-03-27_15:31:11.patch, KAFKA-1461_2015-03-27_16:56:45.patch, 
 KAFKA-1461_2015-03-27_17:02:32.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-27 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1461:
--
Attachment: KAFKA-1461_2015-03-27_17:02:32.patch

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch, KAFKA-1461_2015-03-17_16:03:33.patch, 
 KAFKA-1461_2015-03-27_15:31:11.patch, KAFKA-1461_2015-03-27_16:56:45.patch, 
 KAFKA-1461_2015-03-27_17:02:32.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2052) zookeeper.connect does not work when specifying multiple zk nodes with chroot

2015-03-26 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381516#comment-14381516
 ] 

Sriharsha Chintalapani commented on KAFKA-2052:
---

[~uohzxela] Syntax is to add the zkroot at the end of the zookeeper.connect 
string
Here is an example
zookeeper.connect=zookeeper-staging-a-1.bezurk.org:2181,zookeeper-staging-b-1.bezurk.org:2181/kafka
Here is the text from https://kafka.apache.org/08/configuration.html
Zookeeper also allows you to add a chroot path which will make all kafka 
data for this cluster appear under a particular path. This is a way to setup 
multiple Kafka clusters or other applications on the same zookeeper cluster. To 
do this give a connection string in the form 
hostname1:port1,hostname2:port2,hostname3:port3/chroot/path which would put all 
this cluster's data under the path /chroot/path. Note that you must create this 
path yourself prior to starting the broker and consumers must use the same 
connection string.

Closing this as invalid. Please re-open if necessary.

 zookeeper.connect does not work when specifying multiple zk nodes with chroot 
 --

 Key: KAFKA-2052
 URL: https://issues.apache.org/jira/browse/KAFKA-2052
 Project: Kafka
  Issue Type: Bug
  Components: config
Reporter: Alex Jiao Ziheng
Priority: Minor

 I wish to specify multiple zk nodes with chroot strings in server.properties 
 config file but Kafka is unable to separate the zk nodes properly in the 
 presence of chroot strings:
 {code:title=server.properties|borderStyle=solid}
 zookeeper.connect=zookeeper-staging-a-1.bezurk.org:2181/kafka,zookeeper-staging-b-1.bezurk.org:2181/kafka
 {code}
 After running {code}service kafka start{code}
 Here are the error logs:
 {code:title=kafka_init_stdout.log|borderStyle=solid}
 [2015-03-26 13:58:21,330] INFO [Kafka Server 1], Connecting to zookeeper on 
 zookeeper-staging-b-1.bezurk.org:2181/kafka,zookeeper-staging-a-1.bezurk.org:2181/kafka
  (kafka.server.KafkaServer)
 [2015-03-26 13:58:21,438] INFO Starting ZkClient event thread. 
 (org.I0Itec.zkclient.ZkEventThread)
 [2015-03-26 13:58:21,454] INFO Client 
 environment:zookeeper.version=3.3.3-1203054, built on 11/17/2011 05:47 GMT 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,454] INFO Client 
 environment:host.name=ip-10-0-11-38.ap-southeast-1.compute.internal 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,454] INFO Client environment:java.version=1.7.0_75 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,454] INFO Client environment:java.vendor=Oracle 
 Corporation (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,454] INFO Client 
 environment:java.home=/usr/lib/jvm/jdk1.7.0_75/jre 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,455] INFO Client 
 environment:java.class.path=:/opt/kafka/bin/../core/build/dependant-libs-2.8.0/*.jar:/opt/kafka/bin/../perf/build/libs//kafka-perf_2.8.0*.jar:/opt/kafka/bin/../clients/build/libs//kafka-clients*.jar:/opt/kafka/bin/../examples/build/libs//kafka-examples*.jar:/opt/kafka/bin/../contrib/hadoop-consumer/build/libs//kafka-hadoop-consumer*.jar:/opt/kafka/bin/../contrib/hadoop-producer/build/libs//kafka-hadoop-producer*.jar:/opt/kafka/bin/../libs/jopt-simple-3.2.jar:/opt/kafka/bin/../libs/kafka_2.9.2-0.8.1.1.jar:/opt/kafka/bin/../libs/kafka_2.9.2-0.8.1.1-javadoc.jar:/opt/kafka/bin/../libs/kafka_2.9.2-0.8.1.1-scaladoc.jar:/opt/kafka/bin/../libs/kafka_2.9.2-0.8.1.1-sources.jar:/opt/kafka/bin/../libs/log4j-1.2.15.jar:/opt/kafka/bin/../libs/metrics-core-2.2.0.jar:/opt/kafka/bin/../libs/scala-library-2.9.2.jar:/opt/kafka/bin/../libs/slf4j-api-1.7.2.jar:/opt/kafka/bin/../libs/snappy-java-1.0.5.jar:/opt/kafka/bin/../libs/zkclient-0.3.jar:/opt/kafka/bin/../libs/zookeeper-3.3.4.jar:/opt/kafka/bin/../core/build/libs/kafka_2.8.0*.jar
  (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,455] INFO Client 
 environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
  (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,455] INFO Client environment:java.io.tmpdir=/tmp 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,455] INFO Client environment:java.compiler=NA 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,455] INFO Client environment:os.name=Linux 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,455] INFO Client environment:os.arch=amd64 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,456] INFO Client environment:os.version=3.2.0-58-virtual 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,456] INFO Client environment:user.name=kafka 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,456] INFO Client environment:user.home=/home/kafka 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26

[jira] [Resolved] (KAFKA-2052) zookeeper.connect does not work when specifying multiple zk nodes with chroot

2015-03-26 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani resolved KAFKA-2052.
---
Resolution: Invalid

 zookeeper.connect does not work when specifying multiple zk nodes with chroot 
 --

 Key: KAFKA-2052
 URL: https://issues.apache.org/jira/browse/KAFKA-2052
 Project: Kafka
  Issue Type: Bug
  Components: config
Reporter: Alex Jiao Ziheng
Priority: Minor

 I wish to specify multiple zk nodes with chroot strings in server.properties 
 config file but Kafka is unable to separate the zk nodes properly in the 
 presence of chroot strings:
 {code:title=server.properties|borderStyle=solid}
 zookeeper.connect=zookeeper-staging-a-1.bezurk.org:2181/kafka,zookeeper-staging-b-1.bezurk.org:2181/kafka
 {code}
 After running {code}service kafka start{code}
 Here are the error logs:
 {code:title=kafka_init_stdout.log|borderStyle=solid}
 [2015-03-26 13:58:21,330] INFO [Kafka Server 1], Connecting to zookeeper on 
 zookeeper-staging-b-1.bezurk.org:2181/kafka,zookeeper-staging-a-1.bezurk.org:2181/kafka
  (kafka.server.KafkaServer)
 [2015-03-26 13:58:21,438] INFO Starting ZkClient event thread. 
 (org.I0Itec.zkclient.ZkEventThread)
 [2015-03-26 13:58:21,454] INFO Client 
 environment:zookeeper.version=3.3.3-1203054, built on 11/17/2011 05:47 GMT 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,454] INFO Client 
 environment:host.name=ip-10-0-11-38.ap-southeast-1.compute.internal 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,454] INFO Client environment:java.version=1.7.0_75 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,454] INFO Client environment:java.vendor=Oracle 
 Corporation (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,454] INFO Client 
 environment:java.home=/usr/lib/jvm/jdk1.7.0_75/jre 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,455] INFO Client 
 environment:java.class.path=:/opt/kafka/bin/../core/build/dependant-libs-2.8.0/*.jar:/opt/kafka/bin/../perf/build/libs//kafka-perf_2.8.0*.jar:/opt/kafka/bin/../clients/build/libs//kafka-clients*.jar:/opt/kafka/bin/../examples/build/libs//kafka-examples*.jar:/opt/kafka/bin/../contrib/hadoop-consumer/build/libs//kafka-hadoop-consumer*.jar:/opt/kafka/bin/../contrib/hadoop-producer/build/libs//kafka-hadoop-producer*.jar:/opt/kafka/bin/../libs/jopt-simple-3.2.jar:/opt/kafka/bin/../libs/kafka_2.9.2-0.8.1.1.jar:/opt/kafka/bin/../libs/kafka_2.9.2-0.8.1.1-javadoc.jar:/opt/kafka/bin/../libs/kafka_2.9.2-0.8.1.1-scaladoc.jar:/opt/kafka/bin/../libs/kafka_2.9.2-0.8.1.1-sources.jar:/opt/kafka/bin/../libs/log4j-1.2.15.jar:/opt/kafka/bin/../libs/metrics-core-2.2.0.jar:/opt/kafka/bin/../libs/scala-library-2.9.2.jar:/opt/kafka/bin/../libs/slf4j-api-1.7.2.jar:/opt/kafka/bin/../libs/snappy-java-1.0.5.jar:/opt/kafka/bin/../libs/zkclient-0.3.jar:/opt/kafka/bin/../libs/zookeeper-3.3.4.jar:/opt/kafka/bin/../core/build/libs/kafka_2.8.0*.jar
  (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,455] INFO Client 
 environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
  (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,455] INFO Client environment:java.io.tmpdir=/tmp 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,455] INFO Client environment:java.compiler=NA 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,455] INFO Client environment:os.name=Linux 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,455] INFO Client environment:os.arch=amd64 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,456] INFO Client environment:os.version=3.2.0-58-virtual 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,456] INFO Client environment:user.name=kafka 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,456] INFO Client environment:user.home=/home/kafka 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,456] INFO Client environment:user.dir=/home/kafka 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,457] INFO Initiating client connection, 
 connectString=zookeeper-staging-b-1.bezurk.org:2181/kafka,zookeeper-staging-a-1.bezurk.org:2181/kafka
  sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient@c0d7 
 (org.apache.zookeeper.ZooKeeper)
 [2015-03-26 13:58:21,564] INFO Opening socket connection to server 
 zookeeper-staging-b-1.bezurk.org/10.0.29.191:2181 
 (org.apache.zookeeper.ClientCnxn)
 [2015-03-26 13:58:21,578] INFO Socket connection established to 
 zookeeper-staging-b-1.bezurk.org/10.0.29.191:2181, initiating session 
 (org.apache.zookeeper.ClientCnxn)
 [2015-03-26 13:58:21,596] INFO Session establishment complete on server 
 zookeeper-staging-b-1.bezurk.org/10.0.29.191:2181, sessionid = 
 0x24c50668a78000d, negotiated timeout = 6000

[jira] [Commented] (KAFKA-2050) Avoid calling .size() on java.util.ConcurrentLinkedQueue

2015-03-25 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14380809#comment-14380809
 ] 

Sriharsha Chintalapani commented on KAFKA-2050:
---

[~timbrooks] changes looks good to me. But can you send the patch using 
kafka-patch-review.py 
https://cwiki.apache.org/confluence/display/KAFKA/Patch+submission+and+review

 Avoid calling .size() on java.util.ConcurrentLinkedQueue
 

 Key: KAFKA-2050
 URL: https://issues.apache.org/jira/browse/KAFKA-2050
 Project: Kafka
  Issue Type: Bug
  Components: network
Reporter: Tim Brooks
Assignee: Jun Rao
 Attachments: dont_call_queue_size.patch


 Generally, it seems to be preferred to avoid calling .size() on a Java 
 ConcurrentLinkedQueue. This is an O(N) operation as it must iterate through 
 all the nodes.
 Calling this every time through the loop makes this issue worse under high 
 load. It seems like the same functionality can be attained by just polling 
 and checking for null.
 This is more imperative and less functional, but it seems alright since this 
 is in lower-level networking code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2046) Delete topic still doesn't work

2015-03-24 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14378741#comment-14378741
 ] 

Sriharsha Chintalapani commented on KAFKA-2046:
---

[~clarkhaskins] can you add the details on how the big the cluster was and also 
do you have state-change.log files on the brokers where the Log data was not 
deleted.

 Delete topic still doesn't work
 ---

 Key: KAFKA-2046
 URL: https://issues.apache.org/jira/browse/KAFKA-2046
 Project: Kafka
  Issue Type: Bug
Reporter: Clark Haskins

 I just attempted to delete at 128 partition topic with all inbound producers 
 stopped.
 The result was as follows:
 The /admin/delete_topics znode was empty
 the topic under /brokers/topics was removed
 The Kafka topics command showed that the topic was removed
 However, the data on disk on each broker was not deleted and the topic has 
 not yet been re-created by starting up the inbound mirror maker.
 Let me know what additional information is needed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (KAFKA-2046) Delete topic still doesn't work

2015-03-24 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani reassigned KAFKA-2046:
-

Assignee: Sriharsha Chintalapani

 Delete topic still doesn't work
 ---

 Key: KAFKA-2046
 URL: https://issues.apache.org/jira/browse/KAFKA-2046
 Project: Kafka
  Issue Type: Bug
Reporter: Clark Haskins
Assignee: Sriharsha Chintalapani

 I just attempted to delete at 128 partition topic with all inbound producers 
 stopped.
 The result was as follows:
 The /admin/delete_topics znode was empty
 the topic under /brokers/topics was removed
 The Kafka topics command showed that the topic was removed
 However, the data on disk on each broker was not deleted and the topic has 
 not yet been re-created by starting up the inbound mirror maker.
 Let me know what additional information is needed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-24 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379162#comment-14379162
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

[~guozhang] Thanks for the review. Can you please take a look at my reply to 
your comment.

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch, KAFKA-1461_2015-03-17_16:03:33.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1684) Implement TLS/SSL authentication

2015-03-24 Thread Sriharsha Chintalapani (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14378335#comment-14378335
]

Sriharsha Chintalapani commented on KAFKA-1684:
---

[~gwenshap] if you have patch available for KAFKA-1928 can you please upload
it. I can modify my ssl and kerberos patches according to the new code.

Implement TLS/SSL authentication

Key: KAFKA-1684
URL: https://issues.apache.org/jira/browse/KAFKA-1684
Project: Kafka
Issue Type: Sub-task
Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Sriharsha Chintalapani
Attachments: KAFKA-1684.patch, KAFKA-1684.patch

Add an SSL port to the configuration and advertise this as part of the
metadata request.
If the SSL port is configured the socket server will need to add a second
Acceptor thread to listen on it. Connections accepted on this port will need
to go through the SSL handshake prior to being registered with a Processor
for request processing.
SSL requests and responses may need to be wrapped or unwrapped using the
SSLEngine that was initialized by the acceptor. This wrapping and unwrapping
is very similar to what will need to be done for SASL-based authentication
schemes. We should have a uniform interface that covers both of these and we
will need to store the instance in the session with the request. The socket
server will have to use this object when reading and writing requests. We
will need to take care with the FetchRequests as the current
FileChannel.transferTo mechanism will be incompatible with wrap/unwrap so we
can only use this optimization for unencrypted sockets that don't require
userspace translation (wrapping).

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1507) Using GetOffsetShell against non-existent topic creates the topic unintentionally

2015-03-23 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376040#comment-14376040
 ] 

Sriharsha Chintalapani commented on KAFKA-1507:
---

[~jkreps] Since create/update/topic requests are part of KIP-4. Your proposal 
if the producer is throwing errors like UnknownTopicOrPartition users should 
catch this error and use AdminClient create a topic?. I still see a benefit of 
allowing users to pass in their required topic config( partitions, replication 
etcc) and if there is no topic exists send a createTopicRequest. If this is not 
desirable as per your suggestion we need to implement AdminClient?. In this 
case they can use AdminUtils and we should modify the AdminUtils send requests 
to broker instead of directly sending requests to zookeeper. This will also 
help KAFKA-1688 as all the create/update/delete requests will go through broker 
authorizer. Let me know if this what your thinking.

 Using GetOffsetShell against non-existent topic creates the topic 
 unintentionally
 -

 Key: KAFKA-1507
 URL: https://issues.apache.org/jira/browse/KAFKA-1507
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
 Environment: centos
Reporter: Luke Forehand
Assignee: Sriharsha Chintalapani
Priority: Minor
  Labels: newbie
 Attachments: KAFKA-1507.patch, KAFKA-1507.patch, 
 KAFKA-1507_2014-07-22_10:27:45.patch, KAFKA-1507_2014-07-23_17:07:20.patch, 
 KAFKA-1507_2014-08-12_18:09:06.patch, KAFKA-1507_2014-08-22_11:06:38.patch, 
 KAFKA-1507_2014-08-22_11:08:51.patch


 A typo in using GetOffsetShell command can cause a
 topic to be created which cannot be deleted (because deletion is still in
 progress)
 ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
 kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 --topic typo --time 1
 ./kafka-topics.sh --zookeeper stormqa1/kafka-prod --describe --topic typo
 Topic:typo  PartitionCount:8ReplicationFactor:1 Configs:
  Topic: typo Partition: 0Leader: 10  Replicas: 10
   Isr: 10
 ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (KAFKA-1684) Implement TLS/SSL authentication

2015-03-18 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani reassigned KAFKA-1684:
-

Assignee: Sriharsha Chintalapani  (was: Ivan Lyutov)

 Implement TLS/SSL authentication
 

 Key: KAFKA-1684
 URL: https://issues.apache.org/jira/browse/KAFKA-1684
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1684.patch, KAFKA-1684.patch


 Add an SSL port to the configuration and advertise this as part of the 
 metadata request.
 If the SSL port is configured the socket server will need to add a second 
 Acceptor thread to listen on it. Connections accepted on this port will need 
 to go through the SSL handshake prior to being registered with a Processor 
 for request processing.
 SSL requests and responses may need to be wrapped or unwrapped using the 
 SSLEngine that was initialized by the acceptor. This wrapping and unwrapping 
 is very similar to what will need to be done for SASL-based authentication 
 schemes. We should have a uniform interface that covers both of these and we 
 will need to store the instance in the session with the request. The socket 
 server will have to use this object when reading and writing requests. We 
 will need to take care with the FetchRequests as the current 
 FileChannel.transferTo mechanism will be incompatible with wrap/unwrap so we 
 can only use this optimization for unencrypted sockets that don't require 
 userspace translation (wrapping).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-2031) Executing scripts that invoke kafka-run-class.sh results in 'permission denied to create log dir' warning.

2015-03-18 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani resolved KAFKA-2031.
---
Resolution: Duplicate

 Executing scripts that invoke kafka-run-class.sh results in 'permission 
 denied to create log dir' warning.
 --

 Key: KAFKA-2031
 URL: https://issues.apache.org/jira/browse/KAFKA-2031
 Project: Kafka
  Issue Type: Bug
  Components: packaging
Affects Versions: 0.8.1.1
Reporter: Manikandan Narayanaswamy
Priority: Minor

 Kafka-run-class.sh script expects LOG_DIR variable to be set, and if this 
 variable is not set, it defaults log dir to $base_dir/logs. And in the event 
 the executor of the script does not have the right permissions, it would lead 
 to errors such as:
 {noformat}
 mkdir: cannot create directory `/usr/lib/kafka/bin/../logs': Permission 
 denied.
 {noformat}
 Proposing one way to make this more configurable is by introducing 
 kafka-env.sh. Sourcing this file would export LOG_DIR and potentially other 
 variables if need be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2031) Executing scripts that invoke kafka-run-class.sh results in 'permission denied to create log dir' warning.

2015-03-18 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368239#comment-14368239
 ] 

Sriharsha Chintalapani commented on KAFKA-2031:
---

[~mnarayan] we've a JIRA to add kafka-env.sh here 
https://issues.apache.org/jira/browse/KAFKA-1566

 Executing scripts that invoke kafka-run-class.sh results in 'permission 
 denied to create log dir' warning.
 --

 Key: KAFKA-2031
 URL: https://issues.apache.org/jira/browse/KAFKA-2031
 Project: Kafka
  Issue Type: Bug
  Components: packaging
Affects Versions: 0.8.1.1
Reporter: Manikandan Narayanaswamy
Priority: Minor

 Kafka-run-class.sh script expects LOG_DIR variable to be set, and if this 
 variable is not set, it defaults log dir to $base_dir/logs. And in the event 
 the executor of the script does not have the right permissions, it would lead 
 to errors such as:
 {noformat}
 mkdir: cannot create directory `/usr/lib/kafka/bin/../logs': Permission 
 denied.
 {noformat}
 Proposing one way to make this more configurable is by introducing 
 kafka-env.sh. Sourcing this file would export LOG_DIR and potentially other 
 variables if need be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (KAFKA-1912) Create a simple request re-routing facility

2015-03-17 Thread Sriharsha Chintalapani (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sriharsha Chintalapani reassigned KAFKA-1912:
-

Assignee: Sriharsha Chintalapani

Create a simple request re-routing facility
---

Key: KAFKA-1912
URL: https://issues.apache.org/jira/browse/KAFKA-1912
Project: Kafka
Issue Type: Improvement
Reporter: Jay Kreps
Assignee: Sriharsha Chintalapani
Fix For: 0.8.3

We are accumulating a lot of requests that have to be directed to the correct
server. This makes sense for high volume produce or fetch requests. But it is
silly to put the extra burden on the client for the many miscellaneous
requests such as fetching or committing offsets and so on.
This adds a ton of practical complexity to the clients with little or no
payoff in performance.
We should add a generic request-type agnostic re-routing facility on the
server. This would allow any server to accept a request and forward it to the
correct destination, proxying the response back to the user. Naturally it
needs to do this without blocking the thread.
The result is that a client implementation can choose to be optimally
efficient and manage a local cache of cluster state and attempt to always
direct its requests to the proper server OR it can choose simplicity and just
send things all to a single host and let that host figure out where to
forward it.
I actually think we should implement this more or less across the board, but
some requests such as produce and fetch require more logic to proxy since
they have to be scattered out to multiple servers and gathered back to create
the response. So these could be done in a second phase.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (KAFKA-2007) update offsetrequest for more useful information we have on broker about partition

2015-03-17 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani reassigned KAFKA-2007:
-

Assignee: Sriharsha Chintalapani

 update offsetrequest for more useful information we have on broker about 
 partition
 --

 Key: KAFKA-2007
 URL: https://issues.apache.org/jira/browse/KAFKA-2007
 Project: Kafka
  Issue Type: Bug
Reporter: Joe Stein
Assignee: Sriharsha Chintalapani
 Fix For: 0.8.3


 this will need a KIP
 via [~jkreps] in KIP-6 discussion about KAFKA-1694
 The other information that would be really useful to get would be
 information about partitions--how much data is in the partition, what are
 the segment offsets, what is the log-end offset (i.e. last offset), what is
 the compaction point, etc. I think that done right this would be the
 successor to the very awkward OffsetRequest we have today.
 This is not really blocking that ticket and could happen before/after and has 
 a lot of other useful purposes and is important to get done so tracking it 
 here in this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1566) Kafka environment configuration (kafka-env.sh)

2015-03-17 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366377#comment-14366377
 ] 

Sriharsha Chintalapani commented on KAFKA-1566:
---

[~nehanarkhede] patch applies against the trunk cleanly now. Can you please 
take a look.

 Kafka environment configuration (kafka-env.sh)
 --

 Key: KAFKA-1566
 URL: https://issues.apache.org/jira/browse/KAFKA-1566
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Cosmin Lehene
Assignee: Sriharsha Chintalapani
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1566.patch, KAFKA-1566_2015-02-21_21:57:02.patch, 
 KAFKA-1566_2015-03-17_17:01:38.patch, KAFKA-1566_2015-03-17_17:19:23.patch


 It would be useful (especially for automated deployments) to have an 
 environment configuration file that could be sourced from the launcher files 
 (e.g. kafka-run-server.sh). 
 This is how this could look like kafka-env.sh 
 {code}
 export KAFKA_JVM_PERFORMANCE_OPTS=-XX:+UseCompressedOops 
 -XX:+DisableExplicitGC -Djava.awt.headless=true \ -XX:+UseG1GC 
 -XX:PermSize=48m -XX:MaxPermSize=48m -XX:MaxGCPauseMillis=20 
 -XX:InitiatingHeapOccupancyPercent=35' % 
 export KAFKA_HEAP_OPTS='-Xmx1G -Xms1G' % 
 export KAFKA_LOG4J_OPTS=-Dkafka.logs.dir=/var/log/kafka 
 {code} 
 kafka-server-start.sh 
 {code} 
 ... 
 source $base_dir/config/kafka-env.sh 
 ... 
 {code} 
 This approach is consistent with Hadoop and HBase. However the idea here is 
 to be able to set these values in a single place without having to edit 
 startup scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1566) Kafka environment configuration (kafka-env.sh)

2015-03-17 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366347#comment-14366347
 ] 

Sriharsha Chintalapani commented on KAFKA-1566:
---

Updated reviewboard https://reviews.apache.org/r/29724/diff/
 against branch origin/trunk

 Kafka environment configuration (kafka-env.sh)
 --

 Key: KAFKA-1566
 URL: https://issues.apache.org/jira/browse/KAFKA-1566
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Cosmin Lehene
Assignee: Sriharsha Chintalapani
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1566.patch, KAFKA-1566_2015-02-21_21:57:02.patch, 
 KAFKA-1566_2015-03-17_17:01:38.patch


 It would be useful (especially for automated deployments) to have an 
 environment configuration file that could be sourced from the launcher files 
 (e.g. kafka-run-server.sh). 
 This is how this could look like kafka-env.sh 
 {code}
 export KAFKA_JVM_PERFORMANCE_OPTS=-XX:+UseCompressedOops 
 -XX:+DisableExplicitGC -Djava.awt.headless=true \ -XX:+UseG1GC 
 -XX:PermSize=48m -XX:MaxPermSize=48m -XX:MaxGCPauseMillis=20 
 -XX:InitiatingHeapOccupancyPercent=35' % 
 export KAFKA_HEAP_OPTS='-Xmx1G -Xms1G' % 
 export KAFKA_LOG4J_OPTS=-Dkafka.logs.dir=/var/log/kafka 
 {code} 
 kafka-server-start.sh 
 {code} 
 ... 
 source $base_dir/config/kafka-env.sh 
 ... 
 {code} 
 This approach is consistent with Hadoop and HBase. However the idea here is 
 to be able to set these values in a single place without having to edit 
 startup scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1566) Kafka environment configuration (kafka-env.sh)

2015-03-17 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1566:
--
Attachment: KAFKA-1566_2015-03-17_17:01:38.patch

 Kafka environment configuration (kafka-env.sh)
 --

 Key: KAFKA-1566
 URL: https://issues.apache.org/jira/browse/KAFKA-1566
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Cosmin Lehene
Assignee: Sriharsha Chintalapani
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1566.patch, KAFKA-1566_2015-02-21_21:57:02.patch, 
 KAFKA-1566_2015-03-17_17:01:38.patch


 It would be useful (especially for automated deployments) to have an 
 environment configuration file that could be sourced from the launcher files 
 (e.g. kafka-run-server.sh). 
 This is how this could look like kafka-env.sh 
 {code}
 export KAFKA_JVM_PERFORMANCE_OPTS=-XX:+UseCompressedOops 
 -XX:+DisableExplicitGC -Djava.awt.headless=true \ -XX:+UseG1GC 
 -XX:PermSize=48m -XX:MaxPermSize=48m -XX:MaxGCPauseMillis=20 
 -XX:InitiatingHeapOccupancyPercent=35' % 
 export KAFKA_HEAP_OPTS='-Xmx1G -Xms1G' % 
 export KAFKA_LOG4J_OPTS=-Dkafka.logs.dir=/var/log/kafka 
 {code} 
 kafka-server-start.sh 
 {code} 
 ... 
 source $base_dir/config/kafka-env.sh 
 ... 
 {code} 
 This approach is consistent with Hadoop and HBase. However the idea here is 
 to be able to set these values in a single place without having to edit 
 startup scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1566) Kafka environment configuration (kafka-env.sh)

2015-03-17 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1566:
--
Attachment: KAFKA-1566_2015-03-17_17:19:23.patch

 Kafka environment configuration (kafka-env.sh)
 --

 Key: KAFKA-1566
 URL: https://issues.apache.org/jira/browse/KAFKA-1566
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Cosmin Lehene
Assignee: Sriharsha Chintalapani
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1566.patch, KAFKA-1566_2015-02-21_21:57:02.patch, 
 KAFKA-1566_2015-03-17_17:01:38.patch, KAFKA-1566_2015-03-17_17:19:23.patch


 It would be useful (especially for automated deployments) to have an 
 environment configuration file that could be sourced from the launcher files 
 (e.g. kafka-run-server.sh). 
 This is how this could look like kafka-env.sh 
 {code}
 export KAFKA_JVM_PERFORMANCE_OPTS=-XX:+UseCompressedOops 
 -XX:+DisableExplicitGC -Djava.awt.headless=true \ -XX:+UseG1GC 
 -XX:PermSize=48m -XX:MaxPermSize=48m -XX:MaxGCPauseMillis=20 
 -XX:InitiatingHeapOccupancyPercent=35' % 
 export KAFKA_HEAP_OPTS='-Xmx1G -Xms1G' % 
 export KAFKA_LOG4J_OPTS=-Dkafka.logs.dir=/var/log/kafka 
 {code} 
 kafka-server-start.sh 
 {code} 
 ... 
 source $base_dir/config/kafka-env.sh 
 ... 
 {code} 
 This approach is consistent with Hadoop and HBase. However the idea here is 
 to be able to set these values in a single place without having to edit 
 startup scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1566) Kafka environment configuration (kafka-env.sh)

2015-03-17 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366373#comment-14366373
 ] 

Sriharsha Chintalapani commented on KAFKA-1566:
---

Updated reviewboard https://reviews.apache.org/r/29724/diff/
 against branch origin/trunk

 Kafka environment configuration (kafka-env.sh)
 --

 Key: KAFKA-1566
 URL: https://issues.apache.org/jira/browse/KAFKA-1566
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Cosmin Lehene
Assignee: Sriharsha Chintalapani
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1566.patch, KAFKA-1566_2015-02-21_21:57:02.patch, 
 KAFKA-1566_2015-03-17_17:01:38.patch, KAFKA-1566_2015-03-17_17:19:23.patch


 It would be useful (especially for automated deployments) to have an 
 environment configuration file that could be sourced from the launcher files 
 (e.g. kafka-run-server.sh). 
 This is how this could look like kafka-env.sh 
 {code}
 export KAFKA_JVM_PERFORMANCE_OPTS=-XX:+UseCompressedOops 
 -XX:+DisableExplicitGC -Djava.awt.headless=true \ -XX:+UseG1GC 
 -XX:PermSize=48m -XX:MaxPermSize=48m -XX:MaxGCPauseMillis=20 
 -XX:InitiatingHeapOccupancyPercent=35' % 
 export KAFKA_HEAP_OPTS='-Xmx1G -Xms1G' % 
 export KAFKA_LOG4J_OPTS=-Dkafka.logs.dir=/var/log/kafka 
 {code} 
 kafka-server-start.sh 
 {code} 
 ... 
 source $base_dir/config/kafka-env.sh 
 ... 
 {code} 
 This approach is consistent with Hadoop and HBase. However the idea here is 
 to be able to set these values in a single place without having to edit 
 startup scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-17 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1461:
--
Attachment: KAFKA-1461_2015-03-17_16:03:33.patch

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch, KAFKA-1461_2015-03-17_16:03:33.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-17 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366262#comment-14366262
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

Updated reviewboard https://reviews.apache.org/r/31366/diff/
 against branch origin/trunk

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch, KAFKA-1461_2015-03-17_16:03:33.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1912) Create a simple request re-routing facility

2015-03-17 Thread Sriharsha Chintalapani (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sriharsha Chintalapani updated KAFKA-1912:
--
Assignee: (was: Sriharsha Chintalapani)

Create a simple request re-routing facility
---

Key: KAFKA-1912
URL: https://issues.apache.org/jira/browse/KAFKA-1912
Project: Kafka
Issue Type: Improvement
Reporter: Jay Kreps
Fix For: 0.8.3

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1646) Improve consumer read performance for Windows

2015-03-16 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364344#comment-14364344
 ] 

Sriharsha Chintalapani commented on KAFKA-1646:
---

[~waldenchen] Its looks like the patch is against 0.8.1.1 branch can you send 
us a patch against trunk. 


 Improve consumer read performance for Windows
 -

 Key: KAFKA-1646
 URL: https://issues.apache.org/jira/browse/KAFKA-1646
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.1.1
 Environment: Windows
Reporter: xueqiang wang
Assignee: xueqiang wang
  Labels: newbie, patch
 Attachments: Improve consumer read performance for Windows.patch, 
 KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
 KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, 
 KAFKA-1646_20150312_200352.patch


 This patch is for Window platform only. In Windows platform, if there are 
 more than one replicas writing to disk, the segment log files will not be 
 consistent in disk and then consumer reading performance will be dropped down 
 greatly. This fix allocates more disk spaces when rolling a new segment, and 
 then it will improve the consumer reading performance in NTFS file system.
 This patch doesn't affect file allocation of other filesystems, for it only 
 adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (KAFKA-1685) Implement TLS/SSL tests

2015-03-15 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani reassigned KAFKA-1685:
-

Assignee: Sriharsha Chintalapani

 Implement TLS/SSL tests
 ---

 Key: KAFKA-1685
 URL: https://issues.apache.org/jira/browse/KAFKA-1685
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Sriharsha Chintalapani
 Fix For: 0.9.0


 We need to write a suite of unit tests for TLS authentication. This should be 
 doable with a junit integration test. We can use the simple authorization 
 plugin with only a single user whitelisted. The test can start the server and 
 then connects with and without TLS and validates that access is only possible 
 when authenticated. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1682) Security for Kafka

2015-03-15 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362644#comment-14362644
 ] 

Sriharsha Chintalapani commented on KAFKA-1682:
---

Also
5. KAFKA-1688(Authorization), by [~parthrbhatt] , depending on KAFKA-1684  
KAFKA-1686

 Security for Kafka
 --

 Key: KAFKA-1682
 URL: https://issues.apache.org/jira/browse/KAFKA-1682
 Project: Kafka
  Issue Type: New Feature
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps

 Parent ticket for security. Wiki and discussion is here:
 https://cwiki.apache.org/confluence/display/KAFKA/Security



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (KAFKA-1682) Security for Kafka

2015-03-15 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362644#comment-14362644
 ] 

Sriharsha Chintalapani edited comment on KAFKA-1682 at 3/16/15 1:14 AM:


Also
5. KAFKA-1688(Authorization), by [~parth.brahmbhatt] , depending on KAFKA-1684 
 KAFKA-1686


was (Author: sriharsha):
Also
5. KAFKA-1688(Authorization), by [~parthrbhatt] , depending on KAFKA-1684  
KAFKA-1686

 Security for Kafka
 --

 Key: KAFKA-1682
 URL: https://issues.apache.org/jira/browse/KAFKA-1682
 Project: Kafka
  Issue Type: New Feature
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps

 Parent ticket for security. Wiki and discussion is here:
 https://cwiki.apache.org/confluence/display/KAFKA/Security



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1684) Implement TLS/SSL authentication

2015-03-13 Thread Sriharsha Chintalapani (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14361539#comment-14361539
]

Sriharsha Chintalapani commented on KAFKA-1684:
---

[~junrao] Definitely we can go that route. I would like to get more details on
what needs to be done for KAFKA-1928. Currently the SSLChannel can be used by
both client and socketServer. Also we need to get KAFKA-1809 in that sets up
the framework for multiple ports.
[~gwenshap] KAFKA-1928 is currently assigned to you. If you are not actively
working on it can I take it?.

Implement TLS/SSL authentication

Key: KAFKA-1684
URL: https://issues.apache.org/jira/browse/KAFKA-1684
Project: Kafka
Issue Type: Sub-task
Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Ivan Lyutov
Attachments: KAFKA-1684.patch, KAFKA-1684.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-12 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359384#comment-14359384
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

Updated reviewboard https://reviews.apache.org/r/31927/diff/
 against branch origin/trunk

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-12 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1461:
--
Attachment: KAFKA-1461_2015-03-12_13:54:51.patch

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-12 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359469#comment-14359469
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

Thanks [~junrao] I'll incorporate your and [~guozhang] feedback for RB 31366. 
Yes we can add that condition at partitionMapCond.await(200L) and use the 
fetchBackoffMs . I'll send updated pr for it.

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch, 
 KAFKA-1461_2015-03-12_13:54:51.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1684) Implement TLS/SSL authentication

2015-03-12 Thread Sriharsha Chintalapani (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357665#comment-14357665
]

Sriharsha Chintalapani commented on KAFKA-1684:
---

[~junrao] [~jkreps] [~gwenshap] [~jjkoshy]
Hi Everyone,
I am trying to setup a pattern for both ssl and kerberos auth. The above
patch include SSL auth for socket server. Please ignore the changes around
KafkaServer and SockerServer way of creating acceptors as there is work done in
KAFKA-1809.
Based on the comments before I didn't extend any socketChannel instead
SSLChannel more of a wrapper around socketChannel. The SSL handshake done using
non blocking I/O and without any thread.sleeps . Since SocketServer.processor
depends on socketChannels selection I had to use a ConcurrentHashMap which
stores the socketChannel as key and ChannelWrapper as the value. The other
option I've is to extend the SocketChannel or use the SelectionKey attachment
but we are using this attachment for request and response.
The patch also contains a unit test in SocketServerTest which generates
self-signed certs and runs the socket server using those also
sends a request.
I also have GSSChannel for KAFKA-1686 which does the kerberos auth in
similar pattern to SSLChannel. Once this patch gets reviewed, based on the
feedback I'll change the SSL and kerberos patches.
I appreciate any feedback on the above patch.

Implement TLS/SSL authentication

Key: KAFKA-1684
URL: https://issues.apache.org/jira/browse/KAFKA-1684
Project: Kafka
Issue Type: Sub-task
Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Ivan Lyutov
Attachments: KAFKA-1684.patch, KAFKA-1684.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1684) Implement TLS/SSL authentication

2015-03-11 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357660#comment-14357660
 ] 

Sriharsha Chintalapani commented on KAFKA-1684:
---

Created reviewboard https://reviews.apache.org/r/31958/diff/
 against branch origin/trunk

 Implement TLS/SSL authentication
 

 Key: KAFKA-1684
 URL: https://issues.apache.org/jira/browse/KAFKA-1684
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Ivan Lyutov
 Attachments: KAFKA-1684.patch, KAFKA-1684.patch


 Add an SSL port to the configuration and advertise this as part of the 
 metadata request.
 If the SSL port is configured the socket server will need to add a second 
 Acceptor thread to listen on it. Connections accepted on this port will need 
 to go through the SSL handshake prior to being registered with a Processor 
 for request processing.
 SSL requests and responses may need to be wrapped or unwrapped using the 
 SSLEngine that was initialized by the acceptor. This wrapping and unwrapping 
 is very similar to what will need to be done for SASL-based authentication 
 schemes. We should have a uniform interface that covers both of these and we 
 will need to store the instance in the session with the request. The socket 
 server will have to use this object when reading and writing requests. We 
 will need to take care with the FetchRequests as the current 
 FileChannel.transferTo mechanism will be incompatible with wrap/unwrap so we 
 can only use this optimization for unencrypted sockets that don't require 
 userspace translation (wrapping).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1684) Implement TLS/SSL authentication

2015-03-11 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1684:
--
Attachment: KAFKA-1684.patch

 Implement TLS/SSL authentication
 

 Key: KAFKA-1684
 URL: https://issues.apache.org/jira/browse/KAFKA-1684
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Ivan Lyutov
 Attachments: KAFKA-1684.patch, KAFKA-1684.patch


 Add an SSL port to the configuration and advertise this as part of the 
 metadata request.
 If the SSL port is configured the socket server will need to add a second 
 Acceptor thread to listen on it. Connections accepted on this port will need 
 to go through the SSL handshake prior to being registered with a Processor 
 for request processing.
 SSL requests and responses may need to be wrapped or unwrapped using the 
 SSLEngine that was initialized by the acceptor. This wrapping and unwrapping 
 is very similar to what will need to be done for SASL-based authentication 
 schemes. We should have a uniform interface that covers both of these and we 
 will need to store the instance in the session with the request. The socket 
 server will have to use this object when reading and writing requests. We 
 will need to take care with the FetchRequests as the current 
 FileChannel.transferTo mechanism will be incompatible with wrap/unwrap so we 
 can only use this optimization for unencrypted sockets that don't require 
 userspace translation (wrapping).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-11 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357263#comment-14357263
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

Updated reviewboard https://reviews.apache.org/r/31927/diff/
 against branch origin/trunk

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-11 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1461:
--
Attachment: KAFKA-1461_2015-03-11_10:41:26.patch

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-11 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1461:
--
Attachment: KAFKA-1461_2015-03-11_18:17:51.patch

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-11 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357927#comment-14357927
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

Updated reviewboard https://reviews.apache.org/r/31927/diff/
 against branch origin/trunk

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-11 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358040#comment-14358040
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

[~charmalloc] since there aren't any interface changes I am not sure if a KIP 
is necessary. Ofcourse we added a new config for replica.fetch.backoff.ms If 
this warrants a KIP than I can write up one.

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, 
 KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-10 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356341#comment-14356341
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

Created reviewboard https://reviews.apache.org/r/31927/diff/
 against branch origin/trunk

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-10 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356344#comment-14356344
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

[~junrao] [~guozhang] please take a look at the above patch . Let me know if 
that's what you have in mind. I also added replica.fetch.backoff.ms and 
controller.socket.timeout.ms to TestUtils.createBrokerConfig this reduced the 
total test run time from 15mins to under 10mins on my machine.


 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-10 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1461:
--
Attachment: KAFKA-1461.patch

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch, KAFKA-1461.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-03-09 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354139#comment-14354139
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

[~junrao] I'll work tomorrow and finish up the patch. 
Few questions on your recommendations
 In AbstractFetcherThread, simply backoff based on the configured time, if it 
hits an exception when doing a fetch.
 so instead of handling partitions errors if there is an exception while 
fetching we will just backoff the AbstractFetcherThread wait until configured 
time elapsed or a condition is met?




 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2010) unit tests sometimes are slow during shutdown

2015-03-08 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352329#comment-14352329
 ] 

Sriharsha Chintalapani commented on KAFKA-2010:
---

[~junrao] It might be due to this 
https://issues.apache.org/jira/browse/KAFKA-1887?focusedCommentId=14330093page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14330093
 . 

KAFKA-1971 removed healthCheck.shutdown() which doesn't remove the brokerId 
causing controllerChannelManager.shutdown() throw an exception as noted in my 
comment in KAFKA-1887.

 unit tests sometimes are slow during shutdown
 -

 Key: KAFKA-2010
 URL: https://issues.apache.org/jira/browse/KAFKA-2010
 Project: Kafka
  Issue Type: Bug
Reporter: Jun Rao
Assignee: Jun Rao
 Fix For: 0.8.3


 Our unit tests in trunk seem to be slower than before. The slowness seems to 
 be due to a handful of tests.
 For example, if you run the following test,  sometimes it can take more than 
 40
 secs, while normally it takes less than 10 secs.
 ./gradlew -i cleanTest core:test --tests 
 kafka.admin.AddPartitionsTest.testIncrementPartitions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-2000) Delete consumer offsets from kafka once the topic is deleted

2015-03-03 Thread Sriharsha Chintalapani (JIRA)

Sriharsha Chintalapani created KAFKA-2000:
-

 Summary: Delete consumer offsets from kafka once the topic is 
deleted
 Key: KAFKA-2000
 URL: https://issues.apache.org/jira/browse/KAFKA-2000
 Project: Kafka
  Issue Type: Bug
Reporter: Sriharsha Chintalapani
Assignee: Sriharsha Chintalapani






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1866) LogStartOffset gauge throws exceptions after log.delete()

2015-02-27 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340162#comment-14340162
 ] 

Sriharsha Chintalapani commented on KAFKA-1866:
---

[~nehanarkhede] I updated the patch can you please take a look. Thanks.

 LogStartOffset gauge throws exceptions after log.delete()
 -

 Key: KAFKA-1866
 URL: https://issues.apache.org/jira/browse/KAFKA-1866
 Project: Kafka
  Issue Type: Bug
Reporter: Gian Merlino
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1866.patch, KAFKA-1866_2015-02-10_22:50:09.patch, 
 KAFKA-1866_2015-02-11_09:25:33.patch


 The LogStartOffset gauge does logSegments.head.baseOffset, which throws 
 NoSuchElementException on an empty list, which can occur after a delete() of 
 the log. This makes life harder for custom MetricsReporters, since they have 
 to deal with .value() possibly throwing an exception.
 Locally we're dealing with this by having Log.delete() also call removeMetric 
 on all the gauges. That also has the benefit of not having a bunch of metrics 
 floating around for logs that the broker is not actually handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1852) OffsetCommitRequest can commit offset on unknown topic

2015-02-27 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340163#comment-14340163
 ] 

Sriharsha Chintalapani commented on KAFKA-1852:
---

[~jjkoshy] Updated the patch as per your suggestions can you please take a 
look.Thanks

 OffsetCommitRequest can commit offset on unknown topic
 --

 Key: KAFKA-1852
 URL: https://issues.apache.org/jira/browse/KAFKA-1852
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.3
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1852.patch, KAFKA-1852_2015-01-19_10:44:01.patch, 
 KAFKA-1852_2015-02-12_16:46:10.patch, KAFKA-1852_2015-02-16_13:21:46.patch, 
 KAFKA-1852_2015-02-18_13:13:17.patch


 Currently, we allow an offset to be committed to Kafka, even when the 
 topic/partition for the offset doesn't exist. We probably should disallow 
 that and send an error back in that case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1852) OffsetCommitRequest can commit offset on unknown topic

2015-02-27 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1852:
--
Attachment: KAFKA-1852_2015-02-27_13:50:34.patch

 OffsetCommitRequest can commit offset on unknown topic
 --

 Key: KAFKA-1852
 URL: https://issues.apache.org/jira/browse/KAFKA-1852
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.3
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1852.patch, KAFKA-1852_2015-01-19_10:44:01.patch, 
 KAFKA-1852_2015-02-12_16:46:10.patch, KAFKA-1852_2015-02-16_13:21:46.patch, 
 KAFKA-1852_2015-02-18_13:13:17.patch, KAFKA-1852_2015-02-27_13:50:34.patch


 Currently, we allow an offset to be committed to Kafka, even when the 
 topic/partition for the offset doesn't exist. We probably should disallow 
 that and send an error back in that case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1852) OffsetCommitRequest can commit offset on unknown topic

2015-02-27 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340874#comment-14340874
 ] 

Sriharsha Chintalapani commented on KAFKA-1852:
---

Updated reviewboard https://reviews.apache.org/r/29912/diff/
 against branch origin/trunk

 OffsetCommitRequest can commit offset on unknown topic
 --

 Key: KAFKA-1852
 URL: https://issues.apache.org/jira/browse/KAFKA-1852
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.3
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1852.patch, KAFKA-1852_2015-01-19_10:44:01.patch, 
 KAFKA-1852_2015-02-12_16:46:10.patch, KAFKA-1852_2015-02-16_13:21:46.patch, 
 KAFKA-1852_2015-02-18_13:13:17.patch, KAFKA-1852_2015-02-27_13:50:34.patch


 Currently, we allow an offset to be committed to Kafka, even when the 
 topic/partition for the offset doesn't exist. We probably should disallow 
 that and send an error back in that case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1596) Exception in KafkaScheduler while shutting down

2015-02-24 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1596:
--
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

This is issue is already fixed as part of KAFKA-1760.

 Exception in KafkaScheduler while shutting down
 ---

 Key: KAFKA-1596
 URL: https://issues.apache.org/jira/browse/KAFKA-1596
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy
  Labels: newbie
 Attachments: kafka-1596.patch


 Saw this while trying to reproduce KAFKA-1577. It is very minor and won't 
 happen in practice but annoying nonetheless.
 {code}
 [2014-08-14 18:03:56,686] INFO zookeeper state changed (SyncConnected) 
 (org.I0Itec.zkclient.ZkClient)
 [2014-08-14 18:03:56,776] INFO Loading logs. (kafka.log.LogManager)
 [2014-08-14 18:03:56,783] INFO Logs loading complete. (kafka.log.LogManager)
 [2014-08-14 18:03:57,120] INFO Starting log cleanup with a period of 30 
 ms. (kafka.log.LogManager)
 [2014-08-14 18:03:57,124] INFO Starting log flusher with a default period of 
 9223372036854775807 ms. (kafka.log.LogManager)
 [2014-08-14 18:03:57,158] INFO Awaiting socket connections on 0.0.0.0:9092. 
 (kafka.network.Acceptor)
 [2014-08-14 18:03:57,160] INFO [Socket Server on Broker 0], Started 
 (kafka.network.SocketServer)
 ^C[2014-08-14 18:03:57,203] INFO [Kafka Server 0], shutting down 
 (kafka.server.KafkaServer)
 [2014-08-14 18:03:57,211] INFO [Socket Server on Broker 0], Shutting down 
 (kafka.network.SocketServer)
 [2014-08-14 18:03:57,222] INFO [Socket Server on Broker 0], Shutdown 
 completed (kafka.network.SocketServer)
 [2014-08-14 18:03:57,226] INFO [Replica Manager on Broker 0]: Shut down 
 (kafka.server.ReplicaManager)
 [2014-08-14 18:03:57,228] INFO [ReplicaFetcherManager on broker 0] shutting 
 down (kafka.server.ReplicaFetcherManager)
 [2014-08-14 18:03:57,233] INFO [ReplicaFetcherManager on broker 0] shutdown 
 completed (kafka.server.ReplicaFetcherManager)
 [2014-08-14 18:03:57,274] INFO [Replica Manager on Broker 0]: Shut down 
 completely (kafka.server.ReplicaManager)
 [2014-08-14 18:03:57,276] INFO Shutting down. (kafka.log.LogManager)
 [2014-08-14 18:03:57,296] INFO Will not load MX4J, mx4j-tools.jar is not in 
 the classpath (kafka.utils.Mx4jLoader$)
 [2014-08-14 18:03:57,297] INFO Shutdown complete. (kafka.log.LogManager)
 [2014-08-14 18:03:57,301] FATAL Fatal error during KafkaServerStable startup. 
 Prepare to shutdown (kafka.server.KafkaServerStartable)
 java.lang.IllegalStateException: Kafka scheduler has not been started
 at kafka.utils.KafkaScheduler.ensureStarted(KafkaScheduler.scala:114)
 at kafka.utils.KafkaScheduler.schedule(KafkaScheduler.scala:95)
 at kafka.server.ReplicaManager.startup(ReplicaManager.scala:138)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:112)
 at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:28)
 at kafka.Kafka$.main(Kafka.scala:46)
 at kafka.Kafka.main(Kafka.scala)
 [2014-08-14 18:03:57,324] INFO [Kafka Server 0], shutting down 
 (kafka.server.KafkaServer)
 [2014-08-14 18:03:57,326] INFO Terminate ZkClient event thread. 
 (org.I0Itec.zkclient.ZkEventThread)
 [2014-08-14 18:03:57,329] INFO Session: 0x147d5b0a51a closed 
 (org.apache.zookeeper.ZooKeeper)
 [2014-08-14 18:03:57,329] INFO EventThread shut down 
 (org.apache.zookeeper.ClientCnxn)
 [2014-08-14 18:03:57,329] INFO [Kafka Server 0], shut down completed 
 (kafka.server.KafkaServer)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-02-24 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1461:
--
Attachment: KAFKA-1461.patch

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-02-24 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335164#comment-14335164
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

Created reviewboard https://reviews.apache.org/r/31366/diff/
 against branch origin/trunk

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-02-24 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1461:
--
Status: Patch Available  (was: Open)

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-02-24 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335166#comment-14335166
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

[~guozhang] thanks for the pointers. Can you please take a look at the patch 
when you get a chance.

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1461.patch


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1757) Can not delete Topic index on Windows

2015-02-23 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334389#comment-14334389
 ] 

Sriharsha Chintalapani commented on KAFKA-1757:
---

patch merged by [~jkreps] closing this as fixed.

 Can not delete Topic index on Windows
 -

 Key: KAFKA-1757
 URL: https://issues.apache.org/jira/browse/KAFKA-1757
 Project: Kafka
  Issue Type: Bug
  Components: log
Affects Versions: 0.8.2.0
Reporter: Lukáš Vyhlídka
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3

 Attachments: KAFKA-1757.patch, lucky-v.patch


 When running the Kafka 0.8.2-Beta (Scala 2.10) on Windows, an attempt to 
 delete the Topic throwed an error:
 ERROR [KafkaApi-1] error when handling request Name: StopReplicaRequest; 
 Version: 0; CorrelationId: 38; ClientId: ; DeletePartitions: true; 
 ControllerId: 0; ControllerEpoch: 3; Partitions: [test,0] 
 (kafka.server.KafkaApis)
 kafka.common.KafkaStorageException: Delete of index 
 .index failed.
 at kafka.log.LogSegment.delete(LogSegment.scala:283)
 at kafka.log.Log$$anonfun$delete$1.apply(Log.scala:608)
 at kafka.log.Log$$anonfun$delete$1.apply(Log.scala:608)
 at scala.collection.Iterator$class.foreach(Iterator.scala:727)
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
 at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
 at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
 at kafka.log.Log.delete(Log.scala:608)
 at kafka.log.LogManager.deleteLog(LogManager.scala:375)
 at 
 kafka.cluster.Partition$$anonfun$delete$1.apply$mcV$sp(Partition.scala:144)
 at 
 kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:139)
 at 
 kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:139)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at kafka.utils.Utils$.inWriteLock(Utils.scala:543)
 at kafka.cluster.Partition.delete(Partition.scala:139)
 at kafka.server.ReplicaManager.stopReplica(ReplicaManager.scala:158)
 at 
 kafka.server.ReplicaManager$$anonfun$stopReplicas$3.apply(ReplicaManager.scala:191)
 at 
 kafka.server.ReplicaManager$$anonfun$stopReplicas$3.apply(ReplicaManager.scala:190)
 at scala.collection.immutable.Set$Set1.foreach(Set.scala:74)
 at kafka.server.ReplicaManager.stopReplicas(ReplicaManager.scala:190)
 at kafka.server.KafkaApis.handleStopReplicaRequest(KafkaApis.scala:96)
 at kafka.server.KafkaApis.handle(KafkaApis.scala:59)
 at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59)
 at java.lang.Thread.run(Thread.java:744)
 When I have investigated the issue I figured out that the index file (in my 
 environment it was 
 C:\tmp\kafka-logs\----0014-0\.index)
  was locked by the kafka process and the OS did not allow to delete that file.
 I tried to fix the problem in source codes and when I added close() method 
 call into LogSegment.delete(), the Topic deletion started to work.
 I will add here (not sure how to upload the file during issue creation) a 
 diff with the changes I have made so You can take a look on that whether it 
 is reasonable or not. It would be perfect if it could make it into the 
 product...
 In the end I would like to say that on Linux the deletion works just fine...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1757) Can not delete Topic index on Windows

2015-02-23 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1757:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Can not delete Topic index on Windows
 -

 Key: KAFKA-1757
 URL: https://issues.apache.org/jira/browse/KAFKA-1757
 Project: Kafka
  Issue Type: Bug
  Components: log
Affects Versions: 0.8.2.0
Reporter: Lukáš Vyhlídka
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3

 Attachments: KAFKA-1757.patch, lucky-v.patch


 When running the Kafka 0.8.2-Beta (Scala 2.10) on Windows, an attempt to 
 delete the Topic throwed an error:
 ERROR [KafkaApi-1] error when handling request Name: StopReplicaRequest; 
 Version: 0; CorrelationId: 38; ClientId: ; DeletePartitions: true; 
 ControllerId: 0; ControllerEpoch: 3; Partitions: [test,0] 
 (kafka.server.KafkaApis)
 kafka.common.KafkaStorageException: Delete of index 
 .index failed.
 at kafka.log.LogSegment.delete(LogSegment.scala:283)
 at kafka.log.Log$$anonfun$delete$1.apply(Log.scala:608)
 at kafka.log.Log$$anonfun$delete$1.apply(Log.scala:608)
 at scala.collection.Iterator$class.foreach(Iterator.scala:727)
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
 at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
 at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
 at kafka.log.Log.delete(Log.scala:608)
 at kafka.log.LogManager.deleteLog(LogManager.scala:375)
 at 
 kafka.cluster.Partition$$anonfun$delete$1.apply$mcV$sp(Partition.scala:144)
 at 
 kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:139)
 at 
 kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:139)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at kafka.utils.Utils$.inWriteLock(Utils.scala:543)
 at kafka.cluster.Partition.delete(Partition.scala:139)
 at kafka.server.ReplicaManager.stopReplica(ReplicaManager.scala:158)
 at 
 kafka.server.ReplicaManager$$anonfun$stopReplicas$3.apply(ReplicaManager.scala:191)
 at 
 kafka.server.ReplicaManager$$anonfun$stopReplicas$3.apply(ReplicaManager.scala:190)
 at scala.collection.immutable.Set$Set1.foreach(Set.scala:74)
 at kafka.server.ReplicaManager.stopReplicas(ReplicaManager.scala:190)
 at kafka.server.KafkaApis.handleStopReplicaRequest(KafkaApis.scala:96)
 at kafka.server.KafkaApis.handle(KafkaApis.scala:59)
 at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59)
 at java.lang.Thread.run(Thread.java:744)
 When I have investigated the issue I figured out that the index file (in my 
 environment it was 
 C:\tmp\kafka-logs\----0014-0\.index)
  was locked by the kafka process and the OS did not allow to delete that file.
 I tried to fix the problem in source codes and when I added close() method 
 call into LogSegment.delete(), the Topic deletion started to work.
 I will add here (not sure how to upload the file during issue creation) a 
 diff with the changes I have made so You can take a look on that whether it 
 is reasonable or not. It would be perfect if it could make it into the 
 product...
 In the end I would like to say that on Linux the deletion works just fine...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1724) Errors after reboot in single node setup

2015-02-23 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334148#comment-14334148
 ] 

Sriharsha Chintalapani commented on KAFKA-1724:
---

[~junrao] Yes. We've isStarted in KafkaScheduler which gets set after its 
started and in shutdown we check isStarted and go through shutdown process.
Tested it  in a cluster to reproduce don't see any errors.

 Errors after reboot in single node setup
 

 Key: KAFKA-1724
 URL: https://issues.apache.org/jira/browse/KAFKA-1724
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.2.0
Reporter: Ciprian Hacman
Assignee: Sriharsha Chintalapani
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1724.patch


 In a single node setup, after reboot, Kafka logs show the following:
 {code}
 [2014-10-22 16:37:22,206] INFO [Controller 0]: Controller starting up 
 (kafka.controller.KafkaController)
 [2014-10-22 16:37:22,419] INFO [Controller 0]: Controller startup complete 
 (kafka.controller.KafkaController)
 [2014-10-22 16:37:22,554] INFO conflict in /brokers/ids/0 data: 
 {jmx_port:-1,timestamp:1413995842465,host:ip-10-91-142-54.eu-west-1.compute.internal,version:1,port:9092}
  stored data: 
 {jmx_port:-1,timestamp:1413994171579,host:ip-10-91-142-54.eu-west-1.compute.internal,version:1,port:9092}
  (kafka.utils.ZkUtils$)
 [2014-10-22 16:37:22,736] INFO I wrote this conflicted ephemeral node 
 [{jmx_port:-1,timestamp:1413995842465,host:ip-10-91-142-54.eu-west-1.compute.internal,version:1,port:9092}]
  at /brokers/ids/0 a while back in a different session, hence I will backoff 
 for this node to be deleted by Zookeeper and retry (kafka.utils.ZkUtils$)
 [2014-10-22 16:37:25,010] ERROR Error handling event ZkEvent[Data of 
 /controller changed sent to 
 kafka.server.ZookeeperLeaderElector$LeaderChangeListener@a6af882] 
 (org.I0Itec.zkclient.ZkEventThread)
 java.lang.IllegalStateException: Kafka scheduler has not been started
 at kafka.utils.KafkaScheduler.ensureStarted(KafkaScheduler.scala:114)
 at kafka.utils.KafkaScheduler.shutdown(KafkaScheduler.scala:86)
 at 
 kafka.controller.KafkaController.onControllerResignation(KafkaController.scala:350)
 at 
 kafka.controller.KafkaController$$anonfun$2.apply$mcV$sp(KafkaController.scala:162)
 at 
 kafka.server.ZookeeperLeaderElector$LeaderChangeListener$$anonfun$handleDataDeleted$1.apply$mcZ$sp(ZookeeperLeaderElector.scala:138)
 at 
 kafka.server.ZookeeperLeaderElector$LeaderChangeListener$$anonfun$handleDataDeleted$1.apply(ZookeeperLeaderElector.scala:134)
 at 
 kafka.server.ZookeeperLeaderElector$LeaderChangeListener$$anonfun$handleDataDeleted$1.apply(ZookeeperLeaderElector.scala:134)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at 
 kafka.server.ZookeeperLeaderElector$LeaderChangeListener.handleDataDeleted(ZookeeperLeaderElector.scala:134)
 at org.I0Itec.zkclient.ZkClient$6.run(ZkClient.java:549)
 at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
 [2014-10-22 16:37:28,757] INFO Registered broker 0 at path /brokers/ids/0 
 with address ip-10-91-142-54.eu-west-1.compute.internal:9092. 
 (kafka.utils.ZkUtils$)
 [2014-10-22 16:37:28,849] INFO [Kafka Server 0], started 
 (kafka.server.KafkaServer)
 [2014-10-22 16:38:56,718] INFO Closing socket connection to /127.0.0.1. 
 (kafka.network.Processor)
 [2014-10-22 16:38:56,850] INFO Closing socket connection to /127.0.0.1. 
 (kafka.network.Processor)
 [2014-10-22 16:38:56,985] INFO Closing socket connection to /127.0.0.1. 
 (kafka.network.Processor)
 {code}
 The last log line repeats forever and is correlated with errors on the app 
 side.
 Restarting Kafka fixes the errors.
 Steps to reproduce (with help from the mailing list):
 # start zookeeper
 # start kafka-broker
 # create topic or start a producer writing to a topic
 # stop zookeeper
 # stop kafka-broker( kafka broker shutdown goes into  WARN Session
 0x14938d9dc010001 for server null, unexpected error, closing socket 
 connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) 
 java.net.ConnectException: Connection refused)
 # kill -9 kafka-broker
 # restart zookeeper and than kafka-broker leads into the the error above



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1724) Errors after reboot in single node setup

2015-02-23 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334088#comment-14334088
 ] 

Sriharsha Chintalapani commented on KAFKA-1724:
---

[~junrao] Thanks for the comments on the patch. So it looks like this is 
already fixed in the trunk.  We can close this JIRA.

 Errors after reboot in single node setup
 

 Key: KAFKA-1724
 URL: https://issues.apache.org/jira/browse/KAFKA-1724
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.2.0
Reporter: Ciprian Hacman
Assignee: Sriharsha Chintalapani
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1724.patch


 In a single node setup, after reboot, Kafka logs show the following:
 {code}
 [2014-10-22 16:37:22,206] INFO [Controller 0]: Controller starting up 
 (kafka.controller.KafkaController)
 [2014-10-22 16:37:22,419] INFO [Controller 0]: Controller startup complete 
 (kafka.controller.KafkaController)
 [2014-10-22 16:37:22,554] INFO conflict in /brokers/ids/0 data: 
 {jmx_port:-1,timestamp:1413995842465,host:ip-10-91-142-54.eu-west-1.compute.internal,version:1,port:9092}
  stored data: 
 {jmx_port:-1,timestamp:1413994171579,host:ip-10-91-142-54.eu-west-1.compute.internal,version:1,port:9092}
  (kafka.utils.ZkUtils$)
 [2014-10-22 16:37:22,736] INFO I wrote this conflicted ephemeral node 
 [{jmx_port:-1,timestamp:1413995842465,host:ip-10-91-142-54.eu-west-1.compute.internal,version:1,port:9092}]
  at /brokers/ids/0 a while back in a different session, hence I will backoff 
 for this node to be deleted by Zookeeper and retry (kafka.utils.ZkUtils$)
 [2014-10-22 16:37:25,010] ERROR Error handling event ZkEvent[Data of 
 /controller changed sent to 
 kafka.server.ZookeeperLeaderElector$LeaderChangeListener@a6af882] 
 (org.I0Itec.zkclient.ZkEventThread)
 java.lang.IllegalStateException: Kafka scheduler has not been started
 at kafka.utils.KafkaScheduler.ensureStarted(KafkaScheduler.scala:114)
 at kafka.utils.KafkaScheduler.shutdown(KafkaScheduler.scala:86)
 at 
 kafka.controller.KafkaController.onControllerResignation(KafkaController.scala:350)
 at 
 kafka.controller.KafkaController$$anonfun$2.apply$mcV$sp(KafkaController.scala:162)
 at 
 kafka.server.ZookeeperLeaderElector$LeaderChangeListener$$anonfun$handleDataDeleted$1.apply$mcZ$sp(ZookeeperLeaderElector.scala:138)
 at 
 kafka.server.ZookeeperLeaderElector$LeaderChangeListener$$anonfun$handleDataDeleted$1.apply(ZookeeperLeaderElector.scala:134)
 at 
 kafka.server.ZookeeperLeaderElector$LeaderChangeListener$$anonfun$handleDataDeleted$1.apply(ZookeeperLeaderElector.scala:134)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at 
 kafka.server.ZookeeperLeaderElector$LeaderChangeListener.handleDataDeleted(ZookeeperLeaderElector.scala:134)
 at org.I0Itec.zkclient.ZkClient$6.run(ZkClient.java:549)
 at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
 [2014-10-22 16:37:28,757] INFO Registered broker 0 at path /brokers/ids/0 
 with address ip-10-91-142-54.eu-west-1.compute.internal:9092. 
 (kafka.utils.ZkUtils$)
 [2014-10-22 16:37:28,849] INFO [Kafka Server 0], started 
 (kafka.server.KafkaServer)
 [2014-10-22 16:38:56,718] INFO Closing socket connection to /127.0.0.1. 
 (kafka.network.Processor)
 [2014-10-22 16:38:56,850] INFO Closing socket connection to /127.0.0.1. 
 (kafka.network.Processor)
 [2014-10-22 16:38:56,985] INFO Closing socket connection to /127.0.0.1. 
 (kafka.network.Processor)
 {code}
 The last log line repeats forever and is correlated with errors on the app 
 side.
 Restarting Kafka fixes the errors.
 Steps to reproduce (with help from the mailing list):
 # start zookeeper
 # start kafka-broker
 # create topic or start a producer writing to a topic
 # stop zookeeper
 # stop kafka-broker( kafka broker shutdown goes into  WARN Session
 0x14938d9dc010001 for server null, unexpected error, closing socket 
 connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) 
 java.net.ConnectException: Connection refused)
 # kill -9 kafka-broker
 # restart zookeeper and than kafka-broker leads into the the error above



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1976) transient unit test failure in ProducerFailureHandlingTest.testNotEnoughReplicasAfterBrokerShutdown

2015-02-22 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14332514#comment-14332514
 ] 

Sriharsha Chintalapani commented on KAFKA-1976:
---

[~junrao] I covered this issue as part of this JIRA 
https://issues.apache.org/jira/browse/KAFKA-1887

 transient unit test failure in 
 ProducerFailureHandlingTest.testNotEnoughReplicasAfterBrokerShutdown
 ---

 Key: KAFKA-1976
 URL: https://issues.apache.org/jira/browse/KAFKA-1976
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.3
Reporter: Jun Rao

 Saw the following failure a few times.
 kafka.api.test.ProducerFailureHandlingTest  
 testNotEnoughReplicasAfterBrokerShutdown FAILED
 org.scalatest.junit.JUnitTestFailedError: Expected 
 NotEnoughReplicasException when producing to topic with fewer brokers than 
 min.insync.replicas
 at 
 org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:101)
 at 
 org.scalatest.junit.JUnit3Suite.newAssertionFailedException(JUnit3Suite.scala:149)
 at org.scalatest.Assertions$class.fail(Assertions.scala:711)
 at org.scalatest.junit.JUnit3Suite.fail(JUnit3Suite.scala:149)
 at 
 kafka.api.test.ProducerFailureHandlingTest.testNotEnoughReplicasAfterBrokerShutdown(ProducerFailureHandlingTest.scala:352)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1566) Kafka environment configuration (kafka-env.sh)

2015-02-21 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1566:
--
Attachment: KAFKA-1566_2015-02-21_21:57:02.patch

 Kafka environment configuration (kafka-env.sh)
 --

 Key: KAFKA-1566
 URL: https://issues.apache.org/jira/browse/KAFKA-1566
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Cosmin Lehene
Assignee: Sriharsha Chintalapani
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1566.patch, KAFKA-1566_2015-02-21_21:57:02.patch


 It would be useful (especially for automated deployments) to have an 
 environment configuration file that could be sourced from the launcher files 
 (e.g. kafka-run-server.sh). 
 This is how this could look like kafka-env.sh 
 {code}
 export KAFKA_JVM_PERFORMANCE_OPTS=-XX:+UseCompressedOops 
 -XX:+DisableExplicitGC -Djava.awt.headless=true \ -XX:+UseG1GC 
 -XX:PermSize=48m -XX:MaxPermSize=48m -XX:MaxGCPauseMillis=20 
 -XX:InitiatingHeapOccupancyPercent=35' % 
 export KAFKA_HEAP_OPTS='-Xmx1G -Xms1G' % 
 export KAFKA_LOG4J_OPTS=-Dkafka.logs.dir=/var/log/kafka 
 {code} 
 kafka-server-start.sh 
 {code} 
 ... 
 source $base_dir/config/kafka-env.sh 
 ... 
 {code} 
 This approach is consistent with Hadoop and HBase. However the idea here is 
 to be able to set these values in a single place without having to edit 
 startup scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1566) Kafka environment configuration (kafka-env.sh)

2015-02-21 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14332047#comment-14332047
 ] 

Sriharsha Chintalapani commented on KAFKA-1566:
---

Updated reviewboard https://reviews.apache.org/r/29724/diff/
 against branch origin/trunk

 Kafka environment configuration (kafka-env.sh)
 --

 Key: KAFKA-1566
 URL: https://issues.apache.org/jira/browse/KAFKA-1566
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Cosmin Lehene
Assignee: Sriharsha Chintalapani
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1566.patch, KAFKA-1566_2015-02-21_21:57:02.patch


 It would be useful (especially for automated deployments) to have an 
 environment configuration file that could be sourced from the launcher files 
 (e.g. kafka-run-server.sh). 
 This is how this could look like kafka-env.sh 
 {code}
 export KAFKA_JVM_PERFORMANCE_OPTS=-XX:+UseCompressedOops 
 -XX:+DisableExplicitGC -Djava.awt.headless=true \ -XX:+UseG1GC 
 -XX:PermSize=48m -XX:MaxPermSize=48m -XX:MaxGCPauseMillis=20 
 -XX:InitiatingHeapOccupancyPercent=35' % 
 export KAFKA_HEAP_OPTS='-Xmx1G -Xms1G' % 
 export KAFKA_LOG4J_OPTS=-Dkafka.logs.dir=/var/log/kafka 
 {code} 
 kafka-server-start.sh 
 {code} 
 ... 
 source $base_dir/config/kafka-env.sh 
 ... 
 {code} 
 This approach is consistent with Hadoop and HBase. However the idea here is 
 to be able to set these values in a single place without having to edit 
 startup scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1566) Kafka environment configuration (kafka-env.sh)

2015-02-21 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14332028#comment-14332028
 ] 

Sriharsha Chintalapani commented on KAFKA-1566:
---

[~nehanarkhede] will rebase and send a new patch.
[~omkreddy] makes sense will add to the new patch. 


 Kafka environment configuration (kafka-env.sh)
 --

 Key: KAFKA-1566
 URL: https://issues.apache.org/jira/browse/KAFKA-1566
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Cosmin Lehene
Assignee: Sriharsha Chintalapani
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1566.patch


 It would be useful (especially for automated deployments) to have an 
 environment configuration file that could be sourced from the launcher files 
 (e.g. kafka-run-server.sh). 
 This is how this could look like kafka-env.sh 
 {code}
 export KAFKA_JVM_PERFORMANCE_OPTS=-XX:+UseCompressedOops 
 -XX:+DisableExplicitGC -Djava.awt.headless=true \ -XX:+UseG1GC 
 -XX:PermSize=48m -XX:MaxPermSize=48m -XX:MaxGCPauseMillis=20 
 -XX:InitiatingHeapOccupancyPercent=35' % 
 export KAFKA_HEAP_OPTS='-Xmx1G -Xms1G' % 
 export KAFKA_LOG4J_OPTS=-Dkafka.logs.dir=/var/log/kafka 
 {code} 
 kafka-server-start.sh 
 {code} 
 ... 
 source $base_dir/config/kafka-env.sh 
 ... 
 {code} 
 This approach is consistent with Hadoop and HBase. However the idea here is 
 to be able to set these values in a single place without having to edit 
 startup scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1566) Kafka environment configuration (kafka-env.sh)

2015-02-21 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14332048#comment-14332048
 ] 

Sriharsha Chintalapani commented on KAFKA-1566:
---

[~nehanarkhede] can you please check the latest patch I am able to apply 
cleanly on the trunk. Thanks.

 Kafka environment configuration (kafka-env.sh)
 --

 Key: KAFKA-1566
 URL: https://issues.apache.org/jira/browse/KAFKA-1566
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Cosmin Lehene
Assignee: Sriharsha Chintalapani
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1566.patch, KAFKA-1566_2015-02-21_21:57:02.patch


 It would be useful (especially for automated deployments) to have an 
 environment configuration file that could be sourced from the launcher files 
 (e.g. kafka-run-server.sh). 
 This is how this could look like kafka-env.sh 
 {code}
 export KAFKA_JVM_PERFORMANCE_OPTS=-XX:+UseCompressedOops 
 -XX:+DisableExplicitGC -Djava.awt.headless=true \ -XX:+UseG1GC 
 -XX:PermSize=48m -XX:MaxPermSize=48m -XX:MaxGCPauseMillis=20 
 -XX:InitiatingHeapOccupancyPercent=35' % 
 export KAFKA_HEAP_OPTS='-Xmx1G -Xms1G' % 
 export KAFKA_LOG4J_OPTS=-Dkafka.logs.dir=/var/log/kafka 
 {code} 
 kafka-server-start.sh 
 {code} 
 ... 
 source $base_dir/config/kafka-env.sh 
 ... 
 {code} 
 This approach is consistent with Hadoop and HBase. However the idea here is 
 to be able to set these values in a single place without having to edit 
 startup scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1887) controller error message on shutting the last broker

2015-02-21 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330090#comment-14330090
 ] 

Sriharsha Chintalapani commented on KAFKA-1887:
---

Created reviewboard https://reviews.apache.org/r/31256/diff/
 against branch origin/trunk

 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3

 Attachments: KAFKA-1887.patch


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
 at 
 kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:446)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:373)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356)
 at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
 at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1887) controller error message on shutting the last broker

2015-02-21 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1887:
--
Attachment: KAFKA-1887.patch

 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3

 Attachments: KAFKA-1887.patch


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
 at 
 kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:446)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:373)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356)
 at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
 at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1887) controller error message on shutting the last broker

2015-02-21 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1887:
--
Attachment: KAFKA-1887_2015-02-21_01:12:25.patch

 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3

 Attachments: KAFKA-1887.patch, KAFKA-1887_2015-02-21_01:12:25.patch


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
 at 
 kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:446)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:373)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356)
 at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
 at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1887) controller error message on shutting the last broker

2015-02-21 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330094#comment-14330094
 ] 

Sriharsha Chintalapani commented on KAFKA-1887:
---

Updated reviewboard https://reviews.apache.org/r/31256/diff/
 against branch origin/trunk

 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3

 Attachments: KAFKA-1887.patch, KAFKA-1887_2015-02-21_01:12:25.patch


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
 at 
 kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:446)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:373)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356)
 at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
 at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1887) controller error message on shutting the last broker

2015-02-21 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1887:
--
Status: Patch Available  (was: Open)

 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3

 Attachments: KAFKA-1887.patch


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
 at 
 kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:446)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:373)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356)
 at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
 at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1887) controller error message on shutting the last broker

2015-02-21 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330093#comment-14330093
 ] 

Sriharsha Chintalapani commented on KAFKA-1887:
---

[~nehanarkhede]  I moved  KafkaController.shutdown() followed by 
KafkaHealthCheck.shutdown() above SocketServer.shutdown().

1) Moving kafkaHealthCheck below controller shutdown doesn't trigger  
ReplicaStateMachine.BrokerChangeListener() 
2) because of 1 controllerContext.controllerChannelManager.removeBroker for the 
current brokerId  and it continues exist in 
controllerContext.controllerChannelManager.brokerStateInfo.
3) when kafkaController.shutdown() gets called it calls 
controllerChannelManager.shutdown() and it will go through removeExistingBroker 
for the brokerId whose SocketServer is shutdown causing 
removeExistingBroker().brokerStateInfo(brokerId).channel.disconnect() 
throw an exception . Because of this exception KafkaBroker.shutdown() is 
slowing down.

In the above patch moved KafkaController.shutdown and KafkaHealthCheck.shutdown 
above SocketServer.shutdown()

 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3

 Attachments: KAFKA-1887.patch


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
 at 
 kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:446)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:373)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356)
 at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
 at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (KAFKA-1887) controller error message on shutting the last broker

2015-02-21 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330093#comment-14330093
 ] 

Sriharsha Chintalapani edited comment on KAFKA-1887 at 2/21/15 5:53 PM:


[~nehanarkhede]  I moved  KafkaController.shutdown() followed by 
KafkaHealthCheck.shutdown() above SocketServer.shutdown().

1) Moving kafkaHealthCheck below controller shutdown doesn't trigger  
ReplicaStateMachine.BrokerChangeListener() 
2) because of 1 controllerContext.controllerChannelManager.removeBroker doesn't 
gets called for the current brokerId  and it continues exist in 
controllerContext.controllerChannelManager.brokerStateInfo.
3) when kafkaController.shutdown() gets called it calls 
controllerChannelManager.shutdown() and it will go through removeExistingBroker 
for the brokerId whose SocketServer is shutdown causing 
removeExistingBroker().brokerStateInfo(brokerId).channel.disconnect() 
throw an exception . Because of this exception KafkaBroker.shutdown() is 
slowing down.

In the above patch moved KafkaController.shutdown and KafkaHealthCheck.shutdown 
above SocketServer.shutdown()


was (Author: sriharsha):
[~nehanarkhede]  I moved  KafkaController.shutdown() followed by 
KafkaHealthCheck.shutdown() above SocketServer.shutdown().

1) Moving kafkaHealthCheck below controller shutdown doesn't trigger  
ReplicaStateMachine.BrokerChangeListener() 
2) because of 1 controllerContext.controllerChannelManager.removeBroker for the 
current brokerId  and it continues exist in 
controllerContext.controllerChannelManager.brokerStateInfo.
3) when kafkaController.shutdown() gets called it calls 
controllerChannelManager.shutdown() and it will go through removeExistingBroker 
for the brokerId whose SocketServer is shutdown causing 
removeExistingBroker().brokerStateInfo(brokerId).channel.disconnect() 
throw an exception . Because of this exception KafkaBroker.shutdown() is 
slowing down.

In the above patch moved KafkaController.shutdown and KafkaHealthCheck.shutdown 
above SocketServer.shutdown()

 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3

 Attachments: KAFKA-1887.patch, KAFKA-1887_2015-02-21_01:12:25.patch


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
 at 
 kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:446)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:373)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at

[jira] [Commented] (KAFKA-1887) controller error message on shutting the last broker

2015-02-20 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329122#comment-14329122
 ] 

Sriharsha Chintalapani commented on KAFKA-1887:
---

[~nehanarkhede] Another problem I forgot add in my comment above with this fix 
i.e moving kafkaHealthCheck below the controller shutdown causing broker 
controlled shutdown to fail . Here is the log on a 2 node cluster with 2 
partitions, 2 replication factor topic.
This happens after I shutdown second broker and next shutdown the controller
I'll get more details on this. 
[2015-02-20 08:24:34,548] INFO [Kafka Server 0], Starting controlled shutdown 
(kafka.server.KafkaServer)
[2015-02-20 08:24:34,569] INFO [Kafka Server 0], Remaining partitions to move: 
[test3,0],[test3,1] (kafka.server.KafkaServer)
[2015-02-20 08:24:34,570] INFO [Kafka Server 0], Error code from controller: 0 
(kafka.server.KafkaServer)
[2015-02-20 08:24:39,575] WARN [Kafka Server 0], Retrying controlled shutdown 
after the previous attempt failed... (kafka.server.KafkaServer)
[2015-02-20 08:24:39,596] INFO [Kafka Server 0], Remaining partitions to move: 
[test3,0],[test3,1] (kafka.server.KafkaServer)
[2015-02-20 08:24:39,596] INFO [Kafka Server 0], Error code from controller: 0 
(kafka.server.KafkaServer)
^C[2015-02-20 08:24:44,598] WARN [Kafka Server 0], Retrying controlled shutdown 
after the previous attempt failed... (kafka.server.KafkaServer)
[2015-02-20 08:24:44,617] INFO [Kafka Server 0], Remaining partitions to move: 
[test3,0],[test3,1] (kafka.server.KafkaServer)
[2015-02-20 08:24:44,617] INFO [Kafka Server 0], Error code from controller: 0 
(kafka.server.KafkaServer)
[2015-02-20 08:24:49,620] WARN [Kafka Server 0], Retrying controlled shutdown 
after the previous attempt failed... (kafka.server.KafkaServer)
[2015-02-20 08:24:49,621] INFO Closing socket connection to /192.168.202.1. 
(kafka.network.Processor)
[2015-02-20 08:24:49,621] WARN [Kafka Server 0], Proceeding to do an unclean 
shutdown as all the controlled shutdown attempts failed 
(kafka.server.KafkaServer)


 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
 at 
 kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:446)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:373)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at

[jira] [Issue Comment Deleted] (KAFKA-1887) controller error message on shutting the last broker

2015-02-20 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1887:
--
Comment: was deleted

(was: [~nehanarkhede] Another problem I forgot add in my comment above with 
this fix i.e moving kafkaHealthCheck below the controller shutdown causing 
broker controlled shutdown to fail . Here is the log on a 2 node cluster with 2 
partitions, 2 replication factor topic.
This happens after I shutdown second broker and next shutdown the controller
I'll get more details on this. 
[2015-02-20 08:24:34,548] INFO [Kafka Server 0], Starting controlled shutdown 
(kafka.server.KafkaServer)
[2015-02-20 08:24:34,569] INFO [Kafka Server 0], Remaining partitions to move: 
[test3,0],[test3,1] (kafka.server.KafkaServer)
[2015-02-20 08:24:34,570] INFO [Kafka Server 0], Error code from controller: 0 
(kafka.server.KafkaServer)
[2015-02-20 08:24:39,575] WARN [Kafka Server 0], Retrying controlled shutdown 
after the previous attempt failed... (kafka.server.KafkaServer)
[2015-02-20 08:24:39,596] INFO [Kafka Server 0], Remaining partitions to move: 
[test3,0],[test3,1] (kafka.server.KafkaServer)
[2015-02-20 08:24:39,596] INFO [Kafka Server 0], Error code from controller: 0 
(kafka.server.KafkaServer)
^C[2015-02-20 08:24:44,598] WARN [Kafka Server 0], Retrying controlled shutdown 
after the previous attempt failed... (kafka.server.KafkaServer)
[2015-02-20 08:24:44,617] INFO [Kafka Server 0], Remaining partitions to move: 
[test3,0],[test3,1] (kafka.server.KafkaServer)
[2015-02-20 08:24:44,617] INFO [Kafka Server 0], Error code from controller: 0 
(kafka.server.KafkaServer)
[2015-02-20 08:24:49,620] WARN [Kafka Server 0], Retrying controlled shutdown 
after the previous attempt failed... (kafka.server.KafkaServer)
[2015-02-20 08:24:49,621] INFO Closing socket connection to /192.168.202.1. 
(kafka.network.Processor)
[2015-02-20 08:24:49,621] WARN [Kafka Server 0], Proceeding to do an unclean 
shutdown as all the controlled shutdown attempts failed 
(kafka.server.KafkaServer)
)

 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
 at 
 kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:446)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:373)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)

[jira] [Commented] (KAFKA-1887) controller error message on shutting the last broker

2015-02-20 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329318#comment-14329318
 ] 

Sriharsha Chintalapani commented on KAFKA-1887:
---

This can be fixed by moving KafkaHealthcheck.shutdown() after 
controller.shutdown() as suggested by [~nehanarkhede] [~gwenshap] . We also 
need to move kafkaHealthCheck.start() before controller.start() otherwise we 
will see the same error in state-change.log after broker started.  
The below patch causing the unit tests run time to go from 9min 30 sec to 
15mins on my machine and also causing intermittent test failure in 
ProducerFailureHandlingTest.testNotEnoughReplicasAfterBrokerShutdown  as 
producer.send.get gets an NotEnoughReplicasAfterAppendException instead of 
NotEnoughReplicasException (probably not related to this patch). I am looking 
into the test slowness with the patch but if you have any idea/fix for this 
please go ahead and take the jira I don't want to hold up 0.8.2.1 release. I'll 
update the jira as soon as I've a fix. 
{code}
+/* tell everyone we are alive */
+kafkaHealthcheck = new KafkaHealthcheck(config.brokerId, 
config.advertisedHostName, config.advertisedPort, config.zkSessionTimeoutMs, 
zkClient)
+kafkaHealthcheck.startup()
+
 /* start kafka controller */
 kafkaController = new KafkaController(config, zkClient, brokerState)
 kafkaController.startup()
@@ -152,10 +156,6 @@ class KafkaServer(val config: KafkaConfig, time: Time = 
SystemTime) extends Logg
 topicConfigManager = new TopicConfigManager(zkClient, logManager)
 topicConfigManager.startup()
 
-/* tell everyone we are alive */
-kafkaHealthcheck = new KafkaHealthcheck(config.brokerId, 
config.advertisedHostName, config.advertisedPort, config.zkSessionTimeoutMs, 
zkClient)
-kafkaHealthcheck.startup()
-
 /* register broker metrics */
 registerStats()
 
@@ -310,8 +310,6 @@ class KafkaServer(val config: KafkaConfig, time: Time = 
SystemTime) extends Logg
   if (canShutdown) {
 Utils.swallow(controlledShutdown())
 brokerState.newState(BrokerShuttingDown)
-if(kafkaHealthcheck != null)
-  Utils.swallow(kafkaHealthcheck.shutdown())
 if(socketServer != null)
   Utils.swallow(socketServer.shutdown())
 if(requestHandlerPool != null)
@@ -329,6 +327,8 @@ class KafkaServer(val config: KafkaConfig, time: Time = 
SystemTime) extends Logg
   Utils.swallow(consumerCoordinator.shutdown())
 if(kafkaController != null)
   Utils.swallow(kafkaController.shutdown())
+if(kafkaHealthcheck != null)
+  Utils.swallow(kafkaHealthcheck.shutdown())
 if(zkClient != null)
   Utils.swallow(zkClient.close())

{code}

 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)

[jira] [Updated] (KAFKA-1852) OffsetCommitRequest can commit offset on unknown topic

2015-02-18 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1852:
--
Attachment: KAFKA-1852_2015-02-18_13:13:17.patch

 OffsetCommitRequest can commit offset on unknown topic
 --

 Key: KAFKA-1852
 URL: https://issues.apache.org/jira/browse/KAFKA-1852
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.3
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1852.patch, KAFKA-1852_2015-01-19_10:44:01.patch, 
 KAFKA-1852_2015-02-12_16:46:10.patch, KAFKA-1852_2015-02-16_13:21:46.patch, 
 KAFKA-1852_2015-02-18_13:13:17.patch


 Currently, we allow an offset to be committed to Kafka, even when the 
 topic/partition for the offset doesn't exist. We probably should disallow 
 that and send an error back in that case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1852) OffsetCommitRequest can commit offset on unknown topic

2015-02-18 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326566#comment-14326566
 ] 

Sriharsha Chintalapani commented on KAFKA-1852:
---

Updated reviewboard https://reviews.apache.org/r/29912/diff/
 against branch origin/trunk

 OffsetCommitRequest can commit offset on unknown topic
 --

 Key: KAFKA-1852
 URL: https://issues.apache.org/jira/browse/KAFKA-1852
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.3
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1852.patch, KAFKA-1852_2015-01-19_10:44:01.patch, 
 KAFKA-1852_2015-02-12_16:46:10.patch, KAFKA-1852_2015-02-16_13:21:46.patch, 
 KAFKA-1852_2015-02-18_13:13:17.patch


 Currently, we allow an offset to be committed to Kafka, even when the 
 topic/partition for the offset doesn't exist. We probably should disallow 
 that and send an error back in that case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1965) Leaner DelayedItem

2015-02-18 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326973#comment-14326973
 ] 

Sriharsha Chintalapani commented on KAFKA-1965:
---

[~yasuhiro.matsuda] Changes looks good to me. Can you attach it to the review 
board here are instructions 
https://cwiki.apache.org/confluence/display/KAFKA/Patch+submission+and+review

 Leaner DelayedItem
 --

 Key: KAFKA-1965
 URL: https://issues.apache.org/jira/browse/KAFKA-1965
 Project: Kafka
  Issue Type: Improvement
  Components: purgatory
Reporter: Yasuhiro Matsuda
Assignee: Joel Koshy
Priority: Trivial
 Attachments: KAFKA-1965.patch


 In DelayedItem, which is a superclass of DelayedOperation, both the creation 
 timestamp and the length delay are stored. However, all it needs is one 
 timestamp that is the due of the item.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1867) liveBroker list not updated on a cluster with no topics

2015-02-17 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324659#comment-14324659
 ] 

Sriharsha Chintalapani commented on KAFKA-1867:
---

[~nehanarkhede] pinging for a review. Thanks.

 liveBroker list not updated on a cluster with no topics
 ---

 Key: KAFKA-1867
 URL: https://issues.apache.org/jira/browse/KAFKA-1867
 Project: Kafka
  Issue Type: Bug
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
 Fix For: 0.8.3

 Attachments: KAFKA-1867.patch, KAFKA-1867.patch, 
 KAFKA-1867_2015-01-25_21:07:47.patch


 Currently, when there is no topic in a cluster, the controller doesn't send 
 any UpdateMetadataRequest to the broker when it starts up. As a result, the 
 liveBroker list in metadataCache is empty. This means that we will return 
 incorrect broker list in TopicMetatadataResponse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1852) OffsetCommitRequest can commit offset on unknown topic

2015-02-16 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14323316#comment-14323316
 ] 

Sriharsha Chintalapani commented on KAFKA-1852:
---

Updated reviewboard https://reviews.apache.org/r/29912/diff/
 against branch origin/trunk

 OffsetCommitRequest can commit offset on unknown topic
 --

 Key: KAFKA-1852
 URL: https://issues.apache.org/jira/browse/KAFKA-1852
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.3
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1852.patch, KAFKA-1852_2015-01-19_10:44:01.patch, 
 KAFKA-1852_2015-02-12_16:46:10.patch, KAFKA-1852_2015-02-16_13:21:46.patch


 Currently, we allow an offset to be committed to Kafka, even when the 
 topic/partition for the offset doesn't exist. We probably should disallow 
 that and send an error back in that case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1852) OffsetCommitRequest can commit offset on unknown topic

2015-02-12 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319327#comment-14319327
 ] 

Sriharsha Chintalapani commented on KAFKA-1852:
---

Updated reviewboard https://reviews.apache.org/r/29912/diff/
 against branch origin/trunk

 OffsetCommitRequest can commit offset on unknown topic
 --

 Key: KAFKA-1852
 URL: https://issues.apache.org/jira/browse/KAFKA-1852
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.3
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1852.patch, KAFKA-1852_2015-01-19_10:44:01.patch, 
 KAFKA-1852_2015-02-12_16:46:10.patch


 Currently, we allow an offset to be committed to Kafka, even when the 
 topic/partition for the offset doesn't exist. We probably should disallow 
 that and send an error back in that case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1852) OffsetCommitRequest can commit offset on unknown topic

2015-02-12 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1852:
--
Attachment: KAFKA-1852_2015-02-12_16:46:10.patch

 OffsetCommitRequest can commit offset on unknown topic
 --

 Key: KAFKA-1852
 URL: https://issues.apache.org/jira/browse/KAFKA-1852
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.3
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1852.patch, KAFKA-1852_2015-01-19_10:44:01.patch, 
 KAFKA-1852_2015-02-12_16:46:10.patch


 Currently, we allow an offset to be committed to Kafka, even when the 
 topic/partition for the offset doesn't exist. We probably should disallow 
 that and send an error back in that case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1852) OffsetCommitRequest can commit offset on unknown topic

2015-02-12 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319332#comment-14319332
 ] 

Sriharsha Chintalapani commented on KAFKA-1852:
---

Thanks [~jjkoshy] for the review. I added your suggestion please take a look 
when you get a chance.

 OffsetCommitRequest can commit offset on unknown topic
 --

 Key: KAFKA-1852
 URL: https://issues.apache.org/jira/browse/KAFKA-1852
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.3
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1852.patch, KAFKA-1852_2015-01-19_10:44:01.patch, 
 KAFKA-1852_2015-02-12_16:46:10.patch


 Currently, we allow an offset to be committed to Kafka, even when the 
 topic/partition for the offset doesn't exist. We probably should disallow 
 that and send an error back in that case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1887) controller error message on shutting the last broker

2015-02-12 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14318312#comment-14318312
 ] 

Sriharsha Chintalapani commented on KAFKA-1887:
---

[~gwenshap] I am worried about moving kafkaHealthCheck below as it won't be 
able to trigger onBrokerFailure in KafkaController. In a cluster this probably 
fine as there will be another controller gets elected. Also this change causes 
ProducerFailureHandlingTest.testNotEnoughReplicasAfterBrokerShutdown to fail as 
producer.send.get gets an NotEnoughReplicasAfterAppendException instead of 
NotEnoughReplicasException

 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
 at 
 kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:446)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:373)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356)
 at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
 at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1866) LogStartOffset gauge throws exceptions after log.delete()

2015-02-11 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1866:
--
Attachment: KAFKA-1866_2015-02-11_09:25:33.patch

 LogStartOffset gauge throws exceptions after log.delete()
 -

 Key: KAFKA-1866
 URL: https://issues.apache.org/jira/browse/KAFKA-1866
 Project: Kafka
  Issue Type: Bug
Reporter: Gian Merlino
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1866.patch, KAFKA-1866_2015-02-10_22:50:09.patch, 
 KAFKA-1866_2015-02-11_09:25:33.patch


 The LogStartOffset gauge does logSegments.head.baseOffset, which throws 
 NoSuchElementException on an empty list, which can occur after a delete() of 
 the log. This makes life harder for custom MetricsReporters, since they have 
 to deal with .value() possibly throwing an exception.
 Locally we're dealing with this by having Log.delete() also call removeMetric 
 on all the gauges. That also has the benefit of not having a bunch of metrics 
 floating around for logs that the broker is not actually handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1757) Can not delete Topic index on Windows

2015-02-11 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316454#comment-14316454
 ] 

Sriharsha Chintalapani commented on KAFKA-1757:
---

[~junrao] Can you please review the patch. Thanks.

 Can not delete Topic index on Windows
 -

 Key: KAFKA-1757
 URL: https://issues.apache.org/jira/browse/KAFKA-1757
 Project: Kafka
  Issue Type: Bug
  Components: log
Affects Versions: 0.8.2.0
Reporter: Lukáš Vyhlídka
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3

 Attachments: KAFKA-1757.patch, lucky-v.patch


 When running the Kafka 0.8.2-Beta (Scala 2.10) on Windows, an attempt to 
 delete the Topic throwed an error:
 ERROR [KafkaApi-1] error when handling request Name: StopReplicaRequest; 
 Version: 0; CorrelationId: 38; ClientId: ; DeletePartitions: true; 
 ControllerId: 0; ControllerEpoch: 3; Partitions: [test,0] 
 (kafka.server.KafkaApis)
 kafka.common.KafkaStorageException: Delete of index 
 .index failed.
 at kafka.log.LogSegment.delete(LogSegment.scala:283)
 at kafka.log.Log$$anonfun$delete$1.apply(Log.scala:608)
 at kafka.log.Log$$anonfun$delete$1.apply(Log.scala:608)
 at scala.collection.Iterator$class.foreach(Iterator.scala:727)
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
 at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
 at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
 at kafka.log.Log.delete(Log.scala:608)
 at kafka.log.LogManager.deleteLog(LogManager.scala:375)
 at 
 kafka.cluster.Partition$$anonfun$delete$1.apply$mcV$sp(Partition.scala:144)
 at 
 kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:139)
 at 
 kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:139)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at kafka.utils.Utils$.inWriteLock(Utils.scala:543)
 at kafka.cluster.Partition.delete(Partition.scala:139)
 at kafka.server.ReplicaManager.stopReplica(ReplicaManager.scala:158)
 at 
 kafka.server.ReplicaManager$$anonfun$stopReplicas$3.apply(ReplicaManager.scala:191)
 at 
 kafka.server.ReplicaManager$$anonfun$stopReplicas$3.apply(ReplicaManager.scala:190)
 at scala.collection.immutable.Set$Set1.foreach(Set.scala:74)
 at kafka.server.ReplicaManager.stopReplicas(ReplicaManager.scala:190)
 at kafka.server.KafkaApis.handleStopReplicaRequest(KafkaApis.scala:96)
 at kafka.server.KafkaApis.handle(KafkaApis.scala:59)
 at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59)
 at java.lang.Thread.run(Thread.java:744)
 When I have investigated the issue I figured out that the index file (in my 
 environment it was 
 C:\tmp\kafka-logs\----0014-0\.index)
  was locked by the kafka process and the OS did not allow to delete that file.
 I tried to fix the problem in source codes and when I added close() method 
 call into LogSegment.delete(), the Topic deletion started to work.
 I will add here (not sure how to upload the file during issue creation) a 
 diff with the changes I have made so You can take a look on that whether it 
 is reasonable or not. It would be perfect if it could make it into the 
 product...
 In the end I would like to say that on Linux the deletion works just fine...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1866) LogStartOffset gauge throws exceptions after log.delete()

2015-02-11 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316572#comment-14316572
 ] 

Sriharsha Chintalapani commented on KAFKA-1866:
---

Updated reviewboard https://reviews.apache.org/r/30084/diff/
 against branch origin/trunk

 LogStartOffset gauge throws exceptions after log.delete()
 -

 Key: KAFKA-1866
 URL: https://issues.apache.org/jira/browse/KAFKA-1866
 Project: Kafka
  Issue Type: Bug
Reporter: Gian Merlino
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1866.patch, KAFKA-1866_2015-02-10_22:50:09.patch, 
 KAFKA-1866_2015-02-11_09:25:33.patch


 The LogStartOffset gauge does logSegments.head.baseOffset, which throws 
 NoSuchElementException on an empty list, which can occur after a delete() of 
 the log. This makes life harder for custom MetricsReporters, since they have 
 to deal with .value() possibly throwing an exception.
 Locally we're dealing with this by having Log.delete() also call removeMetric 
 on all the gauges. That also has the benefit of not having a bunch of metrics 
 floating around for logs that the broker is not actually handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1566) Kafka environment configuration (kafka-env.sh)

2015-02-11 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316455#comment-14316455
 ] 

Sriharsha Chintalapani commented on KAFKA-1566:
---

[~jkreps] [~nehanarkhede] Can you please review this.  kafka-env.sh will allow 
the flexibility of defining a custom java_home for the users.

 Kafka environment configuration (kafka-env.sh)
 --

 Key: KAFKA-1566
 URL: https://issues.apache.org/jira/browse/KAFKA-1566
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Cosmin Lehene
Assignee: Sriharsha Chintalapani
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1566.patch


 It would be useful (especially for automated deployments) to have an 
 environment configuration file that could be sourced from the launcher files 
 (e.g. kafka-run-server.sh). 
 This is how this could look like kafka-env.sh 
 {code}
 export KAFKA_JVM_PERFORMANCE_OPTS=-XX:+UseCompressedOops 
 -XX:+DisableExplicitGC -Djava.awt.headless=true \ -XX:+UseG1GC 
 -XX:PermSize=48m -XX:MaxPermSize=48m -XX:MaxGCPauseMillis=20 
 -XX:InitiatingHeapOccupancyPercent=35' % 
 export KAFKA_HEAP_OPTS='-Xmx1G -Xms1G' % 
 export KAFKA_LOG4J_OPTS=-Dkafka.logs.dir=/var/log/kafka 
 {code} 
 kafka-server-start.sh 
 {code} 
 ... 
 source $base_dir/config/kafka-env.sh 
 ... 
 {code} 
 This approach is consistent with Hadoop and HBase. However the idea here is 
 to be able to set these values in a single place without having to edit 
 startup scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1852) OffsetCommitRequest can commit offset on unknown topic

2015-02-11 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316447#comment-14316447
 ] 

Sriharsha Chintalapani commented on KAFKA-1852:
---

[~jjkoshy] pinging for a review. Thanks.

 OffsetCommitRequest can commit offset on unknown topic
 --

 Key: KAFKA-1852
 URL: https://issues.apache.org/jira/browse/KAFKA-1852
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.3
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1852.patch, KAFKA-1852_2015-01-19_10:44:01.patch


 Currently, we allow an offset to be committed to Kafka, even when the 
 topic/partition for the offset doesn't exist. We probably should disallow 
 that and send an error back in that case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (KAFKA-1887) controller error message on shutting the last broker

2015-02-11 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani reassigned KAFKA-1887:
-

Assignee: Sriharsha Chintalapani

 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
 at 
 kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:446)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:373)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356)
 at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
 at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1926) Replace kafka.utils.Utils with o.a.k.common.utils.Utils

2015-02-11 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316319#comment-14316319
 ] 

Sriharsha Chintalapani commented on KAFKA-1926:
---

[~tongli] you should make patch against trunk.

 Replace kafka.utils.Utils with o.a.k.common.utils.Utils
 ---

 Key: KAFKA-1926
 URL: https://issues.apache.org/jira/browse/KAFKA-1926
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.8.2.0
Reporter: Jay Kreps
  Labels: newbie, patch
 Attachments: KAFKA-1926.patch


 There is currently a lot of duplication between the Utils class in common and 
 the one in core.
 Our plan has been to deprecate duplicate code in the server and replace it 
 with the new common code.
 As such we should evaluate each method in the scala Utils and do one of the 
 following:
 1. Migrate it to o.a.k.common.utils.Utils if it is a sensible general purpose 
 utility in active use that is not Kafka-specific. If we migrate it we should 
 really think about the API and make sure there is some test coverage. A few 
 things in there are kind of funky and we shouldn't just blindly copy them 
 over.
 2. Create a new class ServerUtils or ScalaUtils in kafka.utils that will hold 
 any utilities that really need to make use of Scala features to be convenient.
 3. Delete it if it is not used, or has a bad api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1887) controller error message on shutting the last broker

2015-02-11 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317678#comment-14317678
 ] 

Sriharsha Chintalapani commented on KAFKA-1887:
---

[~junrao] 
Shutting down the controller before shutting down broker is not a good idea as 
broker does a controlledShutdown which sends ControllShutdownRequest and there 
won't be any controller handling these requests. More over if we don't shutdown 
broker first and if it happens to be the last broker in the cluster it won't go 
through onBrokerFailover which will move the partitions to offlinestate.
I am not sure if there is good way to fix this as it does go through the 
expected steps. Please let me know if you have any ideas. 

 controller error message on shutting the last broker
 

 Key: KAFKA-1887
 URL: https://issues.apache.org/jira/browse/KAFKA-1887
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Jun Rao
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.8.3


 We always see the following error in state-change log on shutting down the 
 last broker.
 [2015-01-20 13:21:04,036] ERROR Controller 0 epoch 3 initiated state change 
 for partition [test,0] from OfflinePartition to OnlinePartition failed 
 (state.change.logger)
 kafka.common.NoReplicaOnlineException: No replica for partition [test,0] is 
 alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
 at 
 kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
 at 
 kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
 at 
 kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
 at 
 kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
 at 
 scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 at 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
 at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
 at 
 scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
 at 
 kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
 at 
 kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:446)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:373)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
 at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
 at kafka.utils.Utils$.inLock(Utils.scala:535)
 at 
 kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356)
 at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
 at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1926) Replace kafka.utils.Utils with o.a.k.common.utils.Utils

2015-02-10 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314920#comment-14314920
 ] 

Sriharsha Chintalapani commented on KAFKA-1926:
---

[~tongli] you might  want to check these 
https://cwiki.apache.org/confluence/display/KAFKA/Patch+submission+and+review
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+patch+review+tool
You can submit patch using kafka patch review tool it will upload the patch to 
https://reviews.apache.org and attaches link here on JIRA.

 Replace kafka.utils.Utils with o.a.k.common.utils.Utils
 ---

 Key: KAFKA-1926
 URL: https://issues.apache.org/jira/browse/KAFKA-1926
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.8.2.0
Reporter: Jay Kreps
  Labels: newbie, patch
 Attachments: KAFKA-1926-p1.patch


 There is currently a lot of duplication between the Utils class in common and 
 the one in core.
 Our plan has been to deprecate duplicate code in the server and replace it 
 with the new common code.
 As such we should evaluate each method in the scala Utils and do one of the 
 following:
 1. Migrate it to o.a.k.common.utils.Utils if it is a sensible general purpose 
 utility in active use that is not Kafka-specific. If we migrate it we should 
 really think about the API and make sure there is some test coverage. A few 
 things in there are kind of funky and we shouldn't just blindly copy them 
 over.
 2. Create a new class ServerUtils or ScalaUtils in kafka.utils that will hold 
 any utilities that really need to make use of Scala features to be convenient.
 3. Delete it if it is not used, or has a bad api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1866) LogStartOffset gauge throws exceptions after log.delete()

2015-02-10 Thread Sriharsha Chintalapani (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315682#comment-14315682
 ] 

Sriharsha Chintalapani commented on KAFKA-1866:
---

Updated reviewboard https://reviews.apache.org/r/30084/diff/
 against branch origin/trunk

 LogStartOffset gauge throws exceptions after log.delete()
 -

 Key: KAFKA-1866
 URL: https://issues.apache.org/jira/browse/KAFKA-1866
 Project: Kafka
  Issue Type: Bug
Reporter: Gian Merlino
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1866.patch, KAFKA-1866_2015-02-10_22:50:09.patch


 The LogStartOffset gauge does logSegments.head.baseOffset, which throws 
 NoSuchElementException on an empty list, which can occur after a delete() of 
 the log. This makes life harder for custom MetricsReporters, since they have 
 to deal with .value() possibly throwing an exception.
 Locally we're dealing with this by having Log.delete() also call removeMetric 
 on all the gauges. That also has the benefit of not having a bunch of metrics 
 floating around for logs that the broker is not actually handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1866) LogStartOffset gauge throws exceptions after log.delete()

2015-02-10 Thread Sriharsha Chintalapani (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1866:
--
Attachment: KAFKA-1866_2015-02-10_22:50:09.patch

 LogStartOffset gauge throws exceptions after log.delete()
 -

 Key: KAFKA-1866
 URL: https://issues.apache.org/jira/browse/KAFKA-1866
 Project: Kafka
  Issue Type: Bug
Reporter: Gian Merlino
Assignee: Sriharsha Chintalapani
 Attachments: KAFKA-1866.patch, KAFKA-1866_2015-02-10_22:50:09.patch


 The LogStartOffset gauge does logSegments.head.baseOffset, which throws 
 NoSuchElementException on an empty list, which can occur after a delete() of 
 the log. This makes life harder for custom MetricsReporters, since they have 
 to deal with .value() possibly throwing an exception.
 Locally we're dealing with this by having Log.delete() also call removeMetric 
 on all the gauges. That also has the benefit of not having a bunch of metrics 
 floating around for logs that the broker is not actually handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

< 1 2 3 4 5 6 7 8 >

301 - 400 of 775 matches

Mail list logo