[jira] [Commented] (KAFKA-3015) Improve JBOD data balancing
[ https://issues.apache.org/jira/browse/KAFKA-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15069215#comment-15069215 ] Joe Stein commented on KAFKA-3015: -- Can we do both of these at the same time https://cwiki.apache.org/confluence/display/KAFKA/KIP-18+-+JBOD+Support and provide an option for folks by topic for which it is using? I haven't taken a look in a while at KAFKA-2188 if that is also a good direction for folks we should talk about picking that back up too. Its a little stale but some re-base and reviews, fixes, reviews if folks have need for Kafka brokers staying up on disk failure without RAID. So it would be like at least 3 parts to it. There may be other items in the "JBOD" realm folks want to work on too. > Improve JBOD data balancing > --- > > Key: KAFKA-3015 > URL: https://issues.apache.org/jira/browse/KAFKA-3015 > Project: Kafka > Issue Type: Improvement >Reporter: Jay Kreps > > When running with multiple data directories (i.e. JBOD) we currently place > partitions entirely within one data directory. This tends to lead to poor > balancing across disks as some topics have more throughput/retention and not > all disks get data from all topics. You can't fix this problem with smarter > partition placement strategies because ultimately you don't know when a > partition is created when or how heavily it will be used (this is a subtle > point, and the tendency is to try to think of some more sophisticated way to > place partitions based on current data size but this is actually > exceptionally dangerous and can lead to much worse imbalance when creating > many partitions at once as they would all go to the disk with the least > data). We don't support online rebalancing across directories/disks so this > imbalance is a big problem and limits the usefulness of this configuration. > Implementing online rebalancing of data across disks without downtime is > actually quite hard and requires lots of I/O since you have to actually > rewrite full partitions of data. > An alternative would be to place each partition in *all* directories/drives > and round-robin *segments* within the partition across the directories. So > the layout would be something like: > drive-a/mytopic-0/ > 000.data > 000.index > 0024680.data > 0024680.index > drive-a/mytopic-0/ > 0012345.data > 0012345.index > 0036912.data > 0036912.index > This is a little harder to implement than the current approach but not very > hard, and it is a lot easier than implementing online data balancing across > disks while retaining the current approach. I think this could easily be done > in a backwards compatible way. > I think the balancing you would get from this in most cases would be good > enough to make JBOD the default configuration. Thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2079) Support exhibitor
[ https://issues.apache.org/jira/browse/KAFKA-2079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907723#comment-14907723 ] Joe Stein commented on KAFKA-2079: -- Not yet, I started a KIP for it https://cwiki.apache.org/confluence/display/KAFKA/KIP-30+-+Allow+for+brokers+to+have+plug-able+consensus+and+meta+data+storage+sub+systems I think we need a way to have some plug-able support for different remote interface/system libraries. It would be great for the base Kafka code to continue to support how it works now. The existing code should get refactored and included with the release the kafka-zkclient-connector.jar (or such) still support how it works now for folks. We could then have a way for launching other libraries (not much unlike for metrics) to implement the remote interfaces. - async watchers - leader election - meta data storage I don't think we should separate out the plug-in separately it should contain at least those 3 base functionality. Then folks can work on and build out different implementations and/or collaborate on implementations. I still need to write it up some more in the KIP and start a discussion on the mailing list. KAFKA-873 could be another implementation to use curator (or maybe it can get into exhibitor or something). Other implementations folks have brought up are Akka, Consul and etcd. I think folks can work on and build out and support the different implementations and the existing Kafka brokers can still work how they do now. At some point in a subsequent release we could replace the project released jar with something else. > Support exhibitor > - > > Key: KAFKA-2079 > URL: https://issues.apache.org/jira/browse/KAFKA-2079 > Project: Kafka > Issue Type: Improvement >Reporter: Aaron Dixon > > Exhibitor (https://github.com/Netflix/exhibitor) is a discovery/monitoring > solution for managing Zookeeper clusters. It supports use cases like > discovery, node replacements and auto-scaling of Zk cluster hosts (so you > don't have to manage a fixed set of Zk hosts--especially useful in cloud > environments.) > The easiest way for Kafka to support connection to Zk clusters via exhibitor > is to use curator as its client. There is already a separate ticket for this: > KAFKA-873 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2339) broker becomes unavailable if bad data is passed through the protocol
[ https://issues.apache.org/jira/browse/KAFKA-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633756#comment-14633756 ] Joe Stein commented on KAFKA-2339: -- I haven't had a chance to try to reproduce this yet more exactly. I will see about doing that in the next day or so. broker becomes unavailable if bad data is passed through the protocol - Key: KAFKA-2339 URL: https://issues.apache.org/jira/browse/KAFKA-2339 Project: Kafka Issue Type: Bug Reporter: Joe Stein Assignee: Timothy Chen Priority: Critical Fix For: 0.8.3 I ran into a situation that a non integer value got past for the partition and the brokers went bonkers. reproducible {code} ah=1..2 echo don't do this in production|kafkacat -b localhost:9092 -p $ah {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-2339) broker becomes unavailable if bad data is passed through the protocol
Joe Stein created KAFKA-2339: Summary: broker becomes unavailable if bad data is passed through the protocol Key: KAFKA-2339 URL: https://issues.apache.org/jira/browse/KAFKA-2339 Project: Kafka Issue Type: Bug Reporter: Joe Stein Priority: Critical Fix For: 0.8.3 I ran into a situation that a non integer value got past for the partition and the brokers went bonkers. reproducible {code} ah=1..2 echo don't do this in production|kafkacat -b localhost:9092 -p $ah {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2310) Add config to prevent broker becoming controller
[ https://issues.apache.org/jira/browse/KAFKA-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618034#comment-14618034 ] Joe Stein commented on KAFKA-2310: -- Hey [~jjkoshy] I had suggested to Andrii that it might make sense to make this a new ticket. The old ticket (which I think we should close) had the idea that we wanted to re-elect the controller. This would be problematic for what is trying to solve based on what we have seen in the field. e.g. if you have 12 brokers and they are under heavy load then providing a way to bounce the controller around is going to help if when it gets to a broker it can't perform its responsibilities sufficiently. The consensus I have been able to get from ops folks is that separating/isolating the controller onto two brokers on two (for redundancy) lower end equipment solve the problem fully. Since this is just another config I didn't think that i needed it KIP but honestly wasn't 100% sure otherwise would have already committed this feature. The purpose of the patch for different versions is because I know a bunch of folks that are going to take it for the version of kafka they are using and start using the feature. Add config to prevent broker becoming controller Key: KAFKA-2310 URL: https://issues.apache.org/jira/browse/KAFKA-2310 Project: Kafka Issue Type: Bug Reporter: Andrii Biletskyi Assignee: Andrii Biletskyi Attachments: KAFKA-2310.patch, KAFKA-2310_0.8.1.patch, KAFKA-2310_0.8.2.patch The goal is to be able to specify which cluster brokers can serve as a controller and which cannot. This way it will be possible to reserve particular, not overloaded with partitions and other operations, broker as controller. Proposed to add config _controller.eligibility_ defaulted to true (for backward compatibility, since now any broker can become a controller) Patch will be available for trunk, 0.8.2 and 0.8.1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2304) Support enabling JMX in Kafka Vagrantfile
[ https://issues.apache.org/jira/browse/KAFKA-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616979#comment-14616979 ] Joe Stein commented on KAFKA-2304: -- Thanks [~ewencp] for the review, will take a look later tonight and commit if good to go Support enabling JMX in Kafka Vagrantfile - Key: KAFKA-2304 URL: https://issues.apache.org/jira/browse/KAFKA-2304 Project: Kafka Issue Type: Bug Affects Versions: 0.8.3 Reporter: Stevo Slavic Assignee: Joe Stein Priority: Minor Fix For: 0.8.3 Attachments: KAFKA-2304-JMX.patch, KAFKA-2304-JMX.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2304) Support enabling JMX in Kafka Vagrantfile
[ https://issues.apache.org/jira/browse/KAFKA-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2304: - Resolution: Fixed Status: Resolved (was: Patch Available) pushed to trunk, thanks for the patch and review Support enabling JMX in Kafka Vagrantfile - Key: KAFKA-2304 URL: https://issues.apache.org/jira/browse/KAFKA-2304 Project: Kafka Issue Type: Bug Reporter: Stevo Slavic Assignee: Stevo Slavic Priority: Minor Fix For: 0.8.3 Attachments: KAFKA-2304-JMX.patch, KAFKA-2304-JMX.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-2304) Support enabling JMX in Kafka Vagrantfile
[ https://issues.apache.org/jira/browse/KAFKA-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616979#comment-14616979 ] Joe Stein edited comment on KAFKA-2304 at 7/7/15 5:01 PM: -- Thanks [~ewencp] for the review, will take a look now and commit if good to go. was (Author: joestein): Thanks [~ewencp] for the review, will take a look later tonight and commit if good to go Support enabling JMX in Kafka Vagrantfile - Key: KAFKA-2304 URL: https://issues.apache.org/jira/browse/KAFKA-2304 Project: Kafka Issue Type: Bug Reporter: Stevo Slavic Assignee: Stevo Slavic Priority: Minor Fix For: 0.8.3 Attachments: KAFKA-2304-JMX.patch, KAFKA-2304-JMX.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2304) Support enabling JMX in Kafka Vagrantfile
[ https://issues.apache.org/jira/browse/KAFKA-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2304: - Affects Version/s: (was: 0.8.3) Reviewer: Ewen Cheslack-Postava Support enabling JMX in Kafka Vagrantfile - Key: KAFKA-2304 URL: https://issues.apache.org/jira/browse/KAFKA-2304 Project: Kafka Issue Type: Bug Reporter: Stevo Slavic Assignee: Stevo Slavic Priority: Minor Fix For: 0.8.3 Attachments: KAFKA-2304-JMX.patch, KAFKA-2304-JMX.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2304) Support enabling JMX in Kafka Vagrantfile
[ https://issues.apache.org/jira/browse/KAFKA-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2304: - Fix Version/s: 0.8.3 Support enabling JMX in Kafka Vagrantfile - Key: KAFKA-2304 URL: https://issues.apache.org/jira/browse/KAFKA-2304 Project: Kafka Issue Type: Bug Affects Versions: 0.8.3 Reporter: Stevo Slavic Assignee: Stevo Slavic Priority: Minor Fix For: 0.8.3 Attachments: KAFKA-2304-JMX.patch, KAFKA-2304-JMX.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2304) Support enabling JMX in Kafka Vagrantfile
[ https://issues.apache.org/jira/browse/KAFKA-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2304: - Assignee: Stevo Slavic (was: Joe Stein) Support enabling JMX in Kafka Vagrantfile - Key: KAFKA-2304 URL: https://issues.apache.org/jira/browse/KAFKA-2304 Project: Kafka Issue Type: Bug Affects Versions: 0.8.3 Reporter: Stevo Slavic Assignee: Stevo Slavic Priority: Minor Fix For: 0.8.3 Attachments: KAFKA-2304-JMX.patch, KAFKA-2304-JMX.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka
[ https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605003#comment-14605003 ] Joe Stein commented on KAFKA-1173: -- Can you create a new ticket please, thanks. Using Vagrant to get up and running with Apache Kafka - Key: KAFKA-1173 URL: https://issues.apache.org/jira/browse/KAFKA-1173 Project: Kafka Issue Type: Improvement Reporter: Joe Stein Assignee: Ewen Cheslack-Postava Fix For: 0.8.3 Attachments: KAFKA-1173-JMX.patch, KAFKA-1173.patch, KAFKA-1173_2013-12-07_12:07:55.patch, KAFKA-1173_2014-11-11_13:50:55.patch, KAFKA-1173_2014-11-12_11:32:09.patch, KAFKA-1173_2014-11-18_16:01:33.patch Vagrant has been getting a lot of pickup in the tech communities. I have found it very useful for development and testing and working with a few clients now using it to help virtualize their environments in repeatable ways. Using Vagrant to get up and running. For 0.8.0 I have a patch on github https://github.com/stealthly/kafka 1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/) 2) Install Virtual Box [https://www.virtualbox.org/](https://www.virtualbox.org/) In the main kafka folder 1) ./sbt update 2) ./sbt package 3) ./sbt assembly-package-dependency 4) vagrant up once this is done * Zookeeper will be running 192.168.50.5 * Broker 1 on 192.168.50.10 * Broker 2 on 192.168.50.20 * Broker 3 on 192.168.50.30 When you are all up and running you will be back at a command brompt. If you want you can login to the machines using vagrant shh machineName but you don't need to. You can access the brokers and zookeeper by their IP e.g. bin/kafka-console-producer.sh --broker-list 192.168.50.10:9092,192.168.50.20:9092,192.168.50.30:9092 --topic sandbox bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic sandbox --from-beginning -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2254) The shell script should be optimized , even kafka-run-class.sh has a syntax error.
[ https://issues.apache.org/jira/browse/KAFKA-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2254: - Fix Version/s: (was: 0.8.2.1) 0.8.3 The shell script should be optimized , even kafka-run-class.sh has a syntax error. -- Key: KAFKA-2254 URL: https://issues.apache.org/jira/browse/KAFKA-2254 Project: Kafka Issue Type: Bug Components: build Affects Versions: 0.8.2.1 Environment: linux Reporter: Bo Wang Labels: client-script, kafka-run-class.sh, shell-script Fix For: 0.8.3 Attachments: kafka-shell-script.patch Original Estimate: 24h Remaining Estimate: 24h kafka-run-class.sh 128 line has a syntax error(missing a space): 127-loggc) 128 if [ -z $KAFKA_GC_LOG_OPTS] ; then 129GC_LOG_ENABLED=true 130 fi And use the ShellCheck to check the shell scripts, the results shows some errors 、 warnings and notes: https://github.com/koalaman/shellcheck/wiki/SC2068 https://github.com/koalaman/shellcheck/wiki/Sc2046 https://github.com/koalaman/shellcheck/wiki/Sc2086 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KAFKA-1000) Inbuilt consumer offset management feature for kakfa
[ https://issues.apache.org/jira/browse/KAFKA-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein resolved KAFKA-1000. -- Resolution: Fixed Inbuilt consumer offset management feature for kakfa Key: KAFKA-1000 URL: https://issues.apache.org/jira/browse/KAFKA-1000 Project: Kafka Issue Type: New Feature Components: consumer Affects Versions: 0.8.1 Reporter: Tejas Patil Assignee: Tejas Patil Priority: Minor Labels: features Fix For: 0.8.2.0 Kafka currently stores offsets in zookeeper. This is a problem for several reasons. First it means the consumer must embed the zookeeper client which is not available in all languages. Secondly offset commits are actually quite frequent and Zookeeper does not scale this kind of high-write load. This Jira is for tracking the phase #2 of Offset Management [0]. Joel and I have been working on this. [1] is the overall design of the feature. [0] : https://cwiki.apache.org/confluence/display/KAFKA/Offset+Management [1] : https://cwiki.apache.org/confluence/display/KAFKA/Inbuilt+Consumer+Offset+Management -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2207) The testCannotSendToInternalTopic test method in ProducerFailureHandlingTest fails consistently with the following exception:
[ https://issues.apache.org/jira/browse/KAFKA-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2207: - Fix Version/s: (was: 0.8.2.1) 0.8.3 The testCannotSendToInternalTopic test method in ProducerFailureHandlingTest fails consistently with the following exception: - Key: KAFKA-2207 URL: https://issues.apache.org/jira/browse/KAFKA-2207 Project: Kafka Issue Type: Bug Reporter: Deepthi Fix For: 0.8.3 Attachments: KAFKA-2207.patch kafka.api.ProducerFailureHandlingTest testCannotSendToInternalTopic FAILED java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 3000 ms. at org.apache.kafka.clients.producer.KafkaProducer$FutureFailure.init(KafkaProducer.java:437) at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:352) at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:248) at kafka.api.ProducerFailureHandlingTest.testCannotSendToInternalTopic(ProducerFailureHandlingTest.scala:309) Caused by: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 3000 ms. The following attached patch has resolved the issue -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1326) New consumer checklist
[ https://issues.apache.org/jira/browse/KAFKA-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1326: - Fix Version/s: 0.8.3 New consumer checklist -- Key: KAFKA-1326 URL: https://issues.apache.org/jira/browse/KAFKA-1326 Project: Kafka Issue Type: New Feature Components: consumer Affects Versions: 0.9.0 Reporter: Neha Narkhede Assignee: Neha Narkhede Labels: feature Fix For: 0.8.3 We will use this JIRA to track the list of issues to resolve to get a working new consumer client. The consumer client can work in phases - 1. Add new consumer APIs and configs 2. Refactor Sender. We will need to use some common APIs from Sender.java (https://issues.apache.org/jira/browse/KAFKA-1316) 3. Add metadata fetch and refresh functionality to the consumer (This will require https://issues.apache.org/jira/browse/KAFKA-1316) 4. Add functionality to support subscribe(TopicPartition...partitions). This will add SimpleConsumer functionality to the new consumer. This does not include any group management related work. 5. Add ability to commit offsets to Kafka. This will include adding functionality to the commit()/commitAsync()/committed() APIs. This still does not include any group management related work. 6. Add functionality to the offsetsBeforeTime() API. 7. Add consumer co-ordinator election to the server. This will only add a new module for the consumer co-ordinator, but not necessarily all the logic to do group management. At this point, we will have a fully functional standalone consumer and a server side co-ordinator module. This will be a good time to start adding group management functionality to the server and consumer. 8. Add failure detection capability to the consumer when group management is used. This will not include any rebalancing logic, just the ability to detect failures using session.timeout.ms. 9. Add rebalancing logic to the server and consumer. This will be a tricky and potentially large change since it will involve implementing the group management protocol. 10. Add system tests for the new consumer 11. Add metrics 12. Convert mirror maker to use the new consumer. 13. Convert perf test to use the new consumer 14. Performance testing and analysis. 15. Review and fine tune log4j logging -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2161) Fix a few copyrights
[ https://issues.apache.org/jira/browse/KAFKA-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14571985#comment-14571985 ] Joe Stein commented on KAFKA-2161: -- We stopped running rat when we moved to gradle https://github.com/apache/kafka/blob/trunk/build.gradle#L44 in 0.8.1. We should add running rat again for a release. I don't think we need to put the script back into the repo though to-do that. I never used that script I always did java -jar ../../apache-rat-0.8/apache-rat-0.8.jar in 0.8.0 and below in release steps https://cwiki.apache.org/confluence/display/KAFKA/Release+Process Fix a few copyrights Key: KAFKA-2161 URL: https://issues.apache.org/jira/browse/KAFKA-2161 Project: Kafka Issue Type: Bug Reporter: Ewen Cheslack-Postava Assignee: Ewen Cheslack-Postava Priority: Trivial Attachments: KAFKA-2161.patch I noticed that I accidentally let some incorrect copyright headers slip in with the KAKFA-1501 patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2161) Fix a few copyrights
[ https://issues.apache.org/jira/browse/KAFKA-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572045#comment-14572045 ] Joe Stein commented on KAFKA-2161: -- [~ewencp] that would be great Fix a few copyrights Key: KAFKA-2161 URL: https://issues.apache.org/jira/browse/KAFKA-2161 Project: Kafka Issue Type: Bug Reporter: Ewen Cheslack-Postava Assignee: Ewen Cheslack-Postava Priority: Trivial Attachments: KAFKA-2161.patch I noticed that I accidentally let some incorrect copyright headers slip in with the KAKFA-1501 patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function
[ https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565914#comment-14565914 ] Joe Stein commented on KAFKA-1778: -- Hey, sorry for late reply. I have seen now on a few dozen clusters situations where the broker gets into a state where the controller is hung and the only recourse is to either delete the znode from Zookeeper (/controller) to force a re-election or shutdown the broker. In the former case I have seen in one situation where the entire cluster went down. I am fairly certain this was because of the version of Zookeeper they were running (3.4.5) however I haven't ever tried to reproduce it. The latter case many folks don't want to shutdown the broker because they are in high traffic situations and doing so we could be a lot worse than the controller not working... sometimes that changes and they shut the broker down so the controller can fail over and their partition reassignment can continue to the new brokers they just launched (as an example). So, originally we were thinking of fixing this be having an admin call that could trigger safely another leader election. We have been finding though that just having the broker start without it ever being able to be the controller (can.be.controller = false) is preferable in *a lot* of cases. This way there are brokers that will never be the controller and then some that could and with the brokers that could one of them would. ~ Joestein Create new re-elect controller admin function - Key: KAFKA-1778 URL: https://issues.apache.org/jira/browse/KAFKA-1778 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Abhishek Nigam Fix For: 0.8.3 kafka --controller --elect -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-2218) reassignment tool needs to parse and validate the json
Joe Stein created KAFKA-2218: Summary: reassignment tool needs to parse and validate the json Key: KAFKA-2218 URL: https://issues.apache.org/jira/browse/KAFKA-2218 Project: Kafka Issue Type: Bug Reporter: Joe Stein Priority: Critical Fix For: 0.8.3 Ran into a production issue with the broker.id being set to a string instead of integer and the controller had nothing in the log and stayed stuck. Eventually we saw this in the log of the brokers where coming from me 11:42 AM [2015-05-23 15:41:05,863] 67396362 [ZkClient-EventThread-14-ERROR org.I0Itec.zkclient.ZkEventThread - Error handling event ZkEvent[Data of /admin/reassign_partitions changed sent to kafka.controller.PartitionsReassignedListener@78c6aab8] java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer at scala.runtime.BoxesRunTime.unboxToInt(Unknown Source) at kafka.controller.KafkaController$$anonfun$4.apply(KafkaController.scala:579) we then had to delete the znode from zookeeper (admin/reassign_partition) and then fix the json and try it again -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function
[ https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547838#comment-14547838 ] Joe Stein commented on KAFKA-1778: -- I was thinking that the broker when starting up would have another property. can.be.controller=false || can.be.controller=true If a broker has this value to true, then it can be the controller and the thread starts up for the KafkaController, else it doesn't. Should be a few lines change in KafkaServer and config mod Create new re-elect controller admin function - Key: KAFKA-1778 URL: https://issues.apache.org/jira/browse/KAFKA-1778 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Abhishek Nigam Fix For: 0.8.3 kafka --controller --elect -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-2180) topics never create on brokers though it succeeds in tool and is in zookeeper
Joe Stein created KAFKA-2180: Summary: topics never create on brokers though it succeeds in tool and is in zookeeper Key: KAFKA-2180 URL: https://issues.apache.org/jira/browse/KAFKA-2180 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.2 Reporter: Joe Stein Priority: Critical Fix For: 0.8.3 Ran into an issue with a 0.8.2.1 cluster where create topic was succeeding when running bin/kafka-topics.sh --create and seen in zookeeper but brokers never get updated. We ended up fixing this by deleting the /controller znode so controller leader election would result. Wwe really should have some better way to make the controller failover ( KAFKA-1778 ) than rmr /controller in the zookeeper shell -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-2179) no graceful nor fast way to shutdown every broker without killing them
Joe Stein created KAFKA-2179: Summary: no graceful nor fast way to shutdown every broker without killing them Key: KAFKA-2179 URL: https://issues.apache.org/jira/browse/KAFKA-2179 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.2 Reporter: Joe Stein Priority: Minor Fix For: 0.8.3 if you do a controlled shutdown of every broker at the same time the controlled shutdown process spins out of control. Every leader can't go anywhere because every broker is trying to controlled shutdown itself. The result is the brokers take a long (variable) time before it eventually does actually shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2132) Move Log4J appender to clients module
[ https://issues.apache.org/jira/browse/KAFKA-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513110#comment-14513110 ] Joe Stein commented on KAFKA-2132: -- +1 - make a stand-alone module called log4j/ that has this one class, move the class to Java so people don't have the scala dependency Move Log4J appender to clients module - Key: KAFKA-2132 URL: https://issues.apache.org/jira/browse/KAFKA-2132 Project: Kafka Issue Type: Improvement Reporter: Gwen Shapira Assignee: Ashish K Singh Log4j appender is just a producer. Since we have a new producer in the clients module, no need to keep Log4J appender in core and force people to package all of Kafka with their apps. Lets move the Log4jAppender to clients module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2132) Move Log4J appender to clients module
[ https://issues.apache.org/jira/browse/KAFKA-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505370#comment-14505370 ] Joe Stein commented on KAFKA-2132: -- Can we put it in the new new admin client tools jar that KAFKA-1694 is creating? tools/src/main/java/org/apache/kafka/loggers/KafkaLog4JAppenderBasic.java or something... That is all Java code and think the Log4j being in Java code would be preferable. Move Log4J appender to clients module - Key: KAFKA-2132 URL: https://issues.apache.org/jira/browse/KAFKA-2132 Project: Kafka Issue Type: Improvement Reporter: Gwen Shapira Assignee: Ashish K Singh Log4j appender is just a producer. Since we have a new producer in the clients module, no need to keep Log4J appender in core and force people to package all of Kafka with their apps. Lets move the Log4jAppender to clients module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2079) Support exhibitor
[ https://issues.apache.org/jira/browse/KAFKA-2079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391680#comment-14391680 ] Joe Stein commented on KAFKA-2079: -- I think if we do this (which we should) the work should also go into separating out the meta data storage and consensus service (async watchers and leader election). Once that is plug-able then you can use Exhibitor, Akka, Consul, Zookeeper, etc, whatever folks like to-do. From there I think the project can either adopt support for the exhibitor version or stick to the zkclient for out of the box support. Zkclient has worked and we should just abstract around it to reduce the most risk in code changes. Once its plug-able then Exhibitor can get used then too by folks that want to build and support it. Support exhibitor - Key: KAFKA-2079 URL: https://issues.apache.org/jira/browse/KAFKA-2079 Project: Kafka Issue Type: Improvement Reporter: Aaron Dixon Exhibitor (https://github.com/Netflix/exhibitor) is a discovery/monitoring solution for managing Zookeeper clusters. It supports use cases like discovery, node replacements and auto-scaling of Zk cluster hosts (so you don't have to manage a fixed set of Zk hosts--especially useful in cloud environments.) The easiest way for Kafka to support connection to Zk clusters via exhibitor is to use curator as its client. There is already a separate ticket for this: KAFKA-873 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1856) Add PreCommit Patch Testing
[ https://issues.apache.org/jira/browse/KAFKA-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1856: - Resolution: Fixed Fix Version/s: 0.8.3 Status: Resolved (was: Patch Available) This still requires the jenkins build to get updated with the job I created https://builds.apache.org/job/KafkaPreCommit/rename?newName=PreCommit-Kafka and I will ping INFRA about getting that connected. Thanks! Add PreCommit Patch Testing --- Key: KAFKA-1856 URL: https://issues.apache.org/jira/browse/KAFKA-1856 Project: Kafka Issue Type: Task Reporter: Ashish K Singh Assignee: Ashish K Singh Fix For: 0.8.3 Attachments: KAFKA-1845.result.txt, KAFKA-1856.patch, KAFKA-1856_2015-01-18_21:43:56.patch, KAFKA-1856_2015-02-04_14:57:05.patch, KAFKA-1856_2015-02-04_15:44:47.patch h1. Kafka PreCommit Patch Testing - *Don't wait for it to break* h2. Motivation *With great power comes great responsibility* - Uncle Ben. As Kafka user list is growing, mechanism to ensure quality of the product is required. Quality becomes hard to measure and maintain in an open source project, because of a wide community of contributors. Luckily, Kafka is not the first open source project and can benefit from learnings of prior projects. PreCommit tests are the tests that are run for each patch that gets attached to an open JIRA. Based on tests results, test execution framework, test bot, +1 or -1 the patch. Having PreCommit tests take the load off committers to look at or test each patch. h2. Tests in Kafka h3. Unit and Integraiton Tests [Unit and Integration tests|https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Unit+and+Integration+Tests] are cardinal to help contributors to avoid breaking existing functionalities while adding new functionalities or fixing older ones. These tests, atleast the ones relevant to the changes, must be run by contributors before attaching a patch to a JIRA. h3. System Tests [System tests|https://cwiki.apache.org/confluence/display/KAFKA/Kafka+System+Tests] are much wider tests that, unlike unit tests, focus on end-to-end scenarios and not some specific method or class. h2. Apache PreCommit tests Apache provides a mechanism to automatically build a project and run a series of tests whenever a patch is uploaded to a JIRA. Based on test execution, the test framework will comment with a +1 or -1 on the JIRA. You can read more about the framework here: http://wiki.apache.org/general/PreCommitBuilds h2. Plan # Create a test-patch.py script (similar to the one used in Flume, Sqoop and other projects) that will take a jira as a parameter, apply on the appropriate branch, build the project, run tests and report results. This script should be committed into the Kafka code-base. To begin with, this will only run unit tests. We can add code sanity checks, system_tests, etc in the future. # Create a jenkins job for running the test (as described in http://wiki.apache.org/general/PreCommitBuilds) and validate that it works manually. This must be done by a committer with Jenkins access. # Ask someone with access to https://builds.apache.org/job/PreCommit-Admin/ to add Kafka to the list of projects PreCommit-Admin triggers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1856) Add PreCommit Patch Testing
[ https://issues.apache.org/jira/browse/KAFKA-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379331#comment-14379331 ] Joe Stein commented on KAFKA-1856: -- Testing file [KAFKA-1856_2015-02-04_15%3A44%3A47.patch|https://issues.apache.org/jira/secure/attachment/12696611/KAFKA-1856_2015-02-04_15%3A44%3A47.patch] against branch trunk took 0:25:20.725178. {color:red}Overall:{color} -1 due to 2 errors {color:red}ERROR:{color} Some unit tests failed (report) {color:red}ERROR:{color} Failed unit test: {{unit.kafka.consumer.PartitionAssignorTest testRangePartitionAssignor FAILED }} {color:green}SUCCESS:{color} Gradle bootstrap was successful {color:green}SUCCESS:{color} Clean was successful {color:green}SUCCESS:{color} Patch applied correctly {color:green}SUCCESS:{color} Patch add/modify test case {color:green}SUCCESS:{color} Gradle bootstrap was successful {color:green}SUCCESS:{color} Patch compiled {color:green}SUCCESS:{color} Checked style for Main {color:green}SUCCESS:{color} Checked style for Test This message is automatically generated. Add PreCommit Patch Testing --- Key: KAFKA-1856 URL: https://issues.apache.org/jira/browse/KAFKA-1856 Project: Kafka Issue Type: Task Reporter: Ashish K Singh Assignee: Ashish K Singh Attachments: KAFKA-1845.result.txt, KAFKA-1856.patch, KAFKA-1856_2015-01-18_21:43:56.patch, KAFKA-1856_2015-02-04_14:57:05.patch, KAFKA-1856_2015-02-04_15:44:47.patch h1. Kafka PreCommit Patch Testing - *Don't wait for it to break* h2. Motivation *With great power comes great responsibility* - Uncle Ben. As Kafka user list is growing, mechanism to ensure quality of the product is required. Quality becomes hard to measure and maintain in an open source project, because of a wide community of contributors. Luckily, Kafka is not the first open source project and can benefit from learnings of prior projects. PreCommit tests are the tests that are run for each patch that gets attached to an open JIRA. Based on tests results, test execution framework, test bot, +1 or -1 the patch. Having PreCommit tests take the load off committers to look at or test each patch. h2. Tests in Kafka h3. Unit and Integraiton Tests [Unit and Integration tests|https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Unit+and+Integration+Tests] are cardinal to help contributors to avoid breaking existing functionalities while adding new functionalities or fixing older ones. These tests, atleast the ones relevant to the changes, must be run by contributors before attaching a patch to a JIRA. h3. System Tests [System tests|https://cwiki.apache.org/confluence/display/KAFKA/Kafka+System+Tests] are much wider tests that, unlike unit tests, focus on end-to-end scenarios and not some specific method or class. h2. Apache PreCommit tests Apache provides a mechanism to automatically build a project and run a series of tests whenever a patch is uploaded to a JIRA. Based on test execution, the test framework will comment with a +1 or -1 on the JIRA. You can read more about the framework here: http://wiki.apache.org/general/PreCommitBuilds h2. Plan # Create a test-patch.py script (similar to the one used in Flume, Sqoop and other projects) that will take a jira as a parameter, apply on the appropriate branch, build the project, run tests and report results. This script should be committed into the Kafka code-base. To begin with, this will only run unit tests. We can add code sanity checks, system_tests, etc in the future. # Create a jenkins job for running the test (as described in http://wiki.apache.org/general/PreCommitBuilds) and validate that it works manually. This must be done by a committer with Jenkins access. # Ask someone with access to https://builds.apache.org/job/PreCommit-Admin/ to add Kafka to the list of projects PreCommit-Admin triggers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1912) Create a simple request re-routing facility
[ https://issues.apache.org/jira/browse/KAFKA-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1912: - Fix Version/s: 0.8.3 Create a simple request re-routing facility --- Key: KAFKA-1912 URL: https://issues.apache.org/jira/browse/KAFKA-1912 Project: Kafka Issue Type: Improvement Reporter: Jay Kreps Fix For: 0.8.3 We are accumulating a lot of requests that have to be directed to the correct server. This makes sense for high volume produce or fetch requests. But it is silly to put the extra burden on the client for the many miscellaneous requests such as fetching or committing offsets and so on. This adds a ton of practical complexity to the clients with little or no payoff in performance. We should add a generic request-type agnostic re-routing facility on the server. This would allow any server to accept a request and forward it to the correct destination, proxying the response back to the user. Naturally it needs to do this without blocking the thread. The result is that a client implementation can choose to be optimally efficient and manage a local cache of cluster state and attempt to always direct its requests to the proper server OR it can choose simplicity and just send things all to a single host and let that host figure out where to forward it. I actually think we should implement this more or less across the board, but some requests such as produce and fetch require more logic to proxy since they have to be scattered out to multiple servers and gathered back to create the response. So these could be done in a second phase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1927) Replace requests in kafka.api with requests in org.apache.kafka.common.requests
[ https://issues.apache.org/jira/browse/KAFKA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1927: - Fix Version/s: 0.8.3 Replace requests in kafka.api with requests in org.apache.kafka.common.requests --- Key: KAFKA-1927 URL: https://issues.apache.org/jira/browse/KAFKA-1927 Project: Kafka Issue Type: Improvement Reporter: Jay Kreps Assignee: Gwen Shapira Fix For: 0.8.3 The common package introduced a better way of defining requests using a new protocol definition DSL and also includes wrapper objects for these. We should switch KafkaApis over to use these request definitions and consider the scala classes deprecated (we probably need to retain some of them for a while for the scala clients). This will be a big improvement because 1. We will have each request now defined in only one place (Protocol.java) 2. We will have built-in support for multi-version requests 3. We will have much better error messages (no more cryptic underflow errors) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KAFKA-1927) Replace requests in kafka.api with requests in org.apache.kafka.common.requests
[ https://issues.apache.org/jira/browse/KAFKA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein reassigned KAFKA-1927: Assignee: Gwen Shapira Replace requests in kafka.api with requests in org.apache.kafka.common.requests --- Key: KAFKA-1927 URL: https://issues.apache.org/jira/browse/KAFKA-1927 Project: Kafka Issue Type: Improvement Reporter: Jay Kreps Assignee: Gwen Shapira Fix For: 0.8.3 The common package introduced a better way of defining requests using a new protocol definition DSL and also includes wrapper objects for these. We should switch KafkaApis over to use these request definitions and consider the scala classes deprecated (we probably need to retain some of them for a while for the scala clients). This will be a big improvement because 1. We will have each request now defined in only one place (Protocol.java) 2. We will have built-in support for multi-version requests 3. We will have much better error messages (no more cryptic underflow errors) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2028) Unable to start the ZK instance after myid file was missing and had to recreate it.
[ https://issues.apache.org/jira/browse/KAFKA-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366349#comment-14366349 ] Joe Stein commented on KAFKA-2028: -- the issue is like related to using the /tmp directory dataDir=/tmp/zookeeper you should not be using the /tmp directory for zookeeper nor kafka (check your server.properties log.dirs) data what could have happened is a reboot which os in that case delete everything in /tmp Unable to start the ZK instance after myid file was missing and had to recreate it. --- Key: KAFKA-2028 URL: https://issues.apache.org/jira/browse/KAFKA-2028 Project: Kafka Issue Type: Bug Components: admin Affects Versions: 0.8.1.1 Environment: Non Prod Reporter: InduR Created a Dev 3 node cluster environment in Jan and the environment has been up and running without any issues until few days. Kafka server stopped running but ZK listener was up .Noticed that the Myid file was missing in all 3 servers. Recreated the file when ZK was still running did not help. Stopped all of the ZK /kafka server instances and see the following error when starting ZK. kafka_2.10-0.8.1.1 OS : RHEL [root@lablx0025 bin]# ./zookeeper-server-start.sh ../config/zookeeper.properties [1] 31053 [* bin]# [2015-03-17 15:04:33,876] INFO Reading configuration from: ../config/zookeeper.properties (org.apache.zookeeper. server.quorum.QuorumPeerConfig) [2015-03-17 15:04:33,885] INFO Defaulting to majority quorums (org.apache.zookeeper.server.quorum.QuorumPeerConfig) [2015-03-17 15:04:33,911] DEBUG preRegister called. Server=com.sun.jmx.mbeanserver.JmxMBeanServer@4891d863, name=log4j:logger=kafka (k afka) [2015-03-17 15:04:33,915] INFO Starting quorum peer (org.apache.zookeeper.server.quorum.QuorumPeerMain) [2015-03-17 15:04:33,940] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxn) [2015-03-17 15:04:33,966] INFO tickTime set to 3000 (org.apache.zookeeper.server.quorum.QuorumPeer) [2015-03-17 15:04:33,966] INFO minSessionTimeout set to -1 (org.apache.zookeeper.server.quorum.QuorumPeer) [2015-03-17 15:04:33,966] INFO maxSessionTimeout set to -1 (org.apache.zookeeper.server.quorum.QuorumPeer) [2015-03-17 15:04:33,966] INFO initLimit set to 5 (org.apache.zookeeper.server.quorum.QuorumPeer) [2015-03-17 15:04:34,023] ERROR Failed to increment parent cversion for: /consumers/console-consumer-6249/offsets/test (org.apache.zoo keeper.server.persistence.FileTxnSnapLog) org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /consumers/console-consumer-6249/offsets/test at org.apache.zookeeper.server.DataTree.incrementCversion(DataTree.java:1218) at org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:222) at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:150) at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:222) at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:398) at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:143) at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:103) at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:76) [2015-03-17 15:04:34,027] FATAL Unable to load database on disk (org.apache.zookeeper.server.quorum.QuorumPeer) java.io.IOException: Failed to process transaction type: 2 error: KeeperErrorCode = NoNode for /consumers/console-consumer-6249/offset s/test at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:152) at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:222) at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:398) at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:143) at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:103) at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:76) [2015-03-17 15:04:34,027] FATAL Unexpected exception, exiting abnormally
[jira] [Updated] (KAFKA-2023) git clone kafka repository requires https
[ https://issues.apache.org/jira/browse/KAFKA-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2023: - Reviewer: Gwen Shapira git clone kafka repository requires https - Key: KAFKA-2023 URL: https://issues.apache.org/jira/browse/KAFKA-2023 Project: Kafka Issue Type: Bug Components: website Reporter: Anatoli Fomenko Assignee: Anatoly Fayngelerin Priority: Minor Attachments: KAFKA-2023.patch From http://kafka.apache.org/code.html: Our code is kept in git. You can check it out like this: git clone http://git-wip-us.apache.org/repos/asf/kafka.git kafka On CentOS 6.5: {code} $ git clone http://git-wip-us.apache.org/repos/asf/kafka.git kafka Initialized empty Git repository in /home/anatoli/git/kafka/.git/ error: RPC failed; result=22, HTTP code = 405 {code} while: {code} $ git clone https://git-wip-us.apache.org/repos/asf/kafka.git kafka Initialized empty Git repository in /home/anatoli/git/kafka/.git/ remote: Counting objects: 24607, done. remote: Compressing objects: 100% (9212/9212), done. remote: Total 24607 (delta 14449), reused 19801 (delta 11465) Receiving objects: 100% (24607/24607), 15.61 MiB | 5.85 MiB/s, done. Resolving deltas: 100% (14449/14449), done. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KAFKA-2023) git clone kafka repository requires https
[ https://issues.apache.org/jira/browse/KAFKA-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein reassigned KAFKA-2023: Assignee: Anatoly Fayngelerin git clone kafka repository requires https - Key: KAFKA-2023 URL: https://issues.apache.org/jira/browse/KAFKA-2023 Project: Kafka Issue Type: Bug Components: website Reporter: Anatoli Fomenko Assignee: Anatoly Fayngelerin Priority: Minor Attachments: KAFKA-2023.patch From http://kafka.apache.org/code.html: Our code is kept in git. You can check it out like this: git clone http://git-wip-us.apache.org/repos/asf/kafka.git kafka On CentOS 6.5: {code} $ git clone http://git-wip-us.apache.org/repos/asf/kafka.git kafka Initialized empty Git repository in /home/anatoli/git/kafka/.git/ error: RPC failed; result=22, HTTP code = 405 {code} while: {code} $ git clone https://git-wip-us.apache.org/repos/asf/kafka.git kafka Initialized empty Git repository in /home/anatoli/git/kafka/.git/ remote: Counting objects: 24607, done. remote: Compressing objects: 100% (9212/9212), done. remote: Total 24607 (delta 14449), reused 19801 (delta 11465) Receiving objects: 100% (24607/24607), 15.61 MiB | 5.85 MiB/s, done. Resolving deltas: 100% (14449/14449), done. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2023) git clone kafka repository requires https
[ https://issues.apache.org/jira/browse/KAFKA-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364263#comment-14364263 ] Joe Stein commented on KAFKA-2023: -- looks like maybe the issue is the version of git, i tried a few other asf repos same issue with git 1.7.1 what comes with yum install git git clone kafka repository requires https - Key: KAFKA-2023 URL: https://issues.apache.org/jira/browse/KAFKA-2023 Project: Kafka Issue Type: Bug Components: website Reporter: Anatoli Fomenko Priority: Minor Attachments: KAFKA-2023.patch From http://kafka.apache.org/code.html: Our code is kept in git. You can check it out like this: git clone http://git-wip-us.apache.org/repos/asf/kafka.git kafka On CentOS 6.5: {code} $ git clone http://git-wip-us.apache.org/repos/asf/kafka.git kafka Initialized empty Git repository in /home/anatoli/git/kafka/.git/ error: RPC failed; result=22, HTTP code = 405 {code} while: {code} $ git clone https://git-wip-us.apache.org/repos/asf/kafka.git kafka Initialized empty Git repository in /home/anatoli/git/kafka/.git/ remote: Counting objects: 24607, done. remote: Compressing objects: 100% (9212/9212), done. remote: Total 24607 (delta 14449), reused 19801 (delta 11465) Receiving objects: 100% (24607/24607), 15.61 MiB | 5.85 MiB/s, done. Resolving deltas: 100% (14449/14449), done. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2023) git clone kafka repository requires https
[ https://issues.apache.org/jira/browse/KAFKA-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364184#comment-14364184 ] Joe Stein commented on KAFKA-2023: -- works ok for me on ubuntu and redhat on two different networks {code} $ git clone http://git-wip-us.apache.org/repos/asf/kafka.git kafka Cloning into 'kafka'... remote: Counting objects: 24607, done. remote: Compressing objects: 100% (9212/9212), done. remote: Total 24607 (delta 14447), reused 19803 (delta 11465) Receiving objects: 100% (24607/24607), 15.62 MiB | 3.46 MiB/s, done. Resolving deltas: 100% (14447/14447), done. Checking connectivity... done. {code} git clone kafka repository requires https - Key: KAFKA-2023 URL: https://issues.apache.org/jira/browse/KAFKA-2023 Project: Kafka Issue Type: Bug Components: website Reporter: Anatoli Fomenko Priority: Minor From http://kafka.apache.org/code.html: Our code is kept in git. You can check it out like this: git clone http://git-wip-us.apache.org/repos/asf/kafka.git kafka On CentOS 6.5: {code} $ git clone http://git-wip-us.apache.org/repos/asf/kafka.git kafka Initialized empty Git repository in /home/anatoli/git/kafka/.git/ error: RPC failed; result=22, HTTP code = 405 {code} while: {code} $ git clone https://git-wip-us.apache.org/repos/asf/kafka.git kafka Initialized empty Git repository in /home/anatoli/git/kafka/.git/ remote: Counting objects: 24607, done. remote: Compressing objects: 100% (9212/9212), done. remote: Total 24607 (delta 14449), reused 19801 (delta 11465) Receiving objects: 100% (24607/24607), 15.61 MiB | 5.85 MiB/s, done. Resolving deltas: 100% (14449/14449), done. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2006) switch the broker server over to the new java protocol definitions
[ https://issues.apache.org/jira/browse/KAFKA-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2006: - Description: This was brought up during the the review KAFKA-1694 The latest patch for KAFKA-1694 now uses the new java protocol for the new message so the work here will not be bloated for new messages just the ones that are already there. switch the broker server over to the new java protocol definitions -- Key: KAFKA-2006 URL: https://issues.apache.org/jira/browse/KAFKA-2006 Project: Kafka Issue Type: Bug Reporter: Joe Stein Fix For: 0.8.3 This was brought up during the the review KAFKA-1694 The latest patch for KAFKA-1694 now uses the new java protocol for the new message so the work here will not be bloated for new messages just the ones that are already there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior
[ https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359359#comment-14359359 ] Joe Stein commented on KAFKA-1461: -- Here is my reasoning. Say you are an operations person. And, in the next release we tell folks about the KIP to learn and understand changes that affect them (yada yada language for the release). And something like this isn't in there. We are changing the behavior of an existing config and removing another. It makes the communication of behavior incongruent for the changes of a release. So, while I agree we don't need it for this the reason I even brought it up was looking at it from the release perspective for what ops folks are going to be looking at when we get there. Replica fetcher thread does not implement any back-off behavior --- Key: KAFKA-1461 URL: https://issues.apache.org/jira/browse/KAFKA-1461 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.1.1 Reporter: Sam Meder Assignee: Sriharsha Chintalapani Labels: newbie++ Fix For: 0.8.3 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch The current replica fetcher thread will retry in a tight loop if any error occurs during the fetch call. For example, we've seen cases where the fetch continuously throws a connection refused exception leading to several replica fetcher threads that spin in a pretty tight loop. To a much lesser degree this is also an issue in the consumer fetcher thread, although the fact that erroring partitions are removed so a leader can be re-discovered helps some. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior
[ https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359359#comment-14359359 ] Joe Stein edited comment on KAFKA-1461 at 3/12/15 8:42 PM: --- Here is my reasoning. Say you are an operations person. And, in the next release we tell folks about the KIP to learn and understand changes that affect them (yada yada language for the release). And something like this isn't in there. We are changing the behavior of an existing config and removing another. It makes the communication of behavior incongruent for the changes of a release. So, while I agree we don't need it technically but for this consistency reason is why I even brought it up. I was just looking at it from the release perspective for what ops folks are going to be looking at when we get there. was (Author: joestein): Here is my reasoning. Say you are an operations person. And, in the next release we tell folks about the KIP to learn and understand changes that affect them (yada yada language for the release). And something like this isn't in there. We are changing the behavior of an existing config and removing another. It makes the communication of behavior incongruent for the changes of a release. So, while I agree we don't need it for this the reason I even brought it up was looking at it from the release perspective for what ops folks are going to be looking at when we get there. Replica fetcher thread does not implement any back-off behavior --- Key: KAFKA-1461 URL: https://issues.apache.org/jira/browse/KAFKA-1461 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.1.1 Reporter: Sam Meder Assignee: Sriharsha Chintalapani Labels: newbie++ Fix For: 0.8.3 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch The current replica fetcher thread will retry in a tight loop if any error occurs during the fetch call. For example, we've seen cases where the fetch continuously throws a connection refused exception leading to several replica fetcher threads that spin in a pretty tight loop. To a much lesser degree this is also an issue in the consumer fetcher thread, although the fact that erroring partitions are removed so a leader can be re-discovered helps some. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior
[ https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1461: - Comment: was deleted (was: Here is my reasoning. Say you are an operations person. And, in the next release we tell folks about the KIP to learn and understand changes that affect them (yada yada language for the release). And something like this isn't in there. We are changing the behavior of an existing config and removing another. It makes the communication of behavior incongruent for the changes of a release. So, while I agree we don't need it technically but for this consistency reason is why I even brought it up. I was just looking at it from the release perspective for what ops folks are going to be looking at when we get there.) Replica fetcher thread does not implement any back-off behavior --- Key: KAFKA-1461 URL: https://issues.apache.org/jira/browse/KAFKA-1461 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.1.1 Reporter: Sam Meder Assignee: Sriharsha Chintalapani Labels: newbie++ Fix For: 0.8.3 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch The current replica fetcher thread will retry in a tight loop if any error occurs during the fetch call. For example, we've seen cases where the fetch continuously throws a connection refused exception leading to several replica fetcher threads that spin in a pretty tight loop. To a much lesser degree this is also an issue in the consumer fetcher thread, although the fact that erroring partitions are removed so a leader can be re-discovered helps some. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2006) switch the broker server over to the new java protocol definitions
[ https://issues.apache.org/jira/browse/KAFKA-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2006: - Priority: Major (was: Blocker) switch the broker server over to the new java protocol definitions -- Key: KAFKA-2006 URL: https://issues.apache.org/jira/browse/KAFKA-2006 Project: Kafka Issue Type: Bug Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2006) switch the broker server over to the new java protocol definitions
[ https://issues.apache.org/jira/browse/KAFKA-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2006: - Assignee: (was: Andrii Biletskyi) switch the broker server over to the new java protocol definitions -- Key: KAFKA-2006 URL: https://issues.apache.org/jira/browse/KAFKA-2006 Project: Kafka Issue Type: Bug Reporter: Joe Stein Fix For: 0.8.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1546) Automate replica lag tuning
[ https://issues.apache.org/jira/browse/KAFKA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1546: - Fix Version/s: 0.8.3 Automate replica lag tuning --- Key: KAFKA-1546 URL: https://issues.apache.org/jira/browse/KAFKA-1546 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0, 0.8.1, 0.8.1.1 Reporter: Neha Narkhede Assignee: Aditya Auradkar Labels: newbie++ Fix For: 0.8.3 Attachments: KAFKA-1546.patch, KAFKA-1546_2015-03-11_18:48:09.patch Currently, there is no good way to tune the replica lag configs to automatically account for high and low volume topics on the same cluster. For the low-volume topic it will take a very long time to detect a lagging replica, and for the high-volume topic it will have false-positives. One approach to making this easier would be to have the configuration be something like replica.lag.max.ms and translate this into a number of messages dynamically based on the throughput of the partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2001) OffsetCommitTest hang during setup
[ https://issues.apache.org/jira/browse/KAFKA-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2001: - Fix Version/s: 0.8.3 OffsetCommitTest hang during setup -- Key: KAFKA-2001 URL: https://issues.apache.org/jira/browse/KAFKA-2001 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.3 Reporter: Jun Rao Assignee: Joel Koshy Fix For: 0.8.3 OffsetCommitTest seems to hang in trunk reliably. The following is the stacktrace. It seems to hang during setup. Test worker prio=5 tid=7fb5ab154800 nid=0x11198e000 waiting on condition [11198c000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at kafka.server.OffsetCommitTest$$anonfun$setUp$2.apply(OffsetCommitTest.scala:59) at kafka.server.OffsetCommitTest$$anonfun$setUp$2.apply(OffsetCommitTest.scala:58) at scala.collection.immutable.Stream.dropWhile(Stream.scala:825) at kafka.server.OffsetCommitTest.setUp(OffsetCommitTest.scala:58) at junit.framework.TestCase.runBare(TestCase.java:132) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:232) at junit.framework.TestSuite.run(TestSuite.java:227) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:91) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:86) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:49) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:69) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:48) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1986) Producer request failure rate should not include InvalidMessageSizeException and OffsetOutOfRangeException
[ https://issues.apache.org/jira/browse/KAFKA-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1986: - Fix Version/s: 0.8.3 Producer request failure rate should not include InvalidMessageSizeException and OffsetOutOfRangeException -- Key: KAFKA-1986 URL: https://issues.apache.org/jira/browse/KAFKA-1986 Project: Kafka Issue Type: Sub-task Reporter: Aditya Auradkar Assignee: Aditya Auradkar Fix For: 0.8.3 Attachments: KAFKA-1986.patch InvalidMessageSizeException and OffsetOutOfRangeException should not be counted a failedProduce and failedFetch requests since they are client side errors. They should be treated similarly to UnknownTopicOrPartitionException or NotLeaderForPartitionException -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1938) [doc] Quick start example should reference appropriate Kafka version
[ https://issues.apache.org/jira/browse/KAFKA-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1938: - Fix Version/s: 0.8.3 [doc] Quick start example should reference appropriate Kafka version Key: KAFKA-1938 URL: https://issues.apache.org/jira/browse/KAFKA-1938 Project: Kafka Issue Type: Improvement Components: website Affects Versions: 0.8.2.0 Reporter: Stevo Slavic Assignee: Manikumar Reddy Priority: Trivial Fix For: 0.8.3 Attachments: lz4-compression.patch, remove-081-references-1.patch, remove-081-references.patch Kafka 0.8.2.0 documentation, quick start example on https://kafka.apache.org/documentation.html#quickstart in step 1 links and instructs reader to download Kafka 0.8.1.1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1925) Coordinator Node Id set to INT_MAX breaks coordinator metadata updates
[ https://issues.apache.org/jira/browse/KAFKA-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1925: - Fix Version/s: 0.8.3 Coordinator Node Id set to INT_MAX breaks coordinator metadata updates -- Key: KAFKA-1925 URL: https://issues.apache.org/jira/browse/KAFKA-1925 Project: Kafka Issue Type: Sub-task Components: consumer Reporter: Guozhang Wang Assignee: Guozhang Wang Priority: Critical Fix For: 0.8.3 Attachments: KAFKA-1925.v1.patch KafkaConsumer used INT_MAX to mimic a new socket for coordinator (details can be found in KAFKA-1760). However, this behavior breaks the coordinator as the underlying NetworkClient only used the node id to determine when to initiate a new connection: {code} if (connectionStates.canConnect(node.id(), now)) // if we are interested in sending to a node and we don't have a connection to it, initiate one initiateConnect(node, now); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1943) Producer request failure rate should not include MessageSetSizeTooLarge and MessageSizeTooLargeException
[ https://issues.apache.org/jira/browse/KAFKA-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1943: - Fix Version/s: 0.8.3 Producer request failure rate should not include MessageSetSizeTooLarge and MessageSizeTooLargeException Key: KAFKA-1943 URL: https://issues.apache.org/jira/browse/KAFKA-1943 Project: Kafka Issue Type: Sub-task Reporter: Aditya A Auradkar Assignee: Aditya Auradkar Fix For: 0.8.3 Attachments: KAFKA-1943.patch If MessageSetSizeTooLargeException or MessageSizeTooLargeException is thrown from Log, then ReplicaManager counts it as a failed produce request. My understanding is that this metric should only count failures as a result of broker issues and not bad requests sent by the clients. If the message or message set is too large, then it is a client side error and should not be reported. (similar to NotLeaderForPartitionException, UnknownTopicOrPartitionException). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1957) code doc typo
[ https://issues.apache.org/jira/browse/KAFKA-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1957: - Fix Version/s: 0.8.3 code doc typo - Key: KAFKA-1957 URL: https://issues.apache.org/jira/browse/KAFKA-1957 Project: Kafka Issue Type: Improvement Components: config Reporter: Yaguo Zhou Priority: Trivial Fix For: 0.8.3 Attachments: 0001-fix-typo-SO_SNDBUFF-SO_SNDBUF-SO_RCVBUFF-SO_RCVBUF.patch There are doc typo in kafka.server.KafkaConfig.scala, SO_SNDBUFF should be SO_SNDBUF and SO_RCVBUFF should be SO_RCVBUF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1948) kafka.api.consumerTests are hanging
[ https://issues.apache.org/jira/browse/KAFKA-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1948: - Fix Version/s: 0.8.3 kafka.api.consumerTests are hanging --- Key: KAFKA-1948 URL: https://issues.apache.org/jira/browse/KAFKA-1948 Project: Kafka Issue Type: Bug Reporter: Gwen Shapira Assignee: Guozhang Wang Fix For: 0.8.3 Attachments: KAFKA-1948.patch Noticed today that very often when I run the full test suite, it hangs on kafka.api.consumerTest (not always same test though). It doesn't reproduce 100% of the time, but enough to be very annoying. I also saw it happening on trunk after KAFKA-1333: https://builds.apache.org/view/All/job/Kafka-trunk/389/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1959) Class CommitThread overwrite group of Thread class causing compile errors
[ https://issues.apache.org/jira/browse/KAFKA-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1959: - Fix Version/s: 0.8.3 Class CommitThread overwrite group of Thread class causing compile errors - Key: KAFKA-1959 URL: https://issues.apache.org/jira/browse/KAFKA-1959 Project: Kafka Issue Type: Bug Components: core Environment: scala 2.10.4 Reporter: Tong Li Assignee: Tong Li Labels: newbie Fix For: 0.8.3 Attachments: KAFKA-1959.patch, compileError.png class CommitThread(id: Int, partitionCount: Int, commitIntervalMs: Long, zkClient: ZkClient) extends ShutdownableThread(commit-thread) with KafkaMetricsGroup { private val group = group- + id group overwrite class Thread group member, causing the following compile error: overriding variable group in class Thread of type ThreadGroup; value group has weaker access privileges; it should not be private -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1969) NPE in unit test for new consumer
[ https://issues.apache.org/jira/browse/KAFKA-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1969: - Fix Version/s: 0.8.3 NPE in unit test for new consumer - Key: KAFKA-1969 URL: https://issues.apache.org/jira/browse/KAFKA-1969 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Labels: newbie Fix For: 0.8.3 Attachments: stack.out {code} kafka.api.ConsumerTest testConsumptionWithBrokerFailures FAILED java.lang.NullPointerException at org.apache.kafka.clients.consumer.KafkaConsumer.ensureCoordinatorReady(KafkaConsumer.java:1238) at org.apache.kafka.clients.consumer.KafkaConsumer.initiateCoordinatorRequest(KafkaConsumer.java:1189) at org.apache.kafka.clients.consumer.KafkaConsumer.commit(KafkaConsumer.java:777) at org.apache.kafka.clients.consumer.KafkaConsumer.commit(KafkaConsumer.java:816) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:704) at kafka.api.ConsumerTest.consumeWithBrokerFailures(ConsumerTest.scala:167) at kafka.api.ConsumerTest.testConsumptionWithBrokerFailures(ConsumerTest.scala:152) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1964) testPartitionReassignmentCallback hangs occasionally
[ https://issues.apache.org/jira/browse/KAFKA-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1964: - Fix Version/s: 0.8.3 testPartitionReassignmentCallback hangs occasionally - Key: KAFKA-1964 URL: https://issues.apache.org/jira/browse/KAFKA-1964 Project: Kafka Issue Type: Bug Components: admin Affects Versions: 0.8.3 Reporter: Jun Rao Assignee: Guozhang Wang Labels: newbie++ Fix For: 0.8.3 Attachments: stack.out -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1960) .gitignore does not exclude test generated files and folders.
[ https://issues.apache.org/jira/browse/KAFKA-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1960: - Fix Version/s: 0.8.3 .gitignore does not exclude test generated files and folders. - Key: KAFKA-1960 URL: https://issues.apache.org/jira/browse/KAFKA-1960 Project: Kafka Issue Type: Bug Components: build Reporter: Tong Li Assignee: Tong Li Priority: Minor Labels: newbie Fix For: 0.8.3 Attachments: KAFKA-1960.patch gradle test can create quite few folders, .gitignore should exclude these files for an easier git submit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1546) Automate replica lag tuning
[ https://issues.apache.org/jira/browse/KAFKA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358058#comment-14358058 ] Joe Stein commented on KAFKA-1546: -- we could also mark the JIRA as a bug instead of improvment Automate replica lag tuning --- Key: KAFKA-1546 URL: https://issues.apache.org/jira/browse/KAFKA-1546 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0, 0.8.1, 0.8.1.1 Reporter: Neha Narkhede Assignee: Aditya Auradkar Labels: newbie++ Attachments: KAFKA-1546.patch, KAFKA-1546_2015-03-11_18:48:09.patch Currently, there is no good way to tune the replica lag configs to automatically account for high and low volume topics on the same cluster. For the low-volume topic it will take a very long time to detect a lagging replica, and for the high-volume topic it will have false-positives. One approach to making this easier would be to have the configuration be something like replica.lag.max.ms and translate this into a number of messages dynamically based on the throughput of the partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2009) Fix UncheckedOffset.removeOffset synchronization and trace logging issue in mirror maker
[ https://issues.apache.org/jira/browse/KAFKA-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2009: - Fix Version/s: 0.8.3 Fix UncheckedOffset.removeOffset synchronization and trace logging issue in mirror maker Key: KAFKA-2009 URL: https://issues.apache.org/jira/browse/KAFKA-2009 Project: Kafka Issue Type: Bug Reporter: Jiangjie Qin Assignee: Jiangjie Qin Fix For: 0.8.3 Attachments: KAFKA-2009.patch, KAFKA-2009_2015-03-11_11:26:57.patch This ticket is to fix the mirror maker problem on current trunk which is introduced in KAFKA-1650. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1914) Count TotalProduceRequestRate and TotalFetchRequestRate in BrokerTopicMetrics
[ https://issues.apache.org/jira/browse/KAFKA-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1914: - Fix Version/s: 0.8.3 Count TotalProduceRequestRate and TotalFetchRequestRate in BrokerTopicMetrics - Key: KAFKA-1914 URL: https://issues.apache.org/jira/browse/KAFKA-1914 Project: Kafka Issue Type: Sub-task Components: core Reporter: Aditya A Auradkar Assignee: Aditya Auradkar Fix For: 0.8.3 Attachments: KAFKA-1914.patch, KAFKA-1914.patch, KAFKA-1914_2015-02-17_15:46:27.patch Currently the BrokerTopicMetrics only counts the failedProduceRequestRate and the failedFetchRequestRate. We should add 2 metrics to count the overall produce/fetch request rates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1865) Add a flush() call to the new producer API
[ https://issues.apache.org/jira/browse/KAFKA-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1865: - Fix Version/s: 0.8.3 Add a flush() call to the new producer API -- Key: KAFKA-1865 URL: https://issues.apache.org/jira/browse/KAFKA-1865 Project: Kafka Issue Type: Bug Reporter: Jay Kreps Assignee: Jay Kreps Fix For: 0.8.3 Attachments: KAFKA-1865.patch, KAFKA-1865_2015-02-21_15:36:54.patch, KAFKA-1865_2015-02-22_16:26:46.patch, KAFKA-1865_2015-02-23_18:29:16.patch, KAFKA-1865_2015-02-25_17:15:26.patch, KAFKA-1865_2015-02-26_10:37:16.patch The postconditions of this would be that any record enqueued prior to flush() would have completed being sent (either successfully or not). An open question is whether you can continue sending new records while this call is executing (on other threads). We should only do this if it doesn't add inefficiencies for people who don't use it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1852) OffsetCommitRequest can commit offset on unknown topic
[ https://issues.apache.org/jira/browse/KAFKA-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1852: - Fix Version/s: 0.8.3 OffsetCommitRequest can commit offset on unknown topic -- Key: KAFKA-1852 URL: https://issues.apache.org/jira/browse/KAFKA-1852 Project: Kafka Issue Type: Bug Affects Versions: 0.8.3 Reporter: Jun Rao Assignee: Sriharsha Chintalapani Fix For: 0.8.3 Attachments: KAFKA-1852.patch, KAFKA-1852_2015-01-19_10:44:01.patch, KAFKA-1852_2015-02-12_16:46:10.patch, KAFKA-1852_2015-02-16_13:21:46.patch, KAFKA-1852_2015-02-18_13:13:17.patch, KAFKA-1852_2015-02-27_13:50:34.patch Currently, we allow an offset to be committed to Kafka, even when the topic/partition for the offset doesn't exist. We probably should disallow that and send an error back in that case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1910) Refactor KafkaConsumer
[ https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1910: - Fix Version/s: 0.8.3 Refactor KafkaConsumer -- Key: KAFKA-1910 URL: https://issues.apache.org/jira/browse/KAFKA-1910 Project: Kafka Issue Type: Sub-task Components: consumer Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.8.3 Attachments: KAFKA-1910.patch, KAFKA-1910_2015-03-05_14:55:33.patch KafkaConsumer now contains all the logic on the consumer side, making it a very huge class file, better re-factoring it to have multiple layers on top of KafkaClient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1831) Producer does not provide any information about which host the data was sent to
[ https://issues.apache.org/jira/browse/KAFKA-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1831: - Fix Version/s: 0.8.2.0 Producer does not provide any information about which host the data was sent to --- Key: KAFKA-1831 URL: https://issues.apache.org/jira/browse/KAFKA-1831 Project: Kafka Issue Type: Improvement Components: producer Affects Versions: 0.8.1.1 Reporter: Mark Payne Assignee: Jun Rao Fix For: 0.8.2.0 For traceability purposes and for troubleshooting, when sending data to Kafka, the Producer should provide information about which host the data was sent to. This works well already in the SimpleConsumer, which provides host() and port() methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1546) Automate replica lag tuning
[ https://issues.apache.org/jira/browse/KAFKA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358017#comment-14358017 ] Joe Stein commented on KAFKA-1546: -- Shouldn't we have a KIP for this? It seems like we are changing/adding public features that will affects folks. Automate replica lag tuning --- Key: KAFKA-1546 URL: https://issues.apache.org/jira/browse/KAFKA-1546 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0, 0.8.1, 0.8.1.1 Reporter: Neha Narkhede Assignee: Aditya Auradkar Labels: newbie++ Attachments: KAFKA-1546.patch, KAFKA-1546_2015-03-11_18:48:09.patch Currently, there is no good way to tune the replica lag configs to automatically account for high and low volume topics on the same cluster. For the low-volume topic it will take a very long time to detect a lagging replica, and for the high-volume topic it will have false-positives. One approach to making this easier would be to have the configuration be something like replica.lag.max.ms and translate this into a number of messages dynamically based on the throughput of the partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior
[ https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358018#comment-14358018 ] Joe Stein commented on KAFKA-1461: -- Shouldn't there be a KIP for this? Replica fetcher thread does not implement any back-off behavior --- Key: KAFKA-1461 URL: https://issues.apache.org/jira/browse/KAFKA-1461 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.1.1 Reporter: Sam Meder Assignee: Sriharsha Chintalapani Labels: newbie++ Fix For: 0.8.3 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch The current replica fetcher thread will retry in a tight loop if any error occurs during the fetch call. For example, we've seen cases where the fetch continuously throws a connection refused exception leading to several replica fetcher threads that spin in a pretty tight loop. To a much lesser degree this is also an issue in the consumer fetcher thread, although the fact that erroring partitions are removed so a leader can be re-discovered helps some. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1546) Automate replica lag tuning
[ https://issues.apache.org/jira/browse/KAFKA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358054#comment-14358054 ] Joe Stein commented on KAFKA-1546: -- if folks are going to read the KIP to understand for a release what features went in and why and this isn't there I think that would be odd. How will they know what to-do with the setting? granted that is what JIRA is for too if folks read the JIRA for a release but that isn't how the KIP have been discussed and working so far regardless about what I think here for this. Automate replica lag tuning --- Key: KAFKA-1546 URL: https://issues.apache.org/jira/browse/KAFKA-1546 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0, 0.8.1, 0.8.1.1 Reporter: Neha Narkhede Assignee: Aditya Auradkar Labels: newbie++ Attachments: KAFKA-1546.patch, KAFKA-1546_2015-03-11_18:48:09.patch Currently, there is no good way to tune the replica lag configs to automatically account for high and low volume topics on the same cluster. For the low-volume topic it will take a very long time to detect a lagging replica, and for the high-volume topic it will have false-positives. One approach to making this easier would be to have the configuration be something like replica.lag.max.ms and translate this into a number of messages dynamically based on the throughput of the partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior
[ https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358056#comment-14358056 ] Joe Stein commented on KAFKA-1461: -- I personally think it is over kill but i bring it up because it seems to be required for other changes so I am just asking a question. If we are using the KIP to help folks understand the reason behind changes then we should do that and be complete or not. Replica fetcher thread does not implement any back-off behavior --- Key: KAFKA-1461 URL: https://issues.apache.org/jira/browse/KAFKA-1461 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.1.1 Reporter: Sam Meder Assignee: Sriharsha Chintalapani Labels: newbie++ Fix For: 0.8.3 Attachments: KAFKA-1461.patch, KAFKA-1461.patch, KAFKA-1461_2015-03-11_10:41:26.patch, KAFKA-1461_2015-03-11_18:17:51.patch The current replica fetcher thread will retry in a tight loop if any error occurs during the fetch call. For example, we've seen cases where the fetch continuously throws a connection refused exception leading to several replica fetcher threads that spin in a pretty tight loop. To a much lesser degree this is also an issue in the consumer fetcher thread, although the fact that erroring partitions are removed so a leader can be re-discovered helps some. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1930) Move server over to new metrics library
[ https://issues.apache.org/jira/browse/KAFKA-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355335#comment-14355335 ] Joe Stein commented on KAFKA-1930: -- +1 to KIP, I think one thing that is going to be important for folks is how do I use my existing trusted reporter library with the new version or how do I change it so it works. I don't think the project should have to develop a Riemann, statsd, graphite, ganglia, etc reporters but make it as easy to ramp as possible for folks doing this today. Move server over to new metrics library --- Key: KAFKA-1930 URL: https://issues.apache.org/jira/browse/KAFKA-1930 Project: Kafka Issue Type: Improvement Reporter: Jay Kreps We are using org.apache.kafka.common.metrics on the clients, but using Coda Hale metrics on the server. We should move the server over to the new metrics package as well. This will help to make all our metrics self-documenting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2007) update offsetrequest for more useful information we have on broker about partition
[ https://issues.apache.org/jira/browse/KAFKA-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-2007: - Description: this will need a KIP via [~jkreps] in KIP-6 discussion about KAFKA-1694 The other information that would be really useful to get would be information about partitions--how much data is in the partition, what are the segment offsets, what is the log-end offset (i.e. last offset), what is the compaction point, etc. I think that done right this would be the successor to the very awkward OffsetRequest we have today. This is not really blocking that ticket and could happen before/after and has a lot of other useful purposes and is important to get done so tracking it here in this JIRA. was: this will need a KIP via [~jkreps] in KIP-6 discussion about KAFKA- The other information that would be really useful to get would be information about partitions--how much data is in the partition, what are the segment offsets, what is the log-end offset (i.e. last offset), what is the compaction point, etc. I think that done right this would be the successor to the very awkward OffsetRequest we have today. This is not really blocking that ticket and could happen before/after and has a lot of other useful purposes and is important to get done so tracking it here in this JIRA. update offsetrequest for more useful information we have on broker about partition -- Key: KAFKA-2007 URL: https://issues.apache.org/jira/browse/KAFKA-2007 Project: Kafka Issue Type: Bug Reporter: Joe Stein Fix For: 0.8.3 this will need a KIP via [~jkreps] in KIP-6 discussion about KAFKA-1694 The other information that would be really useful to get would be information about partitions--how much data is in the partition, what are the segment offsets, what is the log-end offset (i.e. last offset), what is the compaction point, etc. I think that done right this would be the successor to the very awkward OffsetRequest we have today. This is not really blocking that ticket and could happen before/after and has a lot of other useful purposes and is important to get done so tracking it here in this JIRA. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-2006) switch the broker server over to the new java protocol definitions
Joe Stein created KAFKA-2006: Summary: switch the broker server over to the new java protocol definitions Key: KAFKA-2006 URL: https://issues.apache.org/jira/browse/KAFKA-2006 Project: Kafka Issue Type: Bug Reporter: Joe Stein Assignee: Andrii Biletskyi Priority: Blocker Fix For: 0.8.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-2007) update offsetrequest for more useful information we have on broker about partition
Joe Stein created KAFKA-2007: Summary: update offsetrequest for more useful information we have on broker about partition Key: KAFKA-2007 URL: https://issues.apache.org/jira/browse/KAFKA-2007 Project: Kafka Issue Type: Bug Reporter: Joe Stein Fix For: 0.8.3 this will need a KIP via [~jkreps] in KIP-6 discussion about KAFKA- The other information that would be really useful to get would be information about partitions--how much data is in the partition, what are the segment offsets, what is the log-end offset (i.e. last offset), what is the compaction point, etc. I think that done right this would be the successor to the very awkward OffsetRequest we have today. This is not really blocking that ticket and could happen before/after and has a lot of other useful purposes and is important to get done so tracking it here in this JIRA. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1995) JMS to Kafka: Inbuilt JMSAdaptor/JMSProxy/JMSBridge (Client can speak JMS but hit Kafka)
[ https://issues.apache.org/jira/browse/KAFKA-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348951#comment-14348951 ] Joe Stein commented on KAFKA-1995: -- This sounds like a good idea, but I think it doesn't belong in Kafka itself Agreed on both A JMS client would be great https://cwiki.apache.org/confluence/display/KAFKA/Clients for sure. JMS to Kafka: Inbuilt JMSAdaptor/JMSProxy/JMSBridge (Client can speak JMS but hit Kafka) Key: KAFKA-1995 URL: https://issues.apache.org/jira/browse/KAFKA-1995 Project: Kafka Issue Type: Wish Components: core Affects Versions: 0.8.3 Reporter: Rekha Joshi Kafka is a great alternative to JMS, providing high performance, throughput as scalable, distributed pub sub/commit log service. However there always exist traditional systems running on JMS. Rather than rewriting, it would be great if we just had an inbuilt JMSAdaptor/JMSProxy/JMSBridge by which client can speak JMS but hit Kafka behind-the-scene. Something like Chukwa's o.a.h.chukwa.datacollection.adaptor.jms.JMSAdaptor, which receives msg off JMS queue and transforms to a Chukwa chunk? I have come across folks talking of this need in past as well.Is it considered and/or part of the roadmap? http://grokbase.com/t/kafka/users/131cst8xpv/stomp-binding-for-kafka http://grokbase.com/t/kafka/users/148dm4247q/consuming-messages-from-kafka-and-pushing-on-to-a-jms-queue http://grokbase.com/t/kafka/users/143hjepbn2/request-kafka-zookeeper-jms-details Looking for inputs on correct way to approach this so to retain all good features of Kafka while still not rewriting entire application.Possible? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1845) KafkaConfig should use ConfigDef
[ https://issues.apache.org/jira/browse/KAFKA-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1845: - Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the patch Andrii and the review Gwen, committed to trunk. KafkaConfig should use ConfigDef - Key: KAFKA-1845 URL: https://issues.apache.org/jira/browse/KAFKA-1845 Project: Kafka Issue Type: Sub-task Reporter: Gwen Shapira Assignee: Andrii Biletskyi Labels: newbie Fix For: 0.8.3 Attachments: KAFKA-1845.patch, KAFKA-1845_2015-02-08_17:05:22.patch, KAFKA-1845_2015-03-05_01:12:22.patch ConfigDef is already used for the new producer and for TopicConfig. Will be nice to standardize and use one configuration and validation library across the board. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1882) Create extendable channel interface and default implementations
[ https://issues.apache.org/jira/browse/KAFKA-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1882: - Status: Patch Available (was: Open) Create extendable channel interface and default implementations --- Key: KAFKA-1882 URL: https://issues.apache.org/jira/browse/KAFKA-1882 Project: Kafka Issue Type: Sub-task Components: security Reporter: Gwen Shapira Assignee: Gwen Shapira Priority: Blocker Fix For: 0.8.3 For the security infrastructure, we need an extendible interface to replace SocketChannel. KAFKA-1684 suggests extending SocketChannel itself, but since SocketChannel is part of Java's standard library, the interface changes between different Java versions, so extending it directly can become a compatibility issue. Instead, we can implement a KafkaChannel interface, which will implement connect(), read(), write() and possibly other methods we use. We will replace direct use of SocketChannel in our code with use of KafkaChannel. Different implementations of KafkaChannel will be instantiated based on the port/SecurityProtocol configuration. This patch will provide at least the PLAINTEXT implementation for KafkaChannel. I will validate that the SSL implementation in KAFKA-1684 can be refactored to use a KafkaChannel interface rather than extend SocketChannel directly. However, the patch will not include the SSL channel itself. The interface should also include setters/getters for principal and remote IP, which will be used for the authentication code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1882) Create extendable channel interface and default implementations
[ https://issues.apache.org/jira/browse/KAFKA-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1882: - Priority: Blocker (was: Major) Fix Version/s: 0.8.3 supported in this patch https://issues.apache.org/jira/browse/KAFKA-1809 with PLAINTEXT as the default implementation. The KIP has been accepted too https://cwiki.apache.org/confluence/display/KAFKA/KIP-2+-+Refactor+brokers+to+allow+listening+on+multiple+ports+and+IPs Create extendable channel interface and default implementations --- Key: KAFKA-1882 URL: https://issues.apache.org/jira/browse/KAFKA-1882 Project: Kafka Issue Type: Sub-task Components: security Reporter: Gwen Shapira Assignee: Gwen Shapira Priority: Blocker Fix For: 0.8.3 For the security infrastructure, we need an extendible interface to replace SocketChannel. KAFKA-1684 suggests extending SocketChannel itself, but since SocketChannel is part of Java's standard library, the interface changes between different Java versions, so extending it directly can become a compatibility issue. Instead, we can implement a KafkaChannel interface, which will implement connect(), read(), write() and possibly other methods we use. We will replace direct use of SocketChannel in our code with use of KafkaChannel. Different implementations of KafkaChannel will be instantiated based on the port/SecurityProtocol configuration. This patch will provide at least the PLAINTEXT implementation for KafkaChannel. I will validate that the SSL implementation in KAFKA-1684 can be refactored to use a KafkaChannel interface rather than extend SocketChannel directly. However, the patch will not include the SSL channel itself. The interface should also include setters/getters for principal and remote IP, which will be used for the authentication code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments
[ https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336573#comment-14336573 ] Joe Stein commented on KAFKA-1792: -- [~Dmitry Pekar] we should have this discussion on the KIP thread where we already had came up to discuss making backwards compatable == preserve functionality. So --generate would be for no changes is like it is and your new code would be --re-balance. We would deprecate --generate in the 0.9.0 release. Can you update the KIP and have the discussion continue on the mailing list thread please, thanks! change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments - Key: KAFKA-1792 URL: https://issues.apache.org/jira/browse/KAFKA-1792 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, KAFKA-1792_2014-12-08_13:42:43.patch, KAFKA-1792_2014-12-19_16:48:12.patch, KAFKA-1792_2015-01-14_12:54:52.patch, KAFKA-1792_2015-01-27_19:09:27.patch, KAFKA-1792_2015-02-13_21:07:06.patch, generate_alg_tests.txt, rebalance_use_cases.txt Current implementation produces fair replica distribution between specified list of brokers. Unfortunately, it doesn't take into account current replica assignment. So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth broker id=3, generate will create an assignment config which will redistribute replicas fairly across brokers [0..3] in the same way as those partitions were created from scratch. It will not take into consideration current replica assignment and accordingly will not try to minimize number of replica moves between brokers. As proposed by [~charmalloc] this should be improved. New output of improved --generate algorithm should suite following requirements: - fairness of replica distribution - every broker will have R or R+1 replicas assigned; - minimum of reassignments - number of replica moves between brokers will be minimal; Example. Consider following replica distribution per brokers [0..3] (we just added brokers 2 and 3): - broker - 0, 1, 2, 3 - replicas - 7, 6, 0, 0 The new algorithm will produce following assignment: - broker - 0, 1, 2, 3 - replicas - 4, 3, 3, 3 - moves - -3, -3, +3, +3 It will be fair and number of moves will be 6, which is minimal for specified initial distribution. The scope of this issue is: - design an algorithm matching the above requirements; - implement this algorithm and unit tests; - test it manually using different initial assignments; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1856) Add PreCommit Patch Testing
[ https://issues.apache.org/jira/browse/KAFKA-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333560#comment-14333560 ] Joe Stein commented on KAFKA-1856: -- [~singhashish] I didn't want to make these changes while the build on trunk is breaking currently in jenkins. There is a new jenkins job ready to go once that is fixed we can do this. Add PreCommit Patch Testing --- Key: KAFKA-1856 URL: https://issues.apache.org/jira/browse/KAFKA-1856 Project: Kafka Issue Type: Task Reporter: Ashish Kumar Singh Assignee: Ashish Kumar Singh Attachments: KAFKA-1845.result.txt, KAFKA-1856.patch, KAFKA-1856_2015-01-18_21:43:56.patch, KAFKA-1856_2015-02-04_14:57:05.patch, KAFKA-1856_2015-02-04_15:44:47.patch h1. Kafka PreCommit Patch Testing - *Don't wait for it to break* h2. Motivation *With great power comes great responsibility* - Uncle Ben. As Kafka user list is growing, mechanism to ensure quality of the product is required. Quality becomes hard to measure and maintain in an open source project, because of a wide community of contributors. Luckily, Kafka is not the first open source project and can benefit from learnings of prior projects. PreCommit tests are the tests that are run for each patch that gets attached to an open JIRA. Based on tests results, test execution framework, test bot, +1 or -1 the patch. Having PreCommit tests take the load off committers to look at or test each patch. h2. Tests in Kafka h3. Unit and Integraiton Tests [Unit and Integration tests|https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Unit+and+Integration+Tests] are cardinal to help contributors to avoid breaking existing functionalities while adding new functionalities or fixing older ones. These tests, atleast the ones relevant to the changes, must be run by contributors before attaching a patch to a JIRA. h3. System Tests [System tests|https://cwiki.apache.org/confluence/display/KAFKA/Kafka+System+Tests] are much wider tests that, unlike unit tests, focus on end-to-end scenarios and not some specific method or class. h2. Apache PreCommit tests Apache provides a mechanism to automatically build a project and run a series of tests whenever a patch is uploaded to a JIRA. Based on test execution, the test framework will comment with a +1 or -1 on the JIRA. You can read more about the framework here: http://wiki.apache.org/general/PreCommitBuilds h2. Plan # Create a test-patch.py script (similar to the one used in Flume, Sqoop and other projects) that will take a jira as a parameter, apply on the appropriate branch, build the project, run tests and report results. This script should be committed into the Kafka code-base. To begin with, this will only run unit tests. We can add code sanity checks, system_tests, etc in the future. # Create a jenkins job for running the test (as described in http://wiki.apache.org/general/PreCommitBuilds) and validate that it works manually. This must be done by a committer with Jenkins access. # Ask someone with access to https://builds.apache.org/job/PreCommit-Admin/ to add Kafka to the list of projects PreCommit-Admin triggers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1856) Add PreCommit Patch Testing
[ https://issues.apache.org/jira/browse/KAFKA-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313509#comment-14313509 ] Joe Stein commented on KAFKA-1856: -- [~singhashish] I created a new jenkins build https://builds.apache.org/job/KafkaPreCommit/ and will do some more testing on your patch but it lgtm. Tomorrow(ish) if I can get it all to hooked up will commit it or let you know if any questions/issues. Add PreCommit Patch Testing --- Key: KAFKA-1856 URL: https://issues.apache.org/jira/browse/KAFKA-1856 Project: Kafka Issue Type: Task Reporter: Ashish Kumar Singh Assignee: Ashish Kumar Singh Attachments: KAFKA-1845.result.txt, KAFKA-1856.patch, KAFKA-1856_2015-01-18_21:43:56.patch, KAFKA-1856_2015-02-04_14:57:05.patch, KAFKA-1856_2015-02-04_15:44:47.patch h1. Kafka PreCommit Patch Testing - *Don't wait for it to break* h2. Motivation *With great power comes great responsibility* - Uncle Ben. As Kafka user list is growing, mechanism to ensure quality of the product is required. Quality becomes hard to measure and maintain in an open source project, because of a wide community of contributors. Luckily, Kafka is not the first open source project and can benefit from learnings of prior projects. PreCommit tests are the tests that are run for each patch that gets attached to an open JIRA. Based on tests results, test execution framework, test bot, +1 or -1 the patch. Having PreCommit tests take the load off committers to look at or test each patch. h2. Tests in Kafka h3. Unit and Integraiton Tests [Unit and Integration tests|https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Unit+and+Integration+Tests] are cardinal to help contributors to avoid breaking existing functionalities while adding new functionalities or fixing older ones. These tests, atleast the ones relevant to the changes, must be run by contributors before attaching a patch to a JIRA. h3. System Tests [System tests|https://cwiki.apache.org/confluence/display/KAFKA/Kafka+System+Tests] are much wider tests that, unlike unit tests, focus on end-to-end scenarios and not some specific method or class. h2. Apache PreCommit tests Apache provides a mechanism to automatically build a project and run a series of tests whenever a patch is uploaded to a JIRA. Based on test execution, the test framework will comment with a +1 or -1 on the JIRA. You can read more about the framework here: http://wiki.apache.org/general/PreCommitBuilds h2. Plan # Create a test-patch.py script (similar to the one used in Flume, Sqoop and other projects) that will take a jira as a parameter, apply on the appropriate branch, build the project, run tests and report results. This script should be committed into the Kafka code-base. To begin with, this will only run unit tests. We can add code sanity checks, system_tests, etc in the future. # Create a jenkins job for running the test (as described in http://wiki.apache.org/general/PreCommitBuilds) and validate that it works manually. This must be done by a committer with Jenkins access. # Ask someone with access to https://builds.apache.org/job/PreCommit-Admin/ to add Kafka to the list of projects PreCommit-Admin triggers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1856) Add PreCommit Patch Testing
[ https://issues.apache.org/jira/browse/KAFKA-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1856: - Attachment: KAFKA-1845.result.txt really cool, just tried this out {code} python dev-utils/test-patch.py --defect KAFKA-1845 --output patch-process --run-tests {code} which I think once this is in the jenkins build would have shown up on the KAFKA-1845 ticket as Testing file [KAFKA-1845_2015-02-08_17%3A05%3A22.patch|https://issues.apache.org/jira/secure/attachment/12697336/KAFKA-1845_2015-02-08_17%3A05%3A22.patch] against branch trunk took 0:31:28.393900. {color:green}Overall:{color} +1 all checks pass {color:green}SUCCESS:{color} Gradle bootstrap was successful {color:green}SUCCESS:{color} Clean was successful {color:green}SUCCESS:{color} Patch applied, but there has been warnings: {code}stdin:233: space before tab in indent. if (trimmed.equalsIgnoreCase(true)) stdin:234: space before tab in indent. return true; stdin:235: space before tab in indent. else if (trimmed.equalsIgnoreCase(false)) stdin:236: space before tab in indent. return false; stdin:237: space before tab in indent. else warning: squelched 1 whitespace error warning: 6 lines add whitespace errors. {code} {color:green}SUCCESS:{color} Patch add/modify test case {color:green}SUCCESS:{color} Gradle bootstrap was successful {color:green}SUCCESS:{color} Patch compiled {color:green}SUCCESS:{color} Checked style for Main {color:green}SUCCESS:{color} Checked style for Test {color:green}SUCCESS:{color} All unit tests passed This message is automatically generated. Add PreCommit Patch Testing --- Key: KAFKA-1856 URL: https://issues.apache.org/jira/browse/KAFKA-1856 Project: Kafka Issue Type: Task Reporter: Ashish Kumar Singh Assignee: Ashish Kumar Singh Attachments: KAFKA-1845.result.txt, KAFKA-1856.patch, KAFKA-1856_2015-01-18_21:43:56.patch, KAFKA-1856_2015-02-04_14:57:05.patch, KAFKA-1856_2015-02-04_15:44:47.patch h1. Kafka PreCommit Patch Testing - *Don't wait for it to break* h2. Motivation *With great power comes great responsibility* - Uncle Ben. As Kafka user list is growing, mechanism to ensure quality of the product is required. Quality becomes hard to measure and maintain in an open source project, because of a wide community of contributors. Luckily, Kafka is not the first open source project and can benefit from learnings of prior projects. PreCommit tests are the tests that are run for each patch that gets attached to an open JIRA. Based on tests results, test execution framework, test bot, +1 or -1 the patch. Having PreCommit tests take the load off committers to look at or test each patch. h2. Tests in Kafka h3. Unit and Integraiton Tests [Unit and Integration tests|https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Unit+and+Integration+Tests] are cardinal to help contributors to avoid breaking existing functionalities while adding new functionalities or fixing older ones. These tests, atleast the ones relevant to the changes, must be run by contributors before attaching a patch to a JIRA. h3. System Tests [System tests|https://cwiki.apache.org/confluence/display/KAFKA/Kafka+System+Tests] are much wider tests that, unlike unit tests, focus on end-to-end scenarios and not some specific method or class. h2. Apache PreCommit tests Apache provides a mechanism to automatically build a project and run a series of tests whenever a patch is uploaded to a JIRA. Based on test execution, the test framework will comment with a +1 or -1 on the JIRA. You can read more about the framework here: http://wiki.apache.org/general/PreCommitBuilds h2. Plan # Create a test-patch.py script (similar to the one used in Flume, Sqoop and other projects) that will take a jira as a parameter, apply on the appropriate branch, build the project, run tests and report results. This script should be committed into the Kafka code-base. To begin with, this will only run unit tests. We can add code sanity checks, system_tests, etc in the future. # Create a jenkins job for running the test (as described in http://wiki.apache.org/general/PreCommitBuilds) and validate that it works manually. This must be done by a committer with Jenkins access. # Ask someone with access to https://builds.apache.org/job/PreCommit-Admin/ to add Kafka to the list of projects PreCommit-Admin triggers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1898) compatibility testing framework
[ https://issues.apache.org/jira/browse/KAFKA-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311098#comment-14311098 ] Joe Stein commented on KAFKA-1898: -- Updated the description. The part that the client library would be doing is consuming from the data it is given (so we need some data sets to use) and a set of test scenarios where in they need to produce the right data. The data sets and the analysis output would be what the client library is trying to get to from a result perspective. Another example/test would be supports compression ... in this case the data set would be compressed, they would have to read it, make some change (like hash the message) and re-compress that message hashed in another topic. The validation could run and see, yup my input of List(x) (which was compressed) gave me a result of hash(List(x)) in dest topic (and when analyzing you would uncompress to see the value to check one by one every message). It would be hard to cheat since new data could be generated each time. We also are looking to use this for some end to end latency testing. The goal behind that is to calculate an array of timestamps (based on a key/value map) so you can figure out some more internals and things about the system e.g. {code} {namespace: ly.stealth.kafka.metrics, type: record, name: Timings, fields: [ {name: id, type: long}, {name: timings, type: {type:array, items: long} } ] } {code} compatibility testing framework Key: KAFKA-1898 URL: https://issues.apache.org/jira/browse/KAFKA-1898 Project: Kafka Issue Type: Bug Reporter: Joe Stein Fix For: 0.8.3 Attachments: cctk.png There are a few different scenarios where you want/need to know the status/state of a client library that works with Kafka. Client library development is not just about supporting the wire protocol but also the implementations around specific interactions of the API. The API has blossomed into a robust set of producer, consumer, broker and administrative calls all of which have layers of logic above them. A Client Library may choose to deviate from the path the project sets out and that is ok. The goal of this ticket is to have a system for Kafka that can help to explain what the library is or isn't doing (regardless of what it claims). The idea behind this stems in being able to quickly/easily/succinctly analyze the topic message data. Once you can analyze the topic(s) message you can gather lots of information about what the client library is doing, is not doing and such. There are a few components to this. 1) dataset-generator Test Kafka dataset generation tool. Generates a random text file with given params: --filename, -f - output file name. --filesize, -s - desired size of output file. The actual size will always be a bit larger (with a maximum size of $filesize + $max.length - 1) --min.length, -l - minimum generated entry length. --max.length, -h - maximum generated entry length. Usage: ./gradlew build java -jar dataset-generator/build/libs/dataset-generator-*.jar -s 10 -l 2 -h 20 2) dataset-producer Test Kafka dataset producer tool. Able to produce the given dataset to Kafka or Syslog server. The idea here is you already have lots of data sets that you want to test different things for. You might have different sized messages, formats, etc and want a repeatable benchmark to run and re-run the testing on. You could just have a days worth of data and just choose to replay it. The CCTK idea is that you are always starting from CONSUME in your state of library. If your library is only producing then you will fail a bunch of tests and that might be ok for people. Accepts following params: {code} --filename, -f - input file name. --kafka, -k - Kafka broker address in host:port format. If this parameter is set, --producer.config and --topic must be set too (otherwise they're ignored). --producer.config, -p - Kafka producer properties file location. --topic, -t - Kafka topic to produce to. --syslog, -s - Syslog server address. Format: protocol://host:port (tcp://0.0.0.0:5140 or udp://0.0.0.0:5141 for example) --loop, -l - flag to loop through file until shut off manually. False by default. Usage: ./gradlew build java -jar dataset-producer/build/libs/dataset-producer-*.jar --filename dataset --syslog tcp://0.0.0.0:5140 --loop true {code} 3) extract This step is good so you can save data and compare tests. It could also be removed if folks are just looking for a real live test (and we could support that too). Here we are taking data out of Kafka and putting it into Cassandra (but other data stores can be used too and we should come up with
[jira] [Updated] (KAFKA-1898) compatibility testing framework
[ https://issues.apache.org/jira/browse/KAFKA-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1898: - Description: There are a few different scenarios where you want/need to know the status/state of a client library that works with Kafka. Client library development is not just about supporting the wire protocol but also the implementations around specific interactions of the API. The API has blossomed into a robust set of producer, consumer, broker and administrative calls all of which have layers of logic above them. A Client Library may choose to deviate from the path the project sets out and that is ok. The goal of this ticket is to have a system for Kafka that can help to explain what the library is or isn't doing (regardless of what it claims). The idea behind this stems in being able to quickly/easily/succinctly analyze the topic message data. Once you can analyze the topic(s) message you can gather lots of information about what the client library is doing, is not doing and such. There are a few components to this. 1) dataset-generator Test Kafka dataset generation tool. Generates a random text file with given params: --filename, -f - output file name. --filesize, -s - desired size of output file. The actual size will always be a bit larger (with a maximum size of $filesize + $max.length - 1) --min.length, -l - minimum generated entry length. --max.length, -h - maximum generated entry length. Usage: ./gradlew build java -jar dataset-generator/build/libs/dataset-generator-*.jar -s 10 -l 2 -h 20 2) dataset-producer Test Kafka dataset producer tool. Able to produce the given dataset to Kafka or Syslog server. The idea here is you already have lots of data sets that you want to test different things for. You might have different sized messages, formats, etc and want a repeatable benchmark to run and re-run the testing on. You could just have a days worth of data and just choose to replay it. The CCTK idea is that you are always starting from CONSUME in your state of library. If your library is only producing then you will fail a bunch of tests and that might be ok for people. Accepts following params: {code} --filename, -f - input file name. --kafka, -k - Kafka broker address in host:port format. If this parameter is set, --producer.config and --topic must be set too (otherwise they're ignored). --producer.config, -p - Kafka producer properties file location. --topic, -t - Kafka topic to produce to. --syslog, -s - Syslog server address. Format: protocol://host:port (tcp://0.0.0.0:5140 or udp://0.0.0.0:5141 for example) --loop, -l - flag to loop through file until shut off manually. False by default. Usage: ./gradlew build java -jar dataset-producer/build/libs/dataset-producer-*.jar --filename dataset --syslog tcp://0.0.0.0:5140 --loop true {code} 3) extract This step is good so you can save data and compare tests. It could also be removed if folks are just looking for a real live test (and we could support that too). Here we are taking data out of Kafka and putting it into Cassandra (but other data stores can be used too and we should come up with a way to abstract this out completely so folks could implement whatever they wanted. {code} package ly.stealth.shaihulud.reader import java.util.UUID import com.datastax.spark.connector._ import com.datastax.spark.connector.cql.CassandraConnector import consumer.kafka.MessageAndMetadata import consumer.kafka.client.KafkaReceiver import org.apache.spark._ import org.apache.spark.storage.StorageLevel import org.apache.spark.streaming._ import org.apache.spark.streaming.dstream.DStream object Main extends App with Logging { val parser = new scopt.OptionParser[ReaderConfiguration](spark-reader) { head(Spark Reader for Kafka client applications, 1.0) opt[String](testId) unbounded() optional() action { (x, c) = c.copy(testId = x) } text (Source topic with initial set of data) opt[String](source) unbounded() required() action { (x, c) = c.copy(sourceTopic = x) } text (Source topic with initial set of data) opt[String](destination) unbounded() required() action { (x, c) = c.copy(destinationTopic = x) } text (Destination topic with processed set of data) opt[Int](partitions) unbounded() optional() action { (x, c) = c.copy(partitions = x) } text (Partitions in topic) opt[String](zookeeper) unbounded() required() action { (x, c) = c.copy(zookeeper = x) } text (Zookeeper connection host:port) opt[Int](kafka.fetch.size) unbounded() optional() action { (x, c) = c.copy(kafkaFetchSize = x) } text (Maximum KBs to fetch from Kafka) checkConfig { c = if (c.testId.isEmpty || c.sourceTopic.isEmpty || c.destinationTopic.isEmpty || c.zookeeper.isEmpty) { failure(You haven't provided all required parameters) } else
[jira] [Updated] (KAFKA-1898) compatibility testing framework
[ https://issues.apache.org/jira/browse/KAFKA-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1898: - Attachment: cctk.png compatibility testing framework Key: KAFKA-1898 URL: https://issues.apache.org/jira/browse/KAFKA-1898 Project: Kafka Issue Type: Bug Reporter: Joe Stein Fix For: 0.8.3 Attachments: cctk.png -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1904) run sanity failed test
[ https://issues.apache.org/jira/browse/KAFKA-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1904: - Priority: Blocker (was: Major) run sanity failed test -- Key: KAFKA-1904 URL: https://issues.apache.org/jira/browse/KAFKA-1904 Project: Kafka Issue Type: Bug Reporter: Joe Stein Priority: Blocker Fix For: 0.8.3 Attachments: run_sanity.log.gz _test_case_name : testcase_1 _test_class_name : ReplicaBasicTest arg : bounce_broker : true arg : broker_type : leader arg : message_producing_free_time_sec : 15 arg : num_iteration : 2 arg : num_messages_to_produce_per_producer_call : 50 arg : num_partition : 2 arg : replica_factor : 3 arg : sleep_seconds_between_producer_calls : 1 validation_status : Test completed : FAILED -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1912) Create a simple request re-routing facility
[ https://issues.apache.org/jira/browse/KAFKA-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301956#comment-14301956 ] Joe Stein commented on KAFKA-1912: -- I think this makes sense. Do you want to fold re-routing into the ClusterMetaData RQ/RP https://reviews.apache.org/r/29301/diff/ and commit at the same time or make this patch work first, commit and then rebase and fold KAFKA-1694 into trunk? We could make it plug-able but if we did then https://issues.apache.org/jira/browse/KAFKA-1845 should go first before that interface gets in. Create a simple request re-routing facility --- Key: KAFKA-1912 URL: https://issues.apache.org/jira/browse/KAFKA-1912 Project: Kafka Issue Type: Improvement Reporter: Jay Kreps We are accumulating a lot of requests that have to be directed to the correct server. This makes sense for high volume produce or fetch requests. But it is silly to put the extra burden on the client for the many miscellaneous requests such as fetching or committing offsets and so on. This adds a ton of practical complexity to the clients with little or no payoff in performance. We should add a generic request-type agnostic re-routing facility on the server. This would allow any server to accept a request and forward it to the correct destination, proxying the response back to the user. Naturally it needs to do this without blocking the thread. The result is that a client implementation can choose to be optimally efficient and manage a local cache of cluster state and attempt to always direct its requests to the proper server OR it can choose simplicity and just send things all to a single host and let that host figure out where to forward it. I actually think we should implement this more or less across the board, but some requests such as produce and fetch require more logic to proxy since they have to be scattered out to multiple servers and gathered back to create the response. So these could be done in a second phase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1904) run sanity failured test
Joe Stein created KAFKA-1904: Summary: run sanity failured test Key: KAFKA-1904 URL: https://issues.apache.org/jira/browse/KAFKA-1904 Project: Kafka Issue Type: Bug Reporter: Joe Stein Fix For: 0.8.2 _test_case_name : testcase_1 _test_class_name : ReplicaBasicTest arg : bounce_broker : true arg : broker_type : leader arg : message_producing_free_time_sec : 15 arg : num_iteration : 2 arg : num_messages_to_produce_per_producer_call : 50 arg : num_partition : 2 arg : replica_factor : 3 arg : sleep_seconds_between_producer_calls : 1 validation_status : Test completed : FAILED -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1904) run sanity failed test
[ https://issues.apache.org/jira/browse/KAFKA-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1904: - Summary: run sanity failed test (was: run sanity failured test) run sanity failed test -- Key: KAFKA-1904 URL: https://issues.apache.org/jira/browse/KAFKA-1904 Project: Kafka Issue Type: Bug Reporter: Joe Stein Fix For: 0.8.2 _test_case_name : testcase_1 _test_class_name : ReplicaBasicTest arg : bounce_broker : true arg : broker_type : leader arg : message_producing_free_time_sec : 15 arg : num_iteration : 2 arg : num_messages_to_produce_per_producer_call : 50 arg : num_partition : 2 arg : replica_factor : 3 arg : sleep_seconds_between_producer_calls : 1 validation_status : Test completed : FAILED -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1904) run sanity failed test
[ https://issues.apache.org/jira/browse/KAFKA-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1904: - Attachment: run_sanity.log.gz attached run on trunk, 1 failure lots of skips run sanity failed test -- Key: KAFKA-1904 URL: https://issues.apache.org/jira/browse/KAFKA-1904 Project: Kafka Issue Type: Bug Reporter: Joe Stein Fix For: 0.8.2 Attachments: run_sanity.log.gz _test_case_name : testcase_1 _test_class_name : ReplicaBasicTest arg : bounce_broker : true arg : broker_type : leader arg : message_producing_free_time_sec : 15 arg : num_iteration : 2 arg : num_messages_to_produce_per_producer_call : 50 arg : num_partition : 2 arg : replica_factor : 3 arg : sleep_seconds_between_producer_calls : 1 validation_status : Test completed : FAILED -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1904) run sanity failed test
[ https://issues.apache.org/jira/browse/KAFKA-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296179#comment-14296179 ] Joe Stein commented on KAFKA-1904: -- 0.8.2 works = Total failures count : 0 run sanity failed test -- Key: KAFKA-1904 URL: https://issues.apache.org/jira/browse/KAFKA-1904 Project: Kafka Issue Type: Bug Reporter: Joe Stein Fix For: 0.8.3 Attachments: run_sanity.log.gz _test_case_name : testcase_1 _test_class_name : ReplicaBasicTest arg : bounce_broker : true arg : broker_type : leader arg : message_producing_free_time_sec : 15 arg : num_iteration : 2 arg : num_messages_to_produce_per_producer_call : 50 arg : num_partition : 2 arg : replica_factor : 3 arg : sleep_seconds_between_producer_calls : 1 validation_status : Test completed : FAILED -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1904) run sanity failed test
[ https://issues.apache.org/jira/browse/KAFKA-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1904: - Fix Version/s: (was: 0.8.2) 0.8.3 run sanity failed test -- Key: KAFKA-1904 URL: https://issues.apache.org/jira/browse/KAFKA-1904 Project: Kafka Issue Type: Bug Reporter: Joe Stein Fix For: 0.8.3 Attachments: run_sanity.log.gz _test_case_name : testcase_1 _test_class_name : ReplicaBasicTest arg : bounce_broker : true arg : broker_type : leader arg : message_producing_free_time_sec : 15 arg : num_iteration : 2 arg : num_messages_to_produce_per_producer_call : 50 arg : num_partition : 2 arg : replica_factor : 3 arg : sleep_seconds_between_producer_calls : 1 validation_status : Test completed : FAILED -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1901) Move Kafka version to be generated in code by build (instead of in manifest)
[ https://issues.apache.org/jira/browse/KAFKA-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295614#comment-14295614 ] Joe Stein commented on KAFKA-1901: -- The same way you are suggesting for the brokers to have their metric for version emitted, do the same from producers and consumers. I think though that the producer and consumer metric should keep flowing down to the brokers so you see the metrics of what each broker is seeing for each producer also. Ops can watch the brokers roll and then all the different consumers and producers and see how everything is going during testing prior to release succinctly. Making sure this is in the wire protocol allows non apache clients to utilize the feature. Move Kafka version to be generated in code by build (instead of in manifest) Key: KAFKA-1901 URL: https://issues.apache.org/jira/browse/KAFKA-1901 Project: Kafka Issue Type: Bug Affects Versions: 0.8.2 Reporter: Jason Rosenberg With 0.8.2 (rc2), I've started seeing this warning in the logs of apps deployed to our staging (both server and client): {code} 2015-01-23 00:55:25,273 WARN [async-message-sender-0] common.AppInfo$ - Can't read Kafka version from MANIFEST.MF. Possible cause: java.lang.NullPointerException {code} The issues is that in our deployment, apps are deployed with single 'shaded' jars (e.g. using the maven shade plugin). This means the MANIFEST.MF file won't have a kafka version. Instead, suggest the kafka build generate the proper version in code, as part of the build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1901) Move Kafka version to be generated in code by build (instead of in manifest)
[ https://issues.apache.org/jira/browse/KAFKA-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294838#comment-14294838 ] Joe Stein commented on KAFKA-1901: -- Can we collect that info during a meta data fetch for producers and consumers too? We can visualize producers, consumers and brokers roll states from each brokers perspective. Move Kafka version to be generated in code by build (instead of in manifest) Key: KAFKA-1901 URL: https://issues.apache.org/jira/browse/KAFKA-1901 Project: Kafka Issue Type: Bug Affects Versions: 0.8.2 Reporter: Jason Rosenberg With 0.8.2 (rc2), I've started seeing this warning in the logs of apps deployed to our staging (both server and client): {code} 2015-01-23 00:55:25,273 WARN [async-message-sender-0] common.AppInfo$ - Can't read Kafka version from MANIFEST.MF. Possible cause: java.lang.NullPointerException {code} The issues is that in our deployment, apps are deployed with single 'shaded' jars (e.g. using the maven shade plugin). This means the MANIFEST.MF file won't have a kafka version. Instead, suggest the kafka build generate the proper version in code, as part of the build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments
[ https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292368#comment-14292368 ] Joe Stein commented on KAFKA-1792: -- [~Dmitry Pekar] can you writeup a design for what you have already coded and post it to https://cwiki.apache.org/confluence/display/KAFKA/KIP-6+-+New+reassignment+partition+logic+for+re-balancing please. From there we can move discussions to the mailing list. change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments - Key: KAFKA-1792 URL: https://issues.apache.org/jira/browse/KAFKA-1792 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, KAFKA-1792_2014-12-08_13:42:43.patch, KAFKA-1792_2014-12-19_16:48:12.patch, KAFKA-1792_2015-01-14_12:54:52.patch, generate_alg_tests.txt, rebalance_use_cases.txt Current implementation produces fair replica distribution between specified list of brokers. Unfortunately, it doesn't take into account current replica assignment. So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth broker id=3, generate will create an assignment config which will redistribute replicas fairly across brokers [0..3] in the same way as those partitions were created from scratch. It will not take into consideration current replica assignment and accordingly will not try to minimize number of replica moves between brokers. As proposed by [~charmalloc] this should be improved. New output of improved --generate algorithm should suite following requirements: - fairness of replica distribution - every broker will have R or R+1 replicas assigned; - minimum of reassignments - number of replica moves between brokers will be minimal; Example. Consider following replica distribution per brokers [0..3] (we just added brokers 2 and 3): - broker - 0, 1, 2, 3 - replicas - 7, 6, 0, 0 The new algorithm will produce following assignment: - broker - 0, 1, 2, 3 - replicas - 4, 3, 3, 3 - moves - -3, -3, +3, +3 It will be fair and number of moves will be 6, which is minimal for specified initial distribution. The scope of this issue is: - design an algorithm matching the above requirements; - implement this algorithm and unit tests; - test it manually using different initial assignments; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1898) compatibility testing framework
Joe Stein created KAFKA-1898: Summary: compatibility testing framework Key: KAFKA-1898 URL: https://issues.apache.org/jira/browse/KAFKA-1898 Project: Kafka Issue Type: Bug Reporter: Joe Stein Fix For: 0.8.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1845) KafkaConfig should use ConfigDef
[ https://issues.apache.org/jira/browse/KAFKA-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1845: - Fix Version/s: 0.8.3 KafkaConfig should use ConfigDef - Key: KAFKA-1845 URL: https://issues.apache.org/jira/browse/KAFKA-1845 Project: Kafka Issue Type: Sub-task Reporter: Gwen Shapira Assignee: Andrii Biletskyi Labels: newbie Fix For: 0.8.3 Attachments: KAFKA-1845.patch ConfigDef is already used for the new producer and for TopicConfig. Will be nice to standardize and use one configuration and validation library across the board. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1890) Fix bug preventing Mirror Maker from successful rebalance.
[ https://issues.apache.org/jira/browse/KAFKA-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1890: - Fix Version/s: 0.8.3 Fix bug preventing Mirror Maker from successful rebalance. -- Key: KAFKA-1890 URL: https://issues.apache.org/jira/browse/KAFKA-1890 Project: Kafka Issue Type: Bug Reporter: Jiangjie Qin Assignee: Jiangjie Qin Fix For: 0.8.3 Attachments: KAFKA-1890.patch Follow-up patch for KAFKA-1650 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1890) Fix bug preventing Mirror Maker from successful rebalance.
[ https://issues.apache.org/jira/browse/KAFKA-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1890: - Description: Follow-up patch for KAFKA-1650 Fix bug preventing Mirror Maker from successful rebalance. -- Key: KAFKA-1890 URL: https://issues.apache.org/jira/browse/KAFKA-1890 Project: Kafka Issue Type: Bug Reporter: Jiangjie Qin Assignee: Jiangjie Qin Fix For: 0.8.3 Attachments: KAFKA-1890.patch Follow-up patch for KAFKA-1650 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1890) Fix bug preventing Mirror Maker from successful rebalance.
[ https://issues.apache.org/jira/browse/KAFKA-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1890: - Summary: Fix bug preventing Mirror Maker from successful rebalance. (was: Follow-up patch for KAFKA-1650) Fix bug preventing Mirror Maker from successful rebalance. -- Key: KAFKA-1890 URL: https://issues.apache.org/jira/browse/KAFKA-1890 Project: Kafka Issue Type: Bug Reporter: Jiangjie Qin Assignee: Jiangjie Qin Fix For: 0.8.3 Attachments: KAFKA-1890.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1890) Fix bug preventing Mirror Maker from successful rebalance.
[ https://issues.apache.org/jira/browse/KAFKA-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1890: - Reviewer: Gwen Shapira Fix bug preventing Mirror Maker from successful rebalance. -- Key: KAFKA-1890 URL: https://issues.apache.org/jira/browse/KAFKA-1890 Project: Kafka Issue Type: Bug Reporter: Jiangjie Qin Assignee: Jiangjie Qin Fix For: 0.8.3 Attachments: KAFKA-1890.patch Follow-up patch for KAFKA-1650 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1890) Follow-up patch for KAFKA-1650
[ https://issues.apache.org/jira/browse/KAFKA-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1890: - Description: (was: Fix bug preventing Mirror Maker from successful rebalance.) Follow-up patch for KAFKA-1650 -- Key: KAFKA-1890 URL: https://issues.apache.org/jira/browse/KAFKA-1890 Project: Kafka Issue Type: Bug Reporter: Jiangjie Qin Assignee: Jiangjie Qin Fix For: 0.8.3 Attachments: KAFKA-1890.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1891) MirrorMaker hides consumer exception - making troubleshooting challenging
[ https://issues.apache.org/jira/browse/KAFKA-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1891: - Resolution: Fixed Status: Resolved (was: Patch Available) committed to trunk MirrorMaker hides consumer exception - making troubleshooting challenging - Key: KAFKA-1891 URL: https://issues.apache.org/jira/browse/KAFKA-1891 Project: Kafka Issue Type: Bug Reporter: Gwen Shapira Assignee: Gwen Shapira Fix For: 0.8.3 Attachments: KAFKA-1891.patch When MirrorMaker encounters an issue creating a consumer, it gives a generic unable to create stream error, while hiding the actual issue. We should print the original exception too, so users can resolve whatever issue prevents MirrorMaker from creating a stream. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1888) Add a rolling upgrade system test
[ https://issues.apache.org/jira/browse/KAFKA-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288694#comment-14288694 ] Joe Stein commented on KAFKA-1888: -- I agree this is something that we need. I don't know if the current system tests are the right vehicle for the effort. The system tests haven't been able to help compatibility with clients or tooling. We started on some spark jobs https://github.com/stealthly/gauntlet which I think we can make work for this type of test too. If this is an approach that folks might be interested in the core project I could write up a KIP. Add a rolling upgrade system test --- Key: KAFKA-1888 URL: https://issues.apache.org/jira/browse/KAFKA-1888 Project: Kafka Issue Type: Improvement Components: system tests Reporter: Gwen Shapira Assignee: Gwen Shapira Fix For: 0.9.0 To help test upgrades and compatibility between versions, it will be cool to add a rolling-upgrade test to system tests: Given two versions (just a path to the jars?), check that you can do a rolling upgrade of the brokers from one version to another (using clients from the old version) without losing data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1891) MirrorMaker hides consumer exception - making troubleshooting challenging
[ https://issues.apache.org/jira/browse/KAFKA-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-1891: - Fix Version/s: 0.8.3 MirrorMaker hides consumer exception - making troubleshooting challenging - Key: KAFKA-1891 URL: https://issues.apache.org/jira/browse/KAFKA-1891 Project: Kafka Issue Type: Bug Reporter: Gwen Shapira Assignee: Gwen Shapira Fix For: 0.8.3 Attachments: KAFKA-1891.patch When MirrorMaker encounters an issue creating a consumer, it gives a generic unable to create stream error, while hiding the actual issue. We should print the original exception too, so users can resolve whatever issue prevents MirrorMaker from creating a stream. -- This message was sent by Atlassian JIRA (v6.3.4#6332)