Re: Kafka/Hadoop consumers and producers
I think it makes sense to kill the hadoop consumer/producer code in Kafka, given, as you said, Camus and the simplicity of the Hadoop producer. /Sam On Jul 2, 2013, at 5:01 PM, Jay Kreps jay.kr...@gmail.com wrote: We currently have a contrib package for consuming and producing messages from mapreduce ( https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tree;f=contrib;h=e53e1fb34893e733b10ff27e79e6a1dcbb8d7ab0;hb=HEAD ). We keep running into problems (e.g. KAFKA-946) that are basically due to the fact that the Kafka committers don't seem to mostly be Hadoop developers and aren't doing a good job of maintaining this code (keeping it tested, improving it, documenting it, writing tutorials, getting it moved over to the more modern apis, getting it working with newer Hadoop versions, etc). A couple of options: 1. We could try to get someone in the Kafka community (either a current committer or not) who would adopt this as their baby (it's not much code). 2. We could just let Camus take over this functionality. They already have a more sophisticated consumer and the producer is pretty minimal. So are there any people who would like to adopt the current Hadoop contrib code? Conversely would it be possible to provide the same or similar functionality in Camus and just delete these? -Jay
Re: Kafka/Hadoop consumers and producers
If the Hadoop consumer/producers use-case will remain relevant for Kafka (I assume it will), it would make sense to have the core components (kafka input/output format at least) as part of Kafka so that it could be built, tested and versioned together to maintain compatibility. This would also make it easier to build custom MR jobs on top of Kafka, rather than having to decouple stuff from Camus. Also it would also be less confusing for users at least when starting using Kafka. Camus could use those instead of providing it's own. This being said we did some work on the consumer side (0.8 and the new(er) MR API). We could probably try to rewrite them to use Camus or fix Camus or whatever, but please consider this alternative as well. Thanks, Cosmin On 7/3/13 11:06 AM, Sam Meder sam.me...@jivesoftware.com wrote: I think it makes sense to kill the hadoop consumer/producer code in Kafka, given, as you said, Camus and the simplicity of the Hadoop producer. /Sam On Jul 2, 2013, at 5:01 PM, Jay Kreps jay.kr...@gmail.com wrote: We currently have a contrib package for consuming and producing messages from mapreduce ( https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tree;f=contrib;h=e5 3e1fb34893e733b10ff27e79e6a1dcbb8d7ab0;hb=HEAD ). We keep running into problems (e.g. KAFKA-946) that are basically due to the fact that the Kafka committers don't seem to mostly be Hadoop developers and aren't doing a good job of maintaining this code (keeping it tested, improving it, documenting it, writing tutorials, getting it moved over to the more modern apis, getting it working with newer Hadoop versions, etc). A couple of options: 1. We could try to get someone in the Kafka community (either a current committer or not) who would adopt this as their baby (it's not much code). 2. We could just let Camus take over this functionality. They already have a more sophisticated consumer and the producer is pretty minimal. So are there any people who would like to adopt the current Hadoop contrib code? Conversely would it be possible to provide the same or similar functionality in Camus and just delete these? -Jay
[jira] [Created] (KAFKA-958) Please compile list of key metrics on the broker and client side and put it on a wiki
Vadim created KAFKA-958: --- Summary: Please compile list of key metrics on the broker and client side and put it on a wiki Key: KAFKA-958 URL: https://issues.apache.org/jira/browse/KAFKA-958 Project: Kafka Issue Type: Bug Components: website Affects Versions: 0.8 Reporter: Vadim Priority: Minor Please compile list of important metrics that need to be monitored by companies to insure healthy operation of the kafka service -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-943) Move all configuration key string to constants
[ https://issues.apache.org/jira/browse/KAFKA-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699095#comment-13699095 ] Jay Kreps commented on KAFKA-943: - Yeah, I think the core disagreement is that we are optimizing for slightly different uses. I think the right way to handle config is through a config management system external to the application--that is why we use properties to begin with instead of just a config pojo--so I think that is the use case I am optimizing for. In that case the property name *is* the contract. This has obvious pros and cons. The pro is having a hard separation between config and code and the ability to manage configuration at user-defined groupings (the instance, server, cluster, datacenter, and fabric levels, say). The con is that it is not compile time checked. I agree that this is somewhat subjective. The rationale for warning versus error for unused configs is the ability to make configuration detached from the application. I.e. let's say you want to roll out kafka client++ and it has a new configuration. Do you roll out the new configuration first or the new code? If you do the config first then it is important that we be able to ignore the property so that restarting your app works, if it is the code then you will end up with the default which may or may not work. The non-existant properties are obviously embarrassing and we should fix them irrespective of anything else. Move all configuration key string to constants -- Key: KAFKA-943 URL: https://issues.apache.org/jira/browse/KAFKA-943 Project: Kafka Issue Type: Improvement Components: config Affects Versions: 0.8 Reporter: Sam Meder Attachments: configConstants.patch The current code base has configuration key strings duplicated all over the place. They show up in the actual *Config classes, a lot of tests, command line utilities and other examples. This makes changes hard and error prone. DRY... The attached patch moves these configuration keys to constants and replaces their usage with a reference to the constant. It also cleans up a few old properties and a few misconfigured tests. I've admittedly not written a whole lot of Scala, so there may be some improvements that can be made, in particular I am not sure I chose the best strategy for keys needed by the SyncProducerConfigShared trait (or traits in general). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: criteria for fixes on 0.8 beta
+1 on Jun's defn. On Tue, Jul 2, 2013 at 10:21 PM, Jun Rao jun...@gmail.com wrote: My vote is that a patch can go into 0.8 if (1) it fixes a critical issue or (2) the change is trivial and it makes the 0.8 experience better (e.g., improving log4j readability). kafka-946 may fall into (2). Thanks, Jun On Tue, Jul 2, 2013 at 7:34 PM, Joe Stein crypt...@gmail.com wrote: How about for now: by default we will not incorporate fixes into 0.8 unless there is a compelling argument (e.g., regression/clear bug with no good workaround) to do so. +1 On Tue, Jul 2, 2013 at 8:48 PM, Joel Koshy jjkosh...@gmail.com wrote: Good question. Some fixes are clearly critical (e.g., consumer deadlocks) that would impact everyone and need to go into 0.8. Unfortunately the criticality of most other fixes is subjective and I'm not sure how feasible it is to develop a global criteria. It probably needs to be determined through consensus whether it needs to go into 0.8 or not. How about for now: by default we will not incorporate fixes into 0.8 unless there is a compelling argument (e.g., regression/clear bug with no good workaround) to do so. Joel On Tue, Jul 2, 2013 at 5:11 PM, Jay Kreps jay.kr...@gmail.com wrote: What should the criteria for fixes on 0.8 be? This seems like a reasonable candidate but I don't think we discussed what we would be taking so I thought I would ask... https://issues.apache.org/jira/browse/KAFKA-946 -Jay -- /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop */
[jira] [Commented] (KAFKA-943) Move all configuration key string to constants
[ https://issues.apache.org/jira/browse/KAFKA-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699128#comment-13699128 ] Sam Meder commented on KAFKA-943: - So we also have a config management system external to the application, but it has a key space managed by ourselves and we do not expose Kafka properties directly in that configuration. We also end up not even exposing some of the Kafka properties, some of them can just be set to reasonable defaults for our use case and not exposed to our configuration system. As always it is hard to say how common our use is, but it doesn't strike me as that unusual. Fair point regarding coupling code config (we end up deploying code configuration in lockstep - they're versioned together). I guess one solution would be versioning configuration, but that doesn't seem worth the complexity here. Move all configuration key string to constants -- Key: KAFKA-943 URL: https://issues.apache.org/jira/browse/KAFKA-943 Project: Kafka Issue Type: Improvement Components: config Affects Versions: 0.8 Reporter: Sam Meder Attachments: configConstants.patch The current code base has configuration key strings duplicated all over the place. They show up in the actual *Config classes, a lot of tests, command line utilities and other examples. This makes changes hard and error prone. DRY... The attached patch moves these configuration keys to constants and replaces their usage with a reference to the constant. It also cleans up a few old properties and a few misconfigured tests. I've admittedly not written a whole lot of Scala, so there may be some improvements that can be made, in particular I am not sure I chose the best strategy for keys needed by the SyncProducerConfigShared trait (or traits in general). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Helping out
Sam, Interested in taking a look at KAFKA-959https://issues.apache.org/jira/browse/KAFKA-959 ? Thanks, Jun On Sun, Jun 23, 2013 at 11:01 PM, Sam Meder sam.me...@jivesoftware.comwrote: Hey, I now have roughly a day a week I can dedicate to working on Kafka, so I am looking for issues in the 0.8.1 batch that you think might be good starting points. Input would be much appreciated. Speaking of issues, I think it would be good to either fix https://issues.apache.org/jira/browse/KAFKA-946 for 0.8 or just drop the code from the release. /Sam
[jira] [Created] (KAFKA-960) Upgrade Metrics to 3.x
Cosmin Lehene created KAFKA-960: --- Summary: Upgrade Metrics to 3.x Key: KAFKA-960 URL: https://issues.apache.org/jira/browse/KAFKA-960 Project: Kafka Issue Type: Improvement Affects Versions: 0.8 Reporter: Cosmin Lehene Fix For: 0.8 Now that metrics 3.0 has been released (http://metrics.codahale.com/about/release-notes/) we can upgrade back -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-826) Make Kafka 0.8 depend on metrics 2.2.0 instead of 3.x
[ https://issues.apache.org/jira/browse/KAFKA-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699191#comment-13699191 ] Cosmin Lehene commented on KAFKA-826: - https://issues.apache.org/jira/browse/KAFKA-960 Make Kafka 0.8 depend on metrics 2.2.0 instead of 3.x - Key: KAFKA-826 URL: https://issues.apache.org/jira/browse/KAFKA-826 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Dragos Manolescu Priority: Blocker Labels: build, kafka-0.8, metrics Fix For: 0.8 Attachments: kafka-fix-for-826-complete.patch, kafka-fix-for-826.patch, kafka-fix-for-826-take2.patch In order to mavenize Kafka 0.8, we have to depend on metrics 2.2.0 since metrics 3.x is a huge change as well as not an officially supported release. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-960) Upgrade Metrics to 3.x
[ https://issues.apache.org/jira/browse/KAFKA-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699211#comment-13699211 ] Joel Koshy commented on KAFKA-960: -- Given that there are API changes and mbean name changes between 2.x and 3.x my preference would be to defer this to a few months later (after the official 3.x release has proven to be stable). Upgrade Metrics to 3.x -- Key: KAFKA-960 URL: https://issues.apache.org/jira/browse/KAFKA-960 Project: Kafka Issue Type: Improvement Affects Versions: 0.8 Reporter: Cosmin Lehene Fix For: 0.8 Now that metrics 3.0 has been released (http://metrics.codahale.com/about/release-notes/) we can upgrade back -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Helping out
Jun, Looks like the update would happen around here? On Wed, Jul 3, 2013 at 12:53 PM, Jun Rao jun...@gmail.com wrote: Sam, Interested in taking a look at KAFKA-959https://issues.apache.org/jira/browse/KAFKA-959 ? Thanks, Jun On Sun, Jun 23, 2013 at 11:01 PM, Sam Meder sam.me...@jivesoftware.com wrote: Hey, I now have roughly a day a week I can dedicate to working on Kafka, so I am looking for issues in the 0.8.1 batch that you think might be good starting points. Input would be much appreciated. Speaking of issues, I think it would be good to either fix https://issues.apache.org/jira/browse/KAFKA-946 for 0.8 or just drop the code from the release. /Sam
Re: Helping out
https://github.com/apache/kafka/blob/0.8/core/src/main/scala/kafka/producer/async/DefaultEventHandler.scala#L152 On Wed, Jul 3, 2013 at 2:18 PM, S Ahmed sahmed1...@gmail.com wrote: Jun, Looks like the update would happen around here? On Wed, Jul 3, 2013 at 12:53 PM, Jun Rao jun...@gmail.com wrote: Sam, Interested in taking a look at KAFKA-959https://issues.apache.org/jira/browse/KAFKA-959 ? Thanks, Jun On Sun, Jun 23, 2013 at 11:01 PM, Sam Meder sam.me...@jivesoftware.com wrote: Hey, I now have roughly a day a week I can dedicate to working on Kafka, so I am looking for issues in the 0.8.1 batch that you think might be good starting points. Input would be much appreciated. Speaking of issues, I think it would be good to either fix https://issues.apache.org/jira/browse/KAFKA-946 for 0.8 or just drop the code from the release. /Sam
[jira] [Updated] (KAFKA-957) MirrorMaker needs to preserve the key in the source cluster
[ https://issues.apache.org/jira/browse/KAFKA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-957: Assignee: Guozhang Wang MirrorMaker needs to preserve the key in the source cluster --- Key: KAFKA-957 URL: https://issues.apache.org/jira/browse/KAFKA-957 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Guozhang Wang Currently, MirrorMaker only propagates the message to the target cluster, but not the associated key. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-961) state.change.logger: Error on broker 1 while processing LeaderAndIsr request correlationId 6 received from controller 1 epoch 1 for partition (page_visits,0)
Garrett Barton created KAFKA-961: Summary: state.change.logger: Error on broker 1 while processing LeaderAndIsr request correlationId 6 received from controller 1 epoch 1 for partition (page_visits,0) Key: KAFKA-961 URL: https://issues.apache.org/jira/browse/KAFKA-961 Project: Kafka Issue Type: Bug Affects Versions: 0.8 Environment: Linux gman-minty 3.8.0-19-generic #29-Ubuntu SMP Wed Apr 17 18:16:28 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux Reporter: Garrett Barton Been having issues embedding 0.8 servers into some Yarn stuff I'm doing. I just pulled the latest from git, did a ./sbt +package, followed by ./sbt assembly-package-dependency. And pushed core/target/scala-2.8.0/kafka_2.8.0-0.8.0-beta1.jar into my local mvn repo. Here is sample code ripped out to little classes to show my error: Starting up a broker embedded in java, with the following code: ... Properties props = new Properties(); // dont set so it binds to all interfaces // props.setProperty(hostname, hostName); props.setProperty(port, ); props.setProperty(broker.id, 1); props.setProperty(log.dir, /tmp/embeddedkafka/ + randId); // TODO: hardcode bad props.setProperty(zookeeper.connect, localhost:2181/ + randId); KafkaConfig kconf = new KafkaConfig(props); server = new KafkaServer(kconf, null); server.startup(); LOG.info(Broker online); Sample Producer has the following code: Properties props = new Properties(); props.put(metadata.broker.list, gman-minty:); props.put(serializer.class, kafka.serializer.StringEncoder); props.put(partitioner.class, com.gman.broker.SimplePartitioner); props.put(request.required.acks, 1); ProducerConfig config = new ProducerConfig(props); ProducerString, String producer = new ProducerString, String(config); LOG.info(producer created); KeyedMessageString, String data = new KeyedMessageString, String(page_visits, key1, value1); producer.send(data); LOG.info(wrote message: + data); And here is the server log: INFO 2013-07-03 13:47:30,538 [Thread-0] kafka.utils.VerifiableProperties: Verifying properties INFO 2013-07-03 13:47:30,568 [Thread-0] kafka.utils.VerifiableProperties: Property port is overridden to INFO 2013-07-03 13:47:30,568 [Thread-0] kafka.utils.VerifiableProperties: Property broker.id is overridden to 1 INFO 2013-07-03 13:47:30,568 [Thread-0] kafka.utils.VerifiableProperties: Property zookeeper.connect is overridden to localhost:2181/kafkatest INFO 2013-07-03 13:47:30,569 [Thread-0] kafka.utils.VerifiableProperties: Property log.dir is overridden to \tmp\embeddedkafka\1372873650268 INFO 2013-07-03 13:47:30,574 [Thread-0] kafka.server.KafkaServer: [Kafka Server 1], Starting INFO 2013-07-03 13:47:30,609 [Thread-0] kafka.log.LogManager: [Log Manager on Broker 1] Log directory '/home/gman/workspace/distributed_parser/\tmp\embeddedkafka\1372873650268' not found, creating it. INFO 2013-07-03 13:47:30,619 [Thread-0] kafka.log.LogManager: [Log Manager on Broker 1] Starting log cleaner every 60 ms INFO 2013-07-03 13:47:30,630 [Thread-0] kafka.log.LogManager: [Log Manager on Broker 1] Starting log flusher every 3000 ms with the following overrides Map() INFO 2013-07-03 13:47:30,687 [Thread-0] kafka.network.Acceptor: Awaiting socket connections on 0.0.0.0:. INFO 2013-07-03 13:47:30,688 [Thread-0] kafka.network.SocketServer: [Socket Server on Broker 1], Started INFO 2013-07-03 13:47:30,696 [Thread-0] kafka.server.KafkaZooKeeper: connecting to ZK: localhost:2181/kafkatest INFO 2013-07-03 13:47:30,707 [ZkClient-EventThread-17-localhost:2181/kafkatest] org.I0Itec.zkclient.ZkEventThread: Starting ZkClient event thread. INFO 2013-07-03 13:47:30,716 [Thread-0] org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.2-1221870, built on 12/21/2011 20:46 GMT INFO 2013-07-03 13:47:30,717 [Thread-0] org.apache.zookeeper.ZooKeeper: Client environment:host.name=gman-minty INFO 2013-07-03 13:47:30,717 [Thread-0] org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.6.0_27 INFO 2013-07-03 13:47:30,717 [Thread-0] org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc. INFO 2013-07-03 13:47:30,717 [Thread-0] org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-6-openjdk-amd64/jre INFO 2013-07-03 13:47:30,717 [Thread-0] org.apache.zookeeper.ZooKeeper: Client
[jira] [Commented] (KAFKA-960) Upgrade Metrics to 3.x
[ https://issues.apache.org/jira/browse/KAFKA-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699389#comment-13699389 ] Cosmin Lehene commented on KAFKA-960: - [~jjkoshy] I was going to set it back on 3.x on our internal branch. Are you aware of any issues with 3.x that would make it advisable not to? Upgrade Metrics to 3.x -- Key: KAFKA-960 URL: https://issues.apache.org/jira/browse/KAFKA-960 Project: Kafka Issue Type: Improvement Affects Versions: 0.8 Reporter: Cosmin Lehene Fix For: 0.8 Now that metrics 3.0 has been released (http://metrics.codahale.com/about/release-notes/) we can upgrade back -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (KAFKA-621) System Test 9051 : ConsoleConsumer doesn't receives any data for 20 topics but works for 10
[ https://issues.apache.org/jira/browse/KAFKA-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Fung resolved KAFKA-621. - Resolution: Fixed Not an issue any more. System Test 9051 : ConsoleConsumer doesn't receives any data for 20 topics but works for 10 --- Key: KAFKA-621 URL: https://issues.apache.org/jira/browse/KAFKA-621 Project: Kafka Issue Type: Bug Reporter: John Fung Attachments: kafka-621-reproduce-issue.patch * This issue may be related to KAFKA-618 * To reproduce the issue: 1. Download the latest 0.8 branch and apply the attached patch 2. In kafka_home, execute ./sbt update package 3. In kafka_home/system_test, execute python -B system_test_runner.py and it will execute testcase_9051 * The validation output would be as follows: validation_status : Unique messages from consumer on [t001] : 0 Unique messages from consumer on [t002] : 0 . . . Unique messages from consumer on [t019] : 0 Unique messages from consumer on [t020] : 0 Unique messages from producer on [t001] : 1000 Unique messages from producer on [t002] : 1000 . . . Unique messages from producer on [t018] : 1000 Unique messages from producer on [t019] : 1000 Unique messages from producer on [t020] : 1000 Validate for data matched on topic [t001] : FAILED Validate for data matched on topic [t002] : FAILED . . . Validate for data matched on topic [t019] : FAILED Validate for data matched on topic [t020] : FAILED Validate for merged log segment checksum in cluster [source] : PASSED * However, it will work fine if there are only 10 topics In system_test/replication_testsuite/testcase_9051/testcase_9051_properties.json, update the following line to 10 topics: topic: t001,t002,t003,t004,t005,t006,t007,t008,t009,t010,t011,t012,t013,t014,t015,t016,t017,t018,t019,t020, -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (KAFKA-619) Regression : System Test 900x (Migration Tool) - java.lang.ClassCastException: kafka.message.Message cannot be cast to [B
[ https://issues.apache.org/jira/browse/KAFKA-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Fung closed KAFKA-619. --- Regression : System Test 900x (Migration Tool) - java.lang.ClassCastException: kafka.message.Message cannot be cast to [B - Key: KAFKA-619 URL: https://issues.apache.org/jira/browse/KAFKA-619 Project: Kafka Issue Type: Bug Reporter: John Fung This error is happening in : testcase_900x (migration tool test group). The issue is that no data is received by ConsoleConsumer. All Migration Tool log4j messages are showing the following error. . . . [2012-11-16 08:28:55,361] INFO FetchRunnable-1 start fetching topic: test_1 part: 3 offset: 0 from 127.0.0.1:9093 (kafka.consumer.FetcherRunnable) [2012-11-16 08:28:55,361] INFO FetchRunnable-0 start fetching topic: test_1 part: 3 offset: 0 from 127.0.0.1:9092 (kafka.consumer.FetcherRunnable) [2012-11-16 08:28:55,361] INFO FetchRunnable-2 start fetching topic: test_1 part: 0 offset: 0 from 127.0.0.1:9091 (kafka.consumer.FetcherRunnable) Migration thread failure due to java.lang.ClassCastException: kafka.message.Message cannot be cast to [B java.lang.ClassCastException: kafka.message.Message cannot be cast to [B at kafka.serializer.DefaultEncoder.toBytes(Encoder.scala:36) at kafka.producer.async.DefaultEventHandler$$anonfun$serialize$1.apply(DefaultEventHandler.scala:111) at kafka.producer.async.DefaultEventHandler$$anonfun$serialize$1.apply(DefaultEventHandler.scala:106) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:32) at scala.collection.TraversableLike$class.map(TraversableLike.scala:206) at scala.collection.mutable.WrappedArray.map(WrappedArray.scala:32) at kafka.producer.async.DefaultEventHandler.serialize(DefaultEventHandler.scala:106) at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:47) at kafka.producer.Producer.send(Producer.scala:75) at kafka.javaapi.producer.Producer.send(Producer.scala:32) at kafka.tools.KafkaMigrationTool$MigrationThread.run(KafkaMigrationTool.java:287) [2012-11-16 08:30:04,854] INFO test-consumer-group_jfung-ld-1353083318174-2b62271b begin rebalancing consumer test-consumer-group_jfung-ld-1353083318174-2b62271b try #0 (kafka.consumer.ZookeeperConsumerConnector) [2012-11-16 08:30:04,858] INFO Constructing topic count for test-consumer-group_jfung-ld-1353083318174-2b62271b from *2*.* using \*(\p{Digit}+)\*(.*) as pattern. (kafka.consumer.TopicCount$) . . . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (KAFKA-624) Add 07 ConsoleConsumer to validate message content for Migration Tool testcases
[ https://issues.apache.org/jira/browse/KAFKA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Fung resolved KAFKA-624. - Resolution: Fixed This is handled in KAFKA-882 KAFKA-883 Add 07 ConsoleConsumer to validate message content for Migration Tool testcases --- Key: KAFKA-624 URL: https://issues.apache.org/jira/browse/KAFKA-624 Project: Kafka Issue Type: Task Reporter: John Fung Assignee: John Fung Labels: replication-testing -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (KAFKA-606) System Test Transient Failure (case 0302 GC Pause) - Log segments mismatched across replicas
[ https://issues.apache.org/jira/browse/KAFKA-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Fung resolved KAFKA-606. - Resolution: Fixed Not an issue any more. System Test Transient Failure (case 0302 GC Pause) - Log segments mismatched across replicas Key: KAFKA-606 URL: https://issues.apache.org/jira/browse/KAFKA-606 Project: Kafka Issue Type: Bug Reporter: John Fung Attachments: testcase_0302_data_and_log4j.tar.gz -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (KAFKA-606) System Test Transient Failure (case 0302 GC Pause) - Log segments mismatched across replicas
[ https://issues.apache.org/jira/browse/KAFKA-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Fung closed KAFKA-606. --- System Test Transient Failure (case 0302 GC Pause) - Log segments mismatched across replicas Key: KAFKA-606 URL: https://issues.apache.org/jira/browse/KAFKA-606 Project: Kafka Issue Type: Bug Reporter: John Fung Attachments: testcase_0302_data_and_log4j.tar.gz -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (KAFKA-607) System Test Transient Failure (case 4011 Log Retention) - ConsoleConsumer receives less data
[ https://issues.apache.org/jira/browse/KAFKA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Fung resolved KAFKA-607. - Resolution: Fixed Not an issue any more. System Test Transient Failure (case 4011 Log Retention) - ConsoleConsumer receives less data Key: KAFKA-607 URL: https://issues.apache.org/jira/browse/KAFKA-607 Project: Kafka Issue Type: Bug Reporter: John Fung Attachments: testcase_4011_data_and_log4j.tar.gz -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (KAFKA-607) System Test Transient Failure (case 4011 Log Retention) - ConsoleConsumer receives less data
[ https://issues.apache.org/jira/browse/KAFKA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Fung closed KAFKA-607. --- System Test Transient Failure (case 4011 Log Retention) - ConsoleConsumer receives less data Key: KAFKA-607 URL: https://issues.apache.org/jira/browse/KAFKA-607 Project: Kafka Issue Type: Bug Reporter: John Fung Attachments: testcase_4011_data_and_log4j.tar.gz -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (KAFKA-819) System Test : Add validation of log segment index to offset
[ https://issues.apache.org/jira/browse/KAFKA-819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Fung closed KAFKA-819. --- System Test : Add validation of log segment index to offset --- Key: KAFKA-819 URL: https://issues.apache.org/jira/browse/KAFKA-819 Project: Kafka Issue Type: Task Reporter: John Fung Assignee: John Fung Labels: kafka-0.8, replication-testing Fix For: 0.8 Attachments: kafka-819-v1.patch This can be achieved by calling DumpLogSegments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (KAFKA-792) Update multiple attributes in testcase_xxxx_properties.json
[ https://issues.apache.org/jira/browse/KAFKA-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Fung closed KAFKA-792. --- Update multiple attributes in testcase__properties.json --- Key: KAFKA-792 URL: https://issues.apache.org/jira/browse/KAFKA-792 Project: Kafka Issue Type: Task Reporter: John Fung Assignee: John Fung Labels: replication-testing Fix For: 0.8 Attachments: kafka-792-v1.patch The following are some of the properties need to be updated in some testcase__properties.json. These changes have been patched in local Hudson for a while. Create this new JIRA to check in these changes. log.segment.bytes default.replication.factor num.partitions -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (KAFKA-716) SimpleConsumerPerformance does not consume all available messages
[ https://issues.apache.org/jira/browse/KAFKA-716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Fung closed KAFKA-716. --- SimpleConsumerPerformance does not consume all available messages - Key: KAFKA-716 URL: https://issues.apache.org/jira/browse/KAFKA-716 Project: Kafka Issue Type: Bug Reporter: John Fung Priority: Blocker Labels: p2 To reproduce the issue: 1. Start 1 zookeeper 2. Start 1 broker 3. Send some messages 4. Start SimpleConsumerPerformance to consume messages. The only way to consume all messages is to set the fetch-size to be greater than the log segment file size. 5. This output shows that SimpleConsumerPerformance consumes only 6 messages: $ bin/kafka-run-class.sh kafka.perf.SimpleConsumerPerformance --server kafka://host1:9092 --topic topic_001 --fetch-size 2048 --partition 0 start.time, end.time, fetch.size, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec 2013-01-21 15:09:21:124, 2013-01-21 15:09:21:165, 2048, 0.0059, 0.1429, 6, 146.3415 6. This output shows that ConsoleConsumer consumes all 5500 messages (same test as the above) $ bin/kafka-run-class.sh kafka.consumer.ConsoleConsumer --zookeeper host2:2181 --topic topic_001 --consumer-timeout-ms 5000 --formatter kafka.consumer.ChecksumMessageFormatter --from-beginning | grep ^checksum | wc -l Consumed 5500 messages 5500 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (KAFKA-716) SimpleConsumerPerformance does not consume all available messages
[ https://issues.apache.org/jira/browse/KAFKA-716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Fung resolved KAFKA-716. - Resolution: Fixed Not an issue SimpleConsumerPerformance does not consume all available messages - Key: KAFKA-716 URL: https://issues.apache.org/jira/browse/KAFKA-716 Project: Kafka Issue Type: Bug Reporter: John Fung Priority: Blocker Labels: p2 To reproduce the issue: 1. Start 1 zookeeper 2. Start 1 broker 3. Send some messages 4. Start SimpleConsumerPerformance to consume messages. The only way to consume all messages is to set the fetch-size to be greater than the log segment file size. 5. This output shows that SimpleConsumerPerformance consumes only 6 messages: $ bin/kafka-run-class.sh kafka.perf.SimpleConsumerPerformance --server kafka://host1:9092 --topic topic_001 --fetch-size 2048 --partition 0 start.time, end.time, fetch.size, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec 2013-01-21 15:09:21:124, 2013-01-21 15:09:21:165, 2048, 0.0059, 0.1429, 6, 146.3415 6. This output shows that ConsoleConsumer consumes all 5500 messages (same test as the above) $ bin/kafka-run-class.sh kafka.consumer.ConsoleConsumer --zookeeper host2:2181 --topic topic_001 --consumer-timeout-ms 5000 --formatter kafka.consumer.ChecksumMessageFormatter --from-beginning | grep ^checksum | wc -l Consumed 5500 messages 5500 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-961) state.change.logger: Error on broker 1 while processing LeaderAndIsr request correlationId 6 received from controller 1 epoch 1 for partition (page_visits,0)
[ https://issues.apache.org/jira/browse/KAFKA-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699489#comment-13699489 ] Garrett Barton commented on KAFKA-961: -- I found out what is going on. Since I'm doing this in java (excuse my lack of scala knowledge here) I had to set the KafkaServer constructor with a value for the Time object. Since Apparently SystemTime does not convert to the Time interface/trait in java I left it as null thinking that the scala bit in the constructor: time: Time = SystemTime Would init one for me. Well, it doesn't. I made an inner class of Time that had SystemTime's functionality and passed that into the constructor, and now I can get passed my npe error. Is this expected behavior? state.change.logger: Error on broker 1 while processing LeaderAndIsr request correlationId 6 received from controller 1 epoch 1 for partition (page_visits,0) - Key: KAFKA-961 URL: https://issues.apache.org/jira/browse/KAFKA-961 Project: Kafka Issue Type: Bug Affects Versions: 0.8 Environment: Linux gman-minty 3.8.0-19-generic #29-Ubuntu SMP Wed Apr 17 18:16:28 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux Reporter: Garrett Barton Been having issues embedding 0.8 servers into some Yarn stuff I'm doing. I just pulled the latest from git, did a ./sbt +package, followed by ./sbt assembly-package-dependency. And pushed core/target/scala-2.8.0/kafka_2.8.0-0.8.0-beta1.jar into my local mvn repo. Here is sample code ripped out to little classes to show my error: Starting up a broker embedded in java, with the following code: ... Properties props = new Properties(); // dont set so it binds to all interfaces // props.setProperty(hostname, hostName); props.setProperty(port, ); props.setProperty(broker.id, 1); props.setProperty(log.dir, /tmp/embeddedkafka/ + randId); // TODO: hardcode bad props.setProperty(zookeeper.connect, localhost:2181/ + randId); KafkaConfig kconf = new KafkaConfig(props); server = new KafkaServer(kconf, null); server.startup(); LOG.info(Broker online); Sample Producer has the following code: ... Properties props = new Properties(); props.put(metadata.broker.list, gman-minty:); props.put(serializer.class, kafka.serializer.StringEncoder); props.put(partitioner.class, com.gman.broker.SimplePartitioner); props.put(request.required.acks, 1); ProducerConfig config = new ProducerConfig(props); ProducerString, String producer = new ProducerString, String(config); LOG.info(producer created); KeyedMessageString, String data = new KeyedMessageString, String(page_visits, key1, value1); producer.send(data); LOG.info(wrote message: + data); And here is the server log: INFO 2013-07-03 13:47:30,538 [Thread-0] kafka.utils.VerifiableProperties: Verifying properties INFO 2013-07-03 13:47:30,568 [Thread-0] kafka.utils.VerifiableProperties: Property port is overridden to INFO 2013-07-03 13:47:30,568 [Thread-0] kafka.utils.VerifiableProperties: Property broker.id is overridden to 1 INFO 2013-07-03 13:47:30,568 [Thread-0] kafka.utils.VerifiableProperties: Property zookeeper.connect is overridden to localhost:2181/kafkatest INFO 2013-07-03 13:47:30,569 [Thread-0] kafka.utils.VerifiableProperties: Property log.dir is overridden to \tmp\embeddedkafka\1372873650268 INFO 2013-07-03 13:47:30,574 [Thread-0] kafka.server.KafkaServer: [Kafka Server 1], Starting INFO 2013-07-03 13:47:30,609 [Thread-0] kafka.log.LogManager: [Log Manager on Broker 1] Log directory '/home/gman/workspace/distributed_parser/\tmp\embeddedkafka\1372873650268' not found, creating it. INFO 2013-07-03 13:47:30,619 [Thread-0] kafka.log.LogManager: [Log Manager on Broker 1] Starting log cleaner every 60 ms INFO 2013-07-03 13:47:30,630 [Thread-0] kafka.log.LogManager: [Log Manager on Broker 1] Starting log flusher every 3000 ms with the following overrides Map() INFO 2013-07-03 13:47:30,687 [Thread-0] kafka.network.Acceptor: Awaiting socket connections on 0.0.0.0:. INFO 2013-07-03 13:47:30,688 [Thread-0] kafka.network.SocketServer: [Socket Server on Broker 1], Started INFO 2013-07-03 13:47:30,696 [Thread-0] kafka.server.KafkaZooKeeper:
[jira] [Resolved] (KAFKA-729) Gzip compression codec complains about missing SnappyInputStream
[ https://issues.apache.org/jira/browse/KAFKA-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Fung resolved KAFKA-729. - Resolution: Fixed Not an issue any more. Gzip compression codec complains about missing SnappyInputStream Key: KAFKA-729 URL: https://issues.apache.org/jira/browse/KAFKA-729 Project: Kafka Issue Type: Bug Reporter: John Fung Priority: Critical $ bin/kafka-run-class.sh kafka.perf.ProducerPerformance --broker-list localhost:9092 --topic test_1 --messages 10 --batch-size 1 --compression-codec 1 java.lang.NoClassDefFoundError: org/xerial/snappy/SnappyInputStream at kafka.message.ByteBufferMessageSet$.kafka$message$ByteBufferMessageSet$$create(ByteBufferMessageSet.scala:41) at kafka.message.ByteBufferMessageSet.init(ByteBufferMessageSet.scala:98) at kafka.producer.async.DefaultEventHandler$$anonfun$4.apply(DefaultEventHandler.scala:291) at kafka.producer.async.DefaultEventHandler$$anonfun$4.apply(DefaultEventHandler.scala:279) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80) at scala.collection.Iterator$class.foreach(Iterator.scala:631) at scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:161) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:194) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) at scala.collection.mutable.HashMap.foreach(HashMap.scala:80) at scala.collection.TraversableLike$class.map(TraversableLike.scala:206) at scala.collection.mutable.HashMap.map(HashMap.scala:39) at kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$groupMessagesToSet(DefaultEventHandler.scala:279) at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$1.apply(DefaultEventHandler.scala:102) at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$1.apply(DefaultEventHandler.scala:98) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80) at scala.collection.Iterator$class.foreach(Iterator.scala:631) at scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:161) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:194) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) at scala.collection.mutable.HashMap.foreach(HashMap.scala:80) at kafka.producer.async.DefaultEventHandler.dispatchSerializedData(DefaultEventHandler.scala:98) at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:72) at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:104) at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:87) at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:67) at scala.collection.immutable.Stream.foreach(Stream.scala:254) at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:66) at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:44) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-911) Bug in controlled shutdown logic in controller leads to controller not sending out some state change request
[ https://issues.apache.org/jira/browse/KAFKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699500#comment-13699500 ] Sriram Subramanian commented on KAFKA-911: -- This has been fixed. Bug in controlled shutdown logic in controller leads to controller not sending out some state change request - Key: KAFKA-911 URL: https://issues.apache.org/jira/browse/KAFKA-911 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Neha Narkhede Priority: Blocker Labels: kafka-0.8, p1 Attachments: kafka-911-v1.patch, kafka-911-v2.patch The controlled shutdown logic in the controller first tries to move the leaders from the broker being shutdown. Then it tries to remove the broker from the isr list. During that operation, it does not synchronize on the controllerLock. This causes a race condition while dispatching data using the controller's channel manager. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-911) Bug in controlled shutdown logic in controller leads to controller not sending out some state change request
[ https://issues.apache.org/jira/browse/KAFKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-911: - Resolution: Fixed Status: Resolved (was: Patch Available) Bug in controlled shutdown logic in controller leads to controller not sending out some state change request - Key: KAFKA-911 URL: https://issues.apache.org/jira/browse/KAFKA-911 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Neha Narkhede Priority: Blocker Labels: kafka-0.8, p1 Attachments: kafka-911-v1.patch, kafka-911-v2.patch The controlled shutdown logic in the controller first tries to move the leaders from the broker being shutdown. Then it tries to remove the broker from the isr list. During that operation, it does not synchronize on the controllerLock. This causes a race condition while dispatching data using the controller's channel manager. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-184) Log retention size and file size should be a long
[ https://issues.apache.org/jira/browse/KAFKA-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699510#comment-13699510 ] Sriram Subramanian commented on KAFKA-184: -- We have already fixed all the config naming and types in 0.8. We can resolve this. Log retention size and file size should be a long - Key: KAFKA-184 URL: https://issues.apache.org/jira/browse/KAFKA-184 Project: Kafka Issue Type: Bug Affects Versions: 0.7 Reporter: Joel Koshy Priority: Minor Fix For: 0.8.1 Attachments: KAFKA-184-0.8.patch Realized this in a local set up: the log.retention.size config option should be a long, or we're limited to 2GB. Also, the name can be improved to log.retention.size.bytes or Mbytes as appropriate. Same comments for log.file.size. If we rename the configs, it would be better to resolve KAFKA-181 first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-960) Upgrade Metrics to 3.x
[ https://issues.apache.org/jira/browse/KAFKA-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699545#comment-13699545 ] Joel Koshy commented on KAFKA-960: -- Not really. However, my point is that given that going both directions (upgrade and downgrade) are a bit painful due to the API and mbean changes we should let 3.x prove itself to be stable in other contexts for a period of time before we switch to it. Upgrade Metrics to 3.x -- Key: KAFKA-960 URL: https://issues.apache.org/jira/browse/KAFKA-960 Project: Kafka Issue Type: Improvement Affects Versions: 0.8 Reporter: Cosmin Lehene Fix For: 0.8 Now that metrics 3.0 has been released (http://metrics.codahale.com/about/release-notes/) we can upgrade back -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-961) state.change.logger: Error on broker 1 while processing LeaderAndIsr request correlationId 6 received from controller 1 epoch 1 for partition (page_visits,0)
[ https://issues.apache.org/jira/browse/KAFKA-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699567#comment-13699567 ] Joel Koshy commented on KAFKA-961: -- Passing in null for time would definitely lead to that NPE as you found. I think we only needed a time interface to support a mocktime for tests. Also, we probably didn't anticipate that KafkaServer's would need to be embedded in Java code. If you are okay with your work-around, then great. Another (ugly) way to do it would be to pass in a dynamically instantiated SystemTime - so something like (Time) Class.forName(SystemTime.class.getName()).newInstance() - not sure if it will work though. We can also provide an explicit constructor without the time argument and get rid of the scala default arg. state.change.logger: Error on broker 1 while processing LeaderAndIsr request correlationId 6 received from controller 1 epoch 1 for partition (page_visits,0) - Key: KAFKA-961 URL: https://issues.apache.org/jira/browse/KAFKA-961 Project: Kafka Issue Type: Bug Affects Versions: 0.8 Environment: Linux gman-minty 3.8.0-19-generic #29-Ubuntu SMP Wed Apr 17 18:16:28 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux Reporter: Garrett Barton Been having issues embedding 0.8 servers into some Yarn stuff I'm doing. I just pulled the latest from git, did a ./sbt +package, followed by ./sbt assembly-package-dependency. And pushed core/target/scala-2.8.0/kafka_2.8.0-0.8.0-beta1.jar into my local mvn repo. Here is sample code ripped out to little classes to show my error: Starting up a broker embedded in java, with the following code: ... Properties props = new Properties(); // dont set so it binds to all interfaces // props.setProperty(hostname, hostName); props.setProperty(port, ); props.setProperty(broker.id, 1); props.setProperty(log.dir, /tmp/embeddedkafka/ + randId); // TODO: hardcode bad props.setProperty(zookeeper.connect, localhost:2181/ + randId); KafkaConfig kconf = new KafkaConfig(props); server = new KafkaServer(kconf, null); server.startup(); LOG.info(Broker online); Sample Producer has the following code: ... Properties props = new Properties(); props.put(metadata.broker.list, gman-minty:); props.put(serializer.class, kafka.serializer.StringEncoder); props.put(partitioner.class, com.gman.broker.SimplePartitioner); props.put(request.required.acks, 1); ProducerConfig config = new ProducerConfig(props); ProducerString, String producer = new ProducerString, String(config); LOG.info(producer created); KeyedMessageString, String data = new KeyedMessageString, String(page_visits, key1, value1); producer.send(data); LOG.info(wrote message: + data); And here is the server log: INFO 2013-07-03 13:47:30,538 [Thread-0] kafka.utils.VerifiableProperties: Verifying properties INFO 2013-07-03 13:47:30,568 [Thread-0] kafka.utils.VerifiableProperties: Property port is overridden to INFO 2013-07-03 13:47:30,568 [Thread-0] kafka.utils.VerifiableProperties: Property broker.id is overridden to 1 INFO 2013-07-03 13:47:30,568 [Thread-0] kafka.utils.VerifiableProperties: Property zookeeper.connect is overridden to localhost:2181/kafkatest INFO 2013-07-03 13:47:30,569 [Thread-0] kafka.utils.VerifiableProperties: Property log.dir is overridden to \tmp\embeddedkafka\1372873650268 INFO 2013-07-03 13:47:30,574 [Thread-0] kafka.server.KafkaServer: [Kafka Server 1], Starting INFO 2013-07-03 13:47:30,609 [Thread-0] kafka.log.LogManager: [Log Manager on Broker 1] Log directory '/home/gman/workspace/distributed_parser/\tmp\embeddedkafka\1372873650268' not found, creating it. INFO 2013-07-03 13:47:30,619 [Thread-0] kafka.log.LogManager: [Log Manager on Broker 1] Starting log cleaner every 60 ms INFO 2013-07-03 13:47:30,630 [Thread-0] kafka.log.LogManager: [Log Manager on Broker 1] Starting log flusher every 3000 ms with the following overrides Map() INFO 2013-07-03 13:47:30,687 [Thread-0] kafka.network.Acceptor: Awaiting socket connections on 0.0.0.0:. INFO 2013-07-03 13:47:30,688 [Thread-0] kafka.network.SocketServer: [Socket Server on Broker 1], Started INFO 2013-07-03 13:47:30,696 [Thread-0]
[jira] [Commented] (KAFKA-960) Upgrade Metrics to 3.x
[ https://issues.apache.org/jira/browse/KAFKA-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699608#comment-13699608 ] Jay Kreps commented on KAFKA-960: - Word. I actually think given our experience we might be better off in the long term if we remove metrics entirely from the client and just expose some simple counters. Users can wrap those in the monitoring library of their choice. Otherwise we are incompatible with most people Upgrade Metrics to 3.x -- Key: KAFKA-960 URL: https://issues.apache.org/jira/browse/KAFKA-960 Project: Kafka Issue Type: Improvement Affects Versions: 0.8 Reporter: Cosmin Lehene Fix For: 0.8 Now that metrics 3.0 has been released (http://metrics.codahale.com/about/release-notes/) we can upgrade back -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-962) Add list topics to ClientUtils
Jakob Homan created KAFKA-962: - Summary: Add list topics to ClientUtils Key: KAFKA-962 URL: https://issues.apache.org/jira/browse/KAFKA-962 Project: Kafka Issue Type: Improvement Affects Versions: 0.8 Reporter: Jakob Homan Assignee: Jakob Homan Currently there is no programmatic way to get a list of topics supported directly by Kafka (one can talk to ZooKeeper directly). There is a CLI tool for this ListTopicCommand, but it'd be good to provide this directly to clients as an API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-915) System Test - Mirror Maker testcase_5001 failed
[ https://issues.apache.org/jira/browse/KAFKA-915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699641#comment-13699641 ] Joel Koshy commented on KAFKA-915: -- This failure is due to the fact that the leaderAndIsr request has not yet made it to the brokers until after the mirror maker's rebalance completes. This is related to the issue reported in KAFKA-956. Previously (before we started caching metadata at the brokers) the partition information was retrieved directly from zk. The fix for now would be to use the create topic admin before starting the mirror maker (or move the producer performance start up to well before the mirror maker startup). System Test - Mirror Maker testcase_5001 failed --- Key: KAFKA-915 URL: https://issues.apache.org/jira/browse/KAFKA-915 Project: Kafka Issue Type: Bug Reporter: John Fung Assignee: Joel Koshy Priority: Critical Labels: kafka-0.8, replication-testing Attachments: testcase_5001_debug_logs.tar.gz This case passes if brokers are set to partition = 1, replicas = 1 It fails if brokers are set to partition = 5, replicas = 3 (consistently reproducible) This test case is set up as shown below. 1. Start 2 ZK as a cluster in Source 2. Start 2 ZK as a cluster in Target 3. Start 3 brokers as a cluster in Source (partition = 1, replicas = 1) 4. Start 3 brokers as a cluster in Target (partition = 1, replicas = 1) 5. Start 1 MM 6. Start ProducerPerformance to send some data 7. After Producer is done, start ConsoleConsumer to consume data 8. Stop all processes and validate if there is any data loss. 9. No failure is introduced to any process in this test Attached a tar file which contains the logs and system test output for both cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-559) Garbage collect old consumer metadata entries
[ https://issues.apache.org/jira/browse/KAFKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated KAFKA-559: -- Attachment: KAFKA-559.v1.patch Attached KAFKA-559.v1.patch above. - It has been developed to work do cleanup with group-id as input from user instead of topic. - An additional dry-run feature is provided so that people can see what all znodes would get deleted w/o actually deleting them. Garbage collect old consumer metadata entries - Key: KAFKA-559 URL: https://issues.apache.org/jira/browse/KAFKA-559 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Assignee: Tejas Patil Labels: project Attachments: KAFKA-559.v1.patch Many use cases involve tranient consumers. These consumers create entries under their consumer group in zk and maintain offsets there as well. There is currently no way to delete these entries. It would be good to have a tool that did something like bin/delete-obsolete-consumer-groups.sh [--topic t1] --since [date] --zookeeper [zk_connect] This would scan through consumer group entries and delete any that had no offset update since the given date. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-961) state.change.logger: Error on broker 1 while processing LeaderAndIsr request correlationId 6 received from controller 1 epoch 1 for partition (page_visits,0)
[ https://issues.apache.org/jira/browse/KAFKA-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699741#comment-13699741 ] Garrett Barton commented on KAFKA-961: -- Thanks for the reply! I personally lean towards the cleaner constructor as I think that logic should be defaulted in the class instead of every time I use it having to pass one in. Having said that, I am probably doing odd things with the embedding of Kafka so I understand if you think differently. state.change.logger: Error on broker 1 while processing LeaderAndIsr request correlationId 6 received from controller 1 epoch 1 for partition (page_visits,0) - Key: KAFKA-961 URL: https://issues.apache.org/jira/browse/KAFKA-961 Project: Kafka Issue Type: Bug Affects Versions: 0.8 Environment: Linux gman-minty 3.8.0-19-generic #29-Ubuntu SMP Wed Apr 17 18:16:28 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux Reporter: Garrett Barton Been having issues embedding 0.8 servers into some Yarn stuff I'm doing. I just pulled the latest from git, did a ./sbt +package, followed by ./sbt assembly-package-dependency. And pushed core/target/scala-2.8.0/kafka_2.8.0-0.8.0-beta1.jar into my local mvn repo. Here is sample code ripped out to little classes to show my error: Starting up a broker embedded in java, with the following code: ... Properties props = new Properties(); // dont set so it binds to all interfaces // props.setProperty(hostname, hostName); props.setProperty(port, ); props.setProperty(broker.id, 1); props.setProperty(log.dir, /tmp/embeddedkafka/ + randId); // TODO: hardcode bad props.setProperty(zookeeper.connect, localhost:2181/ + randId); KafkaConfig kconf = new KafkaConfig(props); server = new KafkaServer(kconf, null); server.startup(); LOG.info(Broker online); Sample Producer has the following code: ... Properties props = new Properties(); props.put(metadata.broker.list, gman-minty:); props.put(serializer.class, kafka.serializer.StringEncoder); props.put(partitioner.class, com.gman.broker.SimplePartitioner); props.put(request.required.acks, 1); ProducerConfig config = new ProducerConfig(props); ProducerString, String producer = new ProducerString, String(config); LOG.info(producer created); KeyedMessageString, String data = new KeyedMessageString, String(page_visits, key1, value1); producer.send(data); LOG.info(wrote message: + data); And here is the server log: INFO 2013-07-03 13:47:30,538 [Thread-0] kafka.utils.VerifiableProperties: Verifying properties INFO 2013-07-03 13:47:30,568 [Thread-0] kafka.utils.VerifiableProperties: Property port is overridden to INFO 2013-07-03 13:47:30,568 [Thread-0] kafka.utils.VerifiableProperties: Property broker.id is overridden to 1 INFO 2013-07-03 13:47:30,568 [Thread-0] kafka.utils.VerifiableProperties: Property zookeeper.connect is overridden to localhost:2181/kafkatest INFO 2013-07-03 13:47:30,569 [Thread-0] kafka.utils.VerifiableProperties: Property log.dir is overridden to \tmp\embeddedkafka\1372873650268 INFO 2013-07-03 13:47:30,574 [Thread-0] kafka.server.KafkaServer: [Kafka Server 1], Starting INFO 2013-07-03 13:47:30,609 [Thread-0] kafka.log.LogManager: [Log Manager on Broker 1] Log directory '/home/gman/workspace/distributed_parser/\tmp\embeddedkafka\1372873650268' not found, creating it. INFO 2013-07-03 13:47:30,619 [Thread-0] kafka.log.LogManager: [Log Manager on Broker 1] Starting log cleaner every 60 ms INFO 2013-07-03 13:47:30,630 [Thread-0] kafka.log.LogManager: [Log Manager on Broker 1] Starting log flusher every 3000 ms with the following overrides Map() INFO 2013-07-03 13:47:30,687 [Thread-0] kafka.network.Acceptor: Awaiting socket connections on 0.0.0.0:. INFO 2013-07-03 13:47:30,688 [Thread-0] kafka.network.SocketServer: [Socket Server on Broker 1], Started INFO 2013-07-03 13:47:30,696 [Thread-0] kafka.server.KafkaZooKeeper: connecting to ZK: localhost:2181/kafkatest INFO 2013-07-03 13:47:30,707 [ZkClient-EventThread-17-localhost:2181/kafkatest] org.I0Itec.zkclient.ZkEventThread: Starting ZkClient event thread. INFO 2013-07-03 13:47:30,716 [Thread-0] org.apache.zookeeper.ZooKeeper:
Re: Kafka/Hadoop consumers and producers
IMHO, I think Camus should probably be decoupled from Avro before the simpler contribs are deleted. We don't actually use the contribs, so I'm not saying this for our sake, but it seems like the right thing to do to provide simple examples for this type of stuff, no...? -- Felix On Wed, Jul 3, 2013 at 4:56 AM, Cosmin Lehene cleh...@adobe.com wrote: If the Hadoop consumer/producers use-case will remain relevant for Kafka (I assume it will), it would make sense to have the core components (kafka input/output format at least) as part of Kafka so that it could be built, tested and versioned together to maintain compatibility. This would also make it easier to build custom MR jobs on top of Kafka, rather than having to decouple stuff from Camus. Also it would also be less confusing for users at least when starting using Kafka. Camus could use those instead of providing it's own. This being said we did some work on the consumer side (0.8 and the new(er) MR API). We could probably try to rewrite them to use Camus or fix Camus or whatever, but please consider this alternative as well. Thanks, Cosmin On 7/3/13 11:06 AM, Sam Meder sam.me...@jivesoftware.com wrote: I think it makes sense to kill the hadoop consumer/producer code in Kafka, given, as you said, Camus and the simplicity of the Hadoop producer. /Sam On Jul 2, 2013, at 5:01 PM, Jay Kreps jay.kr...@gmail.com wrote: We currently have a contrib package for consuming and producing messages from mapreduce ( https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tree;f=contrib;h=e5 3e1fb34893e733b10ff27e79e6a1dcbb8d7ab0;hb=HEAD ). We keep running into problems (e.g. KAFKA-946) that are basically due to the fact that the Kafka committers don't seem to mostly be Hadoop developers and aren't doing a good job of maintaining this code (keeping it tested, improving it, documenting it, writing tutorials, getting it moved over to the more modern apis, getting it working with newer Hadoop versions, etc). A couple of options: 1. We could try to get someone in the Kafka community (either a current committer or not) who would adopt this as their baby (it's not much code). 2. We could just let Camus take over this functionality. They already have a more sophisticated consumer and the producer is pretty minimal. So are there any people who would like to adopt the current Hadoop contrib code? Conversely would it be possible to provide the same or similar functionality in Camus and just delete these? -Jay -- You received this message because you are subscribed to the Google Groups Camus - Kafka ETL for Hadoop group. To unsubscribe from this group and stop receiving emails from it, send an email to camus_etl+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Helping out
Yes. We will need to think a bit more for the more general case when there are messages for multiple topics. Thanks, Jun On Wed, Jul 3, 2013 at 11:18 AM, S Ahmed sahmed1...@gmail.com wrote: https://github.com/apache/kafka/blob/0.8/core/src/main/scala/kafka/producer/async/DefaultEventHandler.scala#L152 On Wed, Jul 3, 2013 at 2:18 PM, S Ahmed sahmed1...@gmail.com wrote: Jun, Looks like the update would happen around here? On Wed, Jul 3, 2013 at 12:53 PM, Jun Rao jun...@gmail.com wrote: Sam, Interested in taking a look at KAFKA-959https://issues.apache.org/jira/browse/KAFKA-959 ? Thanks, Jun On Sun, Jun 23, 2013 at 11:01 PM, Sam Meder sam.me...@jivesoftware.com wrote: Hey, I now have roughly a day a week I can dedicate to working on Kafka, so I am looking for issues in the 0.8.1 batch that you think might be good starting points. Input would be much appreciated. Speaking of issues, I think it would be good to either fix https://issues.apache.org/jira/browse/KAFKA-946 for 0.8 or just drop the code from the release. /Sam