compatibility: 0.8.1.1 broker, 0.8.2.2 producer
Hi All, Does someone has experience / encountered any issues using a 0.8.2.2 producer against a 0.8.1.1 broker (specifically kafka_2.9.2-0.8.1.1)? I want to upgrade my existing producer (0.8.2-beta). Also, is there a functional difference between the scala versions (2.9.2,2.10,2.11)? Thanks, Shlomi
Re: compatibility: 0.8.1.1 broker, 0.8.2.2 producer
Thank you both Ewen & Andrey! The general rule of upgrading brokers is clear, but it was important for me to hear what other people experienced. Ewen, I assume the broker upgrade from 0.8.1.1 to 0.8.2.2 is as safe as it could be, right? Like I can just take down a single broker, replace jars, and kick it up again seamlessly. If so I will probably give it a go unless another better version is coming. 10x, Shlomi On Thu, Dec 24, 2015 at 3:12 AM, Andrey Yegorov <andrey.yego...@gmail.com> wrote: > I am using 0.8.2.2 producer with 0.8.1.1 brokers without problems. > Version of scala matters if you are building with scala or some other > components that use scala. > Hope this helps. > > -- > Andrey Yegorov > > On Wed, Dec 23, 2015 at 1:11 PM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > > Shlomi, > > > > You should always upgrade brokers before clients. Newer versions of > clients > > aren't guaranteed to work with older versions of brokers. > > > > For scala versions, there is no functional difference. Generally you only > > need to worry about the Scala version if you are using the old clients > > (which are in the core jar) and the rest of your app requires a specific > > Scala version. > > > > -Ewen > > > > On Wed, Dec 23, 2015 at 6:31 AM, Shlomi Hazan <shl...@viber.com> wrote: > > > > > Hi All, > > > > > > Does someone has experience / encountered any issues using a 0.8.2.2 > > > producer against a 0.8.1.1 broker (specifically kafka_2.9.2-0.8.1.1)? > > > I want to upgrade my existing producer (0.8.2-beta). > > > Also, is there a functional difference between the scala versions > > > (2.9.2,2.10,2.11)? > > > > > > Thanks, > > > Shlomi > > > > > > > > > > > -- > > Thanks, > > Ewen > > >
Re: latency - how to reduce?
I would like to test locally first as it is easier than setting up a test cluster to model the production, yet the script kafka-producer-perf-test is not available for windows. Jun, what kind of basic I/O testing on the local FS did you have in mind? Thanks, Shlomi On Tue, Jan 6, 2015 at 5:40 PM, Jayesh Thakrar j_thak...@yahoo.com.invalid wrote: Have you tried using the built-in stress test scripts? bin/kafka-producer-perf-test.sh bin/kafka-consumer-perf-test.sh Here's how I stress tested them - nohup ${KAFKA_HOME}/bin/kafka-producer-perf-test.sh --broker-list ${KAFKA_SERVERS} --topic ${TOPIC_NAME} --new-producer --threads 16 --messages 1 1kafka-producer-perf-test.sh.log 21 nohup ${KAFKA_HOME}/bin/kafka-consumer-perf-test.sh --zookeeper ${ZOOKEEPER_QUORUM} --topic ${TOPIC_NAME} --threads 16 1kafka-consumer-perf-test.sh.log 21 And I used screen scrapping of the jmx ui screens to push metrics into TSDB to get the following.The rate below is per second - so I could push the Kafka cluster to 140k+ messages/sec on a 4-node cluster with very little utilization (30% utilization). From: Shlomi Hazan shl...@viber.com To: users@kafka.apache.org Sent: Tuesday, January 6, 2015 1:06 AM Subject: Re: latency - how to reduce? Will do. What did you have in mind? just write a big file to disk and measure the time it took to write? maybe also read back? using specific API's? Apart from the local Win machine case, are you aware of any issues with Amazon EC2 instances that may be causing that same latency in production? Thanks, Shlomi On Tue, Jan 6, 2015 at 4:04 AM, Jun Rao j...@confluent.io wrote: Not setting log.flush.interval.messages is good since the default gives the best latency. Could you do some basic I/O testing on the local FS in your windows machine to make sure the I/O latency is ok? Thanks, Jun On Thu, Jan 1, 2015 at 1:40 AM, Shlomi Hazan shl...@viber.com wrote: Happy new year! I did not set log.flush.interval.messages. I also could not find a default value in the docs. Could you explain about that? Thanks, Shlomi On Thu, Jan 1, 2015 at 2:20 AM, Jun Rao j...@confluent.io wrote: What's your setting of log.flush.interval.messages on the broker? Thanks, Jun On Mon, Dec 29, 2014 at 3:26 AM, Shlomi Hazan shl...@viber.com wrote: Hi, I am using 0.8.1.1, and I have hundreds of msec latency at best and even seconds at worst. I have this latency both on production, (with peak load of 30K msg/sec, replication = 2 across 5 brokers, acks = 1), and on the local windows machine using just one process for each of producer, zookeeper, kafka, consumer. Also tried batch.num.messages=1 and producer.type=sync on the local machine but saw no improvement. How can I push latency down to several millis, at least when running local? Thanks, Shlomi
Re: latency - how to reduce?
Will do. What did you have in mind? just write a big file to disk and measure the time it took to write? maybe also read back? using specific API's? Apart from the local Win machine case, are you aware of any issues with Amazon EC2 instances that may be causing that same latency in production? Thanks, Shlomi On Tue, Jan 6, 2015 at 4:04 AM, Jun Rao j...@confluent.io wrote: Not setting log.flush.interval.messages is good since the default gives the best latency. Could you do some basic I/O testing on the local FS in your windows machine to make sure the I/O latency is ok? Thanks, Jun On Thu, Jan 1, 2015 at 1:40 AM, Shlomi Hazan shl...@viber.com wrote: Happy new year! I did not set log.flush.interval.messages. I also could not find a default value in the docs. Could you explain about that? Thanks, Shlomi On Thu, Jan 1, 2015 at 2:20 AM, Jun Rao j...@confluent.io wrote: What's your setting of log.flush.interval.messages on the broker? Thanks, Jun On Mon, Dec 29, 2014 at 3:26 AM, Shlomi Hazan shl...@viber.com wrote: Hi, I am using 0.8.1.1, and I have hundreds of msec latency at best and even seconds at worst. I have this latency both on production, (with peak load of 30K msg/sec, replication = 2 across 5 brokers, acks = 1), and on the local windows machine using just one process for each of producer, zookeeper, kafka, consumer. Also tried batch.num.messages=1 and producer.type=sync on the local machine but saw no improvement. How can I push latency down to several millis, at least when running local? Thanks, Shlomi
Re: latency - how to reduce?
Happy new year! I did not set log.flush.interval.messages. I also could not find a default value in the docs. Could you explain about that? Thanks, Shlomi On Thu, Jan 1, 2015 at 2:20 AM, Jun Rao j...@confluent.io wrote: What's your setting of log.flush.interval.messages on the broker? Thanks, Jun On Mon, Dec 29, 2014 at 3:26 AM, Shlomi Hazan shl...@viber.com wrote: Hi, I am using 0.8.1.1, and I have hundreds of msec latency at best and even seconds at worst. I have this latency both on production, (with peak load of 30K msg/sec, replication = 2 across 5 brokers, acks = 1), and on the local windows machine using just one process for each of producer, zookeeper, kafka, consumer. Also tried batch.num.messages=1 and producer.type=sync on the local machine but saw no improvement. How can I push latency down to several millis, at least when running local? Thanks, Shlomi
latency - how to reduce?
Hi, I am using 0.8.1.1, and I have hundreds of msec latency at best and even seconds at worst. I have this latency both on production, (with peak load of 30K msg/sec, replication = 2 across 5 brokers, acks = 1), and on the local windows machine using just one process for each of producer, zookeeper, kafka, consumer. Also tried batch.num.messages=1 and producer.type=sync on the local machine but saw no improvement. How can I push latency down to several millis, at least when running local? Thanks, Shlomi
Re: [DISCUSSION] adding the serializer api back to the new java producer
Jay, Jun, Thank you both for explaining. I understand this is important enough such that it must be done, and if so, the sooner the better. How will the change be released? a beta-2 or release candidate? I think that if possible, it should not overrun the already released version. Thank you guys for the hard work. Shlomi On Tue, Nov 25, 2014 at 7:37 PM, Jun Rao jun...@gmail.com wrote: Bhavesh, This api change doesn't mean you need to change the format of the encoded data. It simply moves the serialization logic from the application to a pluggable serializer. As long as you preserve the serialization logic, the consumer should still see the same bytes. If you are talking about how to evolve the data schema over time, that's a separate story. Serialization libraries like Avro have better support on schema evolution. Thanks, Jun On Tue, Nov 25, 2014 at 8:41 AM, Bhavesh Mistry mistry.p.bhav...@gmail.com wrote: How will mix bag will work with Consumer side ? Entire site can not be rolled at once so Consumer will have to deals with New and Old Serialize Bytes ? This could be app team responsibility. Are you guys targeting 0.8.2 release, which may break customer who are already using new producer API (beta version). Thanks, Bhavesh On Tue, Nov 25, 2014 at 8:29 AM, Manikumar Reddy ku...@nmsworks.co.in wrote: +1 for this change. what about de-serializer class in 0.8.2? Say i am using new producer with Avro and old consumer combination. then i need to give custom Decoder implementation for Avro right?. On Tue, Nov 25, 2014 at 9:19 PM, Joe Stein joe.st...@stealth.ly wrote: The serializer is an expected use of the producer/consumer now and think we should continue that support in the new client. As far as breaking the API it is why we released the 0.8.2-beta to help get through just these type of blocking issues in a way that the community at large could be involved in easier with a build/binaries to download and use from maven also. +1 on the change now prior to the 0.8.2 release. - Joe Stein On Mon, Nov 24, 2014 at 11:43 PM, Sriram Subramanian srsubraman...@linkedin.com.invalid wrote: Looked at the patch. +1 from me. On 11/24/14 8:29 PM, Gwen Shapira gshap...@cloudera.com wrote: As one of the people who spent too much time building Avro repositories, +1 on bringing serializer API back. I think it will make the new producer easier to work with. Gwen On Mon, Nov 24, 2014 at 6:13 PM, Jay Kreps jay.kr...@gmail.com wrote: This is admittedly late in the release cycle to make a change. To add to Jun's description the motivation was that we felt it would be better to change that interface now rather than after the release if it needed to change. The motivation for wanting to make a change was the ability to really be able to develop support for Avro and other serialization formats. The current status is pretty scattered--there is a schema repository on an Avro JIRA and another fork of that on github, and a bunch of people we have talked to have done similar things for other serialization systems. It would be nice if these things could be packaged in such a way that it was possible to just change a few configs in the producer and get rich metadata support for messages. As we were thinking this through we realized that the new api we were about to introduce was kind of not very compatable with this since it was just byte[] oriented. You can always do this by adding some kind of wrapper api that wraps the producer. But this puts us back in the position of trying to document and support multiple interfaces. This also opens up the possibility of adding a MessageValidator or MessageInterceptor plug-in transparently so that you can do other custom validation on the messages you are sending which obviously requires access to the original object not the byte array. This api doesn't prevent using byte[] by configuring the ByteArraySerializer it works as it currently does. -Jay On Mon, Nov 24, 2014 at 5:58 PM, Jun Rao jun...@gmail.com wrote: Hi, Everyone, I'd like to start a discussion on whether it makes sense to add the serializer api back to the new java producer. Currently, the new java producer takes a byte array for both the key and the value. While this api is simple, it pushes the serialization logic into the application. This makes it hard to reason about what type of data is being sent to Kafka and
Re: issues using the new 0.8.2 producer
All clear, Thank you. I guess an example will be available when the version is released Shlomi On Tue, Nov 25, 2014 at 7:33 AM, Jun Rao jun...@gmail.com wrote: 1. The new producer takes only the new producer configs. 2. There is no longer a pluggable partitioner. By default, if a key is provided, the producer hashes the bytes to get the partition. There is an interface for the client to explicitly specify a partition, if it wants to. 3. Currently, the new producer only takes bytes. We are discussing now if we want to make it take generic types like the old producer. Thanks, Jun On Sun, Nov 23, 2014 at 2:12 AM, Shlomi Hazan shl...@viber.com wrote: Hi, Started to dig into that new producer and have a few questions: 1. what part (if any) of the old producer config still apply to the new producer or is it just what is specified on New Producer Configs? 2. how do you specify a partitioner to the new producer? if no such option, what usage is made with the given key? is it simply hashed with Java's String API? 3. the javadoc example ( ProducerRecord record = new ProducerRecord(the-topic, key, value); ) is incorrect and shows as if creating a producer record takes 3 strings whereas it takes byte arrays for the last two arguments. will the final API be the one documented or rather the one implemented? I am really missing a working example for the new producer so if anyone has one I will be happy to get inspired... Shlomi
issues using the new 0.8.2 producer
Hi, Started to dig into that new producer and have a few questions: 1. what part (if any) of the old producer config still apply to the new producer or is it just what is specified on New Producer Configs? 2. how do you specify a partitioner to the new producer? if no such option, what usage is made with the given key? is it simply hashed with Java's String API? 3. the javadoc example ( ProducerRecord record = new ProducerRecord(the-topic, key, value); ) is incorrect and shows as if creating a producer record takes 3 strings whereas it takes byte arrays for the last two arguments. will the final API be the one documented or rather the one implemented? I am really missing a working example for the new producer so if anyone has one I will be happy to get inspired... Shlomi
Re: will adding partitions to existing topic change leadership to existing partitions?
Thank you, Guozhang. All clear now. On Thu, Nov 20, 2014 at 1:29 AM, Guozhang Wang wangg...@gmail.com wrote: Hi Shlomi, By just use kafka-topics.sh --zookeeper localhost:2181 --alter --topic test_topic --partitions 8 the controller will auto assign replicas to the newly added partitions, but will not touch the existing ones. Guozhang On Mon, Nov 17, 2014 at 11:13 PM, Shlomi Hazan shl...@viber.com wrote: Hi Guozhang, Sorry for being too brief but the question referred to adding partitions with the topic tool (without specifying json file). I was not aware of the json file option at all and so my question is dealing with the case the partitions are added like: kafka-topics.sh --zookeeper localhost:2181 --alter --topic test_topic --partitions 8 Will then existing partitions be subject to leadership change? 10x Shlomi On Mon, Nov 17, 2014 at 7:04 PM, Guozhang Wang wangg...@gmail.com wrote: Hi Shlomi, As long as your json file indicating the partition addition operation does not touch the existing partitions you should be fine. Guozhang On Mon, Nov 17, 2014 at 1:08 AM, Shlomi Hazan shl...@viber.com wrote: Hi I want to add partitions to a running topic, and since I use the python producer I will eventually have to restart producers to reflect the change. the question is if leadership will change for the existing partitions too, forcing me to immediately restart the producers. 10x, Shlomi -- -- Guozhang -- -- Guozhang
will adding partitions to existing topic change leadership to existing partitions?
Hi I want to add partitions to a running topic, and since I use the python producer I will eventually have to restart producers to reflect the change. the question is if leadership will change for the existing partitions too, forcing me to immediately restart the producers. 10x, Shlomi
selecting java producer (0.8.2 or 0.8.1.1?)
Hi, I need to make a choice and I can't get a full picture on the differences between the two. E.g.: Are both producers async capable to the same extent? Is the new producer stable for production? Is there some usage example for the new producer? What are the tradeoffs using one or another? 10x, Shlomi
Re: will adding partitions to existing topic change leadership to existing partitions?
Hi Guozhang, Sorry for being too brief but the question referred to adding partitions with the topic tool (without specifying json file). I was not aware of the json file option at all and so my question is dealing with the case the partitions are added like: kafka-topics.sh --zookeeper localhost:2181 --alter --topic test_topic --partitions 8 Will then existing partitions be subject to leadership change? 10x Shlomi On Mon, Nov 17, 2014 at 7:04 PM, Guozhang Wang wangg...@gmail.com wrote: Hi Shlomi, As long as your json file indicating the partition addition operation does not touch the existing partitions you should be fine. Guozhang On Mon, Nov 17, 2014 at 1:08 AM, Shlomi Hazan shl...@viber.com wrote: Hi I want to add partitions to a running topic, and since I use the python producer I will eventually have to restart producers to reflect the change. the question is if leadership will change for the existing partitions too, forcing me to immediately restart the producers. 10x, Shlomi -- -- Guozhang
Re: 0.8.2 producer with 0.8.1.1 cluster?
10x Christian On Thu, Nov 13, 2014 at 9:50 AM, cac...@gmail.com cac...@gmail.com wrote: I used the 0.8.2 producer in a 0.8.1 cluster in a nonproduction environment. No problems to report it worked great, but my testing at that time was not particularly extensive for failure scenarios. Christian On Wed, Nov 12, 2014 at 10:37 PM, Shlomi Hazan shl...@viber.com wrote: I was asking to know if there's a point in trying... From your answer I understand the answer is yes. 10x, Shlomi On Wed, Nov 12, 2014 at 7:04 PM, Guozhang Wang wangg...@gmail.com wrote: Shlomi, It should be compatible, did you see any issues using it against a 0.8.1.1 cluster? Guozhang On Wed, Nov 12, 2014 at 5:43 AM, Shlomi Hazan shl...@viber.com wrote: Hi, Is the new producer 0.8.2 supposed to work with 0.8.1.1 cluster? Shlomi -- -- Guozhang
0.8.2 producer with 0.8.1.1 cluster?
Hi, Is the new producer 0.8.2 supposed to work with 0.8.1.1 cluster? Shlomi
Re: expanding cluster and reassigning parititions without restarting producer
Neha, I understand that the producer kafka.javaapi.producer.Producer shown in examples is old, and that a new producer (org.apache.kafka.clients.producer) is avail? is it available for 0.8.1.1? how does it work? does it have a trigger fired when partitions are added or does the producer refresh some cache every some given time period? Shlomi On Tue, Nov 11, 2014 at 4:25 AM, Neha Narkhede neha.narkh...@gmail.com wrote: How can I auto refresh keyed producers to use new partitions as these partitions are added? Try using the new producer under org.apache.kafka.clients.producer. Thanks, Neha On Mon, Nov 10, 2014 at 8:52 AM, Bhavesh Mistry mistry.p.bhav...@gmail.com wrote: I had different experience with expanding partition for new producer and its impact. I only tried for non-key message.I would always advice to keep batch size relatively low or plan for expansion with new java producer in advance or since inception otherwise running producer code is impacted. Here is mail chain: http://mail-archives.apache.org/mod_mbox/kafka-dev/201411.mbox/%3ccaoejijit4cgry97dgzfjkfvaqfduv-o1x1kafefbshgirkm...@mail.gmail.com%3E Thanks, Bhavesh On Mon, Nov 10, 2014 at 5:20 AM, Shlomi Hazan shl...@viber.com wrote: Hmmm.. The Java producer example seems to ignore added partitions too... How can I auto refresh keyed producers to use new partitions as these partitions are added? On Mon, Nov 10, 2014 at 12:33 PM, Shlomi Hazan shl...@viber.com wrote: One more thing: I saw that the Python client is also unaffected by addition of partitions to a topic and that it continues to send requests only to the old partitions. is this also handled appropriately by the Java producer? Will he see the change and produce to the new partitions as well? Shlomi On Mon, Nov 10, 2014 at 9:34 AM, Shlomi Hazan shl...@viber.com wrote: No I don't see anything like that, the question was aimed at learning if it is worthwhile to make the effort of reimplementing the Python producer in Java, I so I will not make all the effort just to be disappointed afterwards. understand I have nothing to worry about, so I will try to simulate this situation in small scale... maybe 3 brokers, one topic with one partition and then add partitions. we'll see. thanks for clarifying. Oh, Good luck with Confluent!! :) On Mon, Nov 10, 2014 at 4:17 AM, Neha Narkhede neha.narkh...@gmail.com wrote: The producer might get an error code if the leader of the partitions being reassigned also changes. However it should retry and succeed. Do you see a behavior that suggests otherwise? On Sat, Nov 8, 2014 at 11:45 PM, Shlomi Hazan shl...@viber.com wrote: Hi All, I recently had an issue producing from python where expanding a cluster from 3 to 5 nodes and reassigning partitions forced me to restart the producer b/c of KeyError thrown. Is this situation handled by the Java producer automatically or need I do something to have the java producer refresh itself to see the reassigned partition layout and produce away ? Shlomi
Re: zookeeper snapshot files eat up disk space
That looks like a complete answer. BUT just to be sure: it says Automatic purging of the snapshots and corresponding transaction logs was introduced in version 3.4.0. using 0.8.1.1 means that I will have to purge manually, right? Is there some convention for kafka users? e.g.: delete all but last X=3 maybe? Shlomi On Tue, Nov 11, 2014 at 4:27 PM, Joe Stein joe.st...@stealth.ly wrote: http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#Ongoing+Data+Directory+Cleanup /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Tue, Nov 11, 2014 at 9:24 AM, Shlomi Hazan shl...@viber.com wrote: Hi, My zookeeper 'dataLogDir' is eating up my disk with tons of snapshot files. what are these files? what files can I delete? are week old files disposable? This folder only gets bigger... How can I avoid blowing my disk? Thanks, Shlomi
Re: expanding cluster and reassigning parititions without restarting producer
Understood. Thank you guys. On Wed, Nov 12, 2014 at 4:48 AM, Jun Rao jun...@gmail.com wrote: Just to extend what Neha said. The new producer also picks up the new partitions by refreshing the metadata periodically (controlled metadata.max.age.ms). The new producer distributes the data more evenly to all partitions than the old producer. Thanks, Jun On Tue, Nov 11, 2014 at 11:19 AM, Neha Narkhede neha.narkh...@gmail.com wrote: The new producer is available in 0.8.2-beta (the most recent Kafka release). The old producer only detects new partitions at an interval configured by topic.metadata.refresh.interval.ms. This constraint is no longer true for the new producer and you would likely end up with an even distribution of data across all partitions. If you want to stay with the old producer on 0.8.1.1, you can try reducing topic.metadata.refresh.interval.ms but it may have some performance impact on the Kafka cluster since it ends up sending topic metadata requests to the broker at that interval. Thanks, Neha On Tue, Nov 11, 2014 at 1:45 AM, Shlomi Hazan shl...@viber.com wrote: Neha, I understand that the producer kafka.javaapi.producer.Producer shown in examples is old, and that a new producer (org.apache.kafka.clients.producer) is avail? is it available for 0.8.1.1? how does it work? does it have a trigger fired when partitions are added or does the producer refresh some cache every some given time period? Shlomi On Tue, Nov 11, 2014 at 4:25 AM, Neha Narkhede neha.narkh...@gmail.com wrote: How can I auto refresh keyed producers to use new partitions as these partitions are added? Try using the new producer under org.apache.kafka.clients.producer. Thanks, Neha On Mon, Nov 10, 2014 at 8:52 AM, Bhavesh Mistry mistry.p.bhav...@gmail.com wrote: I had different experience with expanding partition for new producer and its impact. I only tried for non-key message.I would always advice to keep batch size relatively low or plan for expansion with new java producer in advance or since inception otherwise running producer code is impacted. Here is mail chain: http://mail-archives.apache.org/mod_mbox/kafka-dev/201411.mbox/%3ccaoejijit4cgry97dgzfjkfvaqfduv-o1x1kafefbshgirkm...@mail.gmail.com%3E Thanks, Bhavesh On Mon, Nov 10, 2014 at 5:20 AM, Shlomi Hazan shl...@viber.com wrote: Hmmm.. The Java producer example seems to ignore added partitions too... How can I auto refresh keyed producers to use new partitions as these partitions are added? On Mon, Nov 10, 2014 at 12:33 PM, Shlomi Hazan shl...@viber.com wrote: One more thing: I saw that the Python client is also unaffected by addition of partitions to a topic and that it continues to send requests only to the old partitions. is this also handled appropriately by the Java producer? Will he see the change and produce to the new partitions as well? Shlomi On Mon, Nov 10, 2014 at 9:34 AM, Shlomi Hazan shl...@viber.com wrote: No I don't see anything like that, the question was aimed at learning if it is worthwhile to make the effort of reimplementing the Python producer in Java, I so I will not make all the effort just to be disappointed afterwards. understand I have nothing to worry about, so I will try to simulate this situation in small scale... maybe 3 brokers, one topic with one partition and then add partitions. we'll see. thanks for clarifying. Oh, Good luck with Confluent!! :) On Mon, Nov 10, 2014 at 4:17 AM, Neha Narkhede neha.narkh...@gmail.com wrote: The producer might get an error code if the leader of the partitions being reassigned also changes. However it should retry and succeed. Do you see a behavior that suggests otherwise? On Sat, Nov 8, 2014 at 11:45 PM, Shlomi Hazan shl...@viber.com wrote: Hi All, I recently had an issue producing from python where expanding a cluster from 3 to 5 nodes and reassigning partitions forced me to restart the producer b/c of KeyError thrown. Is this situation handled by the Java producer automatically or need I do something to have the java producer refresh itself to see the reassigned partition layout and produce away ? Shlomi
Re: expanding cluster and reassigning parititions without restarting producer
One more thing: I saw that the Python client is also unaffected by addition of partitions to a topic and that it continues to send requests only to the old partitions. is this also handled appropriately by the Java producer? Will he see the change and produce to the new partitions as well? Shlomi On Mon, Nov 10, 2014 at 9:34 AM, Shlomi Hazan shl...@viber.com wrote: No I don't see anything like that, the question was aimed at learning if it is worthwhile to make the effort of reimplementing the Python producer in Java, I so I will not make all the effort just to be disappointed afterwards. understand I have nothing to worry about, so I will try to simulate this situation in small scale... maybe 3 brokers, one topic with one partition and then add partitions. we'll see. thanks for clarifying. Oh, Good luck with Confluent!! :) On Mon, Nov 10, 2014 at 4:17 AM, Neha Narkhede neha.narkh...@gmail.com wrote: The producer might get an error code if the leader of the partitions being reassigned also changes. However it should retry and succeed. Do you see a behavior that suggests otherwise? On Sat, Nov 8, 2014 at 11:45 PM, Shlomi Hazan shl...@viber.com wrote: Hi All, I recently had an issue producing from python where expanding a cluster from 3 to 5 nodes and reassigning partitions forced me to restart the producer b/c of KeyError thrown. Is this situation handled by the Java producer automatically or need I do something to have the java producer refresh itself to see the reassigned partition layout and produce away ? Shlomi
Re: expanding cluster and reassigning parititions without restarting producer
Hmmm.. The Java producer example seems to ignore added partitions too... How can I auto refresh keyed producers to use new partitions as these partitions are added? On Mon, Nov 10, 2014 at 12:33 PM, Shlomi Hazan shl...@viber.com wrote: One more thing: I saw that the Python client is also unaffected by addition of partitions to a topic and that it continues to send requests only to the old partitions. is this also handled appropriately by the Java producer? Will he see the change and produce to the new partitions as well? Shlomi On Mon, Nov 10, 2014 at 9:34 AM, Shlomi Hazan shl...@viber.com wrote: No I don't see anything like that, the question was aimed at learning if it is worthwhile to make the effort of reimplementing the Python producer in Java, I so I will not make all the effort just to be disappointed afterwards. understand I have nothing to worry about, so I will try to simulate this situation in small scale... maybe 3 brokers, one topic with one partition and then add partitions. we'll see. thanks for clarifying. Oh, Good luck with Confluent!! :) On Mon, Nov 10, 2014 at 4:17 AM, Neha Narkhede neha.narkh...@gmail.com wrote: The producer might get an error code if the leader of the partitions being reassigned also changes. However it should retry and succeed. Do you see a behavior that suggests otherwise? On Sat, Nov 8, 2014 at 11:45 PM, Shlomi Hazan shl...@viber.com wrote: Hi All, I recently had an issue producing from python where expanding a cluster from 3 to 5 nodes and reassigning partitions forced me to restart the producer b/c of KeyError thrown. Is this situation handled by the Java producer automatically or need I do something to have the java producer refresh itself to see the reassigned partition layout and produce away ? Shlomi
Re: expanding cluster and reassigning parititions without restarting producer
No I don't see anything like that, the question was aimed at learning if it is worthwhile to make the effort of reimplementing the Python producer in Java, I so I will not make all the effort just to be disappointed afterwards. understand I have nothing to worry about, so I will try to simulate this situation in small scale... maybe 3 brokers, one topic with one partition and then add partitions. we'll see. thanks for clarifying. Oh, Good luck with Confluent!! :) On Mon, Nov 10, 2014 at 4:17 AM, Neha Narkhede neha.narkh...@gmail.com wrote: The producer might get an error code if the leader of the partitions being reassigned also changes. However it should retry and succeed. Do you see a behavior that suggests otherwise? On Sat, Nov 8, 2014 at 11:45 PM, Shlomi Hazan shl...@viber.com wrote: Hi All, I recently had an issue producing from python where expanding a cluster from 3 to 5 nodes and reassigning partitions forced me to restart the producer b/c of KeyError thrown. Is this situation handled by the Java producer automatically or need I do something to have the java producer refresh itself to see the reassigned partition layout and produce away ? Shlomi
expanding cluster and reassigning parititions without restarting producer
Hi All, I recently had an issue producing from python where expanding a cluster from 3 to 5 nodes and reassigning partitions forced me to restart the producer b/c of KeyError thrown. Is this situation handled by the Java producer automatically or need I do something to have the java producer refresh itself to see the reassigned partition layout and produce away ? Shlomi
Re: partitions stealing balancing consumer threads across servers
Jun, Joel, The issue here is exactly which threads are left out, and which threads are assigned partitions. Maybe I am missing something but what I want is to balance consuming threads across machines/processes, regardless of the amount of threads the machine launches (side effect: this way if you have more threads than partitions you get a reserve force awaiting to charge in). example: launching 4 processes on 4 different machines with 4 threads per process on 12 partition topic will have each machine with 3 assigned threads and one doing nothing. more over no matter what number of threads each process will have , as long as it is bigger then 3, the end result will stay the same with 3 assigned threads per machine, and the rest of them doing nothing. Ideally, I would want something like consumer set/ensemble/{what ever word not group} that will be used to denote a group of threads on a machine, so that when specific threads request to join a consumer group they will be elected so that they are balanced across the machine denoted by the consumer set/ensemble identifier. will partition.assignment.strategy=roundrobin help with that? 10x, Shlomi On Thu, Oct 30, 2014 at 4:00 AM, Joel Koshy jjkosh...@gmail.com wrote: Shlomi, If you are on trunk, and your consumer subscriptions are identical then you can try a slightly different partition assignment strategy. Try setting partition.assignment.strategy=roundrobin in your consumer config. Thanks, Joel On Wed, Oct 29, 2014 at 06:29:30PM -0700, Jun Rao wrote: By consumer, I actually mean consumer threads (the thread # you used when creating consumer streams). So, if you have 4 consumers, each with 4 threads, 4 of the threads will not get any data with 12 partitions. It sounds like that's not what you get? What's the output of the ConsumerOffsetChecker (see http://kafka.apache.org/documentation.html)? For consumer.id, you don't need to set it in general. We generate some uuid automatically. Thanks, Jun On Tue, Oct 28, 2014 at 4:59 AM, Shlomi Hazan shl...@viber.com wrote: Jun, I hear you say partitions are evenly distributed among all consumers in the same group, yet I did bump into a case where launching a process with X high level consumer API threads took over all partitions, sending existing consumers to be unemployed. According to the claim above, and if I am not mistaken: on a topic T with 12 partitions and 3 consumers C1-C3 on the same group with 4 threads each, adding a new consumer C4 with 12 threads should yield the following balance: C1-C3 each relinquish a single partition holding only 3 partitions each. C4 holds the 3 partitions relinquished by C1-C3. Yet, in the case I described what happened is that C4 gained all 12 partitions and sent C1-C3 out of business with 0 partitions each. Now maybe I overlooked something but I think I did see that happen. BTW What key is used to distinguish one consumer from another? consumer.id? docs for consumer.id are Generated automatically if not set. What is the best practice for setting it's value? leave empty? is server host name good enough? what are the considerations? When using the high level consumer API, are all threads identified as the same consumer? I guess they are, right?... Thanks, Shlomi On Tue, Oct 28, 2014 at 4:21 AM, Jun Rao jun...@gmail.com wrote: You can take a look at the consumer rebalancing algorithm part in http://kafka.apache.org/documentation.html. Basically, partitions are evenly distributed among all consumers in the same group. If there are more consumers in a group than partitions, some consumers will never get any data. Thanks, Jun On Mon, Oct 27, 2014 at 4:14 AM, Shlomi Hazan shl...@viber.com wrote: Hi All, Using Kafka's high consumer API I have bumped into a situation where launching a consumer process P1 with X consuming threads on a topic with X partition kicks out all other existing consumer threads that consumed prior to launching the process P. That is, consumer process P is stealing all partitions from all other consumer processes. While understandable, it makes it hard to size deploy a cluster with a number of partitions that will both allow balancing of consumption across consuming processes, dividing the partitions across consumers by setting each consumer with it's share of the total number of partitions on the consumed topic, and on the other hand provide room for growth and addition of new consumers to help with increasing traffic into the cluster and the topic. This stealing effect forces me to have more partitions then really needed at the moment, planning for future growth, or stick to what I need and trust the option to add partitions
Re: partitions stealing balancing consumer threads across servers
Jun, I hear you say partitions are evenly distributed among all consumers in the same group, yet I did bump into a case where launching a process with X high level consumer API threads took over all partitions, sending existing consumers to be unemployed. According to the claim above, and if I am not mistaken: on a topic T with 12 partitions and 3 consumers C1-C3 on the same group with 4 threads each, adding a new consumer C4 with 12 threads should yield the following balance: C1-C3 each relinquish a single partition holding only 3 partitions each. C4 holds the 3 partitions relinquished by C1-C3. Yet, in the case I described what happened is that C4 gained all 12 partitions and sent C1-C3 out of business with 0 partitions each. Now maybe I overlooked something but I think I did see that happen. BTW What key is used to distinguish one consumer from another? consumer.id? docs for consumer.id are Generated automatically if not set. What is the best practice for setting it's value? leave empty? is server host name good enough? what are the considerations? When using the high level consumer API, are all threads identified as the same consumer? I guess they are, right?... Thanks, Shlomi On Tue, Oct 28, 2014 at 4:21 AM, Jun Rao jun...@gmail.com wrote: You can take a look at the consumer rebalancing algorithm part in http://kafka.apache.org/documentation.html. Basically, partitions are evenly distributed among all consumers in the same group. If there are more consumers in a group than partitions, some consumers will never get any data. Thanks, Jun On Mon, Oct 27, 2014 at 4:14 AM, Shlomi Hazan shl...@viber.com wrote: Hi All, Using Kafka's high consumer API I have bumped into a situation where launching a consumer process P1 with X consuming threads on a topic with X partition kicks out all other existing consumer threads that consumed prior to launching the process P. That is, consumer process P is stealing all partitions from all other consumer processes. While understandable, it makes it hard to size deploy a cluster with a number of partitions that will both allow balancing of consumption across consuming processes, dividing the partitions across consumers by setting each consumer with it's share of the total number of partitions on the consumed topic, and on the other hand provide room for growth and addition of new consumers to help with increasing traffic into the cluster and the topic. This stealing effect forces me to have more partitions then really needed at the moment, planning for future growth, or stick to what I need and trust the option to add partitions which comes with a price in terms of restarting consumers, bumping into out of order messages (hash partitioning) etc. Is this policy of stealing is intended, or did I just jump to conclusions? what is the way to cope with the sizing question? Shlomi
partitions stealing balancing consumer threads across servers
Hi All, Using Kafka's high consumer API I have bumped into a situation where launching a consumer process P1 with X consuming threads on a topic with X partition kicks out all other existing consumer threads that consumed prior to launching the process P. That is, consumer process P is stealing all partitions from all other consumer processes. While understandable, it makes it hard to size deploy a cluster with a number of partitions that will both allow balancing of consumption across consuming processes, dividing the partitions across consumers by setting each consumer with it's share of the total number of partitions on the consumed topic, and on the other hand provide room for growth and addition of new consumers to help with increasing traffic into the cluster and the topic. This stealing effect forces me to have more partitions then really needed at the moment, planning for future growth, or stick to what I need and trust the option to add partitions which comes with a price in terms of restarting consumers, bumping into out of order messages (hash partitioning) etc. Is this policy of stealing is intended, or did I just jump to conclusions? what is the way to cope with the sizing question? Shlomi
Re: taking broker down and returning it does not restore cluster state (nor rebalance)
trying to reproduce failed: after somewhat long minutes I noticed that the partition leaders regained balance again, and the only issue left is that the preferred replica was not balanced as it was before taking the broker down. meaning, that the output of the topic description shows broker 1 (out of 3) as preferred replica (first in ISR) in 66% of the cases instead of expected 33%. On Mon, Oct 20, 2014 at 11:36 PM, Joel Koshy jjkosh...@gmail.com wrote: As Neha mentioned, with rep factor 2x, this shouldn't normally cause an issue. Taking the broker down will cause the leader to move to another replica; consumers and producers will rediscover the new leader; no rebalances should be triggered. When you bring the broker back up, unless you run a preferred replica leader re-election the broker will remain a follower. Again, there will be no effect on the producers or consumers (i.e., no rebalances). If you can reproduce this easily, can you please send exact steps to reproduce and send over your consumer logs? Thanks, Joel On Mon, Oct 20, 2014 at 09:13:27PM +0300, Shlomi Hazan wrote: Yes I did. It is set to 2. On Oct 20, 2014 5:38 PM, Neha Narkhede neha.narkh...@gmail.com wrote: Did you ensure that your replication factor was set higher than 1? If so, things should recover automatically after adding the killed broker back into the cluster. On Mon, Oct 20, 2014 at 1:32 AM, Shlomi Hazan shl...@viber.com wrote: Hi, Running some tests on 0811 and wanted to see what happens when a broker is taken down with 'kill'. I bumped into the situation at the subject where launching the broker back left him a bit out of the game as far as I could see using stack driver metrics. Trying to rebalance with verify consumer rebalance return an error no owner for partition for all partitions of that topic (128 partitions). moreover, yet aside from the issue at hand, changing the group name to a non-existent group returned success. taking both the consumers and producers down allowed the rebalance to return success... And the question is: How do you restore 100% state after taking down a broker? what is the best practice? what needs be checked and what needs be done? Shlomi
0.8.1.2
Hi All, Will version 0.8.1.2 happen? Shlomi
taking broker down and returning it does not restore cluster state (nor rebalance)
Hi, Running some tests on 0811 and wanted to see what happens when a broker is taken down with 'kill'. I bumped into the situation at the subject where launching the broker back left him a bit out of the game as far as I could see using stack driver metrics. Trying to rebalance with verify consumer rebalance return an error no owner for partition for all partitions of that topic (128 partitions). moreover, yet aside from the issue at hand, changing the group name to a non-existent group returned success. taking both the consumers and producers down allowed the rebalance to return success... And the question is: How do you restore 100% state after taking down a broker? what is the best practice? what needs be checked and what needs be done? Shlomi
Re: taking broker down and returning it does not restore cluster state (nor rebalance)
Yes I did. It is set to 2. On Oct 20, 2014 5:38 PM, Neha Narkhede neha.narkh...@gmail.com wrote: Did you ensure that your replication factor was set higher than 1? If so, things should recover automatically after adding the killed broker back into the cluster. On Mon, Oct 20, 2014 at 1:32 AM, Shlomi Hazan shl...@viber.com wrote: Hi, Running some tests on 0811 and wanted to see what happens when a broker is taken down with 'kill'. I bumped into the situation at the subject where launching the broker back left him a bit out of the game as far as I could see using stack driver metrics. Trying to rebalance with verify consumer rebalance return an error no owner for partition for all partitions of that topic (128 partitions). moreover, yet aside from the issue at hand, changing the group name to a non-existent group returned success. taking both the consumers and producers down allowed the rebalance to return success... And the question is: How do you restore 100% state after taking down a broker? what is the best practice? what needs be checked and what needs be done? Shlomi
Re: programmatically get number of items in topic/partition
Bingo. 10x!! On Wed, Oct 1, 2014 at 6:41 PM, chetan conikee coni...@gmail.com wrote: The other method is via command line bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group *groupName* --zkconnect *zkServer:2181* Refer : https://cwiki.apache.org/confluence/display/KAFKA/System+Tools#SystemTools-ConsumerOffsetChecker https://apache.googlesource.com/kafka/+/0.8.0-beta1-candidate1/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala On Wed, Oct 1, 2014 at 8:28 AM, Gwen Shapira gshap...@cloudera.com wrote: Take a look at ConsumerOffsetChecker. It does just that: print the offset and lag for each consumer and partition. You can either use that class directly, or use it as a guideline for your implementation On Wed, Oct 1, 2014 at 2:10 AM, Shlomi Hazan shl...@viber.com wrote: Hi, How can I programmatically get the number of items in a topic, pending for consumption? If no programmatic way is avail, what other method is available? Shlomi
Re: programmatically get number of items in topic/partition
actually this tool is not a 100% match to what I need, since it can only provide information on topics that have comsumers: Is there also another equivalent tool/method of querying topics that have no consumers ? in this case this tool will not help as it requires a group id as a mandatory parameter... On Sun, Oct 5, 2014 at 1:22 PM, Shlomi Hazan shl...@viber.com wrote: Bingo. 10x!! On Wed, Oct 1, 2014 at 6:41 PM, chetan conikee coni...@gmail.com wrote: The other method is via command line bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group *groupName* --zkconnect *zkServer:2181* Refer : https://cwiki.apache.org/confluence/display/KAFKA/System+Tools#SystemTools-ConsumerOffsetChecker https://apache.googlesource.com/kafka/+/0.8.0-beta1-candidate1/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala On Wed, Oct 1, 2014 at 8:28 AM, Gwen Shapira gshap...@cloudera.com wrote: Take a look at ConsumerOffsetChecker. It does just that: print the offset and lag for each consumer and partition. You can either use that class directly, or use it as a guideline for your implementation On Wed, Oct 1, 2014 at 2:10 AM, Shlomi Hazan shl...@viber.com wrote: Hi, How can I programmatically get the number of items in a topic, pending for consumption? If no programmatic way is avail, what other method is available? Shlomi
programmatically get number of items in topic/partition
Hi, How can I programmatically get the number of items in a topic, pending for consumption? If no programmatic way is avail, what other method is available? Shlomi
Re: Error in acceptor (kafka.network.Acceptor)
No, just a bare centos 6.5 on an EC2 instance On Sep 11, 2014 1:39 AM, Jun Rao jun...@gmail.com wrote: I meant whether you start the broker in service containers like jetty or tomcat. Thanks, Jun On Wed, Sep 10, 2014 at 12:28 AM, Shlomi Hazan shl...@viber.com wrote: Hi, sorry, what do you mean by 'container'? I use bare EC2 instances... Shlomi On Wed, Sep 10, 2014 at 1:41 AM, Jun Rao jun...@gmail.com wrote: Are you starting the broker in some container? You want to make sure that the container doesn't overwrite the open file handler limit. Thanks, Jun On Tue, Sep 9, 2014 at 12:05 AM, Shlomi Hazan shl...@viber.com wrote: Hi, it's probably beyond that. it may be an issue with the number of files Kafka can have opened concurrently. A previous conversation with Joe about (build failes for latest stable source tgz (kafka_2.9.2-0.8.1.1)) turned out to discuss this (Q's by Joe, A's by me): 1. what else on the logs? [*see below*] 2. other broker failure reason? [**] 3. other broker failure after taking leadership? [*how can I be sure? ask another to describe topic?*] 4. how do I measure number of connections? [*ls -l /proc/pid/fd | grep socket | wc -l, also did watch on that*] 5. is that number equals the number of {new Producer}? [*yes*] 6. how many topics? [*1*] how many partitions [*504*] 7. Are u using a partition key? [*yes, I use the python client with* ] *class ProducerIdPartitioner(Partitioner):Implements a partitioner which selects the target partition based on the sending producer IDdef partition(self, key, partitions): size = len(partitions)prod_id = int(key)idx = prod_id % sizereturn partitions[idx]* 8. maybe running into over partitioned topic? [*producer instances is 6 machines * 84 procs * 24 threads, but never got to start them all*,*b/c of errors*] 9. r u running anything else? [*yes, zookeeper*] answer to 1,2: the error's I see on the python client are first timeouts and then message send failures, using sync send. on the controller log: ontroller.log.2014-08-26-13:[2014-08-26 13:40:44,317] ERROR [Controller-1-to-broker-3-send-thread], Controller 1 epoch 3 failed to send StopReplica request with correlation id 519 to broker id:3,host:shlomi-kafka-broker-3,port:9092. Reconnecting to broker. (kafka.controller.RequestSendThread) controller.log.2014-08-26-13:[2014-08-26 13:40:44,319] ERROR [Controller-1-to-broker-3-send-thread], Controller 1's connection to broker id:3,host:shlomi-kafka-broker-3,port:9092 was unsuccessful (kafka.controller.RequestSendThread) on the server log (selected greps): ... server.log.2014-08-27-01:[2014-08-27 01:44:23,143] ERROR [ReplicaFetcherThread-4-2], Error for partition [vpq_android_gcm_h,270] to broker 2:class kafka.common.NotLeaderForPartitionException (kafka.server.ReplicaFetcherThread) ... server.log.2014-08-27-12:[2014-08-27 12:08:34,638] ERROR Closing socket for /10.184.150.54 because of error (kafka.network.Processor) ... server.log.2014-08-28-07:[2014-08-28 07:57:35,944] ERROR [KafkaApi-1] Error when processing fetch request for partition [vpq_android_gcm_h,184] offset 8798 from follower with correlation id 0 (kafka.server.KafkaApis) ... erver.log.2014-09-03-15:[2014-09-03 15:46:18,220] ERROR [ReplicaFetcherThread-2-3], Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 177593; ClientId: ReplicaFetcherThread-2-3; ReplicaId: 1; MaxWait: 1000 ms; MinBytes: 1 bytes; RequestInfo: [vpq_android_gcm_h,196] - PartitionFetchInfo(65283,8388608),[vpq_android_gcm_h,76] - PartitionFetchInfo(262787,8388608),[vpq_android_gcm_h,460] - PartitionFetchInfo(285709,8388608),[vpq_android_gcm_h,100] - PartitionFetchInfo(199405,8388608),[vpq_android_gcm_h,148] - PartitionFetchInfo(339032,8388608),[vpq_android_gcm_h,436] - PartitionFetchInfo(0,8388608),[vpq_android_gcm_h,124] - PartitionFetchInfo(484447,8388608),[vpq_android_gcm_h,484] - PartitionFetchInfo(105945,8388608),[vpq_android_gcm_h,340] - PartitionFetchInfo(0,8388608),[vpq_android_gcm_h,388] - PartitionFetchInfo(9,8388608),[vpq_android_gcm_h,316] - PartitionFetchInfo(194766,8388608),[vpq_android_gcm_h,364] - PartitionFetchInfo(139897,8388608),[vpq_android_gcm_h,292] - PartitionFetchInfo(195408,8388608),[vpq_android_gcm_h,28] - PartitionFetchInfo(329961,8388608),[vpq_android_gcm_h,172] - PartitionFetchInfo(436959,8388608),[vpq_android_gcm_h,268] - PartitionFetchInfo(59827,8388608),[vpq_android_gcm_h,244] - PartitionFetchInfo(259731,8388608),[vpq_android_gcm_h,220] - PartitionFetchInfo(61669,8388608),[vpq_android_gcm_h,412
Re: Error in acceptor (kafka.network.Acceptor)
Hi, sorry, what do you mean by 'container'? I use bare EC2 instances... Shlomi On Wed, Sep 10, 2014 at 1:41 AM, Jun Rao jun...@gmail.com wrote: Are you starting the broker in some container? You want to make sure that the container doesn't overwrite the open file handler limit. Thanks, Jun On Tue, Sep 9, 2014 at 12:05 AM, Shlomi Hazan shl...@viber.com wrote: Hi, it's probably beyond that. it may be an issue with the number of files Kafka can have opened concurrently. A previous conversation with Joe about (build failes for latest stable source tgz (kafka_2.9.2-0.8.1.1)) turned out to discuss this (Q's by Joe, A's by me): 1. what else on the logs? [*see below*] 2. other broker failure reason? [**] 3. other broker failure after taking leadership? [*how can I be sure? ask another to describe topic?*] 4. how do I measure number of connections? [*ls -l /proc/pid/fd | grep socket | wc -l, also did watch on that*] 5. is that number equals the number of {new Producer}? [*yes*] 6. how many topics? [*1*] how many partitions [*504*] 7. Are u using a partition key? [*yes, I use the python client with* ] *class ProducerIdPartitioner(Partitioner):Implements a partitioner which selects the target partition based on the sending producer IDdef partition(self, key, partitions):size = len(partitions)prod_id = int(key)idx = prod_id % sizereturn partitions[idx]* 8. maybe running into over partitioned topic? [*producer instances is 6 machines * 84 procs * 24 threads, but never got to start them all*,*b/c of errors*] 9. r u running anything else? [*yes, zookeeper*] answer to 1,2: the error's I see on the python client are first timeouts and then message send failures, using sync send. on the controller log: ontroller.log.2014-08-26-13:[2014-08-26 13:40:44,317] ERROR [Controller-1-to-broker-3-send-thread], Controller 1 epoch 3 failed to send StopReplica request with correlation id 519 to broker id:3,host:shlomi-kafka-broker-3,port:9092. Reconnecting to broker. (kafka.controller.RequestSendThread) controller.log.2014-08-26-13:[2014-08-26 13:40:44,319] ERROR [Controller-1-to-broker-3-send-thread], Controller 1's connection to broker id:3,host:shlomi-kafka-broker-3,port:9092 was unsuccessful (kafka.controller.RequestSendThread) on the server log (selected greps): ... server.log.2014-08-27-01:[2014-08-27 01:44:23,143] ERROR [ReplicaFetcherThread-4-2], Error for partition [vpq_android_gcm_h,270] to broker 2:class kafka.common.NotLeaderForPartitionException (kafka.server.ReplicaFetcherThread) ... server.log.2014-08-27-12:[2014-08-27 12:08:34,638] ERROR Closing socket for /10.184.150.54 because of error (kafka.network.Processor) ... server.log.2014-08-28-07:[2014-08-28 07:57:35,944] ERROR [KafkaApi-1] Error when processing fetch request for partition [vpq_android_gcm_h,184] offset 8798 from follower with correlation id 0 (kafka.server.KafkaApis) ... erver.log.2014-09-03-15:[2014-09-03 15:46:18,220] ERROR [ReplicaFetcherThread-2-3], Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 177593; ClientId: ReplicaFetcherThread-2-3; ReplicaId: 1; MaxWait: 1000 ms; MinBytes: 1 bytes; RequestInfo: [vpq_android_gcm_h,196] - PartitionFetchInfo(65283,8388608),[vpq_android_gcm_h,76] - PartitionFetchInfo(262787,8388608),[vpq_android_gcm_h,460] - PartitionFetchInfo(285709,8388608),[vpq_android_gcm_h,100] - PartitionFetchInfo(199405,8388608),[vpq_android_gcm_h,148] - PartitionFetchInfo(339032,8388608),[vpq_android_gcm_h,436] - PartitionFetchInfo(0,8388608),[vpq_android_gcm_h,124] - PartitionFetchInfo(484447,8388608),[vpq_android_gcm_h,484] - PartitionFetchInfo(105945,8388608),[vpq_android_gcm_h,340] - PartitionFetchInfo(0,8388608),[vpq_android_gcm_h,388] - PartitionFetchInfo(9,8388608),[vpq_android_gcm_h,316] - PartitionFetchInfo(194766,8388608),[vpq_android_gcm_h,364] - PartitionFetchInfo(139897,8388608),[vpq_android_gcm_h,292] - PartitionFetchInfo(195408,8388608),[vpq_android_gcm_h,28] - PartitionFetchInfo(329961,8388608),[vpq_android_gcm_h,172] - PartitionFetchInfo(436959,8388608),[vpq_android_gcm_h,268] - PartitionFetchInfo(59827,8388608),[vpq_android_gcm_h,244] - PartitionFetchInfo(259731,8388608),[vpq_android_gcm_h,220] - PartitionFetchInfo(61669,8388608),[vpq_android_gcm_h,412] - PartitionFetchInfo(563609,8388608),[vpq_android_gcm_h,4] - PartitionFetchInfo(360336,8388608),[vpq_android_gcm_h,52] - PartitionFetchInfo(378533,8388608) (kafka.server.ReplicaFetcherThread) ... server.log.2014-09-03-14:[2014-09-03 14:04:18,548] ERROR Error in acceptor (kafka.network.Acceptor) ... and these may not be all (other logs may have some more of that) Joe said to just lower the number of connections but I still can't see the exact problem. is there a kafka limit
Re: Error in acceptor (kafka.network.Acceptor)
Hi, it's probably beyond that. it may be an issue with the number of files Kafka can have opened concurrently. A previous conversation with Joe about (build failes for latest stable source tgz (kafka_2.9.2-0.8.1.1)) turned out to discuss this (Q's by Joe, A's by me): 1. what else on the logs? [*see below*] 2. other broker failure reason? [**] 3. other broker failure after taking leadership? [*how can I be sure? ask another to describe topic?*] 4. how do I measure number of connections? [*ls -l /proc/pid/fd | grep socket | wc -l, also did watch on that*] 5. is that number equals the number of {new Producer}? [*yes*] 6. how many topics? [*1*] how many partitions [*504*] 7. Are u using a partition key? [*yes, I use the python client with* ] *class ProducerIdPartitioner(Partitioner):Implements a partitioner which selects the target partition based on the sending producer IDdef partition(self, key, partitions):size = len(partitions)prod_id = int(key)idx = prod_id % sizereturn partitions[idx]* 8. maybe running into over partitioned topic? [*producer instances is 6 machines * 84 procs * 24 threads, but never got to start them all*,*b/c of errors*] 9. r u running anything else? [*yes, zookeeper*] answer to 1,2: the error's I see on the python client are first timeouts and then message send failures, using sync send. on the controller log: ontroller.log.2014-08-26-13:[2014-08-26 13:40:44,317] ERROR [Controller-1-to-broker-3-send-thread], Controller 1 epoch 3 failed to send StopReplica request with correlation id 519 to broker id:3,host:shlomi-kafka-broker-3,port:9092. Reconnecting to broker. (kafka.controller.RequestSendThread) controller.log.2014-08-26-13:[2014-08-26 13:40:44,319] ERROR [Controller-1-to-broker-3-send-thread], Controller 1's connection to broker id:3,host:shlomi-kafka-broker-3,port:9092 was unsuccessful (kafka.controller.RequestSendThread) on the server log (selected greps): ... server.log.2014-08-27-01:[2014-08-27 01:44:23,143] ERROR [ReplicaFetcherThread-4-2], Error for partition [vpq_android_gcm_h,270] to broker 2:class kafka.common.NotLeaderForPartitionException (kafka.server.ReplicaFetcherThread) ... server.log.2014-08-27-12:[2014-08-27 12:08:34,638] ERROR Closing socket for /10.184.150.54 because of error (kafka.network.Processor) ... server.log.2014-08-28-07:[2014-08-28 07:57:35,944] ERROR [KafkaApi-1] Error when processing fetch request for partition [vpq_android_gcm_h,184] offset 8798 from follower with correlation id 0 (kafka.server.KafkaApis) ... erver.log.2014-09-03-15:[2014-09-03 15:46:18,220] ERROR [ReplicaFetcherThread-2-3], Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 177593; ClientId: ReplicaFetcherThread-2-3; ReplicaId: 1; MaxWait: 1000 ms; MinBytes: 1 bytes; RequestInfo: [vpq_android_gcm_h,196] - PartitionFetchInfo(65283,8388608),[vpq_android_gcm_h,76] - PartitionFetchInfo(262787,8388608),[vpq_android_gcm_h,460] - PartitionFetchInfo(285709,8388608),[vpq_android_gcm_h,100] - PartitionFetchInfo(199405,8388608),[vpq_android_gcm_h,148] - PartitionFetchInfo(339032,8388608),[vpq_android_gcm_h,436] - PartitionFetchInfo(0,8388608),[vpq_android_gcm_h,124] - PartitionFetchInfo(484447,8388608),[vpq_android_gcm_h,484] - PartitionFetchInfo(105945,8388608),[vpq_android_gcm_h,340] - PartitionFetchInfo(0,8388608),[vpq_android_gcm_h,388] - PartitionFetchInfo(9,8388608),[vpq_android_gcm_h,316] - PartitionFetchInfo(194766,8388608),[vpq_android_gcm_h,364] - PartitionFetchInfo(139897,8388608),[vpq_android_gcm_h,292] - PartitionFetchInfo(195408,8388608),[vpq_android_gcm_h,28] - PartitionFetchInfo(329961,8388608),[vpq_android_gcm_h,172] - PartitionFetchInfo(436959,8388608),[vpq_android_gcm_h,268] - PartitionFetchInfo(59827,8388608),[vpq_android_gcm_h,244] - PartitionFetchInfo(259731,8388608),[vpq_android_gcm_h,220] - PartitionFetchInfo(61669,8388608),[vpq_android_gcm_h,412] - PartitionFetchInfo(563609,8388608),[vpq_android_gcm_h,4] - PartitionFetchInfo(360336,8388608),[vpq_android_gcm_h,52] - PartitionFetchInfo(378533,8388608) (kafka.server.ReplicaFetcherThread) ... server.log.2014-09-03-14:[2014-09-03 14:04:18,548] ERROR Error in acceptor (kafka.network.Acceptor) ... and these may not be all (other logs may have some more of that) Joe said to just lower the number of connections but I still can't see the exact problem. is there a kafka limit to the number of concurrent open files? cause the process was not limited... Thanks, Shlomi On Tue, Sep 9, 2014 at 7:12 AM, Jun Rao jun...@gmail.com wrote: What type of error did you see? You may need to configure a larger open file handler limit. Thanks, Jun On Wed, Sep 3, 2014 at 12:01 PM, Shlomi Hazan hzshl...@gmail.com wrote: Hi, I am trying to load a cluster with over than 10K connections, and bumped into the error in the subject. Is there any limitation on Kafka's side? if so it configurable? how? on first look, it looks like the selector
Re: build failes for latest stable source tgz (kafka_2.9.2-0.8.1.1)
just VPN'ed into my workstation: the answer to 5 is [*yes*] answer to 1,2: the error's I see on the python client are first timeouts and then message send failures, using sync send. on the controller log: ontroller.log.2014-08-26-13:[2014-08-26 13:40:44,317] ERROR [Controller-1-to-broker-3-send-thread], Controller 1 epoch 3 failed to send StopReplica request with correlation id 519 to broker id:3,host:shlomi-kafka-broker-3,port:9092. Reconnecting to broker. (kafka.controller.RequestSendThread) controller.log.2014-08-26-13:[2014-08-26 13:40:44,319] ERROR [Controller-1-to-broker-3-send-thread], Controller 1's connection to broker id:3,host:shlomi-kafka-broker-3,port:9092 was unsuccessful (kafka.controller.RequestSendThread) on the server log (selected greps): ... server.log.2014-08-27-01:[2014-08-27 01:44:23,143] ERROR [ReplicaFetcherThread-4-2], Error for partition [vpq_android_gcm_h,270] to broker 2:class kafka.common.NotLeaderForPartitionException (kafka.server.ReplicaFetcherThread) ... server.log.2014-08-27-12:[2014-08-27 12:08:34,638] ERROR Closing socket for /10.184.150.54 because of error (kafka.network.Processor) ... server.log.2014-08-28-07:[2014-08-28 07:57:35,944] ERROR [KafkaApi-1] Error when processing fetch request for partition [vpq_android_gcm_h,184] offset 8798 from follower with correlation id 0 (kafka.server.KafkaApis) ... erver.log.2014-09-03-15:[2014-09-03 15:46:18,220] ERROR [ReplicaFetcherThread-2-3], Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 177593; ClientId: ReplicaFetcherThread-2-3; ReplicaId: 1; MaxWait: 1000 ms; MinBytes: 1 bytes; RequestInfo: [vpq_android_gcm_h,196] - PartitionFetchInfo(65283,8388608),[vpq_android_gcm_h,76] - PartitionFetchInfo(262787,8388608),[vpq_android_gcm_h,460] - PartitionFetchInfo(285709,8388608),[vpq_android_gcm_h,100] - PartitionFetchInfo(199405,8388608),[vpq_android_gcm_h,148] - PartitionFetchInfo(339032,8388608),[vpq_android_gcm_h,436] - PartitionFetchInfo(0,8388608),[vpq_android_gcm_h,124] - PartitionFetchInfo(484447,8388608),[vpq_android_gcm_h,484] - PartitionFetchInfo(105945,8388608),[vpq_android_gcm_h,340] - PartitionFetchInfo(0,8388608),[vpq_android_gcm_h,388] - PartitionFetchInfo(9,8388608),[vpq_android_gcm_h,316] - PartitionFetchInfo(194766,8388608),[vpq_android_gcm_h,364] - PartitionFetchInfo(139897,8388608),[vpq_android_gcm_h,292] - PartitionFetchInfo(195408,8388608),[vpq_android_gcm_h,28] - PartitionFetchInfo(329961,8388608),[vpq_android_gcm_h,172] - PartitionFetchInfo(436959,8388608),[vpq_android_gcm_h,268] - PartitionFetchInfo(59827,8388608),[vpq_android_gcm_h,244] - PartitionFetchInfo(259731,8388608),[vpq_android_gcm_h,220] - PartitionFetchInfo(61669,8388608),[vpq_android_gcm_h,412] - PartitionFetchInfo(563609,8388608),[vpq_android_gcm_h,4] - PartitionFetchInfo(360336,8388608),[vpq_android_gcm_h,52] - PartitionFetchInfo(378533,8388608) (kafka.server.ReplicaFetcherThread) ... server.log.2014-09-03-14:[2014-09-03 14:04:18,548] ERROR Error in acceptor (kafka.network.Acceptor) ... On Sat, Sep 6, 2014 at 5:48 PM, Shlomi Hazan shl...@viber.com wrote: Hi and sorry for the late response I just got into the weekend and still Satdurday here... Well, not at my desk but will answer what I can: 1. what else on the logs? [*will vpn and check*] 2. other broker failure reason? [**] 3. other broker failure after taking leadership? [*how can I be sure? ask another to describe topic?*] 4. how do I measure number of connections? [*ls -l /proc/pid/fd | grep socket | wc -l, also did watch on that*] 5. is that number equals the number of {new Producer}? 6. how many topics? [*1*] how many partitions [*504*] 7. Are u using a partition key? [*yes, I use the python client with* ] *class ProducerIdPartitioner(Partitioner):Implements a partitioner which selects the target partition based on the sending producer IDdef partition(self, key, partitions):size = len(partitions)prod_id = int(key)idx = prod_id % sizereturn partitions[idx]* 8. maybe running into over partitioned topic? [*producer instances is 6 machines * 84 procs * 24 threads, but never got to start them all*,*b/c of errors*] 9. r u running anything else? [*yes, zookeeper*] additional: do you want broker or other config? EC2 instance types? anythying else? Thanks, Shlomi On Thu, Sep 4, 2014 at 10:02 PM, Joe Stein joe.st...@stealth.ly wrote: I think it sounds more like another issue than your thinking...the broker should not be failing like that especially another broker being affected doesn't make sense. What else is in the logs on failure? Is the other broker failing because of number of files too? Is it happening after it becomes the leader? How are you measuring number of connections? Is this how many producer connections you are opening up yourself (new Producer())? How many topics do you have? How many partitions? Are you using a partition key? Maybe you are running
build fails (with JDK8)
Hi, While I am not sure that JDK8 is the problem, what I did is simply clone and gardle the source. I kept getting failures and excluding tasks until eventually I did this: *gradle -PscalaVersion=2.9.2 -x :clients:javadoc -x :clients:signArchives -x :clients:licenseTest -x :contrib:signArchives clean build* and got this (tail): ... *:contrib:test UP-TO-DATE* *:contrib:check UP-TO-DATE* *:contrib:build* *:core:compileJava UP-TO-DATE* *:core:compileScala* *Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0* *error: error while loading CharSequence, class file '/usr/java/jdk1.8.0_11/jre/lib/rt.jar(java/lang/CharSequence.class)' is broken* *(bad constant pool tag 15 at byte 1501)* *error: error while loading AnnotatedElement, class file '/usr/java/jdk1.8.0_11/jre/lib/rt.jar(java/lang/reflect/AnnotatedElement.class)' is broken* *(bad constant pool tag 15 at byte 2713)* *error: error while loading Arrays, class file '/usr/java/jdk1.8.0_11/jre/lib/rt.jar(java/util/Arrays.class)' is broken* *(bad constant pool tag 15 at byte 12801)* *error: error while loading Comparator, class file '/usr/java/jdk1.8.0_11/jre/lib/rt.jar(java/util/Comparator.class)' is broken* *(bad constant pool tag 15 at byte 5003)* */tmp/sbt_7c18fddb/xsbt/ExtractAPI.scala:395: error: java.util.Comparator does not take type parameters* * private[this] val sortClasses = new Comparator[Symbol] {* *^* *5 errors found* *:core:compileScala FAILED* *FAILURE: Build failed with an exception.* ** What went wrong:* *Execution failed for task ':core:compileScala'.* * org.gradle.messaging.remote.internal.PlaceholderException (no error message)* ** Try:* *Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.* *BUILD FAILED* *Total time: 12.969 secs* It looks like now it wants me to give a JDK7 rt.jar Can't we all just get along? :( and those poor tasks I excluded too... -- Shlomi
build failes for latest stable source tgz (kafka_2.9.2-0.8.1.1)
what gradle version is used to build kafka_2.9.2-0.8.1.1 ? tried with v2 and failed with : gradle --stacktrace clean FAILURE: Build failed with an exception. * Where: Build file '/home/shlomi/0dec0xb/project/vpmb/master/3rdparty/kafka/code/kafka-0.8.1.1-src/build.gradle' line: 34 * What went wrong: A problem occurred evaluating root project 'kafka-0.8.1.1-src'. Could not find method add() for arguments [licenseMain, class nl.javadude.gradle.plugins.license.License] on task set. * Try: Run with --info or --debug option to get more log output. * Exception is: org.gradle.api.GradleScriptException: A problem occurred evaluating root project 'kafka-0.8.1.1-src'. at org.gradle.groovy.scripts.internal.DefaultScriptRunnerFactory$ScriptRunnerImpl.run(DefaultScriptRunnerFactory.java:54) at org.gradle.configuration.DefaultScriptPluginFactory$ScriptPluginImpl.apply(DefaultScriptPluginFactory.java:187) at org.gradle.configuration.project.BuildScriptProcessor.execute(BuildScriptProcessor.java:39) at org.gradle.configuration.project.BuildScriptProcessor.execute(BuildScriptProcessor.java:26) at org.gradle.configuration.project.ConfigureActionsProjectEvaluator.evaluate(ConfigureActionsProjectEvaluator.java:34) at org.gradle.configuration.project.LifecycleProjectEvaluator.evaluate(LifecycleProjectEvaluator.java:55) at org.gradle.api.internal.project.AbstractProject.evaluate(AbstractProject.java:470) at org.gradle.api.internal.project.AbstractProject.evaluate(AbstractProject.java:79) at org.gradle.configuration.DefaultBuildConfigurer.configure(DefaultBuildConfigurer.java:31) at org.gradle.initialization.DefaultGradleLauncher.doBuildStages(DefaultGradleLauncher.java:128) at org.gradle.initialization.DefaultGradleLauncher.doBuild(DefaultGradleLauncher.java:105) at org.gradle.initialization.DefaultGradleLauncher.run(DefaultGradleLauncher.java:85) at org.gradle.launcher.exec.InProcessBuildActionExecuter$DefaultBuildController.run(InProcessBuildActionExecuter.java:81) at org.gradle.launcher.cli.ExecuteBuildAction.run(ExecuteBuildAction.java:33) at org.gradle.launcher.cli.ExecuteBuildAction.run(ExecuteBuildAction.java:24) at org.gradle.launcher.exec.InProcessBuildActionExecuter.execute(InProcessBuildActionExecuter.java:39) at org.gradle.launcher.exec.InProcessBuildActionExecuter.execute(InProcessBuildActionExecuter.java:29) at org.gradle.launcher.cli.RunBuildAction.run(RunBuildAction.java:50) at org.gradle.internal.Actions$RunnableActionAdapter.execute(Actions.java:171) at org.gradle.launcher.cli.CommandLineActionFactory$ParseAndBuildAction.execute(CommandLineActionFactory.java:237) at org.gradle.launcher.cli.CommandLineActionFactory$ParseAndBuildAction.execute(CommandLineActionFactory.java:210) at org.gradle.launcher.cli.JavaRuntimeValidationAction.execute(JavaRuntimeValidationAction.java:35) at org.gradle.launcher.cli.JavaRuntimeValidationAction.execute(JavaRuntimeValidationAction.java:24) at org.gradle.launcher.cli.CommandLineActionFactory$WithLogging.execute(CommandLineActionFactory.java:206) at org.gradle.launcher.cli.CommandLineActionFactory$WithLogging.execute(CommandLineActionFactory.java:169) at org.gradle.launcher.cli.ExceptionReportingAction.execute(ExceptionReportingAction.java:33) at org.gradle.launcher.cli.ExceptionReportingAction.execute(ExceptionReportingAction.java:22) at org.gradle.launcher.Main.doAction(Main.java:33) at org.gradle.launcher.bootstrap.EntryPoint.run(EntryPoint.java:45) at org.gradle.launcher.bootstrap.ProcessBootstrap.runNoExit(ProcessBootstrap.java:54) at org.gradle.launcher.bootstrap.ProcessBootstrap.run(ProcessBootstrap.java:35) at org.gradle.launcher.GradleMain.main(GradleMain.java:23) Caused by: org.gradle.api.internal.MissingMethodException: Could not find method add() for arguments [licenseMain, class nl.javadude.gradle.plugins.license.License] on task set. at org.gradle.api.internal.AbstractDynamicObject.methodMissingException(AbstractDynamicObject.java:68) at org.gradle.api.internal.AbstractDynamicObject.invokeMethod(AbstractDynamicObject.java:56) at org.gradle.api.internal.CompositeDynamicObject.invokeMethod(CompositeDynamicObject.java:172) at org.gradle.api.internal.tasks.DefaultTaskContainer_Decorated.invokeMethod(Unknown Source) at nl.javadude.gradle.plugins.license.LicensePlugin$_configureSourceSetRule_closure6_closure18.doCall(LicensePlugin.groovy:117) at org.gradle.api.internal.ClosureBackedAction.execute(ClosureBackedAction.java:59) at org.gradle.listener.ActionBroadcast.execute(ActionBroadcast.java:39) at org.gradle.api.internal.DefaultDomainObjectCollection.doAdd(DefaultDomainObjectCollection.java:164) at org.gradle.api.internal.DefaultDomainObjectCollection.add(DefaultDomainObjectCollection.java:159) at
Re: build failes for latest stable source tgz (kafka_2.9.2-0.8.1.1)
./gradlew -PscalaVersion=2.9.2 clean jar failed with JDK 8. (error: error while loading CharSequence, class file '/usr/java/jdk1.8.0_20/jre/lib/rt.jar(java/lang/CharSequence.class)' is broken) I understand there's no escape from installing JDK 7? 10x Shlomi On Thu, Sep 4, 2014 at 6:11 PM, Joe Stein joe.st...@stealth.ly wrote: When building you need to use the ./gradelw script as Harsha said. Please take a look at the README for specific commands and how to run them. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Thu, Sep 4, 2014 at 10:59 AM, Shlomi Hazan shl...@viber.com wrote: it failed with JDK 8 so I hoped a newer gradle will maybe do the magic, and stepped into this other problem. I assume you will say : install JDK 7 and build with our gradle 1.6. is it so? Shlomi On Thu, Sep 4, 2014 at 5:41 PM, Harsha ka...@harsha.io wrote: Did you tried gradlew script in kafka source dir. -Harsha On Thu, Sep 4, 2014, at 07:32 AM, Shlomi Hazan wrote: what gradle version is used to build kafka_2.9.2-0.8.1.1 ? tried with v2 and failed with : gradle --stacktrace clean FAILURE: Build failed with an exception. * Where: Build file '/home/shlomi/0dec0xb/project/vpmb/master/3rdparty/kafka/code/kafka-0.8.1.1-src/build.gradle' line: 34 * What went wrong: A problem occurred evaluating root project 'kafka-0.8.1.1-src'. Could not find method add() for arguments [licenseMain, class nl.javadude.gradle.plugins.license.License] on task set. * Try: Run with --info or --debug option to get more log output. * Exception is: org.gradle.api.GradleScriptException: A problem occurred evaluating root project 'kafka-0.8.1.1-src'. at org.gradle.groovy.scripts.internal.DefaultScriptRunnerFactory$ScriptRunnerImpl.run(DefaultScriptRunnerFactory.java:54) at org.gradle.configuration.DefaultScriptPluginFactory$ScriptPluginImpl.apply(DefaultScriptPluginFactory.java:187) at org.gradle.configuration.project.BuildScriptProcessor.execute(BuildScriptProcessor.java:39) at org.gradle.configuration.project.BuildScriptProcessor.execute(BuildScriptProcessor.java:26) at org.gradle.configuration.project.ConfigureActionsProjectEvaluator.evaluate(ConfigureActionsProjectEvaluator.java:34) at org.gradle.configuration.project.LifecycleProjectEvaluator.evaluate(LifecycleProjectEvaluator.java:55) at org.gradle.api.internal.project.AbstractProject.evaluate(AbstractProject.java:470) at org.gradle.api.internal.project.AbstractProject.evaluate(AbstractProject.java:79) at org.gradle.configuration.DefaultBuildConfigurer.configure(DefaultBuildConfigurer.java:31) at org.gradle.initialization.DefaultGradleLauncher.doBuildStages(DefaultGradleLauncher.java:128) at org.gradle.initialization.DefaultGradleLauncher.doBuild(DefaultGradleLauncher.java:105) at org.gradle.initialization.DefaultGradleLauncher.run(DefaultGradleLauncher.java:85) at org.gradle.launcher.exec.InProcessBuildActionExecuter$DefaultBuildController.run(InProcessBuildActionExecuter.java:81) at org.gradle.launcher.cli.ExecuteBuildAction.run(ExecuteBuildAction.java:33) at org.gradle.launcher.cli.ExecuteBuildAction.run(ExecuteBuildAction.java:24) at org.gradle.launcher.exec.InProcessBuildActionExecuter.execute(InProcessBuildActionExecuter.java:39) at org.gradle.launcher.exec.InProcessBuildActionExecuter.execute(InProcessBuildActionExecuter.java:29) at org.gradle.launcher.cli.RunBuildAction.run(RunBuildAction.java:50) at org.gradle.internal.Actions$RunnableActionAdapter.execute(Actions.java:171) at org.gradle.launcher.cli.CommandLineActionFactory$ParseAndBuildAction.execute(CommandLineActionFactory.java:237) at org.gradle.launcher.cli.CommandLineActionFactory$ParseAndBuildAction.execute(CommandLineActionFactory.java:210) at org.gradle.launcher.cli.JavaRuntimeValidationAction.execute(JavaRuntimeValidationAction.java:35) at org.gradle.launcher.cli.JavaRuntimeValidationAction.execute(JavaRuntimeValidationAction.java:24) at org.gradle.launcher.cli.CommandLineActionFactory$WithLogging.execute(CommandLineActionFactory.java:206) at org.gradle.launcher.cli.CommandLineActionFactory$WithLogging.execute(CommandLineActionFactory.java:169) at org.gradle.launcher.cli.ExceptionReportingAction.execute
Re: build failes for latest stable source tgz (kafka_2.9.2-0.8.1.1)
I sure did. the reason I am building is trying to patch some. specifically this : KAFKA-1623. actually if I felt more confident about scala, I would happily send you a patch. If you don't care screening, just tell me how to prep it for ya and i will. The bigger problem is running into too many open files when ramping up to several thousand connection. going above 4k is hard and around 6K the broker says goodbye. sometimes taking a broker friend for a ride This was what led me to the acceptor and it's being slow.. Have you got an idea why 10K connection should pose a problem. ulimit checked. not that. ?? Shlomi On Thu, Sep 4, 2014 at 7:00 PM, Joe Stein joe.st...@stealth.ly wrote: Have you tried using a binary release http://kafka.apache.org/downloads.html this way you don't have to-do a build? We build using JDK 6 you should be able to run in 8 (I know for sure 6 7 work honestly never tried 8). I just did a quick test with a broker running on 8 and produced/consumed a few message didn't run into issues... As for building in JDK 8 = I reproduced your issue and created a ticket https://issues.apache.org/jira/browse/KAFKA-1624 patches are welcomed. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Thu, Sep 4, 2014 at 11:22 AM, Shlomi Hazan shl...@viber.com wrote: ./gradlew -PscalaVersion=2.9.2 clean jar failed with JDK 8. (error: error while loading CharSequence, class file '/usr/java/jdk1.8.0_20/jre/lib/rt.jar(java/lang/CharSequence.class)' is broken) I understand there's no escape from installing JDK 7? 10x Shlomi On Thu, Sep 4, 2014 at 6:11 PM, Joe Stein joe.st...@stealth.ly wrote: When building you need to use the ./gradelw script as Harsha said. Please take a look at the README for specific commands and how to run them. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Thu, Sep 4, 2014 at 10:59 AM, Shlomi Hazan shl...@viber.com wrote: it failed with JDK 8 so I hoped a newer gradle will maybe do the magic, and stepped into this other problem. I assume you will say : install JDK 7 and build with our gradle 1.6. is it so? Shlomi On Thu, Sep 4, 2014 at 5:41 PM, Harsha ka...@harsha.io wrote: Did you tried gradlew script in kafka source dir. -Harsha On Thu, Sep 4, 2014, at 07:32 AM, Shlomi Hazan wrote: what gradle version is used to build kafka_2.9.2-0.8.1.1 ? tried with v2 and failed with : gradle --stacktrace clean FAILURE: Build failed with an exception. * Where: Build file '/home/shlomi/0dec0xb/project/vpmb/master/3rdparty/kafka/code/kafka-0.8.1.1-src/build.gradle' line: 34 * What went wrong: A problem occurred evaluating root project 'kafka-0.8.1.1-src'. Could not find method add() for arguments [licenseMain, class nl.javadude.gradle.plugins.license.License] on task set. * Try: Run with --info or --debug option to get more log output. * Exception is: org.gradle.api.GradleScriptException: A problem occurred evaluating root project 'kafka-0.8.1.1-src'. at org.gradle.groovy.scripts.internal.DefaultScriptRunnerFactory$ScriptRunnerImpl.run(DefaultScriptRunnerFactory.java:54) at org.gradle.configuration.DefaultScriptPluginFactory$ScriptPluginImpl.apply(DefaultScriptPluginFactory.java:187) at org.gradle.configuration.project.BuildScriptProcessor.execute(BuildScriptProcessor.java:39) at org.gradle.configuration.project.BuildScriptProcessor.execute(BuildScriptProcessor.java:26) at org.gradle.configuration.project.ConfigureActionsProjectEvaluator.evaluate(ConfigureActionsProjectEvaluator.java:34) at org.gradle.configuration.project.LifecycleProjectEvaluator.evaluate(LifecycleProjectEvaluator.java:55) at org.gradle.api.internal.project.AbstractProject.evaluate(AbstractProject.java:470) at org.gradle.api.internal.project.AbstractProject.evaluate(AbstractProject.java:79) at org.gradle.configuration.DefaultBuildConfigurer.configure(DefaultBuildConfigurer.java:31) at org.gradle.initialization.DefaultGradleLauncher.doBuildStages(DefaultGradleLauncher.java:128
trying to tune kafka's internal logging - need help...
Hi, I am trying to get rid of the log files written under “$base_dir/logs”, folder create by line 26 at “bin/kafka-run-class.sh”. I use an EC2 machine with small primary disk and it blows away on occasions when writing to these logs is excessive, and I bumped into a few already (from Jira it looks like you guys know about them). Tried to export “LOG_DIR”, “KAFKA_LOG4J_OPTS”, No luck till now…. L What log4j properties file should be put where to squelch that logging? Is there any such file? P.S. Saw that SCALA_VERSION defaults to 2.8.0 even in the other scala versions distributables. Should I set to 2.9.2/2.10/etc? Are there any other vars to take in account? 10x, Shlomi
delete topic ?
Hi, Doing some evaluation testing, and accidently create a queue with wrong replication factor. Trying to delete as in: kafka_2.10-0.8.1.1/bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic replicated-topic Yeilded: Command must include exactly one action: --list, --describe, --create or -alter Event though this page (https://kafka.apache.org/documentation.html) says: And finally deleting a topic: bin/kafka-topics.sh --zookeeper zk_host:port/chroot --delete --topic my_topic_name WARNING: Delete topic functionality is beta in 0.8.1. Please report any bugs that you encounter on the mailto:%20us...@kafka.apache.org mailing list or https://issues.apache.org/jira/browse/KAFKA JIRA. Kafka does not currently support reducing the number of partitions for a topic or changing the replication factor. What should I do? Shlomi