storm mbeans
Hi, all I know kafka automatically expose mbeans to jmx, it seems storm doesn’t, i wonder if anyone has the experience to use JConsole to read storm build-in metrics through mbeans, or I will have to write separate metricConsumer to register metrics to mbeans? Is there such source code available? thanks AL
install kafka and hadoop on the same cluster?
Hi, All I have a 9-node cluster where I already installed cloudera Hadoop/spark, now I want to install kafka in this cluster too, is it a good idea I install kafka on each of 9 node? If so, any potential risk for that? Also I am thinking to install cassandra on each of this node too, basically all components sitting in the same nodes, would that be OK? thanks AL
gradle building error
Hi, All I am having such error to build Kafka, * Where: Build file '/usr/local/kafka/build.gradle' line: 164 * What went wrong: A problem occurred evaluating root project 'kafka'. > Could not find property 'ScalaPlugin' on project ':clients’. I try to search online, but can’t even find a solution for this, anyone had similar issues before? thanks SL
Re: gradle building error
Hi, Guozhang I re-install gradle, it works now, thanks a lot. SL > On Dec 9, 2015, at 3:47 PM, Guozhang Wang <wangg...@gmail.com> wrote: > > Sa, > > Which command line did you use under what path? > > Guozhang > > On Wed, Dec 9, 2015 at 1:57 PM, Sa Li <sal...@gmail.com> wrote: > >> Hi, All >> >> I am having such error to build Kafka, >> >> * Where: >> Build file '/usr/local/kafka/build.gradle' line: 164 >> >> * What went wrong: >> A problem occurred evaluating root project 'kafka'. >>> Could not find property 'ScalaPlugin' on project ':clients’. >> >> >> I try to search online, but can’t even find a solution for this, anyone had >> similar issues before? >> >> thanks >> >> SL >> > > > > -- > -- Guozhang
kafka-python question
Hi, All I have a question about kafka-python producer, here is the record I have id (uuid) | sensor_id (character) | timestamp | period (int) | current (numeric) | date_received | factor (bigint) 75da661c-bd5c-40e3-8691-9034f34262e3” | “ff0057” | 2013-03-21 11:44:00-07” | 60 |0.1200 |2013-03-26 14:40:51.829-07” | 7485985 I am getting data from database and publishing to kafka, I am having the error of timestamp decimal serialization, can’t just publish each record as a list. I am thinking to convert each record to son object, before I do that, anyone knows more straightforward way to publish directly to a list (kafka consumer can read each record as a list or dictionary? thanks AL
connection refused between two VMs
Hi, All I had experience to setup kafka cluster among physical servers, currently I setup two VMs, and I fire up 1 broker on each VMs, as (broker 0 and 2). I create a topic test-rep-1: Topic:test-rep-1PartitionCount:2ReplicationFactor:1 Configs: Topic: test-rep-1 Partition: 0Leader: 0 Replicas: 0 Isr: 0 Topic: test-rep-1 Partition: 1Leader: 2 Replicas: 2 Isr: 2 However, when I did producer test, I run such command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test-rep-1 5000 1000 -1 acks=1 bootstrap.servers=sa-vm1:9092 buffer.memory=67108864 batch.size=8196 and keep getting connection refused error, like [2015-05-14 15:05:13,961] WARN Error in I/O with sa-vm2/10.43.34.143 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:238) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:135) at java.lang.Thread.run(Thread.java:745) [2015-05-14 15:05:13,972] WARN Error in I/O with sa-vm2/10.43.34.143 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:238) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:135) at java.lang.Thread.run(Thread.java:745) Any clue to solve the connection issues among VMs. Thanks AL
still too many files open errors for kafka web cosole
Hi, All I 've like to use kafka web console to monitor the offset/topics stuff, it is easy to use, however, it is freezing/stopping or dying too frequently. I don't think it's a problem on the OS level. Seems to be a problem on the application level. I've already fixed open file handlers to 98000 for anybody and time_waits to 30s instead of the default 5 minutes. From what I can see from the logs, it starts with play: [ESC[31merrorESC[0m] play - Cannot invoke the action, eventually got an error: java.lang.RuntimeException: Exception while executing statement : IO Exception: java.io.IOException: Too many open files; /etc/kafka-web-console/play; SQL statement: delete from offsetPoints where (offsetPoints.offsetHistoryId = ?) [90031-172] errorCode: 90031, sqlState: 90031 Caused by: java.lang.RuntimeException: Exception while executing statement : IO Exception: java.io.IOException: Too many open files; /etc/kafka-web-console/play; SQL statement: delete from offsetPoints where (offsetPoints.offsetHistoryId = ?) [90031-172] errorCode: 90031, sqlState: 90031 delete from offsetPoints then this seems to cause socket connection errors: Caused by: java.io.IOException: Too many open files at java.io.UnixFileSystem.createFileExclusively(Native Method) ~[na:1.7.0_75] at java.io.File.createNewFile(File.java:1006) ~[na:1.7.0_75] at org.h2.store.fs.FilePathDisk.createTempFile(FilePathDisk.java:367) ~[h2.jar:1.3.172] at org.h2.store.fs.FileUtils.createTempFile(FileUtils.java:329) ~[h2.jar:1.3.172] at org.h2.engine.Database.createTempFile(Database.java:1529) ~[h2.jar:1.3.172] at org.h2.result.RowList.writeAllRows(RowList.java:90) ~[h2.jar:1.3.172] [ESC[36mdebugESC[0m] application - Getting partition leaders for topic topic-exist-test [ESC[36mdebugESC[0m] application - Getting partition leaders for topic topic-rep-3-test [ESC[36mdebugESC[0m] application - Getting partition leaders for topic PofApiTest [ESC[36mdebugESC[0m] application - Getting partition leaders for topic PofApiTest-2 [ESC[36mdebugESC[0m] application - Getting partition leaders for topic fileread [ESC[36mdebugESC[0m] application - Getting partition leaders for topic pageview [ESC[36mdebugESC[0m] application - Getting partition log sizes for topic topic-exist-test from partition leaders 10.100.71.42:9092, 10.100.71.42:9092, 10.100.71.42:9092, 10.100.71.42:9092, 10.100.71.42:9092, 10.100.71.42:9092, 10.100.71.42:9092, 10.100.71.42:9092 [ESC[33mwarnESC[0m] application - Could not connect to partition leader 10.100.71.42:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader 10.100.71.42:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader 10.100.71.42:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader 10.100.71.42:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader 10.100.71.42:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader 10.100.71.42:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader 10.100.71.42:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader 10.100.71.42:9092. Error message: Failed to open a socket. [ESC[36mdebugESC[0m] application - Getting partition offsets for topic topic-exist-test -jar:9092, exemplary-birds:9092, voluminous-mass:9092 [ESC[33mwarnESC[0m] application - Could not connect to partition leader voluminous-mass:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader exemplary-birds:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader harmful-jar:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader voluminous-mass:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader exemplary-birds:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader harmful-jar:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader voluminous-mass:9092. Error message: Failed to open a socket. [ESC[33mwarnESC[0m] application - Could not connect to partition leader exemplary-birds:9092. Error message: Failed to open a socket. [ESC[36mdebugESC[0m] application - Getting partition offsets for topic PofApiTest [ESC[36mdebugESC[0m] application - Getting partition log sizes for topic topic-rep-3-test from partition leaders exemplary-birds:9092, voluminous-mass:9092, harmful-jar:9092, exemplary-birds:9092, voluminous-mass:9092,
rebalancing leadership quite often
Hi, All My dev cluster has three nodes (1, 2, 3), but I've seen quite often that the 1 node just not work as a leader, I run preferred-replica-election many time, every time I run replica election, I see 1 turn out to be leader for some partitions, but it just stop leadership after a while, and the that partitions transfer to other partitions, meaning it will not be used for consumers. I thought that happens only that broker 1 was crashed or stopped, but it didn't, but I still see that leadership shift. Any ideas about this? thanks -- Alec Li
Re: Number of Consumers Connected
GuoZhang Sorry for leaving this topic for a while, I am still not clear how to commit the offset to zk from commandline, I tried this bin/kafka-console-consumer.sh --zookeeper 10.100.71.33:2181 --topic pipe-test-2 --from-beginning --property pipe It seems generate a console-consumer-001 in zK, but when I did that to other topics, nothing in zk, (I can't read anything from consumer group in kafka-web-console), see [zk: localhost:2181(CONNECTED) 2] ls /consumers/web-console-consumer-38650 [offsets, owners, ids] [zk: localhost:2181(CONNECTED) 3] ls /consumers/web-console-consumer-38650/offsets [PofApiTest-1] [zk: localhost:2181(CONNECTED) 4] ls /consumers/web-console-consumer-38650/offsets/PofApiTest-1 [3, 2, 1, 0, 7, 6, 5, 4] [zk: localhost:2181(CONNECTED) 5] ls /consumers/web-console-consumer-38650/offsets/PofApiTest-1/3 [] Any ideas? thanks AL On Tue, Jan 20, 2015 at 9:57 PM, Guozhang Wang wangg...@gmail.com wrote: It seems not the latest version of Kafka, which version are you using? On Tue, Jan 20, 2015 at 9:46 AM, Sa Li sal...@gmail.com wrote: Guozhang Thank you very much for reply, here I print out the kafka-console-consumer.sh help, root@voluminous-mass:/srv/kafka# bin/kafka-console-consumer.sh Missing required argument [zookeeper] Option Description -- --- --autocommit.interval.ms Integer: ms The time interval at which to save the current offset in ms (default: 6) --blacklist blacklist Blacklist of topics to exclude from consumption. --consumer-timeout-ms Integer: prop consumer throws timeout exception after waiting this much of time without incoming messages (default: -1) --csv-reporter-enabled If set, the CSV metrics reporter will be enabled --fetch-size Integer: sizeThe amount of data to fetch in a single request. (default: 1048576) --formatter class The name of a class to use for formatting kafka messages for display. (default: kafka.consumer. DefaultMessageFormatter) --from-beginningIf the consumer does not already have an established offset to consume from, start with the earliest message present in the log rather than the latest message. --group gid The group id to consume on. (default: console-consumer-85664) --max-messages Integer: num_messages The maximum number of messages to consume before exiting. If not set, consumption is continual. --max-wait-ms Integer: ms The max amount of time each fetch request waits. (default: 100) --metrics-dir metrics dictory If csv-reporter-enable is set, and this parameter isset, the csv metrics will be outputed here --min-fetch-bytes Integer: bytes The min number of bytes each fetch request waits for. (default: 1) --property prop --refresh-leader-backoff-ms Integer: Backoff time before refreshing ms metadata (default: 200) --skip-message-on-error If there is an error when processing a message, skip it instead of halt. --socket-buffer-size Integer: sizeThe size of the tcp RECV size. (default: 2097152) --socket-timeout-ms Integer: ms The socket timeout used for the connection to the broker (default: 3) --topic topic The topic id to consume on. --whitelist whitelist Whitelist of topics to include for consumption. --zookeeper urls REQUIRED: The connection string for the zookeeper connection in the form host:port. Multiple URLS can be given to allow fail
kafka-web-console goes down regularly
Hi, All I am currently using kafka-web-console to monitor the kafka system, it get down regularly, so I have to restart it every few hours which is kinda annoying. I downloaded two versions https://github.com/claudemamo/kafka-web-console http://mungeol-heo.blogspot.ca/2014/12/kafka-web-console.html And I started by play start (using 9000), or play start -Dhttp.port=8080, either version can make it work fine at the beginning, but down after few hours. I am thinking use upstart to make it up automatically, but it should not be what is suppose to be, any idea to fix the problem, or I did something wrong? thanks -- Alec Li
Re: Number of Consumers Connected
Hi, Guozhang Thank you very much for the reply, as you mentioned, I download the latest version https://www.apache.org/dyn/closer.cgi?path=/kafka/0.8.2-beta/kafka-0.8.2-beta-src.tgz Untar this build and here is what I see root@DO-mq-dev:/home/stuser/kafka-0.8.2-beta-src/bin# kafka-console-consumer.sh The console consumer is a tool that reads data from Kafka and outputs it to standard output. Option Description -- --- --blacklist blacklist Blacklist of topics to exclude from consumption. --consumer.config config file Consumer config properties file. --csv-reporter-enabled If set, the CSV metrics reporter will be enabled --delete-consumer-offsets If specified, the consumer path in zookeeper is deleted when starting up --formatter class The name of a class to use for formatting kafka messages for display. (default: kafka.tools. DefaultMessageFormatter) --from-beginningIf the consumer does not already have an established offset to consume from, start with the earliest message present in the log rather than the latest message. --max-messages Integer: num_messages The maximum number of messages to consume before exiting. If not set, consumption is continual. --metrics-dir metrics dictory If csv-reporter-enable is set, and this parameter isset, the csv metrics will be outputed here --property prop --skip-message-on-error If there is an error when processing a message, skip it instead of halt. --topic topic The topic id to consume on. --whitelist whitelist Whitelist of topics to include for consumption. --zookeeper urls REQUIRED: The connection string for the zookeeper connection in the form host:port. Multiple URLS can be given to allow fail-over. Again, I am still not able to see description of --property, or I download the wrong version? thanks AL On Tue, Feb 3, 2015 at 4:29 PM, Guozhang Wang wangg...@gmail.com wrote: Hello Sa, Could you try the latest 0.8.2 release, whose console consumer tool has been polished a bit with clearer properties? Guozhang On Tue, Feb 3, 2015 at 10:32 AM, Sa Li sal...@gmail.com wrote: GuoZhang Sorry for leaving this topic for a while, I am still not clear how to commit the offset to zk from commandline, I tried this bin/kafka-console-consumer.sh --zookeeper 10.100.71.33:2181 --topic pipe-test-2 --from-beginning --property pipe It seems generate a console-consumer-001 in zK, but when I did that to other topics, nothing in zk, (I can't read anything from consumer group in kafka-web-console), see [zk: localhost:2181(CONNECTED) 2] ls /consumers/web-console-consumer-38650 [offsets, owners, ids] [zk: localhost:2181(CONNECTED) 3] ls /consumers/web-console-consumer-38650/offsets [PofApiTest-1] [zk: localhost:2181(CONNECTED) 4] ls /consumers/web-console-consumer-38650/offsets/PofApiTest-1 [3, 2, 1, 0, 7, 6, 5, 4] [zk: localhost:2181(CONNECTED) 5] ls /consumers/web-console-consumer-38650/offsets/PofApiTest-1/3 [] Any ideas? thanks AL On Tue, Jan 20, 2015 at 9:57 PM, Guozhang Wang wangg...@gmail.com wrote: It seems not the latest version of Kafka, which version are you using? On Tue, Jan 20, 2015 at 9:46 AM, Sa Li sal...@gmail.com wrote: Guozhang Thank you very much for reply, here I print out the kafka-console-consumer.sh help, root@voluminous-mass:/srv/kafka# bin/kafka-console-consumer.sh Missing required argument [zookeeper] Option Description -- --- --autocommit.interval.ms Integer: ms The time interval at which to save the current offset in ms (default: 6) --blacklist blacklist Blacklist of topics to exclude from consumption. --consumer-timeout-ms Integer: prop consumer throws timeout
java.nio.channels.ClosedChannelException
Hi, All I send messages from one VM to production, but getting such error [2015-01-30 18:43:44,810] WARN Failed to send producer request with correlation id 126 to broker 101 with data for partitions [test-rep-three,5],[test-rep-three,2] (kafka.producer.async.DefaultEventHandler) java.nio.channels.ClosedChannelException at kafka.network.BlockingChannel.send(BlockingChannel.scala:100) at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73) at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:103) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:102) at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.producer.SyncProducer.send(SyncProducer.scala:101) at kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:256) at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:107) at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:99) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) I actually had this kind of error before in another VM, now it works fine in that VM, but when I start a new VM and get kafka build in this new VM, it comes again, really can't recall what did I do to fix this problem before, any ideas? BTW, I telnet, it says connected. thanks AL -- Alec Li
Re: kafka production server test
Thank you for reply Guozhang, right now I can get it work out of box, run testcase_1 on VM and access production. However, from my point of view, we really like to test the existing configs on productions, which means for example, replica_basic_test.py won't start zookeeperkafka since I want to test the started brokers. I am thinking if I can comment out this part: self.log_message(starting zookeepers) kafka_system_test_utils.start_zookeepers(self.systemTestEnv, self.testcaseEnv) self.anonLogger.info(sleeping for 2s) time.sleep(2) self.log_message(starting brokers) kafka_system_test_utils.start_brokers(self.systemTestEnv, self.testcaseEnv) self.anonLogger.info(sleeping for 5s) time.sleep(5) Now I plan to modify the properties files in /system_test/replication_testsuite/config/, and cluster_config.json and testcase_1_properties.json on /system_test/replication_testsuite/testcase_1/, to make these config files exactly same as what we have on production. Does it work or I need to change some other dependencies to get it work? thanks AL On Mon, Jan 26, 2015 at 12:16 PM, Guozhang Wang wangg...@gmail.com wrote: Sa, I believe your questions have mostly been answered by Ewen, and sorry for getting late on this. As you notice the current system test's out-of-the-box experience is not very good, and we are proposing ways to improve that situation: KAFKA-1748 https://issues.apache.org/jira/browse/KAFKA-1748 KAFKA-1589 https://issues.apache.org/jira/browse/KAFKA-1589 And we are adding some more test cases at the same time: KAFKA-1888 https://issues.apache.org/jira/browse/KAFKA-1888 So if you have new observations while using the package or if you are willing to contribute to those tickets you are mostly welcomed. Guozhang On Thu, Jan 22, 2015 at 3:02 PM, Sa Li sal...@gmail.com wrote: Hi, Guozhang Can I run this package remotely test another server? which mean I run this package on dev but testing kafka system on production? thanks AL On Thu, Jan 22, 2015 at 2:55 PM, Sa Li sal...@gmail.com wrote: Hi, Guozhang, Good to know such package, will try it now. :-) thanks On Thu, Jan 22, 2015 at 2:40 PM, Guozhang Wang wangg...@gmail.com wrote: Hi Sa, Have you looked into the system test package? It contains a suite of tests on different failure modes of Kafka brokers. Guozhang On Thu, Jan 22, 2015 at 12:00 PM, Sa Li sal...@gmail.com wrote: Hi, All We are about to deliver kafka production server, I have been working on different test, like performance test from linkedin. This is a 3-node cluster, with 5 nodes zkEnsemble. I assume there are lots of tests I need to do, like network, node failure, flush time, etc. Is there is completed guide to instruct the tests for kafka production servers? thanks -- Alec Li -- -- Guozhang -- Alec Li -- Alec Li -- -- Guozhang -- Alec Li
Kafka System test
Hi, All From my last ticket (Subject: kafka production server test), Guozhang kindly point me the system test package come with kafka source build which is really cool package. I took a look at this package, things are clear is I run it on localhost, I don't need to change anything, say, cluster_config.json defines entities, and system test reads testcase__properties.json to override the properties in cluster_config.json. For example, cluster_config.json defaults hostname as localhost, and three brokers, I assume it will create 3 brokers in localhost and continue the test. Currently I install the package on a vagrant VM, and like to run the system test on VM and remotely access production to test production cluster. The production cluster has 3 nodes. kafka production cluster is on top of a 5-node zookeeper ensemble. My questions is how to effectively change the properties on vagrant system test package. 1. change on cluster_config.json, like { entity_id: 0, hostname: 10.100.70.28,10.100.70.29,10.100.70.30,10.100.70.31,10.100.70.32, role: zookeeper, cluster_name: target, kafka_home: /etc/kafka, java_home: /usr/lib/jvm/java-7-openjdk-amd64/jre, jmx_port: 9990 }, { entity_id: 1, hostname: 10.100.70.28, role: broker, cluster_name: target, kafka_home: /etc/kafka, java_home: /usr/lib/jvm/java-7-openjdk-amd64/jre, jmx_port: 9991 }, Here because I want to test remote servers, so I need to change the cluster_name as target, right? 2. In directory ./replication_testsuite/config/ , for all the properties files, do I need to change them all to be the same as the properties on production servers? 3. in ./replication_testsuite/testcase_/, seems I need to make corresponding changes as well to keep consistent with ./config/properties, such as log.dir: /tmp/kafka_server_1_logs will be change to the log.dir in my production server.properties, is that right? Hope someone who has done the system test on remote server can share some experience, thanks AL -- Alec Li
Re: Kafka System test
Also I found ./kafka/system_test/cluster_config.json is duplicated on each directory ./kafka/system_test/replication_testsuite/testcase_/ When I change the ./kafka/system_test/cluster_config.json, do I need to overwrite it each ./kafka/system_test/replication_testsuite/testcase_/cluster_config.json ? Thanks AL On Fri, Jan 23, 2015 at 1:39 PM, Sa Li sal...@gmail.com wrote: Thanks for reply. Ewen, pertaining to your statement ... hostname setting being a list instead of a single host, are you saying entity_id 1 or 0, entity_id: 0, hostname: 10.100.70.28,10.100.70.29,10.100.70.30,10.100.70.31,10.100.70.32, entity_id: 1, hostname: 10.100.70.28, I thought the role zookeeper has multiple hosts, so I list all the IPs of ensemble. While entity 1 is only about 1 broker (my design about production cluster to fire up one broker for each host, so 3 nodes with 3 brokers), so I specify one hostname IP only here. How do I change? thanks AL On Fri, Jan 23, 2015 at 1:22 PM, Ewen Cheslack-Postava e...@confluent.io wrote: 1. Except for that hostname setting being a list instead of a single host, the changes look reasonable. That is where you want to customize settings for your setup. 2 3. Yes, you'll want to update those files as well. They top-level ones provide defaults, the ones in specific test directories provide overrides for that specific test. But they aren't combined in any way, i.e. the more specific one is just taken as a whole rather than being like a diff, so you do have to update both. You might want to take a look at https://issues.apache.org/jira/browse/KAFKA-1748. Currently if you want to run all tests it's a pain to change the hosts they're running on since it requires manually editing all those files. The patch gets rid of cluster_config.json and provides a couple of different ways of configuring the cluster -- run everything on localhost, get cluster info from a single json file, or get the ssh info from Vagrant. On Fri, Jan 23, 2015 at 11:50 AM, Sa Li sal...@gmail.com wrote: Hi, All From my last ticket (Subject: kafka production server test), Guozhang kindly point me the system test package come with kafka source build which is really cool package. I took a look at this package, things are clear is I run it on localhost, I don't need to change anything, say, cluster_config.json defines entities, and system test reads testcase__properties.json to override the properties in cluster_config.json. For example, cluster_config.json defaults hostname as localhost, and three brokers, I assume it will create 3 brokers in localhost and continue the test. Currently I install the package on a vagrant VM, and like to run the system test on VM and remotely access production to test production cluster. The production cluster has 3 nodes. kafka production cluster is on top of a 5-node zookeeper ensemble. My questions is how to effectively change the properties on vagrant system test package. 1. change on cluster_config.json, like { entity_id: 0, hostname: 10.100.70.28,10.100.70.29,10.100.70.30,10.100.70.31,10.100.70.32, role: zookeeper, cluster_name: target, kafka_home: /etc/kafka, java_home: /usr/lib/jvm/java-7-openjdk-amd64/jre, jmx_port: 9990 }, { entity_id: 1, hostname: 10.100.70.28, role: broker, cluster_name: target, kafka_home: /etc/kafka, java_home: /usr/lib/jvm/java-7-openjdk-amd64/jre, jmx_port: 9991 }, Here because I want to test remote servers, so I need to change the cluster_name as target, right? 2. In directory ./replication_testsuite/config/ , for all the properties files, do I need to change them all to be the same as the properties on production servers? 3. in ./replication_testsuite/testcase_/, seems I need to make corresponding changes as well to keep consistent with ./config/properties, such as log.dir: /tmp/kafka_server_1_logs will be change to the log.dir in my production server.properties, is that right? Hope someone who has done the system test on remote server can share some experience, thanks AL -- Alec Li -- Thanks, Ewen -- Alec Li -- Alec Li
Re: Kafka System test
Thanks for reply. Ewen, pertaining to your statement ... hostname setting being a list instead of a single host, are you saying entity_id 1 or 0, entity_id: 0, hostname: 10.100.70.28,10.100.70.29,10.100.70.30,10.100.70.31,10.100.70.32, entity_id: 1, hostname: 10.100.70.28, I thought the role zookeeper has multiple hosts, so I list all the IPs of ensemble. While entity 1 is only about 1 broker (my design about production cluster to fire up one broker for each host, so 3 nodes with 3 brokers), so I specify one hostname IP only here. How do I change? thanks AL On Fri, Jan 23, 2015 at 1:22 PM, Ewen Cheslack-Postava e...@confluent.io wrote: 1. Except for that hostname setting being a list instead of a single host, the changes look reasonable. That is where you want to customize settings for your setup. 2 3. Yes, you'll want to update those files as well. They top-level ones provide defaults, the ones in specific test directories provide overrides for that specific test. But they aren't combined in any way, i.e. the more specific one is just taken as a whole rather than being like a diff, so you do have to update both. You might want to take a look at https://issues.apache.org/jira/browse/KAFKA-1748. Currently if you want to run all tests it's a pain to change the hosts they're running on since it requires manually editing all those files. The patch gets rid of cluster_config.json and provides a couple of different ways of configuring the cluster -- run everything on localhost, get cluster info from a single json file, or get the ssh info from Vagrant. On Fri, Jan 23, 2015 at 11:50 AM, Sa Li sal...@gmail.com wrote: Hi, All From my last ticket (Subject: kafka production server test), Guozhang kindly point me the system test package come with kafka source build which is really cool package. I took a look at this package, things are clear is I run it on localhost, I don't need to change anything, say, cluster_config.json defines entities, and system test reads testcase__properties.json to override the properties in cluster_config.json. For example, cluster_config.json defaults hostname as localhost, and three brokers, I assume it will create 3 brokers in localhost and continue the test. Currently I install the package on a vagrant VM, and like to run the system test on VM and remotely access production to test production cluster. The production cluster has 3 nodes. kafka production cluster is on top of a 5-node zookeeper ensemble. My questions is how to effectively change the properties on vagrant system test package. 1. change on cluster_config.json, like { entity_id: 0, hostname: 10.100.70.28,10.100.70.29,10.100.70.30,10.100.70.31,10.100.70.32, role: zookeeper, cluster_name: target, kafka_home: /etc/kafka, java_home: /usr/lib/jvm/java-7-openjdk-amd64/jre, jmx_port: 9990 }, { entity_id: 1, hostname: 10.100.70.28, role: broker, cluster_name: target, kafka_home: /etc/kafka, java_home: /usr/lib/jvm/java-7-openjdk-amd64/jre, jmx_port: 9991 }, Here because I want to test remote servers, so I need to change the cluster_name as target, right? 2. In directory ./replication_testsuite/config/ , for all the properties files, do I need to change them all to be the same as the properties on production servers? 3. in ./replication_testsuite/testcase_/, seems I need to make corresponding changes as well to keep consistent with ./config/properties, such as log.dir: /tmp/kafka_server_1_logs will be change to the log.dir in my production server.properties, is that right? Hope someone who has done the system test on remote server can share some experience, thanks AL -- Alec Li -- Thanks, Ewen -- Alec Li
Re: kafka production server test
Hi, Guozhang Can I run this package remotely test another server? which mean I run this package on dev but testing kafka system on production? thanks AL On Thu, Jan 22, 2015 at 2:55 PM, Sa Li sal...@gmail.com wrote: Hi, Guozhang, Good to know such package, will try it now. :-) thanks On Thu, Jan 22, 2015 at 2:40 PM, Guozhang Wang wangg...@gmail.com wrote: Hi Sa, Have you looked into the system test package? It contains a suite of tests on different failure modes of Kafka brokers. Guozhang On Thu, Jan 22, 2015 at 12:00 PM, Sa Li sal...@gmail.com wrote: Hi, All We are about to deliver kafka production server, I have been working on different test, like performance test from linkedin. This is a 3-node cluster, with 5 nodes zkEnsemble. I assume there are lots of tests I need to do, like network, node failure, flush time, etc. Is there is completed guide to instruct the tests for kafka production servers? thanks -- Alec Li -- -- Guozhang -- Alec Li -- Alec Li
Re: kafka production server test
Hi, Guozhang, Good to know such package, will try it now. :-) thanks On Thu, Jan 22, 2015 at 2:40 PM, Guozhang Wang wangg...@gmail.com wrote: Hi Sa, Have you looked into the system test package? It contains a suite of tests on different failure modes of Kafka brokers. Guozhang On Thu, Jan 22, 2015 at 12:00 PM, Sa Li sal...@gmail.com wrote: Hi, All We are about to deliver kafka production server, I have been working on different test, like performance test from linkedin. This is a 3-node cluster, with 5 nodes zkEnsemble. I assume there are lots of tests I need to do, like network, node failure, flush time, etc. Is there is completed guide to instruct the tests for kafka production servers? thanks -- Alec Li -- -- Guozhang -- Alec Li
kafka production server test
Hi, All We are about to deliver kafka production server, I have been working on different test, like performance test from linkedin. This is a 3-node cluster, with 5 nodes zkEnsemble. I assume there are lots of tests I need to do, like network, node failure, flush time, etc. Is there is completed guide to instruct the tests for kafka production servers? thanks -- Alec Li
Re: Number of Consumers Connected
Guozhang Thank you very much for reply, here I print out the kafka-console-consumer.sh help, root@voluminous-mass:/srv/kafka# bin/kafka-console-consumer.sh Missing required argument [zookeeper] Option Description -- --- --autocommit.interval.ms Integer: ms The time interval at which to save the current offset in ms (default: 6) --blacklist blacklist Blacklist of topics to exclude from consumption. --consumer-timeout-ms Integer: prop consumer throws timeout exception after waiting this much of time without incoming messages (default: -1) --csv-reporter-enabled If set, the CSV metrics reporter will be enabled --fetch-size Integer: sizeThe amount of data to fetch in a single request. (default: 1048576) --formatter class The name of a class to use for formatting kafka messages for display. (default: kafka.consumer. DefaultMessageFormatter) --from-beginningIf the consumer does not already have an established offset to consume from, start with the earliest message present in the log rather than the latest message. --group gid The group id to consume on. (default: console-consumer-85664) --max-messages Integer: num_messages The maximum number of messages to consume before exiting. If not set, consumption is continual. --max-wait-ms Integer: ms The max amount of time each fetch request waits. (default: 100) --metrics-dir metrics dictory If csv-reporter-enable is set, and this parameter isset, the csv metrics will be outputed here --min-fetch-bytes Integer: bytes The min number of bytes each fetch request waits for. (default: 1) --property prop --refresh-leader-backoff-ms Integer: Backoff time before refreshing ms metadata (default: 200) --skip-message-on-error If there is an error when processing a message, skip it instead of halt. --socket-buffer-size Integer: sizeThe size of the tcp RECV size. (default: 2097152) --socket-timeout-ms Integer: ms The socket timeout used for the connection to the broker (default: 3) --topic topic The topic id to consume on. --whitelist whitelist Whitelist of topics to include for consumption. --zookeeper urls REQUIRED: The connection string for the zookeeper connection in the form host:port. Multiple URLS can be given to allow fail-over. --property option is not provided the description, is there an example how to use it? thanks AL On Mon, Jan 19, 2015 at 6:30 PM, Guozhang Wang wangg...@gmail.com wrote: There is a property config you can set via bin/kafka-console-consumer.sh to commit offsets to ZK, you can use bin/kafka-console-consumer.sh --help to list all the properties. Guozhang On Mon, Jan 19, 2015 at 5:15 PM, Sa Li sal...@gmail.com wrote: Guozhang, Currently we are in the stage to testing producer, our C# producer sending data to brokers, and use bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance command to produce the messages. We don't have a coded consumer to commit offset, we use bin/kafka-console-consumer.sh --zookeeper command to consume, is there a command that we can use on command line to create zk path? thanks AL On Mon, Jan 19, 2015 at 4:14 PM, Guozhang Wang wangg...@gmail.com wrote: Sa, Did your consumer ever commit offsets to Kafka? If not then no corresponding ZK path will be created. Guozhang On Mon, Jan 19, 2015 at 3:58 PM, Sa Li sal...@gmail.com wrote: Hi, I use such tool Consumer Offset Checker
Re: Number of Consumers Connected
Hi, I use such tool Consumer Offset Checker Displays the: Consumer Group, Topic, Partitions, Offset, logSize, Lag, Owner for the specified set of Topics and Consumer Group bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker To be able to know the consumer group, in zkCli.sh [zk: localhost:2181(CONNECTED) 3] ls / [transactional, admin, zookeeper, consumers, config, controller, storm, brokers, controller_epoch] [zk: localhost:2181(CONNECTED) 4] ls /consumers [web-console-consumer-99295, web-console-consumer-37853, web-console-consumer-30841, perf-consumer-92283, perf-consumer-21631, perf-consumer-95281, perf-consumer-59296, web-console-consumer-52126, web-console-consumer-89137, perf-consumer-72484, perf-consumer-80363, web-console-consumer-47543, web-console-consumer-22509, perf-consumer-16954, perf-consumer-53957, perf-consumer-39448, web-console-consumer-17021, perf-consumer-88693, web-console-consumer-48744, web-console-consumer-82543, perf-consumer-89565, web-console-consumer-97959, perf-consumer-40427, web-console-consumer-95350, web-console-consumer-26473, web-console-consumer-79384, web-console-consumer-8, perf-consumer-91681, web-console-consumer-36136, web-console-consumer-86924, perf-consumer-24510, perf-consumer-5888, perf-consumer-73534, perf-consumer-92985, perf-consumer-7675, perf-consumer-52306, perf-consumer-87352, web-console-consumer-30400] [zk: localhost:2181(CONNECTED) 5] I then run root@exemplary-birds:/srv/kafka# bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic PofApiTest-1 --group web-console-consumer-48744 Group Topic Pid Offset logSize Lag Owner Exception in thread main org.I0Itec.zkclient.exception.ZkNoNodeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /consumers/web-console-consumer-48744/offsets/PofApiTest-1/0 at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685) at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766) at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761) at kafka.utils.ZkUtils$.readData(ZkUtils.scala:461) at kafka.tools.ConsumerOffsetChecker$.kafka$tools$ConsumerOffsetChecker$$processPartition(ConsumerOffsetChecker.scala:59) at kafka.tools.ConsumerOffsetChecker$$anonfun$kafka$tools$ConsumerOffsetChecker$$processTopic$1.apply$mcVI$sp(ConsumerOffsetChecker.scala:89) at kafka.tools.ConsumerOffsetChecker$$anonfun$kafka$tools$ConsumerOffsetChecker$$processTopic$1.apply(ConsumerOffsetChecker.scala:89) at kafka.tools.ConsumerOffsetChecker$$anonfun$kafka$tools$ConsumerOffsetChecker$$processTopic$1.apply(ConsumerOffsetChecker.scala:89) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at kafka.tools.ConsumerOffsetChecker$.kafka$tools$ConsumerOffsetChecker$$processTopic(ConsumerOffsetChecker.scala:88) at kafka.tools.ConsumerOffsetChecker$$anonfun$main$3.apply(ConsumerOffsetChecker.scala:153) at kafka.tools.ConsumerOffsetChecker$$anonfun$main$3.apply(ConsumerOffsetChecker.scala:153) at scala.collection.immutable.List.foreach(List.scala:318) at kafka.tools.ConsumerOffsetChecker$.main(ConsumerOffsetChecker.scala:152) at kafka.tools.ConsumerOffsetChecker.main(ConsumerOffsetChecker.scala) Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /consumers/web-console-consumer-48744/offsets/PofApiTest-1/0 at org.apache.zookeeper.KeeperException.create(KeeperException.java:102) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956) at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103) at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770) at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) ... 15 more So consumer groups make confusing, I didn't specify the consumer-group-id in producer, the only place I know to config group is consumer.properties #consumer group id group.id=test-consumer-group Any hints? Thanks AL On Mon, Dec 15, 2014 at 6:46 PM, nitin sharma kumarsharma.ni...@gmail.com wrote: got it ... thanks a lot. Regards, Nitin Kumar Sharma. On Mon, Dec 15, 2014 at 9:26 PM, Gwen Shapira gshap...@cloudera.com wrote: Hi Nitin, Go to where you installed zookeeper and run: bin/zkCli.sh -server 127.0.0.1:2181 On Mon, Dec 15, 2014 at 6:09 PM, nitin sharma kumarsharma.ni...@gmail.com wrote: Thanks Neha and Gwen for your responses..
Re: kafka-web-console error
Continue this kafka-web-console thread, I follow such page: http://mungeol-heo.blogspot.ca/2014/12/kafka-web-console.html I run the command: play start -Dhttp.port=8080 It works good for a while, but getting such error : at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.sql.SQLException: Timed out waiting for a free available connection. at com.jolbox.bonecp.DefaultConnectionStrategy.getConnectionInternal(DefaultConnectionStrategy.java:88) at com.jolbox.bonecp.AbstractConnectionStrategy.getConnection(AbstractConnectionStrategy.java:90) at com.jolbox.bonecp.BoneCP.getConnection(BoneCP.java:553) at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:131) at play.api.db.DBApi$class.getConnection(DB.scala:67) at play.api.db.BoneCPApi.getConnection(DB.scala:276) at play.api.db.DB$$anonfun$getConnection$1.apply(DB.scala:133) at play.api.db.DB$$anonfun$getConnection$1.apply(DB.scala:133) at scala.Option.map(Option.scala:145) at play.api.db.DB$.getConnection(DB.scala:133) at Global$.Global$$getSession(Global.scala:58) at Global$$anonfun$initiateDb$1.apply(Global.scala:47) at Global$$anonfun$initiateDb$1.apply(Global.scala:47) at org.squeryl.SessionFactory$.newSession(Session.scala:95) at org.squeryl.dsl.QueryDsl$class.inTransaction(QueryDsl.scala:100) at org.squeryl.PrimitiveTypeMode$.inTransaction(PrimitiveTypeMode.scala:40) at models.Setting$.findByKey(Setting.scala:46) at actors.OffsetHistoryManager.actors$OffsetHistoryManager$$schedule(OffsetHistoryManager.scala:102) at actors.OffsetHistoryManager.preStart(OffsetHistoryManager.scala:55) at akka.actor.Actor$class.postRestart(Actor.scala:532) at actors.OffsetHistoryManager.postRestart(OffsetHistoryManager.scala:41) at akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:229) ... 11 more Any hints? thanks AL On Fri, Jan 2, 2015 at 5:07 PM, Joe Stein joe.st...@stealth.ly wrote: The kafka project doesn't have an official web console so you might need to open an issue on the github page of the project for the web console you are using as they may not be closing connections and using up all of resources regardless of what you have set, etc if you have the default setting you may have to increase this value for the operating system if it is not a bug in the client you are using. You can give http://www.cyberciti.biz/faq/howto-linux-get-list-of-open-files/ a try to ascertain which is the problem and/or do a ulimit -n on your machine and see if the value = 1024 which is the default likely for your OS. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Fri, Jan 2, 2015 at 7:41 PM, Sa Li sal...@gmail.com wrote: Hi, all I am running kafka-web-console, I periodically getting such error and cause the UI down: ! @6kldaf9lj - Internal server error, for (GET) [/assets/images/zookeeper_small.gif] - play.api.Application$$anon$1: Execution exception[[FileNotFoundException: /vagrant/kafka-web-console-master/target/scala-2.10/classes/public/images/zookeeper_small.gif (Too many open files)]] at play.api.Application$class.handleError(Application.scala:293) ~[play_2.10-2.2.1.jar:2.2.1] at play.api.DefaultApplication.handleError(Application.scala:399) [play_2.10-2.2.1.jar:2.2.1] at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$12$$anonfun$apply$1.applyOrElse(PlayDefaultUpstreamHandler.scala:165) [play_2.10-2.2.1.jar:2.2.1] at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$12$$anonfun$apply$1.applyOrElse(PlayDefaultUpstreamHandler.scala:162) [play_2.10-2.2.1.jar:2.2.1] at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33) [scala-library-2.10.2.jar:na] at scala.util.Failure$$anonfun$recover$1.apply(Try.scala:185) [scala-library-2.10.2.jar:na] Caused by: java.io.FileNotFoundException: /vagrant/kafka-web-console-master/target/scala-2.10/classes/public/images/zookeeper_small.gif (Too many open files) at java.io.FileInputStream.open(Native Method) ~[na:1.7.0_65] at java.io.FileInputStream.init(FileInputStream.java:146) ~[na:1.7.0_65] at java.io.FileInputStream.init(FileInputStream.java:101) ~[na:1.7.0_65
Re: connection error among nodes
Hello, Jun I run such command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test-rep-three 100 3000 -1 acks=-1 bootstrap.servers=10.100.98.100:9092, 10.100.98.101:9092 http://10.100.98.100:9092/, 10.100.98.102:9092 http://10.100.98.100:9092/ buffer.memory=67108864 batch.size=8196 This is the perf test to producing data to different hosts. I have such questions: 1. If I use bootstrap.servers=10.100.98.100:9092, 10.100.98.101:9092 http://10.100.98.100:9092/, 10.100.98.102:9092 http://10.100.98.100:9092/, does this mean I send data to each of servers through 9092? However, if I set bootstrap.servers=10.100.98.100:9092 only, the producer will send data to 100, while replicate the data through other ports? here is the netstate I saw: netstat -plantue | egrep -i '.98.101|.98.102' tcp0 0 10.100.98.100:37512 10.100.98.102:22 ESTABLISHED 1004 7516316 5862/ssh tcp6 0 0 10.100.98.100:9092 10.100.98.102:56819 ESTABLISHED 0 371522 3852/java tcp6 0 0 10.100.98.100:3888 10.100.98.101:40052 ESTABLISHED 0 27514 1793/java tcp6 0202 10.100.98.100:53592 10.100.98.101:9092 ESTABLISHED 0 497715 3852/java tcp6 0 0 10.100.98.100:3888 10.100.98.102:32837 ESTABLISHED 0 21701 1793/java tcp6 0 0 10.100.98.100:9092 10.100.98.101:51053 ESTABLISHED 0 371526 3852/java tcp6 0270 10.100.98.100:53591 10.100.98.101:9092 ESTABLISHED 0 497713 3852/java tcp6 0 0 10.100.98.100:9092 10.100.98.102:57554 ESTABLISHED 0 7491796 3852/java tcp6 0 0 10.100.98.100:9092 10.100.98.101:51055 ESTABLISHED 0 601277 3852/java tcp6 0 0 10.100.98.100:9092 10.100.98.102:56824 ESTABLISHED 0 614795 3852/java tcp6 0 0 10.100.98.100:48226 10.100.98.102:9092 ESTABLISHED 0 3659949 3852/java tcp6 0 0 10.100.98.100:9092 10.100.98.101:51054 ESTABLISHED 0 601275 3852/java tcp6 0 0 10.100.98.100:48225 10.100.98.102:9092 ESTABLISHED 0 3803556 3852/java tcp6 0 0 10.100.98.100:53593 10.100.98.101:9092 ESTABLISHED 0 638462 3852/java tcp6 0236 10.100.98.100:48228 10.100.98.102:9092 ESTABLISHED 0 3936260 3852/java tcp6 0 0 10.100.98.100:9092 10.100.98.102:56827 ESTABLISHED 0 601276 3852/java tcp6 0 0 10.100.98.100:9092 10.100.98.101:51052 ESTABLISHED 0 614796 3852/java tcp6 0230 10.100.98.100:53594 10.100.98.101:9092 ESTABLISHED 0 637547 3852/java tcp6 0 0 10.100.98.100:9092 10.100.98.102:56826 ESTABLISHED 0 601274 3852/java tcp6 0 0 10.100.98.100:2181 10.100.98.102:34162 ESTABLISHED 0 7491795 1793/java tcp6 0230 10.100.98.100:48227 10.100.98.102:9092 ESTABLISHED 0 3121735 3852/java tcp6 0 0 10.100.98.100:9092 10.100.98.102:56825 ESTABLISHED 0 497716 3852/java 2. I still see the error of server disconnect during the perf test, but back to normal and print the results like 11212 records sent, 2239.7 records/sec (6.41 MB/sec), 7287.1 ms avg latency, 15005.0 max latency. 11620 records sent, 2313.8 records/sec (6.62 MB/sec), 7248.9 ms avg latency, 14807.0 max latency. 11522 records sent, 2293.4 records/sec (6.56 MB/sec), 7110.9 ms avg latency, 14551.0 max latency. 11058 records sent, 2200.2 records/sec (6.29 MB/sec), 7176.8 ms avg latency, 14774.0 max latency. But by monitoring the connection between nodes, we didn't see any disconnections, but we are not sure if the kafka servers disconnection took place or not, how to check it up? thanks AL On Sun, Jan 18, 2015 at 10:21 AM, Jun Rao j...@confluent.io wrote: Any issue with the network? Thanks, Jun On Wed, Jan 7, 2015 at 1:59 PM, Sa Li sal...@gmail.com wrote: Things bother me, sometimes, the errors won't pop out, sometimes it comes, why? On Wed, Jan 7, 2015 at 1:49 PM, Sa Li sal...@gmail.com wrote: Hi, Experts Our cluster is a 3 nodes cluster, I simply test producer locally, see bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test-rep-three 100 3000 -1 acks=1 bootstrap.servers= 10.100.98.100:9092 buffer.memory=67108864 batch.size=8196 But I got such error, I do think this is critical issue, it just temporally lose the connection and get back, what is the reason for this? [2015-01-07 21:44:14,180] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method
java.io.IOException: Too many open files error
Hi, all We test our production kafka, and getting such error [2015-01-15 19:03:45,057] ERROR Error in acceptor (kafka.network.Acceptor) java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept( ServerSocketChannelImpl.java:241) at kafka.network.Acceptor.accept(SocketServer.scala:200) at kafka.network.Acceptor.run(SocketServer.scala:154) at java.lang.Thread.run(Thread.java:745) I noticed some other developers had similar issues, one suggestion was Without knowing the intricacies of Kafka, i think the default open file descriptors is 1024 on unix. This can be changed by setting a higher ulimit value ( typically 8192 but sometimes even 10 ). Before modifying the ulimit I would recommend you check the number of sockets stuck in TIME_WAIT mode. In this case, it looks like the broker has too many open sockets. This could be because you have a rogue client connecting and disconnecting repeatedly. You might have to reduce the TIME_WAIT state to 30 seconds or lower. We increase the open file handles by doing this: insert kafka - nofile 10 in /etc/security/limits.conf Is that right to change the open file descriptors? In addition, it says to reduce the TIME_WAIT, where about to change this state? Or any other solution for this issue? thanks -- Alec Li
Re: java.io.IOException: Too many open files error
Thanks for the reply, I have change the configuration and running to see if any errors come out. SL On Thu, Jan 15, 2015 at 3:34 PM, István lecc...@gmail.com wrote: Hi Sa Li, Depending on your system that configuration entry needs to be modified. The first parameter after the insert is the username what you use to run kafka. It might be your own username or something else, in the following example it is called kafkauser. On the top of that I also like to use soft and hard limits, when you hit the soft limit the system will log a meaningful message in dmesg so you can see what is happening. kafkauser soft nofile 8 kafkauser hard nofile 10 Hope that helps, Istvan On Thu, Jan 15, 2015 at 12:30 PM, Sa Li sal...@gmail.com wrote: Hi, all We test our production kafka, and getting such error [2015-01-15 19:03:45,057] ERROR Error in acceptor (kafka.network.Acceptor) java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept( ServerSocketChannelImpl.java:241) at kafka.network.Acceptor.accept(SocketServer.scala:200) at kafka.network.Acceptor.run(SocketServer.scala:154) at java.lang.Thread.run(Thread.java:745) I noticed some other developers had similar issues, one suggestion was Without knowing the intricacies of Kafka, i think the default open file descriptors is 1024 on unix. This can be changed by setting a higher ulimit value ( typically 8192 but sometimes even 10 ). Before modifying the ulimit I would recommend you check the number of sockets stuck in TIME_WAIT mode. In this case, it looks like the broker has too many open sockets. This could be because you have a rogue client connecting and disconnecting repeatedly. You might have to reduce the TIME_WAIT state to 30 seconds or lower. We increase the open file handles by doing this: insert kafka - nofile 10 in /etc/security/limits.conf Is that right to change the open file descriptors? In addition, it says to reduce the TIME_WAIT, where about to change this state? Or any other solution for this issue? thanks -- Alec Li -- the sun shines for all -- Alec Li
metric-kafka problems
Hello, all I like to use the tool metrics-kafka which seems to be attractive to report kafka metric and use graphite to graph metrics, however I am having trouble to make it work. In https://github.com/stealthly/metrics-kafka, it says: In the main metrics-kafka folder 1) sudo ./bootstrap.sh 2) ./gradlew test 3) sudo ./shutdown.sh When I run ./bootstrap, see this is what I got root@DO-mq-dev:/home/stuser/jmx/metrics-kafka# ././bootstrap.sh /dev/stdin: line 1: syntax error near unexpected token `newline' /dev/stdin: line 1: `!DOCTYPE html' /dev/stdin: line 1: syntax error near unexpected token `newline' /dev/stdin: line 1: `!DOCTYPE html' e348a98a5afb8b89b94fce51b125e8a2045d9834268ec64c3e38cb7b165ef642 2015/01/09 16:49:21 Error response from daemon: Could not find entity for broker1 And this is how I vagrant up: root@DO-mq-dev:/home/stuser/jmx/metrics-kafka# vagrant up /usr/share/vagrant/plugins/provisioners/docker/plugin.rb:13:in `require_relative': /usr/share/vagrant/plugins/provisioners/docker/config.rb:23: syntax error, unexpected tPOW (SyntaxError) def run(name, **options) ^ /usr/share/vagrant/plugins/provisioners/docker/config.rb:43: syntax error, unexpected keyword_end, expecting $end from /usr/share/vagrant/plugins/provisioners/docker/plugin.rb:13:in `block in class:Plugin' from /usr/lib/ruby/vendor_ruby/vagrant/registry.rb:27:in `call' from /usr/lib/ruby/vendor_ruby/vagrant/registry.rb:27:in `get' from /usr/share/vagrant/plugins/kernel_v2/config/vm_provisioner.rb:34:in `initialize' from /usr/share/vagrant/plugins/kernel_v2/config/vm.rb:223:in `new' from /usr/share/vagrant/plugins/kernel_v2/config/vm.rb:223:in `provision' from /home/stuser/jmx/metrics-kafka/Vagrantfile:29:in `block (2 levels) in top (required)' from /usr/lib/ruby/vendor_ruby/vagrant/config/v2/loader.rb:37:in `call' from /usr/lib/ruby/vendor_ruby/vagrant/config/v2/loader.rb:37:in `load' from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:104:in `block (2 levels) in load' from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:98:in `each' from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:98:in `block in load' from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:95:in `each' from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:95:in `load' from /usr/lib/ruby/vendor_ruby/vagrant/environment.rb:335:in `machine' from /usr/lib/ruby/vendor_ruby/vagrant/plugin/v2/command.rb:142:in `block in with_target_vms' from /usr/lib/ruby/vendor_ruby/vagrant/plugin/v2/command.rb:175:in `call' from /usr/lib/ruby/vendor_ruby/vagrant/plugin/v2/command.rb:175:in `block in with_target_vms' from /usr/lib/ruby/vendor_ruby/vagrant/plugin/v2/command.rb:174:in `map' from /usr/lib/ruby/vendor_ruby/vagrant/plugin/v2/command.rb:174:in `with_target_vms' from /usr/share/vagrant/plugins/commands/up/command.rb:56:in `block in execute' from /usr/lib/ruby/vendor_ruby/vagrant/environment.rb:210:in `block (2 levels) in batch' from /usr/lib/ruby/vendor_ruby/vagrant/environment.rb:208:in `tap' from /usr/lib/ruby/vendor_ruby/vagrant/environment.rb:208:in `block in batch' from internal:prelude:10:in `synchronize' from /usr/lib/ruby/vendor_ruby/vagrant/environment.rb:207:in `batch' from /usr/share/vagrant/plugins/commands/up/command.rb:55:in `execute' from /usr/lib/ruby/vendor_ruby/vagrant/cli.rb:38:in `execute' from /usr/lib/ruby/vendor_ruby/vagrant/environment.rb:484:in `cli' from /usr/bin/vagrant:127:in `main' Any idea to make it work? thanks -- Alec Li
Re: metric-kafka problems
Thank you very much, Joe, I will try all of them, keep here posted. On Jan 9, 2015 5:55 PM, Joe Stein joe.st...@stealth.ly wrote: Hi, https://github.com/stealthly/metrics-kafka is a project to be used as an example of how to use Kafka as a central point to send all of your metrics for your entire infrastructure. The consumers integrate so as to abstract the load and coupling of services so systems can just send their stats to Kafka and then you can do whatever you want with them from there (often multiple things). We also build a Yammer Metrics Reporter (which is what Kafka uses to send its Metrics) for Kafka itself so brokers can send their stats into a Kafka topic and used downstream (typically another cluster). The issue you reported was caused by changes by github and I just pushed fixes for them so things are working again. If you are not looking for that type of solution and want to just see and chart broker metrics then I would suggest taking a look at https://github.com/airbnb/kafka-statsd-metrics2 and you point it to https://github.com/kamon-io/docker-grafana-graphite. I find this a very quick out the box way to see what is going on with a broker when no stats reporter is already in place. If you want a Kafka metrics reporter for just graphite check out https://github.com/damienclaveau/kafka-graphite for just ganglie https://github.com/criteo/kafka-ganglia for just Riemann https://github.com/TheLadders/KafkaRiemannMetricsReporter and/or you also can use a service like SPM https://apps.sematext.com/spm-reports/mainPage.do?selectedApplication=4293 or DataDog https://www.datadoghq.com/ Hope this help, thanks! /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Fri, Jan 9, 2015 at 7:51 PM, Sa Li sal...@gmail.com wrote: Hello, all I like to use the tool metrics-kafka which seems to be attractive to report kafka metric and use graphite to graph metrics, however I am having trouble to make it work. In https://github.com/stealthly/metrics-kafka, it says: In the main metrics-kafka folder 1) sudo ./bootstrap.sh 2) ./gradlew test 3) sudo ./shutdown.sh When I run ./bootstrap, see this is what I got root@DO-mq-dev:/home/stuser/jmx/metrics-kafka# ././bootstrap.sh /dev/stdin: line 1: syntax error near unexpected token `newline' /dev/stdin: line 1: `!DOCTYPE html' /dev/stdin: line 1: syntax error near unexpected token `newline' /dev/stdin: line 1: `!DOCTYPE html' e348a98a5afb8b89b94fce51b125e8a2045d9834268ec64c3e38cb7b165ef642 2015/01/09 16:49:21 Error response from daemon: Could not find entity for broker1 And this is how I vagrant up: root@DO-mq-dev:/home/stuser/jmx/metrics-kafka# vagrant up /usr/share/vagrant/plugins/provisioners/docker/plugin.rb:13:in `require_relative': /usr/share/vagrant/plugins/provisioners/docker/config.rb:23: syntax error, unexpected tPOW (SyntaxError) def run(name, **options) ^ /usr/share/vagrant/plugins/provisioners/docker/config.rb:43: syntax error, unexpected keyword_end, expecting $end from /usr/share/vagrant/plugins/provisioners/docker/plugin.rb:13:in `block in class:Plugin' from /usr/lib/ruby/vendor_ruby/vagrant/registry.rb:27:in `call' from /usr/lib/ruby/vendor_ruby/vagrant/registry.rb:27:in `get' from /usr/share/vagrant/plugins/kernel_v2/config/vm_provisioner.rb:34:in `initialize' from /usr/share/vagrant/plugins/kernel_v2/config/vm.rb:223:in `new' from /usr/share/vagrant/plugins/kernel_v2/config/vm.rb:223:in `provision' from /home/stuser/jmx/metrics-kafka/Vagrantfile:29:in `block (2 levels) in top (required)' from /usr/lib/ruby/vendor_ruby/vagrant/config/v2/loader.rb:37:in `call' from /usr/lib/ruby/vendor_ruby/vagrant/config/v2/loader.rb:37:in `load' from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:104:in `block (2 levels) in load' from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:98:in `each' from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:98:in `block in load' from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:95:in `each' from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:95:in `load' from /usr/lib/ruby/vendor_ruby/vagrant/environment.rb:335:in `machine' from /usr/lib/ruby/vendor_ruby/vagrant/plugin/v2/command.rb:142:in `block in with_target_vms' from /usr/lib/ruby/vendor_ruby/vagrant/plugin/v2/command.rb:175:in `call' from /usr/lib/ruby/vendor_ruby/vagrant/plugin/v2/command.rb:175:in `block in with_target_vms' from /usr/lib/ruby/vendor_ruby/vagrant/plugin/v2/command.rb:174:in `map
Re: zookeeper monitoring
Hi, I parse the zkServer.sh and make changes on /etc/zookeeper/conf/environment ZOOMAIN=-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false com.sun.management.jmxremote.port=2 org.apache.zookeeper.server.quorum.QuorumPeerMain which was originally ZOOMAIN=org.apache.zookeeper.server.quorum.QuorumPeerMain when I run it, it is like root@DO-mq-dev:/etc/zookeeper/bin# ./zkServer.sh start JMX enabled by default Using config: /etc/zookeeper/conf/zoo.cfg Starting zookeeper ... STARTED But when I try to connect it by jconsole: 10.100.70.128:2, but it fails to connect, is there a way to confirm jmxremote port= 2? thanks AL On Thu, Jan 8, 2015 at 4:02 PM, Sa Li sal...@gmail.com wrote: Hi, all I've just figured out the monitoring of kafka by jconsole, I want to do the same thing to zookeeper. Zookeeper site says The class *org.apache.zookeeper.server.quorum.QuorumPeerMain* will start a JMX manageable ZooKeeper server. This class registers the proper MBeans during initalization to support JMX monitoring and management of the instance. See *bin/zkServer.sh* for one example of starting ZooKeeper using QuorumPeerMain. I found when I type: root@pof-kstorm-dev1:/etc/kafka# zkServer.sh start JMX enabled by default Using config: /etc/zookeeper/conf/zoo.cfg Starting zookeeper ... STARTED Seems JMX is enabled by default, by checking the zkServer.sh, ZOOMAIN=-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=$JMXLOCALONLY org.apache.zookeeper.server.quorum.QuorumPeerMain here -Dcom.sun.management.jmxremote.local.only=$JMXLOCALONLY , should I change a jmxport here, or what is the default jmx_port number for zookeeper? thanks -- Alec Li -- Alec Li
Re: zookeeper monitoring
Worked, thanks On Fri, Jan 9, 2015 at 10:37 AM, Sa Li sal...@gmail.com wrote: Hi, I parse the zkServer.sh and make changes on /etc/zookeeper/conf/environment ZOOMAIN=-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false com.sun.management.jmxremote.port=2 org.apache.zookeeper.server.quorum.QuorumPeerMain which was originally ZOOMAIN=org.apache.zookeeper.server.quorum.QuorumPeerMain when I run it, it is like root@DO-mq-dev:/etc/zookeeper/bin# ./zkServer.sh start JMX enabled by default Using config: /etc/zookeeper/conf/zoo.cfg Starting zookeeper ... STARTED But when I try to connect it by jconsole: 10.100.70.128:2, but it fails to connect, is there a way to confirm jmxremote port= 2? thanks AL On Thu, Jan 8, 2015 at 4:02 PM, Sa Li sal...@gmail.com wrote: Hi, all I've just figured out the monitoring of kafka by jconsole, I want to do the same thing to zookeeper. Zookeeper site says The class *org.apache.zookeeper.server.quorum.QuorumPeerMain* will start a JMX manageable ZooKeeper server. This class registers the proper MBeans during initalization to support JMX monitoring and management of the instance. See *bin/zkServer.sh* for one example of starting ZooKeeper using QuorumPeerMain. I found when I type: root@pof-kstorm-dev1:/etc/kafka# zkServer.sh start JMX enabled by default Using config: /etc/zookeeper/conf/zoo.cfg Starting zookeeper ... STARTED Seems JMX is enabled by default, by checking the zkServer.sh, ZOOMAIN=-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=$JMXLOCALONLY org.apache.zookeeper.server.quorum.QuorumPeerMain here -Dcom.sun.management.jmxremote.local.only=$JMXLOCALONLY , should I change a jmxport here, or what is the default jmx_port number for zookeeper? thanks -- Alec Li -- Alec Li -- Alec Li
Re: kafka monitoring
Thank you very much for all the reply, I am able to connect jconsole now, by set env JMX_PORT= start server. However, when I connect it I found there is a port conflict with the kafka-run-class.sh, Error: Exception thrown by the agent : java.rmi.server.ExportException: Port already in use: ; nested exception is: java.net.BindException: Address already in use I can of course to reset the JMX_PORT number other than , but I am curious, do I have to ? thanks AL On Thu, Jan 8, 2015 at 11:57 AM, Gene Robichaux gene.robich...@match.com wrote: Is there a firewall between your DEV and PROD environments? If so you will need to open access on all ports, not just JMX port. It gets complicated with JMX. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 -Original Message- From: Sa Li [mailto:sal...@gmail.com] Sent: Thursday, January 08, 2015 1:09 PM To: users@kafka.apache.org Subject: kafka monitoring Hello, All I understand many of you are using jmxtrans along with graphite/ganglia to pull out metrics, according to https://kafka.apache.org/081/ops.html, it says The easiest way to see the available metrics to fire up jconsole and point it at a running kafka client or server; this will all browsing all metrics with JMX. .. I tried to fire up a jconsole on windows attempting to access our dev and production cluster which are running good, here is the main node of my dev: 10.100.75.128, broker port:9092, zk port:2181 Jconsole shows: New Connection Remote Process: Usage: hostname:port OR service:jmx:protocol:sap Username:Password: Sorry about my naive, I tried connect base on above ip just can't be connected, do I need to do something in dev server to be able to make it work? thanks -- Alec Li -- Alec Li
Re: kafka monitoring
In addition, I found all the attributes in jconsole MBeans are cool, but not being graphed, so again, if I want to view the real-time graphing, jmxtrans + graphite is the solution? thanks AL On Thu, Jan 8, 2015 at 1:35 PM, Sa Li sal...@gmail.com wrote: Thank you very much for all the reply, I am able to connect jconsole now, by set env JMX_PORT= start server. However, when I connect it I found there is a port conflict with the kafka-run-class.sh, Error: Exception thrown by the agent : java.rmi.server.ExportException: Port already in use: ; nested exception is: java.net.BindException: Address already in use I can of course to reset the JMX_PORT number other than , but I am curious, do I have to ? thanks AL On Thu, Jan 8, 2015 at 11:57 AM, Gene Robichaux gene.robich...@match.com wrote: Is there a firewall between your DEV and PROD environments? If so you will need to open access on all ports, not just JMX port. It gets complicated with JMX. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 -Original Message- From: Sa Li [mailto:sal...@gmail.com] Sent: Thursday, January 08, 2015 1:09 PM To: users@kafka.apache.org Subject: kafka monitoring Hello, All I understand many of you are using jmxtrans along with graphite/ganglia to pull out metrics, according to https://kafka.apache.org/081/ops.html, it says The easiest way to see the available metrics to fire up jconsole and point it at a running kafka client or server; this will all browsing all metrics with JMX. .. I tried to fire up a jconsole on windows attempting to access our dev and production cluster which are running good, here is the main node of my dev: 10.100.75.128, broker port:9092, zk port:2181 Jconsole shows: New Connection Remote Process: Usage: hostname:port OR service:jmx:protocol:sap Username:Password: Sorry about my naive, I tried connect base on above ip just can't be connected, do I need to do something in dev server to be able to make it work? thanks -- Alec Li -- Alec Li -- Alec Li
Re: NotLeaderForPartitionException while doing performance test
26u IPv6 213145 0t0 TCP *:2181 (LISTEN) java 22152 root 27u IPv6 211541 0t0 TCP exemplary-birds.master:3888 (LISTEN) java 22152 root 28u IPv6 443527 0t0 TCP exemplary-birds.master:3888-complicated-laugh.master:43940 (ESTABLISHED) java 22152 root 29u IPv6 23347 0t0 TCP exemplary-birds.master:43797-harmful-jar.master:2888 (ESTABLISHED) java 22152 root 30u IPv6 204517 0t0 TCP exemplary-birds.master:3888-harmful-jar.master:50791 (ESTABLISHED) java 22152 root 31u IPv6 4278513 0t0 TCP exemplary-birds.master:3888-voluminous-mass.master:50452 (ESTABLISHED) java 22152 root 32u IPv6 4345845 0t0 TCP exemplary-birds.master:2181-harmful-jar.master:45048 (ESTABLISHED) java 22152 root 33u IPv6 443552 0t0 TCP exemplary-birds.master:3888-beloved-judge.master:56370 (ESTABLISHED) java 22152 root 35u IPv6 4364514 0t0 TCP exemplary-birds.master:2181-voluminous-mass.master:60600 (ESTABLISHED) ssh 24632 sa3u IPv4 4289852 0t0 TCP exemplary-birds.master:60510-harmful-jar.master:ssh (ESTABLISHED) ssh 24645 sa3u IPv4 4289867 0t0 TCP exemplary-birds.master:33295-voluminous-mass.master:ssh (ESTABLISHED) I didn't see anything wrong with it, but seem, the connection was temporally closed.. Anyone has similar issue? thanks On Wed, Jan 7, 2015 at 10:32 PM, Jaikiran Pai jai.forums2...@gmail.com wrote: There are different ways to find the connection count and each one depends on the operating system that's being used. lsof -i is one option, for example, on *nix systems. -Jaikiran On Thursday 08 January 2015 11:40 AM, Sa Li wrote: Yes, it is weird hostname, ;), that is what our system guys name it. How to take a note to measure the connections open to 10.100.98.102? Thanks AL On Jan 7, 2015 9:42 PM, Jaikiran Pai jai.forums2...@gmail.com wrote: On Thursday 08 January 2015 01:51 AM, Sa Li wrote: see this type of error again, back to normal in few secs [2015-01-07 20:19:49,744] WARN Error in I/O with harmful-jar.master/ 10.100.98.102 That's a really weird hostname, the harmful-jar.master. Is that really your hostname? You mention that this happens during performance testing. Have you taken a note of how many connection are open to that 10.100.98.102 IP when this Connection refused exception happens? -Jaikiran (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll( Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 20:19:49,754] WARN Error in I/O with harmful-jar.master/ 10.100.98.102 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll( Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 20:19:49,764] WARN Error in I/O with harmful-jar.master/ 10.100.98.102 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll( Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) 160403 records sent, 32080.6 records/sec (91.78 MB/sec), 507.0 ms avg latency, 2418.0 max latency. 109882 records sent, 21976.4 records/sec (62.87 MB/sec), 672.7 ms avg latency, 3529.0 max latency. 100315 records sent, 19995.0 records/sec (57.21 MB/sec), 774.8 ms avg latency, 3858.0 max latency. On Wed, Jan 7, 2015 at 12:07 PM, Sa Li sal...@gmail.com wrote: Hi, All I am doing performance test by bin/kafka-run-class.sh org.apache.kafka.clients
Re: NotLeaderForPartitionException while doing performance test
Yes, it is weird hostname, ;), that is what our system guys name it. How to take a note to measure the connections open to 10.100.98.102? Thanks AL On Jan 7, 2015 9:42 PM, Jaikiran Pai jai.forums2...@gmail.com wrote: On Thursday 08 January 2015 01:51 AM, Sa Li wrote: see this type of error again, back to normal in few secs [2015-01-07 20:19:49,744] WARN Error in I/O with harmful-jar.master/ 10.100.98.102 That's a really weird hostname, the harmful-jar.master. Is that really your hostname? You mention that this happens during performance testing. Have you taken a note of how many connection are open to that 10.100.98.102 IP when this Connection refused exception happens? -Jaikiran (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll( Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 20:19:49,754] WARN Error in I/O with harmful-jar.master/ 10.100.98.102 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll( Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 20:19:49,764] WARN Error in I/O with harmful-jar.master/ 10.100.98.102 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll( Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) 160403 records sent, 32080.6 records/sec (91.78 MB/sec), 507.0 ms avg latency, 2418.0 max latency. 109882 records sent, 21976.4 records/sec (62.87 MB/sec), 672.7 ms avg latency, 3529.0 max latency. 100315 records sent, 19995.0 records/sec (57.21 MB/sec), 774.8 ms avg latency, 3858.0 max latency. On Wed, Jan 7, 2015 at 12:07 PM, Sa Li sal...@gmail.com wrote: Hi, All I am doing performance test by bin/kafka-run-class.sh org.apache.kafka.clients. tools.ProducerPerformance test-rep-three 5 100 -1 acks=1 bootstrap.servers= 10.100.98.100:9092,10.100.98.101:9092,10.100.98.102:9092 buffer.memory=67108864 batch.size=8196 where the topic test-rep-three is described as follow: bin/kafka-topics.sh --describe --zookeeper 10.100.98.101:2181 --topic test-rep-three Topic:test-rep-threePartitionCount:8ReplicationFactor:3 Configs: Topic: test-rep-three Partition: 0Leader: 100 Replicas: 100,102,101 Isr: 102,101,100 Topic: test-rep-three Partition: 1Leader: 101 Replicas: 101,100,102 Isr: 102,101,100 Topic: test-rep-three Partition: 2Leader: 102 Replicas: 102,101,100 Isr: 101,102,100 Topic: test-rep-three Partition: 3Leader: 100 Replicas: 100,101,102 Isr: 101,100,102 Topic: test-rep-three Partition: 4Leader: 101 Replicas: 101,102,100 Isr: 102,100,101 Topic: test-rep-three Partition: 5Leader: 102 Replicas: 102,100,101 Isr: 100,102,101 Topic: test-rep-three Partition: 6Leader: 102 Replicas: 100,102,101 Isr: 102,101,100 Topic: test-rep-three Partition: 7Leader: 101 Replicas: 101,100,102 Isr: 101,100,102 Apparently, it produces the messages and run for a while, but it periodically have such exceptions: org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition
NotLeaderForPartitionException while doing performance test
Hi, All I am doing performance test by bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test-rep-three 5 100 -1 acks=1 bootstrap.servers=10.100.98.100:9092, 10.100.98.101:9092,10.100.98.102:9092 buffer.memory=67108864 batch.size=8196 where the topic test-rep-three is described as follow: bin/kafka-topics.sh --describe --zookeeper 10.100.98.101:2181 --topic test-rep-three Topic:test-rep-threePartitionCount:8ReplicationFactor:3 Configs: Topic: test-rep-three Partition: 0Leader: 100 Replicas: 100,102,101 Isr: 102,101,100 Topic: test-rep-three Partition: 1Leader: 101 Replicas: 101,100,102 Isr: 102,101,100 Topic: test-rep-three Partition: 2Leader: 102 Replicas: 102,101,100 Isr: 101,102,100 Topic: test-rep-three Partition: 3Leader: 100 Replicas: 100,101,102 Isr: 101,100,102 Topic: test-rep-three Partition: 4Leader: 101 Replicas: 101,102,100 Isr: 102,100,101 Topic: test-rep-three Partition: 5Leader: 102 Replicas: 102,100,101 Isr: 100,102,101 Topic: test-rep-three Partition: 6Leader: 102 Replicas: 100,102,101 Isr: 102,101,100 Topic: test-rep-three Partition: 7Leader: 101 Replicas: 101,100,102 Isr: 101,100,102 Apparently, it produces the messages and run for a while, but it periodically have such exceptions: org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. 141292 records sent, 28258.4 records/sec (80.85 MB/sec), 551.2 ms avg latency, 1494.0 max latency. 142526 records sent, 28505.2 records/sec (81.55 MB/sec), 580.8 ms avg latency, 1513.0 max latency. 146564 records sent, 29312.8 records/sec (83.86 MB/sec), 557.9 ms avg latency, 1431.0 max latency. 146755 records sent, 29351.0 records/sec (83.97 MB/sec), 556.7 ms avg latency, 1480.0 max latency. 147963 records sent, 29592.6 records/sec (84.67 MB/sec), 556.7 ms avg latency, 1546.0 max latency. 146931 records sent, 29386.2 records/sec (84.07 MB/sec), 550.9 ms avg latency, 1715.0 max latency. 146947 records sent, 29389.4 records/sec (84.08 MB/sec), 555.1 ms avg latency, 1750.0 max latency. 146422 records sent, 29284.4 records/sec (83.78 MB/sec), 557.9 ms avg latency, 1818.0 max latency. 147516 records sent, 29503.2 records/sec (84.41 MB/sec), 555.6 ms avg latency, 1806.0 max latency. 147877 records sent, 29575.4 records/sec (84.62 MB/sec), 552.1 ms avg latency, 1821.0 max latency. 147201 records sent, 29440.2 records/sec (84.23 MB/sec), 554.5 ms avg latency, 1826.0 max latency. 148317 records sent, 29663.4 records/sec (84.87 MB/sec), 558.1 ms avg latency, 1792.0 max latency. 147756 records sent, 29551.2 records/sec (84.55 MB/sec), 550.9 ms avg latency, 1806.0 max latency then back into correct process state, is that because rebalance? thanks -- Alec Li
Re: NotLeaderForPartitionException while doing performance test
see this type of error again, back to normal in few secs [2015-01-07 20:19:49,744] WARN Error in I/O with harmful-jar.master/ 10.100.98.102 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 20:19:49,754] WARN Error in I/O with harmful-jar.master/ 10.100.98.102 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 20:19:49,764] WARN Error in I/O with harmful-jar.master/ 10.100.98.102 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) 160403 records sent, 32080.6 records/sec (91.78 MB/sec), 507.0 ms avg latency, 2418.0 max latency. 109882 records sent, 21976.4 records/sec (62.87 MB/sec), 672.7 ms avg latency, 3529.0 max latency. 100315 records sent, 19995.0 records/sec (57.21 MB/sec), 774.8 ms avg latency, 3858.0 max latency. On Wed, Jan 7, 2015 at 12:07 PM, Sa Li sal...@gmail.com wrote: Hi, All I am doing performance test by bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test-rep-three 5 100 -1 acks=1 bootstrap.servers= 10.100.98.100:9092,10.100.98.101:9092,10.100.98.102:9092 buffer.memory=67108864 batch.size=8196 where the topic test-rep-three is described as follow: bin/kafka-topics.sh --describe --zookeeper 10.100.98.101:2181 --topic test-rep-three Topic:test-rep-threePartitionCount:8ReplicationFactor:3 Configs: Topic: test-rep-three Partition: 0Leader: 100 Replicas: 100,102,101 Isr: 102,101,100 Topic: test-rep-three Partition: 1Leader: 101 Replicas: 101,100,102 Isr: 102,101,100 Topic: test-rep-three Partition: 2Leader: 102 Replicas: 102,101,100 Isr: 101,102,100 Topic: test-rep-three Partition: 3Leader: 100 Replicas: 100,101,102 Isr: 101,100,102 Topic: test-rep-three Partition: 4Leader: 101 Replicas: 101,102,100 Isr: 102,100,101 Topic: test-rep-three Partition: 5Leader: 102 Replicas: 102,100,101 Isr: 100,102,101 Topic: test-rep-three Partition: 6Leader: 102 Replicas: 100,102,101 Isr: 102,101,100 Topic: test-rep-three Partition: 7Leader: 101 Replicas: 101,100,102 Isr: 101,100,102 Apparently, it produces the messages and run for a while, but it periodically have such exceptions: org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
connection error among nodes
Hi, Experts Our cluster is a 3 nodes cluster, I simply test producer locally, see bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test-rep-three 100 3000 -1 acks=1 bootstrap.servers=10.100.98.100:9092 buffer.memory=67108864 batch.size=8196 But I got such error, I do think this is critical issue, it just temporally lose the connection and get back, what is the reason for this? [2015-01-07 21:44:14,180] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 21:44:14,190] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 21:44:14,200] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 21:44:14,210] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 21:44:14,220] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 21:44:14,230] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 21:44:14,240] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232)
Re: NotLeaderForPartitionException while doing performance test
I checked topic config, isr changes dynamically. root@voluminous-mass:/srv/kafka# bin/kafka-topics.sh --describe --zookeeper 10.100.98.101:2181 --topic test-rep-three Topic:test-rep-threePartitionCount:8ReplicationFactor:3 Configs: Topic: test-rep-three Partition: 0Leader: 100 Replicas: 100,102,101 Isr: 100 Topic: test-rep-three Partition: 1Leader: 100 Replicas: 101,100,102 Isr: 100,101,102 Topic: test-rep-three Partition: 2Leader: 102 Replicas: 102,101,100 Isr: 101,102 Topic: test-rep-three Partition: 3Leader: 100 Replicas: 100,101,102 Isr: 100 Topic: test-rep-three Partition: 4Leader: 100 Replicas: 101,102,100 Isr: 100 Topic: test-rep-three Partition: 5Leader: 102 Replicas: 102,100,101 Isr: 100,102,101 Topic: test-rep-three Partition: 6Leader: 100 Replicas: 100,102,101 Isr: 100,102,101 Topic: test-rep-three Partition: 7Leader: 100 Replicas: 101,100,102 Isr: 100 root@voluminous-mass:/srv/kafka# bin/kafka-topics.sh --describe --zookeeper 10.100.98.101:2181 --topic test-rep-three Topic:test-rep-threePartitionCount:8ReplicationFactor:3 Configs: Topic: test-rep-three Partition: 0Leader: 100 Replicas: 100,102,101 Isr: 102,100,101 Topic: test-rep-three Partition: 1Leader: 101 Replicas: 101,100,102 Isr: 101,102,100 Topic: test-rep-three Partition: 2Leader: 102 Replicas: 102,101,100 Isr: 101,102 Topic: test-rep-three Partition: 3Leader: 100 Replicas: 100,101,102 Isr: 101,100,102 Topic: test-rep-three Partition: 4Leader: 101 Replicas: 101,102,100 Isr: 101,102,100 Topic: test-rep-three Partition: 5Leader: 102 Replicas: 102,100,101 Isr: 102,101,100 Topic: test-rep-three Partition: 6Leader: 102 Replicas: 100,102,101 Isr: 102,101 Topic: test-rep-three Partition: 7Leader: 101 Replicas: 101,100,102 Isr: 101,100,102 Why that happen? thanks On Wed, Jan 7, 2015 at 12:21 PM, Sa Li sal...@gmail.com wrote: see this type of error again, back to normal in few secs [2015-01-07 20:19:49,744] WARN Error in I/O with harmful-jar.master/ 10.100.98.102 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 20:19:49,754] WARN Error in I/O with harmful-jar.master/ 10.100.98.102 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 20:19:49,764] WARN Error in I/O with harmful-jar.master/ 10.100.98.102 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) 160403 records sent, 32080.6 records/sec (91.78 MB/sec), 507.0 ms avg latency, 2418.0 max latency. 109882 records sent, 21976.4 records/sec (62.87 MB/sec), 672.7 ms avg latency, 3529.0 max latency. 100315 records sent, 19995.0 records/sec (57.21 MB/sec), 774.8 ms avg latency, 3858.0 max latency. On Wed, Jan 7, 2015 at 12:07 PM, Sa Li sal...@gmail.com wrote: Hi, All I am doing performance test by bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test-rep-three 5 100 -1 acks=1 bootstrap.servers= 10.100.98.100:9092,10.100.98.101:9092,10.100.98.102:9092 buffer.memory=67108864 batch.size=8196 where
Re: connection error among nodes
Things bother me, sometimes, the errors won't pop out, sometimes it comes, why? On Wed, Jan 7, 2015 at 1:49 PM, Sa Li sal...@gmail.com wrote: Hi, Experts Our cluster is a 3 nodes cluster, I simply test producer locally, see bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test-rep-three 100 3000 -1 acks=1 bootstrap.servers=10.100.98.100:9092 buffer.memory=67108864 batch.size=8196 But I got such error, I do think this is critical issue, it just temporally lose the connection and get back, what is the reason for this? [2015-01-07 21:44:14,180] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 21:44:14,190] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 21:44:14,200] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 21:44:14,210] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 21:44:14,220] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 21:44:14,230] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) [2015-01-07 21:44:14,240] WARN Error in I/O with voluminous-mass.master/ 10.100.98.101 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection
question about jmxtrans to get kafka metrics
Hi, All I installed jmxtrans and graphite, wish to be able to graph stuff from kafka, but firstly I start the jmxtrans and getting such errors, (I use the example graphite json). ./jmxtrans.sh start graphite.json [07 Jan 2015 17:55:58] [ServerScheduler_Worker-4] 180214 DEBUG (com.googlecode.jmxtrans.jobs.ServerJob:31) - + Started server job: Server [host=w2, port=1099, url=service:jmx:rmi:///jndi/rmi://w2:1099/jmxrmi, cronExpression=null, numQueryThreads=null] [07 Jan 2015 17:55:58] [ServerScheduler_Worker-4] 180217 ERROR (com.googlecode.jmxtrans.jobs.ServerJob:39) - Error java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ConfigurationException [Root exception is java.rmi.UnknownHostException: Unknown host: w2; nested exception is: java.net.UnknownHostException: w2] at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:369) at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:268) at com.googlecode.jmxtrans.util.JmxUtils.getServerConnection(JmxUtils.java:351) at com.googlecode.jmxtrans.util.JmxConnectionFactory.makeObject(JmxConnectionFactory.java:31) at org.apache.commons.pool.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:1212) at com.googlecode.jmxtrans.jobs.ServerJob.execute(ServerJob.java:36) at org.quartz.core.JobRunShell.run(JobRunShell.java:216) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549) Caused by: javax.naming.ConfigurationException [Root exception is java.rmi.UnknownHostException: Unknown host: w2; nested exception is: java.net.UnknownHostException: w2] at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:118) at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:203) at javax.naming.InitialContext.lookup(InitialContext.java:411) at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1929) at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1896) at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:286) ... 7 more Caused by: java.rmi.UnknownHostException: Unknown host: w2; nested exception is: java.net.UnknownHostException: w2 at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:616) at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:341) at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source) at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:114) ... 12 more Caused by: java.net.UnknownHostException: w2 at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.init(Socket.java:425) at java.net.Socket.init(Socket.java:208) at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40) at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:147) at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613) ... 17 more The graphite.json { servers : [ { port : 1099, host : w2, queries : [ { obj : java.lang:type=Memory, attr : [ HeapMemoryUsage, NonHeapMemoryUsage ], outputWriters : [ { @class : com.googlecode.jmxtrans.model.output.GraphiteWriter, settings : { port : 2003, host : 10.100.70.128 } } ] } ] } ] } Anyone help me to diagnose what this problem is? thanks -- Alec Li
no space left error
Hi, All I am doing performance test on our new kafka production server, but after sending some messages (even faked message by using bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance), it comes out the error of connection, and shut down the brokers, after that, I see such errors, conf-su: cannot create temp file for here-document: No space left on device How can I fix it, I am concerning that will happen when we start to publish real messages in kafka, and should I create some cron to regularly clean certain directories? thanks -- Alec Li
Re: no space left error
Continue this issue, when I restart the server, like bin/kafka-server-start.sh config/server.properties it will fails to start the server, like [2015-01-06 20:00:55,441] FATAL Fatal error during KafkaServerStable startup. Prepare to shutdown (kafka.server.KafkaServerStartable) java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code at java.nio.HeapByteBuffer.init(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) at kafka.log.FileMessageSet$$anon$1.makeNext(FileMessageSet.scala:188) at kafka.log.FileMessageSet$$anon$1.makeNext(FileMessageSet.scala:165) at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66) at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58) at kafka.log.LogSegment.recover(LogSegment.scala:165) at kafka.log.Log.recoverLog(Log.scala:179) at kafka.log.Log.loadSegments(Log.scala:155) at kafka.log.Log.init(Log.scala:64) at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118) at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:105) at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113) at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34) at kafka.log.LogManager.loadLogs(LogManager.scala:105) at kafka.log.LogManager.init(LogManager.scala:57) at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275) at kafka.server.KafkaServer.startup(KafkaServer.scala:72) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34) at kafka.Kafka$.main(Kafka.scala:46) at kafka.Kafka.main(Kafka.scala) [2015-01-06 20:00:55,443] INFO [Kafka Server 100], shutting down (kafka.server.KafkaServer) [2015-01-06 20:00:55,444] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread) [2015-01-06 20:00:55,446] INFO Session: 0x684a5ed9da3a1a0f closed (org.apache.zookeeper.ZooKeeper) [2015-01-06 20:00:55,446] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn) [2015-01-06 20:00:55,447] INFO [Kafka Server 100], shut down completed (kafka.server.KafkaServer) [2015-01-06 20:00:55,447] INFO [Kafka Server 100], shutting down (kafka.server.KafkaServer) Any ideas On Tue, Jan 6, 2015 at 12:00 PM, Sa Li sal...@gmail.com wrote: the complete error message: -su: cannot create temp file for here-document: No space left on device OpenJDK 64-Bit Server VM warning: Insufficient space for shared memory file: /tmp/hsperfdata_root/19721 Try using the -Djava.io.tmpdir= option to select an alternate temp location. [2015-01-06 19:50:49,244] FATAL (kafka.Kafka$) java.io.FileNotFoundException: conf (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.init(FileInputStream.java:146) at java.io.FileInputStream.init(FileInputStream.java:101) at kafka.utils.Utils$.loadProps(Utils.scala:144) at kafka.Kafka$.main(Kafka.scala:34) at kafka.Kafka.main(Kafka.scala) On Tue, Jan 6, 2015 at 11:58 AM, Sa Li sal...@gmail.com wrote: Hi, All I am doing performance test on our new kafka production server, but after sending some messages (even faked message by using bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance), it comes out the error of connection, and shut down the brokers, after that, I see such errors, conf-su: cannot create temp file for here-document: No space left on device How can I fix it, I am concerning that will happen when we start to publish real messages in kafka, and should I create some cron to regularly clean certain directories? thanks -- Alec Li -- Alec Li -- Alec Li
Re: no space left error
Thanks the reply, the disk is not full: root@exemplary-birds:~# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 133G 3.4G 123G 3% / none4.0K 0 4.0K 0% /sys/fs/cgroup udev 32G 4.0K 32G 1% /dev tmpfs 6.3G 764K 6.3G 1% /run none5.0M 0 5.0M 0% /run/lock none 32G 0 32G 0% /run/shm none100M 0 100M 0% /run/user /dev/sdb114T 15G 14T 1% /srv Neither the memory root@exemplary-birds:~# free total used free sharedbuffers cached Mem: 659633729698380 56264992776 1706687863812 -/+ buffers/cache:1663900 64299472 Swap: 997372 0 997372 thanks On Tue, Jan 6, 2015 at 12:10 PM, David Birdsong david.birds...@gmail.com wrote: I'm keen to hear about how to work one's way out of a filled partition since I've run into this many times after having tuned retention bytes or retention (time?) incorrectly. The proper path to resolving this isn't obvious based on my many harried searches through documentation. I often end up stopping the particular broker, picking an unlucky topic/partition, deleting, modifying the any topics that consumed too much space by lowering their retention bytes, and restarting. On Tue, Jan 6, 2015 at 12:02 PM, Sa Li sal...@gmail.com wrote: Continue this issue, when I restart the server, like bin/kafka-server-start.sh config/server.properties it will fails to start the server, like [2015-01-06 20:00:55,441] FATAL Fatal error during KafkaServerStable startup. Prepare to shutdown (kafka.server.KafkaServerStartable) java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code at java.nio.HeapByteBuffer.init(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) at kafka.log.FileMessageSet$$anon$1.makeNext(FileMessageSet.scala:188) at kafka.log.FileMessageSet$$anon$1.makeNext(FileMessageSet.scala:165) at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66) at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58) at kafka.log.LogSegment.recover(LogSegment.scala:165) at kafka.log.Log.recoverLog(Log.scala:179) at kafka.log.Log.loadSegments(Log.scala:155) at kafka.log.Log.init(Log.scala:64) at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118) at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:105) at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113) at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34) at kafka.log.LogManager.loadLogs(LogManager.scala:105) at kafka.log.LogManager.init(LogManager.scala:57) at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275) at kafka.server.KafkaServer.startup(KafkaServer.scala:72) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34) at kafka.Kafka$.main(Kafka.scala:46) at kafka.Kafka.main(Kafka.scala) [2015-01-06 20:00:55,443] INFO [Kafka Server 100], shutting down (kafka.server.KafkaServer) [2015-01-06 20:00:55,444] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread) [2015-01-06 20:00:55,446] INFO Session: 0x684a5ed9da3a1a0f closed (org.apache.zookeeper.ZooKeeper) [2015-01-06 20:00:55,446] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn) [2015-01-06 20:00:55,447] INFO [Kafka Server 100], shut down completed (kafka.server.KafkaServer) [2015-01-06 20:00:55,447] INFO [Kafka Server 100], shutting down (kafka.server.KafkaServer) Any ideas On Tue, Jan 6, 2015 at 12:00 PM, Sa Li sal...@gmail.com wrote: the complete error message: -su: cannot create temp file for here-document: No space left on device OpenJDK 64-Bit Server VM warning: Insufficient space for shared memory file: /tmp/hsperfdata_root/19721 Try using the -Djava.io.tmpdir= option to select an alternate temp location. [2015-01-06 19:50:49,244] FATAL (kafka.Kafka$) java.io.FileNotFoundException: conf (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.init(FileInputStream.java:146) at java.io.FileInputStream.init(FileInputStream.java:101
some connection errors happen in performance test
Hi, All I am running performance test on kafka, the command bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test-rep-three 500 100 -1 acks=1 bootstrap.servers= 10.100.10.101:9092 buffer.memory=67108864 batch.size=8196 Since we send 50 billions to brokers, it was OK but periodically pop out such errors: [2015-01-06 19:38:32,127] WARN Error in I/O with exemplary-birds.master/ 127.0.1.1 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.kafka.common.network.Selector.poll(Selector.java:232) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:745) 1950 records sent, 224.4 records/sec (0.02 MB/sec), 611.4 ms avg latency, 9259.0 max latency. 2899650 records sent, 579930.0 records/sec (55.31 MB/sec), 2399.5 ms avg latency, 9505.0 max latency. 3170219 records sent, 634043.8 records/sec (60.47 MB/sec), 568.7 ms avg latency, 1201.0 max latency. And I feel the error happen more often, our kafka cluster is a three node cluster. thanks -- Alec Li
config for consumer and producer
Hi, All I am testing and making changes on server.properties, I wonder do I need to specifically change the values in consumer and producer properties, here is the consumer.properties zookeeper.connect=10.100.98.100:2181,10.100.98.101:2181,10.100.98.102:2181 # timeout in ms for connecting to zookeeper zookeeper.connection.timeout.ms=100 #consumer group id group.id=test-consumer-group #consumer timeout #consumer.timeout.ms=5000 I use defaults for most of parameters, for group.id, it was defined as A string that uniquely identifies the group of consumer processes to which this consumer belongs. By setting the same group id multiple processes indicate that they are all part of the same consumer group. Do I need to define many consumer-group here? For producer, we are not user java client, it is a C# client sending message to kafka, so this producer won't be matter (except I am doing producer test locally), right? producer.type=sync compression.codec=none Thanks -- Alec Li
java.io.IOException: Connection reset by peer
Hi, All I am running a C# producer to send messages to kafka (3 nodes cluster), but have such errors: [2015-01-06 16:09:51,143] ERROR Closing socket for /10.100.70.128 because of error (kafka.network.Processor) java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) at kafka.utils.Utils$.read(Utils.scala:380) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Processor.read(SocketServer.scala:444) at kafka.network.Processor.run(SocketServer.scala:340) at java.lang.Thread.run(Thread.java:745) [2015-01-06 16:09:51,144] ERROR Closing socket for /10.100.70.128 because of error (kafka.network.Processor) java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) at kafka.utils.Utils$.read(Utils.scala:380) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Processor.read(SocketServer.scala:444) at kafka.network.Processor.run(SocketServer.scala:340) at java.lang.Thread.run(Thread.java:745) [2015-01-06 16:09:56,138] INFO Closing socket connection to /10.100.70.28. (kafka.network.Processor) [2015-01-06 16:10:07,685] INFO Closing socket connection to /10.100.70.28. (kafka.network.Processor) [2015-01-06 16:10:31,423] INFO Closing socket connection to /10.100.70.28. (kafka.network.Processor) [2015-01-06 16:11:08,077] INFO Closing socket connection to /10.100.70.28. (kafka.network.Processor) [2015-01-06 16:11:43,990] INFO Closing socket connection to /10.100.70.28. (kafka.network.Processor) [2015-01-06 16:12:24,168] INFO Closing socket connection to /10.100.70.128. (kafka.network.Processor) But I do see messages in brokers. Any ideas? thanks -- Alec Li
Re: messages lost
Hi, experts Again, we still having the issues of losing data, see we see data 5000 records, but only find 4500 records on brokers, we did set required.acks -1 to make sure all brokers ack, but that only cause the long latency, but not cure the data lost. thanks On Mon, Jan 5, 2015 at 9:55 AM, Xiaoyu Wang xw...@rocketfuel.com wrote: @Sa, the required.acks is producer side configuration. Set to -1 means requiring ack from all brokers. On Fri, Jan 2, 2015 at 1:51 PM, Sa Li sal...@gmail.com wrote: Thanks a lot, Tim, this is the config of brokers -- broker.id=1 port=9092 host.name=10.100.70.128 num.network.threads=4 num.io.threads=8 socket.send.buffer.bytes=1048576 socket.receive.buffer.bytes=1048576 socket.request.max.bytes=104857600 auto.leader.rebalance.enable=true auto.create.topics.enable=true default.replication.factor=3 log.dirs=/tmp/kafka-logs-1 num.partitions=8 log.flush.interval.messages=1 log.flush.interval.ms=1000 log.retention.hours=168 log.segment.bytes=536870912 log.cleanup.interval.mins=1 zookeeper.connect=10.100.70.128:2181,10.100.70.28:2181,10.100.70.29:2181 zookeeper.connection.timeout.ms=100 --- We actually play around request.required.acks in producer config, -1 cause long latency, 1 is the parameter to cause messages lost. But I am not sure, if this is the reason to lose the records. thanks AL On Fri, Jan 2, 2015 at 9:59 AM, Timothy Chen tnac...@gmail.com wrote: What's your configured required.acks? And also are you waiting for all your messages to be acknowledged as well? The new producer returns futures back, but you still need to wait for the futures to complete. Tim On Fri, Jan 2, 2015 at 9:54 AM, Sa Li sal...@gmail.com wrote: Hi, all We are sending the message from a producer, we send 10 records, but we see only 99573 records for that topics, we confirm this by consume this topic and check the log size in kafka web console. Any ideas for the message lost, what is the reason to cause this? thanks -- Alec Li -- Alec Li -- Alec Li
Re: no space left error
BTW, I found the the /kafka/logs also getting biger and bigger, like controller.log and state-change.logs. should I launch a cron the clean them up regularly or there is way to delete them regularly? thanks AL On Tue, Jan 6, 2015 at 2:01 PM, Sa Li sal...@gmail.com wrote: Hi, All We fix the problem, I like to share the what the problem is in case someone come across the similar issues. We add the data drive for each node /dev/sdb1 , but specify the wrong path in server.properties, which means the data was written into the wrong drive /dev/sda2, quickly eat up all the space in sda2, now we change the path. The sdb1 has 15Tb, which allows us to store data for a while and will be deleted in 1/2 weeks as config mentioned. But I am kinda curious about David's comments, ... after having tuned retention bytes or retention (time?) incorrectly. .. How do you guys set log.retention.bytes? I set log.retention.hours=336 (2 weeks), and should I set log.retention.bytes as default -1 or some other amount? thanks AL On Tue, Jan 6, 2015 at 12:43 PM, Sa Li sal...@gmail.com wrote: Thanks the reply, the disk is not full: root@exemplary-birds:~# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 133G 3.4G 123G 3% / none4.0K 0 4.0K 0% /sys/fs/cgroup udev 32G 4.0K 32G 1% /dev tmpfs 6.3G 764K 6.3G 1% /run none5.0M 0 5.0M 0% /run/lock none 32G 0 32G 0% /run/shm none100M 0 100M 0% /run/user /dev/sdb114T 15G 14T 1% /srv Neither the memory root@exemplary-birds:~# free total used free sharedbuffers cached Mem: 659633729698380 56264992776 1706687863812 -/+ buffers/cache:1663900 64299472 Swap: 997372 0 997372 thanks On Tue, Jan 6, 2015 at 12:10 PM, David Birdsong david.birds...@gmail.com wrote: I'm keen to hear about how to work one's way out of a filled partition since I've run into this many times after having tuned retention bytes or retention (time?) incorrectly. The proper path to resolving this isn't obvious based on my many harried searches through documentation. I often end up stopping the particular broker, picking an unlucky topic/partition, deleting, modifying the any topics that consumed too much space by lowering their retention bytes, and restarting. On Tue, Jan 6, 2015 at 12:02 PM, Sa Li sal...@gmail.com wrote: Continue this issue, when I restart the server, like bin/kafka-server-start.sh config/server.properties it will fails to start the server, like [2015-01-06 20:00:55,441] FATAL Fatal error during KafkaServerStable startup. Prepare to shutdown (kafka.server.KafkaServerStartable) java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code at java.nio.HeapByteBuffer.init(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) at kafka.log.FileMessageSet$$anon$1.makeNext(FileMessageSet.scala:188) at kafka.log.FileMessageSet$$anon$1.makeNext(FileMessageSet.scala:165) at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66) at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58) at kafka.log.LogSegment.recover(LogSegment.scala:165) at kafka.log.Log.recoverLog(Log.scala:179) at kafka.log.Log.loadSegments(Log.scala:155) at kafka.log.Log.init(Log.scala:64) at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118) at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:105) at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113) at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34) at kafka.log.LogManager.loadLogs(LogManager.scala:105) at kafka.log.LogManager.init(LogManager.scala:57) at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275) at kafka.server.KafkaServer.startup(KafkaServer.scala:72) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34) at kafka.Kafka$.main(Kafka.scala:46) at kafka.Kafka.main(Kafka.scala) [2015-01-06 20:00:55,443] INFO [Kafka Server 100], shutting down (kafka.server.KafkaServer) [2015-01-06 20:00:55,444] INFO Terminate ZkClient event thread
Re: no space left error
Hi, All We fix the problem, I like to share the what the problem is in case someone come across the similar issues. We add the data drive for each node /dev/sdb1 , but specify the wrong path in server.properties, which means the data was written into the wrong drive /dev/sda2, quickly eat up all the space in sda2, now we change the path. The sdb1 has 15Tb, which allows us to store data for a while and will be deleted in 1/2 weeks as config mentioned. But I am kinda curious about David's comments, ... after having tuned retention bytes or retention (time?) incorrectly. .. How do you guys set log.retention.bytes? I set log.retention.hours=336 (2 weeks), and should I set log.retention.bytes as default -1 or some other amount? thanks AL On Tue, Jan 6, 2015 at 12:43 PM, Sa Li sal...@gmail.com wrote: Thanks the reply, the disk is not full: root@exemplary-birds:~# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 133G 3.4G 123G 3% / none4.0K 0 4.0K 0% /sys/fs/cgroup udev 32G 4.0K 32G 1% /dev tmpfs 6.3G 764K 6.3G 1% /run none5.0M 0 5.0M 0% /run/lock none 32G 0 32G 0% /run/shm none100M 0 100M 0% /run/user /dev/sdb114T 15G 14T 1% /srv Neither the memory root@exemplary-birds:~# free total used free sharedbuffers cached Mem: 659633729698380 56264992776 1706687863812 -/+ buffers/cache:1663900 64299472 Swap: 997372 0 997372 thanks On Tue, Jan 6, 2015 at 12:10 PM, David Birdsong david.birds...@gmail.com wrote: I'm keen to hear about how to work one's way out of a filled partition since I've run into this many times after having tuned retention bytes or retention (time?) incorrectly. The proper path to resolving this isn't obvious based on my many harried searches through documentation. I often end up stopping the particular broker, picking an unlucky topic/partition, deleting, modifying the any topics that consumed too much space by lowering their retention bytes, and restarting. On Tue, Jan 6, 2015 at 12:02 PM, Sa Li sal...@gmail.com wrote: Continue this issue, when I restart the server, like bin/kafka-server-start.sh config/server.properties it will fails to start the server, like [2015-01-06 20:00:55,441] FATAL Fatal error during KafkaServerStable startup. Prepare to shutdown (kafka.server.KafkaServerStartable) java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code at java.nio.HeapByteBuffer.init(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) at kafka.log.FileMessageSet$$anon$1.makeNext(FileMessageSet.scala:188) at kafka.log.FileMessageSet$$anon$1.makeNext(FileMessageSet.scala:165) at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66) at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58) at kafka.log.LogSegment.recover(LogSegment.scala:165) at kafka.log.Log.recoverLog(Log.scala:179) at kafka.log.Log.loadSegments(Log.scala:155) at kafka.log.Log.init(Log.scala:64) at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118) at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:105) at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113) at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34) at kafka.log.LogManager.loadLogs(LogManager.scala:105) at kafka.log.LogManager.init(LogManager.scala:57) at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275) at kafka.server.KafkaServer.startup(KafkaServer.scala:72) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34) at kafka.Kafka$.main(Kafka.scala:46) at kafka.Kafka.main(Kafka.scala) [2015-01-06 20:00:55,443] INFO [Kafka Server 100], shutting down (kafka.server.KafkaServer) [2015-01-06 20:00:55,444] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread) [2015-01-06 20:00:55,446] INFO Session: 0x684a5ed9da3a1a0f closed (org.apache.zookeeper.ZooKeeper) [2015-01-06 20:00:55,446] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn) [2015-01-06 20:00:55,447] INFO [Kafka Server 100], shut down completed
Re: no space left error
the complete error message: -su: cannot create temp file for here-document: No space left on device OpenJDK 64-Bit Server VM warning: Insufficient space for shared memory file: /tmp/hsperfdata_root/19721 Try using the -Djava.io.tmpdir= option to select an alternate temp location. [2015-01-06 19:50:49,244] FATAL (kafka.Kafka$) java.io.FileNotFoundException: conf (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.init(FileInputStream.java:146) at java.io.FileInputStream.init(FileInputStream.java:101) at kafka.utils.Utils$.loadProps(Utils.scala:144) at kafka.Kafka$.main(Kafka.scala:34) at kafka.Kafka.main(Kafka.scala) On Tue, Jan 6, 2015 at 11:58 AM, Sa Li sal...@gmail.com wrote: Hi, All I am doing performance test on our new kafka production server, but after sending some messages (even faked message by using bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance), it comes out the error of connection, and shut down the brokers, after that, I see such errors, conf-su: cannot create temp file for here-document: No space left on device How can I fix it, I am concerning that will happen when we start to publish real messages in kafka, and should I create some cron to regularly clean certain directories? thanks -- Alec Li -- Alec Li
Re: messages lost
Thanks a lot, Tim, this is the config of brokers -- broker.id=1 port=9092 host.name=10.100.70.128 num.network.threads=4 num.io.threads=8 socket.send.buffer.bytes=1048576 socket.receive.buffer.bytes=1048576 socket.request.max.bytes=104857600 auto.leader.rebalance.enable=true auto.create.topics.enable=true default.replication.factor=3 log.dirs=/tmp/kafka-logs-1 num.partitions=8 log.flush.interval.messages=1 log.flush.interval.ms=1000 log.retention.hours=168 log.segment.bytes=536870912 log.cleanup.interval.mins=1 zookeeper.connect=10.100.70.128:2181,10.100.70.28:2181,10.100.70.29:2181 zookeeper.connection.timeout.ms=100 --- We actually play around request.required.acks in producer config, -1 cause long latency, 1 is the parameter to cause messages lost. But I am not sure, if this is the reason to lose the records. thanks AL On Fri, Jan 2, 2015 at 9:59 AM, Timothy Chen tnac...@gmail.com wrote: What's your configured required.acks? And also are you waiting for all your messages to be acknowledged as well? The new producer returns futures back, but you still need to wait for the futures to complete. Tim On Fri, Jan 2, 2015 at 9:54 AM, Sa Li sal...@gmail.com wrote: Hi, all We are sending the message from a producer, we send 10 records, but we see only 99573 records for that topics, we confirm this by consume this topic and check the log size in kafka web console. Any ideas for the message lost, what is the reason to cause this? thanks -- Alec Li -- Alec Li
messages lost
Hi, all We are sending the message from a producer, we send 10 records, but we see only 99573 records for that topics, we confirm this by consume this topic and check the log size in kafka web console. Any ideas for the message lost, what is the reason to cause this? thanks -- Alec Li
Re: kafka logs gone after reboot the server
One more question, when I set the log.dirs in different nodes in the cluster, should I set them different name, say kafka-logs-1 which associated with broker id, or I can set the same directory name, like /var/log/kafka for every node (assume one broker in each server). thanks On Fri, Jan 2, 2015 at 2:20 PM, Sa Li sal...@gmail.com wrote: Thanks a lot! On Fri, Jan 2, 2015 at 12:15 PM, Jay Kreps jay.kr...@gmail.com wrote: Nice catch Joe--several people have complained about this as a problem and we were a bit mystified as to what kind of bug could lead to all their logs getting deleted and re-replicated when they bounced the server. We assumed bounced meant restarted the app, but I think likely what is happening is what you describe--the logs were in /tmp and bouncing the server meant restarting. -Jay On Fri, Jan 2, 2015 at 11:02 AM, Joe Stein joe.st...@stealth.ly wrote: That is because your logs are in /tmp which you can change by setting log.dirs to something else. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Fri, Jan 2, 2015 at 1:58 PM, Sa Li sal...@gmail.com wrote: Hi, All I've just notice one thing, when I am experiencing some errors in Kafka servers, I reboot the dev servers (not a good way), after reboot, I get into zkCli, I can see all the topics still exist. But when I get into kafka log directory, I found all data gone, see root@DO-mq-dev:/tmp/kafka-logs-1/ui_test_topic_4-0# ll total 8 drwxr-xr-x 2 root root 4096 Jan 2 09:39 ./ drwxr-xr-x 46 root root 4096 Jan 2 10:46 ../ -rw-r--r-- 1 root root 10485760 Jan 2 09:39 .index -rw-r--r-- 1 root root0 Jan 2 09:39 .log I wonder, if for some reasons, the server down, and restart it, all the data in hard drive will be gone? thanks -- Alec Li -- Alec Li -- Alec Li
kafka-web-console error
Hi, all I am running kafka-web-console, I periodically getting such error and cause the UI down: ! @6kldaf9lj - Internal server error, for (GET) [/assets/images/zookeeper_small.gif] - play.api.Application$$anon$1: Execution exception[[FileNotFoundException: /vagrant/kafka-web-console-master/target/scala-2.10/classes/public/images/zookeeper_small.gif (Too many open files)]] at play.api.Application$class.handleError(Application.scala:293) ~[play_2.10-2.2.1.jar:2.2.1] at play.api.DefaultApplication.handleError(Application.scala:399) [play_2.10-2.2.1.jar:2.2.1] at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$12$$anonfun$apply$1.applyOrElse(PlayDefaultUpstreamHandler.scala:165) [play_2.10-2.2.1.jar:2.2.1] at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$12$$anonfun$apply$1.applyOrElse(PlayDefaultUpstreamHandler.scala:162) [play_2.10-2.2.1.jar:2.2.1] at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33) [scala-library-2.10.2.jar:na] at scala.util.Failure$$anonfun$recover$1.apply(Try.scala:185) [scala-library-2.10.2.jar:na] Caused by: java.io.FileNotFoundException: /vagrant/kafka-web-console-master/target/scala-2.10/classes/public/images/zookeeper_small.gif (Too many open files) at java.io.FileInputStream.open(Native Method) ~[na:1.7.0_65] at java.io.FileInputStream.init(FileInputStream.java:146) ~[na:1.7.0_65] at java.io.FileInputStream.init(FileInputStream.java:101) ~[na:1.7.0_65] at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90) ~[na:1.7.0_65] at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188) ~[na:1.7.0_65] at java.net.URL.openStream(URL.java:1037) ~[na:1.7.0_65] [debug] application - Getting partition offsets for topic ui_test_topic_6 [warn] application - Could not connect to partition leader exemplary-birds.master:9092. Error message: Failed to open a socket. [debug] application - Getting partition offsets for topic ui_test_topic_5 [warn] application - Could not connect to partition leader exemplary-birds.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader harmful-jar.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader voluminous-mass.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader voluminous-mass.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader exemplary-birds.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader exemplary-birds.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader harmful-jar.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader harmful-jar.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader voluminous-mass.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader voluminous-mass.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader exemplary-birds.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader harmful-jar.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader exemplary-birds.master:9092. Error message: Failed to open a socket. [debug] application - Getting partition offsets for topic PofApiTest [warn] application - Could not connect to partition leader harmful-jar.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader voluminous-mass.master:9092. Error message: Failed to open a socket. [warn] application - Could not connect to partition leader exemplary-birds.master:9092. Error message: Failed to open a socket. [debug] application - Getting partition offsets for topic ui_test_topic_4 What that means too many files open, is that mean the insufficient local memory? thanks -- Alec Li
Re: kafka logs gone after reboot the server
Thanks a lot! On Fri, Jan 2, 2015 at 12:15 PM, Jay Kreps jay.kr...@gmail.com wrote: Nice catch Joe--several people have complained about this as a problem and we were a bit mystified as to what kind of bug could lead to all their logs getting deleted and re-replicated when they bounced the server. We assumed bounced meant restarted the app, but I think likely what is happening is what you describe--the logs were in /tmp and bouncing the server meant restarting. -Jay On Fri, Jan 2, 2015 at 11:02 AM, Joe Stein joe.st...@stealth.ly wrote: That is because your logs are in /tmp which you can change by setting log.dirs to something else. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Fri, Jan 2, 2015 at 1:58 PM, Sa Li sal...@gmail.com wrote: Hi, All I've just notice one thing, when I am experiencing some errors in Kafka servers, I reboot the dev servers (not a good way), after reboot, I get into zkCli, I can see all the topics still exist. But when I get into kafka log directory, I found all data gone, see root@DO-mq-dev:/tmp/kafka-logs-1/ui_test_topic_4-0# ll total 8 drwxr-xr-x 2 root root 4096 Jan 2 09:39 ./ drwxr-xr-x 46 root root 4096 Jan 2 10:46 ../ -rw-r--r-- 1 root root 10485760 Jan 2 09:39 .index -rw-r--r-- 1 root root0 Jan 2 09:39 .log I wonder, if for some reasons, the server down, and restart it, all the data in hard drive will be gone? thanks -- Alec Li -- Alec Li
kafka logs gone after reboot the server
Hi, All I've just notice one thing, when I am experiencing some errors in Kafka servers, I reboot the dev servers (not a good way), after reboot, I get into zkCli, I can see all the topics still exist. But when I get into kafka log directory, I found all data gone, see root@DO-mq-dev:/tmp/kafka-logs-1/ui_test_topic_4-0# ll total 8 drwxr-xr-x 2 root root 4096 Jan 2 09:39 ./ drwxr-xr-x 46 root root 4096 Jan 2 10:46 ../ -rw-r--r-- 1 root root 10485760 Jan 2 09:39 .index -rw-r--r-- 1 root root0 Jan 2 09:39 .log I wonder, if for some reasons, the server down, and restart it, all the data in hard drive will be gone? thanks -- Alec Li
auto.create.topics.enable in config file
Hi, all I add auto.create.topics.enable=true in server.properties file, but I got such error java.lang.IllegalArgumentException: requirement failed: Unacceptable value for property 'auto.create.topics.enable', boolean values must be either 'true' or 'false when I start the kafka server, any clue for this? thanks -- Alec Li
the impact of partition number
Hi, All I've run bin/kafka-producer-perf-test.sh on our kafka-production cluster, I found the number of partitions really have huge impacts on the producer performance, see: start.time, end.time, compression, message.size, batch.size, total.data.sent.in.MB, MB.sec, total.data.sent.in.nMsg, nMsg.sec 2014-12-22 19:53:27:392, 2014-12-22 19:54:25:581, 1, 3000, 200, 2861.02, 49.1678, 100, 17185.3787 2014-12-22 19:55:27:048, 2014-12-22 19:56:23:318, 1, 3000, 200, 2861.02, 50.8446, 100, 17771.4590 2014-12-22 19:58:09:466, 2014-12-22 19:59:05:068, 1, 3000, 200, 2861.02, 51.4554, 100, 17984.9646 2014-12-22 19:59:40:389, 2014-12-22 20:00:28:646, 1, 3000, 200, 2861.02, 59.2872, 100, 20722.3822 2014-12-22 20:02:41:993, 2014-12-22 20:03:22:481, 1, 3000, 200, 2861.02, 70.6635, 100, 24698.6762 2014-12-22 20:03:47:594, 2014-12-22 20:04:26:238, 1, 3000, 200, 2861.02, 74.0354, 100, 25877.2384 2014-12-22 20:11:49:492, 2014-12-22 20:12:25:843, 1, 3000, 200, 2861.02, 78.7055, 100, 27509.5596 2014-12-22 20:12:53:290, 2014-12-22 20:13:29:746, 1, 3000, 200, 2861.02, 78.4788, 100, 27430.3270 2014-12-22 20:13:53:194, 2014-12-22 20:14:29:470, 1, 3000, 200, 2861.02, 78.8682, 100, 27566.4351 2014-12-22 20:14:51:491, 2014-12-22 20:15:25:451, 1, 3000, 200, 2861.02, 84.2468, 100, 29446.4075 2014-12-22 20:16:51:369, 2014-12-22 20:17:27:452, 1, 3000, 200, 2861.02, 79.2901, 100, 27713.8819 2014-12-22 20:17:57:882, 2014-12-22 20:18:33:957, 1, 3000, 200, 2861.02, 79.3076, 100, 27720.0277 The number of partitions above are from 1 to 12, I wonder why it has such big difference? thanks -- Alec Li
leader and isr were not set when create the topic
Hi, All I created a topic with 3 replications and 6 partitions, but when I check this topic, seems there is no leader and isr were set for this topic, see bin/kafka-topics.sh --create --zookeeper 10.100.98.100:2181 --replication-factor 3 --partitions 6 --topic perf_producer_p6_test SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Created topic perf_producer_p6_test. root@precise64:/etc/kafka# bin/kafka-topics.sh --describe --zookeeper 10.100.98.100:2181 --topic perf_producer_p6_test SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Topic:perf_producer_p6_test PartitionCount:6 ReplicationFactor:3 Configs: Topic: perf_producer_p6_testPartition: 0Leader: none Replicas: 100,101,102 Isr: Topic: perf_producer_p6_testPartition: 1Leader: none Replicas: 101,102,100 Isr: Topic: perf_producer_p6_testPartition: 2Leader: none Replicas: 102,100,101 Isr: Topic: perf_producer_p6_testPartition: 3Leader: none Replicas: 100,102,101 Isr: Topic: perf_producer_p6_testPartition: 4Leader: none Replicas: 101,100,102 Isr: Topic: perf_producer_p6_testPartition: 5Leader: none Replicas: 102,101,100 Isr: Is there a way to specifically set leader and isr in command line, it is strange when I create the topic with 5 partitions, it has leader and isr: root@precise64:/etc/kafka# bin/kafka-topics.sh --describe --zookeeper 10.100.98.100:2181 --topic perf_producer_p5_test SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Topic:perf_producer_p5_test PartitionCount:5 ReplicationFactor:3 Configs: Topic: perf_producer_p5_testPartition: 0Leader: 102 Replicas: 102,100,101 Isr: 102,100,101 Topic: perf_producer_p5_testPartition: 1Leader: 102 Replicas: 100,101,102 Isr: 102,101 Topic: perf_producer_p5_testPartition: 2Leader: 101 Replicas: 101,102,100 Isr: 101,102,100 Topic: perf_producer_p5_testPartition: 3Leader: 102 Replicas: 102,101,100 Isr: 102,101,100 Topic: perf_producer_p5_testPartition: 4Leader: 102 Replicas: 100,102,101 Isr: 102,101 Any ideas? thanks -- Alec Li
Re: leader and isr were not set when create the topic
I restart the kafka server, it is the same thing, sometime nothing listed on ISR, leader, I checked the state-change log [2014-12-22 23:46:38,164] TRACE Broker 100 cached leader info (LeaderAndIsrInfo:(Leader:101,ISR:101,102,100,LeaderEpoch:0,ControllerEpoch:4),ReplicationFactor:3),AllReplicas:101,102,100) for partition [perf_producer_p8_test,1] in response to UpdateMetadata request sent by controller 101 epoch 4 with correlation id 138 (state.change.logger) On Mon, Dec 22, 2014 at 2:46 PM, Sa Li sal...@gmail.com wrote: Hi, All I created a topic with 3 replications and 6 partitions, but when I check this topic, seems there is no leader and isr were set for this topic, see bin/kafka-topics.sh --create --zookeeper 10.100.98.100:2181 --replication-factor 3 --partitions 6 --topic perf_producer_p6_test SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Created topic perf_producer_p6_test. root@precise64:/etc/kafka# bin/kafka-topics.sh --describe --zookeeper 10.100.98.100:2181 --topic perf_producer_p6_test SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Topic:perf_producer_p6_test PartitionCount:6 ReplicationFactor:3 Configs: Topic: perf_producer_p6_testPartition: 0Leader: none Replicas: 100,101,102 Isr: Topic: perf_producer_p6_testPartition: 1Leader: none Replicas: 101,102,100 Isr: Topic: perf_producer_p6_testPartition: 2Leader: none Replicas: 102,100,101 Isr: Topic: perf_producer_p6_testPartition: 3Leader: none Replicas: 100,102,101 Isr: Topic: perf_producer_p6_testPartition: 4Leader: none Replicas: 101,100,102 Isr: Topic: perf_producer_p6_testPartition: 5Leader: none Replicas: 102,101,100 Isr: Is there a way to specifically set leader and isr in command line, it is strange when I create the topic with 5 partitions, it has leader and isr: root@precise64:/etc/kafka# bin/kafka-topics.sh --describe --zookeeper 10.100.98.100:2181 --topic perf_producer_p5_test SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Topic:perf_producer_p5_test PartitionCount:5 ReplicationFactor:3 Configs: Topic: perf_producer_p5_testPartition: 0Leader: 102 Replicas: 102,100,101 Isr: 102,100,101 Topic: perf_producer_p5_testPartition: 1Leader: 102 Replicas: 100,101,102 Isr: 102,101 Topic: perf_producer_p5_testPartition: 2Leader: 101 Replicas: 101,102,100 Isr: 101,102,100 Topic: perf_producer_p5_testPartition: 3Leader: 102 Replicas: 102,101,100 Isr: 102,101,100 Topic: perf_producer_p5_testPartition: 4Leader: 102 Replicas: 100,102,101 Isr: 102,101 Any ideas? thanks -- Alec Li -- Alec Li
Re: leader and isr were not set when create the topic
Hello, Neha This is the error from server.log [2014-12-22 23:53:25,663] WARN [KafkaApi-100] Fetch request with correlation id 1227732 from client ReplicaFetcherThread-0-100 on partition [perf_producer_p8_test,1] failed due to Leader not local for partition [perf_producer_p8_test,1] on broker 100 (kafka.server.KafkaApis) On Mon, Dec 22, 2014 at 3:50 PM, Sa Li sal...@gmail.com wrote: I restart the kafka server, it is the same thing, sometime nothing listed on ISR, leader, I checked the state-change log [2014-12-22 23:46:38,164] TRACE Broker 100 cached leader info (LeaderAndIsrInfo:(Leader:101,ISR:101,102,100,LeaderEpoch:0,ControllerEpoch:4),ReplicationFactor:3),AllReplicas:101,102,100) for partition [perf_producer_p8_test,1] in response to UpdateMetadata request sent by controller 101 epoch 4 with correlation id 138 (state.change.logger) On Mon, Dec 22, 2014 at 2:46 PM, Sa Li sal...@gmail.com wrote: Hi, All I created a topic with 3 replications and 6 partitions, but when I check this topic, seems there is no leader and isr were set for this topic, see bin/kafka-topics.sh --create --zookeeper 10.100.98.100:2181 --replication-factor 3 --partitions 6 --topic perf_producer_p6_test SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Created topic perf_producer_p6_test. root@precise64:/etc/kafka# bin/kafka-topics.sh --describe --zookeeper 10.100.98.100:2181 --topic perf_producer_p6_test SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Topic:perf_producer_p6_test PartitionCount:6 ReplicationFactor:3 Configs: Topic: perf_producer_p6_testPartition: 0Leader: none Replicas: 100,101,102 Isr: Topic: perf_producer_p6_testPartition: 1Leader: none Replicas: 101,102,100 Isr: Topic: perf_producer_p6_testPartition: 2Leader: none Replicas: 102,100,101 Isr: Topic: perf_producer_p6_testPartition: 3Leader: none Replicas: 100,102,101 Isr: Topic: perf_producer_p6_testPartition: 4Leader: none Replicas: 101,100,102 Isr: Topic: perf_producer_p6_testPartition: 5Leader: none Replicas: 102,101,100 Isr: Is there a way to specifically set leader and isr in command line, it is strange when I create the topic with 5 partitions, it has leader and isr: root@precise64:/etc/kafka# bin/kafka-topics.sh --describe --zookeeper 10.100.98.100:2181 --topic perf_producer_p5_test SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Topic:perf_producer_p5_test PartitionCount:5 ReplicationFactor:3 Configs: Topic: perf_producer_p5_testPartition: 0Leader: 102 Replicas: 102,100,101 Isr: 102,100,101 Topic: perf_producer_p5_testPartition: 1Leader: 102 Replicas: 100,101,102 Isr: 102,101 Topic: perf_producer_p5_testPartition: 2Leader: 101 Replicas: 101,102,100 Isr: 101,102,100 Topic: perf_producer_p5_testPartition: 3Leader: 102 Replicas: 102,101,100 Isr: 102,101,100 Topic: perf_producer_p5_testPartition: 4Leader: 102 Replicas: 100,102,101 Isr: 102,101 Any ideas? thanks -- Alec Li -- Alec Li -- Alec Li
Re: leader and isr were not set when create the topic
I have three nodes: 100, 101, and 102 When I restart all of them, seems now everything is ok, but I would like to paste the error messages I got from server.log from each node, see if you can help to understand what is the problem. on node 100 [2014-12-23 00:04:39,401] ERROR [KafkaApi-100] Error when processing fetch request for partition [perf_producer_p8_test,7] offset 125000 from follower with correlation id 0 (kafka.server.KafkaApis) kafka.common.OffsetOutOfRangeException: Request for offset 125000 but we only have log segments in the range 0 to 0. at kafka.log.Log.read(Log.scala:380) at kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSet(KafkaApis.scala:530) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:476) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:471) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.immutable.Map$Map3.foreach(Map.scala:154) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.AbstractTraversable.map(Traversable.scala:105) .. .. in Node 101 and 102 [2014-12-23 00:04:39,440] ERROR [ReplicaFetcherThread-0-100], Current offset 1 25000 for partition [perf_producer_p8_test,1] out of range; reset offset to 0 (kafka.server.ReplicaFetcherThread) [2014-12-23 00:04:39,442] INFO Truncating log perf_producer_p8_test-7 to offset 0. (kafka.log.Log) [2014-12-23 00:04:39,452] WARN [ReplicaFetcherThread-0-100], Replica 102 for partition [perf_producer_p8_test,7] reset its fetch offset to current leader 100's latest offset 0 (kafka.server.ReplicaFetcherThread) On Mon, Dec 22, 2014 at 3:55 PM, Sa Li sal...@gmail.com wrote: Hello, Neha This is the error from server.log [2014-12-22 23:53:25,663] WARN [KafkaApi-100] Fetch request with correlation id 1227732 from client ReplicaFetcherThread-0-100 on partition [perf_producer_p8_test,1] failed due to Leader not local for partition [perf_producer_p8_test,1] on broker 100 (kafka.server.KafkaApis) On Mon, Dec 22, 2014 at 3:50 PM, Sa Li sal...@gmail.com wrote: I restart the kafka server, it is the same thing, sometime nothing listed on ISR, leader, I checked the state-change log [2014-12-22 23:46:38,164] TRACE Broker 100 cached leader info (LeaderAndIsrInfo:(Leader:101,ISR:101,102,100,LeaderEpoch:0,ControllerEpoch:4),ReplicationFactor:3),AllReplicas:101,102,100) for partition [perf_producer_p8_test,1] in response to UpdateMetadata request sent by controller 101 epoch 4 with correlation id 138 (state.change.logger) On Mon, Dec 22, 2014 at 2:46 PM, Sa Li sal...@gmail.com wrote: Hi, All I created a topic with 3 replications and 6 partitions, but when I check this topic, seems there is no leader and isr were set for this topic, see bin/kafka-topics.sh --create --zookeeper 10.100.98.100:2181 --replication-factor 3 --partitions 6 --topic perf_producer_p6_test SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Created topic perf_producer_p6_test. root@precise64:/etc/kafka# bin/kafka-topics.sh --describe --zookeeper 10.100.98.100:2181 --topic perf_producer_p6_test SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Topic:perf_producer_p6_test PartitionCount:6 ReplicationFactor:3 Configs: Topic: perf_producer_p6_testPartition: 0Leader: none Replicas: 100,101,102 Isr: Topic: perf_producer_p6_testPartition: 1Leader: none Replicas: 101,102,100 Isr: Topic: perf_producer_p6_testPartition: 2Leader: none Replicas: 102,100,101 Isr: Topic: perf_producer_p6_testPartition: 3Leader: none Replicas: 100,102,101 Isr: Topic: perf_producer_p6_test
kafka monitoring system
Hi, all I am thinking to make a reliable monitoring system for our kafka production cluster. I read such from documents: Kafka uses Yammer Metrics for metrics reporting in both the server and the client. This can be configured to report stats using pluggable stats reporters to hook up to your monitoring system. The easiest way to see the available metrics to fire up jconsole and point it at a running kafka client or server; this will all browsing all metrics with JMX. We pay particular we do graphing and alerting on the following metrics: .. I am wondering if anyone ever use Jconsole to monitor the kafka, or anyone can recommend a good monitoring tool for kafka production. thanks -- Alec Li
can't produce message in kafka production
Dear all We just build a kafka production cluster, I can create topics in kafka production from another host. But when I am send very simple message as producer, it generate such errors: root@precise64:/etc/kafka# bin/kafka-console-producer.sh --broker-list 10.100.98.100:9092 --topic my-replicated-topic-production SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] my test message 1 [2014-12-18 21:44:25,830] WARN Failed to send producer request with correlation id 2 to broker 101 with data for partitions [my-replicated-topic-production,1] (kafka.producer.async.DefaultEventHandler) java.nio.channels.ClosedChannelException at kafka.network.BlockingChannel.send(BlockingChannel.scala:100) at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73) at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:103) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:102) at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.producer.SyncProducer.send(SyncProducer.scala:101) at kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:256) at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:107) at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:99) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) at kafka.producer.async.DefaultEventHandler.dispatchSerializedData(DefaultEventHandler.scala:99) at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:72) at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105) at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88) at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68) at scala.collection.immutable.Stream.foreach(Stream.scala:547) at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67) at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45) [2014-12-18 21:44:25,948] WARN Failed to send producer request with correlation id 5 to broker 101 with data for partitions [my-replicated-topic-production,1] (kafka.producer.async.DefaultEventHandler) java.nio.channels.ClosedChannelException at kafka.network.BlockingChannel.send(BlockingChannel.scala:100) at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73) at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:103) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:102) at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) at
Re: can't produce message in kafka production
Thanks, Gwen, I telnet it, root@precise64:/etc/kafka# telnet 10.100.98.100 9092 Trying 10.100.98.100... Connected to 10.100.98.100. Escape character is '^]'. seems it connected, and I check with system operation people, netstate should 9092 is listening. I am assuming this is the connection issue, since I can run the same command to my dev-cluster with no problem at all, which is 10.100.70.128:9092. Just in case, is it possibly caused by other types of issues? thanks Alec On Thu, Dec 18, 2014 at 2:33 PM, Gwen Shapira gshap...@cloudera.com wrote: Looks like you can't connect to: 10.100.98.100:9092 I'd validate that this is the issue using telnet and then check the firewall / ipfilters settings. On Thu, Dec 18, 2014 at 2:21 PM, Sa Li sal...@gmail.com wrote: Dear all We just build a kafka production cluster, I can create topics in kafka production from another host. But when I am send very simple message as producer, it generate such errors: root@precise64:/etc/kafka# bin/kafka-console-producer.sh --broker-list 10.100.98.100:9092 --topic my-replicated-topic-production SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/etc/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] my test message 1 [2014-12-18 21:44:25,830] WARN Failed to send producer request with correlation id 2 to broker 101 with data for partitions [my-replicated-topic-production,1] (kafka.producer.async.DefaultEventHandler) java.nio.channels.ClosedChannelException at kafka.network.BlockingChannel.send(BlockingChannel.scala:100) at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73) at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:103) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:102) at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.producer.SyncProducer.send(SyncProducer.scala:101) at kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:256) at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:107) at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$2.apply(DefaultEventHandler.scala:99) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) at kafka.producer.async.DefaultEventHandler.dispatchSerializedData(DefaultEventHandler.scala:99) at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:72) at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105) at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88) at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68) at scala.collection.immutable.Stream.foreach(Stream.scala:547) at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67) at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45) [2014-12-18 21:44:25,948] WARN Failed to send producer request with correlation id 5 to broker 101 with data for partitions [my-replicated-topic-production,1] (kafka.producer.async.DefaultEventHandler
Re: kafka consumer to write into DB
Thank you very much for the reply, Neha, I have a question about consumer, I consume the data from kafka and write into DB, of course I have to create a hash map in memory, load data into memory and bulk copy to DB instead of insert into DB line by line. Does it mean I need to ack each message while load to memory? thanks On Thu, Dec 4, 2014 at 1:21 PM, Neha Narkhede n...@confluent.io wrote: This is specific for pentaho but may be useful - https://github.com/RuckusWirelessIL/pentaho-kafka-consumer On Thu, Dec 4, 2014 at 12:58 PM, Sa Li sal...@gmail.com wrote: Hello, all I never developed a kafka consumer, I want to be able to make an advanced kafka consumer in java to consume the data and continuously write the data into postgresql DB. I am thinking to create a map in memory and getting a predefined number of messages in memory then write into DB in batch, is there a API or sample code to allow me to do this? thanks -- Alec Li -- Thanks, Neha -- Alec Li
Re: kafka consumer to write into DB
Thanks, Neha, is there a java version batch consumer? thanks On Fri, Dec 5, 2014 at 9:41 AM, Scott Clasen sc...@heroku.com wrote: if you are using scala/akka this will handle the batching and acks for you. https://github.com/sclasen/akka-kafka#akkabatchconsumer On Fri, Dec 5, 2014 at 9:21 AM, Sa Li sal...@gmail.com wrote: Thank you very much for the reply, Neha, I have a question about consumer, I consume the data from kafka and write into DB, of course I have to create a hash map in memory, load data into memory and bulk copy to DB instead of insert into DB line by line. Does it mean I need to ack each message while load to memory? thanks On Thu, Dec 4, 2014 at 1:21 PM, Neha Narkhede n...@confluent.io wrote: This is specific for pentaho but may be useful - https://github.com/RuckusWirelessIL/pentaho-kafka-consumer On Thu, Dec 4, 2014 at 12:58 PM, Sa Li sal...@gmail.com wrote: Hello, all I never developed a kafka consumer, I want to be able to make an advanced kafka consumer in java to consume the data and continuously write the data into postgresql DB. I am thinking to create a map in memory and getting a predefined number of messages in memory then write into DB in batch, is there a API or sample code to allow me to do this? thanks -- Alec Li -- Thanks, Neha -- Alec Li -- Alec Li
kafka consumer to write into DB
Hello, all I never developed a kafka consumer, I want to be able to make an advanced kafka consumer in java to consume the data and continuously write the data into postgresql DB. I am thinking to create a map in memory and getting a predefined number of messages in memory then write into DB in batch, is there a API or sample code to allow me to do this? thanks -- Alec Li
Re: how many brokers to set in kafka
thanks a lot On Nov 29, 2014, at 8:29 AM, Jun Rao jun...@gmail.com wrote: Typically, you will just have one broker per server. If you do want to set up multiple brokers on the same server, ideally you need to give each broker dedicated storage. Thanks, Jun On Thu, Nov 27, 2014 at 11:09 AM, Sa Li sal...@gmail.com wrote: Hi, all We are having 3 production server to setup for kafka cluster, I wonder how many brokers to configure for each server. thanks -- Alec Li
rule to set number of brokers for each server
Dear all I am provision production kafka cluster, which has 3 servers, I am wondering how many brokers I should set for each servers, I set 3 brokers in dev clusters, but I really don't what is the advantages to set more than 1 broker for each server, what about 1 broker for each server, totally 3 brokers, instead of 9 brokers. thanks -- Alec Li
how many brokers to set in kafka
Hi, all We are having 3 production server to setup for kafka cluster, I wonder how many brokers to configure for each server. thanks -- Alec Li
Re: how many brokers to set in kafka
Is there any rules to determine or optimize the number of brokers? On Thu, Nov 27, 2014 at 11:09 AM, Sa Li sal...@gmail.com wrote: Hi, all We are having 3 production server to setup for kafka cluster, I wonder how many brokers to configure for each server. thanks -- Alec Li -- Alec Li
Re: kafka web console running error
I am using https://github.com/claudemamo/kafka-web-console version, and do you mind to tell where about to make such modification? thanks Alec On Mon, Nov 24, 2014 at 11:16 PM, Yang Fang franklin.f...@gmail.com wrote: do you see error msg Too many open files? it tips you should modify nofile On Tue, Nov 25, 2014 at 1:26 PM, Jun Rao jun...@gmail.com wrote: Which web console are you using? Thanks, Jun On Fri, Nov 21, 2014 at 8:34 AM, Sa Li sal...@gmail.com wrote: Hi, all I am trying to get kafka web console work, but seems it only works few hours and fails afterwards, below is the error messages on the screen. I am assuming something wrong with the DB, I used to swap H2 to mysql, but didn't help. Anyone has similar problem? - . . at sun.misc.Resource.getByteBuffer(Resource.java:160) ~[na:1.7.0_65] at java.net.URLClassLoader.defineClass(URLClassLoader.java:436) ~[na:1.7.0_65] at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67) at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82) at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59) at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59) at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:42) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [ERROR] Failed to construct terminal; falling back to unsupported java.io.IOException: Cannot run program sh: error=24, Too many open files at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047) at java.lang.Runtime.exec(Runtime.java:617) at java.lang.Runtime.exec(Runtime.java:485) at jline.internal.TerminalLineSettings.exec(TerminalLineSettings.java:183) at jline.internal.TerminalLineSettings.exec(TerminalLineSettings.java:173) at jline.internal.TerminalLineSettings.stty(TerminalLineSettings.java:168) at jline.internal.TerminalLineSettings.get(TerminalLineSettings.java:72) at jline.internal.TerminalLineSettings.init(TerminalLineSettings.java:52) at jline.UnixTerminal.init(UnixTerminal.java:31) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at java.lang.Class.newInstance(Class.java:379) [error] a.a.ActorSystemImpl - Uncaught error from thread [play-akka.actor.default-dispatcher-944] shutting down JVM since 'akka.jvm-exit-on-fatal-error' java.lang.NoClassDefFoundError: common/Util$$anonfun$getPartitionsLogSize$3$$anonfun$apply$19$$anonfun$apply$1$$anonfun$applyOrElse$1 at common.Util$$anonfun$getPartitionsLogSize$3$$anonfun$apply$19$$anonfun$apply$1.applyOrElse(Util.scala:75) ~[na:na] at common.Util$$anonfun$getPartitionsLogSize$3$$anonfun$apply$19$$anonfun$apply$1.applyOrElse(Util.scala:74) ~[na:na] at scala.runtime.AbstractPartialFunction$mcJL$sp.apply$mcJL$sp(AbstractPartialFunction.scala:33) ~[scala-library.jar:na] at scala.runtime.AbstractPartialFunction$mcJL$sp.apply(AbstractPartialFunction.scala:33) ~[scala-library.jar:na] at scala.runtime.AbstractPartialFunction$mcJL$sp.apply(AbstractPartialFunction.scala:25) ~[scala-library.jar:na] at scala.util.Failure$$anonfun$recover$1.apply(Try.scala:185) ~[scala-library.jar:na] at jline.TerminalFactory.getFlavor(TerminalFactory.java:168) at jline.TerminalFactory.create(TerminalFactory.java:81) at jline.TerminalFactory.get(TerminalFactory.java:159) at sbt.MainLoop$$anon$1.run(MainLoop.scala:19) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: error=24, Too many open files
maximum number of open file handles
Hi, all I read the comments from http://www.michael-noll.com/tutorials/running-multi-node-storm-cluster/, Michael mentioned to increase the maximum number of open file handles for the user kafka to 98,304 (change kafka to whatever user you are running the Kafka daemons with – this can be your own user account, of course) you must add the following line to /etc/security/limits.conf: kafka - nofile 98304 I am installing the latest version of kafka, do I need to do the same thing above? thanks -- Alec Li
kafka web console running error
Hi, all I am trying to get kafka web console work, but seems it only works few hours and fails afterwards, below is the error messages on the screen. I am assuming something wrong with the DB, I used to swap H2 to mysql, but didn't help. Anyone has similar problem? - . . at sun.misc.Resource.getByteBuffer(Resource.java:160) ~[na:1.7.0_65] at java.net.URLClassLoader.defineClass(URLClassLoader.java:436) ~[na:1.7.0_65] at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67) at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82) at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59) at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59) at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:42) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [ERROR] Failed to construct terminal; falling back to unsupported java.io.IOException: Cannot run program sh: error=24, Too many open files at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047) at java.lang.Runtime.exec(Runtime.java:617) at java.lang.Runtime.exec(Runtime.java:485) at jline.internal.TerminalLineSettings.exec(TerminalLineSettings.java:183) at jline.internal.TerminalLineSettings.exec(TerminalLineSettings.java:173) at jline.internal.TerminalLineSettings.stty(TerminalLineSettings.java:168) at jline.internal.TerminalLineSettings.get(TerminalLineSettings.java:72) at jline.internal.TerminalLineSettings.init(TerminalLineSettings.java:52) at jline.UnixTerminal.init(UnixTerminal.java:31) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at java.lang.Class.newInstance(Class.java:379) [error] a.a.ActorSystemImpl - Uncaught error from thread [play-akka.actor.default-dispatcher-944] shutting down JVM since 'akka.jvm-exit-on-fatal-error' java.lang.NoClassDefFoundError: common/Util$$anonfun$getPartitionsLogSize$3$$anonfun$apply$19$$anonfun$apply$1$$anonfun$applyOrElse$1 at common.Util$$anonfun$getPartitionsLogSize$3$$anonfun$apply$19$$anonfun$apply$1.applyOrElse(Util.scala:75) ~[na:na] at common.Util$$anonfun$getPartitionsLogSize$3$$anonfun$apply$19$$anonfun$apply$1.applyOrElse(Util.scala:74) ~[na:na] at scala.runtime.AbstractPartialFunction$mcJL$sp.apply$mcJL$sp(AbstractPartialFunction.scala:33) ~[scala-library.jar:na] at scala.runtime.AbstractPartialFunction$mcJL$sp.apply(AbstractPartialFunction.scala:33) ~[scala-library.jar:na] at scala.runtime.AbstractPartialFunction$mcJL$sp.apply(AbstractPartialFunction.scala:25) ~[scala-library.jar:na] at scala.util.Failure$$anonfun$recover$1.apply(Try.scala:185) ~[scala-library.jar:na] at jline.TerminalFactory.getFlavor(TerminalFactory.java:168) at jline.TerminalFactory.create(TerminalFactory.java:81) at jline.TerminalFactory.get(TerminalFactory.java:159) at sbt.MainLoop$$anon$1.run(MainLoop.scala:19) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: error=24, Too many open files at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.init(UNIXProcess.java:186) at java.lang.ProcessImpl.start(ProcessImpl.java:130) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028) ... 18 more [error] a.a.ActorSystemImpl - Uncaught error from thread [play-akka.actor.default-dispatcher-943] shutting down JVM since 'akka.jvm-exit-on-fatal-error' java.lang.NoClassDefFoundError: common/Util$$anonfun$getPartitionsLogSize$3$$anonfun$apply$19$$anonfun$apply$1$$anonfun$applyOrElse$1 at common.Util$$anonfun$getPartitionsLogSize$3$$anonfun$apply$19$$anonfun$apply$1.applyOrElse(Util.scala:75) ~[na:na] at common.Util$$anonfun$getPartitionsLogSize$3$$anonfun$apply$19$$anonfun$apply$1.applyOrElse(Util.scala:74) ~[na:na] at
kafka producer example
Hi, All I am running the kafka producer code: import java.util.*; import kafka.javaapi.producer.Producer; import kafka.producer.KeyedMessage; import kafka.producer.ProducerConfig; public class TestProducer { public static void main(String[] args) { long events = Long.parseLong(args[0]); Random rnd = new Random(); Properties props = new Properties(); props.put(metadata.broker.list, 10.100.70.128:9092, 10.100.70.128:9093,10.100.70.128:9094); props.put(serializer.class, kafka.serializer.StringEncoder); props.put(partitioner.class, example.producer.SimplePartitioner ); props.put(request.required.acks, 1); ProducerConfig config = new ProducerConfig(props); ProducerString, String producer = new ProducerString, String(config); for (long nEvents = 0; nEvents events; nEvents++) { long runtime = new Date().getTime(); String ip = “192.168.2.” + rnd.nextInt(255); String msg = runtime + “,www.example.com,” + ip; KeyedMessageString, String data = new KeyedMessageString, String(page_visits, ip, msg); producer.send(data); } producer.close(); } } It should be straightforwards, but when I compile it in IntelliJ IDEA, I got such error Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) on project kafka-producer: Compilation failure [ERROR] C:\Users\sa\Desktop\Workload\kafka\kafkaprj\kafka-json-producer\src\main\java\kafka\example\TestProducer.java:[35,20] error: cannot access Serializable, which is the producer object. Any idea of this? thanks -- Alec Li
Re: postgresql consumer
Hi, all I've just made a 3-node kafka cluster (9 brokers, 3 for each node), the performance test is OK. Now I am using tridentKafkaSpout, and being able to getting data from producer, see BrokerHosts zk = new ZkHosts(10.100.70.128:2181); TridentKafkaConfig spoutConf = new TridentKafkaConfig(zk, topictest); spoutConf.scheme = new SchemeAsMultiScheme(new StringScheme()); OpaqueTridentKafkaSpout kafkaSpout = new OpaqueTridentKafkaSpout(spoutConf); // TransactionalTridentKafkaSpout kafkaSpout = new TransactionalTridentKafkaSpout(spoutConf); TridentTopology topology = new TridentTopology(); Stream stream = topology.newStream(topictestspout, kafkaSpout).shuffle() .each(new Fields(str), new PrintStream(), new Fields(event_object)) .parallelismHint(16); With above code, I can print out the json objects published to brokers. Instead of printing messages, I will like to simply populate the messages into postgresql DB. I download the code from https://github.com/geoforce/storm-postgresql Here the problems I have: 1. When I am running the storm-postgresql code, the messages generated from a RandomTupleSpout(), I am only able to write data into postgresql DB 100 rows regardless how I change the PostgresqlStateConfig. 2. Now I want to be able to write the json messages into postgresql DB, things seem to be simple, just 2 columns in the DB table, id and events which stores json messages. Forgive my dullness, I couldn't get it work by storm-postgresql. I wonder if anyone has done the similar jobs, getting data from tridentKafkaSpout and write exactly into postgresql DB. In addition, once the writer starts to work, if it stops and restarts for some reasons, and I will to writer to resume the consume process from the stop point instead of very beginning, how to manage the offset and restart to write into DB? thanks Alec Hi, All I setup a kafka cluster, and plan to publish the messages from Web to kafka, the messages are in the form of json, I want to implement a consumer to write the message I consumer to postgresql DB, not aggregation at all. I was thinking to use KafkaSpout in storm to make it happen, now I want to simplify the step, just use kafka consumer to populate message into postgresql. This consumer should have the functions of consumer data, write into postgresql DB in batch, if servers down, consumer can retrieve the data it stored in hard drive with no redundancy and can consume the data from where it stopped once the server up. Is there any sample code for this? thanks a lot Alec
Re: kafka-web-console
All, Again, I am still unable to install, seems to stuck on ivy.lock, any ideas to continue? thanks Alec On Oct 12, 2014, at 7:38 PM, Sa Li sal...@gmail.com wrote: Hi
Re: kafka-web-console
Hi, Palak really? I terminated it since I truly thought it was stuck there, I will run it again. thanks Alec On Oct 12, 2014, at 7:35 PM, Palak Shah spala...@gmail.com wrote: Hi, I am sure you must have got it running by now, but in case you gave up earlier, just have patience and it will start. Even I had faced this issue. The console takes a lot of time to start, but eventually it does. So this is not an error :) Hope this helped, -Palak On Sat, Oct 11, 2014 at 9:00 AM, Sa Li sal...@gmail.com wrote: Hi, all I am installing kafka-web-console on ubuntu server, when I sbt package it, it stuck on waiting for ivy.lock root@DO-mq-dev:/home/stuser/kafkaprj/kafka-web-console# sbt package Loading /usr/share/sbt/bin/sbt-launch-lib.bash [info] Loading project definition from /home/stuser/kafkaprj/kafka-web-console/project Waiting for lock on /root/.ivy2/.sbt.ivy.lock to be available... any idea? and any suggestion while further install. thanks Alec
kafka-web-console
Hi, all I am installing kafka-web-console on ubuntu server, when I sbt package it, it stuck on waiting for ivy.lock root@DO-mq-dev:/home/stuser/kafkaprj/kafka-web-console# sbt package Loading /usr/share/sbt/bin/sbt-launch-lib.bash [info] Loading project definition from /home/stuser/kafkaprj/kafka-web-console/project Waiting for lock on /root/.ivy2/.sbt.ivy.lock to be available... any idea? and any suggestion while further install. thanks Alec
create topic in multiple node kafka cluster
Hi, All I setup a 3-node kafka cluster on top of 3-node zk ensemble. Now I launch 1 broker on each node, the brokers will be randomly distributed to zk ensemble, see DO-mq-dev.1 [zk: localhost:2181(CONNECTED) 1] ls /brokers/ids [0, 1] pof-kstorm-dev1.2 [zk: localhost:2181(CONNECTED) 1] ls /brokers/ids [] pof-kstorm-dev2.3 [zk: localhost:2181(CONNECTED) 1] ls /brokers/ids [2] which means zk1 hosts 2 brokers, zk3 hosts 1 brokers, that will raise a problem, that I am unable to create a topic with replications, say 3, it will throw such exceptions Error while executing topic command replication factor: 3 larger than available brokers: 0 Is there any ways that I can create a topic which can be replicated throughout entire zk ensemble as I know we will have to introduce more than 1 broker in single zk Server if we want to be able to create replicated topics/ thanks -- Alec Li
Re: create topic in multiple node kafka cluster
Hi, I kinda doubt whether I make it as an ensemble, since it shows root@DO-mq-dev:/etc/zookeeper/conf# zkServer.sh status JMX enabled by default Using config: /etc/zookeeper/conf/zoo.cfg Mode: standalone Mode is standalone instead of something else, here is my zoo.cfg, I did follow the instruction to config it # http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. dataDir=/var/lib/zookeeper # Place the dataLogDir to a separate physical disc for better performance # dataLogDir=/disk2/zookeeper # the port at which the clients will connect clientPort=2181 # specify all zookeeper servers # The fist port is used by followers to connect to the leader # The second one is used for leader election DO-mq-dev.1=10.100.70.128:2888:3888 pof-kstorm-dev1.2=10.100.70.28:2888:3888 pof-kstorm-dev2.3=10.100.70.29:2888:3888 # To avoid seeks ZooKeeper allocates space in the transaction log file in # blocks of preAllocSize kilobytes. The default block size is 64M. One reason # for changing the size of the blocks is to reduce the block size if snapshots # are taken more often. (Also, see snapCount). #preAllocSize=65536 # Clients can submit requests faster than ZooKeeper can process them, # especially if there are a lot of clients. To prevent ZooKeeper from running # out of memory due to queued requests, ZooKeeper will throttle clients so that # there is no more than globalOutstandingLimit outstanding requests in the # system. The default limit is 1,000.ZooKeeper logs transactions to a # transaction log. After snapCount transactions are written to a log file a # snapshot is started and a new transaction log file is started. The default # snapCount is 10,000. #snapCount=1000 # If this option is defined, requests will be will logged to a trace file named # traceFile.year.month.day. #traceFile= # Leader accepts client connections. Default value is yes. The leader machine # coordinates updates. For higher update throughput at thes slight expense of # read throughput the leader can be configured to not accept clients and focus # on coordination. leaderServes=yes # Enable regular purging of old data and transaction logs every 24 hours autopurge.purgeInterval=24 autopurge.snapRetainCount=5 And myid in dataDir is At 10.100.70.128 /var/lib/zookeeper contains1 At 10.100.70.28 /var/lib/zookeeper contains2 At 10.100.70.29 /var/lib/zookeeper contains3 I did make myid as 1, 2, 3 corresponding 3 nodes before, but seeing something make such myid, it might be more accurate. Is there anything wrong or missing to not able to make it an ensemble? thanks Alec On Thu, Oct 9, 2014 at 12:06 PM, Guozhang Wang wangg...@gmail.com wrote: Sa, Usually you would not want to set up kafka brokers at the same machines with zk nodes, as that will add depending failures to the server cluster. Back to your original question, it seems your zk nodes do not form an ensemble, since otherwise their zk data should be the same. Guozhang On Thu, Oct 9, 2014 at 11:37 AM, Sa Li sal...@gmail.com wrote: Hi, All I setup a 3-node kafka cluster on top of 3-node zk ensemble. Now I launch 1 broker on each node, the brokers will be randomly distributed to zk ensemble, see DO-mq-dev.1 [zk: localhost:2181(CONNECTED) 1] ls /brokers/ids [0, 1] pof-kstorm-dev1.2 [zk: localhost:2181(CONNECTED) 1] ls /brokers/ids [] pof-kstorm-dev2.3 [zk: localhost:2181(CONNECTED) 1] ls /brokers/ids [2] which means zk1 hosts 2 brokers, zk3 hosts 1 brokers, that will raise a problem, that I am unable to create a topic with replications, say 3, it will throw such exceptions Error while executing topic command replication factor: 3 larger than available brokers: 0 Is there any ways that I can create a topic which can be replicated throughout entire zk ensemble as I know we will have to introduce more than 1 broker in single zk Server if we want to be able to create replicated topics/ thanks -- Alec Li -- -- Guozhang -- Alec Li
postgresql consumer
Hi, All I setup a kafka cluster, and plan to publish the messages from Web to kafka, the messages are in the form of json, I want to implement a consumer to write the message I consumer to postgresql DB, not aggregation at all. I was thinking to use KafkaSpout in storm to make it happen, now I want to simplify the step, just use kafka consumer to populate message into postgresql. This consumer should have the functions of consumer data, write into postgresql DB in batch, if servers down, consumer can retrieve the data it stored in hard drive with no redundancy and can consume the data from where it stopped once the server up. Is there any sample code for this? thanks a lot Alec
Re: kafka producer performance test
Thanks, Jay, Here is what I did this morning, I git clone the latest version of kafka from git, (I am currently using kafka 8.0) now it is 8.1.1, and it use gradle to build project. I am having trouble to build it. I installed gradle, and run ./gradlew jar in kafka root directory, it comes out: Error: Could not find or load main class org.gradle.wrapper.GradleWrapperMain Any idea about this. Thanks Alec On Wed, Oct 1, 2014 at 9:21 PM, Jay Kreps jay.kr...@gmail.com wrote: Hi Sa, That script was developed with the new producer that is included on trunk. Checkout trunk and build and it should be there. -Jay On Wed, Oct 1, 2014 at 7:55 PM, Sa Li sal...@gmail.com wrote: Hi, All I built a 3-node kafka cluster, I want to make performance test, I found someone post following thread, that is exactly the problem I have: - While testing kafka producer performance, I found 2 testing scripts. 1) performance testing script in kafka distribution bin/kafka-producer-perf-test.sh --broker-list localhost:9092 --messages 1000 --topic test --threads 10 --message-size 100 --batch-size 1 --compression-codec 1 2) performance testing script mentioned in https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test6 5000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196 based on org.apache.kafka.clients.producer.Producer. —— I was unable to duplicate either of above method, I figure the commands are outdated, anyone point me how to do such test with new command? thanks Alec -- Alec Li
can't run kafka example code
Hi, all Here I want to run example code associated with kafka package, I run as readme says: To run the demo using scripts: + + 1. Start Zookeeper and the Kafka server + 2. For simple consumer demo, run bin/java-simple-consumer-demo.sh + 3. For unlimited producer-consumer run, run bin/java-producer-consumer-demo.sh but I got such error, :bin/../../project/boot/scala-2.8.0/lib/*.jar:bin/../../core/lib_managed/scala_2.8.0/compile/*.jar:bin/../../core/lib/*.jar:bin/../../core/target/scala_2.8.0/ *.jar:bin/../../examples/target/scala_2.8.0/*.jar Error: Could not find or load main class kafka.examples.SimpleConsumerDemo But I already build the package under kafka directory, I can see the class in examples/target/classes/kafka/examples Any idea about this issue? thanks -- Alec Li
Re: kafka producer performance test
Thanks Guozhang I tried this as in KAFKA-1490: git clone https://git-wip-us.apache.org/repos/asf/kafka.git cd kafka gradle but fails to build: FAILURE: Build failed with an exception. * Where: Script '/home/stuser/trunk/gradle/license.gradle' line: 2 * What went wrong: A problem occurred evaluating script. Could not find method create() for arguments [downloadLicenses, class nl.javadude.gradle.plugins.license.DownloadLicenses] on task set. * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED Seems it is really not that straightforward to build thanks On Thu, Oct 2, 2014 at 12:56 PM, Guozhang Wang wangg...@gmail.com wrote: Hello Sa, KAFKA-1490 introduces a new step of downloading the wrapper, details are included in the latest README file. Guozhang On Thu, Oct 2, 2014 at 11:00 AM, Sa Li sal...@gmail.com wrote: Thanks, Jay, Here is what I did this morning, I git clone the latest version of kafka from git, (I am currently using kafka 8.0) now it is 8.1.1, and it use gradle to build project. I am having trouble to build it. I installed gradle, and run ./gradlew jar in kafka root directory, it comes out: Error: Could not find or load main class org.gradle.wrapper.GradleWrapperMain Any idea about this. Thanks Alec On Wed, Oct 1, 2014 at 9:21 PM, Jay Kreps jay.kr...@gmail.com wrote: Hi Sa, That script was developed with the new producer that is included on trunk. Checkout trunk and build and it should be there. -Jay On Wed, Oct 1, 2014 at 7:55 PM, Sa Li sal...@gmail.com wrote: Hi, All I built a 3-node kafka cluster, I want to make performance test, I found someone post following thread, that is exactly the problem I have: - While testing kafka producer performance, I found 2 testing scripts. 1) performance testing script in kafka distribution bin/kafka-producer-perf-test.sh --broker-list localhost:9092 --messages 1000 --topic test --threads 10 --message-size 100 --batch-size 1 --compression-codec 1 2) performance testing script mentioned in https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test6 5000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196 based on org.apache.kafka.clients.producer.Producer. —— I was unable to duplicate either of above method, I figure the commands are outdated, anyone point me how to do such test with new command? thanks Alec -- Alec Li -- -- Guozhang -- Alec Li
can't gradle
I git clone the latest kafka package, why can't I build the package gradle FAILURE: Build failed with an exception. * Where: Script '/home/ubuntu/kafka/gradle/license.gradle' line: 2 * What went wrong: A problem occurred evaluating script. Could not find method create() for arguments [downloadLicenses, class nl.javadude.gradle.plugins.license.DownloadLicenses] on task set. * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED Thanks -- Alec Li
Re: kafka producer performance test
I can't really gradle through, even clone the latest trunk, anyone having same issue? On Thu, Oct 2, 2014 at 1:55 PM, Sa Li sal...@gmail.com wrote: Thanks Guozhang I tried this as in KAFKA-1490: git clone https://git-wip-us.apache.org/repos/asf/kafka.git cd kafka gradle but fails to build: FAILURE: Build failed with an exception. * Where: Script '/home/stuser/trunk/gradle/license.gradle' line: 2 * What went wrong: A problem occurred evaluating script. Could not find method create() for arguments [downloadLicenses, class nl.javadude.gradle.plugins.license.DownloadLicenses] on task set. * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED Seems it is really not that straightforward to build thanks On Thu, Oct 2, 2014 at 12:56 PM, Guozhang Wang wangg...@gmail.com wrote: Hello Sa, KAFKA-1490 introduces a new step of downloading the wrapper, details are included in the latest README file. Guozhang On Thu, Oct 2, 2014 at 11:00 AM, Sa Li sal...@gmail.com wrote: Thanks, Jay, Here is what I did this morning, I git clone the latest version of kafka from git, (I am currently using kafka 8.0) now it is 8.1.1, and it use gradle to build project. I am having trouble to build it. I installed gradle, and run ./gradlew jar in kafka root directory, it comes out: Error: Could not find or load main class org.gradle.wrapper.GradleWrapperMain Any idea about this. Thanks Alec On Wed, Oct 1, 2014 at 9:21 PM, Jay Kreps jay.kr...@gmail.com wrote: Hi Sa, That script was developed with the new producer that is included on trunk. Checkout trunk and build and it should be there. -Jay On Wed, Oct 1, 2014 at 7:55 PM, Sa Li sal...@gmail.com wrote: Hi, All I built a 3-node kafka cluster, I want to make performance test, I found someone post following thread, that is exactly the problem I have: - While testing kafka producer performance, I found 2 testing scripts. 1) performance testing script in kafka distribution bin/kafka-producer-perf-test.sh --broker-list localhost:9092 --messages 1000 --topic test --threads 10 --message-size 100 --batch-size 1 --compression-codec 1 2) performance testing script mentioned in https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test6 5000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196 based on org.apache.kafka.clients.producer.Producer. —— I was unable to duplicate either of above method, I figure the commands are outdated, anyone point me how to do such test with new command? thanks Alec -- Alec Li -- -- Guozhang -- Alec Li -- Alec Li
Re: multi-node and multi-broker kafka cluster setup
Daniel, thanks for reply It is still the learn curve to me to setup the cluster, we finally want to make connection between kafka cluster and storm cluster. As you mentioned, seems 1 single broker per node is more efficient, is it good to handle multiple topics? For my case, say I can build the 3-node kafka cluster, and three brokers, and certainly that will limit the replica number, as far as I understand, broker number should greater or equal to replica number. For the zk Server, my understanding after play around is: I should run zk Server server for each kafka node, I could zk.connect to single zk server in kafka server.properties, and all the broker info will store in that zkserver, But I may think it might be better to store each individual broker info in local zkServer, then when zkCli,sh, we can see things under /brokers/ids. Is that good solution? I am using such architecture now. thanks On Tue, Sep 30, 2014 at 1:02 PM, Daniel Compton d...@danielcompton.net wrote: Hi Sa While it's possible to run multiple brokers on a single machine, I would be interested to hear why you would want to. Kafka is very efficient and can use all of the system resources under load. Running multiple brokers would increase zookeeper load, force resource sharing between the Kafka processes, and require more admin overhead. Additionally, you almost certainly want to run three Zookeepers. Two Zookeepers gives you no more reliability than one because ZK voting is based on a majority vote. If neither ZK can reach a majority on its own then it will fail. More info at http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A7 Daniel. On 1/10/2014, at 4:35 am, Guozhang Wang wangg...@gmail.com wrote: Hello, In general it is not required to have the kafka brokers installed on the same nodes of the zk servers, and each node can host multiple kafka brokers: you just need to make sure they do not share the same port and the same data dir. Guozhang On Mon, Sep 29, 2014 at 8:31 PM, Sa Li sal...@gmail.com wrote: Hi, I am kinda newbie to kafka, I plan to build a cluster with multiple nodes, and multiple brokers on each node, I can find tutorials for set multiple brokers cluster in single node, say http://www.michael-noll.com/blog/2013/03/13/running-a-multi-broker-apache-kafka-cluster-on-a-single-node/ Also I can find some instructions for multiple node setup, but with single broker on each node. I have not seen any documents to teach me how to setup multiple nodes cluster and multiple brokers in each node. I notice some documents points out: we should install kafka on each node which makes sense, and all the brokers in each node should connect to same zookeeper. I am confused since I thought I could setup a zookeeper ensemble cluster separately, and all the brokers connecting to this zookeeper cluster and this zk cluster doesn’t have to be the server hosting the kafka, but some tutorial says I should install zookeeper on each kafka node. Here is my plan: - I have three nodes: kfServer1, kfserver2, kfserver3, - kfserver1 and kfserver2 are configured as the zookeeper ensemble, which i have done. zk.connect=kfserver1:2181,kfserver2:2181 - broker1, broker2, broker3 are in kfserver1, broker4, broker5, broker6 are on kfserver2, broker7, broker8, broker9 are on kfserver3. When I am configuring, the zk DataDir is in local directory of each node, instead located at the zk ensemble directory, is that correct? So far, I couldnot make above scheme working, anyone have ever made multi-node and multi-broker kafka cluster setup? thanks Alec -- -- Guozhang -- Alec Li
Re: multi-node and multi-broker kafka cluster setup
Just clarify, I am using 3 zkServer ensemble, myid: 1, 2, 3. But in each kafka node server.properties of each broker, I make zk.connect to localhost, which means the broker info stored in local zkServer, I know it is bit of weird, other than assign the broker info automatically by zkServer leader. On Thu, Oct 2, 2014 at 2:25 PM, Sa Li sal...@gmail.com wrote: Daniel, thanks for reply It is still the learn curve to me to setup the cluster, we finally want to make connection between kafka cluster and storm cluster. As you mentioned, seems 1 single broker per node is more efficient, is it good to handle multiple topics? For my case, say I can build the 3-node kafka cluster, and three brokers, and certainly that will limit the replica number, as far as I understand, broker number should greater or equal to replica number. For the zk Server, my understanding after play around is: I should run zk Server server for each kafka node, I could zk.connect to single zk server in kafka server.properties, and all the broker info will store in that zkserver, But I may think it might be better to store each individual broker info in local zkServer, then when zkCli,sh, we can see things under /brokers/ids. Is that good solution? I am using such architecture now. thanks On Tue, Sep 30, 2014 at 1:02 PM, Daniel Compton d...@danielcompton.net wrote: Hi Sa While it's possible to run multiple brokers on a single machine, I would be interested to hear why you would want to. Kafka is very efficient and can use all of the system resources under load. Running multiple brokers would increase zookeeper load, force resource sharing between the Kafka processes, and require more admin overhead. Additionally, you almost certainly want to run three Zookeepers. Two Zookeepers gives you no more reliability than one because ZK voting is based on a majority vote. If neither ZK can reach a majority on its own then it will fail. More info at http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A7 Daniel. On 1/10/2014, at 4:35 am, Guozhang Wang wangg...@gmail.com wrote: Hello, In general it is not required to have the kafka brokers installed on the same nodes of the zk servers, and each node can host multiple kafka brokers: you just need to make sure they do not share the same port and the same data dir. Guozhang On Mon, Sep 29, 2014 at 8:31 PM, Sa Li sal...@gmail.com wrote: Hi, I am kinda newbie to kafka, I plan to build a cluster with multiple nodes, and multiple brokers on each node, I can find tutorials for set multiple brokers cluster in single node, say http://www.michael-noll.com/blog/2013/03/13/running-a-multi-broker-apache-kafka-cluster-on-a-single-node/ Also I can find some instructions for multiple node setup, but with single broker on each node. I have not seen any documents to teach me how to setup multiple nodes cluster and multiple brokers in each node. I notice some documents points out: we should install kafka on each node which makes sense, and all the brokers in each node should connect to same zookeeper. I am confused since I thought I could setup a zookeeper ensemble cluster separately, and all the brokers connecting to this zookeeper cluster and this zk cluster doesn’t have to be the server hosting the kafka, but some tutorial says I should install zookeeper on each kafka node. Here is my plan: - I have three nodes: kfServer1, kfserver2, kfserver3, - kfserver1 and kfserver2 are configured as the zookeeper ensemble, which i have done. zk.connect=kfserver1:2181,kfserver2:2181 - broker1, broker2, broker3 are in kfserver1, broker4, broker5, broker6 are on kfserver2, broker7, broker8, broker9 are on kfserver3. When I am configuring, the zk DataDir is in local directory of each node, instead located at the zk ensemble directory, is that correct? So far, I couldnot make above scheme working, anyone have ever made multi-node and multi-broker kafka cluster setup? thanks Alec -- -- Guozhang -- Alec Li -- Alec Li
Re: can't gradle
Thank you all, I am able to gradle now, here is my mistake, I install gradle by apt-get, and from gradle web, but system automatically pick apt-get gradle to run, and this version is quite outdated, what I did to apt-get remove gradle, and add higher version gradle to /etc/environment, now it works. Hope this can help anyone who have the similar problem, always download and install latest version. On Thu, Oct 2, 2014 at 3:02 PM, Jun Rao jun...@gmail.com wrote: Hmm, not sure what the issue is. You can also just copy the following files from the 0.8.1 branch. gradle/wrapper/ gradle-wrapper.jar gradle-wrapper.properties Thanks, Jun On Thu, Oct 2, 2014 at 2:05 PM, Sa Li sal...@gmail.com wrote: I git clone the latest kafka package, why can't I build the package gradle FAILURE: Build failed with an exception. * Where: Script '/home/ubuntu/kafka/gradle/license.gradle' line: 2 * What went wrong: A problem occurred evaluating script. Could not find method create() for arguments [downloadLicenses, class nl.javadude.gradle.plugins.license.DownloadLicenses] on task set. * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED Thanks -- Alec Li -- Alec Li
kafka producer performance test
Hi, All I built a 3-node kafka cluster, I want to make performance test, I found someone post following thread, that is exactly the problem I have: - While testing kafka producer performance, I found 2 testing scripts. 1) performance testing script in kafka distribution bin/kafka-producer-perf-test.sh --broker-list localhost:9092 --messages 1000 --topic test --threads 10 --message-size 100 --batch-size 1 --compression-codec 1 2) performance testing script mentioned in https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test6 5000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196 based on org.apache.kafka.clients.producer.Producer. —— I was unable to duplicate either of above method, I figure the commands are outdated, anyone point me how to do such test with new command? thanks Alec
Re: kafka producer performance test
Hi, Ravi Thanks for reply, this is how I build the kafka package 0.8 $ git clone https://git-wip-us.apache.org/repos/asf/kafka.git $ cd /etc/kafka $ git checkout -b 0.8 remotes/origin/0.8 $ ./sbt update $ ./sbt package $ ./sbt assembly-package-dependency So I believe I already build it, but still not able to run it, any clues for that? thanks Alec On Oct 1, 2014, at 9:13 PM, ravi singh rrs120...@gmail.com wrote: It is available with Kafka package containing the source code. Download the package, build it and run the above command. Regards, Ravi On Wed, Oct 1, 2014 at 7:55 PM, Sa Li sal...@gmail.com wrote: Hi, All I built a 3-node kafka cluster, I want to make performance test, I found someone post following thread, that is exactly the problem I have: - While testing kafka producer performance, I found 2 testing scripts. 1) performance testing script in kafka distribution bin/kafka-producer-perf-test.sh --broker-list localhost:9092 --messages 1000 --topic test --threads 10 --message-size 100 --batch-size 1 --compression-codec 1 2) performance testing script mentioned in https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test6 5000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196 based on org.apache.kafka.clients.producer.Producer. —— I was unable to duplicate either of above method, I figure the commands are outdated, anyone point me how to do such test with new command? thanks Alec -- *Regards,* *Ravi*
multi-node and multi-broker kafka cluster setup
Hi, I am kinda newbie to kafka, I plan to build a cluster with multiple nodes, and multiple brokers on each node, I can find tutorials for set multiple brokers cluster in single node, say http://www.michael-noll.com/blog/2013/03/13/running-a-multi-broker-apache-kafka-cluster-on-a-single-node/ Also I can find some instructions for multiple node setup, but with single broker on each node. I have not seen any documents to teach me how to setup multiple nodes cluster and multiple brokers in each node. I notice some documents points out: we should install kafka on each node which makes sense, and all the brokers in each node should connect to same zookeeper. I am confused since I thought I could setup a zookeeper ensemble cluster separately, and all the brokers connecting to this zookeeper cluster and this zk cluster doesn’t have to be the server hosting the kafka, but some tutorial says I should install zookeeper on each kafka node. Here is my plan: - I have three nodes: kfServer1, kfserver2, kfserver3, - kfserver1 and kfserver2 are configured as the zookeeper ensemble, which i have done. zk.connect=kfserver1:2181,kfserver2:2181 - broker1, broker2, broker3 are in kfserver1, broker4, broker5, broker6 are on kfserver2, broker7, broker8, broker9 are on kfserver3. When I am configuring, the zk DataDir is in local directory of each node, instead located at the zk ensemble directory, is that correct? So far, I couldnot make above scheme working, anyone have ever made multi-node and multi-broker kafka cluster setup? thanks Alec