why kafka producer api use cpu so high?
I write a very simple code , like this : public class LogProducer { private Producer inner; public LogProducer() throws Exception{ Properties properties = new Properties(); properties.load(ClassLoader.getSystemResourceAsStream("producer.properties")); ProducerConfig config = new ProducerConfig(properties); inner = new Producer(config); } public void send(String topicName,String message) { if(topicName == null || message == null){ return; } KeyedMessage km = new KeyedMessage(topicName,message); inner.send(km); } public void close(){ inner.close(); } /** * @param args */ public static void main(String[] args) { LogProducer producer = null; try{ producer = new LogProducer(); int i=0; while(true){ producer.send("test", "this is a sample"); } }catch(Exception e){ e.printStackTrace(); }finally{ if(producer != null){ producer.close(); } } } } ~~ and the producer.properties like this: metadata.broker.list=127.0.0.1:9092 producer.type=async serializer.class=kafka.serializer.StringEncoder batch.num.messages=200 compression.codec=snappy I run this procedure on linux, which is 4 core cpu , 16GB memory. I find this procedure using one core cpu totally , this is "top" command ouput: [root@localhost ~]# top top - 13:51:09 up 5 days, 13:27, 3 users, load average: 0.96, 0.48, 0.35 Tasks: 367 total, 3 running, 364 sleeping, 0 stopped, 0 zombie Cpu0 : 7.0%us, 0.3%sy, 0.0%ni, 92.0%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 5.0%us, 0.0%sy, 0.0%ni, 95.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 5.0%us, 0.0%sy, 0.0%ni, 95.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 99.7%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 16307528k total, 9398376k used, 6909152k free, 249952k buffers Swap: 8224760k total,0k used, 8224760k free, 6071348k cached why producer api use cpu so high ? or maybe I make something wrong ? by the way , the kafka version 0.8.0 .
Re: why kafka producer api use cpu so high?
What is your compression configuration for your producer? One of the biggest CPU source for the producer is doing compression and also checksuming. Tim On Sun, May 11, 2014 at 12:24 AM, wrote: > I write a very simple code , like this : > public class LogProducer { > > private Producer inner; > public LogProducer() throws Exception{ > Properties properties = new Properties(); > > properties.load(ClassLoader.getSystemResourceAsStream("producer.properties")); > ProducerConfig config = new ProducerConfig(properties); > inner = new Producer(config); > } > > > public void send(String topicName,String message) { > if(topicName == null || message == null){ > return; > } > KeyedMessage km = new KeyedMessage String>(topicName,message); > inner.send(km); > } > public void close(){ > inner.close(); > } > > /** > * @param args > */ > public static void main(String[] args) { > LogProducer producer = null; > try{ > producer = new LogProducer(); > int i=0; > while(true){ > producer.send("test", "this is a > sample"); > } > }catch(Exception e){ > e.printStackTrace(); > }finally{ > if(producer != null){ > producer.close(); > } > } > > } > > } > ~~ > and the producer.properties like this: > metadata.broker.list=127.0.0.1:9092 > producer.type=async > serializer.class=kafka.serializer.StringEncoder > batch.num.messages=200 > compression.codec=snappy > > I run this procedure on linux, which is 4 core cpu , 16GB memory. > I find this procedure using one core cpu totally , this is "top" command > ouput: > > > [root@localhost ~]# top > top - 13:51:09 up 5 days, 13:27, 3 users, load average: 0.96, 0.48, 0.35 > Tasks: 367 total, 3 running, 364 sleeping, 0 stopped, 0 zombie > Cpu0 : 7.0%us, 0.3%sy, 0.0%ni, 92.0%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu1 : 5.0%us, 0.0%sy, 0.0%ni, 95.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu2 : 5.0%us, 0.0%sy, 0.0%ni, 95.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu3 : 99.7%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Mem: 16307528k total, 9398376k used, 6909152k free, 249952k buffers > Swap: 8224760k total,0k used, 8224760k free, 6071348k cached > > why producer api use cpu so high ? or maybe I make something wrong ? > > by the way , the kafka version 0.8.0 .
Re: why kafka producer api use cpu so high?
This code says to send this message infinitely as fast as the machine can thereby consuming as much of one CPU as possible. You may want to consider an alternate test, perhaps one that records the number of messages sent in a given time period. > > public static void main(String[] args) { > > LogProducer producer = null; > > try{ > > producer = new LogProducer(); > > int i=0; > > while(true){ > > producer.send("test", "this is a > sample"); > > } > > }catch(Exception e){ > > e.printStackTrace(); > > }finally{ > > if(producer != null){ > > producer.close(); > > } > > } > > > > } > > > > } > >
Re: Re: why kafka producer api use cpu so high?
because my app can generate 50MB log every second and one record of log is about 1KB , so I must send this log as fast as machine can. this is very difficult, on one hand I want to send log as fast as possible, on the other hand I want kafka producer api use cpu as low as possible. if kafka api using cpu so high , it will impact my app. so can kafka solve this problem ? send 50MB log to kafka server every second ,and using low cpu. From: cac...@gmail.com Date: 2014-05-11 16:52 To: users Subject: Re: why kafka producer api use cpu so high? This code says to send this message infinitely as fast as the machine can thereby consuming as much of one CPU as possible. You may want to consider an alternate test, perhaps one that records the number of messages sent in a given time period. > > public static void main(String[] args) { > > LogProducer producer = null; > > try{ > > producer = new LogProducer(); > > int i=0; > > while(true){ > > producer.send("test", "this is a > sample"); > > } > > }catch(Exception e){ > > e.printStackTrace(); > > }finally{ > > if(producer != null){ > > producer.close(); > > } > > } > > > > } > > > > } > >
Re: Re: why kafka producer api use cpu so high?
I use snappy for compression. but even without compression, this procedure also use 50% one core cpu. when using snappy ,this procedure use 100% one core cpu. From: Timothy Chen Date: 2014-05-11 15:53 To: users@kafka.apache.org Subject: Re: why kafka producer api use cpu so high? What is your compression configuration for your producer? One of the biggest CPU source for the producer is doing compression and also checksuming. Tim On Sun, May 11, 2014 at 12:24 AM, wrote: > I write a very simple code , like this : > public class LogProducer { > > private Producer inner; > public LogProducer() throws Exception{ > Properties properties = new Properties(); > > properties.load(ClassLoader.getSystemResourceAsStream("producer.properties")); > ProducerConfig config = new ProducerConfig(properties); > inner = new Producer(config); > } > > > public void send(String topicName,String message) { > if(topicName == null || message == null){ > return; > } > KeyedMessage km = new KeyedMessage String>(topicName,message); > inner.send(km); > } > public void close(){ > inner.close(); > } > > /** > * @param args > */ > public static void main(String[] args) { > LogProducer producer = null; > try{ > producer = new LogProducer(); > int i=0; > while(true){ > producer.send("test", "this is a > sample"); > } > }catch(Exception e){ > e.printStackTrace(); > }finally{ > if(producer != null){ > producer.close(); > } > } > > } > > } > ~~ > and the producer.properties like this: > metadata.broker.list=127.0.0.1:9092 > producer.type=async > serializer.class=kafka.serializer.StringEncoder > batch.num.messages=200 > compression.codec=snappy > > I run this procedure on linux, which is 4 core cpu , 16GB memory. > I find this procedure using one core cpu totally , this is "top" command > ouput: > > > [root@localhost ~]# top > top - 13:51:09 up 5 days, 13:27, 3 users, load average: 0.96, 0.48, 0.35 > Tasks: 367 total, 3 running, 364 sleeping, 0 stopped, 0 zombie > Cpu0 : 7.0%us, 0.3%sy, 0.0%ni, 92.0%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu1 : 5.0%us, 0.0%sy, 0.0%ni, 95.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu2 : 5.0%us, 0.0%sy, 0.0%ni, 95.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu3 : 99.7%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Mem: 16307528k total, 9398376k used, 6909152k free, 249952k buffers > Swap: 8224760k total,0k used, 8224760k free, 6071348k cached > > why producer api use cpu so high ? or maybe I make something wrong ? > > by the way , the kafka version 0.8.0 .
Re: Re: why kafka producer api use cpu so high?
If a process is CPU bound (which this producer almost certainly will be), it's going to consume as much CPU as it can to do what what it does. The test is flawed. Because there's no end state, the while loop is just going to burn CPU and, because it's singly threaded, it will take a single core. A better test is to find out a rough number of events per second your process needs to produce and write the testing accordingly. That will tell you, when producing ~50MB/sec worth of events, this is how much the producer will chew up. The other thing worth pointing out is that sending a single event at a time comes with a fair bit of overhead which, in turn, naturally drives up CPU time. If you use the list form of send() you're going to be amortize the cost of the RPC and other internal bits leading to more efficient use of system resources. Again, it may still burn a full core because what you're doing is CPU bound, but it will do more during that time. On Sun, May 11, 2014 at 1:04 AM, wrote: > I use snappy for compression. > but even without compression, this procedure also use 50% one core cpu. > > when using snappy ,this procedure use 100% one core cpu. > > > > > > From: Timothy Chen > Date: 2014-05-11 15:53 > To: users@kafka.apache.org > Subject: Re: why kafka producer api use cpu so high? > What is your compression configuration for your producer? > > One of the biggest CPU source for the producer is doing compression > and also checksuming. > > Tim > > On Sun, May 11, 2014 at 12:24 AM, wrote: > > I write a very simple code , like this : > > public class LogProducer { > > > > private Producer inner; > > public LogProducer() throws Exception{ > > Properties properties = new Properties(); > > > properties.load(ClassLoader.getSystemResourceAsStream("producer.properties")); > > ProducerConfig config = new ProducerConfig(properties); > > inner = new Producer(config); > > } > > > > > > public void send(String topicName,String message) { > > if(topicName == null || message == null){ > > return; > > } > > KeyedMessage km = new KeyedMessage String>(topicName,message); > > inner.send(km); > > } > > public void close(){ > > inner.close(); > > } > > > > /** > > * @param args > > */ > > public static void main(String[] args) { > > LogProducer producer = null; > > try{ > > producer = new LogProducer(); > > int i=0; > > while(true){ > > producer.send("test", "this is a > sample"); > > } > > }catch(Exception e){ > > e.printStackTrace(); > > }finally{ > > if(producer != null){ > > producer.close(); > > } > > } > > > > } > > > > } > > ~~ > > and the producer.properties like this: > > metadata.broker.list=127.0.0.1:9092 > > producer.type=async > > serializer.class=kafka.serializer.StringEncoder > > batch.num.messages=200 > > compression.codec=snappy > > > > I run this procedure on linux, which is 4 core cpu , 16GB memory. > > I find this procedure using one core cpu totally , this is "top" command > ouput: > > > > > > [root@localhost ~]# top > > top - 13:51:09 up 5 days, 13:27, 3 users, load average: 0.96, 0.48, > 0.35 > > Tasks: 367 total, 3 running, 364 sleeping, 0 stopped, 0 zombie > > Cpu0 : 7.0%us, 0.3%sy, 0.0%ni, 92.0%id, 0.7%wa, 0.0%hi, 0.0%si, > 0.0%st > > Cpu1 : 5.0%us, 0.0%sy, 0.0%ni, 95.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > > Cpu2 : 5.0%us, 0.0%sy, 0.0%ni, 95.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > > Cpu3 : 99.7%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, > 0.0%st > > Mem: 16307528k total, 9398376k used, 6909152k free, 249952k buffers > > Swap: 8224760k total,0k used, 8224760k free, 6071348k cached > > > > why producer api use cpu so high ? or maybe I make something wrong ? > > > > by the way , the kafka version 0.8.0 . > -- E. Sammer CTO - ScalingData