I coped what I think might be important below(tell me if you need any other
info):
*One Broker:*
rank self accum count trace method
1 77.55% 77.55% 32651 301152 sun.nio.ch.FileDispatcherImpl.writev0
2 9.69% 87.24% 4080 300446 java.net.PlainSocketImpl.socketAccept
3 1.50% 88.75% 633 301478 sun.misc.Unsafe.unpark
4 0.62% 89.37% 260 301005
kafka.perf.ProducerPerformance$ProducerThread.run
5 0.52% 89.89% 219 301123
scala.collection.LinearSeqOptimized$class.length
TRACE 301152:
sun.nio.ch.FileDispatcherImpl.writev0(FileDispatcherImpl.java:Unknown line)
sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
sun.nio.ch.IOUtil.write(IOUtil.java:142)
sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:483)
java.nio.channels.SocketChannel.write(SocketChannel.java:493)
kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:49)
kafka.network.Send$class.writeCompletely(Transmission.scala:73)
kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:95)
kafka.producer.SyncProducer.send(SyncProducer.scala:94)
kafka.producer.SyncProducer.multiSend(SyncProducer.scala:135)
kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:50)
kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:44)
kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:113)
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:92)
TRACE 300446:
java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown
line)
java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
java.net.ServerSocket.implAccept(ServerSocket.java:522)
java.net.ServerSocket.accept(ServerSocket.java:490)
sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52)
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
java.lang.Thread.run(Thread.java:722)
TRACE 301478:
sun.misc.Unsafe.unpark(Unsafe.java:Unknown line)
java.util.concurrent.locks.LockSupport.unpark(LockSupport.java:152)
java.util.concurrent.locks.AbstractQueuedSynchronizer.unparkSuccessor(AbstractQueuedSynchronizer.java:662)
java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1263)
java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:460)
java.util.concurrent.LinkedBlockingQueue.signalNotFull(LinkedBlockingQueue.java:187)
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:477)
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$1.apply(ProducerSendThread.scala:69)
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$1.apply(ProducerSendThread.scala:69)
scala.collection.immutable.Stream$$anonfun$continually$1.apply(Stream.scala:598)
scala.collection.immutable.Stream$$anonfun$continually$1.apply(Stream.scala:598)
scala.collection.immutable.Stream$Cons.tail(Stream.scala:555)
scala.collection.immutable.Stream$Cons.tail(Stream.scala:549)
scala.collection.immutable.Stream$$anonfun$takeWhile$1.apply(Stream.scala:394)
scala.collection.immutable.Stream$$anonfun$takeWhile$1.apply(Stream.scala:394)
*Two broker:*
rank self accum count trace method
1 59.73% 59.73% 13226 301374 sun.nio.ch.FileDispatcherImpl.writev0
2 6.00% 65.72% 1328 300436 java.net.PlainSocketImpl.socketAccept
3 5.07% 70.80% 1123 300925 sun.misc.Unsafe.unpark
4 1.75% 72.54% 387 301001
kafka.perf.ProducerPerformance$ProducerThread.run
5 0.93% 73.47% 205 301075
kafka.message.ByteBufferMessageSet.kafka$message$ByteBufferMessageSet$$internalIterator
*Three broker:*
rank self accum count trace method
1 40.88% 40.88% 8471 300939 sun.misc.Unsafe.unpark
2 9.63% 50.51% 1995 301329 sun.nio.ch.FileDispatcherImpl.writev0
3 6.36% 56.87% 1317 300447 java.net.PlainSocketImpl.socketAccept
4 2.85% 59.72% 591 301828 sun.misc.Unsafe.park
5 2.13% 61.85% 442 301011
kafka.perf.ProducerPerformance$ProducerThread.run
On Thu, Mar 29, 2012 at 1:20 PM, Jay Kreps <[email protected]> wrote:
> Can you run with the profiler enabled so we can see where the cpu is
> going? Add something like the following to the java command line:
>
> -agentlib:hprof=cpu=samples,depth=15
>
> -Jay
>
> On Wed, Mar 28, 2012 at 8:34 PM, 刘明敏 <[email protected]> wrote:
> >
> > wow,seems that's the case:
> >
> > when running async producer,the cpu utilization is like this( I use
> vmstat
> > to check the cpu usage) :
> >
> > procs -----------memory---------- ---swap-- -----io---- --system--
> > -----cpu-----
> > r b swpd free buff cache si so bi bo in cs us sy
> idwa st
> > 29 0 37472 336768 73560 2555980 0 0 0 2224 9137 424623 79
> 20
> > 1 0 0
> > 21 0 37472 1260664 62384 1608728 0 0 0 4412 17336 422321 53
> > 29 16 1 0
> > 14 0 37472 1252836 62424 1612268 0 0 0 7720 11486 1043389
> 57
> > 40 3 0 0
> > 12 0 37472 1255632 62496 1621888 0 0 0 9920 10556 1394823
> 53
> > 46 1 0 0
> > 13 2 37472 1259244 62548 1631272 0 0 0 9920 10335 1383344
> 51
> > 47 2 0 0
> > 11 0 37472 1267484 61660 1601680 0 0 0 9932 16050 1360485
> 49
> > 48 4 0 0
> > 14 1 37472 1257632 61732 1611536 0 0 0 10460 9847 1431470 51
> > 48 1 0 0
> > 12 0 37472 1247524 61800 1620844 0 0 0 8804 13658 1191618
> 54
> > 42 4 0 0
> > 13 0 37472 1239464 61868 1631208 0 0 0 10984 14603 1443709
> 50
> > 48 2 1 0
> > 15 2 37472 1230156 61936 1641628 0 0 0 10980 11569 1480576
> 49
> > 50 1 0 0
> > 12 0 37472 1220236 61992 1652180 0 0 0 9912 11564 1475147
> 50
> > 49 1 1 0
> > 12 1 37472 1210812 62056 1662000 0 0 0 10964 10469 1455622
> 49
> > 49 2 0 0
> > 13 0 37472 1201636 62132 1672208 0 0 0 9904 12127 1467242
> 49
> > 49 1 0 0
> > 12 2 37472 1191964 62208 1682308 0 0 0 10980 12817 1448184
> 50
> > 50 1 0 0
> > 12 0 37472 1181796 62264 1692868 0 0 0 9884 15399 1452468
> 49
> > 49 2 0 0
> > 14 0 37472 1179980 61824 1695388 0 0 0 10988 14860 1303783
> 50
> > 46 3 1 0
> > 12 2 37472 1202232 59256 1653940 0 0 0 10984 14979 1344551
> 47
> > 52 2 0 0
> > 12 0 37472 1191568 59320 1664252 0 0 0 9884 10081 1488794
> 49
> > 51 1 0 0
> > 14 2 37472 1181276 59396 1674340 0 0 0 10984 12535 1433143
> 48
> > 48 3 1 0
> > 16 1 37472 1170788 59468 1685128 0 0 0 10960 11839 1468341
> 48
> > 50 2 0 0
> > 13 0 37472 1161612 59548 1695324 0 0 0 9968 15067 1467583
> 48
> > 50 1 0 0
> > 11 2 37472 1151940 59624 1705468 0 0 0 10984 13112 1482698
> 48
> > 51 1 0 0
> > 15 2 37472 1141524 59696 1716104 0 0 0 10984 14507 1444776
> 50
> > 48 2 1 0
> > 4 2 37472 1152144 57940 1677388 0 0 0 10992 13915 1029274
> 51
> > 39 9 2 0
> >
> > while the sync producer's cpu usage is much lower,like this:
> >
> > procs -----------memory---------- ---swap-- -----io---- --system--
> > -----cpu-----
> > r b swpd free buff cache si so bi bo in cs us sy
> id wa
> > st
> > 1 2 37472 1148876 52496 1729876 0 0 0 21964 9423 3919 29
> > 5 4521 0
> > 0 2 37472 1130836 52652 1751268 0 0 0 21976 7931 4185 15
> > 3 5625 0
> > 0 2 37472 1112360 52820 1774020 0 0 0 23104 7275 4437 14
> > 3 5032 0
> > 0 2 37472 1094876 52984 1796472 0 0 0 23060 7201 3801 13 4
> > 53 31 0
> > 3 2 37472 1078572 53140 1819200 0 0 0 23064 6965 4418 13 4
> > 57 26 0
> > 4 0 37472 1061400 53288 1840004 0 0 0 20864 7027 3930 12 3
> > 58 26 0
> > 3 0 37472 1043916 53444 1861068 0 0 0 21964 6957 4379 14 3
> > 61 21 0
> > 3 2 37472 1025324 53616 1883332 0 0 0 23068 7062 4104 14 4
> > 64 18 0
> > 4 2 37472 1008708 53764 1904044 0 0 0 20868 6246 3802 12 3
> > 57 29 0
> >
> > can you help me understand why the async producer is much more cpu
> > extensive than the sync procuder ?
> >
> > On Thu, Mar 29, 2012 at 12:04 AM, Jay Kreps <[email protected]> wrote:
> >
> > > Ah, but what is the cpu utilization? Can you run "top" or something
> during
> > > the test? My suspicion is that the producer is maybe just cpu bound.
> > >
> > > -Jay
> > >
> > > On Wed, Mar 28, 2012 at 3:57 AM, 刘明敏 <[email protected]>
> wrote:
> > >
> > > > 4 cpus on Procuder:
> > > >
> > > > processor : 0
> > > > vendor_id : GenuineIntel
> > > > cpu family : 6
> > > > model : 30
> > > > model name : Intel(R) Xeon(R) CPU X3430 @ 2.40GHz
> > > > stepping : 5
> > > > cpu MHz : 2393.580
> > > > cache size : 8192 KB
> > > > physical id : 0
> > > > siblings : 4
> > > > core id : 0
> > > > cpu cores : 4
> > > > apicid : 0
> > > > initial apicid : 0
> > > > fpu : yes
> > > > fpu_exception : yes
> > > > cpuid level : 11
> > > > wp : yes
> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat
> > > pse36
> > > > clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm
> > > > constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc
> > > > aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr
> pdcm
> > > > sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept
> vpid
> > > > bogomips : 4787.16
> > > > clflush size : 64
> > > > cache_alignment : 64
> > > > address sizes : 36 bits physical, 48 bits virtual
> > > > power management:
> > > >
> > > > processor : 1
> > > > vendor_id : GenuineIntel
> > > > cpu family : 6
> > > > model : 30
> > > > model name : Intel(R) Xeon(R) CPU X3430 @ 2.40GHz
> > > > stepping : 5
> > > > cpu MHz : 2393.580
> > > > cache size : 8192 KB
> > > > physical id : 0
> > > > siblings : 4
> > > > core id : 1
> > > > cpu cores : 4
> > > > apicid : 2
> > > > initial apicid : 2
> > > > fpu : yes
> > > > fpu_exception : yes
> > > > cpuid level : 11
> > > > wp : yes
> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat
> > > pse36
> > > > clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm
> > > > constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc
> > > > aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr
> pdcm
> > > > sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept
> vpid
> > > > bogomips : 4787.17
> > > > clflush size : 64
> > > > cache_alignment : 64
> > > > address sizes : 36 bits physical, 48 bits virtual
> > > > power management:
> > > >
> > > > processor : 2
> > > > vendor_id : GenuineIntel
> > > > cpu family : 6
> > > > model : 30
> > > > model name : Intel(R) Xeon(R) CPU X3430 @ 2.40GHz
> > > > stepping : 5
> > > > cpu MHz : 2393.580
> > > > cache size : 8192 KB
> > > > physical id : 0
> > > > siblings : 4
> > > > core id : 2
> > > > cpu cores : 4
> > > > apicid : 4
> > > > initial apicid : 4
> > > > fpu : yes
> > > > fpu_exception : yes
> > > > cpuid level : 11
> > > > wp : yes
> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat
> > > pse36
> > > > clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm
> > > > constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc
> > > > aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr
> pdcm
> > > > sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept
> vpid
> > > > bogomips : 4787.17
> > > > clflush size : 64
> > > > cache_alignment : 64
> > > > address sizes : 36 bits physical, 48 bits virtual
> > > > power management:
> > > >
> > > > processor : 3
> > > > vendor_id : GenuineIntel
> > > > cpu family : 6
> > > > model : 30
> > > > model name : Intel(R) Xeon(R) CPU X3430 @ 2.40GHz
> > > > stepping : 5
> > > > cpu MHz : 2393.580
> > > > cache size : 8192 KB
> > > > physical id : 0
> > > > siblings : 4
> > > > core id : 3
> > > > cpu cores : 4
> > > > apicid : 6
> > > > initial apicid : 6
> > > > fpu : yes
> > > > fpu_exception : yes
> > > > cpuid level : 11
> > > > wp : yes
> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat
> > > pse36
> > > > clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm
> > > > constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc
> > > > aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr
> pdcm
> > > > sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept
> vpid
> > > > bogomips : 4787.17
> > > > clflush size : 64
> > > > cache_alignment : 64
> > > > address sizes : 36 bits physical, 48 bits virtual
> > > > power management:
> > > >
> > > > ===
> > > > not sure if I am sending the right place this time.
> > > >
> > > > 2012/3/28 Jay Kreps <[email protected]>
> > > >
> > > > > What does CPU look like on the producer?
> > > > >
> > > > > Also, can you please use the apache mailing list (cc'd). This
> google
> > > > group
> > > > > is from our pre-apache days and is long dead.
> > > > >
> > > > > -Jay
> > > > >
> > > > > On Tue, Mar 27, 2012 at 7:42 PM, Kafka
> > > > > <[email protected]>wrote:
> > > > >
> > > > > > A message has been sent to the kafka-dev<
> > > > > http://groups.google.com/group/kafka-dev?hl=en_US>group and is
> > > awaiting
> > > > > approval. We've included this message for your review.
> > > > > >
> > > > > > The message requires moderation because the user does not have
> > > > permission
> > > > > > to post.
> > > > > >
> > > > > > You can approve or reject<
> > > > > http://groups.google.com/group/kafka-dev/pendmsg?hl=en_US>this
> message
> > > > or
> > > > > you can approve this message by replying to this email.
> > > > > > [image: Visit Google Groups]
> > > > > >
> > > > > > Start <
> https://groups.google.com/groups/create?hl=en_US&src=email>
> > > > your
> > > > > > own group, visit <
> > > http://groups.google.com/support/?hl=en_US&src=email
> > > > >the
> > > > > help center, or
> > > > > > report <
> https://groups.google.com/groups/abuse?hl=en_US&src=email>
> > > > > abuse.
> > > > > >
> > > > > >
> > > > > > ---------- Forwarded message ----------
> > > > > > From: 刘明敏 <[email protected]>
> > > > > > To: [email protected]
> > > > > > Cc: [email protected]
> > > > > > Date: Wed, 28 Mar 2012 10:42:11 +0800
> > > > > > Subject: Re: [kafka-dev] performance of async producer stops
> increase
> > > > > when
> > > > > > broker is more than 2
> > > > > > I have 1 producer on 1 machine.
> > > > > >
> > > > > > The perf of sync producer is :
> > > > > >
> > > > > > 1 borker:
> > > > > > *20.3277(M/s)*,*213151.4441(#/s)* (sync)
> > > > > > 2 borkers:
> > > > > > *39.5995*, *415230.6606* (sync)
> > > > > > 3 borkers:
> > > > > > *58.0836*, *609050.4903*(sync)
> > > > > >
> > > > > > compared with async producer :
> > > > > > when 1 broker:
> > > > > > 20.3164 M/s,213033.3823#/s (async)
> > > > > >
> > > > > > when 2 brokers:
> > > > > > 37.4754 M/s,392958.1892#/s (async)
> > > > > >
> > > > > > when 3 brokers(perf stops increase)
> > > > > > 36.6925 M/s, 384748.5668#/s (async)
> > > > > >
> > > > > > the sync producer reached 600K #/s(far beyond sync producer's
> 384k
> > > #/s)
> > > > > So
> > > > > > I doubt if it is the case of maxing out network or procuder.
> > > > > >
> > > > > > ===
> > > > > > and,I post the thread from google group:
> > > > > > https://groups.google.com/forum/?fromgroups#!forum/kafka-dev
> > > > > >
> > > > > > On Wed, Mar 28, 2012 at 3:25 AM, Jay Kreps <[email protected]>
> > > > wrote:
> > > > > >
> > > > > >> Hi,
> > > > > >>
> > > > > >> How many producers are you using and on how many machines? Also
> what
> > > > is
> > > > > >> the
> > > > > >> network speed (1gbit?) Likely you are maxing out either the
> producer
> > > > > >> machine(s) or the network itself.
> > > > > >>
> > > > > >> Also, quick question, how did you come across this address? This
> > > > address
> > > > > >> has been dead for a while now, and the new mailing list is on
> Apache
> > > > so
> > > > > I
> > > > > >> just want to make sure we don't have a dangling pointer
> somewhere.
> > > > Would
> > > > > >> be
> > > > > >> good to move the discussion to the Apache list so others can
> find
> > > it.
> > > > > >>
> > > > > >> -Jay
> > > > > >>
> > > > > >> On Tue, Mar 27, 2012 at 2:10 AM, mmLiu <
> [email protected]>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > I tested the performance of producer with
> > > > > >> ./bin/kafka-producer-perf-test.sh
> > > > > >> >
> > > > > >> > when 1 broker:
> > > > > >> > 20.3164 M/s,213033.3823#/s (async)
> > > > > >> >
> > > > > >> > when 2 brokers:
> > > > > >> > 37.4754 M/s,392958.1892#/s (async)
> > > > > >> >
> > > > > >> > when 3 brokers(perf stops increase)
> > > > > >> > 36.6925 M/s, 384748.5668#/s (async)
> > > > > >> >
> > > > > >> > when 4 brokers(even decrease):
> > > > > >> > 33.2244 M/s, 348383.5006 #/s (async)
> > > > > >> >
> > > > > >> >
> > > > > >> > Any idea why?
> > > > > >> >
> > > > > >> > FYI:the performance of sync producer increases(almost)
> linearly as
> > > > > >> number
> > > > > >> > of brokers increases.
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best Regards
> > > > > >
> > > > > > ----------------------
> > > > > > 刘明敏 | mmLiu
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards
> > > >
> > > > ----------------------
> > > > 刘明敏 | mmLiu
> > > >
> > >
> >
> >
> >
> > --
> > Best Regards
> >
> > ----------------------
> > 刘明敏 | mmLiu
>
--
Best Regards
----------------------
刘明敏 | mmLiu