DTrace analysis of TC-Cluster Multicast

Rainer Jung Sat, 25 Feb 2006 12:19:09 -0800

Hi,

I started to trace TC cluster with DTrace. I did a first simple exam for
the mcast subcomplex. CPU usage and especially elapsed times might be
slightly larger than expected, because I used the more intrusive
extended DTrace probes.


All tests were done with JDK 1.6.0 build 72 on Solaris 10 SPARC, using a
T2000 system (8 cores with 4 Threads each).

Method                          allocation    elapsed      cpu
mcast/McastServiceImpl.send()        1328B       40ms    6.7ms
mcast/McastServiceImpl.receive()  744-760B   23-510ms    3.5ms

So this does not look very bad, but I inspected, where cpu time is used
and objects get allocated:

1) send cpu time, elapsed time and allocation, per send

  cpu elapsed alloc
6.7ms  40.3ms 1328B  McastServiceImpl.send()

6.3ms  37.6ms 1296B    McastMember.getData()

1.3ms   7.6ms  136B       ... getName().getBytes();
0.9ms   5.3ms  120B       ... getDomain().getBytes();
3.6ms  21.2ms  880B       ... java.net.InetAddress.getByName(host)...

It's interesting, that the last three statements are responsible for 85%
of the allocation and 90% of the cpu usage. This seems unnecessary,
because for sending multicast heartbeat, name, domain and host do not
change, so they could be buffered easily.

2) receive cpu time, elapsed time and allocation, per receive

  cpu  elapsed    alloc
3.5ms 23-510ms 744-760B McastServiceImpl.receive()

0.1ms  3-485ms       0B   socket.receive()
2.8ms   17.9ms     640B   McastMember.getMember()

1.4ms    8.1ms  160B        new String(named),
1.0ms    5.6ms  128B        new String(domaind),
0.1ms    0.6ms  112B        addressToString(addr),

0.3ms    2.7ms   16B      ... membership.memberAlive(m)  [Only 50%]
0.2ms    2.3ms   32B      ... membership.expire(timeToExpiration);
...

The calls to memberAlive are much faster in ~50% of the calls.

Here most interesting is the fact, that the string construction takes
70% of total cpu time. I inspected, why this is so. For the next trace I
had to increase the depth I traced, so elapsed times go a little up. The
columns are:

- ">" or "<": ">" means start of method, "<" end of method
- stack depth relative to the receive method
- cpu time
- elapsed time
- allocated bytes
- class and method

3     -      -    - java/lang/String.<init>
4     -      -    - java/lang/String.<init>
5     -      -    - java/lang/StringCoding.decode
6     -      -    - java/lang/StringCoding.decode
7     -      -    - java/lang/StringCoding.deref

< 7 0.1ms  0.7ms    0 java/lang/StringCoding.deref

7     -      -    - java/lang/StringCoding$StringDecoder.decode
8     -      -    - java/nio/ByteBuffer.wrap

< 8 0.1ms  0.5ms   48 java/nio/ByteBuffer.wrap

8     -      -    - java/nio/CharBuffer.wrap

< 8 0.1ms  0.5ms   48 java/nio/CharBuffer.wrap

8     -      -    - java/nio/charset/CharsetDecoder.decode

< 8 0.7ms  4.4ms    0 java/nio/charset/CharsetDecoder.decode
< 7 1.1ms  7.8ms  160 java/lang/StringCoding$StringDecoder.decode
< 6 1.2ms  9.1ms  160 java/lang/StringCoding.decode
< 5 1.2ms  9.5ms  160 java/lang/StringCoding.decode
< 4 1.3ms 10.1ms  160 java/lang/String.<init>
< 3 1.3ms 10.2ms  160 java/lang/String.<init>

So we can see, that most of the cpu time is needed inside

- java/nio/charset/CharsetDecoder.decode: 0.7ms
- java/lang/StringCoding.deref: 0.1ms
- java/nio/ByteBuffer.wrap: 0.1ms
- java/nio/CharBuffer.wrap: 0.1ms

Maybe it would help to use a more byte-oriented format in the heartbeat
messages and to not convert from it. String conversions might be
restrictable to the case when logging in debug mode.

Cheers,

Rainer


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

DTrace analysis of TC-Cluster Multicast

Reply via email to