If you may have turned on ipv6 on your hadoop cluster, it may cause severe performance hit!
When I ran the gridmix2 benchmark on a newly constructed cluster, it took 30% more time than the baseline time that was obtained on a similar cluster. I noticed that some task processes on some machines took 3+ minutes to initialize. After examining these processes in details, I found that they were stuck at socket initialization tile, as shown in the following stack: "main" prio=10 tid=0x0805b400 nid=0x4681 runnable [0xf7fbb000..0xf7fbc208] java.lang.Thread.State: RUNNABLE at java.net.PlainSocketImpl.initProto(Native Method) at java.net.PlainSocketImpl.<clinit>(PlainSocketImpl.java:84) at java.net.Socket.setImpl(Socket.java:434) at java.net.Socket.<init>(Socket.java:68) at sun.nio.ch.SocketAdaptor.<init>(SocketAdaptor.java:50) at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55) at sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105) - locked <0xf17a38c8> (a java.lang.Object) at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFacto ry.java:58) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:298) - locked <0xf1795db0> (a org.apache.hadoop.ipc.Client$Connection) at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:178) at org.apache.hadoop.ipc.Client.getConnection(Client.java:820) at org.apache.hadoop.ipc.Client.call(Client.java:705) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:335) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:372) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2188) I did a search on the web and found that that was due to a known bug for Java related to ipv6. More information about the bug can be found at the following two pages: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6483406 http://edocs.bea.com/jrockit/releases/5026x/relnotes/relnotes.html {quote} Slow startup because of a hang in java.net.PlainSocketImpl.initProto(), which typically is called when creating the first Socket or ServerSocket. In BEA JRockit 5.0 R26 the network stack is configured so that IPv6 is used in preference to IPv4 when it is present. During initialization of the network stack, the network code connects a socket to its own loopback interface to set up some data structures. Blocking this connection (e.g. with a firewall) will cause the initialization code to wait for a socket timeout, after which the system falls back on using IPv4. {quote} Suggested Workaround: Either set -Djava.net.preferIPv4Stack=true for the child process option, which forces Java to use IPv4 instead, or you disable IPv6 entirely in the system. The proper fix is to allow IPv6 traffic from localhost to localhost. For more information, see the Sun documentation: http://java.sun.com/j2se/1.4.2/docs/guide/net/ipv6_guide/#ipv6-networking Runping