Hi,
The job froze after the filesystem hung on a machine which had successfully completed a map task.
Is there a flag to enable the re scheduling of such a task ?


Jstack of job tracker

"SocketListener0-2" prio=10 tid=0x08916000 nid=0x4a4f runnable [0x4d05c000..0x4d05ce30]
  java.lang.Thread.State: RUNNABLE
       at java.net.SocketInputStream.socketRead0(Native Method)
       at java.net.SocketInputStream.read(SocketInputStream.java:129)
       at org.mortbay.util.LineInput.fill(LineInput.java:469)
       at org.mortbay.util.LineInput.fillLine(LineInput.java:547)
       at org.mortbay.util.LineInput.readLineBuffer(LineInput.java:293)
       at org.mortbay.util.LineInput.readLineBuffer(LineInput.java:277)
       at org.mortbay.http.HttpRequest.readHeader(HttpRequest.java:238)
at org.mortbay.http.HttpConnection.readRequest(HttpConnection.java:861) at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:907)
       at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
       at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
       at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)

  Locked ownable synchronizers:
       - None


"SocketListener0-1" prio=10 tid=0x4da8c800 nid=0xeeb runnable [0x4d266000..0x4d2670b0]
  java.lang.Thread.State: RUNNABLE
       at java.net.SocketInputStream.socketRead0(Native Method)
       at java.net.SocketInputStream.read(SocketInputStream.java:129)
       at org.mortbay.util.LineInput.fill(LineInput.java:469)
       at org.mortbay.util.LineInput.fillLine(LineInput.java:547)
       at org.mortbay.util.LineInput.readLineBuffer(LineInput.java:293)
       at org.mortbay.util.LineInput.readLineBuffer(LineInput.java:277)
       at org.mortbay.http.HttpRequest.readHeader(HttpRequest.java:238)
at org.mortbay.http.HttpConnection.readRequest(HttpConnection.java:861) at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:907)
       at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
       at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
       at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)

"IPC Server listener on 54311" daemon prio=10 tid=0x4df70400 nid=0xe86 runnable [0x4d9fe000..0x4d9feeb0]
  java.lang.Thread.State: RUNNABLE
       at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
       at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
       at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
       at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
       - locked <0x54fb4320> (a sun.nio.ch.Util$1)
       - locked <0x54fb4310> (a java.util.Collections$UnmodifiableSet)
       - locked <0x54fb40b8> (a sun.nio.ch.EPollSelectorImpl)
       at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
       at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
       at org.apache.hadoop.ipc.Server$Listener.run(Server.java:296)

  Locked ownable synchronizers:
       - None

"IPC Server Responder" daemon prio=10 tid=0x4da22800 nid=0xe85 runnable [0x4db75000..0x4db75e30]
  java.lang.Thread.State: RUNNABLE
       at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
       at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
       at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
       at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
       - locked <0x54fdddd0> (a sun.nio.ch.Util$1)
       - locked <0x54fdce10> (a java.util.Collections$UnmodifiableSet)
       - locked <0x54fdcc18> (a sun.nio.ch.EPollSelectorImpl)
       at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
       at org.apache.hadoop.ipc.Server$Responder.run(Server.java:455)

  Locked ownable synchronizers:
       - None

"RMI TCP Accept-0" daemon prio=10 tid=0x4da13400 nid=0xe31 runnable [0x4de55000..0x4de56130]
  java.lang.Thread.State: RUNNABLE
       at java.net.PlainSocketImpl.socketAccept(Native Method)
       at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
       - locked <0x54f6dae0> (a java.net.SocksSocketImpl)
       at java.net.ServerSocket.implAccept(ServerSocket.java:453)
       at java.net.ServerSocket.accept(ServerSocket.java:421)
at sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:34) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:369) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341)
       at java.lang.Thread.run(Thread.java:619)

  Locked ownable synchronizers:
       - None

-Sagar

Reply via email to