I'm seeing some strange behavior on my cluster. Jobs will be done (that is, all tasks completed), but the job will still be "running". This state seems to persist for minutes, and is really killing my throughput.

I'm seeing errors (warnings) in the jobtracker log that look like this:

2009-02-06 12:37:08,425 WARN /: /taskgraph? type=reduce&jobid=job_200902061117_0012:
java.lang.ArrayIndexOutOfBoundsException: 3
at org.apache.hadoop.mapred.StatusHttpServer $TaskGraphServlet.getReduceAvarageProgresses(StatusHttpServer.java:228) at org.apache.hadoop.mapred.StatusHttpServer $TaskGraphServlet.doGet(StatusHttpServer.java:159)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at org.mortbay.jetty.servlet.ServletHolder.handle (ServletHolder.java:427) at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch (WebApplicationHandler.java:475) at org.mortbay.jetty.servlet.ServletHandler.handle (ServletHandler.java:567)
        at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
at org.mortbay.jetty.servlet.WebApplicationContext.handle (WebApplicationContext.java:635)
        at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
        at org.mortbay.http.HttpServer.service(HttpServer.java:954)
at org.mortbay.http.HttpConnection.service (HttpConnection.java:814) at org.mortbay.http.HttpConnection.handleNext (HttpConnection.java:981) at org.mortbay.http.HttpConnection.handle (HttpConnection.java:831) at org.mortbay.http.SocketListener.handleConnection (SocketListener.java:244) at org.mortbay.util.ThreadedServer.handle (ThreadedServer.java:357) at org.mortbay.util.ThreadPool$PoolThread.run (ThreadPool.java:534)


I'm running hadoop-0.19.0. Any ideas?

-Bryan

Reply via email to