Hi - i don't think the indexing stage is reached at all, judging from the 
MapOutputFormat. We sometimes see this happening during the shuffle stage, some 
mapred limits need to be adjusted to overcome this, but don't remember which. 
But you can always decrease the size of a job and just run more jobs, it is 
probably much more efficient in overall throughput because the shuffle stage is 
always expensive.

Markus

 
 
-----Original message-----
> From:Talat Uyarer <ta...@uyarer.com>
> Sent: Monday 29th September 2014 6:57
> To: user@nutch.apache.org
> Subject: Re: Solr Indexer Reduce Tasks &quot;fail to report status&quot;
> 
> Hi Jonathan,
> 
> Sorry for late response.
> 
> i guess your commit size to high for your solr server. Maybe you have big
> size webpage. Because of big size page your commits take long time. Can you
> try decrease your commit size and can you check http content limit ?
> 
> Talat
> On Sep 26, 2014 4:37 PM, "Jonathan Cooper-Ellis" <j...@ziftr.com> wrote:
> 
> > Hi Talat,
> >
> > Thanks for the reply. I looked in the solr logs as well and nothing jumped
> > out at me. I didn't notice anything interesting in the JobTracker logs
> > either. The logs I included in the original message are the logs from the
> > TaskTracker when it fails, unless you're talking about a different task log
> > that I don't know about.
> >
> > Do you have any other ideas?
> >
> > Thanks,
> > jce
> >
> > On Fri, Sep 26, 2014 at 12:41 AM, Talat Uyarer <ta...@uyarer.com> wrote:
> >
> > > Hi Jonathan,
> > >
> > > Did you check your solr log file ? Something may go wrong on Solr side.
> > > Another question did you check your failed attempt task log. This logs
> > are
> > > useful for debugging
> > >
> > > Talat
> > > On Sep 25, 2014 10:59 PM, "Jonathan Cooper-Ellis" <j...@ziftr.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > I have been running Nutch 1.9 on Hadoop 1.2.1 using the
> > deploy/bin/crawl
> > > > script for a little while with no problems. However, I just increased
> > the
> > > > scope of the crawl pretty significantly, and now *most* of my Indexer
> > > jobs
> > > > are failing on the reduce task showing the error "Task
> > > > attempt_201409241419_0046_r_000000_3 failed to report status for 600
> > > > seconds. Killing!". From the TT logs, the main issue seems to be
> > "Caused
> > > > by: java.io.IOException: Connection reset by peer".
> > > >
> > > > I found some suggestions that these errors could be caused by somaxconn
> > > > being too low, so I increased from 128 to 256 on the node running Solr
> > > and
> > > > the JT and it didn't help. I also bumped the memory for MR tasks up to
> > > > 1024m from 700-something which doesn't seem to have helped either.
> > > >
> > > > Has anyone seen this before? Or have any idea what could cause this?
> > > >
> > > > Here is the relevant excerpt from the TT logs:
> > > >
> > > > 2014-09-25 00:40:25,580 WARN org.apache.hadoop.mapred.TaskTracker:
> > > > getMapOutput(attempt_201409241419_0033_m_000018_0,0) failed :
> > > > org.mortbay.jetty.EofException
> > > > at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:791)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.AbstractGenerator$Output.blockForOutput(AbstractGenerator.java:551)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:572)
> > > > at
> > > org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1012)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:651)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:580)
> > > > at
> > > >
> > > >
> > >
> > org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:4125)
> > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> > > > at
> > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
> > > > at
> > > >
> > > >
> > >
> > org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:914)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> > > > at
> > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> > > > at
> > > >
> > >
> > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> > > > at
> > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> > > > at
> > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> > > > at
> > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> > > > at
> > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> > > > at org.mortbay.jetty.Server.handle(Server.java:326)
> > > > at
> > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
> > > > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
> > > > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> > > > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> > > > Caused by: java.io.IOException: Connection reset by peer
> > > > at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> > > > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> > > > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
> > > > at sun.nio.ch.IOUtil.write(IOUtil.java:65)
> > > > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
> > > > at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:170)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221)
> > > > at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:725)
> > > > ... 27 more
> > > >
> > > > 2014-09-25 00:40:25,580 WARN org.mortbay.log: Committed before 410
> > > > getMapOutput(attempt_201409241419_0033_m_000018_0,0) failed :
> > > > org.mortbay.jetty.EofException
> > > > at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:791)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.AbstractGenerator$Output.blockForOutput(AbstractGenerator.java:551)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:572)
> > > > at
> > > org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1012)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:651)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:580)
> > > > at
> > > >
> > > >
> > >
> > org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:4125)
> > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> > > > at
> > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
> > > > at
> > > >
> > > >
> > >
> > org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:914)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> > > > at
> > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> > > > at
> > > >
> > >
> > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> > > > at
> > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> > > > at
> > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> > > > at
> > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> > > > at
> > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> > > > at org.mortbay.jetty.Server.handle(Server.java:326)
> > > > at
> > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
> > > > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
> > > > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> > > > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> > > > Caused by: java.io.IOException: Connection reset by peer
> > > > at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> > > > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> > > > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
> > > > at sun.nio.ch.IOUtil.write(IOUtil.java:65)
> > > > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
> > > > at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:170)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221)
> > > > at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:725)
> > > > ... 27 more
> > > >
> > > > 2014-09-25 00:40:25,580 INFO
> > > > org.apache.hadoop.mapred.TaskTracker.clienttrace: src:
> > > 172.31.36.63:50060,
> > > > dest: 172.31.36.65:53836, bytes: 720896, op: MAPRED_SHUFFLE, cliID:
> > > > attempt_201409241419_0033_m_000018_0, duration: 5555977
> > > > 2014-09-25 00:40:25,581 ERROR org.mortbay.log: /mapOutput
> > > > java.lang.IllegalStateException: Committed
> > > > at org.mortbay.jetty.Response.resetBuffer(Response.java:1023)
> > > > at org.mortbay.jetty.Response.sendError(Response.java:240)
> > > > at
> > > >
> > > >
> > >
> > org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:4162)
> > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> > > > at
> > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
> > > > at
> > > >
> > > >
> > >
> > org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:914)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> > > > at
> > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> > > > at
> > > >
> > >
> > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> > > > at
> > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> > > > at
> > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> > > > at
> > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> > > > at
> > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> > > > at org.mortbay.jetty.Server.handle(Server.java:326)
> > > > at
> > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
> > > > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
> > > > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> > > > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
> > > > at
> > > >
> > > >
> > >
> > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> > > >
> > > >
> > > > Best,
> > > > jce
> > > >
> > >
> >
> 

Reply via email to