In the plugin.includes ? protocol-httpclient
On Thu, Dec 5, 2013 at 12:08 PM, Nguyen Manh Tien <[email protected] > wrote: > Which protocol are you using Amit? > > > On Wed, Dec 4, 2013 at 10:46 PM, Amit Sela <[email protected]> wrote: > > > In my case, the fetch got to that point in 45 minutes and is stuck > another > > 75 minutes with those mappers. > > The log just keeps printing: > > > > org.apache.nutch.fetcher.Fetcher: -activeThreads=2, spinWaiting=0, > > fetchQueues.totalSize=0 > > > > org.apache.nutch.fetcher.Fetcher: -activeThreads=2, spinWaiting=0, > > fetchQueues.totalSize=0 > > > > org.apache.nutch.fetcher.Fetcher: -activeThreads=2, spinWaiting=0, > > fetchQueues.totalSize=0 > > > > .... > > > > > > > > On Wed, Dec 4, 2013 at 4:31 PM, feng lu <[email protected]> wrote: > > > > > I see that it use a while loop to wait for threads to exit and will > wait > > 1 > > > second between each check. so even if fetcher thread was finished, the > > > whole fetcher process will take little longer to exit. > > > > > > code structure like this. > > > > > > do { // wait for threads to > > exit > > > pagesLastSec = pages.get(); > > > bytesLastSec = (int)bytes.get(); > > > > > > try { > > > Thread.sleep(1000); > > > } catch (InterruptedException e) {} > > > > > > .... > > > reportStatus(pagesLastSec, bytesLastSec); // your print output > is > > > coming here > > > > > > LOG.info("-activeThreads=" + activeThreads + ", spinWaiting=" + > > > spinWaiting.get() > > > + ", fetchQueues.totalSize=" + fetchQueues.getTotalSize()); > > > > > > if (!feeder.isAlive() && fetchQueues.getTotalSize() < 5) { > > > fetchQueues.dump(); > > > } > > > .... > > > // check timelimit > > > if (!feeder.isAlive()) { > > > int hitByTimeLimit = fetchQueues.checkTimelimit(); > > > if (hitByTimeLimit != 0) reporter.incrCounter("FetcherStatus", > > > "hitByTimeLimit", hitByTimeLimit); > > > } > > > > > > // some requests seem to hang, despite all intentions > > > if ((System.currentTimeMillis() - lastRequestStart.get()) > > > timeout) > > > { > > > if (LOG.isWarnEnabled()) { > > > LOG.warn("Aborting with "+activeThreads+" hung threads."); > > > } > > > return; > > > } > > > > > > } while (activeThreads.get() > 0); > > > > > > > > > On Wed, Dec 4, 2013 at 7:57 PM, Amit Sela <[email protected]> wrote: > > > > > > > In the fetch phase, I notice that some of the mappers take much > longer > > to > > > > finish. > > > > In the running task mapreduce admin screen it shows > > > > > > > > *1 threads, 1 queues, 0 URLs queued, * > > > > > > > > So why those tasks are not complete ? > > > > > > > > > > > > > > > > -- > > > Don't Grow Old, Grow Up... :-) > > > > > >

