[ https://issues.apache.org/jira/browse/NUTCH-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yogendra Kumar Soni updated NUTCH-2124: --------------------------------------- Flags: Important Labels: db_gone fetcher redirect (was: ) Description: Hello, followredirect is not working in trunk. please see the below log. Fetcher: throughput threshold retries: 5 fetcher.maxNum.threads can't be < than 50 : using 50 instead -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0, fetchQueues.getQueueCount=1 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0, fetchQueues.getQueueCount=1 fetching http://www.wikipedia.com/wiki/URL_redirection (queue crawl delay=5000ms) fetching http://www.wikipedia.com/wiki/URL_redirection (queue crawl delay=5000ms) -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0, fetchQueues.getQueueCount=2 fetching http://www.wikipedia.com/wiki/URL_redirection (queue crawl delay=5000ms) -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0, fetchQueues.getQueueCount=2 fetching http://www.wikipedia.com/wiki/URL_redirection (queue crawl delay=5000ms) fetching http://www.wikipedia.com/wiki/URL_redirection (queue crawl delay=5000ms) -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0, fetchQueues.getQueueCount=2 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0, fetchQueues.getQueueCount=2 '''- redirect count exceeded http://www.wikipedia.com/wiki/URL_redirection''' Thread FetcherThread has no more work available -finishing thread FetcherThread, activeThreads=0 -activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0, fetchQueues.getQueueCount=2 -activeThreads=0 Fetcher: finished at 2015-09-28 19:32:05, elapsed: 00:00:09 Parsing : 20150928193153 > redirect following same link again and again , max redirect exceed and went > db_gone > ----------------------------------------------------------------------------------- > > Key: NUTCH-2124 > URL: https://issues.apache.org/jira/browse/NUTCH-2124 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 1.11 > Reporter: Yogendra Kumar Soni > Labels: db_gone, fetcher, redirect > > Hello, followredirect is not working in trunk. please see the below log. > Fetcher: throughput threshold retries: 5 > fetcher.maxNum.threads can't be < than 50 : using 50 instead > -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0, > fetchQueues.getQueueCount=1 > -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0, > fetchQueues.getQueueCount=1 > fetching http://www.wikipedia.com/wiki/URL_redirection (queue crawl > delay=5000ms) > fetching http://www.wikipedia.com/wiki/URL_redirection (queue crawl > delay=5000ms) > -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0, > fetchQueues.getQueueCount=2 > fetching http://www.wikipedia.com/wiki/URL_redirection (queue crawl > delay=5000ms) > -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0, > fetchQueues.getQueueCount=2 > fetching http://www.wikipedia.com/wiki/URL_redirection (queue crawl > delay=5000ms) > fetching http://www.wikipedia.com/wiki/URL_redirection (queue crawl > delay=5000ms) > -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0, > fetchQueues.getQueueCount=2 > -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0, > fetchQueues.getQueueCount=2 > '''- redirect count exceeded http://www.wikipedia.com/wiki/URL_redirection''' > Thread FetcherThread has no more work available > -finishing thread FetcherThread, activeThreads=0 > -activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0, > fetchQueues.getQueueCount=2 > -activeThreads=0 > Fetcher: finished at 2015-09-28 19:32:05, elapsed: 00:00:09 > Parsing : 20150928193153 -- This message was sent by Atlassian JIRA (v6.3.4#6332)