[ https://issues.apache.org/jira/browse/NUTCH-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lewis John McGibbney resolved NUTCH-2550. ----------------------------------------- Resolution: Fixed > Fetcher fails to follow redirects > --------------------------------- > > Key: NUTCH-2550 > URL: https://issues.apache.org/jira/browse/NUTCH-2550 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 1.15 > Reporter: Hans Brende > Priority: Blocker > Fix For: 1.15 > > > As I detailed in this github > [comment|https://github.com/apache/nutch/commit/c93d908bb635d3c5b59f8c8a22e0584ebf588794#r28470348], > it appears that PR #221 broke redirects. The fetcher will repeatedly fetch > the *original url* rather than the one it's supposed to be redirecting to > until {{http.redirect.max}} is exceeded, and then end with > {{STATUS_FETCH_GONE}}. > I noticed this issue when I was trying to crawl a site with a 301 MOVED > PERMANENTLY status code. > Should be pretty easy to fix though: I was able to get redirects working > again simply by inserting the code {code:java}url = fit.url{code} > [here|https://github.com/apache/nutch/blob/8682b96c3b84018f187eabaadc096ceded34f250/src/java/org/apache/nutch/fetcher/FetcherThread.java#L388] > and > [here|https://github.com/apache/nutch/blob/8682b96c3b84018f187eabaadc096ceded34f250/src/java/org/apache/nutch/fetcher/FetcherThread.java#L409]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)