[ https://issues.apache.org/jira/browse/NUTCH-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dennis Kubes updated NUTCH-613: ------------------------------- Attachment: NUTCH-613-1-20080219.patch This patch checks the hit details for an orig field and uses that as the url field if it exists. This allows the system to correctly find the summary and cached contents. I don't know if this solves the entire problem of redirects and how they are stored but it does solve the symptom of summaries not showing up and cached pages erroring. > Empty Summaries and Cached Pages > -------------------------------- > > Key: NUTCH-613 > URL: https://issues.apache.org/jira/browse/NUTCH-613 > Project: Nutch > Issue Type: Bug > Components: fetcher, searcher, web gui > Affects Versions: 0.9.0 > Environment: All > Reporter: Dennis Kubes > Assignee: Dennis Kubes > Fix For: 0.9.0, 1.0.0 > > Attachments: NUTCH-613-1-20080219.patch > > > There is a bug where some search results do not have summaries and viewing > their cached pages causes a NullPointer. This bug is due to redirects > getting stored under the new url and the getURL method of FetchedSegments > getting the wrong (old) url which is stored in crawldb but has no content or > parse objects. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.