[ 
https://issues.apache.org/jira/browse/NUTCH-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Kubes updated NUTCH-613:
-------------------------------

    Attachment: NUTCH-613-1-20080219.patch

This patch checks the hit details for an orig field and uses that as the url 
field if it exists.  This allows the system to correctly find the summary and 
cached contents.  I don't know if this solves the entire problem of redirects 
and how they are stored but it does solve the symptom of summaries not showing 
up and cached pages erroring.

> Empty Summaries and Cached Pages
> --------------------------------
>
>                 Key: NUTCH-613
>                 URL: https://issues.apache.org/jira/browse/NUTCH-613
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher, searcher, web gui
>    Affects Versions: 0.9.0
>         Environment: All
>            Reporter: Dennis Kubes
>            Assignee: Dennis Kubes
>             Fix For: 0.9.0, 1.0.0
>
>         Attachments: NUTCH-613-1-20080219.patch
>
>
> There is a bug where some search results do not have summaries and viewing 
> their cached pages causes a NullPointer.  This bug is due to redirects 
> getting stored under the new url and the getURL method of FetchedSegments 
> getting the wrong (old) url which is stored in crawldb but has no content or 
> parse objects.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to