Markus Jelsma created NUTCH-1932: ------------------------------------ Summary: Automatically remove orphaned pages Key: NUTCH-1932 URL: https://issues.apache.org/jira/browse/NUTCH-1932 Project: Nutch Issue Type: New Feature Reporter: Markus Jelsma Assignee: Markus Jelsma Priority: Minor Fix For: 1.11
Nutch should be able to automatically remove orphaned pages such as old 404's, and not continue to revisit them. This requires NUTCH-1913. An inlink count of 1 is enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)