Indexer to use webgraph inlinks ------------------------------- Key: NUTCH-1181 URL: https://issues.apache.org/jira/browse/NUTCH-1181 Project: Nutch Issue Type: New Feature Components: indexer Reporter: Markus Jelsma Assignee: Markus Jelsma Fix For: 1.5
Indexers currently rely on the LinkDB for anchor indexing while the WebGraph provides the same data as an inverted link DB. An inlinkDB created by the WebGraph program with non-zero LinkRank scores on the nodes also provide an improved set ordered by popularity. This issue must: - let IndexerMapReduce understand the new format; - allow for indexing only popular anchors. The goal is todeprecate all code associated with invertlinks and ultimately remove it from the codebase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira