[ https://issues.apache.org/jira/browse/NUTCH-614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dennis Kubes updated NUTCH-614: ------------------------------- Attachment: NUTCH-614-1-20080219.patch Orders inlinks by parents OPIC score. > Order Inlinks by OPIC score of parent page > ------------------------------------------ > > Key: NUTCH-614 > URL: https://issues.apache.org/jira/browse/NUTCH-614 > Project: Nutch > Issue Type: Improvement > Affects Versions: 0.9.0 > Environment: All > Reporter: Dennis Kubes > Assignee: Dennis Kubes > Fix For: 0.9.0, 1.0.0 > > Attachments: NUTCH-614-1-20080219.patch > > > Currently when saving inlinks there is a max number of inlinks (configurable) > which get saved and very little logic goes into deciding which inlinks get > saved. This patch uses the OPIC score of the encompassing page to set a > score for each inlink. Inlinks are then reverse sorted according to score > and the best inlinks are saved first. The logic behind this is that pages > with higher OPIC scores should have better links which they are pointing to. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.