Nguyen Manh Tien created NUTCH-1672:
---------------------------------------

             Summary: Inlinks are added twice in DbUpdateReducer
                 Key: NUTCH-1672
                 URL: https://issues.apache.org/jira/browse/NUTCH-1672
             Project: Nutch
          Issue Type: Bug
    Affects Versions: 2.2.1
            Reporter: Nguyen Manh Tien
            Priority: Minor
         Attachments: NUTCH-1672.patch

The first for loop is redundant 

for (ScoreDatum inlink : inlinkedScoreData) {
      page.putToInlinks(new Utf8(inlink.getUrl()), new 
Utf8(inlink.getAnchor()));
}
...
for (ScoreDatum inlink : inlinkedScoreData) {
      int inlinkDist = inlink.getDistance();
      if (inlinkDist < smallestDist) {
        smallestDist=inlinkDist;
      }
      page.putToInlinks(new Utf8(inlink.getUrl()), new 
Utf8(inlink.getAnchor()));
}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to