Nguyen Manh Tien created NUTCH-1672: ---------------------------------------
Summary: Inlinks are added twice in DbUpdateReducer Key: NUTCH-1672 URL: https://issues.apache.org/jira/browse/NUTCH-1672 Project: Nutch Issue Type: Bug Affects Versions: 2.2.1 Reporter: Nguyen Manh Tien Priority: Minor Attachments: NUTCH-1672.patch The first for loop is redundant for (ScoreDatum inlink : inlinkedScoreData) { page.putToInlinks(new Utf8(inlink.getUrl()), new Utf8(inlink.getAnchor())); } ... for (ScoreDatum inlink : inlinkedScoreData) { int inlinkDist = inlink.getDistance(); if (inlinkDist < smallestDist) { smallestDist=inlinkDist; } page.putToInlinks(new Utf8(inlink.getUrl()), new Utf8(inlink.getAnchor())); } -- This message was sent by Atlassian JIRA (v6.1#6144)