[jira] [Commented] (NUTCH-1671) indexchecker to add digest field
[ https://issues.apache.org/jira/browse/NUTCH-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938662#comment-13938662 ] Hudson commented on NUTCH-1671: --- SUCCESS: Integrated in Nutch-trunk #2568 (See [https://builds.apache.org/job/Nutch-trunk/2568/]) NUTCH-1671 indexchecker to add digest field (snagel: http://svn.apache.org/viewvc/nutch/trunk/?view=rev&rev=1578616) * /nutch/trunk/CHANGES.txt * /nutch/trunk/src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java > indexchecker to add digest field > > > Key: NUTCH-1671 > URL: https://issues.apache.org/jira/browse/NUTCH-1671 > Project: Nutch > Issue Type: Bug >Affects Versions: 1.7, 2.2.1 >Reporter: Sebastian Nagel >Priority: Trivial > Fix For: 2.3, 1.9 > > Attachments: NUTCH-1671-2x.patch, NUTCH-1671-trunk.patch > > > IndexingFiltersChecker does not add field "digest" as done by > IndexerMapReduce. Digest/signature could be also used by indexing filters > which then may fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (NUTCH-1671) indexchecker to add digest field
[ https://issues.apache.org/jira/browse/NUTCH-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938653#comment-13938653 ] Hudson commented on NUTCH-1671: --- SUCCESS: Integrated in Nutch-nutchgora #957 (See [https://builds.apache.org/job/Nutch-nutchgora/957/]) NUTCH-1671 indexchecker to add digest field (snagel: http://svn.apache.org/viewvc/nutch/branches/2.x/?view=rev&rev=1578620) * /nutch/branches/2.x/CHANGES.txt * /nutch/branches/2.x/src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java > indexchecker to add digest field > > > Key: NUTCH-1671 > URL: https://issues.apache.org/jira/browse/NUTCH-1671 > Project: Nutch > Issue Type: Bug >Affects Versions: 1.7, 2.2.1 >Reporter: Sebastian Nagel >Priority: Trivial > Fix For: 2.3, 1.9 > > Attachments: NUTCH-1671-2x.patch, NUTCH-1671-trunk.patch > > > IndexingFiltersChecker does not add field "digest" as done by > IndexerMapReduce. Digest/signature could be also used by indexing filters > which then may fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (NUTCH-1671) indexchecker to add digest field
[ https://issues.apache.org/jira/browse/NUTCH-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13908790#comment-13908790 ] Sebastian Nagel commented on NUTCH-1671: Hi [~amuseme], you mean a check for null as done in ParseChecker (trunk and 2.x)? Looks plausible, at a first glance. However, ParseUtil.parse() catches all possible errors and then returns an "empty" parse result. So, I'll commit the patches as is. If a NPE is observed, we can fix it later. > indexchecker to add digest field > > > Key: NUTCH-1671 > URL: https://issues.apache.org/jira/browse/NUTCH-1671 > Project: Nutch > Issue Type: Bug >Affects Versions: 1.7, 2.2.1 >Reporter: Sebastian Nagel >Priority: Trivial > Fix For: 2.3, 1.8 > > Attachments: NUTCH-1671-2x.patch, NUTCH-1671-trunk.patch > > > IndexingFiltersChecker does not add field "digest" as done by > IndexerMapReduce. Digest/signature could be also used by indexing filters > which then may fail. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (NUTCH-1671) indexchecker to add digest field
[ https://issues.apache.org/jira/browse/NUTCH-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830530#comment-13830530 ] lufeng commented on NUTCH-1671: --- yes, this field can be used by indexing filters. +1 another question is that should we add check code after parse content like this {code:java} ParseResult parseResult = new ParseUtil(conf).parse(content); if (parseResult == null) { LOG.error("Problem with parse - check log"); return (-1); } {code} > indexchecker to add digest field > > > Key: NUTCH-1671 > URL: https://issues.apache.org/jira/browse/NUTCH-1671 > Project: Nutch > Issue Type: Bug >Affects Versions: 1.7, 2.2.1 >Reporter: Sebastian Nagel >Priority: Trivial > Fix For: 2.3, 1.8 > > Attachments: NUTCH-1671-2x.patch, NUTCH-1671-trunk.patch > > > IndexingFiltersChecker does not add field "digest" as done by > IndexerMapReduce. Digest/signature could be also used by indexing filters > which then may fail. -- This message was sent by Atlassian JIRA (v6.1#6144)