[ https://issues.apache.org/jira/browse/NUTCH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Markus Jelsma updated NUTCH-897: -------------------------------- Attachment: NUTCH-897.patch Attached tested fix and if confirmed to work and not break existing configurations. Patch works for 1.3 and trunk. > Subcollection requires blacklist element > ---------------------------------------- > > Key: NUTCH-897 > URL: https://issues.apache.org/jira/browse/NUTCH-897 > Project: Nutch > Issue Type: Bug > Components: indexer > Affects Versions: 1.2, 1.3, 2.0 > Reporter: Markus Jelsma > Assignee: Markus Jelsma > Priority: Trivial > Fix For: 1.3, 2.0 > > Attachments: NUTCH-897.patch > > > This is a very minor issue with in Subcollection.java. It throws an error if > the (empty) blacklist element was omitted. I think it should either not > silently fail in case of an omitted blacklist element or throw a decent error > message that the blacklist element is required. The following exception gets > thrown if the blacklist element is omitted in a subcollection block: > 2010-09-06 13:32:30,438 INFO collection.CollectionManager - Instantiating > CollectionManager > 2010-09-06 13:32:30,438 INFO collection.CollectionManager - initializing > CollectionManager > 2010-09-06 13:32:30,451 INFO collection.CollectionManager - file has1 > elements > > 2010-09-06 13:32:30,456 WARN collection.CollectionManager - Error > occured:java.lang.NullPointerException > > 2010-09-06 13:32:30,469 WARN collection.CollectionManager - > java.lang.NullPointerException > > 2010-09-06 13:32:30,470 WARN collection.CollectionManager - at > org.apache.nutch.collection.Subcollection.initialize(Subcollection.java:173) > > 2010-09-06 13:32:30,470 WARN collection.CollectionManager - at > org.apache.nutch.collection.CollectionManager.parse(CollectionManager.java:98) > > 2010-09-06 13:32:30,470 WARN collection.CollectionManager - at > org.apache.nutch.collection.CollectionManager.init(CollectionManager.java:75) > > 2010-09-06 13:32:30,470 WARN collection.CollectionManager - at > org.apache.nutch.collection.CollectionManager.<init>(CollectionManager.java:56) > > 2010-09-06 13:32:30,471 WARN collection.CollectionManager - at > org.apache.nutch.collection.CollectionManager.getCollectionManager(CollectionManager.java:115) > > > 2010-09-06 13:32:30,471 WARN collection.CollectionManager - at > org.apache.nutch.indexer.subcollection.SubcollectionIndexingFilter.addSubCollectionField(SubcollectionIndexingFilter.java:65) > > > 2010-09-06 13:32:30,471 WARN collection.CollectionManager - at > org.apache.nutch.indexer.subcollection.SubcollectionIndexingFilter.filter(SubcollectionIndexingFilter.java:71) > > > 2010-09-06 13:32:30,471 WARN collection.CollectionManager - at > org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:109) > > 2010-09-06 13:32:30,471 WARN collection.CollectionManager - at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:134) > > 2010-09-06 13:32:30,472 WARN collection.CollectionManager - at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50) > > 2010-09-06 13:32:30,472 WARN collection.CollectionManager - at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463) > > 2010-09-06 13:32:30,472 WARN collection.CollectionManager - at > org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411) > > 2010-09-06 13:32:30,472 WARN collection.CollectionManager - at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira