[ https://issues.apache.org/jira/browse/NUTCH-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julien Nioche resolved NUTCH-1100. ---------------------------------- Resolution: Fixed Committed revision 1540758. We'll probably move to a more generic approach in NUTCH-656 but in the meantime this is a good patch to have. Thanks! > SolrDedup broken > ---------------- > > Key: NUTCH-1100 > URL: https://issues.apache.org/jira/browse/NUTCH-1100 > Project: Nutch > Issue Type: Bug > Components: indexer > Affects Versions: 1.4 > Reporter: Markus Jelsma > Fix For: 1.9 > > Attachments: NUTCH-1100-1.6-1.patch > > > Some Solr indices are unable to be deduped from Nutch. For unknown reasons > Nutch will throw the exception below. There are no peculiarities to be found > in the Solr logs, the queries are normal and seem to succeed. > {code} > java.lang.NullPointerException > at org.apache.hadoop.io.Text.encode(Text.java:388) > at org.apache.hadoop.io.Text.set(Text.java:178) > at > org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:272) > at > org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:243) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)