[ https://issues.apache.org/jira/browse/NUTCH-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584550#comment-13584550 ]
Sebastian Nagel commented on NUTCH-1535: ---------------------------------------- Presumably, this is caused by incompatible field mappings defined in solrindex-mapping.xml (Nutch) and schema.xml (Solr). Please, check these files and follow carefully http://wiki.apache.org/nutch/NutchTutorial If the problem persists: can you attach - Solr log files - configuration files Also the nutch-user mailing list is a good starting point to get help. Thanks, Sebastian > Crawl crashes with java.io.exception > ------------------------------------ > > Key: NUTCH-1535 > URL: https://issues.apache.org/jira/browse/NUTCH-1535 > Project: Nutch > Issue Type: Bug > Environment: Ubuntu 12.04 > Reporter: Adam Pioch > > I started a crawl website using the command line with nutch 1.6 and it > crashed after starting indexing. This is the error I get: > 2013-02-20 19:34:05,335 INFO solr.SolrWriter - Indexing 5 documents > 2013-02-20 19:34:05,685 WARN mapred.LocalJobRunner - job_local_0019 > org.apache.solr.common.SolrException: Bad Request > Bad Request > request: http://localhost:8983/solr/update?wt=javabin&version=2 > at > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430) > at > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244) > at > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) > at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:142) > at > org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48) > at > org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:466) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260) > 2013-02-20 19:34:05,956 ERROR solr.SolrIndexer - java.io.IOException: Job > failed! > 2013-02-20 19:34:05,959 INFO solr.SolrDeleteDuplicates - > SolrDeleteDuplicates: starting at 2013-02-20 19:34:05 > 2013-02-20 19:34:05,960 INFO solr.SolrDeleteDuplicates - > SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/ > 2013-02-20 19:34:06,180 WARN mapred.FileOutputCommitter - Output path is > null in cleanup > 2013-02-20 19:34:06,180 WARN mapred.LocalJobRunner - job_local_0020 > java.lang.NullPointerException > at org.apache.hadoop.io.Text.encode(Text.java:388) > at org.apache.hadoop.io.Text.set(Text.java:178) > at > org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:270) > at > org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:241) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) > I'm new in nutch and tried to crawl some webpages and index it to solr. I'm > asking for explanation that is easy to understand. I'll be thankfull for any > help. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira