[jira] [Commented] (NUTCH-1535) Crawl crashes with java.io.exception

Sebastian Nagel (JIRA) Fri, 22 Feb 2013 11:02:16 -0800

    [ 
https://issues.apache.org/jira/browse/NUTCH-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584550#comment-13584550
 ]


Sebastian Nagel commented on NUTCH-1535:
----------------------------------------

Presumably, this is caused by incompatible field mappings defined in
solrindex-mapping.xml (Nutch) and schema.xml (Solr). Please,
check these files and follow carefully 
http://wiki.apache.org/nutch/NutchTutorial
If the problem persists: can you attach
- Solr log files
- configuration files

Also the nutch-user mailing list is a good starting point to get help.
Thanks, Sebastian
                
> Crawl crashes with java.io.exception
> ------------------------------------
>
>                 Key: NUTCH-1535
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1535
>             Project: Nutch
>          Issue Type: Bug
>         Environment: Ubuntu 12.04
>            Reporter: Adam Pioch
>
> I started a crawl website using the command line with nutch 1.6 and it 
> crashed after starting indexing. This is the error I get:
> 2013-02-20 19:34:05,335 INFO  solr.SolrWriter - Indexing 5 documents
> 2013-02-20 19:34:05,685 WARN  mapred.LocalJobRunner - job_local_0019
> org.apache.solr.common.SolrException: Bad Request
> Bad Request
> request: http://localhost:8983/solr/update?wt=javabin&version=2
>       at 
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
>       at 
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
>       at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
>       at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:142)
>       at 
> org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
>       at 
> org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:466)
>       at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
> 2013-02-20 19:34:05,956 ERROR solr.SolrIndexer - java.io.IOException: Job 
> failed!
> 2013-02-20 19:34:05,959 INFO  solr.SolrDeleteDuplicates - 
> SolrDeleteDuplicates: starting at 2013-02-20 19:34:05
> 2013-02-20 19:34:05,960 INFO  solr.SolrDeleteDuplicates - 
> SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/
> 2013-02-20 19:34:06,180 WARN  mapred.FileOutputCommitter - Output path is 
> null in cleanup
> 2013-02-20 19:34:06,180 WARN  mapred.LocalJobRunner - job_local_0020
> java.lang.NullPointerException
>       at org.apache.hadoop.io.Text.encode(Text.java:388)
>       at org.apache.hadoop.io.Text.set(Text.java:178)
>       at 
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:270)
>       at 
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:241)
>       at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>       at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> I'm new in nutch and tried to crawl some webpages and index it to solr. I'm 
> asking for explanation that is easy to understand. I'll be thankfull for any 
> help.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (NUTCH-1535) Crawl crashes with java.io.exception

Reply via email to