Greeting,
Here is the hadoop.log output from my crash - any ideas?
2007-07-31 19:06:50,702 INFO indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2007-07-31 19:06:50,799 INFO indexer.Indexer - Optimizing index.
2007-07-31 19:06:51,497 INFO indexer.Indexer - Indexer: done
2007-07-31 19:06:51,498 INFO indexer.DeleteDuplicates - Dedup: starting
2007-07-31 19:06:51,510 INFO indexer.DeleteDuplicates - Dedup:
adding indexes in: /var/webindex/data/indexes
2007-07-31 19:06:51,733 WARN mapred.LocalJobRunner - job_2xsg2o
java.lang.ArrayIndexOutOfBoundsException: -1
at org.apache.lucene.index.MultiReader.isDeleted
(MultiReader.java:113)
at org.apache.nutch.indexer.DeleteDuplicates$InputFormat
$DDRecordReader.next(DeleteDuplicates.java:176)
at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run
(LocalJobRunner.java:126)
On Jul 30, 2007, at 2:02 PM, DES wrote:
> Look in logs/hadoop.log for the actual reason for this exception. The
> console message is not really helpful.
>
> On 7/30/07, Micah Vivion <[EMAIL PROTECTED]> wrote:
>>>> Exception in thread "main" java.io.IOException: Job failed!
>>>> at org.apache.hadoop.mapred.JobClient.runJob
>>>> (JobClient.java:
>>>> 604)
>>>> at org.apache.nutch.indexer.DeleteDuplicates.dedup
>>>> (DeleteDuplicates.java:439)
>>>> at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general