Thanks for the reminder as i believe this is an actual issue! I've got some 
indices that cannot be deduplicated from Nutch and die without giving a proper 
clue.


I'll reproduce and report back on it. I know it's not a problem of not having 
the correct fields marked as STORED since that once index has all fields used 
by dedup marked as STORED. 

Strange..

> Hi Zhao,
> 
> Do you have anymore verbose log info from hadoop.log, I have never worked
> with Nutch 0.9 but if you could at least indicate whether you get something
> like
> 
> LOG: info Dedup: starting ... blah blah blah
> 
> Taking this to a larger context I am not particularly happy with the
> verboseness of logging when there are errors with indexing commands. When
> we experience an error during any of the index related commands we get
> back Job failed. It would be nice to get back a reason for the job failing
> which was more clear than a stack trace.
> 
> Finally, this is from a personal point of view, I would highly recommend
> that you upgrade to a newer (1.3) version of Nutch if you are using this in
> production. There are significant improvements in functionality.
> 
> Lewis
> 
> On Mon, Aug 29, 2011 at 3:24 AM, zhao <[email protected]> wrote:
> > Dear all,
> > after use nutch 0.9 ,but have a question,Detailed description of the
> > problem
> > is
> > 
> >       Exception in thread "main" java.io.IOException: Job failed!
> >       
> >        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
> >        at
> > 
> > org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:439
> > )
> > 
> >          at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
> > 
> > Thank you for your help
> > 
> >  zhao
> > 
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/a-question-about-job-failed-tp3291669p
> > 3291669.html Sent from the Nutch - User mailing list archive at
> > Nabble.com.

Reply via email to