nutch crashes for unknown reason

2011-07-12 Thread Paul van Hoven
I'm starting with nutch and I ran a simple job as described in the nutch tutorial. After a while I get the following error: CrawlDb update: URL filtering: true CrawlDb update: Merging segment data into db. CrawlDb update: finished at 2011-07-12 12:32:03, elapsed: 00:00:03 LinkDb: starting at

Re: nutch crashes for unknown reason

2011-07-12 Thread Paul van Hoven
Actually I'm not shure if I look at the right log lines. Please explain in more detail for what exactly I should look for. Anyway I found the following line just before the error: Error parsing: http://eu.apachecon.com/js/jquery.akslideshow.js: failed(2,0): Can't retrieve Tika parser for

Re: nutch crashes for unknown reason

2011-07-12 Thread Markus Jelsma
Actually I'm not shure if I look at the right log lines. Please explain in more detail for what exactly I should look for. Anyway I found the following line just before the error: Error parsing: http://eu.apachecon.com/js/jquery.akslideshow.js: failed(2,0): Can't retrieve Tika parser for

Re: nutch crashes for unknown reason

2011-07-12 Thread Paul van Hoven
I'm not if I did understand you correct. Here is the complete output of my crawl: tom:bin toom$ ./nutch crawl /Users/toom/Downloads/nutch-1.3/crawled -dir /Users/toom/Downloads/nutch-1.3/sites -depth 3 -topN 50 solrUrl is not set, indexing will be skipped... crawl started in:

Re: nutch crashes for unknown reason

2011-07-12 Thread Markus Jelsma
I don't see this segment 20110712114256 being parsed. On Tuesday 12 July 2011 13:38:35 Paul van Hoven wrote: I'm not if I did understand you correct. Here is the complete output of my crawl: tom:bin toom$ ./nutch crawl /Users/toom/Downloads/nutch-1.3/crawled -dir

Re: nutch crashes for unknown reason

2011-07-12 Thread Paul van Hoven
Okay, and what does that mean? How can I repair the error? 2011/7/12 Markus Jelsma markus.jel...@openindex.io: I don't see this segment 20110712114256 being parsed. On Tuesday 12 July 2011 13:38:35 Paul van Hoven wrote: I'm not if I did understand you correct. Here is the complete output of

Re: nutch crashes for unknown reason

2011-07-12 Thread lewis john mcgibbney
Fro mn the looks of it you need to parse all segments before indexing attempting to index them. As Markus has pointed out, the specific segment hasn't been parsed. Try parsing as per the following link http://wiki.apache.org/nutch/bin/nutch_parse On Tue, Jul 12, 2011 at 1:50 PM, Paul van Hoven