Note: I mistakenly used nutch-user email for reply-to value. Feel free to reply to either nutch-dev or nutch-user as I monitor both of them :-) Anyway can anybody tell me how I can easily change reply-to value in gmail? I am struggling with this all the time especially when replying to multiple mail-lists....
On 1/4/06, Lukas Vlcek <[EMAIL PROTECTED]> wrote: > Hi, > > I am trying to use the latest nutch-trunk version but I am facing > unexpected "Job failed!" exception. It seems that all crawling work > has been already done but some threads are hunged which results into > exception after some timeout. > > I am not sure whether this is a real nutch issue or just mine > misunderstanding of proper configuration. > > The following are the details: > I am trying to run nutch-trunk version on one machine (Linux). I used > the latest svn and produced fresh installation package using "ant > tar". Then I modified nutch-site.xml only (see attachement) - I > believe I didn't change anything special. I was doing modifications to > [fetcher.threads.fetch] and [fetcher.threads.per.host] as well but > this didn't seem to help. > > Typically, nutch crawl process seemed to work fine and it crawled all > documents on my local apache server (both nutch and apache run on the > same machine) but then it didn't stop but was waiting for something to > finish. Since then it was just producing lines like [060103 231602 16 > pages, 0 errors, 0.4 pages/s, 305 kb/s, ] into log where the later two > numbers (pages/s, kb/s) where decreasing as time went by (that is > logical). > > Then I receive the following exception: > Sometime it even contains log massege saying > "Aborting with "+activeThreads+" hung threads." where activeThreads > was some number (this number differs based on conf setup). > > ... (see crawl.log attachement file for whole log) > 060103 231602 16 pages, 0 errors, 0.4 pages/s, 305 kb/s, > 060103 231602 16 pages, 0 errors, 0.4 pages/s, 305 kb/s, > java.lang.NullPointerException > at > java.lang.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:980) > at java.lang.Float.parseFloat(Float.java:222) > at > org.apache.nutch.parse.ParseOutputFormat$1.write(ParseOutputFormat.java:84) > at > org.apache.nutch.fetcher.FetcherOutputFormat$1.write(FetcherOutputFormat.java:80) > at org.apache.nutch.mapred.ReduceTask$2.collect(ReduceTask.java:247) > at > org.apache.nutch.mapred.lib.IdentityReducer.reduce(IdentityReducer.java:41) > at org.apache.nutch.mapred.ReduceTask.run(ReduceTask.java:260) > at > org.apache.nutch.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:90) > 060103 231603 map 100% > Exception in thread "main" java.io.IOException: Job failed! > at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:308) > at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:344) > at org.apache.nutch.crawl.Crawl.main(Crawl.java:111) > > Does anybody know what is wrong? > > Regards, > Lukas > > >