Hi, On 7/3/07, Jason Ma <[EMAIL PROTECTED]> wrote: > I'm running Nutch on RedHat Linux with Java 1.6.0_01. I have > successfully crawled and indexed smaller quantities of data in the > past. However, after I tried to scale up the crawling, Nutch would > give an exception when indexing (the bottom of the log included > below). Please let me know if there's more information I should > provide. > > I'd be very grateful for any suggestions or advice you may have. > > Thanks in advance, > Jason Ma > > .... > > Indexing [http://64.13.133.31/pics/up-VC2GQ0CA9QSHHGHM-s] with > analyzer [EMAIL PROTECTED] > (null) > Indexing [http://64.13.133.31/pics/up-VET16648L9TBU53B-s] with > analyzer [EMAIL PROTECTED] > (null) > Indexing [http://64.13.133.31/pics/up-VHIUOB6N8CVESR52-s] with > analyzer [EMAIL PROTECTED] > (null) > Indexing [http://64.13.133.31/pics/user_promo_mini.png] with analyzer > [EMAIL PROTECTED] (null) > Optimizing index. > merging segments _73 (1 docs) _74 (1 docs) _75 (1 docs) _76 (1 docs) > _77 (1 docs) _78 (1 docs) _79 (1 docs) _7a (1 docs) _7b (1 docs) _7c > (1 docs) _7d (1 docs) _7e (1 docs) _7f (1 docs) _7g (1 docs) _7h (1 > docs) _7i (1 docs) _7j (1 docs) _7k (1 docs) _7l (1 docs) _7m (1 docs) > _7n (1 docs) _7o (1 docs) _7p (1 docs) _7q (1 docs) into _7r (24 docs) > merging segments _1e (50 docs) _2t (50 docs) _48 (50 docs) _5n (50 > docs) _72 (50 docs) _7r (24 docs) into _7s (274 docs) > Exception in thread "main" java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:357) > at org.apache.nutch.indexer.Indexer.index(Indexer.java:296) > at org.apache.nutch.indexer.Indexer.main(Indexer.java:313) >
This exception is jobrunner telling us that your job has failed. This doesn't show us where the actual problem is. Check your logs/hadoop.log or your tasktracker's log files and you should see a more detailed log about your problem. -- Doğacan Güney ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
