Hi,

On 7/3/07, Jason Ma <[EMAIL PROTECTED]> wrote:
> I'm running Nutch on RedHat Linux with Java 1.6.0_01.  I have
> successfully crawled and indexed smaller quantities of data in the
> past.  However, after I tried to scale up the crawling, Nutch would
> give an exception when indexing (the bottom of the log included
> below).  Please let me know if there's more information I should
> provide.
>
> I'd be very grateful for any suggestions or advice you may have.
>
> Thanks in advance,
> Jason Ma
>
> ....
>
>  Indexing [http://64.13.133.31/pics/up-VC2GQ0CA9QSHHGHM-s] with
> analyzer [EMAIL PROTECTED]
> (null)
>  Indexing [http://64.13.133.31/pics/up-VET16648L9TBU53B-s] with
> analyzer [EMAIL PROTECTED]
> (null)
>  Indexing [http://64.13.133.31/pics/up-VHIUOB6N8CVESR52-s] with
> analyzer [EMAIL PROTECTED]
> (null)
>  Indexing [http://64.13.133.31/pics/user_promo_mini.png] with analyzer
> [EMAIL PROTECTED] (null)
> Optimizing index.
> merging segments _73 (1 docs) _74 (1 docs) _75 (1 docs) _76 (1 docs)
> _77 (1 docs) _78 (1 docs) _79 (1 docs) _7a (1 docs) _7b (1 docs) _7c
> (1 docs) _7d (1 docs) _7e (1 docs) _7f (1 docs) _7g (1 docs) _7h (1
> docs) _7i (1 docs) _7j (1 docs) _7k (1 docs) _7l (1 docs) _7m (1 docs)
> _7n (1 docs) _7o (1 docs) _7p (1 docs) _7q (1 docs) into _7r (24 docs)
> merging segments _1e (50 docs) _2t (50 docs) _48 (50 docs) _5n (50
> docs) _72 (50 docs) _7r (24 docs) into _7s (274 docs)
> Exception in thread "main" java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:357)
>         at org.apache.nutch.indexer.Indexer.index(Indexer.java:296)
>         at org.apache.nutch.indexer.Indexer.main(Indexer.java:313)
>

This exception is jobrunner telling us that your job has failed. This
doesn't show us where the actual problem is. Check your
logs/hadoop.log or your tasktracker's log files and you should see a
more detailed log about your problem.

-- 
Doğacan Güney
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to