The information you posted wont be enough for the developers to debug the 
problem, but I think this is where I can lend a hand. I have also been seeing 
this problem recently where I cant finish off the indexing portion, I'm 
currently using revision 547377 and the previous one before that.
 
Here is my log output;

2007-06-10 05:01:17,511 WARN  mapred.LocalJobRunner - job_dmp15q
java.lang.NullPointerException: value cannot be null
        at org.apache.lucene.document.Field.<init>(Field.java:188)
        at org.apache.lucene.document.Field.<init>(Field.java:164)
        at org.apache.nutch.indexer.Indexer.reduce(Indexer.java:199)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:313)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:1
55)
2007-06-10 05:01:17,551 FATAL indexer.Indexer - Indexer: java.io.IOException: Jo
b failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
        at org.apache.nutch.indexer.Indexer.index(Indexer.java:279)
        at org.apache.nutch.indexer.Indexer.run(Indexer.java:301)
        at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
        at org.apache.nutch.indexer.Indexer.main(Indexer.java:284)


----- Original Message ----
From: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Thursday, June 14, 2007 2:25:06 PM
Subject: Indexing problems in nutch-nightly


I was experimenting the last four releases of the nightly version, using
the intranet method, with about 400 hundred seed sites a depth of 4, topN
600 e one computer.  Every time I got the following error message:

Indexing [http://200.0.198.11/Biblioteca/p-periodicas/index.htm] with analyzer
[EMAIL PROTECTED] (null)
Indexer: java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
        at org.apache.nutch.indexer.Indexer.index(Indexer.java:279)
        at org.apache.nutch.indexer.Indexer.run(Indexer.java:301)
        at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
        at org.apache.nutch.indexer.Indexer.main(Indexer.java:284)

[EMAIL PROTECTED] nutch-2007-06-14_07-21-27]#

As a I got some messages of "Job failed! before (but never at this point
of indexing) and restarting the computer and indexing again, solved the problem,
I did this, but with no results.

On the other hand, the same task, with the 0.9 release was always successful,
in every try, with the same crawling specifications.  So, I am just pointing
this, because, I think, there is the possibility to exist some problem in
the indexing phase of the newer nightly versions.

Tanks
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to