Hi,

The messages you show are taken from the namenode logs and it seems like the
settings of the replication is too high.

Since you are using 2 machines the reasonable replication for dfs block is
two.

So you need to add the following property to your conf/hadoop-site.xml

<property>
  <name>dfs.replication</name>
  <value>2</value>
</property>

This usually got nothing to do with the generate so please specify what is
the message you get from the generate process.

HTH,

Gal

-----Original Message-----
From: Jason Culverhouse [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 02, 2007 2:28 AM
To: [email protected]
Subject: Nutch 0.9-dev trunk generate task failing/not completing

I have a 2 node testing cluster, I can  no longer run the generate  
task with out a log filled with errors (after about 19 successful  
iterations)

running
./bin/nutch generate crawl/crawldb testsegments -topN 10

generates the following
----
2007-02-01 15:28:37,510 WARN  fs.FSNamesystem - Replication requested  
of 10 is larger than cluster size (2). Using cluster size.
2007-02-01 15:28:37,510 DEBUG dfs.StateChange - DIR*  
NameSystem.startFile: add /home/nutch/nutch/filesystem/mapred/system/ 
submit_l03eux/.job.xml.crc to pendingCreates for DFSClient_-980095089
2007-02-01 15:28:37,511 DEBUG dfs.StateChange - BLOCK*  
NameSystem.allocateBlock: /home/nutch/nutch/filesystem/mapred/system/ 
submit_l03eux/.job.xml.crc. blk_4008532917963069756 is created and  
added to pendingCreates and pendingCreateBlocks
2007-02-01 15:28:37,594 DEBUG dfs.DataNode - Number of active  
connections is: 1
2007-02-01 15:28:37,627 INFO  dfs.DataNode - Received block  
blk_4008532917963069756 from /192.168.10.95

----
After this the log is filled with endless groups of 8 fs.FSNamesystem  
- Could not find any nodes with sufficient capacity
(10 replicas requested - 2 slaves?)

This seems to put the server in a loop of eating up all the CPU as  
the tasks run leaving very little for the actual jobs
----

2007-02-01 15:28:40,786 INFO  mapred.JobClient - Running job: job_0001
2007-02-01 15:28:41,328 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:41,328 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:41,328 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:41,328 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:41,678 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:41,678 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:41,678 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:41,678 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:41,794 INFO  mapred.JobClient -  map 0% reduce 0%
2007-02-01 15:28:44,320 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:44,320 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:44,320 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:44,320 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:44,699 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:44,699 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:44,699 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity
2007-02-01 15:28:44,699 WARN  fs.FSNamesystem - Could not find any  
nodes with sufficient capacity


Can anyone provide any help with this problem?

These continue even after I kill the generate task

Jason
  



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to