Cluster Summary

I am running a crawl on about 1 Million web domains. After 30% Map is done I
see the following usage
The Non DFS uses seems very high like 31G. This means nutch is creating too
many temporary files local
to that node. Is this correct ? Hoping someone will answer this post with at
least a Ok/not Ok.
First Crawl on the hadoop.  No other jobs running. DFS had 10G of data
before this job started

314 files and directories, 460 blocks = 774 total. Heap Size is 14.82 MB /
966.69 MB (1%) 
  Configured Capacity    :       377.91 GB
 DFS Used        :       60.31 GB
 Non DFS Used    :       31.58 GB
 DFS Remaining   :       286.02 GB
 DFS Used%       :       15.96 %
 DFS Remaining%  :       75.69 %
 Live Nodes      :       8
 Dead Nodes      :       0


-- 
View this message in context: 
http://www.nabble.com/Nutch1.0-hadoop-dfs-usage-doesnt-seem-right-.-experience-users-please-comment-tp23454975p23454975.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to