[ 
https://issues.apache.org/jira/browse/HADOOP-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893521#action_12893521
 ] 

Meng Mao commented on HADOOP-6092:
----------------------------------

We've been seeing what we think is the same thing in Hadoop 0.20.1.

Here's what the dfs health page says:
Configured Capacity      :       77.22 TB
 DFS Used        :       58.18 TB
 Non DFS Used    :       5.69 TB
 DFS Remaining   :       13.34 TB
 DFS Used%       :       75.35 %
 DFS Remaining%  :       17.27 %
 Live Nodes      :       45
 Dead Nodes      :       0

Even the most occupied node has just under 80% used space.

For both Java and Streaming jobs, we've seen exceptions like the below, which 
is from a streaming run:
Failed to merge on the local FSorg.apache.hadoop.fs.FSError: 
java.io.IOException: No space left on device
        at 
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:192)
        at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
        at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at 
org.apache.hadoop.mapred.IFileOutputStream.write(IFileOutputStream.java:84)
        at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:218)
        at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:157)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(ReduceTask.java:2533)
Caused by: java.io.IOException: No space left on device
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:260)
        at 
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:190)
        ... 10 more

This job had about 100GB of input, and I'm guessing around that level of data 
being passed into the reducers. During the job execution, watching 
dfsnodelist.jsp?whatNodes=LIVE doesn't indicate much growth, certainly not to 
the point where a node hits 100%. Not that I know that kind of tracking of disk 
use is actually supported.

AFAIK, we have the physical disk space to match what the dfs pages indicate. 

What can we do to investigate how we're running out of space?

> No space left on device
> -----------------------
>
>                 Key: HADOOP-6092
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6092
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.19.0
>         Environment: ubuntu0.8.4
>            Reporter: mawanqiang
>
> Exception in thread "main" org.apache.hadoop.fs.FSError: java.io.IOException: 
> No space left on device
>         at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:199)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at java.io.FilterOutputStream.close(FilterOutputStream.java:140)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
>         at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:339)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:825)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1142)
>         at org.apache.nutch.indexer.Indexer.index(Indexer.java:72)
>         at org.apache.nutch.indexer.Indexer.run(Indexer.java:92)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.nutch.indexer.Indexer.main(Indexer.java:101)
> Caused by: java.io.IOException: No space left on device
>         at java.io.FileOutputStream.writeBytes(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:260)
>         at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:197)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to