[
https://issues.apache.org/jira/browse/WHIRR-413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149220#comment-13149220
]
Jongwook Woo commented on WHIRR-413:
------------------------------------
I also ran two times bigger data on HBase code without scan.setCaching(2000)
and it has the same error as follows because of the space limit:
11/11/13 04:51:34 WARN mapred.LocalJobRunner: job_local_0001
org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
at
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:192)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at
org.apache.hadoop.mapred.IFileOutputStream.write(IFileOutputStream.java:84)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:217)
at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:157)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1535)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:549)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:623)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:282)
at
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:190)
... 15 more
jongwook@ip-10-84-69-196:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 9.9G 9.4G 1.2M 100% /
none 846M 116K 846M 1% /dev
none 879M 0 879M 0% /dev/shm
none 879M 64K 878M 1% /var/run
none 879M 0 879M 0% /var/lock
none 879M 0 879M 0% /lib/init/rw
/dev/sda2 335G 200M 318G 1% /mnt
Thus, the error is not mainly from "scan.setCaching(2000)". If there is a way
to configure to increase the size of /dev/sda1, that is, /tmp, the issue could
be resolved.
> jobcache file is stored at /tmp/ folder so that it has out of storage error
> ---------------------------------------------------------------------------
>
> Key: WHIRR-413
> URL: https://issues.apache.org/jira/browse/WHIRR-413
> Project: Whirr
> Issue Type: Bug
> Components: build, service/hadoop
> Affects Versions: 0.6.0, 0.7.0
> Environment: - Ubuntu-11.10
> - java version "1.6.0_23"
> OpenJDK Runtime Environment (IcedTea6 1.11pre) (6b23~pre10-0ubuntu5)
> OpenJDK Client VM (build 20.0-b11, mixed mode, sharing)
> - ruby 1.8.7 (2011-06-30 patchlevel 352) [i686-linux]
> - Apache Maven 3.0.3 (r1075438; 2011-02-28 09:31:09-0800)
> Maven home: /home/jongwook/apache/apache-maven-3.0.3
> Java version: 1.6.0_23, vendor: Sun Microsystems Inc.
> Java home: /usr/lib/jvm/java-6-openjdk/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "linux", version: "3.0.0-12-generic", arch: "i386", family: "unix"
> Reporter: Jongwook Woo
> Priority: Critical
> Labels: build
> Fix For: 0.6.0, 0.7.0
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> when I run Hadoop to read/write data from/to HBase, I got the following error
> because of the less storage space at /tmp/.
> I guess whirr is supposed to use /data/tmp/ to store jobcache file such as
> taskTracker/jobcache/job_local_0001/attempt_local_0001_m_0000xx_0/output/file.out
> because /data/tmp/ has 335GB. However, it is stored at /tmp/ that has only
> 9.9G. Thus, some configuration xml file seems not correct. It generates
> errors both at 0.6.0 and 0.7.0
> -----Storage space check ---------------------------------------
> jongwook@ip-10-245-174-15:/tmp/hadoop-jongwook/mapred/local/taskTracker/jobcache/job_local_0001$
> cd /tmp
> jongwook@ip-10-245-174-15:/tmp$ df -h .
> Filesystem Size Used Avail Use% Mounted on
> /dev/sda1 9.9G 9.1G 274M 98% /
> jongwook@ip-10-245-174-15:/tmp$ df -h
> Filesystem Size Used Avail Use% Mounted on
> /dev/sda1 9.9G 9.1G 274M 98% /
> none 846M 116K 846M 1% /dev
> none 879M 0 879M 0% /dev/shm
> none 879M 68K 878M 1% /var/run
> none 879M 0 879M 0% /var/lock
> none 879M 0 879M 0% /lib/init/rw
> /dev/sda2 335G 199M 318G 1% /mnt
> -----Error msg at the end of hadoop/hbase code
> -------------------------------------------------------
> 11/10/27 03:33:09 INFO mapred.MapTask: Finished spill 61
> 11/10/27 03:33:09 WARN mapred.LocalJobRunner: job_local_0001
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> valid local directory for
> taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000016_0/output/file.out
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
> at
> org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:61)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1469)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154)
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:549)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:623)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 11/10/27 03:33:09 INFO mapred.JobClient: Job complete: job_local_0001
> 11/10/27 03:33:09 INFO mapred.JobClient: Counters: 8
> 11/10/27 03:33:09 INFO mapred.JobClient: FileSystemCounters
> 11/10/27 03:33:09 INFO mapred.JobClient: FILE_BYTES_READ=103074405254
> 11/10/27 03:33:09 INFO mapred.JobClient: FILE_BYTES_WRITTEN=156390149579
> 11/10/27 03:33:09 INFO mapred.JobClient: Map-Reduce Framework
> 11/10/27 03:33:09 INFO mapred.JobClient: Combine output records=0
> 11/10/27 03:33:09 INFO mapred.JobClient: Map input records=13248198
> 11/10/27 03:33:09 INFO mapred.JobClient: Spilled Records=788109966
> 11/10/27 03:33:09 INFO mapred.JobClient: Map output bytes=5347057080
> 11/10/27 03:33:09 INFO mapred.JobClient: Combine input records=0
> 11/10/27 03:33:09 INFO mapred.JobClient: Map output records=278212138
> It takes: 1966141 msec
> 11/10/27 03:33:10 INFO zookeeper.ZooKeeper: Session: 0x13341a966cb000d closed
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira