[
https://issues.apache.org/jira/browse/WHIRR-413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138597#comment-13138597
]
Jongwook Woo commented on WHIRR-413:
------------------------------------
Yes that is what I pointed out in this issue. Whirr 0.6.0 and 0.7.0 create
/data/tmp but the jobcache files are stored at /tmp while running hbase code
with hadoop. And, there is no file generated under /data/tmp.
I don't test CDH. For whirr 0.6.0, I just downloaded and run it. For whirr
0.7.0, I just SVNed and run it.
I may also take a look at whirr codes. If you have some idea in the following
questions, it will be much easier to resolve the issue:
(1) Is "whirr-hbase-default.properties" the only one that determines the
location of jobcache files?
(2) Do you remember any other configuration files or codes that specify the
location of jobcache?
(3) do you think my code - or any TableMapper hadoop code - to scan HBase also
can specify the /tmp or /data/tmp folder to store jobcache files?
> jobcache file is stored at /tmp/ folder so that it has out of storage error
> ---------------------------------------------------------------------------
>
> Key: WHIRR-413
> URL: https://issues.apache.org/jira/browse/WHIRR-413
> Project: Whirr
> Issue Type: Bug
> Components: build, service/hadoop
> Affects Versions: 0.6.0, 0.7.0
> Environment: - Ubuntu-11.10
> - java version "1.6.0_23"
> OpenJDK Runtime Environment (IcedTea6 1.11pre) (6b23~pre10-0ubuntu5)
> OpenJDK Client VM (build 20.0-b11, mixed mode, sharing)
> - ruby 1.8.7 (2011-06-30 patchlevel 352) [i686-linux]
> - Apache Maven 3.0.3 (r1075438; 2011-02-28 09:31:09-0800)
> Maven home: /home/jongwook/apache/apache-maven-3.0.3
> Java version: 1.6.0_23, vendor: Sun Microsystems Inc.
> Java home: /usr/lib/jvm/java-6-openjdk/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "linux", version: "3.0.0-12-generic", arch: "i386", family: "unix"
> Reporter: Jongwook Woo
> Priority: Critical
> Labels: build
> Fix For: 0.6.0, 0.7.0
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> when I run Hadoop to read/write data from/to HBase, I got the following error
> because of the less storage space at /tmp/.
> I guess whirr is supposed to use /data/tmp/ to store jobcache file such as
> taskTracker/jobcache/job_local_0001/attempt_local_0001_m_0000xx_0/output/file.out
> because /data/tmp/ has 335GB. However, it is stored at /tmp/ that has only
> 9.9G. Thus, some configuration xml file seems not correct. It generates
> errors both at 0.6.0 and 0.7.0
> -----Storage space check ---------------------------------------
> jongwook@ip-10-245-174-15:/tmp/hadoop-jongwook/mapred/local/taskTracker/jobcache/job_local_0001$
> cd /tmp
> jongwook@ip-10-245-174-15:/tmp$ df -h .
> Filesystem Size Used Avail Use% Mounted on
> /dev/sda1 9.9G 9.1G 274M 98% /
> jongwook@ip-10-245-174-15:/tmp$ df -h
> Filesystem Size Used Avail Use% Mounted on
> /dev/sda1 9.9G 9.1G 274M 98% /
> none 846M 116K 846M 1% /dev
> none 879M 0 879M 0% /dev/shm
> none 879M 68K 878M 1% /var/run
> none 879M 0 879M 0% /var/lock
> none 879M 0 879M 0% /lib/init/rw
> /dev/sda2 335G 199M 318G 1% /mnt
> -----Error msg at the end of hadoop/hbase code
> -------------------------------------------------------
> 11/10/27 03:33:09 INFO mapred.MapTask: Finished spill 61
> 11/10/27 03:33:09 WARN mapred.LocalJobRunner: job_local_0001
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> valid local directory for
> taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000016_0/output/file.out
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
> at
> org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:61)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1469)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154)
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:549)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:623)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 11/10/27 03:33:09 INFO mapred.JobClient: Job complete: job_local_0001
> 11/10/27 03:33:09 INFO mapred.JobClient: Counters: 8
> 11/10/27 03:33:09 INFO mapred.JobClient: FileSystemCounters
> 11/10/27 03:33:09 INFO mapred.JobClient: FILE_BYTES_READ=103074405254
> 11/10/27 03:33:09 INFO mapred.JobClient: FILE_BYTES_WRITTEN=156390149579
> 11/10/27 03:33:09 INFO mapred.JobClient: Map-Reduce Framework
> 11/10/27 03:33:09 INFO mapred.JobClient: Combine output records=0
> 11/10/27 03:33:09 INFO mapred.JobClient: Map input records=13248198
> 11/10/27 03:33:09 INFO mapred.JobClient: Spilled Records=788109966
> 11/10/27 03:33:09 INFO mapred.JobClient: Map output bytes=5347057080
> 11/10/27 03:33:09 INFO mapred.JobClient: Combine input records=0
> 11/10/27 03:33:09 INFO mapred.JobClient: Map output records=278212138
> It takes: 1966141 msec
> 11/10/27 03:33:10 INFO zookeeper.ZooKeeper: Session: 0x13341a966cb000d closed
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira