[ 
https://issues.apache.org/jira/browse/WHIRR-413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138597#comment-13138597
 ] 

Jongwook Woo commented on WHIRR-413:
------------------------------------

Yes that is what I pointed out in this issue. Whirr 0.6.0 and 0.7.0 create 
/data/tmp but the jobcache files are stored at /tmp while running hbase code 
with hadoop. And, there is no file generated under /data/tmp.

I don't test CDH. For whirr 0.6.0, I just downloaded and run it. For whirr 
0.7.0, I just SVNed and run it.

I may also take a look at whirr codes. If you have some idea in the following 
questions, it will be much easier to resolve the issue:
(1) Is "whirr-hbase-default.properties" the only one that determines the 
location of jobcache files? 
(2) Do you remember any other configuration files or codes that specify the 
location of jobcache? 
(3) do you think my code - or any TableMapper hadoop code - to scan HBase also 
can specify the /tmp or /data/tmp folder to store jobcache files?
 
                
> jobcache file is stored at /tmp/ folder so that it has out of storage error
> ---------------------------------------------------------------------------
>
>                 Key: WHIRR-413
>                 URL: https://issues.apache.org/jira/browse/WHIRR-413
>             Project: Whirr
>          Issue Type: Bug
>          Components: build, service/hadoop
>    Affects Versions: 0.6.0, 0.7.0
>         Environment: - Ubuntu-11.10
> - java version "1.6.0_23"
> OpenJDK Runtime Environment (IcedTea6 1.11pre) (6b23~pre10-0ubuntu5)
> OpenJDK Client VM (build 20.0-b11, mixed mode, sharing)
> - ruby 1.8.7 (2011-06-30 patchlevel 352) [i686-linux]
> - Apache Maven 3.0.3 (r1075438; 2011-02-28 09:31:09-0800)
> Maven home: /home/jongwook/apache/apache-maven-3.0.3
> Java version: 1.6.0_23, vendor: Sun Microsystems Inc.
> Java home: /usr/lib/jvm/java-6-openjdk/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "linux", version: "3.0.0-12-generic", arch: "i386", family: "unix"
>            Reporter: Jongwook Woo
>            Priority: Critical
>              Labels: build
>             Fix For: 0.6.0, 0.7.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> when I run Hadoop to read/write data from/to HBase, I got the following error 
> because of the less storage space at /tmp/.
> I guess whirr is supposed to use /data/tmp/ to store jobcache file  such as 
> taskTracker/jobcache/job_local_0001/attempt_local_0001_m_0000xx_0/output/file.out
>  because /data/tmp/ has 335GB. However, it is stored at /tmp/ that has only 
> 9.9G. Thus, some configuration xml file seems not correct. It generates 
> errors both at 0.6.0 and 0.7.0
> -----Storage space check ---------------------------------------
> jongwook@ip-10-245-174-15:/tmp/hadoop-jongwook/mapred/local/taskTracker/jobcache/job_local_0001$
>  cd /tmp
> jongwook@ip-10-245-174-15:/tmp$ df -h .
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda1             9.9G  9.1G  274M  98% /
> jongwook@ip-10-245-174-15:/tmp$ df -h
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda1             9.9G  9.1G  274M  98% /
> none                  846M  116K  846M   1% /dev
> none                  879M     0  879M   0% /dev/shm
> none                  879M   68K  878M   1% /var/run
> none                  879M     0  879M   0% /var/lock
> none                  879M     0  879M   0% /lib/init/rw
> /dev/sda2             335G  199M  318G   1% /mnt
> -----Error msg at the end of hadoop/hbase code 
> -------------------------------------------------------
> 11/10/27 03:33:09 INFO mapred.MapTask: Finished spill 61
> 11/10/27 03:33:09 WARN mapred.LocalJobRunner: job_local_0001
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any 
> valid local directory for 
> taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000016_0/output/file.out
>       at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343)
>       at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
>       at 
> org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:61)
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1469)
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154)
>       at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:549)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:623)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 11/10/27 03:33:09 INFO mapred.JobClient: Job complete: job_local_0001
> 11/10/27 03:33:09 INFO mapred.JobClient: Counters: 8
> 11/10/27 03:33:09 INFO mapred.JobClient:   FileSystemCounters
> 11/10/27 03:33:09 INFO mapred.JobClient:     FILE_BYTES_READ=103074405254
> 11/10/27 03:33:09 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=156390149579
> 11/10/27 03:33:09 INFO mapred.JobClient:   Map-Reduce Framework
> 11/10/27 03:33:09 INFO mapred.JobClient:     Combine output records=0
> 11/10/27 03:33:09 INFO mapred.JobClient:     Map input records=13248198
> 11/10/27 03:33:09 INFO mapred.JobClient:     Spilled Records=788109966
> 11/10/27 03:33:09 INFO mapred.JobClient:     Map output bytes=5347057080
> 11/10/27 03:33:09 INFO mapred.JobClient:     Combine input records=0
> 11/10/27 03:33:09 INFO mapred.JobClient:     Map output records=278212138
> It takes: 1966141 msec
> 11/10/27 03:33:10 INFO zookeeper.ZooKeeper: Session: 0x13341a966cb000d closed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to