You can use mapred.local.dir for this purpose. It accepts a list of directories tasks may use, just like dfs.data.dir uses multiple disks for block writes/reads.
On Sun, Apr 22, 2012 at 12:50 PM, mete <efk...@gmail.com> wrote: > Hello folks, > > I have a job that processes text files from hdfs on local fs (temp > directory) and then copies those back to hdfs. > I added another drive to each server to have better io performance, but as > far as i could see hadoop.tmp.dir will not benefit from multiple disks,even > if i setup two different folders on different disks. (dfs.data.dir works > fine). As a result the disk with temp folder set is highy utilized, where > the other one is a little bit idle. > Does anyone have an idea on what to do? (i am using cdh3u3) > > Thanks in advance > Mete -- Harsh J