The directory returned by getWorkOutputPath is a task specific directory, to
be used for files that should be part of the final output of the job.

If you want to write to the task local directory, use the local file system
api, and paths relative to '.'.
The parameter mapred.local.dir will contain the name of the local directory.


On Wed, Jul 1, 2009 at 9:19 AM, bonito perdo <bonito.pe...@googlemail.com>wrote:

> Thank you for you immediate response.
> In this case, what is the difference with the path obtained from
> FileOutputFormat.getWorkOutputPath(job)? this path refers to hdfs...
>
> Thank you.
>
>
> On Wed, Jul 1, 2009 at 5:13 PM, jason hadoop <jason.had...@gmail.com>
> wrote:
>
> > The parameter mapred.local.dir controls the directory used by the task
> > tracker for map/reduce jobs local files.
> >
> > the dfs.data.dir paramter is for the datanode.
> >
> > On Wed, Jul 1, 2009 at 8:56 AM, bonito <bonito.pe...@gmail.com> wrote:
> >
> > >
> > > Hello,
> > > I am a bit confused about the local directories where each map/reduce
> > task
> > > can store data.
> > > According to what I have read,
> > > dfs.data.dir - is the path on the local file system in which the
> DataNode
> > > instance should store its data. That is, since we have a number of
> > > individual nodes, this is the place where each node can store its own
> > data.
> > > Right?
> > > This data may be part of a-let's say- file stored under the hdfs
> > namespace?
> > > The value of this property for my configuration is:
> > >                          /home/bon/my_hdfiles/temp_0.19.1/dfs/data.
> > > As far as I can understand this path refers to the local "disk" of each
> > > node.
> > >
> > > Moreover, calling FileOutputFormat.getWorkOutputPath(job) we obtain the
> > > Path
> > > to the task's temporary output directory for the map-reduce job. This
> > path
> > > is totally different than the previous which confuses me since the
> > > temporary
> > > output of each task should be written locally in the node's disk. The
> > path
> > > I
> > > retrieve is:
> > >
> > >
> > >
> >
> hdfs://localhost:9000/user/bon/keys_fil.txt/_temporary/_attempt_200907011515_0009_m_000000_0
> > > Does this path refer to the local disk (node)? Or is it possible that
> it
> > > may
> > > refer to another node in the cluster?
> > >
> > > Any clarification would be of great help.
> > >
> > > Thank you.
> > > --
> > > View this message in context:
> > > http://www.nabble.com/local-directory-tp24292289p24292289.html
> > > Sent from the Hadoop core-user mailing list archive at Nabble.com.
> > >
> > >
> >
> >
> > --
> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > www.prohadoopbook.com a community for Hadoop Professionals
> >
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Reply via email to