I'm not sure what you're going for here, but the impression I get of
what this parameter does tells me that if multiple machines write to it,
things will explode. That would not be the case, however, if the
developers coded mapreduce to make each machine name directories or
files according to the specific machine (i.e., incorporating the node
name or the node's MAC address).
Something you could do in the event that the contents of
${yarn.app.mapreduce.am.staging-dir} are /not/ named in a node-specific
way would be to incorporate that yourself like so:
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>file:/share_mnt/tmp/${HOSTNAME}</value>
</property>
Then you'll wind up with each node's stuff on that share side-by-side.
I'd ask, though: are you sure you want to do this? What mechanism uses
this staging directory? Is it every worker's NodeManager daemon? If it
is, do you really want every node writing to the same network share
during a job?
On 6/28/18 10:36 PM, Ascot Moss wrote:
Hi,
Can "yarn.app.mapreduce.am.staging-dir" be set to use Linux shared
mount point? if yes, is the following correct?
mapred-site.xml
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>file:/share_mnt/tmp</value>
</property>
where "share_mnt" is the shared folder that can be accessed by all nodes.
Please help!