I'm not sure what you're going for here, but the impression I get of what this parameter does tells me that if multiple machines write to it, things will explode. That would not be the case, however, if the developers coded mapreduce to make each machine name directories or files according to the specific machine (i.e., incorporating the node name or the node's MAC address).

Something you could do in the event that the contents of ${yarn.app.mapreduce.am.staging-dir} are /not/ named in a node-specific way would be to incorporate that yourself like so:

   <property>
   <name>yarn.app.mapreduce.am.staging-dir</name>
   <value>file:/share_mnt/tmp/${HOSTNAME}</value>
   </property>

Then you'll wind up with each node's stuff on that share side-by-side.

I'd ask, though: are you sure you want to do this? What mechanism uses this staging directory? Is it every worker's NodeManager daemon? If it is, do you really want every node writing to the same network share during a job?



On 6/28/18 10:36 PM, Ascot Moss wrote:
Hi,

Can "yarn.app.mapreduce.am.staging-dir" be set to use Linux shared mount point? if yes, is the following correct?

mapred-site.xml

<property>
   <name>yarn.app.mapreduce.am.staging-dir</name>
   <value>file:/share_mnt/tmp</value>
</property>

where "share_mnt" is the shared folder that can be accessed by all nodes.

Please help!





Reply via email to