Hi. I'm trying to set up yarn log aggregation on a cluster using a shared NFS filesystem (no HDFS). The issue is that the user directories that get created under yarn.nodemanager.remote-app-log-dir are owned by the node manager owner, and not by the submitting user (which therefore can't access their logs). I've also tried running the node manager as root, to no avail.
Looking into the sources, it seems that the LogWriter freates the file within a UserGroupInformation.doAs block; I'm guessing that under HDFS that means "impersonating" the submitting user, and the file gets created with their ownership. However, if yarn.nodemanager.remote-app-log-dir happens to be on a file:/// filesystem and not HDFS, this doesn't happen. Can anyone confirm this, and/or suggest a workaround? Is it maybe possible to access the aggregated logs via some sort of rest API? Thanks, Shay