Hello,
I would like to force all mapreduce jobs run from the Hive shell to run as the
hdfs user who ran them instead of as the "hive" user. For instance, I have HDFS
testuser1 logged into the edge node under their unix user with the same name
testuser1. This user begins a hive shell and kicks off a hive query job. This
job is always run as the "hive" user in HDFS and therefore all temporary files
reside in the hive users' HDFS folder. This is a problem.
I'm using the following for temp file location: hive.exec.scratchdir =
/user/${user.name}/.hive-temp/ . The intent is to ensure that every user puts
their own temp files and intermediate job files in their own user directory so
that we can track user disk usage correctly. Since Hive jobs are run as the
"hive" user, temp files always end up in /user/hive/.hive-temp/.
Is it possible to locate hive temp files in the user who runs the hive job?
Thanks,
-Shawn
Shawn Higgins
Systems Engineer
Thomson Reuters
[email protected]
thomsonreuters.com