Re: Submitting Hadoop jobs from a shell action leads to the lost of the user identity

Alejandro Abdelnur Thu, 21 Feb 2013 09:02:09 -0800

Clément,

Oozie provides MR and Pig actions that propagate the user identity
correctly. If you are using a shell action your shell action ends up
running as the Unix user running the task, in an unsecure cluster is
typically the mapred user, in a secure cluster is the same user that
submitted the job to Oozie (this requires user provisioning to all nodes in
the cluster), but then you'll face another problem, you user in the nodes
will not have a kerberos session. This means you are back to square one, if
you want to run jobs against the cluster using the correct user, you should
use the Oozie provided actions for that effect (MR, Pig, Sqoop, Hive,
distcp, etc).


Thx


On Thu, Feb 21, 2013 at 7:44 AM, Clément MATHIEU <clem...@unportant.info>wrote:

> Hi all,
>
> I'm currently troubleshooting an issue involving tasks ran under the
> default mapred account rather the user account and would greatly appreciate
> any help or suggestion.
>
>
> The guilty action can be summarized like that:
>   - The workflow contains a shell action
>   - The shell action starts a Bash shell script
>   - The shell script executes several Pig scripts and starts some map
> reduce tasks
>
> The issue is that the Pig scripts and MR tasks are submitted to the
> jobtracker and executed using the mapred Hadoop user rather the same user
> that is used to run the shell action.
>
>
> I believe that this issue can trivially be explained because Hadoop relies
> on the Unix username of the process which invoked submit and, if security
> is not enabled, Hadoop uses a shared mapred Unix user to execute the tasks.
> How can I configure Oozie, or at least my action, to avoid a such escape ?
>
>
>
> PS: Using shell scripts to do Oozie's job can be seen as dumb, but the
> explanation is quite rational ;)
>
>
> Thanks,
>
> -- Clément
>



-- 
Alejandro

Re: Submitting Hadoop jobs from a shell action leads to the lost of the user identity

Reply via email to