I agree. But how do we use APPLICATION_PATH for this purpose since we need a Yes/No flag to specify new vs old behavior?
So we have to use a new setting for this (something like USE_RUNTIME_USER_APPLICATION_PATH ?) On Fri, May 19, 2017 at 7:57 AM, Pramod Immaneni <pra...@datatorrent.com> wrote: > I wouldn't necessarily consider the current behavior a bug and the default > is fine the way it is today, especially because the user launching the app > is not the user. APPLICATION_PATH can be used as the setting. > > On Fri, May 19, 2017 at 7:43 AM, Vlad Rozov <v.ro...@datatorrent.com> > wrote: > > > Do I understand correctly that the question is regarding > > DAGContext.APPLICATION_PATH attribute value in case it is not defined? In > > this case, I would treat the current behavior as a bug and +1 the > proposal > > to set it to the impersonated user B DFS home directory. As > > APPLICATION_PATH can be explicitly set I do not see a need to provide > > another settings to preserve the current behavior. > > > > Thank you, > > > > Vlad > > > > > > On 5/18/17 15:46, Pramod Immaneni wrote: > > > >> Sorry typo in sentence "as we are not asking for permissions for a lower > >> privilege", please read as "as we are now asking for permissions for a > >> lower privilege". > >> > >> On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni < > pra...@datatorrent.com> > >> wrote: > >> > >> Apex cli supports impersonation in secure mode. With impersonation, the > >>> user running the cli or the user authenticating with hadoop (henceforth > >>> referred to as login user) can be different from the effective user > with > >>> which the actions are performed under hadoop. An example for this is an > >>> application can be launched by user A to run in hadoop as user B. This > is > >>> kind of like the sudo functionality in unix. You can find more details > >>> about the functionalilty here https://apex.apache.org/docs/a > >>> pex/security/ in > >>> the Impersonation section. > >>> > >>> What happens today with launching an application with impersonation, > >>> using > >>> the above launch example, is that even though the application runs as > >>> user > >>> B it still uses user A's hdfs path for the application path. The > >>> application path is where the artifacts necessary to run the > application > >>> are stored and where the runtime files like checkpoints are stored. > This > >>> means that user B needs to have read and write access to user A's > >>> application path folders. > >>> > >>> This may not be allowed in certain environments as it may be a policy > >>> violation for the following reason. Because user A is able to > impersonate > >>> as user B to launch the application, A is considered to be a higher > >>> privileged user than B and is given necessary privileges in hadoop to > do > >>> so. But after launch B needs to access folders belonging to A which > could > >>> constitute a violation as we are not asking for permissions for a lower > >>> privilege user to access resources of a higher privilege user. > >>> > >>> I would like to propose adding a configuration setting, which when set > >>> will use the application path in the impersonated user's home directory > >>> (user B) as opposed to impersonating user's home directory (user A). If > >>> this setting is not specified then the behavior can default to what it > is > >>> today for backwards compatibility. > >>> > >>> Comments, suggestions, concerns? > >>> > >>> Thanks > >>> > >>> > > >