Hi all,

I wanted to start a (hopefully short) discussion around the treatment of the
hadoop.job.ugi configuration in Hadoop 0.22 and beyond (as well as the
secure 0.20 branch). In the current security implementation, the following
incompatible changes have been made even for users who are sticking with
"simple" security.

1) Groups resolution happens on the server side, where it used to happen on
the client. Thus, all Hadoop users must exist on the NN/JT machines in order
for group mapping to succeed (or the user must write a custom group mapper).
2) The hadoop.job.ugi parameter is ignored - instead the user has to use the
new UGI.createRemoteUser("foo").doAs() API, even in simple security.

I'm curious whether the general user community feels these are acceptable
breaking changes. The potential solutions I can see are:

For 1) Add a configuration like hadoop.security.simple.groupmappinglocation
-> "client" or "server". If it's set to "client", the group mapping would
continue to happen as it does in prior versions on the client side.
For 2) If security is "simple", we can have the FileSystem and JobClient
constructors check for this parameter. If it's set, and there is no Subject
object associated with the current AccessControlContext, wrap the creation
of the RPC proxy with the correct doAs() call.

Although security is obviously an absolute necessity for many organizations,
I know of a lot of people who have small clusters and small teams who don't
have any plans to deploy it. For these people, I imagine the above
backward-compatibility layer may be very helpful as they adopt the next
releases of Hadoop. If we don't want to support these options going forward,
we can of course emit deprecation warnings when they are in effect and
remove the compatibility layer in the next major release.

Any thoughts here? Do people often make use of the hadoop.job.ugi variable
to such an extent that this breaking change would block your organization
from upgrading?

Thanks
-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to