The Cluster KDC should be set up to trust the Active Directory KDC (cross-realm trust in the kerberos lingo). This handles the cases of user authentication when a user talks to a server in the cluster directly (e.g., user->namenode). The GID and other user attributes are usually stored in ldap. The cluster nodes are set up to talk to the cluster specific ldap server.
On Sep 30, 2011, at 7:19 PM, bigbibguy father wrote: > We are planning to enable secure Hadoop using Kerberos. > > Our users reside in the active directory. We read that there are two options > to use Kerberos for securing Hadoop. > > 1) You run Kerberos on machine local to the cluster and create service > principals here > 2) Use Active Directory itself as the kerberos KDC and create service > principals also in Active Directory. > > It seems cloudera and industry in general recommends option1 of running a > local KDC for authernticating service principals. > https://ccp.cloudera.com/display/CDHDOC/Integrating+Hadoop+Security+with+Active+Directory > > I read that the tasktrackers run tasks as the user who submitted the user. > In that case , doesn't the TaskTracker nodes need to talk to the Active > Directory to get the user details like gid etc ? > > So does this mean that every node (tasktrackers, job tracker and namenode) > will be interacting with the Active Directory anyway ? > > If so, option 1 doesn't seem to be superior since each node has to talk to > two kdc's - local kerberos for authenticating service principals, Active > Directory to get the user details and group information . > > Please correct me if I am wrong in my assumptions. > > Thanks and Regards, > > BBG