We are planning to enable secure Hadoop using Kerberos. Our users reside in the active directory. We read that there are two options to use Kerberos for securing Hadoop.
1) You run Kerberos on machine local to the cluster and create service principals here 2) Use Active Directory itself as the kerberos KDC and create service principals also in Active Directory. It seems cloudera and industry in general recommends option1 of running a local KDC for authernticating service principals. https://ccp.cloudera.com/display/CDHDOC/Integrating+Hadoop+Security+with+Active+Directory I read that the tasktrackers run tasks as the user who submitted the user. In that case , doesn't the TaskTracker nodes need to talk to the Active Directory to get the user details like gid etc ? So does this mean that every node (tasktrackers, job tracker and namenode) will be interacting with the Active Directory anyway ? If so, option 1 doesn't seem to be superior since each node has to talk to two kdc's - local kerberos for authenticating service principals, Active Directory to get the user details and group information . Please correct me if I am wrong in my assumptions. Thanks and Regards, BBG