[ 
https://issues.apache.org/jira/browse/TAJO-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026680#comment-14026680
 ] 

Min Zhou commented on TAJO-858:
-------------------------------

Hi Prafulla,

Thank you for the explanation. Make more sense for me now.

Seems the original token is from TajoMaster, right? TajoWorker will get that 
token from TajoMaster through a RPC call, right?   Unfortunately, the current 
RPC implementation in Tajo is lack of an SASL authetication mechanism. In 
another word, it's not safe.  Anyone who can talk to TajoMaster through that 
RPC can impersonate a TajoWorker,  getting those delegation tokens from 
TajoMaster without any identification.  Do you agree that?

Since you don't want to differentiate between multiple tajo users, I think we 
can design the security in tajo like HBase does.

1. For standalone mode, each TajoWorker is a long-term daemon made up of the 
whole tajo cluster. How about keeping keytab files like tajo.keytab in a 
directory on each machine, only the OS user named "tajo" have access to it? 
This design isolate the keytab from the other users. Anyone except tajo can't 
touch that file. Furthermore, it avoid token passing through an unsafe RPC.  
This is similar with what HBase does.

2. For yarn mode, since yarn can help users download a credentials file for 
each container and set an env variable named "HADOOP_TOKEN_FILE_LOCATION" point 
to that file.  UserGroupInformation.getCurrentUser() can automatically load 
those credentials according to that env variable.  see 
{noformat}
/**
   * Log in a user using the given subject
   * @parma subject the subject to use when logging in a user, or null to 
   * create a new subject.
   * @throws IOException if login fails
   */
  @InterfaceAudience.Public
  @InterfaceStability.Evolving
  public synchronized 
  static void loginUserFromSubject(Subject subject) throws IOException {
   ...
      String fileLocation = System.getenv(HADOOP_TOKEN_FILE_LOCATION);
      if (fileLocation != null) {
        // Load the token storage file and put all of the tokens into the
        // user. Don't use the FileSystem API for reading since it has a lock
        // cycle (HADOOP-9212).
        Credentials cred = Credentials.readTokenStorageFile(
            new File(fileLocation), conf);
        loginUser.addCredentials(cred);
      }
  ...
}
{noformat}
This is the reason why my patch worked. For this mode, still avoid token 
passing through unsafe RPC call.  

References
http://hortonworks.com/wp-content/uploads/2011/10/security-design_withCover-1.pdf
http://hbase.apache.org/book/security.html







> Support for hadoop kerberos authentication in Tajo
> --------------------------------------------------
>
>                 Key: TAJO-858
>                 URL: https://issues.apache.org/jira/browse/TAJO-858
>             Project: Tajo
>          Issue Type: Bug
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Min Zhou
>            Assignee: Prafulla T
>         Attachments: TAJO-858.patch
>
>
> The hadoop cluster is configured to use kerberos as authentication mechanism. 
> The exception is list below, seems when opening a hdfs file, tajo can't read 
> the security related config items from core-site.xml. It still used SIMPLE 
> authentication.
> {noformat}
> 2014-05-29 01:00:40,269 WARN  security.UserGroupInformation 
> (UserGroupInformation.java:doAs(1551)) - PriviledgedActionException as:mzhou 
> (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Client 
> cannot authenticate via:[TOKEN, KERBEROS]
> 2014-05-29 01:00:40,270 WARNdomain  ipc.Client (Client.java:run(669)) - 
> Exception encountered while connecting to the server : 
> org.apache.hadoop.security.AccessControlException: Client canhostnot 
> authenticate via:[TOKEN, KERBEROS]
> 2014-05-29 01:00:40,270 WARN  security.UserGroupInformation 
> (UserGroupInformation.java:doAs(1551)) - PriviledgedActionException as:mzhou 
> (auth:SIMPLE) cause:java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]
> 2014-05-29 01:00:40,278 ERROR worker.Task (Task.java:run(393)) - 
> java.io.IOException: Failed on local exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]; Host Details : local host is: 
> "host4.grid.domain.com/172.20.1.34"; destination host is: 
> "host1.grid.domain.com":9000; 
>       at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1359)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>       at com.sun.proxy.$Proxy9.getBlockLocations(Unknown Source)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>       at com.sun.proxy.$Proxy9.getBlockLocations(Unknown Source)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:206)
>       at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1131)
>       at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1121)
>       at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1111)
>       at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:272)
>       at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:239)
>       at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:232)
>       at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1279)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:292)
>       at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:292)
>       at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:765)
>       at org.apache.tajo.storage.CSVFile$CSVScanner.init(CSVFile.java:303)
>       at 
> org.apache.tajo.engine.planner.physical.SeqScanExec.initScanner(SeqScanExec.java:197)
>       at 
> org.apache.tajo.engine.planner.physical.SeqScanExec.init(SeqScanExec.java:179)
>       at 
> org.apache.tajo.engine.planner.physical.UnaryPhysicalExec.init(UnaryPhysicalExec.java:52)
>       at 
> org.apache.tajo.engine.planner.physical.UnaryPhysicalExec.init(UnaryPhysicalExec.java:52)
>       at 
> org.apache.tajo.engine.planner.physical.HashShuffleFileWriteExec.init(HashShuffleFileWriteExec.java:81)
>       at org.apache.tajo.worker.Task.run(Task.java:383)
>       at org.apache.tajo.worker.TaskRunner$1.run(TaskRunner.java:391)
>       at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]
>       at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:674)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>       at 
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:637)
>       at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:721)
>       at org.apache.hadoop.ipc.Client$Connection.access$2700(Client.java:367)
>       at org.apache.hadoop.ipc.Client.getConnection(Client.java:1458)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1377)
>       ... 32 more
> Caused by: org.apache.hadoop.security.AccessControlException: Client cannot 
> authenticate via:[TOKEN, KERBEROS]
>       at 
> org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:170)
>       at 
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:387)
>       at 
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:547)
>       at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:367)
>       at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:713)
>       at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:709)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>       at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:708)
>       ... 35 more
> 2014-05-29 01:00:40,278 INFO  worker.TaskAttemptContext 
> (TaskAttemptContext.java:setState(110)) - Query status of 
> ta_1401325188560_0001_000001_000000_00 is changed to TA_FAILED
> 2014-05-29 01:00:40,281 INFO  worker.Task (Task.java:run(447)) - Task Counter 
> - total:1, succeeded: 0, killed: 0, failed: 1
> 2014-05-29 01:00:40,282 INFO  worker.TaskRunner (TaskRunner.java:run(332)) - 
> Request GetTask: 
> eb_1401325188560_0001_000001,container_1401325188560_0001_01_000001
> 2014-05-29 01:00:40,305 INFO  worker.TaskRunner (TaskRunner.java:run(370)) - 
> Accumulated Received Task: 2
> 2014-05-29 01:00:40,305 INFO  worker.TaskRunner (TaskRunner.java:run(379)) - 
> Initializing: ta_1401325188560_0001_000001_000000_01
> 2014-05-29 01:00:40,316 INFO  worker.TaskAttemptContext 
> (TaskAttemptContext.java:setState(110)) - Query status of 
> ta_1401325188560_0001_000001_000000_01 is changed to TA_PENDING
> 2014-05-29 01:00:40,316 INFO  worker.Task (Task.java:<init>(188)) - 
> ==================================
> 2014-05-29 01:00:40,318 INFO  worker.Task (Task.java:<init>(189)) - * 
> Subquery ta_1401325188560_0001_000001_000000_01 is initialized
> 2014-05-29 01:00:40,318 INFO  worker.Task (Task.java:<init>(190)) - * 
> InterQuery: true, Use HASH_SHUFFLE shuffle
> 2014-05-29 01:00:40,318 INFO  worker.Task (Task.java:<init>(193)) - * 
> Fragments (num: 1)
> 2014-05-29 01:00:40,318 INFO  worker.Task (Task.java:<init>(194)) - * Fetches 
> (total:0) :
> 2014-05-29 01:00:40,318 INFO  worker.Task (Task.java:<init>(198)) - * Local 
> task dir: 
> file:/grid/d/tmp/yarn/usercache/mzhou/appcache/application_1400096295333_0092/container_1400096295333_0092_01_000004/${LOCAL_DIRS}/q_1401325188560_0001/output/1/0_1
> 2014-05-29 01:00:40,318 INFO  worker.Task (Task.java:<init>(203)) - 
> ==================================
> 2014-05-29 01:00:40,319 INFO  worker.TaskAttemptContext 
> (TaskAttemptContext.java:setState(110)) - Query status of 
> ta_1401325188560_0001_000001_000000_01 is changed to TA_RUNNING
> 2014-05-29 01:00:40,319 INFO  planner.PhysicalPlannerImpl 
> (PhysicalPlannerImpl.java:createInMemoryHashAggregation(901)) - The planner 
> chooses [Hash Aggregation]
> 2014-05-29 01:00:40,325 WARN  security.UserGroupInformation 
> (UserGroupInformation.java:doAs(1551)) - PriviledgedActionException as:mzhou 
> (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Client 
> cannot authenticate via:[TOKEN, KERBEROS]
> 2014-05-29 01:00:40,326 WARN  ipc.Client (Client.java:run(669)) - Exception 
> encountered while connecting to the server : 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]
> 2014-05-29 01:00:40,326 WARN  security.UserGroupInformation 
> (UserGroupInformation.java:doAs(1551)) - PriviledgedActionException as:mzhou 
> (auth:SIMPLE) cause:java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]
> 2014-05-29 01:00:40,328 ERROR worker.Task (Task.java:run(393)) - 
> java.io.IOException: Failed on local exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]; Host Details : local host is: 
> "host4.grid.domain.com/172.20.1.34"; destination host is: 
> "host1.grid.domain.com":9000; 
>       at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1359)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>       at com.sun.proxy.$Proxy9.getBlockLocations(Unknown Source)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>       at com.sun.proxy.$Proxy9.getBlockLocations(Unknown Source)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:206)
>       at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1131)
>       at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1121)
>       at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1111)
>       at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:272)
>       at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:239)
>       at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:232)
>       at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1279)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:292)
>       at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:292)
>       at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:765)
>       at org.apache.tajo.storage.CSVFile$CSVScanner.init(CSVFile.java:303)
>       at 
> org.apache.tajo.engine.planner.physical.SeqScanExec.initScanner(SeqScanExec.java:197)
>       at 
> org.apache.tajo.engine.planner.physical.SeqScanExec.init(SeqScanExec.java:179)
>       at 
> org.apache.tajo.engine.planner.physical.UnaryPhysicalExec.init(UnaryPhysicalExec.java:52)
>       at 
> org.apache.tajo.engine.planner.physical.UnaryPhysicalExec.init(UnaryPhysicalExec.java:52)
>       at 
> org.apache.tajo.engine.planner.physical.HashShuffleFileWriteExec.init(HashShuffleFileWriteExec.java:81)
>       at org.apache.tajo.worker.Task.run(Task.java:383)
>       at org.apache.tajo.worker.TaskRunner$1.run(TaskRunner.java:391)
>       at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]
>       at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:674)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>       at 
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:637)
>       at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:721)
>       at org.apache.hadoop.ipc.Client$Connection.access$2700(Client.java:367)
>       at org.apache.hadoop.ipc.Client.getConnection(Client.java:1458)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1377)
>       ... 32 more
> Caused by: org.apache.hadoop.security.AccessControlException: Client cannot 
> authenticate via:[TOKEN, KERBEROS]
>       at 
> org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:170)
>       at 
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:387)
>       at 
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:547)
>       at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:367)
>       at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:713)
>       at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:709)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>       at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:708)
>       ... 35 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to