Hari Sekhon created HDFS-7488:
---------------------------------

             Summary: HDFS Windows CIFS Gateway
                 Key: HDFS-7488
                 URL: https://issues.apache.org/jira/browse/HDFS-7488
             Project: Hadoop HDFS
          Issue Type: New Feature
    Affects Versions: 2.4.0
         Environment: HDP 2.1
            Reporter: Hari Sekhon


Stakeholders are pressuring for native Windows file share access to our Hadoop 
clusters.

I've used NFS gateway several times and while it's theoretically viable for 
users now UID mapping is implemented in 2.5... insecure NFS makes our fully 
Kerberized clusters security pointless.

We really need CIFS gateway access to enforce authentication which NFSv3 
doesn't (NFSv4?).

I've even tried Samba over NFS gateway loopback mount point (don't laugh - they 
want it that badly), and enabled hdfs atime precision to an hour to prevent 
FSNamesystem.setTimes() java exceptions in gw logs, but the NFS server still 
doesn't like the Windows CIFS client actions:

{code}2014-12-08 16:31:38,053 ERROR nfs3.RpcProgramNfs3 
(RpcProgramNfs3.java:setattr(346)) - Setting file size is not supported when 
setattr, fileId: 25597
2014-12-08 16:31:38,065 INFO  nfs3.WriteManager 
(WriteManager.java:handleWrite(136)) - No opened stream for fileId:25597
2014-12-08 16:31:38,122 INFO  nfs3.OpenFileCtx 
(OpenFileCtx.java:receivedNewWriteInternal(624)) - Have to change stable write 
to unstable write:FILE_SYNC
{code}

A debug of the Samba server shows it's trying to set metadata timestamps which 
hangs indefinitely, resulting in the creation of a zero byte file when trying 
to copy a file in to HDFS /tmp via the Windows mapped drive.

{code}
...
 smb_set_file_time: setting utimes to modified values.
file_ntime: actime: Thu Jan  1 01:00:00 1970
file_ntime: modtime: Mon Dec  8 16:31:38 2014
file_ntime: ctime: Thu Jan  1 01:00:00 1970
file_ntime: createtime: Thu Jan  1 01:00:00 1970
{code}

This is the traceback from NFS gw log when hdfs precision was set to 0:

{code}org.apache.hadoop.ipc.RemoteException(java.io.IOException): Access time 
for hdfs is not configured.  Please set dfs.namenode.accesstime.precision 
configuration parameter.
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setTimes(FSNamesystem.java:1960)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setTimes(NameNodeRpcServer.java:950)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setTimes(ClientNamenodeProtocolServerSideTranslatorPB.java:833)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
...
{code}

Regards,

Hari Sekhon
http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to