[jira] [Created] (HDFS-7989) NFS gateway should shutdown when it can't bind the serivce ports
Brandon Li created HDFS-7989: Summary: NFS gateway should shutdown when it can't bind the serivce ports Key: HDFS-7989 URL: https://issues.apache.org/jira/browse/HDFS-7989 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Brandon Li Unlike the Portmap, Nfs3 class does shutdown even the service can't start. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7976) Update NFS user guide for mount option sync to minimize or avoid reordered writes
Brandon Li created HDFS-7976: Summary: Update NFS user guide for mount option sync to minimize or avoid reordered writes Key: HDFS-7976 URL: https://issues.apache.org/jira/browse/HDFS-7976 Project: Hadoop HDFS Issue Type: Improvement Components: documentation, nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Brandon Li The mount option sync is critical. I observed that this mount option can minimize or avoid reordered writes. Mount option sync could have some negative performance impact on the file uploading. However, it makes the performance much more predicable and can also reduce the possibly of failures caused by file dumping. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7925) truncate RPC should not be considered idempotent
Brandon Li created HDFS-7925: Summary: truncate RPC should not be considered idempotent Key: HDFS-7925 URL: https://issues.apache.org/jira/browse/HDFS-7925 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Brandon Li Currently truncate is considered as an idempotent call in ClientProtocol. However, the retried RPC request could get a lease error like following: 2015-03-12 11:45:47,320 INFO ipc.Server (Server.java:run(2053)) - IPC Server handler 6 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.truncate from 192.168.76.4:49763 Call#1 Retry#1: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: Failed to TRUNCATE_FILE /user/hrt_qa/testFileTr for DFSClient_NONMAPREDUCE_171671673_1 on 192.168.76.4 because DFSClient_NONMAPREDUCE_171671673_1 is already the current lease holder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-6446) NFS: Different error messages for appending/writing data from read only mount
[ https://issues.apache.org/jira/browse/HDFS-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-6446. -- Resolution: Duplicate NFS: Different error messages for appending/writing data from read only mount - Key: HDFS-6446 URL: https://issues.apache.org/jira/browse/HDFS-6446 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Yesha Vora Assignee: Brandon Li steps: 1) set dfs.nfs.exports.allowed.hosts = nfs_client ro 2) Restart nfs server 3) Append data on file present on hdfs from read only mount point Append data {noformat} bash$ cat /tmp/tmp_10MB.txt /tmp/tmp_mnt/expected_data_stream cat: write error: Input/output error {noformat} 4) Write data from read only mount point Copy data {noformat} bash$ cp /tmp/tmp_10MB.txt /tmp/tmp_mnt/tmp/ cp: cannot create regular file `/tmp/tmp_mnt/tmp/tmp_10MB.txt': Permission denied {noformat} Both operations are treated differently. Copying data returns valid error message: 'Permission denied' . Though append data does not return valid error message -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-6445) NFS: Add a log message 'Permission denied' while writing data from read only mountpoint
[ https://issues.apache.org/jira/browse/HDFS-6445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-6445. -- Resolution: Duplicate NFS: Add a log message 'Permission denied' while writing data from read only mountpoint --- Key: HDFS-6445 URL: https://issues.apache.org/jira/browse/HDFS-6445 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Yesha Vora Assignee: Brandon Li Add a log message in NFS log file when a write operation is performed on read only mount point steps: 1) set dfs.nfs.exports.allowed.hosts = nfsclient ro 2) Restart nfs server 3) Append data on file present on hdfs {noformat} bash: cat /tmp/tmp_10MB.txt /tmp/tmp_mnt/expected_data_stream cat: write error: Input/output error {noformat} The real reason for append operation failure is permission denied. It should be printed in nfs logs. currently, nfs log prints below messages. {noformat} 2014-05-22 21:50:56,068 DEBUG nfs3.RpcProgramNfs3 (RpcProgramNfs3.java:write(731)) - NFS WRITE fileId: 16493 offset: 7340032 length:1048576 stableHow:0 xid:1904385849 2014-05-22 21:50:56,076 DEBUG nfs3.RpcProgramNfs3 (RpcProgramNfs3.java:handleInternal(1936)) - WRITE_RPC_CALL_START1921163065 2014-05-22 21:50:56,078 DEBUG nfs3.RpcProgramNfs3 (RpcProgramNfs3.java:write(731)) - NFS WRITE fileId: 16493 offset: 8388608 length:1048576 stableHow:0 xid:1921163065 2014-05-22 21:50:56,086 DEBUG nfs3.RpcProgramNfs3 (RpcProgramNfs3.java:handleInternal(1936)) - WRITE_RPC_CALL_START1937940281 2014-05-22 21:50:56,087 DEBUG nfs3.RpcProgramNfs3 (RpcProgramNfs3.java:write(731)) - NFS WRITE fileId: 16493 offset: 9437184 length:1048576 stableHow:0 xid:1937940281 2014-05-22 21:50:56,091 DEBUG nfs3.RpcProgramNfs3 (RpcProgramNfs3.java:handleInternal(1936)) - WRITE_RPC_CALL_START1954717497 2014-05-22 21:50:56,091 DEBUG nfs3.RpcProgramNfs3 (RpcProgramNfs3.java:write(731)) - NFS WRITE fileId: 16493 offset: 10485760 length:168 stableHow:0 xid:1954717497 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7640) print NFS Client in the NFS log
Brandon Li created HDFS-7640: Summary: print NFS Client in the NFS log Key: HDFS-7640 URL: https://issues.apache.org/jira/browse/HDFS-7640 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Brandon Li Priority: Trivial Currently hdfs-nfs logs does not have any information about nfs clients. When multiple clients are using nfs, it becomes hard to distinguish which request came from which client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7578) NFS WRITE and COMMIT responses should always use the channel pipeline
Brandon Li created HDFS-7578: Summary: NFS WRITE and COMMIT responses should always use the channel pipeline Key: HDFS-7578 URL: https://issues.apache.org/jira/browse/HDFS-7578 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-7578.001.patch Write and Commit responses directly write data to the channel instead of push it to the process pipeline. This could cause the NFS handler thread be blocked waiting for the response to be flushed to the network before it can return to serve a different request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7516) Fix findbugs warnings in hdfs-nfs project
Brandon Li created HDFS-7516: Summary: Fix findbugs warnings in hdfs-nfs project Key: HDFS-7516 URL: https://issues.apache.org/jira/browse/HDFS-7516 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Brandon Li Assignee: Brandon Li -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7502) Fix findbugs warning in hdfs-nfs project
Brandon Li created HDFS-7502: Summary: Fix findbugs warning in hdfs-nfs project Key: HDFS-7502 URL: https://issues.apache.org/jira/browse/HDFS-7502 Project: Hadoop HDFS Issue Type: Bug Reporter: Brandon Li Assignee: Brandon Li -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7387) NFS may only do partial commit due to a race between COMMIT and write
Brandon Li created HDFS-7387: Summary: NFS may only do partial commit due to a race between COMMIT and write Key: HDFS-7387 URL: https://issues.apache.org/jira/browse/HDFS-7387 Project: Hadoop HDFS Issue Type: Bug Reporter: Brandon Li Assignee: Brandon Li Priority: Critical The requested range may not be committed when the following happens: 1. the last pending write is removed from the queue to write to hdfs 2. a commit request arrives, NFS sees there is not pending write, and it will do a sync 3. this sync request could flush only part of the last write to hdfs 4. if a file read happens immediately after the above steps, the user may not see all the data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7259) Unresponseive NFS mount point due to deferred COMMIT response
Brandon Li created HDFS-7259: Summary: Unresponseive NFS mount point due to deferred COMMIT response Key: HDFS-7259 URL: https://issues.apache.org/jira/browse/HDFS-7259 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Brandon Li Since the gateway can't commit random write, it caches the COMMIT requests in a queue and send back response only when the data can be committed or stream timeout (failure in the latter case). This could cause problems two patterns: (1) file uploading failure (2) the mount dir is stuck on the same client, but other NFS clients can still access NFS gateway. The error pattern (2) is because there are too many COMMIT requests pending, so the NFS client can't send any other requests(e.g., for ls) to NFS gateway with its pending requests limit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-7215) Add gc log to NFS gateway
[ https://issues.apache.org/jira/browse/HDFS-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li reopened HDFS-7215: -- Add gc log to NFS gateway - Key: HDFS-7215 URL: https://issues.apache.org/jira/browse/HDFS-7215 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Brandon Li Assignee: Brandon Li Like NN/DN, a GC log would help debug issues in NFS gateway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7215) Add gc log to NFS gateway
Brandon Li created HDFS-7215: Summary: Add gc log to NFS gateway Key: HDFS-7215 URL: https://issues.apache.org/jira/browse/HDFS-7215 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Brandon Li Assignee: Brandon Li Like NN/DN, a GC log would help debug issues in NFS gateway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-7215) Add gc log to NFS gateway
[ https://issues.apache.org/jira/browse/HDFS-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-7215. -- Resolution: Invalid Add gc log to NFS gateway - Key: HDFS-7215 URL: https://issues.apache.org/jira/browse/HDFS-7215 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Brandon Li Assignee: Brandon Li Like NN/DN, a GC log would help debug issues in NFS gateway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-7094) [ HDFS NFS ] TYPO in NFS configurations from documentation.
[ https://issues.apache.org/jira/browse/HDFS-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-7094. -- Resolution: Not a Problem [ HDFS NFS ] TYPO in NFS configurations from documentation. --- Key: HDFS-7094 URL: https://issues.apache.org/jira/browse/HDFS-7094 Project: Hadoop HDFS Issue Type: Bug Components: documentation Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Priority: Critical Fix For: 2.5.1 *{color:blue}Config from Documentation{color}*( https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html ) property name *{color:red}nfs.keytab.file{color}* /name value/etc/hadoop/conf/nfsserver.keytab/value !-- path to the nfs gateway keytab -- /property property name *{color:red}nfs.kerberos.principal{color}* /name valuenfsserver/_h...@your-realm.com/value /property *{color:blue}Config From Code{color}* {code} public static final String DFS_NFS_KEYTAB_FILE_KEY = dfs.nfs.keytab.file; public static final String DFS_NFS_USER_NAME_KEY = dfs.nfs.kerberos.principal; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-6949) Add NFS-ACL protocol support
Brandon Li created HDFS-6949: Summary: Add NFS-ACL protocol support Key: HDFS-6949 URL: https://issues.apache.org/jira/browse/HDFS-6949 Project: Hadoop HDFS Issue Type: New Feature Components: nfs Reporter: Brandon Li This is the umbrella JIRA to track the effort of adding NFS ACL support. ACL support for NFSv3 is known as NFSACL. It is a separate out of band protocol (for NFSv3) to support ACL operations (GETACL and SETACL). There is no formal documentation or RFC on this protocol. NFSACL program number is 100227 and version is 3. The program listens on tcp port 38467. More reference: http://lwn.net/Articles/120338/ http://cateee.net/lkddb/web-lkddb/NFS_V3_ACL.html -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6941) Add shell based end-to-end NFS test
Brandon Li created HDFS-6941: Summary: Add shell based end-to-end NFS test Key: HDFS-6941 URL: https://issues.apache.org/jira/browse/HDFS-6941 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li As [~jayunit100] pointed out in HDFS-5135, we can create similar e2e NFS test as what's in apache bigtop HCFS fuse mount tests (BIGTOP-1221). Another example is: https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hadoop/src/main/groovy/org/apache/bigtop/itest/hadoop/hcfs/TestFuseHCFS.groovy -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6916) Like NameNode, NFS gateway should do name-id mapping with multiple sources
Brandon Li created HDFS-6916: Summary: Like NameNode, NFS gateway should do name-id mapping with multiple sources Key: HDFS-6916 URL: https://issues.apache.org/jira/browse/HDFS-6916 Project: Hadoop HDFS Issue Type: Bug Reporter: Brandon Li Like what's already done in Namenode, NFS should also do the name id mapping in a similar way, e.g., shell/ldap/composit mappings. The difference here is that, NN does mapping from user name to group lists, while NFS from name to id. Some problem has been found with current name-id mapping: the LDAP server has lots of user account and it returns a limited number of entries to each search request. Current code (IdUserGroup) uses a shell command to retrieve user accounts. One shell command might not get the complete list, e.g., due to some limit set in LDAP server. Even it does, it's not necessary to cache all user account in the memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6890) NFS readdirplus doesn't return dotdot attributes
Brandon Li created HDFS-6890: Summary: NFS readdirplus doesn't return dotdot attributes Key: HDFS-6890 URL: https://issues.apache.org/jira/browse/HDFS-6890 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li In RpcProgramNfs3#readdirplus(): {noformat} entries[1] = new READDIRPLUS3Response.EntryPlus3(dotdotFileId, .., dotdotFileId, postOpDirAttr, new FileHandle(dotdotFileId)); {noformat} It should return the directory's parent attribute instead of postOpDirAttr. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6892) Add XDR packaging method for each NFS/Mount request
Brandon Li created HDFS-6892: Summary: Add XDR packaging method for each NFS/Mount request Key: HDFS-6892 URL: https://issues.apache.org/jira/browse/HDFS-6892 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Brandon Li The method can be used for unit tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6894) Add XDR parser method for each NFS/Mount response
Brandon Li created HDFS-6894: Summary: Add XDR parser method for each NFS/Mount response Key: HDFS-6894 URL: https://issues.apache.org/jira/browse/HDFS-6894 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Environment: This can be an abstract method in NFS3Response to force the subclasses to implement. Reporter: Brandon Li -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6895) Add XDR parser method for each Mount response
Brandon Li created HDFS-6895: Summary: Add XDR parser method for each Mount response Key: HDFS-6895 URL: https://issues.apache.org/jira/browse/HDFS-6895 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6896) Add XDR packaging method for each Mount request
Brandon Li created HDFS-6896: Summary: Add XDR packaging method for each Mount request Key: HDFS-6896 URL: https://issues.apache.org/jira/browse/HDFS-6896 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6868) portmap and nfs3 are documented as hadoop commands instead of hdfs
[ https://issues.apache.org/jira/browse/HDFS-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-6868. -- Resolution: Fixed Hadoop Flags: Reviewed portmap and nfs3 are documented as hadoop commands instead of hdfs -- Key: HDFS-6868 URL: https://issues.apache.org/jira/browse/HDFS-6868 Project: Hadoop HDFS Issue Type: Bug Components: documentation, nfs Affects Versions: 3.0.0, 2.3.0, 2.4.0, 2.5.0, 2.4.1, 2.6.0 Reporter: Allen Wittenauer Assignee: Brandon Li Attachments: HDFS-6868.patch The NFS guide says to use 'hadoop portmap' and 'hadoop nfs3' even though these are deprecated options. Instead this should say 'hdfs portmap' and 'hdfs nfs3'. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6855) Add a different end-to-end non-manual NFS test to replace TestOutOfOrderWrite
Brandon Li created HDFS-6855: Summary: Add a different end-to-end non-manual NFS test to replace TestOutOfOrderWrite Key: HDFS-6855 URL: https://issues.apache.org/jira/browse/HDFS-6855 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Brandon Li TestOutOfOrderWrite is an end-to-end test with a TCP client. However, it's a manual test and out-of-order write is covered by new added test in HDFS-6850. This JIRA is to track the effort of adding a new end-to-end test with more test cases to replace TestOutOfOrderWrite. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6717) Jira HDFS-5804 breaks default nfs-gateway behavior for unsecured config
[ https://issues.apache.org/jira/browse/HDFS-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-6717. -- Resolution: Fixed Hadoop Flags: Reviewed Jira HDFS-5804 breaks default nfs-gateway behavior for unsecured config --- Key: HDFS-6717 URL: https://issues.apache.org/jira/browse/HDFS-6717 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 2.4.0 Reporter: Jeff Hansen Assignee: Brandon Li Priority: Minor Attachments: HDFS-6717.001.patch, HdfsNfsGateway.html I believe this is just a matter of needing to update documentation. As a result of https://issues.apache.org/jira/browse/HDFS-5804, the secure and unsecure code paths appear to have been merged -- this is great because it means less code to test. However, it means that the default unsecure behavior requires additional configuration that needs to be documented. I'm not the first to have trouble following the instructions documented in http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html I kept hitting a RemoteException with the message that hdfs user cannot impersonate root -- apparently under the old code, there was no impersonation going on, so the nfs3 service could and should be run under the same user id that runs hadoop (I assumed this meant the user id hdfs). However, with the new unified code path, that would require hdfs to be able to impersonate root (because root is always the local user that mounts a drive). The comments in jira hdfs-5804 seem to indicate nobody has a problem with requiring the nfsserver user to impersonate root -- if that means it's necessary for the configuration to include root as a user nfsserver can impersonate, that should be included in the setup instructions. More to the point, it appears to be absolutely necessary now to provision a user named nfsserver in order to be able to give that nfsserver ability to impersonate other users. Alternatively I think we'd need to configure hdfs to be able to proxy other users. I'm not really sure what the best practice should be, but it should be documented since it wasn't needed in the past. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6732) Fix inconsistent description in NFS user guide doc for proxy user
Brandon Li created HDFS-6732: Summary: Fix inconsistent description in NFS user guide doc for proxy user Key: HDFS-6732 URL: https://issues.apache.org/jira/browse/HDFS-6732 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation, nfs Reporter: Brandon Li Assignee: Brandon Li -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6706) ZKFailoverController failed to recognize the quorum is not met
[ https://issues.apache.org/jira/browse/HDFS-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-6706. -- Resolution: Invalid ZKFailoverController failed to recognize the quorum is not met -- Key: HDFS-6706 URL: https://issues.apache.org/jira/browse/HDFS-6706 Project: Hadoop HDFS Issue Type: Bug Reporter: Brandon Li Assignee: Brandon Li Thanks Kenny Zhang for finding this problem. The zkfc cannot be startup due to ha.zookeeper.quorum is not met. zkfc -format doesn't log the real problem. And then user will see the error message instead of the real issue when starting zkfc: 2014-07-01 17:08:17,528 FATAL ha.ZKFailoverController (ZKFailoverController.java:doRun(213)) - Unable to start failover controller. Parent znode does not exist. Run with -formatZK flag to initialize ZooKeeper. 2014-07-01 16:00:48,678 FATAL ha.ZKFailoverController (ZKFailoverController.java:fatalError(365)) - Fatal error occurred:Received create error from Zookeeper. code:NONODE for path /hadoop-ha/prodcluster/ActiveStandbyElectorLock 2014-07-01 17:24:44,202 - INFO ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@627 - Got user-level KeeperException when processing sessionid:0x346f36191250005 type:create cxid:0x2 zxid:0xf0033 txntype:-1 reqpath:n/a Error Path:/hadoop-ha/prodcluster/ActiveStandbyElectorLock Error:KeeperErrorCode = NodeExists for /hadoop-ha/prodcluster/ActiveStandbyElectorLock To reproduce the problem: 1. use HDFS cluster with automatic HA enable and set the ha.zookeeper.quorum to 3. 2. start two zookeeper servers. 3. do hdfs zkfc -format, and then hdfs zkfc -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6706) ZKFailoverController failed to recognize the quorum is not met
Brandon Li created HDFS-6706: Summary: ZKFailoverController failed to recognize the quorum is not met Key: HDFS-6706 URL: https://issues.apache.org/jira/browse/HDFS-6706 Project: Hadoop HDFS Issue Type: Bug Reporter: Brandon Li Assignee: Brandon Li Thanks Kenny Zhang for finding this problem. The zkfc cannot be startup due to ha.zookeeper.quorum is not met. zkfc -format doesn't log the real problem. And then user will see the error message instead of the real issue when starting zkfc. 2014-07-01 17:08:17,528 FATAL ha.ZKFailoverController (ZKFailoverController.java:doRun(213)) - Unable to start failover controller. Parent znode does not exist. Run with -formatZK flag to initialize ZooKeeper. 2014-07-01 16:00:48,678 FATAL ha.ZKFailoverController (ZKFailoverController.java:fatalError(365)) - Fatal error occurred:Received create error from Zookeeper. code:NONODE for path /hadoop-ha/prodcluster/ActiveStandbyElectorLock 2014-07-01 17:24:44,202 - INFO ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@627 - Got user-level KeeperException when processing sessionid:0x346f36191250005 type:create cxid:0x2 zxid:0xf0033 txntype:-1 reqpath:n/a Error Path:/hadoop-ha/prodcluster/ActiveStandbyElectorLock Error:KeeperErrorCode = NodeExists for /hadoop-ha/prodcluster/ActiveStandbyElectorLock To reproduce the problem: 1. use HDFS cluster with automatic HA enable and setup the ha.zookeeper.quorum to 3. 2. start two zookeeper servers. 3. do hdfs zkfc -format, and then hdfs zkfc -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6623) Provide a CLI command to refresh NFS user group list
Brandon Li created HDFS-6623: Summary: Provide a CLI command to refresh NFS user group list Key: HDFS-6623 URL: https://issues.apache.org/jira/browse/HDFS-6623 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 3.0.0 Reporter: Brandon Li Currently, NFS updates the user/group id cache every 15 minutes by default. However, it's preferred to have an option to refresh the list without restarting NFS gateway. This is especially useful in the environment where user accounts change frequently. Otherwise, the user can't access the HDFS mount immediately after they are creating in LDAP or etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6569) OOB massage can't be sent to the client when DataNode shuts down for upgrade
Brandon Li created HDFS-6569: Summary: OOB massage can't be sent to the client when DataNode shuts down for upgrade Key: HDFS-6569 URL: https://issues.apache.org/jira/browse/HDFS-6569 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Brandon Li The socket is closed too early before the OOB message can be sent to client, which causes the write pipeline failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6498) Support squash and range in NFS static user id mapping
Brandon Li created HDFS-6498: Summary: Support squash and range in NFS static user id mapping Key: HDFS-6498 URL: https://issues.apache.org/jira/browse/HDFS-6498 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Brandon Li HDFS-6435 adds static user group name id mapping. The mapping is a one to one mapping. What makes this feature easy to use is to support squash and range based mapping as that in traditional NFS configuration (e.g., http://manpages.ubuntu.com/manpages/hardy/man5/exports.5.html) {noformat} # Mapping for client foobar: #remote local uid 0-99 - # squash these uid 100-5001000# map 100-500 to 1000-1500 gid 0-49 - # squash these gid 50-100 700 # map 50-100 to 700-750 {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6458) NFS: stale NFS file handle Error for previous mount point
[ https://issues.apache.org/jira/browse/HDFS-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-6458. -- Resolution: Duplicate NFS: stale NFS file handle Error for previous mount point - Key: HDFS-6458 URL: https://issues.apache.org/jira/browse/HDFS-6458 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Yesha Vora Assignee: Brandon Li Steps to reproduce: 1) Set dfs.nfs.exports.allowed.hosts = Gateway rw 2) mount nfs on Gateway 3) Set dfs.nfs.exports.allowed.hosts = Datanode rw 4) mount nfs on Datanode Try to access NFS mount point at Gateway. Can't access mount point from Gateway. {noformat} bash: ls /tmp/tmp_mnt ls: cannot access /tmp/tmp_mnt: Stale NFS file handle {noformat} Expected: Mount_point from previous config should be accessible if it is not unmounted before config change. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6459) Add function to refresh export table for NFS gateway
Brandon Li created HDFS-6459: Summary: Add function to refresh export table for NFS gateway Key: HDFS-6459 URL: https://issues.apache.org/jira/browse/HDFS-6459 Project: Hadoop HDFS Issue Type: New Feature Components: nfs Reporter: Brandon Li Currently NFS has to restart to refresh the export table configuration. This JIRA is to track the effort to provide the function to refresh the export table without rebooting NFS gateway. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6451) NFS should not return NFS3ERR_IO for AccessControlException
Brandon Li created HDFS-6451: Summary: NFS should not return NFS3ERR_IO for AccessControlException Key: HDFS-6451 URL: https://issues.apache.org/jira/browse/HDFS-6451 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Brandon Li As [~jingzhao] pointed out in HDFS-6411, we need to catch the AccessControlException from the HDFS calls, and return NFS3ERR_PERM instead of NFS3ERR_IO for it. Another possible improvement is to have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6416) Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system clock bugs
Brandon Li created HDFS-6416: Summary: Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system clock bugs Key: HDFS-6416 URL: https://issues.apache.org/jira/browse/HDFS-6416 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Brandon Li Priority: Minor As [~cnauroth] pointed out in HADOOP-10612, Time#monotonicNow is a more preferred method to use since this isn't subject to system clock bugs (i.e. Someone resets the clock to a time in the past, and then updates don't happen for a long time.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6378) NFS: when portmap/rpcbind is not available, NFS registration should timeout instead of hanging
Brandon Li created HDFS-6378: Summary: NFS: when portmap/rpcbind is not available, NFS registration should timeout instead of hanging Key: HDFS-6378 URL: https://issues.apache.org/jira/browse/HDFS-6378 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Brandon Li -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6181) Fix property name error in NFS user guide
Brandon Li created HDFS-6181: Summary: Fix property name error in NFS user guide Key: HDFS-6181 URL: https://issues.apache.org/jira/browse/HDFS-6181 Project: Hadoop HDFS Issue Type: Bug Components: documentation, nfs Reporter: Brandon Li Assignee: Brandon Li Priority: Trivial A couple property names are wrong in the NFS user guide, and should be fixed as the following: property - namedfs.nfsgateway.keytab.file/name + namedfs.nfs.keytab.file/name value/etc/hadoop/conf/nfsserver.keytab/value !-- path to the nfs gateway keytab -- /property property - namedfs.nfsgateway.kerberos.principal/name + namedfs.nfs.kerberos.principal/name valuenfsserver/_h...@your-realm.com/value /property -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6060) NameNode should not check DataNode layout version
Brandon Li created HDFS-6060: Summary: NameNode should not check DataNode layout version Key: HDFS-6060 URL: https://issues.apache.org/jira/browse/HDFS-6060 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Brandon Li Assignee: Brandon Li In current code, NameNode allows DataNode layout version to be different only when the NameNode is in rolling upgrade mode. DataNode can't register with NameNode when only DataNode is to be upgraded with a layout version different with that on NameNode. NameNode should not check DataNode layout version in any cases. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6054) MiniQJMHACluster should not use static port to avoid binding failure in unit test
Brandon Li created HDFS-6054: Summary: MiniQJMHACluster should not use static port to avoid binding failure in unit test Key: HDFS-6054 URL: https://issues.apache.org/jira/browse/HDFS-6054 Project: Hadoop HDFS Issue Type: Improvement Reporter: Brandon Li One example of the test failues: TestFailureToReadEdits {noformat} Error Message Port in use: localhost:10003 Stacktrace java.net.BindException: Port in use: localhost:10003 at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:845) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:786) at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:132) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:593) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:492) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:650) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:635) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1283) at org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:966) at org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:851) at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:697) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:374) at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:355) at org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits.setUpCluster(TestFailureToReadEdits.java:108) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6043) Give HDFS daemons NFS3 and Portmap their own OPTS
Brandon Li created HDFS-6043: Summary: Give HDFS daemons NFS3 and Portmap their own OPTS Key: HDFS-6043 URL: https://issues.apache.org/jira/browse/HDFS-6043 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Brandon Li Assignee: Brandon Li Like some other HDFS services, the OPTS makes it easier for the users to update resource related settings for the NFS gateway. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6044) Add property for setting the NFS look up time for users
Brandon Li created HDFS-6044: Summary: Add property for setting the NFS look up time for users Key: HDFS-6044 URL: https://issues.apache.org/jira/browse/HDFS-6044 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Brandon Li Assignee: Brandon Li Priority: Minor Currently NFS gateway refresh the user account every 15minutes. Add a property to make it tunable in different environments. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-5874) Should not compare DataNode current layout version with that of NameNode in DataStrorage
[ https://issues.apache.org/jira/browse/HDFS-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-5874. -- Resolution: Fixed Should not compare DataNode current layout version with that of NameNode in DataStrorage Key: HDFS-5874 URL: https://issues.apache.org/jira/browse/HDFS-5874 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Reporter: Brandon Li Assignee: Brandon Li Fix For: HDFS-5535 (Rolling upgrades) Attachments: HDFS-5874.001.patch As [~vinayrpet] pointed out in HDFS-5754: in DataStorage DATANODE_LAYOUT_VERSION should not compare with NameNode layout version anymore. {noformat} if (DataNodeLayoutVersion.supports( LayoutVersion.Feature.FEDERATION, HdfsConstants.DATANODE_LAYOUT_VERSION) HdfsConstants.DATANODE_LAYOUT_VERSION == nsInfo.getLayoutVersion()) { readProperties(sd, nsInfo.getLayoutVersion()); {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-5874) Should not compare DataNode current layout version with that of NameNode in DataStrorage
Brandon Li created HDFS-5874: Summary: Should not compare DataNode current layout version with that of NameNode in DataStrorage Key: HDFS-5874 URL: https://issues.apache.org/jira/browse/HDFS-5874 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Brandon Li As [~vinayrpet] pointed out in HDFS-5754: in DataStorage DATANODE_LAYOUT_VERSION should not compare with NameNode layout version anymore. {noformat} if (DataNodeLayoutVersion.supports( LayoutVersion.Feature.FEDERATION, HdfsConstants.DATANODE_LAYOUT_VERSION) HdfsConstants.DATANODE_LAYOUT_VERSION == nsInfo.getLayoutVersion()) { readProperties(sd, nsInfo.getLayoutVersion()); {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Reopened] (HDFS-5767) Nfs implementation assumes userName userId mapping to be unique, which is not true sometimes
[ https://issues.apache.org/jira/browse/HDFS-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li reopened HDFS-5767: -- Nfs implementation assumes userName userId mapping to be unique, which is not true sometimes Key: HDFS-5767 URL: https://issues.apache.org/jira/browse/HDFS-5767 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.3.0 Environment: With LDAP enabled Reporter: Yongjun Zhang Assignee: Brandon Li I'm seeing that the nfs implementation assumes unique userName, userId pair to be returned by command getent paswd. That is, for a given userName, there should be a single userId, and for a given userId, there should be a single userName. The reason is explained in the following message: private static final String DUPLICATE_NAME_ID_DEBUG_INFO = NFS gateway can't start with duplicate name or id on the host system.\n + This is because HDFS (non-kerberos cluster) uses name as the only way to identify a user or group.\n + The host system with duplicated user/group name or id might work fine most of the time by itself.\n + However when NFS gateway talks to HDFS, HDFS accepts only user and group name.\n + Therefore, same name means the same user or same group. To find the duplicated names/ids, one can do:\n + getent passwd | cut -d: -f1,3 and getent group | cut -d: -f1,3 on Linux systms,\n + dscl . -list /Users UniqueID and dscl . -list /Groups PrimaryGroupID on MacOS.; This requirement can not be met sometimes (e.g. because of the use of LDAP) Let's do some examination: What exist in /etc/passwd: $ more /etc/passwd | grep ^bin bin:x:2:2:bin:/bin:/bin/sh $ more /etc/passwd | grep ^daemon daemon:x:1:1:daemon:/usr/sbin:/bin/sh The above result says userName bin has userId 2, and daemon has userId 1. What we can see with getent passwd command due to LDAP: $ getent passwd | grep ^bin bin:x:2:2:bin:/bin:/bin/sh bin:x:1:1:bin:/bin:/sbin/nologin $ getent passwd | grep ^daemon daemon:x:1:1:daemon:/usr/sbin:/bin/sh daemon:x:2:2:daemon:/sbin:/sbin/nologin We can see that there are multiple entries for the same userName with different userIds, and the same userId could be associated with different userNames. So the assumption stated in the above DEBUG_INFO message can not be met here. The DEBUG_INFO also stated that HDFS uses name as the only way to identify user/group. I'm filing this JIRA for a solution. Hi [~brandonli], since you implemented most of the nfs feature, would you please comment? Thanks. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HDFS-5767) Nfs implementation assumes userName userId mapping to be unique, which is not true sometimes
[ https://issues.apache.org/jira/browse/HDFS-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-5767. -- Resolution: Invalid Assignee: Brandon Li Nfs implementation assumes userName userId mapping to be unique, which is not true sometimes Key: HDFS-5767 URL: https://issues.apache.org/jira/browse/HDFS-5767 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.3.0 Environment: With LDAP enabled Reporter: Yongjun Zhang Assignee: Brandon Li I'm seeing that the nfs implementation assumes unique userName, userId pair to be returned by command getent paswd. That is, for a given userName, there should be a single userId, and for a given userId, there should be a single userName. The reason is explained in the following message: private static final String DUPLICATE_NAME_ID_DEBUG_INFO = NFS gateway can't start with duplicate name or id on the host system.\n + This is because HDFS (non-kerberos cluster) uses name as the only way to identify a user or group.\n + The host system with duplicated user/group name or id might work fine most of the time by itself.\n + However when NFS gateway talks to HDFS, HDFS accepts only user and group name.\n + Therefore, same name means the same user or same group. To find the duplicated names/ids, one can do:\n + getent passwd | cut -d: -f1,3 and getent group | cut -d: -f1,3 on Linux systms,\n + dscl . -list /Users UniqueID and dscl . -list /Groups PrimaryGroupID on MacOS.; This requirement can not be met sometimes (e.g. because of the use of LDAP) Let's do some examination: What exist in /etc/passwd: $ more /etc/passwd | grep ^bin bin:x:2:2:bin:/bin:/bin/sh $ more /etc/passwd | grep ^daemon daemon:x:1:1:daemon:/usr/sbin:/bin/sh The above result says userName bin has userId 2, and daemon has userId 1. What we can see with getent passwd command due to LDAP: $ getent passwd | grep ^bin bin:x:2:2:bin:/bin:/bin/sh bin:x:1:1:bin:/bin:/sbin/nologin $ getent passwd | grep ^daemon daemon:x:1:1:daemon:/usr/sbin:/bin/sh daemon:x:2:2:daemon:/sbin:/sbin/nologin We can see that there are multiple entries for the same userName with different userIds, and the same userId could be associated with different userNames. So the assumption stated in the above DEBUG_INFO message can not be met here. The DEBUG_INFO also stated that HDFS uses name as the only way to identify user/group. I'm filing this JIRA for a solution. Hi [~brandonli], since you implemented most of the nfs feature, would you please comment? Thanks. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-5795) RemoteBlockReader2#checkSuccess() shoud print error status
Brandon Li created HDFS-5795: Summary: RemoteBlockReader2#checkSuccess() shoud print error status Key: HDFS-5795 URL: https://issues.apache.org/jira/browse/HDFS-5795 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Brandon Li Priority: Trivial RemoteBlockReader2#checkSuccess() doesn't print error status, which makes debug harder when the client can't read from DataNode. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HDFS-5712) ViewFS should check the existence of the mapped namespace directories in the mount table
[ https://issues.apache.org/jira/browse/HDFS-5712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-5712. -- Resolution: Invalid ViewFS should check the existence of the mapped namespace directories in the mount table Key: HDFS-5712 URL: https://issues.apache.org/jira/browse/HDFS-5712 Project: Hadoop HDFS Issue Type: Bug Components: federation Affects Versions: 3.0.0 Reporter: Brandon Li ViewFS doesn't validate the mount table mapping. Even the mapped directory on NameNode doesn't exist, list directories or dfs -ls command can still show the mapped directory. This confuses users and applications when they try to create files under the mapped directories. They will get file-not-exist error but viewfs shows the directory exists. It would be less misleading if ViewFS can validate the mount table and report found errors. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-5712) ViewFS should check the existence of the mapped namespace directories in the mount table
Brandon Li created HDFS-5712: Summary: ViewFS should check the existence of the mapped namespace directories in the mount table Key: HDFS-5712 URL: https://issues.apache.org/jira/browse/HDFS-5712 Project: Hadoop HDFS Issue Type: Bug Components: federation Affects Versions: 3.0.0 Reporter: Brandon Li ViewFS doesn't validate the mount table mapping. Even the mapped directory on NameNode doesn't exist, list directories or dfs -ls command can still show the mapped directory. This confuses users and applications when they try to create files under the mapped directories. They will get file-not-exist error but viewfs shows the directory exists. It would be less misleading if ViewFS can validate the mount table and report found errors. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-5713) ViewFS doesn't work with -lsr command
Brandon Li created HDFS-5713: Summary: ViewFS doesn't work with -lsr command Key: HDFS-5713 URL: https://issues.apache.org/jira/browse/HDFS-5713 Project: Hadoop HDFS Issue Type: Bug Components: federation Affects Versions: 3.0.0 Reporter: Brandon Li -lsr doesn't show the namespace subtree but only shows the top level directory/file objects. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-5662) Can't decommit a DataNode due to file's replication factor larger than the rest of the cluster size
Brandon Li created HDFS-5662: Summary: Can't decommit a DataNode due to file's replication factor larger than the rest of the cluster size Key: HDFS-5662 URL: https://issues.apache.org/jira/browse/HDFS-5662 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li A datanode can't be decommitted if it has replica belongs to a file with a replication factor larger than the rest of the cluster size. One way to fix this is to have some kind of minimum replication factor setting and thus any datanode can be decommitted regardless of the largest replication factor it's related to. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Resolved] (HDFS-5572) Hadoop NFS rpc service port is not configurable
[ https://issues.apache.org/jira/browse/HDFS-5572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-5572. -- Resolution: Duplicate Hadoop NFS rpc service port is not configurable --- Key: HDFS-5572 URL: https://issues.apache.org/jira/browse/HDFS-5572 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.2.0 Reporter: Gordon Wang This jira HDFS-5246 gives a patch to configure hadoop nfs port. But when I set nfs3.server.port in core-site, the configuration does not take effect. The rpc port is still 2049. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (HDFS-5657) race condition causes writeback state error in NFS gateway
Brandon Li created HDFS-5657: Summary: race condition causes writeback state error in NFS gateway Key: HDFS-5657 URL: https://issues.apache.org/jira/browse/HDFS-5657 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Brandon Li Assignee: Brandon Li A race condition between NFS gateway writeback executor thread and new write handler thread can cause writeback state check failure, e.g., {noformat} 2013-11-26 10:34:07,859 DEBUG nfs3.RpcProgramNfs3 (Nfs3Utils.java:writeChannel(113)) - WRITE_RPC_CALL_END__957880843 2013-11-26 10:34:07,863 DEBUG nfs3.OpenFileCtx (OpenFileCtx.java:offerNextToWrite(832)) - The asyn write task has no pending writes, fileId: 30938 2013-11-26 10:34:07,871 ERROR nfs3.AsyncDataService (AsyncDataService.java:run(136)) - Asyn data service got error:java.lang.IllegalStateException: The openFileCtx has false async status at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.hadoop.hdfs.nfs.nfs3.OpenFileCtx.executeWriteBack(OpenFileCtx.java:890) at org.apache.hadoop.hdfs.nfs.nfs3.AsyncDataService$WriteBackTask.run(AsyncDataService.java:134) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2013-11-26 10:34:07,901 DEBUG nfs3.RpcProgramNfs3 (RpcProgramNfs3.java:write(707)) - requesed offset=917504 and current filesize=917504 2013-11-26 10:34:07,902 DEBUG nfs3.WriteManager (WriteManager.java:handleWrite(131)) - handleWrite fileId: 30938 offset: 917504 length:65536 stableHow:0 {noformat} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (HDFS-5587) add debug information when NFS fails to start with duplicate user or group names
Brandon Li created HDFS-5587: Summary: add debug information when NFS fails to start with duplicate user or group names Key: HDFS-5587 URL: https://issues.apache.org/jira/browse/HDFS-5587 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Brandon Li Assignee: Brandon Li When the host provides duplicate user or group names, NFS will not start and print errors like the following: {noformat} ... ... 13/11/25 18:11:52 INFO nfs3.Nfs3Base: registered UNIX signal handlers for [TERM, HUP, INT] Exception in thread main java.lang.IllegalArgumentException: value already present: s-iss at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115) at com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:112) at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96) at com.google.common.collect.HashBiMap.put(HashBiMap.java:85) at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMapInternal(IdUserGroup.java:85) at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMaps(IdUserGroup.java:110) at org.apache.hadoop.nfs.nfs3.IdUserGroup.init(IdUserGroup.java:54) at org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.init(RpcProgramNfs3.java:172) at org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.init(RpcProgramNfs3.java:164) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.init(Nfs3.java:41) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:52) 13/11/25 18:11:54 INFO nfs3.Nfs3Base: SHUTDOWN_MSG: ... ... {noformat} The reason NFS should not start is that, HDFS (non-kerberos cluster) uses name as the only way to identify a user. On some linux box, it could have two users with the same name but different user IDs. Linux might be able to work fine with that most of the time. However, when NFS gateway talks to HDFS, HDFS accepts only user name. That is, from HDFS' point of view, these two different users are the same user even though they are different on the Linux box. The duplicate names on Linux systems sometimes is because of some legacy system configurations, or combined name services. Regardless, NFS gateway should print some help information so the user can understand the error and the remove the duplicated names before NFS restart. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5577) NFS user guide update
Brandon Li created HDFS-5577: Summary: NFS user guide update Key: HDFS-5577 URL: https://issues.apache.org/jira/browse/HDFS-5577 Project: Hadoop HDFS Issue Type: Bug Reporter: Brandon Li Assignee: Brandon Li Priority: Trivial dfs.access.time.precision is deprecated and the doc should use dfs.namenode.accesstime.precision instead. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5563) NFS gateway should commit the buffered data when read request comes after write to the same file
Brandon Li created HDFS-5563: Summary: NFS gateway should commit the buffered data when read request comes after write to the same file Key: HDFS-5563 URL: https://issues.apache.org/jira/browse/HDFS-5563 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Brandon Li Assignee: Brandon Li HDFS write is asynchronous and data may not be available to read immediately after write. One of the main reason is that DFSClient doesn't flush data to DN until its local buffer is full. To workaround this problem, when a read comes after write to the same file, NFS gateway should sync the data so the read request can get the latest content. The drawback is that, the frequent hsync() call can slow down data write. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5539) NFS gateway secuirty enhancement
Brandon Li created HDFS-5539: Summary: NFS gateway secuirty enhancement Key: HDFS-5539 URL: https://issues.apache.org/jira/browse/HDFS-5539 Project: Hadoop HDFS Issue Type: New Feature Reporter: Brandon Li Currently, NFS gateway only supports AUTH_UNIX RPC authentication. AUTH_UNIX is easy to deploy and use but lack of strong security support. This JIRA is to track the effort of NFS gateway security enhancement, such as RPCSEC_GSS framework and end to end Kerberos support. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5523) Support subdirectory mount and multiple exports in HDFS-NFS gateway
Brandon Li created HDFS-5523: Summary: Support subdirectory mount and multiple exports in HDFS-NFS gateway Key: HDFS-5523 URL: https://issues.apache.org/jira/browse/HDFS-5523 Project: Hadoop HDFS Issue Type: New Feature Reporter: Brandon Li Supporting multiple exports and subdirectory mount usually can make data and security management easier for the HDFS-NFS client. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5519) COMMIT handler should update the commit status after sync
Brandon Li created HDFS-5519: Summary: COMMIT handler should update the commit status after sync Key: HDFS-5519 URL: https://issues.apache.org/jira/browse/HDFS-5519 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Brandon Li Assignee: Brandon Li Priority: Minor The problem was found during test with Windows NFS client. After hsync, OpenFileCtx#checkCommit() should update COMMIT_DO_SYNC to COMMIT_FINISHED. Otherwise, the caller could throw run time exception since COMMIT_DO_SYNC is not expected. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HDFS-5172) Handle race condition for writes
[ https://issues.apache.org/jira/browse/HDFS-5172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-5172. -- Resolution: Fixed This issue is fixed along with the fix to HDFS-5364. Solve it as a dup. Handle race condition for writes Key: HDFS-5172 URL: https://issues.apache.org/jira/browse/HDFS-5172 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Brandon Li When an unstable write arrives, the following happens: 1. retrieves the OpenFileCtx 2. create asyn task to write it to hdfs The race is that, the OpenFileCtx could be closed by the StreamMonitor. Then step 2 will simply return an error to the client. This is OK before streaming is supported. To support data streaming, the file needs to be reopened. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5469) Add configuration property for the sub-directroy export path
Brandon Li created HDFS-5469: Summary: Add configuration property for the sub-directroy export path Key: HDFS-5469 URL: https://issues.apache.org/jira/browse/HDFS-5469 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Brandon Li Assignee: Brandon Li Currently only HDFS root is exported. Adding this property is the first step to support sub-directory mounting. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5364) Add OpenFileCtx cache
Brandon Li created HDFS-5364: Summary: Add OpenFileCtx cache Key: HDFS-5364 URL: https://issues.apache.org/jira/browse/HDFS-5364 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Brandon Li Assignee: Brandon Li NFS gateway can run out of memory when the stream timeout is set to a relatively long period(e.g., 1 minute) and user uploads thousands of files in parallel. Each stream DFSClient creates a DataStreamer thread, and will eventually run out of memory by creating too many threads. NFS gateway should have a OpenFileCtx cache to limit the total opened files. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5347) add HDFS NFS user guide
Brandon Li created HDFS-5347: Summary: add HDFS NFS user guide Key: HDFS-5347 URL: https://issues.apache.org/jira/browse/HDFS-5347 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Reporter: Brandon Li Assignee: Brandon Li -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5330) fix readdir and readdirplus for large directories
Brandon Li created HDFS-5330: Summary: fix readdir and readdirplus for large directories Key: HDFS-5330 URL: https://issues.apache.org/jira/browse/HDFS-5330 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Brandon Li These two calls need to use cookies to do multiple round trips to namenode to get the complete list of the dirents. Currently implementation passes an inode path as startAfter for listPath(), however, namenode doesn't resolve startAfter as an inode path. Better use file name as startAfter. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5337) should do a sync for a commit request even there is no pending writes
Brandon Li created HDFS-5337: Summary: should do a sync for a commit request even there is no pending writes Key: HDFS-5337 URL: https://issues.apache.org/jira/browse/HDFS-5337 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Brandon Li Assignee: Brandon Li HDFS-5281 introduced a regression that hsync is not executed when a commit request arrives with no pending writes for the same file. This JIRA is to track the fix. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5281) COMMIT request should not block
Brandon Li created HDFS-5281: Summary: COMMIT request should not block Key: HDFS-5281 URL: https://issues.apache.org/jira/browse/HDFS-5281 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Brandon Li Currently Commit request is handled synchronously, blocked at most 30 seconds before timeout. This JIRA is to make is asynchronous and thus it won't block other requests coming from the same channel. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5268) NFS write commit verifier is not set in a few places
Brandon Li created HDFS-5268: Summary: NFS write commit verifier is not set in a few places Key: HDFS-5268 URL: https://issues.apache.org/jira/browse/HDFS-5268 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Brandon Li Assignee: Brandon Li Each time, after server reboot a WRITE_COMMIT_VERF is created and passed to NFS client with write or commit response. If this verifier is not correctly set in the response, some NFS client(especially Linux client) could keep resending the same request, since it thinks the write/commit didn't succeed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5249) Fix dumper thread which may die silently
Brandon Li created HDFS-5249: Summary: Fix dumper thread which may die silently Key: HDFS-5249 URL: https://issues.apache.org/jira/browse/HDFS-5249 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Brandon Li Dumper thread can get an NPE when the WriteCtx it's about to work on is just deleted by write back thread. A dead dumper thread could cause out-of-memory error when too many pending writes accumulated for one opened file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5252) Do unstable write only when stable write can't be honored
Brandon Li created HDFS-5252: Summary: Do unstable write only when stable write can't be honored Key: HDFS-5252 URL: https://issues.apache.org/jira/browse/HDFS-5252 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li When the client asks for a stable write but the prerequisite writes are not transferred to NFS gateway, the stableness can't be honored. NFS gateway has to treat the write as unstable write. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5199) Add more debug trace for NFS READ and WRITE
Brandon Li created HDFS-5199: Summary: Add more debug trace for NFS READ and WRITE Key: HDFS-5199 URL: https://issues.apache.org/jira/browse/HDFS-5199 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Brandon Li Before more sophisticated utility is added, the simple trace indicating start/end serving request can help debug errors and collect statistic information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5171) NFS should create input stream for a file and try to share it with multiple read requests
Brandon Li created HDFS-5171: Summary: NFS should create input stream for a file and try to share it with multiple read requests Key: HDFS-5171 URL: https://issues.apache.org/jira/browse/HDFS-5171 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Brandon Li Currently, NFS creates an input steam for each read request and closes it after the request is served. With lots of read request, the overhead is significant. Like for write request, NFS should create input stream for a file and try to share it with multiple read requests. The stream can be closed if there is no read request for a certain amount of time (e.g., 10 sec). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5172) Handle race condition for writes
Brandon Li created HDFS-5172: Summary: Handle race condition for writes Key: HDFS-5172 URL: https://issues.apache.org/jira/browse/HDFS-5172 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Brandon Li Assignee: Brandon Li When an unstable write arrives, the following happens: 1. retrieves the OpenFileCtx 2. create asyn task to write it to hdfs The race is that, the OpenFileCtx could be closed by the StreamMonitor. Then step 2 will simply return an error to the client. This is OK before streaming is supported. To support data streaming, the file needs to be reopened. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5135) Create a test framework to enable NFS end to end unit test
Brandon Li created HDFS-5135: Summary: Create a test framework to enable NFS end to end unit test Key: HDFS-5135 URL: https://issues.apache.org/jira/browse/HDFS-5135 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Brandon Li Currently, we have to manually start portmap and nfs3 processes to test patch and new functionalities. This JIRA is to track the effort to introduce a test framework to NFS unit test without starting standalone nfs3 processes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5136) MNT EXPORT should give the full group list which can mount the exports
Brandon Li created HDFS-5136: Summary: MNT EXPORT should give the full group list which can mount the exports Key: HDFS-5136 URL: https://issues.apache.org/jira/browse/HDFS-5136 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Brandon Li Currently MNT.EXPORT command returns empty group list which is considered as everybody can mount the export. It should return the correctly configured group list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5107) Fix array copy error in Readdir and Readdirplus responses
Brandon Li created HDFS-5107: Summary: Fix array copy error in Readdir and Readdirplus responses Key: HDFS-5107 URL: https://issues.apache.org/jira/browse/HDFS-5107 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li System.arraycopy(this.entries, 0, entries, 0, entries.length); it should be System.arraycopy(entries, 0, this.entries, 0, entries.length); This caused NFS to fail to return directory content. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5104) Support dotdot name in NFS LOOKUP operation
Brandon Li created HDFS-5104: Summary: Support dotdot name in NFS LOOKUP operation Key: HDFS-5104 URL: https://issues.apache.org/jira/browse/HDFS-5104 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Procedure LOOKUP searches a directory for a specific name and returns the file handle for the corresponding file system object. NFS client sets filename as .. to get the parent directory information. Currently .. is considered invalid name component. We only allow .. when the path is an inodeID path as /.reserved/.inodes/.. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5084) Add namespace ID and snapshot ID into fileHandle to support Federation and Snapshot
Brandon Li created HDFS-5084: Summary: Add namespace ID and snapshot ID into fileHandle to support Federation and Snapshot Key: HDFS-5084 URL: https://issues.apache.org/jira/browse/HDFS-5084 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 3.0.0 Reporter: Brandon Li -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5086) Support RPCSEC_GSS authentication in NFSv3 gateway
Brandon Li created HDFS-5086: Summary: Support RPCSEC_GSS authentication in NFSv3 gateway Key: HDFS-5086 URL: https://issues.apache.org/jira/browse/HDFS-5086 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 3.0.0 Reporter: Brandon Li -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5085) Support Kerberos authentication in NFSv3 gateway
Brandon Li created HDFS-5085: Summary: Support Kerberos authentication in NFSv3 gateway Key: HDFS-5085 URL: https://issues.apache.org/jira/browse/HDFS-5085 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 3.0.0 Reporter: Brandon Li -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5078) Support file append in NFSv3 gateway to enable data streaming to HDFS
Brandon Li created HDFS-5078: Summary: Support file append in NFSv3 gateway to enable data streaming to HDFS Key: HDFS-5078 URL: https://issues.apache.org/jira/browse/HDFS-5078 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5067) Support symlink operations
Brandon Li created HDFS-5067: Summary: Support symlink operations Key: HDFS-5067 URL: https://issues.apache.org/jira/browse/HDFS-5067 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 3.0.0 Reporter: Brandon Li Given the symlink issues(e.g., HDFS-4765) are getting fixed. NFS can support the symlinke related requests, which includes NFSv3 calls SYMLINK and READLINK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5069) Inculde hadoop-nfs jar file into hadoop-common tar ball, and hdfs-nfs into hadoop-hdfs tar file for easier NFS deployment
Brandon Li created HDFS-5069: Summary: Inculde hadoop-nfs jar file into hadoop-common tar ball, and hdfs-nfs into hadoop-hdfs tar file for easier NFS deployment Key: HDFS-5069 URL: https://issues.apache.org/jira/browse/HDFS-5069 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 3.0.0 Reporter: Brandon Li -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5054) PortmapInterface should check if the procedure is out-of-range
Brandon Li created HDFS-5054: Summary: PortmapInterface should check if the procedure is out-of-range Key: HDFS-5054 URL: https://issues.apache.org/jira/browse/HDFS-5054 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5043) For HdfsFileStatus, set default value of childrenNum to -1 instead of 0 to avoid confusing applications
Brandon Li created HDFS-5043: Summary: For HdfsFileStatus, set default value of childrenNum to -1 instead of 0 to avoid confusing applications Key: HDFS-5043 URL: https://issues.apache.org/jira/browse/HDFS-5043 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Per discussion in HDFS-4772, default value 0 can confuse application since it doesn't know the server doesn't support childNum or the directory has no child. Use -1 instead to avoid this confusion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4954) compile failure in branch-2: getFlushedOffset should catch or rethrow IOException
Brandon Li created HDFS-4954: Summary: compile failure in branch-2: getFlushedOffset should catch or rethrow IOException Key: HDFS-4954 URL: https://issues.apache.org/jira/browse/HDFS-4954 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Brandon Li Assignee: Brandon Li This is caused when merging HDFS-4762 from trunk to branch-2. Unlike that in trunk, FSDataOutputStream.getPos() throws IOException, so getFlushedOffset should catch or rethrow the IOException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4947) Add NFS server export table to control export by hostname or IP range
Brandon Li created HDFS-4947: Summary: Add NFS server export table to control export by hostname or IP range Key: HDFS-4947 URL: https://issues.apache.org/jira/browse/HDFS-4947 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Jing Zhao -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4928) Use fileId instead of src to get INode in complete() and addBlock()
Brandon Li created HDFS-4928: Summary: Use fileId instead of src to get INode in complete() and addBlock() Key: HDFS-4928 URL: https://issues.apache.org/jira/browse/HDFS-4928 Project: Hadoop HDFS Issue Type: Bug Reporter: Brandon Li When a valid fileId is provided, complete() and addBlock() should use it for getting INodes instead of src. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4920) ClientNamenodeProtocolServerSideTranslatorPB.addBlock() should check if fileId exists for backword compatibility support
Brandon Li created HDFS-4920: Summary: ClientNamenodeProtocolServerSideTranslatorPB.addBlock() should check if fileId exists for backword compatibility support Key: HDFS-4920 URL: https://issues.apache.org/jira/browse/HDFS-4920 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Brandon Li Instead of calling req.getFileId(), it should check if fileId presents, req.hasFileId() first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4900) Print user when services are started
Brandon Li created HDFS-4900: Summary: Print user when services are started Key: HDFS-4900 URL: https://issues.apache.org/jira/browse/HDFS-4900 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0, 1.3.0 Reporter: Brandon Li Priority: Trivial Printing user name during start can help debug access permission related issues, e.g., the namenode storage directory is not accessible by the user who starts the service. The message could look like: / STARTUP_MSG: Starting NameNode by hdfs ... ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4784) NPE in FSDirectory.resolvePath()
Brandon Li created HDFS-4784: Summary: NPE in FSDirectory.resolvePath() Key: HDFS-4784 URL: https://issues.apache.org/jira/browse/HDFS-4784 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li NN can get NPE when resoling an inode id path for a nonexistent file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4771) Provide a way to set symlink attributes
Brandon Li created HDFS-4771: Summary: Provide a way to set symlink attributes Key: HDFS-4771 URL: https://issues.apache.org/jira/browse/HDFS-4771 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0 Reporter: Brandon Li Currently HDFS always resolves symlink when setting certain file attributes, such as setPermission and setTime. And thus the client can't set some file attributes of the symlink itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4772) Add number of children in HdfsFileStatus
Brandon Li created HDFS-4772: Summary: Add number of children in HdfsFileStatus Key: HDFS-4772 URL: https://issues.apache.org/jira/browse/HDFS-4772 Project: Hadoop HDFS Issue Type: Improvement Reporter: Brandon Li Assignee: Brandon Li Priority: Minor This JIRA is to track the change to return the number of children for a directory, so the client doesn't need to make a getListing() call to calculate the number of dirents. This makes it convenient for the client to check directory size change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4763) Add script changes/utility for starting NFS gateway
Brandon Li created HDFS-4763: Summary: Add script changes/utility for starting NFS gateway Key: HDFS-4763 URL: https://issues.apache.org/jira/browse/HDFS-4763 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Brandon Li -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4756) Implement ONCRPC and XDR
Brandon Li created HDFS-4756: Summary: Implement ONCRPC and XDR Key: HDFS-4756 URL: https://issues.apache.org/jira/browse/HDFS-4756 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Brandon Li This is to track the implementation of ONCRPC(rfc5531) and XDR(rfc4506). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4750) Support NFSv3 interface to HDFS
Brandon Li created HDFS-4750: Summary: Support NFSv3 interface to HDFS Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to support such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-252) Export the HDFS file system through a NFS protocol
[ https://issues.apache.org/jira/browse/HDFS-252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-252. - Resolution: Duplicate Export the HDFS file system through a NFS protocol -- Key: HDFS-252 URL: https://issues.apache.org/jira/browse/HDFS-252 Project: Hadoop HDFS Issue Type: New Feature Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: nfshadoop.tar.gz It would be nice if can expose the HDFS filesystem using the NFS protocol. There are a couple of options that I could find: 1. Use a user space C-language-implementation of a NFS server and then use the libhdfs API to integrate that code with Hadoop. There is such an implementation available at http://sourceforge.net/project/showfiles.php?group_id=66203. 2. Use a user space Java implementation of a NFS server and then integrate it with HDFS using Java API. There is such an implementation of NFS server at http://void.org/~steven/jnfs/. I have experimented with Option 2 and have written a first version of the Hadoop integration. I am attaching the code for your preliminary feedback. This implementation of the Java NFS server has one limitation: it supports UDP only. Some licensing issues will have to be sorted out before it can be used. Steve (the writer of the NFS server implemenation) has told me that he can change the licensing of the code if needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-487) HDFS should expose a fileid to uniquely identify a file
[ https://issues.apache.org/jira/browse/HDFS-487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-487. - Resolution: Duplicate HDFS should expose a fileid to uniquely identify a file --- Key: HDFS-487 URL: https://issues.apache.org/jira/browse/HDFS-487 Project: Hadoop HDFS Issue Type: New Feature Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: fileid1.txt HDFS should expose a id that uniquely identifies a file. This helps in developing applications that work correctly even when files are moved from one directory to another. A typical use-case is to make the Pluggable Block Placement Policy (HDFS-385) use fileid instead of filename. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-4489. -- Resolution: Fixed Close this JIRA since all its sub-issues have been resolved. Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HDFS-3538) TestBlocksWithNotEnoughRacks fails
[ https://issues.apache.org/jira/browse/HDFS-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li reopened HDFS-3538: -- TestBlocksWithNotEnoughRacks fails -- Key: HDFS-3538 URL: https://issues.apache.org/jira/browse/HDFS-3538 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.24.0 Reporter: Brandon Li It failed for a few days in jenkins test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-4654) FileNotFoundException: ID mismatch
[ https://issues.apache.org/jira/browse/HDFS-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-4654. -- Resolution: Not A Problem Assignee: Brandon Li Given HDFS-4339 has been committed. This JIRA is not a problem any more. FileNotFoundException: ID mismatch -- Key: HDFS-4654 URL: https://issues.apache.org/jira/browse/HDFS-4654 Project: Hadoop HDFS Issue Type: Bug Components: ha, namenode Affects Versions: 3.0.0 Reporter: Fengdong Yu Assignee: Brandon Li Fix For: 3.0.0 Mu cluster was build from source code trunk r1463074. I got an exception as follows when I put a file to the HDFS. 13/04/01 09:33:45 WARN retry.RetryInvocationHandler: Exception while invoking addBlock of class ClientNamenodeProtocolTranslatorPB. Trying to fail over immediately. 13/04/01 09:33:45 WARN hdfs.DFSClient: DataStreamer Exception java.io.FileNotFoundException: ID mismatch. Request id and saved id: 1073 , 1050 at org.apache.hadoop.hdfs.server.namenode.INodeId.checkId(INodeId.java:51) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2501) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2298) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2212) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:498) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:356) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40979) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:526) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1018) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1818) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1814) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1489) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1812) please reproduce as : hdfs dfs -put test.data /user/data/test.data after this command start to run, then kill active name node process. I have only three nodes(A,B,C) for test A and B are name nodes. B and C are data nodes. ZK deployed on A, B and C. A, B and C are all journal nodes. Thanks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira