[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010841#comment-14010841 ] Hangjun Ye commented on HDFS-6382: -- Thanks Haohui for your reply. Let me confirm I got your point. Your suggestion is that we'd better have a general mechanism/framework to run a job (maybe periodically) over the namespace inside the NN, and the TTL policy is just a specific job that might be implemented by user? That's an interesting direction, we will think about it. We are heavy users of Hadoop and also do some in-house improvements per our business requirement. We definitely want to contribute the improvements back to community, as long as it's helpful for the community. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6453) use Time#monotonicNow to avoid system clock reset
[ https://issues.apache.org/jira/browse/HDFS-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6453: Status: Patch Available (was: Open) > use Time#monotonicNow to avoid system clock reset > - > > Key: HDFS-6453 > URL: https://issues.apache.org/jira/browse/HDFS-6453 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode >Affects Versions: 3.0.0 >Reporter: Liang Xie >Assignee: Liang Xie > Attachments: HDFS-6453.txt > > > similiar with hadoop-common, let's re-check and replace > System#currentTimeMillis with Time#monotonicNow in HDFS project as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6453) use Time#monotonicNow to avoid system clock reset
[ https://issues.apache.org/jira/browse/HDFS-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6453: Attachment: HDFS-6453.txt > use Time#monotonicNow to avoid system clock reset > - > > Key: HDFS-6453 > URL: https://issues.apache.org/jira/browse/HDFS-6453 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode >Affects Versions: 3.0.0 >Reporter: Liang Xie >Assignee: Liang Xie > Attachments: HDFS-6453.txt > > > similiar with hadoop-common, let's re-check and replace > System#currentTimeMillis with Time#monotonicNow in HDFS project as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6453) use Time#monotonicNow to avoid system clock reset
Liang Xie created HDFS-6453: --- Summary: use Time#monotonicNow to avoid system clock reset Key: HDFS-6453 URL: https://issues.apache.org/jira/browse/HDFS-6453 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, namenode Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie similiar with hadoop-common, let's re-check and replace System#currentTimeMillis with Time#monotonicNow in HDFS project as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010822#comment-14010822 ] Guo Ruijing commented on HDFS-4167: --- I agree that it's better to keep it as a standalone HDFS metadata operation. It is easy to restore any snapshot if block is copied out for append as HDFS-6087 proposal. what's change in append? 1. file f1 includes (Block1, Block2, Block3) 2. append to f1 a) client request block3 information from namenode b) client request to datanode to copy block3 as block4 c) append to block4 d) commit block4 to namenode what happend in snapshot? snap1: f1 include (block1, block2, block3) snap2: f2 include (block1, block2, block4) how to restore snapshot? just restore snap1 as current file since no partial blocks are shared by different snap. > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Jing Zhao > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010820#comment-14010820 ] Haohui Mai commented on HDFS-6382: -- bq. TTL is a very simple (but general) policy and we might even consider it as an attribute of file, like the number of replicas. Seems it wouldn't introduce much complexity to handle it in the NN. bq. Another benefit to having it inside NN is we don't have to handle the authentication/authorization problem in a separate system. For example we have a shared HDFS cluster for many internal users, we don't want someone to set TTL policy to other one's files. NN could handle it easily by its own authentication/authorization mechanism. I agree that running jobs of the namespace without MR should be the direction to go. However, I think the main hold back here is that the design mixes the mechanism (running jobs of the namespace without MR) and the policy (TTL) together. As [~cmccabe] pointed out earlier, every user has his / her own policy. Provided that HDFS has a wide range of users, this type of design / implementation is unlikely to fly in the ecosystem. Currently HDFS does not have the above mechanism, you're more than welcomed to contribute a patch. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-6404) HttpFS should use a 000 umask for mkdir and create operations
[ https://issues.apache.org/jira/browse/HDFS-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Yoder reassigned HDFS-6404: Assignee: Mike Yoder (was: Alejandro Abdelnur) > HttpFS should use a 000 umask for mkdir and create operations > - > > Key: HDFS-6404 > URL: https://issues.apache.org/jira/browse/HDFS-6404 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Alejandro Abdelnur >Assignee: Mike Yoder > > The FileSystem created by HttpFS should use a 000 umask not to affect the > permissions set by the client as it is the responsibility of the client to > resolve the right permissions based on the client unmask. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6310) PBImageXmlWriter should output information about Delegation Tokens
[ https://issues.apache.org/jira/browse/HDFS-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010757#comment-14010757 ] Akira AJISAKA commented on HDFS-6310: - bq. If the attacker has access to the image, it's already game over whether oiv accurately dumps the image or not. I agree with you. [~wheat9], what do you think? If you agree with that, could you review the patch? > PBImageXmlWriter should output information about Delegation Tokens > -- > > Key: HDFS-6310 > URL: https://issues.apache.org/jira/browse/HDFS-6310 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 2.4.0 >Reporter: Akira AJISAKA >Assignee: Akira AJISAKA > Attachments: HDFS-6310.patch > > > Separated from HDFS-6293. > The 2.4.0 pb-fsimage does contain tokens, but OfflineImageViewer with -XML > option does not show any tokens. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010734#comment-14010734 ] Hangjun Ye commented on HDFS-6382: -- Implementing it outside NN is definitely another option, and I agree with Colin that it's not feasible to implement a complex clean up policy (like based on storage space) inside NN. TTL is a very simple (but general) policy and we might even consider it as an attribute of file, like the number of replicas. Seems it wouldn't introduce much complexity to handle it in the NN. Another benefit to having it inside NN is we don't have to handle the authentication/authorization problem in a separate system. For example we have a shared HDFS cluster for many internal users, we don't want someone to set TTL policy to other one's files. NN could handle it easily by its own authentication/authorization mechanism. So far a TTL-based clean up policy is good enough for our scenario (Zesheng and I are from the same company and we are supporting our company's internal usage for Hadoop) and it's would be nice to have a simple and workable solution in HDFS. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6286) adding a timeout setting for local read io
[ https://issues.apache.org/jira/browse/HDFS-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie resolved HDFS-6286. - Resolution: Duplicate > adding a timeout setting for local read io > -- > > Key: HDFS-6286 > URL: https://issues.apache.org/jira/browse/HDFS-6286 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.0.0, 2.4.0 >Reporter: Liang Xie >Assignee: Liang Xie > > Currently, if a write or remote read requested into a sick disk, > DFSClient.hdfsTimeout could help the caller have a guaranteed time cost to > return back. but it doesn't work on local read. Take HBase scan for example, > DFSInputStream.read -> readWithStrategy -> readBuffer -> > BlockReaderLocal.read -> dataIn.read -> FileChannelImpl.read > if it hits a bad disk, the low read io probably takes tens of seconds, and > what's worse is, the "DFSInputStream.read" hold a lock always. > Per my knowledge, there's no good mechanism to cancel a running read > io(Please correct me if it's wrong), so my opinion is adding a future around > the read request, and we could set a timeout there, if the threshold reached, > we can add the local node into deadnode probably... > Any thought? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6442) Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port conficts
[ https://issues.apache.org/jira/browse/HDFS-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010723#comment-14010723 ] Hadoop QA commented on HDFS-6442: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647022/HDFS-6442.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6992//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6992//console This message is automatically generated. > Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port > conficts > -- > > Key: HDFS-6442 > URL: https://issues.apache.org/jira/browse/HDFS-6442 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Minor > Attachments: HDFS-6442.1.patch, HDFS-6442.patch > > > TestEditLogAutoroll and TestStandbyCheckpoints both use 10061 and 10062 to > set up the mini-cluster, this may result in occasionally test failure when > run test with -Pparallel-tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6286) adding a timeout setting for local read io
[ https://issues.apache.org/jira/browse/HDFS-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010718#comment-14010718 ] Liang Xie commented on HDFS-6286: - bq. There is a high overhead to adding communication between threads to every read, and I don't think we want this in short-circuit reads (which is an optimization, after all) Indeed, i am fine with my prototype not in community codebase, just as a kindly heads up to notice this corner case:) It doesn't help for regular request perf, just against the long tail request. bq. If we create an extra thread per DFSInputStream using SCR i used a thread pool, so the overhead should be acceptable, and when i checked the timeout/execution exception, the upper layer will treat that dn as datanode immediately, so it was expected no halt pool be observed per my understanding. bq. I am going to create a JIRA to implement hedged reads for the non-pread case. I think that will be a better general solution that doesn't have the above-mentioned problems. Cool, i also have got some of your concerns, and i totally agree that we need a more general solution in community code like hedged reads for regular read. Let's work on HDFS-6450 now and close this one. > adding a timeout setting for local read io > -- > > Key: HDFS-6286 > URL: https://issues.apache.org/jira/browse/HDFS-6286 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.0.0, 2.4.0 >Reporter: Liang Xie >Assignee: Liang Xie > > Currently, if a write or remote read requested into a sick disk, > DFSClient.hdfsTimeout could help the caller have a guaranteed time cost to > return back. but it doesn't work on local read. Take HBase scan for example, > DFSInputStream.read -> readWithStrategy -> readBuffer -> > BlockReaderLocal.read -> dataIn.read -> FileChannelImpl.read > if it hits a bad disk, the low read io probably takes tens of seconds, and > what's worse is, the "DFSInputStream.read" hold a lock always. > Per my knowledge, there's no good mechanism to cancel a running read > io(Please correct me if it's wrong), so my opinion is adding a future around > the read request, and we could set a timeout there, if the threshold reached, > we can add the local node into deadnode probably... > Any thought? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6056) Clean up NFS config settings
[ https://issues.apache.org/jira/browse/HDFS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010717#comment-14010717 ] Hadoop QA commented on HDFS-6056: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647017/HDFS-6056.009.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 11 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6991//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6991//console This message is automatically generated. > Clean up NFS config settings > > > Key: HDFS-6056 > URL: https://issues.apache.org/jira/browse/HDFS-6056 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.3.0 >Reporter: Aaron T. Myers >Assignee: Brandon Li > Attachments: HDFS-6056.001.patch, HDFS-6056.002.patch, > HDFS-6056.003.patch, HDFS-6056.004.patch, HDFS-6056.005.patch, > HDFS-6056.006.patch, HDFS-6056.007.patch, HDFS-6056.008.patch, > HDFS-6056.009.patch > > > As discussed on HDFS-6050, there's a few opportunities to improve the config > settings related to NFS. This JIRA is to implement those changes, which > include: moving hdfs-nfs related properties into hadoop-hdfs-nfs project, and > replacing 'nfs3' with 'nfs' in the property names. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6448) change BlockReaderLocalLegacy timeout detail
[ https://issues.apache.org/jira/browse/HDFS-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010701#comment-14010701 ] Liang Xie commented on HDFS-6448: - Thanks Colin for your comment, now i begin to understand why the BlockReaderLocalLegacy class still in trunk :) and also i am glad to see this timeout issue doesn't exist in HDFS-347 SCR. > change BlockReaderLocalLegacy timeout detail > > > Key: HDFS-6448 > URL: https://issues.apache.org/jira/browse/HDFS-6448 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.0.0, 2.4.0 >Reporter: Liang Xie >Assignee: Liang Xie > Attachments: HDFS-6448.txt > > > Our hbase deployed upon hadoop2.0, in one accident, we hit HDFS-5016 in HDFS > side, but we also found from HBase side, the dfs client was hung at > getBlockReader, after reading code, we found there is a timeout setting in > current codebase though, but the default hdfsTimeout value is "-1" ( from > Client.java:getTimeout(conf) )which means no timeout... > The hung stack trace like following: > at $Proxy21.getBlockLocalPathInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolTranslatorPB.java:215) > at > org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:267) > at > org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:180) > at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:812) > One feasible fix is replacing the hdfsTimeout with socketTimeout. see > attached patch. Most of credit should give [~liushaohui] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6452) ConfiguredFailoverProxyProvider should randomize currentProxyIndex on initialization
[ https://issues.apache.org/jira/browse/HDFS-6452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010672#comment-14010672 ] Gera Shegalov commented on HDFS-6452: - Lohit is correct, when we implement "readable standby" similar to what is provided by some database systems, the fraction of failed requests even in "normal case" will be well below 50%. Making randomization optional is a good idea. > ConfiguredFailoverProxyProvider should randomize currentProxyIndex on > initialization > > > Key: HDFS-6452 > URL: https://issues.apache.org/jira/browse/HDFS-6452 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha, hdfs-client >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > > We observe that the clients iterate proxies in the fixed order. Depending on > the order of namenodes in dfs.ha.namenodes. (e.g. 'nn1,nn2') and > the current standby (nn1), all the clients will hit nn1 first, and then > failover to nn2. Chatting with [~lohit] we think we can simply select the > initial value of {{currentProxyIndex}} randomly, and keep the logic of > {{performFailover}} of iterating from left-to-right. This should halve the > unnecessary load on standby NN. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6452) ConfiguredFailoverProxyProvider should randomize currentProxyIndex on initialization
[ https://issues.apache.org/jira/browse/HDFS-6452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010660#comment-14010660 ] Lohit Vijayarenu commented on HDFS-6452: On our clusters we are seeing about 70-75% of load is readonly (getFileInfo, listStatus, getBlockLocation). With this we have been thinking about enabling namenode stale reads. If we do that, then having clients pick random NameNode would distribute clients across both NameNode. How about we have an option to randomize which NN to talk to by the client. By default ConfiguredFailoverProxyProvider would behave like today, but an option to randomize would be useful. > ConfiguredFailoverProxyProvider should randomize currentProxyIndex on > initialization > > > Key: HDFS-6452 > URL: https://issues.apache.org/jira/browse/HDFS-6452 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha, hdfs-client >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > > We observe that the clients iterate proxies in the fixed order. Depending on > the order of namenodes in dfs.ha.namenodes. (e.g. 'nn1,nn2') and > the current standby (nn1), all the clients will hit nn1 first, and then > failover to nn2. Chatting with [~lohit] we think we can simply select the > initial value of {{currentProxyIndex}} randomly, and keep the logic of > {{performFailover}} of iterating from left-to-right. This should halve the > unnecessary load on standby NN. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010657#comment-14010657 ] Jian Wang commented on HDFS-6382: - I think it is better for you to provide a (backup & clean up ) platform for your user ,you can implement a lot of clean up strategy for your users in your company. This can reduce a lot of repeated jobs. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010632#comment-14010632 ] Colin Patrick McCabe commented on HDFS-6382: bq. Why do you think that putting the cleanup mechanism into the NameNode seems questionable, can you point out some details? Andrew and Chris commented about this earlier. See: https://issues.apache.org/jira/browse/HDFS-6382?focusedCommentId=13998933&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13998933 I would add to that: * Every user of this is going to want a slightly different deletion policy. It's just way too much configuration for the NameNode to reasonably handle. Much easier to do it in a user process. For example, maybe you want to keep at least 100 GB of logs, 100 GB of "foo" data, and 1000 GB of "bar" data. It's easy to handle this complexity in a user process, incredibly complex and frustrating to handle it in the NameNode. * Your nightly MR job (or whatever) also needs to be able to do things like email sysadmins when the disks are filling up, which the NameNode can't reasonably be expected to do. * I don't see a big advantage to doing this in the NameNode, and I see a lot of disadvantages (more complexity to maintain, difficult configuration, need to restart to update config) Maybe I could be convinced otherwise, but so far the only argument that I've seen for doing it in the NN is that it would be re-usable. And this could just as easily apply to an implementation outside the NN. For example, as I pointed out earlier, DistCp is reusable, without being in the NameNode. > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6452) ConfiguredFailoverProxyProvider should randomize currentProxyIndex on initialization
[ https://issues.apache.org/jira/browse/HDFS-6452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010631#comment-14010631 ] Gera Shegalov commented on HDFS-6452: - Hi Aaron, the net improvement should be in the overhead smoothness over time. E.g., we will smooth the storm of {{StandbyException: Operation category READ is not supported in state standby}} [~jingzhao], this is targeted for deployments with automatic failover where the emphasis is on not trying to watch what NN is active all the time. > ConfiguredFailoverProxyProvider should randomize currentProxyIndex on > initialization > > > Key: HDFS-6452 > URL: https://issues.apache.org/jira/browse/HDFS-6452 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha, hdfs-client >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > > We observe that the clients iterate proxies in the fixed order. Depending on > the order of namenodes in dfs.ha.namenodes. (e.g. 'nn1,nn2') and > the current standby (nn1), all the clients will hit nn1 first, and then > failover to nn2. Chatting with [~lohit] we think we can simply select the > initial value of {{currentProxyIndex}} randomly, and keep the logic of > {{performFailover}} of iterating from left-to-right. This should halve the > unnecessary load on standby NN. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6452) ConfiguredFailoverProxyProvider should randomize currentProxyIndex on initialization
[ https://issues.apache.org/jira/browse/HDFS-6452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010623#comment-14010623 ] Jing Zhao commented on HDFS-6452: - I agree with [~atm]. The NameNode failover is not common in practice, and the administrator can easily control which NN is the active one while starting the cluster. Given that, to randomize currentProxyIndex on client initialization will actually increase the number of RPC in normal cases. > ConfiguredFailoverProxyProvider should randomize currentProxyIndex on > initialization > > > Key: HDFS-6452 > URL: https://issues.apache.org/jira/browse/HDFS-6452 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha, hdfs-client >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > > We observe that the clients iterate proxies in the fixed order. Depending on > the order of namenodes in dfs.ha.namenodes. (e.g. 'nn1,nn2') and > the current standby (nn1), all the clients will hit nn1 first, and then > failover to nn2. Chatting with [~lohit] we think we can simply select the > initial value of {{currentProxyIndex}} randomly, and keep the logic of > {{performFailover}} of iterating from left-to-right. This should halve the > unnecessary load on standby NN. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6442) Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port conficts
[ https://issues.apache.org/jira/browse/HDFS-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010609#comment-14010609 ] Arpit Agarwal commented on HDFS-6442: - +1 pending Jenkins. Thanks for incorporating the suggestion. > Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port > conficts > -- > > Key: HDFS-6442 > URL: https://issues.apache.org/jira/browse/HDFS-6442 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Minor > Attachments: HDFS-6442.1.patch, HDFS-6442.patch > > > TestEditLogAutoroll and TestStandbyCheckpoints both use 10061 and 10062 to > set up the mini-cluster, this may result in occasionally test failure when > run test with -Pparallel-tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6452) ConfiguredFailoverProxyProvider should randomize currentProxyIndex on initialization
[ https://issues.apache.org/jira/browse/HDFS-6452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010598#comment-14010598 ] Aaron T. Myers commented on HDFS-6452: -- Hi Gera, while this will obviously halve the amount of errant RPCs made to a standby NN in the situation where all clients connect to the standby NN first, it will of course also cut in half the number of RPCs that connect to the active the first time in the situation where all clients connect to the active first. Given that, is this proposal really a net improvement? > ConfiguredFailoverProxyProvider should randomize currentProxyIndex on > initialization > > > Key: HDFS-6452 > URL: https://issues.apache.org/jira/browse/HDFS-6452 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha, hdfs-client >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > > We observe that the clients iterate proxies in the fixed order. Depending on > the order of namenodes in dfs.ha.namenodes. (e.g. 'nn1,nn2') and > the current standby (nn1), all the clients will hit nn1 first, and then > failover to nn2. Chatting with [~lohit] we think we can simply select the > initial value of {{currentProxyIndex}} randomly, and keep the logic of > {{performFailover}} of iterating from left-to-right. This should halve the > unnecessary load on standby NN. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6452) ConfiguredFailoverProxyProvider should randomize currentProxyIndex on initialization
Gera Shegalov created HDFS-6452: --- Summary: ConfiguredFailoverProxyProvider should randomize currentProxyIndex on initialization Key: HDFS-6452 URL: https://issues.apache.org/jira/browse/HDFS-6452 Project: Hadoop HDFS Issue Type: Improvement Components: ha, hdfs-client Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov We observe that the clients iterate proxies in the fixed order. Depending on the order of namenodes in dfs.ha.namenodes. (e.g. 'nn1,nn2') and the current standby (nn1), all the clients will hit nn1 first, and then failover to nn2. Chatting with [~lohit] we think we can simply select the initial value of {{currentProxyIndex}} randomly, and keep the logic of {{performFailover}} of iterating from left-to-right. This should halve the unnecessary load on standby NN. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6442) Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port conficts
[ https://issues.apache.org/jira/browse/HDFS-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6442: - Summary: Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port conficts (was: Fix TestEditLogAutoroll failure caused by port conficts) > Fix TestEditLogAutoroll and TestStandbyCheckpoints failure caused by port > conficts > -- > > Key: HDFS-6442 > URL: https://issues.apache.org/jira/browse/HDFS-6442 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Minor > Attachments: HDFS-6442.1.patch, HDFS-6442.patch > > > TestEditLogAutoroll and TestStandbyCheckpoints both use 10061 and 10062 to > set up the mini-cluster, this may result in occasionally test failure when > run test with -Pparallel-tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6442) Fix TestEditLogAutoroll failure caused by port conficts
[ https://issues.apache.org/jira/browse/HDFS-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6442: - Attachment: HDFS-6442.1.patch Update the patch to address arpit's comments. > Fix TestEditLogAutoroll failure caused by port conficts > --- > > Key: HDFS-6442 > URL: https://issues.apache.org/jira/browse/HDFS-6442 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Minor > Attachments: HDFS-6442.1.patch, HDFS-6442.patch > > > TestEditLogAutoroll and TestStandbyCheckpoints both use 10061 and 10062 to > set up the mini-cluster, this may result in occasionally test failure when > run test with -Pparallel-tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6056) Clean up NFS config settings
[ https://issues.apache.org/jira/browse/HDFS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6056: - Attachment: HDFS-6056.009.patch Rebased the patch. > Clean up NFS config settings > > > Key: HDFS-6056 > URL: https://issues.apache.org/jira/browse/HDFS-6056 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.3.0 >Reporter: Aaron T. Myers >Assignee: Brandon Li > Attachments: HDFS-6056.001.patch, HDFS-6056.002.patch, > HDFS-6056.003.patch, HDFS-6056.004.patch, HDFS-6056.005.patch, > HDFS-6056.006.patch, HDFS-6056.007.patch, HDFS-6056.008.patch, > HDFS-6056.009.patch > > > As discussed on HDFS-6050, there's a few opportunities to improve the config > settings related to NFS. This JIRA is to implement those changes, which > include: moving hdfs-nfs related properties into hadoop-hdfs-nfs project, and > replacing 'nfs3' with 'nfs' in the property names. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6056) Clean up NFS config settings
[ https://issues.apache.org/jira/browse/HDFS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6056: - Attachment: (was: HDFS-6056.009.patch) > Clean up NFS config settings > > > Key: HDFS-6056 > URL: https://issues.apache.org/jira/browse/HDFS-6056 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.3.0 >Reporter: Aaron T. Myers >Assignee: Brandon Li > Attachments: HDFS-6056.001.patch, HDFS-6056.002.patch, > HDFS-6056.003.patch, HDFS-6056.004.patch, HDFS-6056.005.patch, > HDFS-6056.006.patch, HDFS-6056.007.patch, HDFS-6056.008.patch, > HDFS-6056.009.patch > > > As discussed on HDFS-6050, there's a few opportunities to improve the config > settings related to NFS. This JIRA is to implement those changes, which > include: moving hdfs-nfs related properties into hadoop-hdfs-nfs project, and > replacing 'nfs3' with 'nfs' in the property names. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6056) Clean up NFS config settings
[ https://issues.apache.org/jira/browse/HDFS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010545#comment-14010545 ] Hadoop QA commented on HDFS-6056: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647014/HDFS-6056.009.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6990//console This message is automatically generated. > Clean up NFS config settings > > > Key: HDFS-6056 > URL: https://issues.apache.org/jira/browse/HDFS-6056 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.3.0 >Reporter: Aaron T. Myers >Assignee: Brandon Li > Attachments: HDFS-6056.001.patch, HDFS-6056.002.patch, > HDFS-6056.003.patch, HDFS-6056.004.patch, HDFS-6056.005.patch, > HDFS-6056.006.patch, HDFS-6056.007.patch, HDFS-6056.008.patch, > HDFS-6056.009.patch > > > As discussed on HDFS-6050, there's a few opportunities to improve the config > settings related to NFS. This JIRA is to implement those changes, which > include: moving hdfs-nfs related properties into hadoop-hdfs-nfs project, and > replacing 'nfs3' with 'nfs' in the property names. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6056) Clean up NFS config settings
[ https://issues.apache.org/jira/browse/HDFS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6056: - Attachment: HDFS-6056.009.patch Uploaded a new patch to address Aaron's comments. > Clean up NFS config settings > > > Key: HDFS-6056 > URL: https://issues.apache.org/jira/browse/HDFS-6056 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.3.0 >Reporter: Aaron T. Myers >Assignee: Brandon Li > Attachments: HDFS-6056.001.patch, HDFS-6056.002.patch, > HDFS-6056.003.patch, HDFS-6056.004.patch, HDFS-6056.005.patch, > HDFS-6056.006.patch, HDFS-6056.007.patch, HDFS-6056.008.patch, > HDFS-6056.009.patch > > > As discussed on HDFS-6050, there's a few opportunities to improve the config > settings related to NFS. This JIRA is to implement those changes, which > include: moving hdfs-nfs related properties into hadoop-hdfs-nfs project, and > replacing 'nfs3' with 'nfs' in the property names. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010520#comment-14010520 ] Zesheng Wu commented on HDFS-6382: -- bq. Like I said, we should write such a tool and add it to the base Hadoop distribution. This is similar to what we did with DistCp. Then users would not need to write their own versions of this stuff. Sure, this is another good option. bq. It's important to distinguish between creating a tool to handle deleting old files (which we all agree we should do), and putting this into the NameNode (which seems questionable). Why do you think that putting the cleanup mechanism into the NameNode seems questionable, can you point out some details? > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5682) Heterogeneous Storage phase 2 - APIs to expose Storage Types
[ https://issues.apache.org/jira/browse/HDFS-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010512#comment-14010512 ] Zesheng Wu commented on HDFS-5682: -- Thanks for the responses [~arpitagarwal] bq.The function name should communicate that this is disk space quota for a specific storage type, as opposed to the overall quotas which are set with setQuota. If the proposed name is hard to follow, how about get/setsetQuotaByStorageType? Yes, get/setQuotaByStorageType will be clearer. bq.Let's defer this for now. The API and protocol can both be easily extended in a backwards compatible manner in the future without affecting existing applications. OK bq. We have to differentiate between quota unavailability vs disk space availability. The former will result in a quota violation exception, the latter will result in the behavior you described. We discuss the reasons for this in the HDFS-2832 design doc. Got it, thanks. I will look into the HDFS-2832 doc for more details. > Heterogeneous Storage phase 2 - APIs to expose Storage Types > > > Key: HDFS-5682 > URL: https://issues.apache.org/jira/browse/HDFS-5682 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: 20140522-Heterogeneous-Storages-API.pdf > > > Phase 1 (HDFS-2832) added support to present the DataNode as a collection of > discrete storages of different types. > This Jira is to track phase 2 of the Heterogeneous Storage work which > involves exposing Storage Types to applications and adding Quota Management > support for administrators. > This phase will also include tools support for administrators/users. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6451) NFS should not return NFS3ERR_IO for AccessControlException
[ https://issues.apache.org/jira/browse/HDFS-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010503#comment-14010503 ] Brandon Li commented on HDFS-6451: -- >From [~zhongyi-altiscale]: {quote}Hi Jing Zhao, it's definitely good to have a single exception handler instead of replicating the same code everywhere, but since each server procedure (ACCESS, GETATTR, FSSTAT, etc) might have their private data that needs to be written out, the child NFS3Response class still need to overload the writeHeaderAndResponse anyways for AccessControlException, do you mean we need to catch it together with AuthorizationException in RpcProgramNfs3.java? or do you mean we need to examine the whole codebase looking for every function that could potentially throw AccessControlException, and make sure the error code is set correctly in the catch clause?{quote} > NFS should not return NFS3ERR_IO for AccessControlException > > > Key: HDFS-6451 > URL: https://issues.apache.org/jira/browse/HDFS-6451 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Reporter: Brandon Li > > As [~jingzhao] pointed out in HDFS-6411, we need to catch the > AccessControlException from the HDFS calls, and return NFS3ERR_PERM instead > of NFS3ERR_IO for it. > Another possible improvement is to have a single class/method for the common > exception handling process, instead of repeating the same exception handling > process in different NFS methods. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6442) Fix TestEditLogAutoroll failure caused by port conficts
[ https://issues.apache.org/jira/browse/HDFS-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010488#comment-14010488 ] Zesheng Wu commented on HDFS-6442: -- Thanks [~arpitagarwal], OK, I will make it more general as you suggested. > Fix TestEditLogAutoroll failure caused by port conficts > --- > > Key: HDFS-6442 > URL: https://issues.apache.org/jira/browse/HDFS-6442 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Minor > Attachments: HDFS-6442.patch > > > TestEditLogAutoroll and TestStandbyCheckpoints both use 10061 and 10062 to > set up the mini-cluster, this may result in occasionally test failure when > run test with -Pparallel-tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010486#comment-14010486 ] Hudson commented on HDFS-6411: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5613 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5613/]) HDFS-6411. nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it. Contributed by Brandon Li (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1597895) * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/response/ACCESS3Response.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Fix For: 2.4.1 > > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6376) Distcp data between two HA clusters requires another configuration
[ https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010480#comment-14010480 ] Hadoop QA commented on HDFS-6376: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647011/HDFS-6376-4-branch-2.4.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6989//console This message is automatically generated. > Distcp data between two HA clusters requires another configuration > -- > > Key: HDFS-6376 > URL: https://issues.apache.org/jira/browse/HDFS-6376 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, federation, hdfs-client >Affects Versions: 2.3.0, 2.4.0 > Environment: Hadoop 2.3.0 >Reporter: Dave Marion > Fix For: 2.4.1 > > Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, > HDFS-6376-4-branch-2.4.patch, HDFS-6376-branch-2.4.patch, > HDFS-6376-patch-1.patch > > > User has to create a third set of configuration files for distcp when > transferring data between two HA clusters. > Consider the scenario in [1]. You cannot put all of the required properties > in core-site.xml and hdfs-site.xml for the client to resolve the location of > both active namenodes. If you do, then the datanodes from cluster A may join > cluster B. I can not find a configuration option that tells the datanodes to > federate blocks for only one of the clusters in the configuration. > [1] > http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6411: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6411: - Fix Version/s: 2.4.1 > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Fix For: 2.4.1 > > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010479#comment-14010479 ] Brandon Li commented on HDFS-6411: -- Thank you, guys. I've committed the patch. Let's move further discussions for the code optimization to HDFS-6451. > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Fix For: 2.4.1 > > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6376) Distcp data between two HA clusters requires another configuration
[ https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Marion updated HDFS-6376: -- Status: Open (was: Patch Available) forgot --no-prefix in patch 3 > Distcp data between two HA clusters requires another configuration > -- > > Key: HDFS-6376 > URL: https://issues.apache.org/jira/browse/HDFS-6376 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, federation, hdfs-client >Affects Versions: 2.4.0, 2.3.0 > Environment: Hadoop 2.3.0 >Reporter: Dave Marion > Fix For: 2.4.1 > > Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, > HDFS-6376-4-branch-2.4.patch, HDFS-6376-branch-2.4.patch, > HDFS-6376-patch-1.patch > > > User has to create a third set of configuration files for distcp when > transferring data between two HA clusters. > Consider the scenario in [1]. You cannot put all of the required properties > in core-site.xml and hdfs-site.xml for the client to resolve the location of > both active namenodes. If you do, then the datanodes from cluster A may join > cluster B. I can not find a configuration option that tells the datanodes to > federate blocks for only one of the clusters in the configuration. > [1] > http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6376) Distcp data between two HA clusters requires another configuration
[ https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Marion updated HDFS-6376: -- Status: Patch Available (was: Open) > Distcp data between two HA clusters requires another configuration > -- > > Key: HDFS-6376 > URL: https://issues.apache.org/jira/browse/HDFS-6376 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, federation, hdfs-client >Affects Versions: 2.4.0, 2.3.0 > Environment: Hadoop 2.3.0 >Reporter: Dave Marion > Fix For: 2.4.1 > > Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, > HDFS-6376-4-branch-2.4.patch, HDFS-6376-branch-2.4.patch, > HDFS-6376-patch-1.patch > > > User has to create a third set of configuration files for distcp when > transferring data between two HA clusters. > Consider the scenario in [1]. You cannot put all of the required properties > in core-site.xml and hdfs-site.xml for the client to resolve the location of > both active namenodes. If you do, then the datanodes from cluster A may join > cluster B. I can not find a configuration option that tells the datanodes to > federate blocks for only one of the clusters in the configuration. > [1] > http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010472#comment-14010472 ] Zhongyi Xie commented on HDFS-6411: --- Hi [~jingzhao], it's definitely good to have a single exception handler instead of replicating the same code everywhere, but since each server procedure (ACCESS, GETATTR, FSSTAT, etc) might have their private data that needs to be written out, the child NFS3Response class still need to overload the writeHeaderAndResponse anyways for AccessControlException, do you mean we need to catch it together with AuthorizationException in RpcProgramNfs3.java? or do you mean we need to examine the whole codebase looking for every function that could potentially throw AccessControlException, and make sure the error code is set correctly in the catch clause? > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6376) Distcp data between two HA clusters requires another configuration
[ https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Marion updated HDFS-6376: -- Attachment: HDFS-6376-4-branch-2.4.patch fix patch > Distcp data between two HA clusters requires another configuration > -- > > Key: HDFS-6376 > URL: https://issues.apache.org/jira/browse/HDFS-6376 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, federation, hdfs-client >Affects Versions: 2.3.0, 2.4.0 > Environment: Hadoop 2.3.0 >Reporter: Dave Marion > Fix For: 2.4.1 > > Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, > HDFS-6376-4-branch-2.4.patch, HDFS-6376-branch-2.4.patch, > HDFS-6376-patch-1.patch > > > User has to create a third set of configuration files for distcp when > transferring data between two HA clusters. > Consider the scenario in [1]. You cannot put all of the required properties > in core-site.xml and hdfs-site.xml for the client to resolve the location of > both active namenodes. If you do, then the datanodes from cluster A may join > cluster B. I can not find a configuration option that tells the datanodes to > federate blocks for only one of the clusters in the configuration. > [1] > http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6056) Clean up NFS config settings
[ https://issues.apache.org/jira/browse/HDFS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010460#comment-14010460 ] Aaron T. Myers commented on HDFS-6056: -- bq. OK. Let's further simplify the config for HDFS. Other NFS implementations can add their own prefix if they need. Seems fine to me. Consistency of the HDFS configs/docs is really all I'm concerned with. {quote} I am actually not very concerned about the deprecation of keys in Common hadoop-nfs. The reason is that, 1) most of them are hdfs-nfs related, 2) the rest of them are all hidden keys and used for debug purpose except "dfs.nfs.exports.allowed.hosts". Even for "dfs.nfs.exports.allowed.hosts", we can add the deprecation declaration into Configuration#defaultDeprecations and remove it from Configuration after a couple releases. I will update the patch if this sounds ok to you. {quote} Sure, that sounds fine. Not the prettiest solution in the world, but certainly seems like it should work. > Clean up NFS config settings > > > Key: HDFS-6056 > URL: https://issues.apache.org/jira/browse/HDFS-6056 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.3.0 >Reporter: Aaron T. Myers >Assignee: Brandon Li > Attachments: HDFS-6056.001.patch, HDFS-6056.002.patch, > HDFS-6056.003.patch, HDFS-6056.004.patch, HDFS-6056.005.patch, > HDFS-6056.006.patch, HDFS-6056.007.patch, HDFS-6056.008.patch > > > As discussed on HDFS-6050, there's a few opportunities to improve the config > settings related to NFS. This JIRA is to implement those changes, which > include: moving hdfs-nfs related properties into hadoop-hdfs-nfs project, and > replacing 'nfs3' with 'nfs' in the property names. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010455#comment-14010455 ] Brandon Li commented on HDFS-6411: -- Thank you, [~jingzhao] for the review. I've filed HDFS-6451 to track the improvement you described. > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6451) NFS should not return NFS3ERR_IO for AccessControlException
Brandon Li created HDFS-6451: Summary: NFS should not return NFS3ERR_IO for AccessControlException Key: HDFS-6451 URL: https://issues.apache.org/jira/browse/HDFS-6451 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Brandon Li As [~jingzhao] pointed out in HDFS-6411, we need to catch the AccessControlException from the HDFS calls, and return NFS3ERR_PERM instead of NFS3ERR_IO for it. Another possible improvement is to have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010439#comment-14010439 ] Jing Zhao commented on HDFS-6411: - The current patch looks good to me. +1 One issue of the current code is that we may also want to catch the AccessControlException from the HDFS calls, and return NFS3ERR_PERM instead of NFS3ERR_IO for it. But I guess we can do it in a separate jira. Another possible future improvement is that we can have a single class/method for the common exception handling process, instead of repeating the same exception handling process in different NFS methods. > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6442) Fix TestEditLogAutoroll failure caused by port conficts
[ https://issues.apache.org/jira/browse/HDFS-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010427#comment-14010427 ] Arpit Agarwal commented on HDFS-6442: - Hi [~wuzesheng], the patch looks good but would you consider an approach like HDFS-6443 i.e. randomized port selection+ retries? > Fix TestEditLogAutoroll failure caused by port conficts > --- > > Key: HDFS-6442 > URL: https://issues.apache.org/jira/browse/HDFS-6442 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu >Priority: Minor > Attachments: HDFS-6442.patch > > > TestEditLogAutoroll and TestStandbyCheckpoints both use 10061 and 10062 to > set up the mini-cluster, this may result in occasionally test failure when > run test with -Pparallel-tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6056) Clean up NFS config settings
[ https://issues.apache.org/jira/browse/HDFS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010419#comment-14010419 ] Brandon Li commented on HDFS-6056: -- Thanks, Aaron. {quote}move IdUserGroup#NFS_STATIC_MAPPING_FILE_KEY out of IdUserGroup and put it with all the other config names{quote} Sure. {quote}...so I would anticipate user confusion of which configs do and do not start with "dfs."{quote} OK. Let's further simplify the config for HDFS. Other NFS implementations can add their own prefix if they need. {quote}...if there were some other project which only depended upon the Common hadoop-nfs project, the config deprecations would not be loaded. {quote} I am actually not very concerned about the deprecation of keys in Common hadoop-nfs. The reason is that, 1) most of them are hdfs-nfs related, 2) the rest of them are all hidden keys and used for debug purpose except "dfs.nfs.exports.allowed.hosts". Even for "dfs.nfs.exports.allowed.hosts", we can add the deprecation declaration into Configuration#defaultDeprecations and remove it from Configuration after a couple releases. I will update the patch if this sounds ok to you. > Clean up NFS config settings > > > Key: HDFS-6056 > URL: https://issues.apache.org/jira/browse/HDFS-6056 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.3.0 >Reporter: Aaron T. Myers >Assignee: Brandon Li > Attachments: HDFS-6056.001.patch, HDFS-6056.002.patch, > HDFS-6056.003.patch, HDFS-6056.004.patch, HDFS-6056.005.patch, > HDFS-6056.006.patch, HDFS-6056.007.patch, HDFS-6056.008.patch > > > As discussed on HDFS-6050, there's a few opportunities to improve the config > settings related to NFS. This JIRA is to implement those changes, which > include: moving hdfs-nfs related properties into hadoop-hdfs-nfs project, and > replacing 'nfs3' with 'nfs' in the property names. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6447) balancer should timestamp the completion message
[ https://issues.apache.org/jira/browse/HDFS-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juan Yu updated HDFS-6447: -- Attachment: HDFS-6447.002.patch There is no new test since the change is just to add timestamp to a trace. sample output: May 27, 2014 2:20:25 PM Balancing took 3.087 seconds > balancer should timestamp the completion message > > > Key: HDFS-6447 > URL: https://issues.apache.org/jira/browse/HDFS-6447 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Reporter: Allen Wittenauer >Assignee: Juan Yu >Priority: Trivial > Labels: newbie > Attachments: HDFS-6447.002.patch, HDFS-6447.patch.001 > > > When the balancer finishes, it doesn't report the time it finished. It > should do this so that users have a better sense of how long it took to > complete. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010267#comment-14010267 ] Zhongyi Xie commented on HDFS-6411: --- Hi [~brandonli], I've tested it on my VM and looks like the problem is fixed However I did see something interesting [alti-test-02@alexie-dt root]$ mkdir /hdfs/tmp/dir mkdir: cannot create directory `/hdfs/tmp/dir': Permission denied [alti-test-02@alexie-dt root]$ rmdir /hdfs/tmp rmdir: failed to remove `/hdfs/tmp': Permission denied [alti-test-02@alexie-dt root]$ rmdir /hdfs/ rmdir: failed to remove `/hdfs/': Permission denied [alti-test-02@alexie-dt root]$ ls /hdfs ls: cannot access /hdfs: Stale file handle but once I log out of alti-test-02 user back to root, the NFS handle is still working [root@alexie-dt ~]# ls /hdfs backups hive mr-history system tmp user and I retried these steps, the problem goes away (i.e. I didn't see the Stale file handle again), so unless there is some consistent repro steps and if the handle is hang again, I won't worry about that > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6056) Clean up NFS config settings
[ https://issues.apache.org/jira/browse/HDFS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010265#comment-14010265 ] Aaron T. Myers commented on HDFS-6056: -- Hey Brandon, latest patch looks pretty good to me. Two comments for you: # Seems like we should move {{IdUserGroup#NFS_STATIC_MAPPING_FILE_KEY}} out of {{IdUserGroup}} and put it with all the other config names. # I still find it unfortunate that we now have some configs which just start with "nfs." and others which start with "dfs.nfs.". From the user's perspective, there's no good reason for this, since they'll only be using NFS to access HDFS, so I would anticipate user confusion of which configs do and do not start with "dfs." # I think there may be a bit of a problem with having the {{NfsConfiguration}} class in the hadoop-hdfs-nfs project, since it is also responsible for adding the {{DeprecationDeltas}} for NFS config settings which only exist in the hadoop-nfs (Common) project. This means that, though no such project exists today, if there were some other project which only depended upon the Common hadoop-nfs project, the config deprecations would not be loaded. This seems like it might be another argument in favor of moving all of this code into the single hadoop-hdfs-nfs project. > Clean up NFS config settings > > > Key: HDFS-6056 > URL: https://issues.apache.org/jira/browse/HDFS-6056 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.3.0 >Reporter: Aaron T. Myers >Assignee: Brandon Li > Attachments: HDFS-6056.001.patch, HDFS-6056.002.patch, > HDFS-6056.003.patch, HDFS-6056.004.patch, HDFS-6056.005.patch, > HDFS-6056.006.patch, HDFS-6056.007.patch, HDFS-6056.008.patch > > > As discussed on HDFS-6050, there's a few opportunities to improve the config > settings related to NFS. This JIRA is to implement those changes, which > include: moving hdfs-nfs related properties into hadoop-hdfs-nfs project, and > replacing 'nfs3' with 'nfs' in the property names. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6416) Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system clock bugs
[ https://issues.apache.org/jira/browse/HDFS-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010249#comment-14010249 ] Hudson commented on HDFS-6416: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5612 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5612/]) HDFS-6416. Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system clock bugs. Contributed by Abhiraj Butala (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1597868) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtxCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system > clock bugs > > > Key: HDFS-6416 > URL: https://issues.apache.org/jira/browse/HDFS-6416 > Project: Hadoop HDFS > Issue Type: Improvement > Components: nfs >Affects Versions: 2.4.0 >Reporter: Brandon Li >Assignee: Abhiraj Butala >Priority: Minor > Fix For: 2.5.0 > > Attachments: HDFS-6416.patch > > > As [~cnauroth] pointed out in HADOOP-10612, Time#monotonicNow is a more > preferred method to use since this isn't subject to system clock bugs (i.e. > Someone resets the clock to a time in the past, and then updates don't happen > for a long time.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6441) Add ability to exclude/include few datanodes while balancing
[ https://issues.apache.org/jira/browse/HDFS-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010250#comment-14010250 ] Benoy Antony commented on HDFS-6441: {quote} What happens if neither option is given? It appears to maybe ignore all hosts? {quote} If neither option is given, all the nodes will be included. This is the behavior without this patch. The internal boolean variable (exclude) is set to true, but the list of nodes to exclude will be empty. {quote} If both options are given, it appears to build a union of the include/exclude hosts, then use the last argument to determine if the union is exclude or not? {quote} If both options are given, the last option will be effective. {quote} I seem to recall getHostName is (or used to be) a bit peculiar and can return a DN self-reported name, hence the getPeerHostName which is guaranteed to return the actual hostname. You should check and match the NN's behavior on use of peer name or reported name. {quote} I believe, You are right. I'll check and test with _getPeerHostName_ {quote} _DEFALUT is misspelled {quote} I'll fix this. > Add ability to exclude/include few datanodes while balancing > > > Key: HDFS-6441 > URL: https://issues.apache.org/jira/browse/HDFS-6441 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Affects Versions: 2.4.0 >Reporter: Benoy Antony >Assignee: Benoy Antony > Attachments: HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, > HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, > HDFS-6441.patch, HDFS-6441.patch > > > In some use cases, it is desirable to ignore a few data nodes while > balancing. The administrator should be able to specify a list of data nodes > in a file similar to the hosts file and the balancer should ignore these data > nodes while balancing so that no blocks are added/removed on these nodes. > Similarly it will be beneficial to specify that only a particular list of > datanodes should be considered for balancing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6416) Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system clock bugs
[ https://issues.apache.org/jira/browse/HDFS-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010219#comment-14010219 ] Brandon Li commented on HDFS-6416: -- Thank you, [~abutala]. I've committed the patch. > Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system > clock bugs > > > Key: HDFS-6416 > URL: https://issues.apache.org/jira/browse/HDFS-6416 > Project: Hadoop HDFS > Issue Type: Improvement > Components: nfs >Affects Versions: 2.4.0 >Reporter: Brandon Li >Assignee: Abhiraj Butala >Priority: Minor > Fix For: 2.5.0 > > Attachments: HDFS-6416.patch > > > As [~cnauroth] pointed out in HADOOP-10612, Time#monotonicNow is a more > preferred method to use since this isn't subject to system clock bugs (i.e. > Someone resets the clock to a time in the past, and then updates don't happen > for a long time.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6416) Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system clock bugs
[ https://issues.apache.org/jira/browse/HDFS-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6416: - Fix Version/s: 2.5.0 > Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system > clock bugs > > > Key: HDFS-6416 > URL: https://issues.apache.org/jira/browse/HDFS-6416 > Project: Hadoop HDFS > Issue Type: Improvement > Components: nfs >Affects Versions: 2.4.0 >Reporter: Brandon Li >Assignee: Abhiraj Butala >Priority: Minor > Fix For: 2.5.0 > > Attachments: HDFS-6416.patch > > > As [~cnauroth] pointed out in HADOOP-10612, Time#monotonicNow is a more > preferred method to use since this isn't subject to system clock bugs (i.e. > Someone resets the clock to a time in the past, and then updates don't happen > for a long time.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6416) Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system clock bugs
[ https://issues.apache.org/jira/browse/HDFS-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6416: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system > clock bugs > > > Key: HDFS-6416 > URL: https://issues.apache.org/jira/browse/HDFS-6416 > Project: Hadoop HDFS > Issue Type: Improvement > Components: nfs >Affects Versions: 2.4.0 >Reporter: Brandon Li >Assignee: Abhiraj Butala >Priority: Minor > Fix For: 2.5.0 > > Attachments: HDFS-6416.patch > > > As [~cnauroth] pointed out in HADOOP-10612, Time#monotonicNow is a more > preferred method to use since this isn't subject to system clock bugs (i.e. > Someone resets the clock to a time in the past, and then updates don't happen > for a long time.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010217#comment-14010217 ] Zhongyi Xie commented on HDFS-6411: --- [~brandonli], it looks good, but I haven't got a chance to test it out since my VM is broken today, will run the test cases once my VM is back to normal, will let you know by then, thanks! > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010214#comment-14010214 ] Brandon Li commented on HDFS-6411: -- [~zhongyi-altiscale], how does the new patch look? > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6450) Support non-positional hedged reads in HDFS
Colin Patrick McCabe created HDFS-6450: -- Summary: Support non-positional hedged reads in HDFS Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6416) Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system clock bugs
[ https://issues.apache.org/jira/browse/HDFS-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010193#comment-14010193 ] Hadoop QA commented on HDFS-6416: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12646189/HDFS-6416.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6987//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6987//console This message is automatically generated. > Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system > clock bugs > > > Key: HDFS-6416 > URL: https://issues.apache.org/jira/browse/HDFS-6416 > Project: Hadoop HDFS > Issue Type: Improvement > Components: nfs >Affects Versions: 2.4.0 >Reporter: Brandon Li >Assignee: Abhiraj Butala >Priority: Minor > Attachments: HDFS-6416.patch > > > As [~cnauroth] pointed out in HADOOP-10612, Time#monotonicNow is a more > preferred method to use since this isn't subject to system clock bugs (i.e. > Someone resets the clock to a time in the past, and then updates don't happen > for a long time.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-6379) HTTPFS - Implement ACLs support
[ https://issues.apache.org/jira/browse/HDFS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur reassigned HDFS-6379: Assignee: Mike Yoder (was: Alejandro Abdelnur) > HTTPFS - Implement ACLs support > --- > > Key: HDFS-6379 > URL: https://issues.apache.org/jira/browse/HDFS-6379 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Alejandro Abdelnur >Assignee: Mike Yoder > Fix For: 2.4.0 > > > HDFS-4685 added ACLs support to WebHDFS but missed adding them to HttpFS. > This JIRA is for such. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5682) Heterogeneous Storage phase 2 - APIs to expose Storage Types
[ https://issues.apache.org/jira/browse/HDFS-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010157#comment-14010157 ] Arpit Agarwal commented on HDFS-5682: - Thanks for the feedback [~wuzesheng]. My responses are below. bq. 1. About the storage type, because I didn't participate the discussion in HDFS-2832, I am confused by the current storage type DISK and SSD. I think SSD is also one type of disk, DISK and SSD are not orthogonal. Can we change storage type to HDD and SDD, this will be more straightforward? Good point, I'll look into making the names clearer. In a subsequent revision of the API we would like to eliminate the hard-coded names from code altogether. bq. 2. About setStorageTypeSpaceQuota/getStorageTypeSpaceQuota, these two names are not very natural. From the literal meaning, it sounds like setting/getting space quota on some storage type other than some type of storage. I would suggest that setStorageSpaceQuota/getStorageSpaceQuota will be better. I am not a native English speaker, if I were wrong, just ignore this. The function name should communicate that this is disk space quota for a specific storage type, as opposed to the overall quotas which are set with {{setQuota}}. If the proposed name is hard to follow, how about {{get}}/{{setsetQuotaByStorageType}}? bq. 3. About the command line, hdfs dfsadmin -get(set)StorageTypeSpaceQuota, I think get(set) one storage type once is simple and straightforward, if we get(set) more than one once, because there's no atomicity guarantee, it's complicated to handle failure. Yes I think we can simplify the command line as you suggested. bq. 4. About the StoragePreference class, as you said in the design doc in HDFS-2832, in the future HDFS will support place replicas on different storages, such as 1 on SSD, and 2 on HDD. I would suggest that StoragePerference class can support specifying storage type of each replica now, in this way, we can easily support the above feature in the future. Let's defer this for now. The API and protocol can both be easily extended in a backwards compatible manner in the future without affecting existing applications. bq. 5. About the create file sematics, as you said in the doc "During file creation there must be sufficient quota to place at least one block times the replication factor on the target storage type, otherwise the request is falied immediately with QuotaExceededException", I think it will be more natural and friendly that first create the file on the default storage(HDD) if there's not enough space of desired storage type , and than let the namenode replicate the block to desired storage lazily when there's enough space available. We have to differentiate between quota unavailability vs disk space availability. The former will result in a quota violation exception, the latter will result in the behavior you described. We discuss the reasons for this in the HDFS-2832 design doc. > Heterogeneous Storage phase 2 - APIs to expose Storage Types > > > Key: HDFS-5682 > URL: https://issues.apache.org/jira/browse/HDFS-5682 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: 20140522-Heterogeneous-Storages-API.pdf > > > Phase 1 (HDFS-2832) added support to present the DataNode as a collection of > discrete storages of different types. > This Jira is to track phase 2 of the Heterogeneous Storage work which > involves exposing Storage Types to applications and adding Quota Management > support for administrators. > This phase will also include tools support for administrators/users. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6416) Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system clock bugs
[ https://issues.apache.org/jira/browse/HDFS-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010154#comment-14010154 ] Hadoop QA commented on HDFS-6416: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12646189/HDFS-6416.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6986//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6986//console This message is automatically generated. > Use Time#monotonicNow in OpenFileCtx and OpenFileCtxCatch to avoid system > clock bugs > > > Key: HDFS-6416 > URL: https://issues.apache.org/jira/browse/HDFS-6416 > Project: Hadoop HDFS > Issue Type: Improvement > Components: nfs >Affects Versions: 2.4.0 >Reporter: Brandon Li >Assignee: Abhiraj Butala >Priority: Minor > Attachments: HDFS-6416.patch > > > As [~cnauroth] pointed out in HADOOP-10612, Time#monotonicNow is a more > preferred method to use since this isn't subject to system clock bugs (i.e. > Someone resets the clock to a time in the past, and then updates don't happen > for a long time.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6447) balancer should timestamp the completion message
[ https://issues.apache.org/jira/browse/HDFS-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010138#comment-14010138 ] Hadoop QA commented on HDFS-6447: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12646925/HDFS-6447.patch.001 against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6981//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6981//console This message is automatically generated. > balancer should timestamp the completion message > > > Key: HDFS-6447 > URL: https://issues.apache.org/jira/browse/HDFS-6447 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Reporter: Allen Wittenauer >Assignee: Juan Yu >Priority: Trivial > Labels: newbie > Attachments: HDFS-6447.patch.001 > > > When the balancer finishes, it doesn't report the time it finished. It > should do this so that users have a better sense of how long it took to > complete. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6447) balancer should timestamp the completion message
[ https://issues.apache.org/jira/browse/HDFS-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010147#comment-14010147 ] Allen Wittenauer commented on HDFS-6447: it might be nice to format it similarly to the output of the rest of the balancer output, where timestamp comes first. But other than that, yup, this is pretty much what I'm looking to see added. :D > balancer should timestamp the completion message > > > Key: HDFS-6447 > URL: https://issues.apache.org/jira/browse/HDFS-6447 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Reporter: Allen Wittenauer >Assignee: Juan Yu >Priority: Trivial > Labels: newbie > Attachments: HDFS-6447.patch.001 > > > When the balancer finishes, it doesn't report the time it finished. It > should do this so that users have a better sense of how long it took to > complete. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010090#comment-14010090 ] Hadoop QA commented on HDFS-6411: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12646956/HDFS-6411.004.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6985//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6985//console This message is automatically generated. > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-3493) Replication is not happened for the block (which is recovered and in finalized) to the Datanode which has got the same block with old generation timestamp in RBW
[ https://issues.apache.org/jira/browse/HDFS-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010073#comment-14010073 ] Juan Yu commented on HDFS-3493: --- Hi Vinay, Would you mind I take it over and finish it? Thanks, Juan > Replication is not happened for the block (which is recovered and in > finalized) to the Datanode which has got the same block with old generation > timestamp in RBW > - > > Key: HDFS-3493 > URL: https://issues.apache.org/jira/browse/HDFS-3493 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha, 2.0.5-alpha >Reporter: J.Andreina >Assignee: Vinayakumar B > Attachments: HDFS-3493.patch > > > replication factor= 3, block report interval= 1min and start NN and 3DN > Step 1:Write a file without close and do hflush (Dn1,DN2,DN3 has blk_ts1) > Step 2:Stopped DN3 > Step 3:recovery happens and time stamp updated(blk_ts2) > Step 4:close the file > Step 5:blk_ts2 is finalized and available in DN1 and Dn2 > Step 6:now restarted DN3(which has got blk_ts1 in rbw) > From the NN side there is no cmd issued to DN3 to delete the blk_ts1 . But > ask DN3 to make the block as corrupt . > Replication of blk_ts2 to DN3 is not happened. > NN logs: > > {noformat} > INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_3927215081484173742 to add as corrupt on XX.XX.XX.XX:50276 by > /XX.XX.XX.XX because reported RWR replica with genstamp 1007 does not match > COMPLETE block's genstamp in block map 1008 > INFO org.apache.hadoop.hdfs.StateChange: BLOCK* processReport: from > DatanodeRegistration(XX.XX.XX.XX, > storageID=DS-443871816-XX.XX.XX.XX-50276-1336829714197, infoPort=50275, > ipcPort=50277, > storageInfo=lv=-40;cid=CID-e654ac13-92dc-4f82-a22b-c0b6861d06d7;nsid=2063001898;c=0), > blocks: 2, processing time: 1 msecs > INFO org.apache.hadoop.hdfs.StateChange: BLOCK* Removing block > blk_3927215081484173742_1008 from neededReplications as it has enough > replicas. > INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_3927215081484173742 to add as corrupt on XX.XX.XX.XX:50276 by > /XX.XX.XX.XX because reported RWR replica with genstamp 1007 does not match > COMPLETE block's genstamp in block map 1008 > INFO org.apache.hadoop.hdfs.StateChange: BLOCK* processReport: from > DatanodeRegistration(XX.XX.XX.XX, > storageID=DS-443871816-XX.XX.XX.XX-50276-1336829714197, infoPort=50275, > ipcPort=50277, > storageInfo=lv=-40;cid=CID-e654ac13-92dc-4f82-a22b-c0b6861d06d7;nsid=2063001898;c=0), > blocks: 2, processing time: 1 msecs > WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Not > able to place enough replicas, still in need of 1 to reach 1 > For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy > {noformat} > fsck Report > === > {noformat} > /file21: Under replicated > BP-1008469586-XX.XX.XX.XX-1336829603103:blk_3927215081484173742_1008. Target > Replicas is 3 but found 2 replica(s). > .Status: HEALTHY > Total size: 495 B > Total dirs: 1 > Total files: 3 > Total blocks (validated):3 (avg. block size 165 B) > Minimally replicated blocks: 3 (100.0 %) > Over-replicated blocks: 0 (0.0 %) > Under-replicated blocks: 1 (33.32 %) > Mis-replicated blocks: 0 (0.0 %) > Default replication factor: 1 > Average block replication: 2.0 > Corrupt blocks: 0 > Missing replicas:1 (14.285714 %) > Number of data-nodes:3 > Number of racks: 1 > FSCK ended at Sun May 13 09:49:05 IST 2012 in 9 milliseconds > The filesystem under path '/' is HEALTHY > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
[ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010053#comment-14010053 ] Colin Patrick McCabe commented on HDFS-6382: bq. But if there's no internal cleanup mechanism of HDFS, all users(across companies) need to write their own cleanup tools respectively, lots of repeated work. Like I said, we should write such a tool and add it to the base Hadoop distribution. This is similar to what we did with {{DistCp}}. Then users would not need to write their own versions of this stuff. It's important to distinguish between creating a tool to handle deleting old files (which we all agree we should do), and putting this into the NameNode (which seems questionable). > HDFS File/Directory TTL > --- > > Key: HDFS-6382 > URL: https://issues.apache.org/jira/browse/HDFS-6382 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > > In production environment, we always have scenario like this, we want to > backup files on hdfs for some time and then hope to delete these files > automatically. For example, we keep only 1 day's logs on local disk due to > limited disk space, but we need to keep about 1 month's logs in order to > debug program bugs, so we keep all the logs on hdfs and delete logs which are > older than 1 month. This is a typical scenario of HDFS TTL. So here we > propose that hdfs can support TTL. > Following are some details of this proposal: > 1. HDFS can support TTL on a specified file or directory > 2. If a TTL is set on a file, the file will be deleted automatically after > the TTL is expired > 3. If a TTL is set on a directory, the child files and directories will be > deleted automatically after the TTL is expired > 4. The child file/directory's TTL configuration should override its parent > directory's > 5. A global configuration is needed to configure that whether the deleted > files/directories should go to the trash or not > 6. A global configuration is needed to configure that whether a directory > with TTL should be deleted when it is emptied by TTL mechanism or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010034#comment-14010034 ] Hadoop QA commented on HDFS-6411: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12646945/HDFS-6411.003.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6982//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6982//console This message is automatically generated. > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6286) adding a timeout setting for local read io
[ https://issues.apache.org/jira/browse/HDFS-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010032#comment-14010032 ] Colin Patrick McCabe commented on HDFS-6286: I understand your motivation here, but I'm afraid I am -1 on this at the moment. There is a high overhead to adding communication between threads to every {{read}}, and I don't think we want this in short-circuit reads (which is an optimization, after all). Any way you look at this, it is problematic. If we create an extra thread per DFSInputStream using SCR, we might completely blow the thread budget of an application like HBase that would be hundreds or thousands of extra threads (since HBase has a lot of open local files). If we have a fixed-size thread pool, slow disks will cause the thread pool to grind to a halt and bottleneck system performance. I am open to ideas here, but I just can't see a way to resolve those problems. Maybe I am missing something. In the meantime, I am going to create a JIRA to implement hedged reads for the non-pread case. I think that will be a better general solution that doesn't have the above-mentioned problems. > adding a timeout setting for local read io > -- > > Key: HDFS-6286 > URL: https://issues.apache.org/jira/browse/HDFS-6286 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.0.0, 2.4.0 >Reporter: Liang Xie >Assignee: Liang Xie > > Currently, if a write or remote read requested into a sick disk, > DFSClient.hdfsTimeout could help the caller have a guaranteed time cost to > return back. but it doesn't work on local read. Take HBase scan for example, > DFSInputStream.read -> readWithStrategy -> readBuffer -> > BlockReaderLocal.read -> dataIn.read -> FileChannelImpl.read > if it hits a bad disk, the low read io probably takes tens of seconds, and > what's worse is, the "DFSInputStream.read" hold a lock always. > Per my knowledge, there's no good mechanism to cancel a running read > io(Please correct me if it's wrong), so my opinion is adding a future around > the read request, and we could set a timeout there, if the threshold reached, > we can add the local node into deadnode probably... > Any thought? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6411: - Attachment: HDFS-6411.004.patch > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010011#comment-14010011 ] Brandon Li commented on HDFS-6411: -- Uploaded the patch to address Zhongyi's comments. Thanks! > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.004.patch, HDFS-6411.patch, > tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6222) Remove background token renewer from webhdfs
[ https://issues.apache.org/jira/browse/HDFS-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010007#comment-14010007 ] Hadoop QA commented on HDFS-6222: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12646923/HDFS-6222.trunk.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6980//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/6980//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6980//console This message is automatically generated. > Remove background token renewer from webhdfs > > > Key: HDFS-6222 > URL: https://issues.apache.org/jira/browse/HDFS-6222 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-6222.branch-2.patch, HDFS-6222.trunk.patch > > > The background token renewer is a source of problems for long-running > daemons. Webhdfs should lazy fetch a new token when it receives an > InvalidToken exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010008#comment-14010008 ] Hadoop QA commented on HDFS-6411: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12646945/HDFS-6411.003.patch against trunk revision . {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6984//console This message is automatically generated. > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.patch, tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6448) change BlockReaderLocalLegacy timeout detail
[ https://issues.apache.org/jira/browse/HDFS-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010003#comment-14010003 ] Colin Patrick McCabe commented on HDFS-6448: Socket timeout seems reasonable to me. DFSInputStream uses socketTimeout to get a proxy to talk to the DN, in code like this: {code} /** Read the block length from one of the datanodes. */ private long readBlockLength(LocatedBlock locatedblock) throws IOException { ... try { cdp = DFSUtil.createClientDatanodeProtocolProxy(datanode, dfsClient.getConfiguration(), dfsClient.getConf().socketTimeout, dfsClient.getConf().connectToDnViaHostname, locatedblock); {code} So I am +1 on this patch. bq. yes, we employed hadoop2.0, only the legacy HDFS-2246 available. I took a quick look at the HDFS-347 SCR code while making patch and did not find the same issue(to be honest, i am not familiar with this piece of code, so probably just i missed it). i think Colin Patrick McCabe have the exact answer definitely Just as a note, we kept around the legacy block reader local only because HDFS-347 wasn't implemented on Windows. If you are not using Windows, then I would recommend upgrading and using the new one ASAP... HDFS-2246 has a lot of problems besides this (its failure handling code is fairly buggy, especially in older releases.) bq. Do you know if this is is only an issue in HDFS-2246 SCR? Is it present in HDFS-347 SCRs? HDFS-347 uses {{socketTimeout}}. The relevant code is in {{BlockReaderFactory#nextDomainPeer}} > change BlockReaderLocalLegacy timeout detail > > > Key: HDFS-6448 > URL: https://issues.apache.org/jira/browse/HDFS-6448 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.0.0, 2.4.0 >Reporter: Liang Xie >Assignee: Liang Xie > Attachments: HDFS-6448.txt > > > Our hbase deployed upon hadoop2.0, in one accident, we hit HDFS-5016 in HDFS > side, but we also found from HBase side, the dfs client was hung at > getBlockReader, after reading code, we found there is a timeout setting in > current codebase though, but the default hdfsTimeout value is "-1" ( from > Client.java:getTimeout(conf) )which means no timeout... > The hung stack trace like following: > at $Proxy21.getBlockLocalPathInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolTranslatorPB.java:215) > at > org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:267) > at > org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:180) > at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:812) > One feasible fix is replacing the hdfsTimeout with socketTimeout. see > attached patch. Most of credit should give [~liushaohui] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6449) Incorrect counting in ContentSummaryComputationContext in 0.23.
[ https://issues.apache.org/jira/browse/HDFS-6449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee resolved HDFS-6449. -- Resolution: Fixed Fix Version/s: 0.23.11 Hadoop Flags: Reviewed Thanks for the review, Daryn. I've committed this to branch-0.23. > Incorrect counting in ContentSummaryComputationContext in 0.23. > --- > > Key: HDFS-6449 > URL: https://issues.apache.org/jira/browse/HDFS-6449 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.23.10 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 0.23.11 > > Attachments: HDFS-6449.branch-0.23.patch > > > In {{ContentSummaryComputationContext}}, the content counting in {{yield()}} > is incorrect. The result is still correct, but it ends up yielding more > frequently. Trunk and branch-2 does not have this bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6227) ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have been closed by java interrupts
[ https://issues.apache.org/jira/browse/HDFS-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009967#comment-14009967 ] Colin Patrick McCabe commented on HDFS-6227: Thanks, Jing. Test failure appears to be HDFS-6257, not related. > ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have > been closed by java interrupts > --- > > Key: HDFS-6227 > URL: https://issues.apache.org/jira/browse/HDFS-6227 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Jing Zhao >Assignee: Colin Patrick McCabe > Fix For: 2.5.0 > > Attachments: HDFS-6227.000.patch, HDFS-6227.001.patch, > HDFS-6227.002.patch, ShortCircuitReadInterruption.test.patch > > > While running tests in a single node cluster, where short circuit read is > enabled and multiple threads may read the same file concurrently, one of the > read got ClosedChannelException and failed. Full exception trace see comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009985#comment-14009985 ] Zhongyi Xie commented on HDFS-6411: --- [~brandonli], can you please also add an "else" clause in the getattr function, like you did in access, just in case the unwrapped exception happens to be something other than AuthorizationException? there is another related issue with fsstat where NFS3ERR_IO could also be replaced with NFS3_ACCESS > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.patch, tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009971#comment-14009971 ] Brandon Li commented on HDFS-6411: -- Thank you, [~aw] and [~zhongyi-altiscale]. The error massage printed by shell is always "permission denied" in my tests with either NFS3ERR_ACCES or NFS3ERR_PERM. Regardless, I agree that NFS3ERR_ACCES is a better error status than NFS3ERR_PERM in this case. I've updated a new patch for the update. > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.patch, tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6449) Incorrect counting in ContentSummaryComputationContext in 0.23.
[ https://issues.apache.org/jira/browse/HDFS-6449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009962#comment-14009962 ] Daryn Sharp commented on HDFS-6449: --- +1 Proven to work since this is the version of the patch run internally. > Incorrect counting in ContentSummaryComputationContext in 0.23. > --- > > Key: HDFS-6449 > URL: https://issues.apache.org/jira/browse/HDFS-6449 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.23.10 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-6449.branch-0.23.patch > > > In {{ContentSummaryComputationContext}}, the content counting in {{yield()}} > is incorrect. The result is still correct, but it ends up yielding more > frequently. Trunk and branch-2 does not have this bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6447) balancer should timestamp the completion message
[ https://issues.apache.org/jira/browse/HDFS-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009960#comment-14009960 ] Juan Yu commented on HDFS-6447: --- Oops, sorry about the patch name. > balancer should timestamp the completion message > > > Key: HDFS-6447 > URL: https://issues.apache.org/jira/browse/HDFS-6447 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Reporter: Allen Wittenauer >Assignee: Juan Yu >Priority: Trivial > Labels: newbie > Attachments: HDFS-6447.patch.001 > > > When the balancer finishes, it doesn't report the time it finished. It > should do this so that users have a better sense of how long it took to > complete. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6227) ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have been closed by java interrupts
[ https://issues.apache.org/jira/browse/HDFS-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009955#comment-14009955 ] Hudson commented on HDFS-6227: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5611 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5611/]) HDFS-6227. ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have been closed by java interrupts. Contributed by Colin Patrick McCabe. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1597829) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/shortcircuit/ShortCircuitCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderFactory.java > ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have > been closed by java interrupts > --- > > Key: HDFS-6227 > URL: https://issues.apache.org/jira/browse/HDFS-6227 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Jing Zhao >Assignee: Colin Patrick McCabe > Fix For: 2.5.0 > > Attachments: HDFS-6227.000.patch, HDFS-6227.001.patch, > HDFS-6227.002.patch, ShortCircuitReadInterruption.test.patch > > > While running tests in a single node cluster, where short circuit read is > enabled and multiple threads may read the same file concurrently, one of the > read got ClosedChannelException and failed. Full exception trace see comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6411) nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user attempts to access it
[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6411: - Attachment: HDFS-6411.003.patch > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.2.0 >Reporter: Zhongyi Xie >Assignee: Brandon Li > Attachments: HDFS-6411-branch-2.2.patch, HDFS-6411.002.patch, > HDFS-6411.003.patch, HDFS-6411.patch, tcpdump-HDFS-6411-Brandon.out > > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6447) balancer should timestamp the completion message
[ https://issues.apache.org/jira/browse/HDFS-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009933#comment-14009933 ] Andrew Wang commented on HDFS-6447: --- +1 LGTM, thanks Juan. Typically we name the patches such that they end in .patch, e.g. "hdfs-6447.001.patch", but that's a nit :) [~aw], is this basically what you had in mind? > balancer should timestamp the completion message > > > Key: HDFS-6447 > URL: https://issues.apache.org/jira/browse/HDFS-6447 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Reporter: Allen Wittenauer >Assignee: Juan Yu >Priority: Trivial > Labels: newbie > Attachments: HDFS-6447.patch.001 > > > When the balancer finishes, it doesn't report the time it finished. It > should do this so that users have a better sense of how long it took to > complete. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6227) ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have been closed by java interrupts
[ https://issues.apache.org/jira/browse/HDFS-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6227: Resolution: Fixed Fix Version/s: 2.5.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed this to trunk and branch-2. Thanks for the fix [~cmccabe]! > ShortCircuitCache#unref should purge ShortCircuitReplicas whose streams have > been closed by java interrupts > --- > > Key: HDFS-6227 > URL: https://issues.apache.org/jira/browse/HDFS-6227 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Jing Zhao >Assignee: Colin Patrick McCabe > Fix For: 2.5.0 > > Attachments: HDFS-6227.000.patch, HDFS-6227.001.patch, > HDFS-6227.002.patch, ShortCircuitReadInterruption.test.patch > > > While running tests in a single node cluster, where short circuit read is > enabled and multiple threads may read the same file concurrently, one of the > read got ClosedChannelException and failed. Full exception trace see comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6449) Incorrect counting in ContentSummaryComputationContext in 0.23.
[ https://issues.apache.org/jira/browse/HDFS-6449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-6449: - Attachment: HDFS-6449.branch-0.23.patch Attaching patch for branch-0.23. > Incorrect counting in ContentSummaryComputationContext in 0.23. > --- > > Key: HDFS-6449 > URL: https://issues.apache.org/jira/browse/HDFS-6449 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.23.10 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-6449.branch-0.23.patch > > > In {{ContentSummaryComputationContext}}, the content counting in {{yield()}} > is incorrect. The result is still correct, but it ends up yielding more > frequently. Trunk and branch-2 does not have this bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6449) Incorrect counting in ContentSummaryComputationContext in 0.23.
Kihwal Lee created HDFS-6449: Summary: Incorrect counting in ContentSummaryComputationContext in 0.23. Key: HDFS-6449 URL: https://issues.apache.org/jira/browse/HDFS-6449 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.10 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical In {{ContentSummaryComputationContext}}, the content counting in {{yield()}} is incorrect. The result is still correct, but it ends up yielding more frequently. Trunk and branch-2 does not have this bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6447) balancer should timestamp the completion message
[ https://issues.apache.org/jira/browse/HDFS-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juan Yu updated HDFS-6447: -- Status: Patch Available (was: In Progress) > balancer should timestamp the completion message > > > Key: HDFS-6447 > URL: https://issues.apache.org/jira/browse/HDFS-6447 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Reporter: Allen Wittenauer >Assignee: Juan Yu >Priority: Trivial > Labels: newbie > Attachments: HDFS-6447.patch.001 > > > When the balancer finishes, it doesn't report the time it finished. It > should do this so that users have a better sense of how long it took to > complete. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6447) balancer should timestamp the completion message
[ https://issues.apache.org/jira/browse/HDFS-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juan Yu updated HDFS-6447: -- Attachment: HDFS-6447.patch.001 patch to report balancer finish time. > balancer should timestamp the completion message > > > Key: HDFS-6447 > URL: https://issues.apache.org/jira/browse/HDFS-6447 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Reporter: Allen Wittenauer >Assignee: Juan Yu >Priority: Trivial > Labels: newbie > Attachments: HDFS-6447.patch.001 > > > When the balancer finishes, it doesn't report the time it finished. It > should do this so that users have a better sense of how long it took to > complete. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Work started] (HDFS-6447) balancer should timestamp the completion message
[ https://issues.apache.org/jira/browse/HDFS-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-6447 started by Juan Yu. > balancer should timestamp the completion message > > > Key: HDFS-6447 > URL: https://issues.apache.org/jira/browse/HDFS-6447 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Reporter: Allen Wittenauer >Assignee: Juan Yu >Priority: Trivial > Labels: newbie > > When the balancer finishes, it doesn't report the time it finished. It > should do this so that users have a better sense of how long it took to > complete. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6222) Remove background token renewer from webhdfs
[ https://issues.apache.org/jira/browse/HDFS-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-6222: -- Status: Patch Available (was: Open) > Remove background token renewer from webhdfs > > > Key: HDFS-6222 > URL: https://issues.apache.org/jira/browse/HDFS-6222 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-6222.branch-2.patch, HDFS-6222.trunk.patch > > > The background token renewer is a source of problems for long-running > daemons. Webhdfs should lazy fetch a new token when it receives an > InvalidToken exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6222) Remove background token renewer from webhdfs
[ https://issues.apache.org/jira/browse/HDFS-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-6222: -- Attachment: HDFS-6222.trunk.patch HDFS-6222.branch-2.patch Adds lazy re-refetch of expired tokens. Internally tested on secure clusters. Only difference in patches is a conflicting import. > Remove background token renewer from webhdfs > > > Key: HDFS-6222 > URL: https://issues.apache.org/jira/browse/HDFS-6222 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-6222.branch-2.patch, HDFS-6222.trunk.patch > > > The background token renewer is a source of problems for long-running > daemons. Webhdfs should lazy fetch a new token when it receives an > InvalidToken exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6441) Add ability to exclude/include few datanodes while balancing
[ https://issues.apache.org/jira/browse/HDFS-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009688#comment-14009688 ] Daryn Sharp commented on HDFS-6441: --- I've only quickly skimmed the raw patch, a few questions/comments: # What happens if neither option is given? It appears to maybe ignore all hosts? # If both options are given, it appears to build a union of the include/exclude hosts, then use the last argument to determine if the union is exclude or not? # I seem to recall {{getHostName}} is (or used to be) a bit peculiar and can return a DN self-reported name, hence the {{getPeerHostName}} which is guaranteed to return the actual hostname. You should check and match the NN's behavior on use of peer name or reported name. # {{DEFALUT}} is misspelled > Add ability to exclude/include few datanodes while balancing > > > Key: HDFS-6441 > URL: https://issues.apache.org/jira/browse/HDFS-6441 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Affects Versions: 2.4.0 >Reporter: Benoy Antony >Assignee: Benoy Antony > Attachments: HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, > HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, > HDFS-6441.patch, HDFS-6441.patch > > > In some use cases, it is desirable to ignore a few data nodes while > balancing. The administrator should be able to specify a list of data nodes > in a file similar to the hosts file and the balancer should ignore these data > nodes while balancing so that no blocks are added/removed on these nodes. > Similarly it will be beneficial to specify that only a particular list of > datanodes should be considered for balancing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6354) NN startup does not fail when it fails to login with the spnego principal
[ https://issues.apache.org/jira/browse/HDFS-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009656#comment-14009656 ] Daryn Sharp commented on HDFS-6354: --- The multiple principal spnego support feature should indirectly uncover a misconfiguration since it has to read the keytab (which may throw) and then throws if no HTTP principals are found. Perhaps we need to extend the auth handler to verify an explicit principal is in the keytab too. > NN startup does not fail when it fails to login with the spnego principal > - > > Key: HDFS-6354 > URL: https://issues.apache.org/jira/browse/HDFS-6354 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Arpit Gupta > > I have noticed where the NN startup did not report any issues the login fails > because either the keytab is wrong or the principal does not exist etc. This > can be mis leading and lead to authentication failures when a client tries to > authenticate to the spnego principal. -- This message was sent by Atlassian JIRA (v6.2#6252)