[jira] [Commented] (HDFS-6169) Move the address in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962634#comment-13962634 ] Akira AJISAKA commented on HDFS-6169: - Thanks for the comments! Attaching a new patch. bq. How to differentiate the url /webhdfs/v1?op=LISTSTATUS and /wrongurl? I started a NameNode and sent a HTTP request to /webhdfs/v1?op=liststatus : {code} $ curl -i http://trunk:50070/webhdfs/v1?op=liststatus HTTP/1.1 404 Not Found Cache-Control: no-cache Expires: Tue, 08 Apr 2014 04:36:24 GMT Date: Tue, 08 Apr 2014 04:36:24 GMT Pragma: no-cache Expires: Tue, 08 Apr 2014 04:36:24 GMT Date: Tue, 08 Apr 2014 04:36:24 GMT Pragma: no-cache Content-Type: application/json Transfer-Encoding: chunked Server: Jetty(6.1.26) {RemoteException:{exception:FileNotFoundException,javaClassName:java.io.FileNotFoundException,message:File /webhdfs/v1 does not exist.}} {code} Since both /webhdfs/v1 and /wrongurl return 404, I think there is no need to differentiate them. Move the address in WebImageViewer -- Key: HDFS-6169 URL: https://issues.apache.org/jira/browse/HDFS-6169 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Attachments: HDFS-6169.2.patch, HDFS-6169.3.patch, HDFS-6169.4.patch, HDFS-6169.5.patch, HDFS-6169.patch Move the endpoint of WebImageViewer from http://hostname:port/ to http://hostname:port/webhdfs/v1/ to support {{hdfs dfs -ls}} to WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6169) Move the address in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-6169: Attachment: HDFS-6169.6.patch Move the address in WebImageViewer -- Key: HDFS-6169 URL: https://issues.apache.org/jira/browse/HDFS-6169 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Attachments: HDFS-6169.2.patch, HDFS-6169.3.patch, HDFS-6169.4.patch, HDFS-6169.5.patch, HDFS-6169.6.patch, HDFS-6169.patch Move the endpoint of WebImageViewer from http://hostname:port/ to http://hostname:port/webhdfs/v1/ to support {{hdfs dfs -ls}} to WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6169) Move the address in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962652#comment-13962652 ] Akira AJISAKA commented on HDFS-6169: - New patch changes the warning messages in {{FSImageLoader}} to clarify whether the path prefix /webhdfs/v1/ is set correctly or not. By the way, it's better to set JSON formatted {{RemoteException}} in the content if the request fails. I'll create a separate jira. Move the address in WebImageViewer -- Key: HDFS-6169 URL: https://issues.apache.org/jira/browse/HDFS-6169 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Attachments: HDFS-6169.2.patch, HDFS-6169.3.patch, HDFS-6169.4.patch, HDFS-6169.5.patch, HDFS-6169.6.patch, HDFS-6169.patch Move the endpoint of WebImageViewer from http://hostname:port/ to http://hostname:port/webhdfs/v1/ to support {{hdfs dfs -ls}} to WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962655#comment-13962655 ] Hadoop QA commented on HDFS-6200: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639135/HDFS-6200.004.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6615//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/6615//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6615//console This message is automatically generated. Create a separate jar for hdfs-client - Key: HDFS-6200 URL: https://issues.apache.org/jira/browse/HDFS-6200 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs client. As discussed in the hdfs-dev mailing list (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), downstream projects are forced to bring in additional dependency in order to access hdfs. The additional dependency sometimes can be difficult to manage for projects like Apache Falcon and Apache Oozie. This jira proposes to create a new project, hadoop-hdfs-cliient, which contains the client side of the hdfs code. Downstream projects can use this jar instead of the hadoop-hdfs to avoid unnecessary dependency. Note that it does not break the compatibility of downstream projects. This is because old downstream projects implicitly depend on hadoop-hdfs-client through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6169) Move the address in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-6169: Attachment: HDFS-6169.7.patch Update the patch to handle /webhdfs/v1 prefix as invalid. Move the address in WebImageViewer -- Key: HDFS-6169 URL: https://issues.apache.org/jira/browse/HDFS-6169 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Attachments: HDFS-6169.2.patch, HDFS-6169.3.patch, HDFS-6169.4.patch, HDFS-6169.5.patch, HDFS-6169.6.patch, HDFS-6169.7.patch, HDFS-6169.patch Move the endpoint of WebImageViewer from http://hostname:port/ to http://hostname:port/webhdfs/v1/ to support {{hdfs dfs -ls}} to WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6200: - Attachment: HDFS-6200.005.patch Create a separate jar for hdfs-client - Key: HDFS-6200 URL: https://issues.apache.org/jira/browse/HDFS-6200 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, HDFS-6200.005.patch Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs client. As discussed in the hdfs-dev mailing list (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), downstream projects are forced to bring in additional dependency in order to access hdfs. The additional dependency sometimes can be difficult to manage for projects like Apache Falcon and Apache Oozie. This jira proposes to create a new project, hadoop-hdfs-cliient, which contains the client side of the hdfs code. Downstream projects can use this jar instead of the hadoop-hdfs to avoid unnecessary dependency. Note that it does not break the compatibility of downstream projects. This is because old downstream projects implicitly depend on hadoop-hdfs-client through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962692#comment-13962692 ] Haohui Mai commented on HDFS-6200: -- The v5 patch fixes the audit warning. Create a separate jar for hdfs-client - Key: HDFS-6200 URL: https://issues.apache.org/jira/browse/HDFS-6200 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, HDFS-6200.005.patch Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs client. As discussed in the hdfs-dev mailing list (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), downstream projects are forced to bring in additional dependency in order to access hdfs. The additional dependency sometimes can be difficult to manage for projects like Apache Falcon and Apache Oozie. This jira proposes to create a new project, hadoop-hdfs-cliient, which contains the client side of the hdfs code. Downstream projects can use this jar instead of the hadoop-hdfs to avoid unnecessary dependency. Note that it does not break the compatibility of downstream projects. This is because old downstream projects implicitly depend on hadoop-hdfs-client through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6180) dead node count / listing is very broken in JMX and old GUI
[ https://issues.apache.org/jira/browse/HDFS-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6180: - Attachment: HDFS-6180.005.patch dead node count / listing is very broken in JMX and old GUI --- Key: HDFS-6180 URL: https://issues.apache.org/jira/browse/HDFS-6180 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Haohui Mai Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6180.000.patch, HDFS-6180.001.patch, HDFS-6180.002.patch, HDFS-6180.003.patch, HDFS-6180.004.patch, dn.log After bringing up a 578 node cluster with 13 dead nodes, 0 were reported on the new GUI, but showed up properly in the datanodes tab. Some nodes are also being double reported in the deadnode and inservice section (22 show up dead, 565 show up alive, 9 duplicated nodes). From /jmx (confirmed that it's the same in jconsole): {noformat} { name : Hadoop:service=NameNode,name=FSNamesystemState, modelerType : org.apache.hadoop.hdfs.server.namenode.FSNamesystem, CapacityTotal : 5477748687372288, CapacityUsed : 24825720407, CapacityRemaining : 5477723861651881, TotalLoad : 565, SnapshotStats : {\SnapshottableDirectories\:0,\Snapshots\:0}, BlocksTotal : 21065, MaxObjects : 0, FilesTotal : 25454, PendingReplicationBlocks : 0, UnderReplicatedBlocks : 0, ScheduledReplicationBlocks : 0, FSState : Operational, NumLiveDataNodes : 565, NumDeadDataNodes : 0, NumDecomLiveDataNodes : 0, NumDecomDeadDataNodes : 0, NumDecommissioningDataNodes : 0, NumStaleDataNodes : 1 }, {noformat} I'm not going to include deadnode/livenodes because the list is huge, but I've confirmed there are 9 nodes showing up in both deadnodes and livenodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6180) dead node count / listing is very broken in JMX and old GUI
[ https://issues.apache.org/jira/browse/HDFS-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6180: - Attachment: (was: HDFS-6180.005.patch) dead node count / listing is very broken in JMX and old GUI --- Key: HDFS-6180 URL: https://issues.apache.org/jira/browse/HDFS-6180 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Haohui Mai Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6180.000.patch, HDFS-6180.001.patch, HDFS-6180.002.patch, HDFS-6180.003.patch, HDFS-6180.004.patch, dn.log After bringing up a 578 node cluster with 13 dead nodes, 0 were reported on the new GUI, but showed up properly in the datanodes tab. Some nodes are also being double reported in the deadnode and inservice section (22 show up dead, 565 show up alive, 9 duplicated nodes). From /jmx (confirmed that it's the same in jconsole): {noformat} { name : Hadoop:service=NameNode,name=FSNamesystemState, modelerType : org.apache.hadoop.hdfs.server.namenode.FSNamesystem, CapacityTotal : 5477748687372288, CapacityUsed : 24825720407, CapacityRemaining : 5477723861651881, TotalLoad : 565, SnapshotStats : {\SnapshottableDirectories\:0,\Snapshots\:0}, BlocksTotal : 21065, MaxObjects : 0, FilesTotal : 25454, PendingReplicationBlocks : 0, UnderReplicatedBlocks : 0, ScheduledReplicationBlocks : 0, FSState : Operational, NumLiveDataNodes : 565, NumDeadDataNodes : 0, NumDecomLiveDataNodes : 0, NumDecomDeadDataNodes : 0, NumDecommissioningDataNodes : 0, NumStaleDataNodes : 1 }, {noformat} I'm not going to include deadnode/livenodes because the list is huge, but I've confirmed there are 9 nodes showing up in both deadnodes and livenodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6180) dead node count / listing is very broken in JMX and old GUI
[ https://issues.apache.org/jira/browse/HDFS-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962695#comment-13962695 ] Haohui Mai commented on HDFS-6180: -- Sorry, uploaded the wrong patch. dead node count / listing is very broken in JMX and old GUI --- Key: HDFS-6180 URL: https://issues.apache.org/jira/browse/HDFS-6180 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Haohui Mai Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6180.000.patch, HDFS-6180.001.patch, HDFS-6180.002.patch, HDFS-6180.003.patch, HDFS-6180.004.patch, dn.log After bringing up a 578 node cluster with 13 dead nodes, 0 were reported on the new GUI, but showed up properly in the datanodes tab. Some nodes are also being double reported in the deadnode and inservice section (22 show up dead, 565 show up alive, 9 duplicated nodes). From /jmx (confirmed that it's the same in jconsole): {noformat} { name : Hadoop:service=NameNode,name=FSNamesystemState, modelerType : org.apache.hadoop.hdfs.server.namenode.FSNamesystem, CapacityTotal : 5477748687372288, CapacityUsed : 24825720407, CapacityRemaining : 5477723861651881, TotalLoad : 565, SnapshotStats : {\SnapshottableDirectories\:0,\Snapshots\:0}, BlocksTotal : 21065, MaxObjects : 0, FilesTotal : 25454, PendingReplicationBlocks : 0, UnderReplicatedBlocks : 0, ScheduledReplicationBlocks : 0, FSState : Operational, NumLiveDataNodes : 565, NumDeadDataNodes : 0, NumDecomLiveDataNodes : 0, NumDecomDeadDataNodes : 0, NumDecommissioningDataNodes : 0, NumStaleDataNodes : 1 }, {noformat} I'm not going to include deadnode/livenodes because the list is huge, but I've confirmed there are 9 nodes showing up in both deadnodes and livenodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6169) Move the address in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962744#comment-13962744 ] Hadoop QA commented on HDFS-6169: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639152/HDFS-6169.7.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6617//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6617//console This message is automatically generated. Move the address in WebImageViewer -- Key: HDFS-6169 URL: https://issues.apache.org/jira/browse/HDFS-6169 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Attachments: HDFS-6169.2.patch, HDFS-6169.3.patch, HDFS-6169.4.patch, HDFS-6169.5.patch, HDFS-6169.6.patch, HDFS-6169.7.patch, HDFS-6169.patch Move the endpoint of WebImageViewer from http://hostname:port/ to http://hostname:port/webhdfs/v1/ to support {{hdfs dfs -ls}} to WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6169) Move the address in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962747#comment-13962747 ] Hadoop QA commented on HDFS-6169: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639152/HDFS-6169.7.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6616//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6616//console This message is automatically generated. Move the address in WebImageViewer -- Key: HDFS-6169 URL: https://issues.apache.org/jira/browse/HDFS-6169 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Attachments: HDFS-6169.2.patch, HDFS-6169.3.patch, HDFS-6169.4.patch, HDFS-6169.5.patch, HDFS-6169.6.patch, HDFS-6169.7.patch, HDFS-6169.patch Move the endpoint of WebImageViewer from http://hostname:port/ to http://hostname:port/webhdfs/v1/ to support {{hdfs dfs -ls}} to WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6198) DataNode rolling upgrade does not correctly identify current block pool directory and replace with trash on Windows.
[ https://issues.apache.org/jira/browse/HDFS-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962774#comment-13962774 ] Hudson commented on HDFS-6198: -- FAILURE: Integrated in Hadoop-Yarn-trunk #533 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/533/]) HDFS-6198. DataNode rolling upgrade does not correctly identify current block pool directory and replace with trash on Windows. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585627) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockPoolSliceStorage.java DataNode rolling upgrade does not correctly identify current block pool directory and replace with trash on Windows. Key: HDFS-6198 URL: https://issues.apache.org/jira/browse/HDFS-6198 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 3.0.0, 2.4.1 Attachments: HDFS-6198.1.patch, HDFS-6198.2.patch For rolling upgrade, the DataNode uses a regex to identify the current block pool directory and replace it with the equivalent trash directory. This regex hard-codes '/' as the file path separator, which doesn't work on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6180) dead node count / listing is very broken in JMX and old GUI
[ https://issues.apache.org/jira/browse/HDFS-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962776#comment-13962776 ] Hudson commented on HDFS-6180: -- FAILURE: Integrated in Hadoop-Yarn-trunk #533 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/533/]) HDFS-6180. Dead node count / listing is very broken in JMX and old GUI. Contributed by Haohui Mai. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585625) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HostFileManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/HostFileManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeRegistration.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHostFileManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java dead node count / listing is very broken in JMX and old GUI --- Key: HDFS-6180 URL: https://issues.apache.org/jira/browse/HDFS-6180 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Haohui Mai Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6180.000.patch, HDFS-6180.001.patch, HDFS-6180.002.patch, HDFS-6180.003.patch, HDFS-6180.004.patch, dn.log After bringing up a 578 node cluster with 13 dead nodes, 0 were reported on the new GUI, but showed up properly in the datanodes tab. Some nodes are also being double reported in the deadnode and inservice section (22 show up dead, 565 show up alive, 9 duplicated nodes). From /jmx (confirmed that it's the same in jconsole): {noformat} { name : Hadoop:service=NameNode,name=FSNamesystemState, modelerType : org.apache.hadoop.hdfs.server.namenode.FSNamesystem, CapacityTotal : 5477748687372288, CapacityUsed : 24825720407, CapacityRemaining : 5477723861651881, TotalLoad : 565, SnapshotStats : {\SnapshottableDirectories\:0,\Snapshots\:0}, BlocksTotal : 21065, MaxObjects : 0, FilesTotal : 25454, PendingReplicationBlocks : 0, UnderReplicatedBlocks : 0, ScheduledReplicationBlocks : 0, FSState : Operational, NumLiveDataNodes : 565, NumDeadDataNodes : 0, NumDecomLiveDataNodes : 0, NumDecomDeadDataNodes : 0, NumDecommissioningDataNodes : 0, NumStaleDataNodes : 1 }, {noformat} I'm not going to include deadnode/livenodes because the list is huge, but I've confirmed there are 9 nodes showing up in both deadnodes and livenodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962773#comment-13962773 ] Hudson commented on HDFS-6143: -- FAILURE: Integrated in Hadoop-Yarn-trunk #533 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/533/]) HDFS-6143. WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths. Contributed by Gera Shegalov. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585639) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/ByteRangeInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6143-branch-2.4.0.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v02.patch, HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch, HDFS-6143.v04.patch, HDFS-6143.v05.patch, HDFS-6143.v06.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6181) Fix the wrong property names in NFS user guide
[ https://issues.apache.org/jira/browse/HDFS-6181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962770#comment-13962770 ] Hudson commented on HDFS-6181: -- FAILURE: Integrated in Hadoop-Yarn-trunk #533 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/533/]) HDFS-6181. Fix the wrong property names in NFS user guide. Contributed by Brandon Li (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585563) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/mount/RpcProgramMountd.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/JournalProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/NamenodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/protocol/QJournalProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/protocolPB/QJournalProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/GetJournalEditServlet.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNodeHttpServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeHttpServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/InterDatanodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/JournalProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamenodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSHAAdmin.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/GetGroups.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetImageServlet.java *
[jira] [Commented] (HDFS-6191) Disable quota checks when replaying edit log.
[ https://issues.apache.org/jira/browse/HDFS-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962777#comment-13962777 ] Hudson commented on HDFS-6191: -- FAILURE: Integrated in Hadoop-Yarn-trunk #533 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/533/]) HDFS-6191. Disable quota checks when replaying edit log. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585544) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSDirectory.java Disable quota checks when replaying edit log. - Key: HDFS-6191 URL: https://issues.apache.org/jira/browse/HDFS-6191 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Kihwal Lee Assignee: Kihwal Lee Fix For: 3.0.0, 0.23.11, 2.5.0 Attachments: HDFS-6191.branch-0.23.patch, HDFS-6191.patch, HDFS-6191.patch, HDFS-6191.patch Since the serving NN does quota checks before logging edits, performing quota checks is unnecessary while replaying edits. I propose disabling quota checks for 2NN and SBN. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6197) Rolling upgrade rollback on Windows can fail attempting to rename edit log segment files to a destination that already exists.
[ https://issues.apache.org/jira/browse/HDFS-6197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962772#comment-13962772 ] Hudson commented on HDFS-6197: -- FAILURE: Integrated in Hadoop-Yarn-trunk #533 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/533/]) HDFS-6197. Rolling upgrade rollback on Windows can fail attempting to rename edit log segment files to a destination that already exists. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585586) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRollingUpgrade.java Rolling upgrade rollback on Windows can fail attempting to rename edit log segment files to a destination that already exists. -- Key: HDFS-6197 URL: https://issues.apache.org/jira/browse/HDFS-6197 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Fix For: 3.0.0, 2.4.1 Attachments: HDFS-6197.1.patch As part of a rollback from a rolling upgrade in progress, we discard edit log segments by renaming the file with suffix .trash. If no new edits arrive in between multiple attempts, then the .trash file may still exist. On Windows, {{java.io.File#renameTo}} fails if the destination already exists though. This is visible right now as a failure in {{TestRollingUpgrade#testRollback}} when running on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962802#comment-13962802 ] Hadoop QA commented on HDFS-6200: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639158/HDFS-6200.005.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6618//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6618//console This message is automatically generated. Create a separate jar for hdfs-client - Key: HDFS-6200 URL: https://issues.apache.org/jira/browse/HDFS-6200 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, HDFS-6200.005.patch Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs client. As discussed in the hdfs-dev mailing list (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), downstream projects are forced to bring in additional dependency in order to access hdfs. The additional dependency sometimes can be difficult to manage for projects like Apache Falcon and Apache Oozie. This jira proposes to create a new project, hadoop-hdfs-cliient, which contains the client side of the hdfs code. Downstream projects can use this jar instead of the hadoop-hdfs to avoid unnecessary dependency. Note that it does not break the compatibility of downstream projects. This is because old downstream projects implicitly depend on hadoop-hdfs-client through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962963#comment-13962963 ] Hudson commented on HDFS-6143: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1751 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1751/]) HDFS-6143. WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths. Contributed by Gera Shegalov. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585639) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/ByteRangeInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6143-branch-2.4.0.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v02.patch, HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch, HDFS-6143.v04.patch, HDFS-6143.v05.patch, HDFS-6143.v06.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6197) Rolling upgrade rollback on Windows can fail attempting to rename edit log segment files to a destination that already exists.
[ https://issues.apache.org/jira/browse/HDFS-6197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962962#comment-13962962 ] Hudson commented on HDFS-6197: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1751 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1751/]) HDFS-6197. Rolling upgrade rollback on Windows can fail attempting to rename edit log segment files to a destination that already exists. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585586) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRollingUpgrade.java Rolling upgrade rollback on Windows can fail attempting to rename edit log segment files to a destination that already exists. -- Key: HDFS-6197 URL: https://issues.apache.org/jira/browse/HDFS-6197 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Fix For: 3.0.0, 2.4.1 Attachments: HDFS-6197.1.patch As part of a rollback from a rolling upgrade in progress, we discard edit log segments by renaming the file with suffix .trash. If no new edits arrive in between multiple attempts, then the .trash file may still exist. On Windows, {{java.io.File#renameTo}} fails if the destination already exists though. This is visible right now as a failure in {{TestRollingUpgrade#testRollback}} when running on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6181) Fix the wrong property names in NFS user guide
[ https://issues.apache.org/jira/browse/HDFS-6181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962960#comment-13962960 ] Hudson commented on HDFS-6181: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1751 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1751/]) HDFS-6181. Fix the wrong property names in NFS user guide. Contributed by Brandon Li (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585563) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/mount/RpcProgramMountd.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/JournalProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/NamenodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/protocol/QJournalProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/protocolPB/QJournalProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/GetJournalEditServlet.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNodeHttpServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeHttpServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/InterDatanodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/JournalProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamenodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSHAAdmin.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/GetGroups.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetImageServlet.java *
[jira] [Commented] (HDFS-6180) dead node count / listing is very broken in JMX and old GUI
[ https://issues.apache.org/jira/browse/HDFS-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962966#comment-13962966 ] Hudson commented on HDFS-6180: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1751 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1751/]) HDFS-6180. Dead node count / listing is very broken in JMX and old GUI. Contributed by Haohui Mai. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585625) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HostFileManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/HostFileManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeRegistration.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHostFileManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java dead node count / listing is very broken in JMX and old GUI --- Key: HDFS-6180 URL: https://issues.apache.org/jira/browse/HDFS-6180 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Haohui Mai Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6180.000.patch, HDFS-6180.001.patch, HDFS-6180.002.patch, HDFS-6180.003.patch, HDFS-6180.004.patch, dn.log After bringing up a 578 node cluster with 13 dead nodes, 0 were reported on the new GUI, but showed up properly in the datanodes tab. Some nodes are also being double reported in the deadnode and inservice section (22 show up dead, 565 show up alive, 9 duplicated nodes). From /jmx (confirmed that it's the same in jconsole): {noformat} { name : Hadoop:service=NameNode,name=FSNamesystemState, modelerType : org.apache.hadoop.hdfs.server.namenode.FSNamesystem, CapacityTotal : 5477748687372288, CapacityUsed : 24825720407, CapacityRemaining : 5477723861651881, TotalLoad : 565, SnapshotStats : {\SnapshottableDirectories\:0,\Snapshots\:0}, BlocksTotal : 21065, MaxObjects : 0, FilesTotal : 25454, PendingReplicationBlocks : 0, UnderReplicatedBlocks : 0, ScheduledReplicationBlocks : 0, FSState : Operational, NumLiveDataNodes : 565, NumDeadDataNodes : 0, NumDecomLiveDataNodes : 0, NumDecomDeadDataNodes : 0, NumDecommissioningDataNodes : 0, NumStaleDataNodes : 1 }, {noformat} I'm not going to include deadnode/livenodes because the list is huge, but I've confirmed there are 9 nodes showing up in both deadnodes and livenodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6191) Disable quota checks when replaying edit log.
[ https://issues.apache.org/jira/browse/HDFS-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962967#comment-13962967 ] Hudson commented on HDFS-6191: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1751 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1751/]) HDFS-6191. Disable quota checks when replaying edit log. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585544) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSDirectory.java Disable quota checks when replaying edit log. - Key: HDFS-6191 URL: https://issues.apache.org/jira/browse/HDFS-6191 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Kihwal Lee Assignee: Kihwal Lee Fix For: 3.0.0, 0.23.11, 2.5.0 Attachments: HDFS-6191.branch-0.23.patch, HDFS-6191.patch, HDFS-6191.patch, HDFS-6191.patch Since the serving NN does quota checks before logging edits, performing quota checks is unnecessary while replaying edits. I propose disabling quota checks for 2NN and SBN. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6191) Disable quota checks when replaying edit log.
[ https://issues.apache.org/jira/browse/HDFS-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962978#comment-13962978 ] Hudson commented on HDFS-6191: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1725 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1725/]) HDFS-6191. Disable quota checks when replaying edit log. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585544) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSDirectory.java Disable quota checks when replaying edit log. - Key: HDFS-6191 URL: https://issues.apache.org/jira/browse/HDFS-6191 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Kihwal Lee Assignee: Kihwal Lee Fix For: 3.0.0, 0.23.11, 2.5.0 Attachments: HDFS-6191.branch-0.23.patch, HDFS-6191.patch, HDFS-6191.patch, HDFS-6191.patch Since the serving NN does quota checks before logging edits, performing quota checks is unnecessary while replaying edits. I propose disabling quota checks for 2NN and SBN. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962974#comment-13962974 ] Hudson commented on HDFS-6143: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1725 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1725/]) HDFS-6143. WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths. Contributed by Gera Shegalov. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585639) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/ByteRangeInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6143-branch-2.4.0.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v02.patch, HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch, HDFS-6143.v04.patch, HDFS-6143.v05.patch, HDFS-6143.v06.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6180) dead node count / listing is very broken in JMX and old GUI
[ https://issues.apache.org/jira/browse/HDFS-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962977#comment-13962977 ] Hudson commented on HDFS-6180: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1725 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1725/]) HDFS-6180. Dead node count / listing is very broken in JMX and old GUI. Contributed by Haohui Mai. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585625) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HostFileManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/HostFileManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeRegistration.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHostFileManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java dead node count / listing is very broken in JMX and old GUI --- Key: HDFS-6180 URL: https://issues.apache.org/jira/browse/HDFS-6180 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Haohui Mai Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6180.000.patch, HDFS-6180.001.patch, HDFS-6180.002.patch, HDFS-6180.003.patch, HDFS-6180.004.patch, dn.log After bringing up a 578 node cluster with 13 dead nodes, 0 were reported on the new GUI, but showed up properly in the datanodes tab. Some nodes are also being double reported in the deadnode and inservice section (22 show up dead, 565 show up alive, 9 duplicated nodes). From /jmx (confirmed that it's the same in jconsole): {noformat} { name : Hadoop:service=NameNode,name=FSNamesystemState, modelerType : org.apache.hadoop.hdfs.server.namenode.FSNamesystem, CapacityTotal : 5477748687372288, CapacityUsed : 24825720407, CapacityRemaining : 5477723861651881, TotalLoad : 565, SnapshotStats : {\SnapshottableDirectories\:0,\Snapshots\:0}, BlocksTotal : 21065, MaxObjects : 0, FilesTotal : 25454, PendingReplicationBlocks : 0, UnderReplicatedBlocks : 0, ScheduledReplicationBlocks : 0, FSState : Operational, NumLiveDataNodes : 565, NumDeadDataNodes : 0, NumDecomLiveDataNodes : 0, NumDecomDeadDataNodes : 0, NumDecommissioningDataNodes : 0, NumStaleDataNodes : 1 }, {noformat} I'm not going to include deadnode/livenodes because the list is huge, but I've confirmed there are 9 nodes showing up in both deadnodes and livenodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6181) Fix the wrong property names in NFS user guide
[ https://issues.apache.org/jira/browse/HDFS-6181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962971#comment-13962971 ] Hudson commented on HDFS-6181: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1725 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1725/]) HDFS-6181. Fix the wrong property names in NFS user guide. Contributed by Brandon Li (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585563) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/mount/RpcProgramMountd.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/JournalProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/NamenodeProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/protocol/QJournalProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/protocolPB/QJournalProtocolPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/GetJournalEditServlet.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNodeHttpServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeHttpServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/InterDatanodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/JournalProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamenodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSHAAdmin.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/GetGroups.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetImageServlet.java *
[jira] [Commented] (HDFS-6197) Rolling upgrade rollback on Windows can fail attempting to rename edit log segment files to a destination that already exists.
[ https://issues.apache.org/jira/browse/HDFS-6197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962973#comment-13962973 ] Hudson commented on HDFS-6197: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1725 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1725/]) HDFS-6197. Rolling upgrade rollback on Windows can fail attempting to rename edit log segment files to a destination that already exists. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585586) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRollingUpgrade.java Rolling upgrade rollback on Windows can fail attempting to rename edit log segment files to a destination that already exists. -- Key: HDFS-6197 URL: https://issues.apache.org/jira/browse/HDFS-6197 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Fix For: 3.0.0, 2.4.1 Attachments: HDFS-6197.1.patch As part of a rollback from a rolling upgrade in progress, we discard edit log segments by renaming the file with suffix .trash. If no new edits arrive in between multiple attempts, then the .trash file may still exist. On Windows, {{java.io.File#renameTo}} fails if the destination already exists though. This is visible right now as a failure in {{TestRollingUpgrade#testRollback}} when running on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6198) DataNode rolling upgrade does not correctly identify current block pool directory and replace with trash on Windows.
[ https://issues.apache.org/jira/browse/HDFS-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962975#comment-13962975 ] Hudson commented on HDFS-6198: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1725 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1725/]) HDFS-6198. DataNode rolling upgrade does not correctly identify current block pool directory and replace with trash on Windows. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585627) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockPoolSliceStorage.java DataNode rolling upgrade does not correctly identify current block pool directory and replace with trash on Windows. Key: HDFS-6198 URL: https://issues.apache.org/jira/browse/HDFS-6198 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 3.0.0, 2.4.1 Attachments: HDFS-6198.1.patch, HDFS-6198.2.patch For rolling upgrade, the DataNode uses a regex to identify the current block pool directory and replace it with the equivalent trash directory. This regex hard-codes '/' as the file path separator, which doesn't work on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962999#comment-13962999 ] Daryn Sharp commented on HDFS-6143: --- Unfortunately this appears to introduce a performance penalty for jobs using webhdfs. A file used as job input will open the file and immediately seek to the split location. Currently, the lazy open will only begin streaming from the seek offset. Doesn't this patch cause every map to begin streaming from offset 0, followed by the seek which closes the stream, and then re-opening and streaming from the new offset? If yes this adds unnecessary load, additional latency for an unnecessary connection that will be closed, is wasteful of bandwidth for data that will be ignored, etc. I think the patch should be reverted. WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6143-branch-2.4.0.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v02.patch, HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch, HDFS-6143.v04.patch, HDFS-6143.v05.patch, HDFS-6143.v06.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963005#comment-13963005 ] Daryn Sharp commented on HDFS-6143: --- If I understand the intent of this jira correctly, I believe you want to do something similar to the manual redirect resolution ala the two-step write: upon creating the data stream, issue the open to the NN with follow redirects disabled which will elicit an immediate 404. Then upon read use the resolved url. WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6143-branch-2.4.0.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v02.patch, HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch, HDFS-6143.v04.patch, HDFS-6143.v05.patch, HDFS-6143.v06.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6201) Get EncryptionKey from NN only if data transfer encryption is required
Benoy Antony created HDFS-6201: -- Summary: Get EncryptionKey from NN only if data transfer encryption is required Key: HDFS-6201 URL: https://issues.apache.org/jira/browse/HDFS-6201 Project: Hadoop HDFS Issue Type: Improvement Components: security Reporter: Benoy Antony Assignee: Benoy Antony HDFS-5910 allowed data transfer encryption to be decided by custom logic based on the Ip address of client and datanode. This is on top of the _dfs.encrypt.data.transfer_ flag. There are some invocations where encryptionkey is fetched first and the datanode is identified later. In these cases, encryptionkey is fetched after invoking the custom logic without the ip address of the datanode. This might result in fetching fetching encryptionkey when it is not required and vice versa. To correct this, a refactoring is required so that encryptionkey is fetched only when it is required. Per [~arpitagarwal] on HDFS-5910 {quote} For the usage in getDataEncryptionKey(), we can refactor to pass a functor as the encryption key to e.g. getFileChecksum. However I am okay with doing the refactoring in a separate change. We can leave the parameter-less overload of isTrusted for now and just use it fromgetEcnryptionKey and file a separate Jira to fix it. {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963139#comment-13963139 ] Gera Shegalov commented on HDFS-6143: - [~daryn], thanks for review. I agree that there is a performance cost associated with streaming too early if we are going to seek right away, e.g, to a split offset. What do think of just invoking {{getFileStatus}} within as the first thing in {{WebHdfsFileSystem.open}} WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6143-branch-2.4.0.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v02.patch, HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch, HDFS-6143.v04.patch, HDFS-6143.v05.patch, HDFS-6143.v06.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6056) Clean up NFS config settings
[ https://issues.apache.org/jira/browse/HDFS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6056: - Attachment: HDFS-6056.002.patch Forgot to update the test configuration file. Uploaded a new patch to fix it. Clean up NFS config settings Key: HDFS-6056 URL: https://issues.apache.org/jira/browse/HDFS-6056 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.3.0 Reporter: Aaron T. Myers Assignee: Brandon Li Attachments: HDFS-6056.001.patch, HDFS-6056.002.patch As discussed on HDFS-6050, there's a few opportunities to improve the config settings related to NFS. This JIRA is to implement those changes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6202) Replace direct usage of ProxyUser Constants with function
Benoy Antony created HDFS-6202: -- Summary: Replace direct usage of ProxyUser Constants with function Key: HDFS-6202 URL: https://issues.apache.org/jira/browse/HDFS-6202 Project: Hadoop HDFS Issue Type: Improvement Components: security Reporter: Benoy Antony Assignee: Benoy Antony Priority: Minor This is a pre-requisite for HADOOP-10471 to reduce the visibility of constants of ProxyUsers. The _TestJspHelper_ directly uses the constants in _ProxyUsers_. This can be replaced with the corresponding functions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6169) Move the address in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963232#comment-13963232 ] Haohui Mai commented on HDFS-6169: -- +1 Move the address in WebImageViewer -- Key: HDFS-6169 URL: https://issues.apache.org/jira/browse/HDFS-6169 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Attachments: HDFS-6169.2.patch, HDFS-6169.3.patch, HDFS-6169.4.patch, HDFS-6169.5.patch, HDFS-6169.6.patch, HDFS-6169.7.patch, HDFS-6169.patch Move the endpoint of WebImageViewer from http://hostname:port/ to http://hostname:port/webhdfs/v1/ to support {{hdfs dfs -ls}} to WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6202) Replace direct usage of ProxyUser Constants with function
[ https://issues.apache.org/jira/browse/HDFS-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated HDFS-6202: --- Attachment: HDFS-6202.patch Attaching a patch. No tests are added since this is a minor refactoring with no behavioral change. Replace direct usage of ProxyUser Constants with function - Key: HDFS-6202 URL: https://issues.apache.org/jira/browse/HDFS-6202 Project: Hadoop HDFS Issue Type: Improvement Components: security Reporter: Benoy Antony Assignee: Benoy Antony Priority: Minor Attachments: HDFS-6202.patch This is a pre-requisite for HADOOP-10471 to reduce the visibility of constants of ProxyUsers. The _TestJspHelper_ directly uses the constants in _ProxyUsers_. This can be replaced with the corresponding functions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6202) Replace direct usage of ProxyUser Constants with function
[ https://issues.apache.org/jira/browse/HDFS-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated HDFS-6202: --- Status: Patch Available (was: Open) Replace direct usage of ProxyUser Constants with function - Key: HDFS-6202 URL: https://issues.apache.org/jira/browse/HDFS-6202 Project: Hadoop HDFS Issue Type: Improvement Components: security Reporter: Benoy Antony Assignee: Benoy Antony Priority: Minor Attachments: HDFS-6202.patch This is a pre-requisite for HADOOP-10471 to reduce the visibility of constants of ProxyUsers. The _TestJspHelper_ directly uses the constants in _ProxyUsers_. This can be replaced with the corresponding functions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6169) Move the address in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6169: - Resolution: Fixed Fix Version/s: 2.5.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed the patch to trunk and branch-2. Thanks [~ajisakaa] for the contribution. Move the address in WebImageViewer -- Key: HDFS-6169 URL: https://issues.apache.org/jira/browse/HDFS-6169 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Fix For: 2.5.0 Attachments: HDFS-6169.2.patch, HDFS-6169.3.patch, HDFS-6169.4.patch, HDFS-6169.5.patch, HDFS-6169.6.patch, HDFS-6169.7.patch, HDFS-6169.patch Move the endpoint of WebImageViewer from http://hostname:port/ to http://hostname:port/webhdfs/v1/ to support {{hdfs dfs -ls}} to WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6169) Move the address in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963331#comment-13963331 ] Hudson commented on HDFS-6169: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5470 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5470/]) HDFS-6169. Move the address in WebImageViewer. Contributed by Akira Ajisaka. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585802) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java Move the address in WebImageViewer -- Key: HDFS-6169 URL: https://issues.apache.org/jira/browse/HDFS-6169 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Fix For: 2.5.0 Attachments: HDFS-6169.2.patch, HDFS-6169.3.patch, HDFS-6169.4.patch, HDFS-6169.5.patch, HDFS-6169.6.patch, HDFS-6169.7.patch, HDFS-6169.patch Move the endpoint of WebImageViewer from http://hostname:port/ to http://hostname:port/webhdfs/v1/ to support {{hdfs dfs -ls}} to WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963345#comment-13963345 ] Haohui Mai commented on HDFS-6200: -- [~szetszwo], [~tucu00], and [~ste...@apache.org], can you please take a look at this patch? Create a separate jar for hdfs-client - Key: HDFS-6200 URL: https://issues.apache.org/jira/browse/HDFS-6200 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, HDFS-6200.005.patch Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs client. As discussed in the hdfs-dev mailing list (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), downstream projects are forced to bring in additional dependency in order to access hdfs. The additional dependency sometimes can be difficult to manage for projects like Apache Falcon and Apache Oozie. This jira proposes to create a new project, hadoop-hdfs-cliient, which contains the client side of the hdfs code. Downstream projects can use this jar instead of the hadoop-hdfs to avoid unnecessary dependency. Note that it does not break the compatibility of downstream projects. This is because old downstream projects implicitly depend on hadoop-hdfs-client through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-5983) TestSafeMode#testInitializeReplQueuesEarly fails
[ https://issues.apache.org/jira/browse/HDFS-5983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He reassigned HDFS-5983: - Assignee: Chen He TestSafeMode#testInitializeReplQueuesEarly fails Key: HDFS-5983 URL: https://issues.apache.org/jira/browse/HDFS-5983 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Chen He Attachments: testlog.txt It was seen from one of the precommit build of HDFS-5962. The test case creates 15 blocks and then shuts down all datanodes. Then the namenode is restarted with a low safe block threshold and one datanode is restarted. The idea is that the initial block report from the restarted datanode will make the namenode leave the safemode and initialize the replication queues. According to the log, the datanode reported 3 blocks, but slightly before that the namenode did repl queue init with 1 block. I will attach the log. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963356#comment-13963356 ] Daryn Sharp commented on HDFS-6143: --- A {{getFileStatus}} call will add more load to the NN, and more importantly I don't see how it will address the issue. The NN shouldn't be redirecting you if the file doesn't exist. If it is, that's the bug that must be fixed. WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6143-branch-2.4.0.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v02.patch, HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch, HDFS-6143.v04.patch, HDFS-6143.v05.patch, HDFS-6143.v06.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963359#comment-13963359 ] Daryn Sharp commented on HDFS-6143: --- I'm sorry, my brain is elsewhere, ignore previous comment. I think the client should do the two step redirect resolve so the client will get FNF immediately, but the datanode stream will still lazy open. WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6143-branch-2.4.0.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v02.patch, HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch, HDFS-6143.v04.patch, HDFS-6143.v05.patch, HDFS-6143.v06.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6056) Clean up NFS config settings
[ https://issues.apache.org/jira/browse/HDFS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963391#comment-13963391 ] Hadoop QA commented on HDFS-6056: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639222/HDFS-6056.002.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-nfs: org.apache.hadoop.hdfs.TestSafeMode {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6619//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6619//console This message is automatically generated. Clean up NFS config settings Key: HDFS-6056 URL: https://issues.apache.org/jira/browse/HDFS-6056 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.3.0 Reporter: Aaron T. Myers Assignee: Brandon Li Attachments: HDFS-6056.001.patch, HDFS-6056.002.patch As discussed on HDFS-6050, there's a few opportunities to improve the config settings related to NFS. This JIRA is to implement those changes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5983) TestSafeMode#testInitializeReplQueuesEarly fails
[ https://issues.apache.org/jira/browse/HDFS-5983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-5983: -- Attachment: HDFS-5983.patch Kihwal, Chen, I started working on this yesterday and just found out it has been assigned to Chen. Here is the patch. There are basically some timing conditions that can make the test invalid: 1. Based on MiniDFSCluster setting, DN will send two blockReports, one for each storage. If the second blockReport came in after NameNodeAdapter.getSafeModeSafeBlocks and before BlockManagerTestUtil.updateState, the test will fail. 2. processMisReplicatedBlocks is async. To make sure it completes before the test gets the metrics, the test can wait until it completes. Sorry, hope Chen hasn't spent much time on this. While the root causes have been identified, there are other ways to fix the test. Feel free to continue the work if necessary. Thanks. TestSafeMode#testInitializeReplQueuesEarly fails Key: HDFS-5983 URL: https://issues.apache.org/jira/browse/HDFS-5983 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Chen He Attachments: HDFS-5983.patch, testlog.txt It was seen from one of the precommit build of HDFS-5962. The test case creates 15 blocks and then shuts down all datanodes. Then the namenode is restarted with a low safe block threshold and one datanode is restarted. The idea is that the initial block report from the restarted datanode will make the namenode leave the safemode and initialize the replication queues. According to the log, the datanode reported 3 blocks, but slightly before that the namenode did repl queue init with 1 block. I will attach the log. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963419#comment-13963419 ] Tsz Wo Nicholas Sze commented on HDFS-6143: --- ... believe you want to do something similar to the manual redirect resolution ala the two-step write: upon creating the data stream, issue the open to the NN with follow redirects disabled which will elicit an immediate 404. Then upon read use the resolved url. It may not work since the seek offset is missing in open. There is no way to calculate the redirect. [~jira.shegalov], could you explain how this breaks LzoInputFormat.getSplits? I am also concerned about the performance penalty. I wonder if there are other solutions. WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6143-branch-2.4.0.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v02.patch, HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch, HDFS-6143.v04.patch, HDFS-6143.v05.patch, HDFS-6143.v06.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6203) check other namenode's state before HAadmin transitionToActive
patrick white created HDFS-6203: --- Summary: check other namenode's state before HAadmin transitionToActive Key: HDFS-6203 URL: https://issues.apache.org/jira/browse/HDFS-6203 Project: Hadoop HDFS Issue Type: Improvement Components: ha Affects Versions: 2.3.0 Reporter: patrick white Current behavior is that the HAadmin -transitionToActive command will complete the transition to Active even if the other namenode is already in Active state. This is not an allowed condition and should be handled by fencing, however setting both namenode's active can happen accidentally with relative ease, especially in a production environment when performing manual maintenance operations. If this situation does occur it is very serious and will likely cause data loss, or best case, require a difficult recovery to avoid data loss. This is requesting an enhancement to haadmin's -transitionToActive command, to have HAadmin check the Active state of the other namenode before completing the transition. If the other namenode is Active, then fail the request due to other nn already-active. Not sure if there is a scenario where both namenode's being Active is valid or desired, but to maintain functional compatibility a 'force' parameter could be added to override this check and allow previous behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6178) Decommission on standby NN couldn't finish
[ https://issues.apache.org/jira/browse/HDFS-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963464#comment-13963464 ] Jing Zhao commented on HDFS-6178: - The patch looks pretty good to me. One question: do we also want to add the isPopulatingReplQueues() check in BlockManager#isReplicationInProgress, when it adds the block to neededReplications? Decommission on standby NN couldn't finish -- Key: HDFS-6178 URL: https://issues.apache.org/jira/browse/HDFS-6178 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Ming Ma Attachments: HDFS-6178.patch Currently decommissioning machines in HA-enabled cluster requires running refreshNodes in both active and standby nodes. Sometimes decommissioning won't finish from standby NN's point of view. Here is the diagnosis of why it could happen. Standby NN's blockManager manages blocks replication and block invalidation as if it is the active NN; even though DNs will ignore block commands coming from standby NN. When standby NN makes block operation decisions such as the target of block replication and the node to remove excess blocks from, the decision is independent of active NN. So active NN and standby NN could have different states. When we try to decommission nodes on standby nodes; such state inconsistency might prevent standby NN from making progress. Here is an example. Machine A Machine B Machine C Machine D Machine E Machine F Machine G Machine H 1. For a given block, both active and standby have 5 replicas on machine A, B, C, D, E. So both active and standby decide to pick excess nodes to invalidate. Active picked D and E as excess DNs. After the next block reports from D and E, active NN has 3 active replicas (A, B, C), 0 excess replica. {noformat} 2014-03-27 01:50:14,410 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (E:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set 2014-03-27 01:50:15,539 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (D:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set {noformat} Standby pick C, E as excess DNs. Given DNs ignore commands from standby, After the next block reports from C, D, E, standby has 2 active replicas (A, B), 1 excess replica (C). {noformat} 2014-03-27 01:51:49,543 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (E:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set 2014-03-27 01:51:49,894 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (C:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set {noformat} 2. Machine A decomm request was sent to standby. Standby only had one live replica and picked machine G, H as targets, but given standby commands was ignored by DNs, G, H remained in pending replication queue until they are timed out. At this point, you have one decommissioning replica (A), 1 active replica (B), one excess replica (C). {noformat} 2014-03-27 04:42:52,258 INFO BlockStateChange: BLOCK* ask A:50010 to replicate blk_-5207804474559026159_121186764 to datanode(s) G:50010 H:50010 {noformat} 3. Machine A decomm request was sent to active NN. Active NN picked machine F as the target. It finished properly. So active NN had 3 active replicas (B, C, F), one decommissioned replica (A). {noformat} 2014-03-27 04:44:15,239 INFO BlockStateChange: BLOCK* ask 10.42.246.110:50010 to replicate blk_-5207804474559026159_121186764 to datanode(s) F:50010 2014-03-27 04:44:16,083 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: F:50010 is added to blk_-5207804474559026159_121186764 size 7100065 {noformat} 4. Standby NN picked up F as a new replica. Thus standby had one decommissioning replica (A), 2 active replicas (B, F), one excess replica (C). Standby NN kept trying to schedule replication work, but DNs ignored the commands. {noformat} 2014-03-27 04:44:16,084 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: F:50010 is added to blk_-5207804474559026159_121186764 size 7100065 2014-03-28 23:06:11,970 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Block: blk_-5207804474559026159_121186764, Expected Replicas: 3, live replicas: 2, corrupt replicas: 0, decommissioned replicas: 1, excess replicas: 1, Is Open File: false, Datanodes having this block: C:50010 B:50010 A:50010 F:50010 , Current Datanode: A:50010, Is current datanode decommissioning: true {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6203) check other namenode's state before HAadmin transitionToActive
[ https://issues.apache.org/jira/browse/HDFS-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963476#comment-13963476 ] Jing Zhao commented on HDFS-6203: - Currently if users enable the automatic failover, the manual transitionToActive operation is disallowed, unless a forcemanual option is given and the user confirms the operation. Also the prompt msg states the risk of the operation. Do you mean we should also do this for non-automatic failover case? For checking other NN's state, if we add the check into the transitionToActive method, we cannot still guarantee that the other NN will not transition to active after the checking. Thus I think the checking here will not be very useful. check other namenode's state before HAadmin transitionToActive -- Key: HDFS-6203 URL: https://issues.apache.org/jira/browse/HDFS-6203 Project: Hadoop HDFS Issue Type: Improvement Components: ha Affects Versions: 2.3.0 Reporter: patrick white Current behavior is that the HAadmin -transitionToActive command will complete the transition to Active even if the other namenode is already in Active state. This is not an allowed condition and should be handled by fencing, however setting both namenode's active can happen accidentally with relative ease, especially in a production environment when performing manual maintenance operations. If this situation does occur it is very serious and will likely cause data loss, or best case, require a difficult recovery to avoid data loss. This is requesting an enhancement to haadmin's -transitionToActive command, to have HAadmin check the Active state of the other namenode before completing the transition. If the other namenode is Active, then fail the request due to other nn already-active. Not sure if there is a scenario where both namenode's being Active is valid or desired, but to maintain functional compatibility a 'force' parameter could be added to override this check and allow previous behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6202) Replace direct usage of ProxyUser Constants with function
[ https://issues.apache.org/jira/browse/HDFS-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963497#comment-13963497 ] Hadoop QA commented on HDFS-6202: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639236/HDFS-6202.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6620//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6620//console This message is automatically generated. Replace direct usage of ProxyUser Constants with function - Key: HDFS-6202 URL: https://issues.apache.org/jira/browse/HDFS-6202 Project: Hadoop HDFS Issue Type: Improvement Components: security Reporter: Benoy Antony Assignee: Benoy Antony Priority: Minor Attachments: HDFS-6202.patch This is a pre-requisite for HADOOP-10471 to reduce the visibility of constants of ProxyUsers. The _TestJspHelper_ directly uses the constants in _ProxyUsers_. This can be replaced with the corresponding functions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-4051) TestRBWBlockInvalidation.testBlockInvalidationWhenRBWReplicaMissedInDN failed
[ https://issues.apache.org/jira/browse/HDFS-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze resolved HDFS-4051. --- Resolution: Not a Problem TestRBWBlockInvalidation is changed. Cannot find the related code anymore. Resolving ... TestRBWBlockInvalidation.testBlockInvalidationWhenRBWReplicaMissedInDN failed - Key: HDFS-4051 URL: https://issues.apache.org/jira/browse/HDFS-4051 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.0.0-alpha Reporter: Eli Collins Labels: test-fail Attachments: test.log.gz TestRBWBlockInvalidation.testBlockInvalidationWhenRBWReplicaMissedInDN failed with the following on a recent trunk job. Has been seen before in HDFS-3391 as well on branch-2. {noformat} java.lang.AssertionError: There should be 1 replica in the corruptReplicasMap expected:1 but was:0 at org.junit.Assert.fail(Assert.java:91) at org.junit.Assert.failNotEquals(Assert.java:645) at org.junit.Assert.assertEquals(Assert.java:126) at org.junit.Assert.assertEquals(Assert.java:470) at org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation.testBlockInvalidationWhenRBWReplicaMissedInDN(TestRBWBlockInvalidation.java:99) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6203) check other namenode's state before HAadmin transitionToActive
[ https://issues.apache.org/jira/browse/HDFS-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963514#comment-13963514 ] patrick white commented on HDFS-6203: - Right, suggest this check for the non-automatic case. Good point and understand, without some form of locking there is certainly the possibility of a race. This check would still have value though, specifically for the case where manual operations are being performed. In this case it's possible for an operator to lose track of state or mistakenly use the wrong serviceId in a command. Arguably this is a don't do that situation, but since it is an easy mistake to make, with very severe consequences in a production environment, a check like this would be valuable. check other namenode's state before HAadmin transitionToActive -- Key: HDFS-6203 URL: https://issues.apache.org/jira/browse/HDFS-6203 Project: Hadoop HDFS Issue Type: Improvement Components: ha Affects Versions: 2.3.0 Reporter: patrick white Current behavior is that the HAadmin -transitionToActive command will complete the transition to Active even if the other namenode is already in Active state. This is not an allowed condition and should be handled by fencing, however setting both namenode's active can happen accidentally with relative ease, especially in a production environment when performing manual maintenance operations. If this situation does occur it is very serious and will likely cause data loss, or best case, require a difficult recovery to avoid data loss. This is requesting an enhancement to haadmin's -transitionToActive command, to have HAadmin check the Active state of the other namenode before completing the transition. If the other namenode is Active, then fail the request due to other nn already-active. Not sure if there is a scenario where both namenode's being Active is valid or desired, but to maintain functional compatibility a 'force' parameter could be added to override this check and allow previous behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6016) Update datanode replacement policy to make writes more robust
[ https://issues.apache.org/jira/browse/HDFS-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963516#comment-13963516 ] Tsz Wo Nicholas Sze commented on HDFS-6016: --- ... Making it also add additional DN when r == 2 and the current number of DNs is 1. When creating a file with r == 2, the user actually cares more about the performance. (Otherwise, one should use r == 3.) Also, if datanode replacement is preferred, one could set the policy to ALWAYS. So I suggest don't make this change. Update datanode replacement policy to make writes more robust - Key: HDFS-6016 URL: https://issues.apache.org/jira/browse/HDFS-6016 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, ha, hdfs-client, namenode Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HDFS-6016.patch, HDFS-6016.patch As discussed in HDFS-5924, writers that are down to only one node due to node failures can suffer if a DN does not restart in time. We do not worry about writes that began with single replica. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6204) TestRBWBlockInvalidation may fail
Tsz Wo Nicholas Sze created HDFS-6204: - Summary: TestRBWBlockInvalidation may fail Key: HDFS-6204 URL: https://issues.apache.org/jira/browse/HDFS-6204 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor {code} java.lang.AssertionError: There should not be any replica in the corruptReplicasMap expected:0 but was:1 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation.testBlockInvalidationWhenRBWReplicaMissedInDN(TestRBWBlockInvalidation.java:137) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6205) FsShell setfacl can throw ArrayIndexOutOfBoundsException when no perm is specified
Stephen Chu created HDFS-6205: - Summary: FsShell setfacl can throw ArrayIndexOutOfBoundsException when no perm is specified Key: HDFS-6205 URL: https://issues.apache.org/jira/browse/HDFS-6205 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 3.0.0, 2.4.0 Reporter: Stephen Chu Priority: Minor If users don't specify the perm of an acl when using the FsShell's setfacl command, a fatal internal error ArrayIndexOutOfBoundsException will be thrown. {code} [root@hdfs-nfs ~]# hdfs dfs -setfacl -m user:bob: /user/hdfs/td1 -setfacl: Fatal internal error java.lang.ArrayIndexOutOfBoundsException: 2 at org.apache.hadoop.fs.permission.AclEntry.parseAclEntry(AclEntry.java:285) at org.apache.hadoop.fs.permission.AclEntry.parseAclSpec(AclEntry.java:221) at org.apache.hadoop.fs.shell.AclCommands$SetfaclCommand.processOptions(AclCommands.java:260) at org.apache.hadoop.fs.shell.Command.run(Command.java:154) at org.apache.hadoop.fs.FsShell.run(FsShell.java:255) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.fs.FsShell.main(FsShell.java:308) [root@hdfs-nfs ~]# {code} An improvement would be if it returned something like this: {code} [root@hdfs-nfs ~]# hdfs dfs -setfacl -m user:bob:rww /user/hdfs/td1 -setfacl: Invalid permission in aclSpec : user:bob:rww Usage: hadoop fs [generic options] -setfacl [-R] [{-b|-k} {-m|-x acl_spec} path]|[--set acl_spec path] [root@hdfs-nfs ~]# {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6204) TestRBWBlockInvalidation may fail
[ https://issues.apache.org/jira/browse/HDFS-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6204: -- Status: Patch Available (was: Open) TestRBWBlockInvalidation may fail - Key: HDFS-6204 URL: https://issues.apache.org/jira/browse/HDFS-6204 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h6204_20140408.patch {code} java.lang.AssertionError: There should not be any replica in the corruptReplicasMap expected:0 but was:1 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation.testBlockInvalidationWhenRBWReplicaMissedInDN(TestRBWBlockInvalidation.java:137) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6204) TestRBWBlockInvalidation may fail
[ https://issues.apache.org/jira/browse/HDFS-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6204: -- Attachment: h6204_20140408.patch h6204_20140408.patch: changes the last sleep in testBlockInvalidationWhenRBWReplicaMissedInDN to a loop. TestRBWBlockInvalidation may fail - Key: HDFS-6204 URL: https://issues.apache.org/jira/browse/HDFS-6204 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h6204_20140408.patch {code} java.lang.AssertionError: There should not be any replica in the corruptReplicasMap expected:0 but was:1 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation.testBlockInvalidationWhenRBWReplicaMissedInDN(TestRBWBlockInvalidation.java:137) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6206) DFSUtil.substituteForWildcardAddress may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963534#comment-13963534 ] Tsz Wo Nicholas Sze commented on HDFS-6206: --- TestClientReportBadBlock may fail. {noformat} java.lang.NullPointerException: null at org.apache.hadoop.hdfs.DFSUtil.substituteForWildcardAddress(DFSUtil.java:1145) at org.apache.hadoop.hdfs.DFSUtil.getInfoServer(DFSUtil.java:1088) at org.apache.hadoop.hdfs.tools.DFSck.getCurrentNamenodeAddress(DFSck.java:248) at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:255) at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:148) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:145) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:144) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.TestClientReportBadBlock.runFsck(TestClientReportBadBlock.java:346) at org.apache.hadoop.hdfs.TestClientReportBadBlock.verifyFsckHealth(TestClientReportBadBlock.java:320) at org.apache.hadoop.hdfs.TestClientReportBadBlock.testCorruptAllOfThreeReplicas(TestClientReportBadBlock.java:148) {noformat} DFSUtil.substituteForWildcardAddress may throw NPE -- Key: HDFS-6206 URL: https://issues.apache.org/jira/browse/HDFS-6206 Project: Hadoop HDFS Issue Type: Bug Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze InetSocketAddress.getAddress() may return null if the address null is unresolved. In such case, DFSUtil.substituteForWildcardAddress may throw NPE. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6206) DFSUtil.substituteForWildcardAddress may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6206: -- Status: Patch Available (was: Open) DFSUtil.substituteForWildcardAddress may throw NPE -- Key: HDFS-6206 URL: https://issues.apache.org/jira/browse/HDFS-6206 Project: Hadoop HDFS Issue Type: Bug Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6206_20140408.patch InetSocketAddress.getAddress() may return null if the address null is unresolved. In such case, DFSUtil.substituteForWildcardAddress may throw NPE. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6206) DFSUtil.substituteForWildcardAddress may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6206: -- Attachment: h6206_20140408.patch h6206_20140408.patch: check null. DFSUtil.substituteForWildcardAddress may throw NPE -- Key: HDFS-6206 URL: https://issues.apache.org/jira/browse/HDFS-6206 Project: Hadoop HDFS Issue Type: Bug Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6206_20140408.patch InetSocketAddress.getAddress() may return null if the address null is unresolved. In such case, DFSUtil.substituteForWildcardAddress may throw NPE. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6207) ConcurrentModificationException in AbstractDelegationTokenSelector.selectToken()
Arpit Gupta created HDFS-6207: - Summary: ConcurrentModificationException in AbstractDelegationTokenSelector.selectToken() Key: HDFS-6207 URL: https://issues.apache.org/jira/browse/HDFS-6207 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta While running a hive job on a HA cluster saw ConcurrentModificationException in AbstractDelegationTokenSelector.selectToken() -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6207) ConcurrentModificationException in AbstractDelegationTokenSelector.selectToken()
[ https://issues.apache.org/jira/browse/HDFS-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Gupta updated HDFS-6207: -- Assignee: Jing Zhao ConcurrentModificationException in AbstractDelegationTokenSelector.selectToken() Key: HDFS-6207 URL: https://issues.apache.org/jira/browse/HDFS-6207 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jing Zhao While running a hive job on a HA cluster saw ConcurrentModificationException in AbstractDelegationTokenSelector.selectToken() -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6207) ConcurrentModificationException in AbstractDelegationTokenSelector.selectToken()
[ https://issues.apache.org/jira/browse/HDFS-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963550#comment-13963550 ] Arpit Gupta commented on HDFS-6207: --- {code} Caused by: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894) at java.util.HashMap$ValueIterator.next(HashMap.java:922) at java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1067) at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSelector.selectToken(AbstractDelegationTokenSelector.java:53) at org.apache.hadoop.hdfs.HAUtil.cloneDelegationTokenForLogicalUri(HAUtil.java:260) at org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider.init {code} ConcurrentModificationException in AbstractDelegationTokenSelector.selectToken() Key: HDFS-6207 URL: https://issues.apache.org/jira/browse/HDFS-6207 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta While running a hive job on a HA cluster saw ConcurrentModificationException in AbstractDelegationTokenSelector.selectToken() -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6204) TestRBWBlockInvalidation may fail
[ https://issues.apache.org/jira/browse/HDFS-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963557#comment-13963557 ] Arpit Agarwal commented on HDFS-6204: - +1 for the patch. TestRBWBlockInvalidation may fail - Key: HDFS-6204 URL: https://issues.apache.org/jira/browse/HDFS-6204 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h6204_20140408.patch {code} java.lang.AssertionError: There should not be any replica in the corruptReplicasMap expected:0 but was:1 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation.testBlockInvalidationWhenRBWReplicaMissedInDN(TestRBWBlockInvalidation.java:137) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6206) DFSUtil.substituteForWildcardAddress may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963559#comment-13963559 ] Arpit Agarwal commented on HDFS-6206: - +1 for the patch. DFSUtil.substituteForWildcardAddress may throw NPE -- Key: HDFS-6206 URL: https://issues.apache.org/jira/browse/HDFS-6206 Project: Hadoop HDFS Issue Type: Bug Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6206_20140408.patch InetSocketAddress.getAddress() may return null if the address null is unresolved. In such case, DFSUtil.substituteForWildcardAddress may throw NPE. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6206) DFSUtil.substituteForWildcardAddress may throw NPE
Tsz Wo Nicholas Sze created HDFS-6206: - Summary: DFSUtil.substituteForWildcardAddress may throw NPE Key: HDFS-6206 URL: https://issues.apache.org/jira/browse/HDFS-6206 Project: Hadoop HDFS Issue Type: Bug Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze InetSocketAddress.getAddress() may return null if the address null is unresolved. In such case, DFSUtil.substituteForWildcardAddress may throw NPE. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6178) Decommission on standby NN couldn't finish
[ https://issues.apache.org/jira/browse/HDFS-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-6178: -- Attachment: HDFS-6178-2.patch Thanks, Jing. Updated patch per suggestion. Decommission on standby NN couldn't finish -- Key: HDFS-6178 URL: https://issues.apache.org/jira/browse/HDFS-6178 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Ming Ma Attachments: HDFS-6178-2.patch, HDFS-6178.patch Currently decommissioning machines in HA-enabled cluster requires running refreshNodes in both active and standby nodes. Sometimes decommissioning won't finish from standby NN's point of view. Here is the diagnosis of why it could happen. Standby NN's blockManager manages blocks replication and block invalidation as if it is the active NN; even though DNs will ignore block commands coming from standby NN. When standby NN makes block operation decisions such as the target of block replication and the node to remove excess blocks from, the decision is independent of active NN. So active NN and standby NN could have different states. When we try to decommission nodes on standby nodes; such state inconsistency might prevent standby NN from making progress. Here is an example. Machine A Machine B Machine C Machine D Machine E Machine F Machine G Machine H 1. For a given block, both active and standby have 5 replicas on machine A, B, C, D, E. So both active and standby decide to pick excess nodes to invalidate. Active picked D and E as excess DNs. After the next block reports from D and E, active NN has 3 active replicas (A, B, C), 0 excess replica. {noformat} 2014-03-27 01:50:14,410 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (E:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set 2014-03-27 01:50:15,539 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (D:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set {noformat} Standby pick C, E as excess DNs. Given DNs ignore commands from standby, After the next block reports from C, D, E, standby has 2 active replicas (A, B), 1 excess replica (C). {noformat} 2014-03-27 01:51:49,543 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (E:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set 2014-03-27 01:51:49,894 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (C:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set {noformat} 2. Machine A decomm request was sent to standby. Standby only had one live replica and picked machine G, H as targets, but given standby commands was ignored by DNs, G, H remained in pending replication queue until they are timed out. At this point, you have one decommissioning replica (A), 1 active replica (B), one excess replica (C). {noformat} 2014-03-27 04:42:52,258 INFO BlockStateChange: BLOCK* ask A:50010 to replicate blk_-5207804474559026159_121186764 to datanode(s) G:50010 H:50010 {noformat} 3. Machine A decomm request was sent to active NN. Active NN picked machine F as the target. It finished properly. So active NN had 3 active replicas (B, C, F), one decommissioned replica (A). {noformat} 2014-03-27 04:44:15,239 INFO BlockStateChange: BLOCK* ask 10.42.246.110:50010 to replicate blk_-5207804474559026159_121186764 to datanode(s) F:50010 2014-03-27 04:44:16,083 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: F:50010 is added to blk_-5207804474559026159_121186764 size 7100065 {noformat} 4. Standby NN picked up F as a new replica. Thus standby had one decommissioning replica (A), 2 active replicas (B, F), one excess replica (C). Standby NN kept trying to schedule replication work, but DNs ignored the commands. {noformat} 2014-03-27 04:44:16,084 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: F:50010 is added to blk_-5207804474559026159_121186764 size 7100065 2014-03-28 23:06:11,970 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Block: blk_-5207804474559026159_121186764, Expected Replicas: 3, live replicas: 2, corrupt replicas: 0, decommissioned replicas: 1, excess replicas: 1, Is Open File: false, Datanodes having this block: C:50010 B:50010 A:50010 F:50010 , Current Datanode: A:50010, Is current datanode decommissioning: true {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
Please unsubscribe me from the list..thanks
On Wed, Apr 9, 2014 at 8:55 AM, Ming Ma (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/HDFS-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] Ming Ma updated HDFS-6178: -- Attachment: HDFS-6178-2.patch Thanks, Jing. Updated patch per suggestion. Decommission on standby NN couldn't finish -- Key: HDFS-6178 URL: https://issues.apache.org/jira/browse/HDFS-6178 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Ming Ma Attachments: HDFS-6178-2.patch, HDFS-6178.patch Currently decommissioning machines in HA-enabled cluster requires running refreshNodes in both active and standby nodes. Sometimes decommissioning won't finish from standby NN's point of view. Here is the diagnosis of why it could happen. Standby NN's blockManager manages blocks replication and block invalidation as if it is the active NN; even though DNs will ignore block commands coming from standby NN. When standby NN makes block operation decisions such as the target of block replication and the node to remove excess blocks from, the decision is independent of active NN. So active NN and standby NN could have different states. When we try to decommission nodes on standby nodes; such state inconsistency might prevent standby NN from making progress. Here is an example. Machine A Machine B Machine C Machine D Machine E Machine F Machine G Machine H 1. For a given block, both active and standby have 5 replicas on machine A, B, C, D, E. So both active and standby decide to pick excess nodes to invalidate. Active picked D and E as excess DNs. After the next block reports from D and E, active NN has 3 active replicas (A, B, C), 0 excess replica. {noformat} 2014-03-27 01:50:14,410 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (E:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set 2014-03-27 01:50:15,539 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (D:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set {noformat} Standby pick C, E as excess DNs. Given DNs ignore commands from standby, After the next block reports from C, D, E, standby has 2 active replicas (A, B), 1 excess replica (C). {noformat} 2014-03-27 01:51:49,543 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (E:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set 2014-03-27 01:51:49,894 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (C:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set {noformat} 2. Machine A decomm request was sent to standby. Standby only had one live replica and picked machine G, H as targets, but given standby commands was ignored by DNs, G, H remained in pending replication queue until they are timed out. At this point, you have one decommissioning replica (A), 1 active replica (B), one excess replica (C). {noformat} 2014-03-27 04:42:52,258 INFO BlockStateChange: BLOCK* ask A:50010 to replicate blk_-5207804474559026159_121186764 to datanode(s) G:50010 H:50010 {noformat} 3. Machine A decomm request was sent to active NN. Active NN picked machine F as the target. It finished properly. So active NN had 3 active replicas (B, C, F), one decommissioned replica (A). {noformat} 2014-03-27 04:44:15,239 INFO BlockStateChange: BLOCK* ask 10.42.246.110:50010 to replicate blk_-5207804474559026159_121186764 to datanode(s) F:50010 2014-03-27 04:44:16,083 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: F:50010 is added to blk_-5207804474559026159_121186764 size 7100065 {noformat} 4. Standby NN picked up F as a new replica. Thus standby had one decommissioning replica (A), 2 active replicas (B, F), one excess replica (C). Standby NN kept trying to schedule replication work, but DNs ignored the commands. {noformat} 2014-03-27 04:44:16,084 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: F:50010 is added to blk_-5207804474559026159_121186764 size 7100065 2014-03-28 23:06:11,970 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Block: blk_-5207804474559026159_121186764, Expected Replicas: 3, live replicas: 2, corrupt replicas: 0, decommissioned replicas: 1, excess replicas: 1, Is Open File: false, Datanodes having this block: C:50010 B:50010 A:50010 F:50010 , Current Datanode: A:50010, Is current datanode decommissioning: true {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963576#comment-13963576 ] Gera Shegalov commented on HDFS-6143: - [~daryn], I think putting calling {{getFileStatus}} in open to get FNF, and keep the old lazy open will have the same effect. {code} diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/ index bdf744a..bad1534 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java @@ -898,6 +898,7 @@ public boolean delete(Path f, boolean recursive) throws IOException { @Override public FSDataInputStream open(final Path f, final int buffersize ) throws IOException { +getFileStatus(f); statistics.incrementReadOps(1); final HttpOpParam.Op op = GetOpParam.Op.OPEN; final URL url = toUrl(op, f, new BufferSizeParam(buffersize)); {code} WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6143-branch-2.4.0.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v02.patch, HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch, HDFS-6143.v04.patch, HDFS-6143.v05.patch, HDFS-6143.v06.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963595#comment-13963595 ] Gera Shegalov commented on HDFS-6143: - [~szetszwo], it's easy to understand the issue if your review the unit test modifications in the patch. As for our production use case, that works fine on the LocalFileSystem and HDFS, please review splittable Lzo in elephant bird. Hadoop-lzo's {{LzoIndex.readIndex}} reads the index file and exploits the fact the FS should [error out opening a non-existing file|https://github.com/twitter/hadoop-lzo/blob/master/src/main/java/com/hadoop/compression/lzo/LzoIndex.java#L176], instead of using an RPC to get the file status directly or via {{exists}}. WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: HDFS-6143-branch-2.4.0.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v02.patch, HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch, HDFS-6143.v04.patch, HDFS-6143.v05.patch, HDFS-6143.v06.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4114) Deprecate the BackupNode and CheckpointNode in 2.0
[ https://issues.apache.org/jira/browse/HDFS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963596#comment-13963596 ] Haohui Mai commented on HDFS-4114: -- Rebased Deprecate the BackupNode and CheckpointNode in 2.0 -- Key: HDFS-4114 URL: https://issues.apache.org/jira/browse/HDFS-4114 Project: Hadoop HDFS Issue Type: Bug Reporter: Eli Collins Assignee: Suresh Srinivas Attachments: HDFS-4114.000.patch, HDFS-4114.patch Per the thread on hdfs-dev@ (http://s.apache.org/tMT) let's remove the BackupNode and CheckpointNode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4114) Deprecate the BackupNode and CheckpointNode in 2.0
[ https://issues.apache.org/jira/browse/HDFS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-4114: - Attachment: HDFS-4114.000.patch Deprecate the BackupNode and CheckpointNode in 2.0 -- Key: HDFS-4114 URL: https://issues.apache.org/jira/browse/HDFS-4114 Project: Hadoop HDFS Issue Type: Bug Reporter: Eli Collins Assignee: Suresh Srinivas Attachments: HDFS-4114.000.patch, HDFS-4114.patch Per the thread on hdfs-dev@ (http://s.apache.org/tMT) let's remove the BackupNode and CheckpointNode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6208) DataNode caching can leak file descriptors.
Chris Nauroth created HDFS-6208: --- Summary: DataNode caching can leak file descriptors. Key: HDFS-6208 URL: https://issues.apache.org/jira/browse/HDFS-6208 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth In the DataNode, management of mmap'd/mlock'd block files can leak file descriptors. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6202) Replace direct usage of ProxyUser Constants with function
[ https://issues.apache.org/jira/browse/HDFS-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963624#comment-13963624 ] Benoy Antony commented on HDFS-6202: [~sureshms] [~arpitagarwal] , could one of you please review this ? Replace direct usage of ProxyUser Constants with function - Key: HDFS-6202 URL: https://issues.apache.org/jira/browse/HDFS-6202 Project: Hadoop HDFS Issue Type: Improvement Components: security Reporter: Benoy Antony Assignee: Benoy Antony Priority: Minor Attachments: HDFS-6202.patch This is a pre-requisite for HADOOP-10471 to reduce the visibility of constants of ProxyUsers. The _TestJspHelper_ directly uses the constants in _ProxyUsers_. This can be replaced with the corresponding functions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6208) DataNode caching can leak file descriptors.
[ https://issues.apache.org/jira/browse/HDFS-6208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963627#comment-13963627 ] Chris Nauroth commented on HDFS-6208: - There are 2 problems: # The block file stream and checksum file stream are only closed if there is some kind of I/O error. They do not get closed if everything succeeds. mmap/mlock does not rely on keeping the source file descriptor open, so we can change our code to close these. # In test runs, all cached blocks remaining after a test finishes are kept mmap'd into the process. We can handle this by explicitly uncaching at shutdown time. I spotted these issues during runs of {{TestCacheDirectives}} on Windows, where leaked file descriptors and memory-mapped regions cause subsequent tests to fail, because locks are still held on the underlying block files in the test data directory. I have a patch in progress to fix this. DataNode caching can leak file descriptors. --- Key: HDFS-6208 URL: https://issues.apache.org/jira/browse/HDFS-6208 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth In the DataNode, management of mmap'd/mlock'd block files can leak file descriptors. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6160) TestSafeMode occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6160: Attachment: HDFS-6160.01.patch Fix a couple of race conditions with the test case: # The {{waitFor}} predicate was just checking that _safeModeBlocks 0_ which can happen before block reports from all storages attached to the DN are processed. If the subsequent block reports come after this check and before we sample {{getSafeModeSafeBlocks}}, then {{safe}} will be wrong. Fixed it by adding a metric for storageBlockReportOps and checking it. # Put the check for {{getUnderReplicatedBlocks}} in a loop since the processing is asynch and may be delayed. TestSafeMode occasionally fails --- Key: HDFS-6160 URL: https://issues.apache.org/jira/browse/HDFS-6160 Project: Hadoop HDFS Issue Type: Test Reporter: Ted Yu Priority: Minor Attachments: HDFS-6160.01.patch From https://builds.apache.org/job/PreCommit-HDFS-Build/6511//testReport/org.apache.hadoop.hdfs/TestSafeMode/testInitializeReplQueuesEarly/ : {code} java.lang.AssertionError: expected:13 but was:0 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.hdfs.TestSafeMode.testInitializeReplQueuesEarly(TestSafeMode.java:212) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-6160) TestSafeMode occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal reassigned HDFS-6160: --- Assignee: Arpit Agarwal TestSafeMode occasionally fails --- Key: HDFS-6160 URL: https://issues.apache.org/jira/browse/HDFS-6160 Project: Hadoop HDFS Issue Type: Test Reporter: Ted Yu Assignee: Arpit Agarwal Priority: Minor Attachments: HDFS-6160.01.patch From https://builds.apache.org/job/PreCommit-HDFS-Build/6511//testReport/org.apache.hadoop.hdfs/TestSafeMode/testInitializeReplQueuesEarly/ : {code} java.lang.AssertionError: expected:13 but was:0 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.hdfs.TestSafeMode.testInitializeReplQueuesEarly(TestSafeMode.java:212) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6160) TestSafeMode occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6160: Status: Patch Available (was: Open) TestSafeMode occasionally fails --- Key: HDFS-6160 URL: https://issues.apache.org/jira/browse/HDFS-6160 Project: Hadoop HDFS Issue Type: Test Reporter: Ted Yu Priority: Minor Attachments: HDFS-6160.01.patch From https://builds.apache.org/job/PreCommit-HDFS-Build/6511//testReport/org.apache.hadoop.hdfs/TestSafeMode/testInitializeReplQueuesEarly/ : {code} java.lang.AssertionError: expected:13 but was:0 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.hdfs.TestSafeMode.testInitializeReplQueuesEarly(TestSafeMode.java:212) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6160) TestSafeMode occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6160: Issue Type: Bug (was: Test) TestSafeMode occasionally fails --- Key: HDFS-6160 URL: https://issues.apache.org/jira/browse/HDFS-6160 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Ted Yu Assignee: Arpit Agarwal Priority: Minor Attachments: HDFS-6160.01.patch From https://builds.apache.org/job/PreCommit-HDFS-Build/6511//testReport/org.apache.hadoop.hdfs/TestSafeMode/testInitializeReplQueuesEarly/ : {code} java.lang.AssertionError: expected:13 but was:0 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.hdfs.TestSafeMode.testInitializeReplQueuesEarly(TestSafeMode.java:212) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6160) TestSafeMode occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6160: Affects Version/s: 2.4.0 TestSafeMode occasionally fails --- Key: HDFS-6160 URL: https://issues.apache.org/jira/browse/HDFS-6160 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 2.4.0 Reporter: Ted Yu Assignee: Arpit Agarwal Priority: Minor Attachments: HDFS-6160.01.patch From https://builds.apache.org/job/PreCommit-HDFS-Build/6511//testReport/org.apache.hadoop.hdfs/TestSafeMode/testInitializeReplQueuesEarly/ : {code} java.lang.AssertionError: expected:13 but was:0 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.hdfs.TestSafeMode.testInitializeReplQueuesEarly(TestSafeMode.java:212) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6160) TestSafeMode occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6160: Component/s: test TestSafeMode occasionally fails --- Key: HDFS-6160 URL: https://issues.apache.org/jira/browse/HDFS-6160 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 2.4.0 Reporter: Ted Yu Assignee: Arpit Agarwal Priority: Minor Attachments: HDFS-6160.01.patch From https://builds.apache.org/job/PreCommit-HDFS-Build/6511//testReport/org.apache.hadoop.hdfs/TestSafeMode/testInitializeReplQueuesEarly/ : {code} java.lang.AssertionError: expected:13 but was:0 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.hdfs.TestSafeMode.testInitializeReplQueuesEarly(TestSafeMode.java:212) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6160) TestSafeMode occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6160: Priority: Major (was: Minor) TestSafeMode occasionally fails --- Key: HDFS-6160 URL: https://issues.apache.org/jira/browse/HDFS-6160 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Ted Yu Assignee: Arpit Agarwal Attachments: HDFS-6160.01.patch From https://builds.apache.org/job/PreCommit-HDFS-Build/6511//testReport/org.apache.hadoop.hdfs/TestSafeMode/testInitializeReplQueuesEarly/ : {code} java.lang.AssertionError: expected:13 but was:0 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.hdfs.TestSafeMode.testInitializeReplQueuesEarly(TestSafeMode.java:212) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6209) Fix flaky test TestValidateConfigurationSettings.testThatDifferentRPCandHttpPortsAreOK
Arpit Agarwal created HDFS-6209: --- Summary: Fix flaky test TestValidateConfigurationSettings.testThatDifferentRPCandHttpPortsAreOK Key: HDFS-6209 URL: https://issues.apache.org/jira/browse/HDFS-6209 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal The test depends on hard-coded port numbers being available. It should retry if the chosen port is in use. Exception details below in a comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6209) Fix flaky test TestValidateConfigurationSettings.testThatDifferentRPCandHttpPortsAreOK
[ https://issues.apache.org/jira/browse/HDFS-6209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963674#comment-13963674 ] Arpit Agarwal commented on HDFS-6209: - Exception details: {code} java.net.BindException: Port in use: localhost:9000 at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:344) at sun.nio.ch.Net.bind(Net.java:336) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:199) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:854) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:795) at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:132) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:613) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:509) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:670) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:655) at org.apache.hadoop.hdfs.server.namenode.TestValidateConfigurationSettings.testThatDifferentRPCandHttpPortsAreOK(TestValidateConfigurationSettings.java:86) {code} Fix flaky test TestValidateConfigurationSettings.testThatDifferentRPCandHttpPortsAreOK -- Key: HDFS-6209 URL: https://issues.apache.org/jira/browse/HDFS-6209 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal The test depends on hard-coded port numbers being available. It should retry if the chosen port is in use. Exception details below in a comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6169) Move the address in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963687#comment-13963687 ] Akira AJISAKA commented on HDFS-6169: - Thanks Haohui for the review and the commit! Move the address in WebImageViewer -- Key: HDFS-6169 URL: https://issues.apache.org/jira/browse/HDFS-6169 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Fix For: 2.5.0 Attachments: HDFS-6169.2.patch, HDFS-6169.3.patch, HDFS-6169.4.patch, HDFS-6169.5.patch, HDFS-6169.6.patch, HDFS-6169.7.patch, HDFS-6169.patch Move the endpoint of WebImageViewer from http://hostname:port/ to http://hostname:port/webhdfs/v1/ to support {{hdfs dfs -ls}} to WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6204) TestRBWBlockInvalidation may fail
[ https://issues.apache.org/jira/browse/HDFS-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963692#comment-13963692 ] Hadoop QA commented on HDFS-6204: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639279/h6204_20140408.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6621//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6621//console This message is automatically generated. TestRBWBlockInvalidation may fail - Key: HDFS-6204 URL: https://issues.apache.org/jira/browse/HDFS-6204 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h6204_20140408.patch {code} java.lang.AssertionError: There should not be any replica in the corruptReplicasMap expected:0 but was:1 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation.testBlockInvalidationWhenRBWReplicaMissedInDN(TestRBWBlockInvalidation.java:137) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6209) Fix flaky test TestValidateConfigurationSettings.testThatDifferentRPCandHttpPortsAreOK
[ https://issues.apache.org/jira/browse/HDFS-6209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6209: Status: Patch Available (was: Open) Fix flaky test TestValidateConfigurationSettings.testThatDifferentRPCandHttpPortsAreOK -- Key: HDFS-6209 URL: https://issues.apache.org/jira/browse/HDFS-6209 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6209.01.patch The test depends on hard-coded port numbers being available. It should retry if the chosen port is in use. Exception details below in a comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6209) Fix flaky test TestValidateConfigurationSettings.testThatDifferentRPCandHttpPortsAreOK
[ https://issues.apache.org/jira/browse/HDFS-6209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6209: Attachment: HDFS-6209.01.patch Patch with following fixes: # Randomize port selection, add a retry in case the initial choice is in use. # Each test case releases acquired ports. Fix flaky test TestValidateConfigurationSettings.testThatDifferentRPCandHttpPortsAreOK -- Key: HDFS-6209 URL: https://issues.apache.org/jira/browse/HDFS-6209 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6209.01.patch The test depends on hard-coded port numbers being available. It should retry if the chosen port is in use. Exception details below in a comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6056) Clean up NFS config settings
[ https://issues.apache.org/jira/browse/HDFS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963701#comment-13963701 ] Jing Zhao commented on HDFS-6056: - Since nfs is in another project, do we still want to move all the configuration keys to hadoop-common and hdfs? Maybe in this jira we can first make the conf names more consistent and unified, and consider the moving part in a separate jira? Clean up NFS config settings Key: HDFS-6056 URL: https://issues.apache.org/jira/browse/HDFS-6056 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.3.0 Reporter: Aaron T. Myers Assignee: Brandon Li Attachments: HDFS-6056.001.patch, HDFS-6056.002.patch As discussed on HDFS-6050, there's a few opportunities to improve the config settings related to NFS. This JIRA is to implement those changes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6206) DFSUtil.substituteForWildcardAddress may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963706#comment-13963706 ] Hadoop QA commented on HDFS-6206: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639282/h6206_20140408.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6622//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6622//console This message is automatically generated. DFSUtil.substituteForWildcardAddress may throw NPE -- Key: HDFS-6206 URL: https://issues.apache.org/jira/browse/HDFS-6206 Project: Hadoop HDFS Issue Type: Bug Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6206_20140408.patch InetSocketAddress.getAddress() may return null if the address null is unresolved. In such case, DFSUtil.substituteForWildcardAddress may throw NPE. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6178) Decommission on standby NN couldn't finish
[ https://issues.apache.org/jira/browse/HDFS-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963717#comment-13963717 ] Hadoop QA commented on HDFS-6178: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639289/HDFS-6178-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6623//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6623//console This message is automatically generated. Decommission on standby NN couldn't finish -- Key: HDFS-6178 URL: https://issues.apache.org/jira/browse/HDFS-6178 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Ming Ma Attachments: HDFS-6178-2.patch, HDFS-6178.patch Currently decommissioning machines in HA-enabled cluster requires running refreshNodes in both active and standby nodes. Sometimes decommissioning won't finish from standby NN's point of view. Here is the diagnosis of why it could happen. Standby NN's blockManager manages blocks replication and block invalidation as if it is the active NN; even though DNs will ignore block commands coming from standby NN. When standby NN makes block operation decisions such as the target of block replication and the node to remove excess blocks from, the decision is independent of active NN. So active NN and standby NN could have different states. When we try to decommission nodes on standby nodes; such state inconsistency might prevent standby NN from making progress. Here is an example. Machine A Machine B Machine C Machine D Machine E Machine F Machine G Machine H 1. For a given block, both active and standby have 5 replicas on machine A, B, C, D, E. So both active and standby decide to pick excess nodes to invalidate. Active picked D and E as excess DNs. After the next block reports from D and E, active NN has 3 active replicas (A, B, C), 0 excess replica. {noformat} 2014-03-27 01:50:14,410 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (E:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set 2014-03-27 01:50:15,539 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (D:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set {noformat} Standby pick C, E as excess DNs. Given DNs ignore commands from standby, After the next block reports from C, D, E, standby has 2 active replicas (A, B), 1 excess replica (C). {noformat} 2014-03-27 01:51:49,543 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (E:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set 2014-03-27 01:51:49,894 INFO BlockStateChange: BLOCK* chooseExcessReplicates: (C:50010, blk_-5207804474559026159_121186764) is added to invalidated blocks set {noformat} 2. Machine A decomm request was sent to standby. Standby only had one live replica and picked machine G, H as targets, but given standby commands was ignored by DNs, G, H remained in pending replication queue until they are timed out. At this point, you have one decommissioning replica (A), 1 active replica (B), one excess replica (C). {noformat} 2014-03-27 04:42:52,258 INFO BlockStateChange: BLOCK* ask A:50010 to replicate blk_-5207804474559026159_121186764 to datanode(s) G:50010 H:50010 {noformat} 3. Machine A decomm request was sent to active NN. Active NN picked machine F as the target. It finished properly. So active NN had 3 active replicas (B, C, F), one decommissioned replica (A). {noformat} 2014-03-27 04:44:15,239 INFO BlockStateChange: BLOCK* ask 10.42.246.110:50010 to replicate blk_-5207804474559026159_121186764 to datanode(s) F:50010 2014-03-27 04:44:16,083 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: F:50010 is added to
[jira] [Commented] (HDFS-4114) Deprecate the BackupNode and CheckpointNode in 2.0
[ https://issues.apache.org/jira/browse/HDFS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963721#comment-13963721 ] Hadoop QA commented on HDFS-4114: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639295/HDFS-4114.000.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6624//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6624//console This message is automatically generated. Deprecate the BackupNode and CheckpointNode in 2.0 -- Key: HDFS-4114 URL: https://issues.apache.org/jira/browse/HDFS-4114 Project: Hadoop HDFS Issue Type: Bug Reporter: Eli Collins Assignee: Suresh Srinivas Attachments: HDFS-4114.000.patch, HDFS-4114.patch Per the thread on hdfs-dev@ (http://s.apache.org/tMT) let's remove the BackupNode and CheckpointNode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6170) Support GETFILESTATUS operation in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-6170: Attachment: HDFS-6170.patch Attaching a patch. Support GETFILESTATUS operation in WebImageViewer - Key: HDFS-6170 URL: https://issues.apache.org/jira/browse/HDFS-6170 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: newbie Attachments: HDFS-6170.patch WebImageViewer is created by HDFS-5978 but now supports only {{LISTSTATUS}} operation. {{GETFILESTATUS}} operation is required for users to execute hdfs dfs -ls webhdfs://foo on WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6170) Support GETFILESTATUS operation in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-6170: Target Version/s: 2.5.0 Status: Patch Available (was: Open) Support GETFILESTATUS operation in WebImageViewer - Key: HDFS-6170 URL: https://issues.apache.org/jira/browse/HDFS-6170 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: newbie Attachments: HDFS-6170.patch WebImageViewer is created by HDFS-5978 but now supports only {{LISTSTATUS}} operation. {{GETFILESTATUS}} operation is required for users to execute hdfs dfs -ls webhdfs://foo on WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6170) Support GETFILESTATUS operation in WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-6170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963778#comment-13963778 ] Akira AJISAKA commented on HDFS-6170: - I tried to execute hdfs dfs -ls webhdfs://foo to get the list with the patch, but failed to get it because {{Ls}} actually calls {{GETACLSTATUS}} operation. {code} [root@trunk ~]# hdfs dfs -ls webhdfs://localhost:5978/ 14/04/09 11:53:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items ls: Unexpected HTTP response: code=400 != 200, op=GETACLSTATUS, message=Bad Request {code} I'll create a separate jira to handle {{GETACLSTATUS}} operation. Support GETFILESTATUS operation in WebImageViewer - Key: HDFS-6170 URL: https://issues.apache.org/jira/browse/HDFS-6170 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: newbie Attachments: HDFS-6170.patch WebImageViewer is created by HDFS-5978 but now supports only {{LISTSTATUS}} operation. {{GETFILESTATUS}} operation is required for users to execute hdfs dfs -ls webhdfs://foo on WebImageViewer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6210) Support GETACLSTATUS operation in WebImageViewer
Akira AJISAKA created HDFS-6210: --- Summary: Support GETACLSTATUS operation in WebImageViewer Key: HDFS-6210 URL: https://issues.apache.org/jira/browse/HDFS-6210 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: 2.5.0 Reporter: Akira AJISAKA In HDFS-6170, I found {{GETACLSTATUS}} operation support is also required to execute hdfs dfs -ls to WebImageViewer. {code} [root@trunk ~]# hdfs dfs -ls webhdfs://localhost:5978/ 14/04/09 11:53:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items ls: Unexpected HTTP response: code=400 != 200, op=GETACLSTATUS, message=Bad Request {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6209) Fix flaky test TestValidateConfigurationSettings.testThatDifferentRPCandHttpPortsAreOK
[ https://issues.apache.org/jira/browse/HDFS-6209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963798#comment-13963798 ] Hadoop QA commented on HDFS-6209: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639319/HDFS-6209.01.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestSafeMode {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6626//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6626//console This message is automatically generated. Fix flaky test TestValidateConfigurationSettings.testThatDifferentRPCandHttpPortsAreOK -- Key: HDFS-6209 URL: https://issues.apache.org/jira/browse/HDFS-6209 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6209.01.patch The test depends on hard-coded port numbers being available. It should retry if the chosen port is in use. Exception details below in a comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6160) TestSafeMode occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963799#comment-13963799 ] Hadoop QA commented on HDFS-6160: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639311/HDFS-6160.01.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6625//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6625//console This message is automatically generated. TestSafeMode occasionally fails --- Key: HDFS-6160 URL: https://issues.apache.org/jira/browse/HDFS-6160 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Ted Yu Assignee: Arpit Agarwal Attachments: HDFS-6160.01.patch From https://builds.apache.org/job/PreCommit-HDFS-Build/6511//testReport/org.apache.hadoop.hdfs/TestSafeMode/testInitializeReplQueuesEarly/ : {code} java.lang.AssertionError: expected:13 but was:0 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.hdfs.TestSafeMode.testInitializeReplQueuesEarly(TestSafeMode.java:212) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)