[jira] [Commented] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297130#comment-14297130 ] Xiaoyu Yao commented on HDFS-7584: -- org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer needs the new binary editsStored in V8, I will upload a separate editsStored file. org.apache.hadoop.hdfs.TestDFSInotifyEventInputStream needs to bump up the number of editOps from 49 to 50. I will post a new patch that fixes core tests and javac/findbugs shortly. Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.patch, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6884) Include the hostname in HTTPFS log filenames
[ https://issues.apache.org/jira/browse/HDFS-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-6884: --- Status: Open (was: Patch Available) Include the hostname in HTTPFS log filenames Key: HDFS-6884 URL: https://issues.apache.org/jira/browse/HDFS-6884 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.0 Reporter: Andrew Wang Assignee: Alejandro Abdelnur It'd be good to include the hostname in the httpfs log filenames. Right now we have httpfs.log and httpfs-audit.log, it'd be nice to have e.g. httpfs-${hostname}.log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7314) Aborted DFSClient's impact on long running service like YARN
[ https://issues.apache.org/jira/browse/HDFS-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297133#comment-14297133 ] Ming Ma commented on HDFS-7314: --- Thanks, [~jira.shegalov]. That is interesting. That might work when applications request new FileSystem object. However, there is the scenario where applications still hold the reference of aborted FileSystem object and want to use that to create files; then applications need to be modified to catch the exception and recreate the FileSystem object? At the beginning of the jira, one of the 3 solutions proposed is to keep DistributedFileSystem alive and recreate DFSClient. Regarding of the approach, it will be good to keep it transparent to the applications. Aborted DFSClient's impact on long running service like YARN Key: HDFS-7314 URL: https://issues.apache.org/jira/browse/HDFS-7314 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-7314-2.patch, HDFS-7314-3.patch, HDFS-7314-4.patch, HDFS-7314-5.patch, HDFS-7314-6.patch, HDFS-7314-7.patch, HDFS-7314.patch It happened in YARN nodemanger scenario. But it could happen to any long running service that use cached instance of DistrbutedFileSystem. 1. Active NN is under heavy load. So it became unavailable for 10 minutes; any DFSClient request will get ConnectTimeoutException. 2. YARN nodemanager use DFSClient for certain write operation such as log aggregator or shared cache in YARN-1492. DFSClient used by YARN NM's renewLease RPC got ConnectTimeoutException. {noformat} 2014-10-29 01:36:19,559 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to renew lease for [DFSClient_NONMAPREDUCE_-550838118_1] for 372 seconds. Aborting ... {noformat} 3. After DFSClient is in Aborted state, YARN NM can't use that cached instance of DistributedFileSystem. {noformat} 2014-10-29 20:26:23,991 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc... java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:727) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1780) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:237) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:340) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:57) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} We can make YARN or DFSClient more tolerant to temporary NN unavailability. Given the callstack is YARN - DistributedFileSystem - DFSClient, this can be addressed at different layers. * YARN closes the DistributedFileSystem object when it receives some well defined exception. Then the next HDFS call will create a new instance of DistributedFileSystem. We have to fix all the places in YARN. Plus other HDFS applications need to address this as well. * DistributedFileSystem detects Aborted DFSClient and create a new instance of DFSClient. We will need to fix all the places DistributedFileSystem calls DFSClient. * After DFSClient gets into Aborted state, it doesn't have to reject all requests , instead it can retry. If NN is available again it can transition to healthy state. Comments? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7584: - Attachment: editsStored Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.1.patch, HDFS-7584.9.patch, editsStored, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7584: - Attachment: HDFS-7584.9.1.patch Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.1.patch, HDFS-7584.9.patch, editsStored, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297157#comment-14297157 ] Hadoop QA commented on HDFS-7584: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12695300/editsStored against trunk revision fe2188a. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9372//console This message is automatically generated. Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.1.patch, HDFS-7584.9.patch, editsStored, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7584: - Attachment: HDFS-7584.9a.patch Reorder the patch name for Jenkins to pick up the latest. Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.1.patch, HDFS-7584.9.patch, HDFS-7584.9a.patch, editsStored, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7175) Client-side SocketTimeoutException during Fsck
[ https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297190#comment-14297190 ] Subbu commented on HDFS-7175: - The problem that we face is that if we turn on showprogress, then the fsck command takes much longer (about 50% longer), not to mention the gazillion dots printed out. If we disable the dots, the timeout problem happens. We did some quick performance analysis on what is causing the 50% extra time, and it turns out that it is actually printing dots to the tty. From my earlier experiment with the tcpdump, it seems that we need to send something on the channel to keep it alive. So, here is a proposed solution: * Change the server to disregard the showprogress option, and send out dots every N (=10) seconds no matter what. * Change the client to filter out any line that has only dots in it, if the showprogress option is not specified. * Maybe take as N an additional option (e.g. progressFrequencySec), or make it configurable in hdfs-site.xml, or leave it at 10 (for now at least). If this sounds fine, I can work on a patch to do this. I am also fine if Akira wants to work on the patch, or has alternative solutions. Client-side SocketTimeoutException during Fsck -- Key: HDFS-7175 URL: https://issues.apache.org/jira/browse/HDFS-7175 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Carl Steinbach Assignee: Akira AJISAKA Attachments: HDFS-7175.2.patch, HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch HDFS-2538 disabled status reporting for the fsck command (it can optionally be enabled with the -showprogress option). We have observed that without status reporting the client will abort with read timeout: {noformat} [hdfs@lva1-hcl0030 ~]$ hdfs fsck / Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070 14/09/30 06:03:41 WARN security.UserGroupInformation: PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) cause:java.net.SocketTimeoutException: Read timed out Exception in thread main java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312) at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346) {noformat} Since there's nothing for the client to read it will abort if the time required to complete the fsck operation is longer than the client's read timeout setting. I can think of a couple ways to fix this: # Set an infinite read timeout on the client side (not a good idea!). # Have the server-side write (and flush) zeros to the wire and instruct the client to ignore these characters instead of echoing them. # It's possible that flushing an empty buffer on the server-side will trigger an HTTP response with a zero length payload. This may be enough to keep the client from hanging up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296627#comment-14296627 ] Hadoop QA commented on HDFS-7584: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12695195/HDFS-7584.9.patch against trunk revision 7882bc0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1193 javac compiler warnings (more than the trunk's current 1191 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDFSInotifyEventInputStream org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9370//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9370//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9370//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9370//console This message is automatically generated. Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.patch, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7702) Move metadata across namenode - Effort to a real distributed namenode
ray zhang created HDFS-7702: --- Summary: Move metadata across namenode - Effort to a real distributed namenode Key: HDFS-7702 URL: https://issues.apache.org/jira/browse/HDFS-7702 Project: Hadoop HDFS Issue Type: New Feature Reporter: ray zhang Assignee: ray zhang Implement a tool can show in memory namespace tree structure with weight(size) and a API can move metadata across different namenode. The purpose is moving data efficiently and faster, without moving blocks on datanode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7681) Fix ReplicaInputStream constructor to take InputStreams
[ https://issues.apache.org/jira/browse/HDFS-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296719#comment-14296719 ] Hudson commented on HDFS-7681: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #88 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/88/]) HDFS-7681. Change ReplicaInputStreams constructor to take InputStream(s) instead of FileDescriptor(s). Contributed by Joe Pallas (szetszwo: rev 5a0051f4da6e102846d795a7965a6a18216d74f7) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/ReplicaInputStreams.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix ReplicaInputStream constructor to take InputStreams --- Key: HDFS-7681 URL: https://issues.apache.org/jira/browse/HDFS-7681 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Joe Pallas Assignee: Joe Pallas Fix For: 3.0.0 Attachments: HDFS-7681.patch As noted in HDFS-5194, the constructor for {{ReplicaInputStream}} takes {{FileDescriptor}} s that are immediately turned into {{InputStream}} s, while the callers already have {{FileInputStream}} s from which they extract {{FileDescriptor}} s. This seems to have been done as part of a large set of changes to appease findbugs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296723#comment-14296723 ] Hudson commented on HDFS-7423: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #88 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/88/]) HDFS-7423. various typos and message formatting fixes in nfs daemon and doc. (Charles Lamb via yliu) (yliu: rev f37849188b05a6251584de1aed5e66d5dfa7da4f) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteManager.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/AsyncDataService.java various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Fix For: 2.7.0 Attachments: HDFS-7423-branch-2.004.patch, HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch, HDFS-7423.004.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296722#comment-14296722 ] Hudson commented on HDFS-7611: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #88 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/88/]) HDFS-7611. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. Contributed by Jing Zhao and Byron Wong. (jing9: rev d244574d03903b0514b0308da85d2f06c2384524) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirDeleteOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Jing Zhao Priority: Critical Fix For: 2.7.0 Attachments: HDFS-7611.000.patch, HDFS-7611.001.patch, HDFS-7611.002.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6673) Add delimited format support to PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296718#comment-14296718 ] Hudson commented on HDFS-6673: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #88 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/88/]) HDFS-6673. Add delimited format support to PB OIV tool. Contributed by Eddy Xu. (wang: rev caf7298e49f646a40023af999f62d61846fde5e2) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageTextWriter.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewerPB.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageDelimitedTextWriter.java * hadoop-hdfs-project/hadoop-hdfs/pom.xml Add delimited format support to PB OIV tool --- Key: HDFS-6673 URL: https://issues.apache.org/jira/browse/HDFS-6673 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Fix For: 2.7.0 Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, HDFS-6673.005.patch, HDFS-6673.006.patch The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported in the old {{oiv}} tool. This task adds supports of _Delimited_ processor to the oiv tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296733#comment-14296733 ] Hudson commented on HDFS-7611: -- FAILURE: Integrated in Hadoop-Yarn-trunk #822 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/822/]) HDFS-7611. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. Contributed by Jing Zhao and Byron Wong. (jing9: rev d244574d03903b0514b0308da85d2f06c2384524) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirDeleteOp.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Jing Zhao Priority: Critical Fix For: 2.7.0 Attachments: HDFS-7611.000.patch, HDFS-7611.001.patch, HDFS-7611.002.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6673) Add delimited format support to PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296729#comment-14296729 ] Hudson commented on HDFS-6673: -- FAILURE: Integrated in Hadoop-Yarn-trunk #822 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/822/]) HDFS-6673. Add delimited format support to PB OIV tool. Contributed by Eddy Xu. (wang: rev caf7298e49f646a40023af999f62d61846fde5e2) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageDelimitedTextWriter.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageTextWriter.java * hadoop-hdfs-project/hadoop-hdfs/pom.xml * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewerPB.java Add delimited format support to PB OIV tool --- Key: HDFS-6673 URL: https://issues.apache.org/jira/browse/HDFS-6673 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Fix For: 2.7.0 Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, HDFS-6673.005.patch, HDFS-6673.006.patch The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported in the old {{oiv}} tool. This task adds supports of _Delimited_ processor to the oiv tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296734#comment-14296734 ] Hudson commented on HDFS-7423: -- FAILURE: Integrated in Hadoop-Yarn-trunk #822 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/822/]) HDFS-7423. various typos and message formatting fixes in nfs daemon and doc. (Charles Lamb via yliu) (yliu: rev f37849188b05a6251584de1aed5e66d5dfa7da4f) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteManager.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/AsyncDataService.java various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Fix For: 2.7.0 Attachments: HDFS-7423-branch-2.004.patch, HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch, HDFS-7423.004.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7681) Fix ReplicaInputStream constructor to take InputStreams
[ https://issues.apache.org/jira/browse/HDFS-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296730#comment-14296730 ] Hudson commented on HDFS-7681: -- FAILURE: Integrated in Hadoop-Yarn-trunk #822 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/822/]) HDFS-7681. Change ReplicaInputStreams constructor to take InputStream(s) instead of FileDescriptor(s). Contributed by Joe Pallas (szetszwo: rev 5a0051f4da6e102846d795a7965a6a18216d74f7) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/ReplicaInputStreams.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java Fix ReplicaInputStream constructor to take InputStreams --- Key: HDFS-7681 URL: https://issues.apache.org/jira/browse/HDFS-7681 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Joe Pallas Assignee: Joe Pallas Fix For: 3.0.0 Attachments: HDFS-7681.patch As noted in HDFS-5194, the constructor for {{ReplicaInputStream}} takes {{FileDescriptor}} s that are immediately turned into {{InputStream}} s, while the callers already have {{FileInputStream}} s from which they extract {{FileDescriptor}} s. This seems to have been done as part of a large set of changes to appease findbugs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296801#comment-14296801 ] Charles Lamb commented on HDFS-7423: bq. Is it correct? Yes, it's ok because statistics is only declared and never used (except there). Hence, it's always null. Probably a better change would have been to just eliminate statistics completely from the file. Thanks for the commit [~hitliuyi]. various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Fix For: 2.7.0 Attachments: HDFS-7423-branch-2.004.patch, HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch, HDFS-7423.004.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6673) Add delimited format support to PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297001#comment-14297001 ] Hudson commented on HDFS-6673: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2039 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2039/]) HDFS-6673. Add delimited format support to PB OIV tool. Contributed by Eddy Xu. (wang: rev caf7298e49f646a40023af999f62d61846fde5e2) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageDelimitedTextWriter.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/pom.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewerPB.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageTextWriter.java Add delimited format support to PB OIV tool --- Key: HDFS-6673 URL: https://issues.apache.org/jira/browse/HDFS-6673 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Fix For: 2.7.0 Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, HDFS-6673.005.patch, HDFS-6673.006.patch The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported in the old {{oiv}} tool. This task adds supports of _Delimited_ processor to the oiv tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297005#comment-14297005 ] Hudson commented on HDFS-7423: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2039 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2039/]) HDFS-7423. various typos and message formatting fixes in nfs daemon and doc. (Charles Lamb via yliu) (yliu: rev f37849188b05a6251584de1aed5e66d5dfa7da4f) * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteManager.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/AsyncDataService.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Fix For: 2.7.0 Attachments: HDFS-7423-branch-2.004.patch, HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch, HDFS-7423.004.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7681) Fix ReplicaInputStream constructor to take InputStreams
[ https://issues.apache.org/jira/browse/HDFS-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297002#comment-14297002 ] Hudson commented on HDFS-7681: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2039 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2039/]) HDFS-7681. Change ReplicaInputStreams constructor to take InputStream(s) instead of FileDescriptor(s). Contributed by Joe Pallas (szetszwo: rev 5a0051f4da6e102846d795a7965a6a18216d74f7) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/ReplicaInputStreams.java Fix ReplicaInputStream constructor to take InputStreams --- Key: HDFS-7681 URL: https://issues.apache.org/jira/browse/HDFS-7681 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Joe Pallas Assignee: Joe Pallas Fix For: 3.0.0 Attachments: HDFS-7681.patch As noted in HDFS-5194, the constructor for {{ReplicaInputStream}} takes {{FileDescriptor}} s that are immediately turned into {{InputStream}} s, while the callers already have {{FileInputStream}} s from which they extract {{FileDescriptor}} s. This seems to have been done as part of a large set of changes to appease findbugs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297004#comment-14297004 ] Hudson commented on HDFS-7611: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2039 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2039/]) HDFS-7611. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. Contributed by Jing Zhao and Byron Wong. (jing9: rev d244574d03903b0514b0308da85d2f06c2384524) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirDeleteOp.java deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Jing Zhao Priority: Critical Fix For: 2.7.0 Attachments: HDFS-7611.000.patch, HDFS-7611.001.patch, HDFS-7611.002.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7704) DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
Rushabh S Shah created HDFS-7704: Summary: DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out. Key: HDFS-7704 URL: https://issues.apache.org/jira/browse/HDFS-7704 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.5.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah There are couple of synchronous calls in BPOfferservice (i.e reportBadBlocks and trySendErrorReport) which will wait for both of the actor threads to process this calls. This calls are made with writeLock acquired. When reportBadBlocks() is blocked at the RPC layer due to unreachable NN, subsequent heartbeat response processing has to wait for the write lock. It eventually gets through, but takes too long and it blocks the next heartbeat. In our HA cluster setup, the standby namenode was taking a long time to process the request. Requesting improvement in datanode to make the above calls asynchronous since these reports don't have any specific deadlines, so extra few seconds of delay should be acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6673) Add delimited format support to PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296804#comment-14296804 ] Hudson commented on HDFS-6673: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #89 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/89/]) HDFS-6673. Add delimited format support to PB OIV tool. Contributed by Eddy Xu. (wang: rev caf7298e49f646a40023af999f62d61846fde5e2) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageTextWriter.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageDelimitedTextWriter.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/pom.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewerPB.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java Add delimited format support to PB OIV tool --- Key: HDFS-6673 URL: https://issues.apache.org/jira/browse/HDFS-6673 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Fix For: 2.7.0 Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, HDFS-6673.005.patch, HDFS-6673.006.patch The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported in the old {{oiv}} tool. This task adds supports of _Delimited_ processor to the oiv tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296807#comment-14296807 ] Hudson commented on HDFS-7611: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #89 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/89/]) HDFS-7611. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. Contributed by Jing Zhao and Byron Wong. (jing9: rev d244574d03903b0514b0308da85d2f06c2384524) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirDeleteOp.java deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Jing Zhao Priority: Critical Fix For: 2.7.0 Attachments: HDFS-7611.000.patch, HDFS-7611.001.patch, HDFS-7611.002.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7703) Support favouredNodes for the append for new blocks
Vinayakumar B created HDFS-7703: --- Summary: Support favouredNodes for the append for new blocks Key: HDFS-7703 URL: https://issues.apache.org/jira/browse/HDFS-7703 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Currently favorNodes is supported for the new file creation, and these nodes are applicable for all blocks of the file. Same support should be available when file is opened for append. But, even though original file has not used favor nodes, favorNodes passed to append will be used only for new blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7703) Support favouredNodes for the append for new blocks
[ https://issues.apache.org/jira/browse/HDFS-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-7703: Affects Version/s: 2.6.0 Support favouredNodes for the append for new blocks --- Key: HDFS-7703 URL: https://issues.apache.org/jira/browse/HDFS-7703 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Currently favorNodes is supported for the new file creation, and these nodes are applicable for all blocks of the file. Same support should be available when file is opened for append. But, even though original file has not used favor nodes, favorNodes passed to append will be used only for new blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7681) Fix ReplicaInputStream constructor to take InputStreams
[ https://issues.apache.org/jira/browse/HDFS-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296805#comment-14296805 ] Hudson commented on HDFS-7681: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #89 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/89/]) HDFS-7681. Change ReplicaInputStreams constructor to take InputStream(s) instead of FileDescriptor(s). Contributed by Joe Pallas (szetszwo: rev 5a0051f4da6e102846d795a7965a6a18216d74f7) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/ReplicaInputStreams.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix ReplicaInputStream constructor to take InputStreams --- Key: HDFS-7681 URL: https://issues.apache.org/jira/browse/HDFS-7681 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Joe Pallas Assignee: Joe Pallas Fix For: 3.0.0 Attachments: HDFS-7681.patch As noted in HDFS-5194, the constructor for {{ReplicaInputStream}} takes {{FileDescriptor}} s that are immediately turned into {{InputStream}} s, while the callers already have {{FileInputStream}} s from which they extract {{FileDescriptor}} s. This seems to have been done as part of a large set of changes to appease findbugs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296808#comment-14296808 ] Hudson commented on HDFS-7423: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #89 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/89/]) HDFS-7423. various typos and message formatting fixes in nfs daemon and doc. (Charles Lamb via yliu) (yliu: rev f37849188b05a6251584de1aed5e66d5dfa7da4f) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/AsyncDataService.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteManager.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Fix For: 2.7.0 Attachments: HDFS-7423-branch-2.004.patch, HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch, HDFS-7423.004.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7175) Client-side SocketTimeoutException during Fsck
[ https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297207#comment-14297207 ] Subbu commented on HDFS-7175: - I tried on jdk7. Note that the timeout happens only on large clusters (that take more than a minute to scan). [~ajisakaa] did you try out tcpdump? Client-side SocketTimeoutException during Fsck -- Key: HDFS-7175 URL: https://issues.apache.org/jira/browse/HDFS-7175 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Carl Steinbach Assignee: Akira AJISAKA Attachments: HDFS-7175.2.patch, HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch HDFS-2538 disabled status reporting for the fsck command (it can optionally be enabled with the -showprogress option). We have observed that without status reporting the client will abort with read timeout: {noformat} [hdfs@lva1-hcl0030 ~]$ hdfs fsck / Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070 14/09/30 06:03:41 WARN security.UserGroupInformation: PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) cause:java.net.SocketTimeoutException: Read timed out Exception in thread main java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312) at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346) {noformat} Since there's nothing for the client to read it will abort if the time required to complete the fsck operation is longer than the client's read timeout setting. I can think of a couple ways to fix this: # Set an infinite read timeout on the client side (not a good idea!). # Have the server-side write (and flush) zeros to the wire and instruct the client to ignore these characters instead of echoing them. # It's possible that flushing an empty buffer on the server-side will trigger an HTTP response with a zero length payload. This may be enough to keep the client from hanging up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4265) BKJM doesn't take advantage of speculative reads
[ https://issues.apache.org/jira/browse/HDFS-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-4265: --- Attachment: 0005-HDFS-4265.patch BKJM doesn't take advantage of speculative reads Key: HDFS-4265 URL: https://issues.apache.org/jira/browse/HDFS-4265 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: 2.2.0 Reporter: Ivan Kelly Assignee: Rakesh R Attachments: 0005-HDFS-4265.patch, 001-HDFS-4265.patch, 002-HDFS-4265.patch, 003-HDFS-4265.patch, 004-HDFS-4265.patch BookKeeperEditLogInputStream reads entry at a time, so it doesn't take advantage of the speculative read mechanism introduced by BOOKKEEPER-336. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4265) BKJM doesn't take advantage of speculative reads
[ https://issues.apache.org/jira/browse/HDFS-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297298#comment-14297298 ] Rakesh R commented on HDFS-4265: I've rebased the fix on trunk and attached new patch. There is no extra changes compare to the previous patch. BKJM doesn't take advantage of speculative reads Key: HDFS-4265 URL: https://issues.apache.org/jira/browse/HDFS-4265 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: 2.2.0 Reporter: Ivan Kelly Assignee: Rakesh R Attachments: 0005-HDFS-4265.patch, 001-HDFS-4265.patch, 002-HDFS-4265.patch, 003-HDFS-4265.patch, 004-HDFS-4265.patch BookKeeperEditLogInputStream reads entry at a time, so it doesn't take advantage of the speculative read mechanism introduced by BOOKKEEPER-336. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7603) The background replication queue initialization may not let others run
[ https://issues.apache.org/jira/browse/HDFS-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297302#comment-14297302 ] Daryn Sharp commented on HDFS-7603: --- +1 Looks good, understood too hard to write a test to prove/disprove unfair lock starvation under large namespaces. Tested internally. The background replication queue initialization may not let others run -- Key: HDFS-7603 URL: https://issues.apache.org/jira/browse/HDFS-7603 Project: Hadoop HDFS Issue Type: Bug Components: rolling upgrades Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7603.patch, HDFS-7603.patch The background replication queue initialization processes configured number of blocks at a time and releases the namesystem write lock. This was to let namenode start serving right after a standby to active transition or leaving safe mode. However, this does not allow others to run much if the lock fairness is set to unfair for the higher throughput. I propose adding a delay between unlocking and locking in the async repl queue init thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7314) Aborted DFSClient's impact on long running service like YARN
[ https://issues.apache.org/jira/browse/HDFS-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297266#comment-14297266 ] Gera Shegalov commented on HDFS-7314: - Actually I need to take #1 back, I misspoke DFS#close calls super.close() {code} @Override public void close() throws IOException { try { dfs.closeOutputStreams(false); super.close(); } finally { dfs.close(); } } {code} So it's only about 2. Aborted DFSClient's impact on long running service like YARN Key: HDFS-7314 URL: https://issues.apache.org/jira/browse/HDFS-7314 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-7314-2.patch, HDFS-7314-3.patch, HDFS-7314-4.patch, HDFS-7314-5.patch, HDFS-7314-6.patch, HDFS-7314-7.patch, HDFS-7314.patch It happened in YARN nodemanger scenario. But it could happen to any long running service that use cached instance of DistrbutedFileSystem. 1. Active NN is under heavy load. So it became unavailable for 10 minutes; any DFSClient request will get ConnectTimeoutException. 2. YARN nodemanager use DFSClient for certain write operation such as log aggregator or shared cache in YARN-1492. DFSClient used by YARN NM's renewLease RPC got ConnectTimeoutException. {noformat} 2014-10-29 01:36:19,559 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to renew lease for [DFSClient_NONMAPREDUCE_-550838118_1] for 372 seconds. Aborting ... {noformat} 3. After DFSClient is in Aborted state, YARN NM can't use that cached instance of DistributedFileSystem. {noformat} 2014-10-29 20:26:23,991 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc... java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:727) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1780) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:237) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:340) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:57) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} We can make YARN or DFSClient more tolerant to temporary NN unavailability. Given the callstack is YARN - DistributedFileSystem - DFSClient, this can be addressed at different layers. * YARN closes the DistributedFileSystem object when it receives some well defined exception. Then the next HDFS call will create a new instance of DistributedFileSystem. We have to fix all the places in YARN. Plus other HDFS applications need to address this as well. * DistributedFileSystem detects Aborted DFSClient and create a new instance of DFSClient. We will need to fix all the places DistributedFileSystem calls DFSClient. * After DFSClient gets into Aborted state, it doesn't have to reject all requests , instead it can retry. If NN is available again it can transition to healthy state. Comments? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7175) Client-side SocketTimeoutException during Fsck
[ https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297199#comment-14297199 ] Allen Wittenauer commented on HDFS-7175: I'm talking specifically about the null not getting sent across the socket, since it sounds like it a) it did work for [~ajisakaa] and b) I know that LI has mostly transitioned over to JDK8. Client-side SocketTimeoutException during Fsck -- Key: HDFS-7175 URL: https://issues.apache.org/jira/browse/HDFS-7175 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Carl Steinbach Assignee: Akira AJISAKA Attachments: HDFS-7175.2.patch, HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch HDFS-2538 disabled status reporting for the fsck command (it can optionally be enabled with the -showprogress option). We have observed that without status reporting the client will abort with read timeout: {noformat} [hdfs@lva1-hcl0030 ~]$ hdfs fsck / Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070 14/09/30 06:03:41 WARN security.UserGroupInformation: PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) cause:java.net.SocketTimeoutException: Read timed out Exception in thread main java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312) at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346) {noformat} Since there's nothing for the client to read it will abort if the time required to complete the fsck operation is longer than the client's read timeout setting. I can think of a couple ways to fix this: # Set an infinite read timeout on the client side (not a good idea!). # Have the server-side write (and flush) zeros to the wire and instruct the client to ignore these characters instead of echoing them. # It's possible that flushing an empty buffer on the server-side will trigger an HTTP response with a zero length payload. This may be enough to keep the client from hanging up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7603) The background replication queue initialization may not let others run
[ https://issues.apache.org/jira/browse/HDFS-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-7603: - Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed this to trunk and branch-2. The background replication queue initialization may not let others run -- Key: HDFS-7603 URL: https://issues.apache.org/jira/browse/HDFS-7603 Project: Hadoop HDFS Issue Type: Bug Components: rolling upgrades Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 2.7.0 Attachments: HDFS-7603.patch, HDFS-7603.patch The background replication queue initialization processes configured number of blocks at a time and releases the namesystem write lock. This was to let namenode start serving right after a standby to active transition or leaving safe mode. However, this does not allow others to run much if the lock fairness is set to unfair for the higher throughput. I propose adding a delay between unlocking and locking in the async repl queue init thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296898#comment-14296898 ] Hudson commented on HDFS-7611: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #85 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/85/]) HDFS-7611. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. Contributed by Jing Zhao and Byron Wong. (jing9: rev d244574d03903b0514b0308da85d2f06c2384524) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirDeleteOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Jing Zhao Priority: Critical Fix For: 2.7.0 Attachments: HDFS-7611.000.patch, HDFS-7611.001.patch, HDFS-7611.002.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7681) Fix ReplicaInputStream constructor to take InputStreams
[ https://issues.apache.org/jira/browse/HDFS-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296895#comment-14296895 ] Hudson commented on HDFS-7681: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #85 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/85/]) HDFS-7681. Change ReplicaInputStreams constructor to take InputStream(s) instead of FileDescriptor(s). Contributed by Joe Pallas (szetszwo: rev 5a0051f4da6e102846d795a7965a6a18216d74f7) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/ReplicaInputStreams.java Fix ReplicaInputStream constructor to take InputStreams --- Key: HDFS-7681 URL: https://issues.apache.org/jira/browse/HDFS-7681 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Joe Pallas Assignee: Joe Pallas Fix For: 3.0.0 Attachments: HDFS-7681.patch As noted in HDFS-5194, the constructor for {{ReplicaInputStream}} takes {{FileDescriptor}} s that are immediately turned into {{InputStream}} s, while the callers already have {{FileInputStream}} s from which they extract {{FileDescriptor}} s. This seems to have been done as part of a large set of changes to appease findbugs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7648) Verify the datanode directory layout
[ https://issues.apache.org/jira/browse/HDFS-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R reassigned HDFS-7648: -- Assignee: Rakesh R Verify the datanode directory layout Key: HDFS-7648 URL: https://issues.apache.org/jira/browse/HDFS-7648 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Rakesh R HDFS-6482 changed datanode layout to use block ID to determine the directory to store the block. We should have some mechanism to verify it. Either DirectoryScanner or block report generation could do the check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296932#comment-14296932 ] Hudson commented on HDFS-7611: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2020 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2020/]) HDFS-7611. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. Contributed by Jing Zhao and Byron Wong. (jing9: rev d244574d03903b0514b0308da85d2f06c2384524) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirDeleteOp.java deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Jing Zhao Priority: Critical Fix For: 2.7.0 Attachments: HDFS-7611.000.patch, HDFS-7611.001.patch, HDFS-7611.002.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7648) Verify the datanode directory layout
[ https://issues.apache.org/jira/browse/HDFS-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296925#comment-14296925 ] Rakesh R commented on HDFS-7648: Thanks [~cmccabe] for comments. bq.Probably what we want to do is log a warning about files that are in locations they do not belong in Do you meant, DirectoryScanner should identify the blocks which are not in the expected directory path computed using its block ID and just do a log message without fixing it. As per the earlier discussion it need to fix. I was having an idea which is similar to the mechanism used in Datastorage, create hardlinks. Please correct me if I miss anything. Verify the datanode directory layout Key: HDFS-7648 URL: https://issues.apache.org/jira/browse/HDFS-7648 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze HDFS-6482 changed datanode layout to use block ID to determine the directory to store the block. We should have some mechanism to verify it. Either DirectoryScanner or block report generation could do the check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6673) Add delimited format support to PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296928#comment-14296928 ] Hudson commented on HDFS-6673: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2020 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2020/]) HDFS-6673. Add delimited format support to PB OIV tool. Contributed by Eddy Xu. (wang: rev caf7298e49f646a40023af999f62d61846fde5e2) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageDelimitedTextWriter.java * hadoop-hdfs-project/hadoop-hdfs/pom.xml * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewerPB.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageTextWriter.java Add delimited format support to PB OIV tool --- Key: HDFS-6673 URL: https://issues.apache.org/jira/browse/HDFS-6673 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Fix For: 2.7.0 Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, HDFS-6673.005.patch, HDFS-6673.006.patch The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported in the old {{oiv}} tool. This task adds supports of _Delimited_ processor to the oiv tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296933#comment-14296933 ] Hudson commented on HDFS-7423: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2020 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2020/]) HDFS-7423. various typos and message formatting fixes in nfs daemon and doc. (Charles Lamb via yliu) (yliu: rev f37849188b05a6251584de1aed5e66d5dfa7da4f) * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/AsyncDataService.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Fix For: 2.7.0 Attachments: HDFS-7423-branch-2.004.patch, HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch, HDFS-7423.004.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7681) Fix ReplicaInputStream constructor to take InputStreams
[ https://issues.apache.org/jira/browse/HDFS-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296929#comment-14296929 ] Hudson commented on HDFS-7681: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2020 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2020/]) HDFS-7681. Change ReplicaInputStreams constructor to take InputStream(s) instead of FileDescriptor(s). Contributed by Joe Pallas (szetszwo: rev 5a0051f4da6e102846d795a7965a6a18216d74f7) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/ReplicaInputStreams.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix ReplicaInputStream constructor to take InputStreams --- Key: HDFS-7681 URL: https://issues.apache.org/jira/browse/HDFS-7681 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Joe Pallas Assignee: Joe Pallas Fix For: 3.0.0 Attachments: HDFS-7681.patch As noted in HDFS-5194, the constructor for {{ReplicaInputStream}} takes {{FileDescriptor}} s that are immediately turned into {{InputStream}} s, while the callers already have {{FileInputStream}} s from which they extract {{FileDescriptor}} s. This seems to have been done as part of a large set of changes to appease findbugs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296899#comment-14296899 ] Hudson commented on HDFS-7423: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #85 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/85/]) HDFS-7423. various typos and message formatting fixes in nfs daemon and doc. (Charles Lamb via yliu) (yliu: rev f37849188b05a6251584de1aed5e66d5dfa7da4f) * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/AsyncDataService.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Fix For: 2.7.0 Attachments: HDFS-7423-branch-2.004.patch, HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch, HDFS-7423.004.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6673) Add delimited format support to PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296894#comment-14296894 ] Hudson commented on HDFS-6673: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #85 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/85/]) HDFS-6673. Add delimited format support to PB OIV tool. Contributed by Eddy Xu. (wang: rev caf7298e49f646a40023af999f62d61846fde5e2) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewerPB.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java * hadoop-hdfs-project/hadoop-hdfs/pom.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageTextWriter.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageDelimitedTextWriter.java Add delimited format support to PB OIV tool --- Key: HDFS-6673 URL: https://issues.apache.org/jira/browse/HDFS-6673 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Fix For: 2.7.0 Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, HDFS-6673.005.patch, HDFS-6673.006.patch The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported in the old {{oiv}} tool. This task adds supports of _Delimited_ processor to the oiv tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7704) DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
[ https://issues.apache.org/jira/browse/HDFS-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-7704: - Attachment: HDFS-7704.patch DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out. --- Key: HDFS-7704 URL: https://issues.apache.org/jira/browse/HDFS-7704 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.5.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7704.patch There are couple of synchronous calls in BPOfferservice (i.e reportBadBlocks and trySendErrorReport) which will wait for both of the actor threads to process this calls. This calls are made with writeLock acquired. When reportBadBlocks() is blocked at the RPC layer due to unreachable NN, subsequent heartbeat response processing has to wait for the write lock. It eventually gets through, but takes too long and it blocks the next heartbeat. In our HA cluster setup, the standby namenode was taking a long time to process the request. Requesting improvement in datanode to make the above calls asynchronous since these reports don't have any specific deadlines, so extra few seconds of delay should be acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7704) DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
[ https://issues.apache.org/jira/browse/HDFS-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-7704: - Status: Patch Available (was: Open) To fix the issue, I created a queue for each actor thread which will enqueue the two synchronous method call and will process them at the end. DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out. --- Key: HDFS-7704 URL: https://issues.apache.org/jira/browse/HDFS-7704 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.5.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7704.patch There are couple of synchronous calls in BPOfferservice (i.e reportBadBlocks and trySendErrorReport) which will wait for both of the actor threads to process this calls. This calls are made with writeLock acquired. When reportBadBlocks() is blocked at the RPC layer due to unreachable NN, subsequent heartbeat response processing has to wait for the write lock. It eventually gets through, but takes too long and it blocks the next heartbeat. In our HA cluster setup, the standby namenode was taking a long time to process the request. Requesting improvement in datanode to make the above calls asynchronous since these reports don't have any specific deadlines, so extra few seconds of delay should be acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7603) The background replication queue initialization may not let others run
[ https://issues.apache.org/jira/browse/HDFS-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297315#comment-14297315 ] Hudson commented on HDFS-7603: -- FAILURE: Integrated in Hadoop-trunk-Commit #6962 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6962/]) HDFS-7603. The background replication queue initialization may not let others run. Contributed by Kihwal Lee. (kihwal: rev 89b07490f8354bb83a67b7ffc917bfe99708e615) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java The background replication queue initialization may not let others run -- Key: HDFS-7603 URL: https://issues.apache.org/jira/browse/HDFS-7603 Project: Hadoop HDFS Issue Type: Bug Components: rolling upgrades Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 2.7.0 Attachments: HDFS-7603.patch, HDFS-7603.patch The background replication queue initialization processes configured number of blocks at a time and releases the namesystem write lock. This was to let namenode start serving right after a standby to active transition or leaving safe mode. However, this does not allow others to run much if the lock fairness is set to unfair for the higher throughput. I propose adding a delay between unlocking and locking in the async repl queue init thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7577) Add additional headers that includes need by Windows
[ https://issues.apache.org/jira/browse/HDFS-7577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297336#comment-14297336 ] Thanh Do commented on HDFS-7577: Thanks Colin. I'll work on the next patch soon. Best! Add additional headers that includes need by Windows Key: HDFS-7577 URL: https://issues.apache.org/jira/browse/HDFS-7577 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Thanh Do Assignee: Thanh Do Fix For: HDFS-6994 Attachments: HDFS-7577-branch-HDFS-6994-0.patch, HDFS-7577-branch-HDFS-6994-1.patch, HDFS-7577-branch-HDFS-6994-2.patch This jira involves adding a list of (mostly dummy) headers that available in POSIX systems, but not in Windows. One step towards making libhdfs3 built in Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4265) BKJM doesn't take advantage of speculative reads
[ https://issues.apache.org/jira/browse/HDFS-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297368#comment-14297368 ] Hadoop QA commented on HDFS-4265: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12695322/0005-HDFS-4265.patch against trunk revision 342efa1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9374//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9374//artifact/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9374//artifact/patchprocess/newPatchFindbugsWarningsbkjournal.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9374//console This message is automatically generated. BKJM doesn't take advantage of speculative reads Key: HDFS-4265 URL: https://issues.apache.org/jira/browse/HDFS-4265 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: 2.2.0 Reporter: Ivan Kelly Assignee: Rakesh R Attachments: 0005-HDFS-4265.patch, 001-HDFS-4265.patch, 002-HDFS-4265.patch, 003-HDFS-4265.patch, 004-HDFS-4265.patch BookKeeperEditLogInputStream reads entry at a time, so it doesn't take advantage of the speculative read mechanism introduced by BOOKKEEPER-336. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7704) DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
[ https://issues.apache.org/jira/browse/HDFS-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297385#comment-14297385 ] Hadoop QA commented on HDFS-7704: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12695290/HDFS-7704.patch against trunk revision 03a5e04. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9371//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9371//console This message is automatically generated. DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out. --- Key: HDFS-7704 URL: https://issues.apache.org/jira/browse/HDFS-7704 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.5.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7704.patch There are couple of synchronous calls in BPOfferservice (i.e reportBadBlocks and trySendErrorReport) which will wait for both of the actor threads to process this calls. This calls are made with writeLock acquired. When reportBadBlocks() is blocked at the RPC layer due to unreachable NN, subsequent heartbeat response processing has to wait for the write lock. It eventually gets through, but takes too long and it blocks the next heartbeat. In our HA cluster setup, the standby namenode was taking a long time to process the request. Requesting improvement in datanode to make the above calls asynchronous since these reports don't have any specific deadlines, so extra few seconds of delay should be acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7696) FsDatasetImpl.getTmpInputStreams(..) may leak file descriptors
[ https://issues.apache.org/jira/browse/HDFS-7696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297416#comment-14297416 ] Joe Pallas commented on HDFS-7696: -- That looks right; I'm embarrassed I didn't do it myself. FsDatasetImpl.getTmpInputStreams(..) may leak file descriptors -- Key: HDFS-7696 URL: https://issues.apache.org/jira/browse/HDFS-7696 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h7696_20150128.patch getTmpInputStreams(..) opens a block file and a meta file, and then return them as ReplicaInputStreams. The caller responses to closes those streams. In case of errors, an exception is thrown without closing the files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7647) DatanodeManager.sortLocatedBlocks() sorts DatanodeInfos but not StorageIDs
[ https://issues.apache.org/jira/browse/HDFS-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Milan Desai updated HDFS-7647: -- Status: In Progress (was: Patch Available) DatanodeManager.sortLocatedBlocks() sorts DatanodeInfos but not StorageIDs -- Key: HDFS-7647 URL: https://issues.apache.org/jira/browse/HDFS-7647 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Milan Desai Assignee: Milan Desai Attachments: HDFS-7647-2.patch, HDFS-7647.patch DatanodeManager.sortLocatedBlocks() sorts the array of DatanodeInfos inside each LocatedBlock, but does not touch the array of StorageIDs and StorageTypes. As a result, the DatanodeInfos and StorageIDs/StorageTypes are mismatched. The method is called by FSNamesystem.getBlockLocations(), so the client will not know which StorageID/Type corresponds to which DatanodeInfo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7693) libhdfs: add hdfsFile cache
[ https://issues.apache.org/jira/browse/HDFS-7693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297456#comment-14297456 ] Chris Nauroth commented on HDFS-7693: - Hi Colin. I've only taken a quick scan of the patch so far. Can you please describe the motivation for this feature? Is this intended to optimize a particular usage pattern that you're seeing? Since the cache manipulation methods are in the public header, it appears it will be the caller's responsibility to allocate and use a cache. Did you consider trying to encapsulate the cache completely behind the existing API? The presence of mutexes implies that you intend for the cache to be called from multiple threads. I also see that LRU eviction can trigger a close of the file by side effect. I didn't see any reference counting, so is there a risk that one thread triggers eviction and close of a stream that another thread is still using? Again, maybe some of this would be clearer if I first understood the intended use case. I can volunteer to take the patch for a spin on Windows too. I can see right away that test_libhdfs_cache.c is likely to fail compilation, because the Microsoft C compiler doesn't support designated initializers on structs. Thanks! libhdfs: add hdfsFile cache --- Key: HDFS-7693 URL: https://issues.apache.org/jira/browse/HDFS-7693 Project: Hadoop HDFS Issue Type: Improvement Components: libhdfs Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7693.001.patch Add an hdfsFile cache inside libhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7702) Move metadata across namenode - Effort to a real distributed namenode
[ https://issues.apache.org/jira/browse/HDFS-7702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297455#comment-14297455 ] Ray commented on HDFS-7702: --- Wrote a draft proposal. https://cwiki.apache.org/confluence/display/FLINK/Metadata+Moving+Tool+Design+Proposal+-+Effort+to+a+real+distributed+namenode Move metadata across namenode - Effort to a real distributed namenode - Key: HDFS-7702 URL: https://issues.apache.org/jira/browse/HDFS-7702 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ray Assignee: Ray Implement a tool can show in memory namespace tree structure with weight(size) and a API can move metadata across different namenode. The purpose is moving data efficiently and faster, without moving blocks on datanode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7704) DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
[ https://issues.apache.org/jira/browse/HDFS-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297476#comment-14297476 ] Rushabh S Shah commented on HDFS-7704: -- This test pass on my local setup. DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out. --- Key: HDFS-7704 URL: https://issues.apache.org/jira/browse/HDFS-7704 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.5.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7704.patch There are couple of synchronous calls in BPOfferservice (i.e reportBadBlocks and trySendErrorReport) which will wait for both of the actor threads to process this calls. This calls are made with writeLock acquired. When reportBadBlocks() is blocked at the RPC layer due to unreachable NN, subsequent heartbeat response processing has to wait for the write lock. It eventually gets through, but takes too long and it blocks the next heartbeat. In our HA cluster setup, the standby namenode was taking a long time to process the request. Requesting improvement in datanode to make the above calls asynchronous since these reports don't have any specific deadlines, so extra few seconds of delay should be acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7702) Move metadata across namenode - Effort to a real distributed namenode
[ https://issues.apache.org/jira/browse/HDFS-7702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297745#comment-14297745 ] Ray commented on HDFS-7702: --- Is the transfer granularity blockpool only? This is the last step(Clean Up), after metadata flip done, the target namenode will only transfer one blockpool id to datanodes, then datanode won't have to report the same block to both namenodes. So from namenode to namenode, transfer namespace sub-tree, from namenode to datanode only transfer blockpool id and block id. but then this statement: it will mark delete the involved sub-tree from its own namespace leads me to believe that it's sub-trees in the namespace. Ok, this one is from the previous step(Metadata Flip), it won't transfer anything. The metadata exist on the source namenode, so just mark it delete, not really delete, just remove reference, in case we have to roll back. Could you please clarify this statement: all read and write operation regarding the same namespace sub-tree is forwarding to the target namenode. Who does the forwarding, the client or the source NN? The source NN will give info whether metadata on other NN and the client will do the forwarding, I haven't mention the overflowing table yet, client will get metadata from namenode first, the metadata include some data structure point out some path in other namenodes. Then client will connect to all other namenodes (if have, might only have previous step don't have to do this one ) in parallel, and get all metadata merged together. Please feel free to tell me if anything still not clear. Thank you for your first response! Move metadata across namenode - Effort to a real distributed namenode - Key: HDFS-7702 URL: https://issues.apache.org/jira/browse/HDFS-7702 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ray Assignee: Ray Implement a tool can show in memory namespace tree structure with weight(size) and a API can move metadata across different namenode. The purpose is moving data efficiently and faster, without moving blocks on datanode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7702) Move metadata across namenode - Effort to a real distributed namenode
[ https://issues.apache.org/jira/browse/HDFS-7702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297712#comment-14297712 ] Ray commented on HDFS-7702: --- Hi [~clamb], Good questions. For the failure scenarios, I will document more on it, as you can image , it's a transaction, if one step failed the whole transaction will roll back. Administrator has to kick off again, and the whole transaction should not take long time. I choose kryo rather than protobuf, because namespace structure is too complex to convert so many java objects to PB (may be I am wrong here), and I have to dynamically transfer some namespace data. I use kyronet, because it's seemless with kryo, it's possible to use IPC instead if have a good reason.:) And I found deadlock while trying to serialize INode object, Kryo seems easy to handle customized serialization as well. I am sorry for other uncleared statement, I will try to make them more clear. Move metadata across namenode - Effort to a real distributed namenode - Key: HDFS-7702 URL: https://issues.apache.org/jira/browse/HDFS-7702 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ray Assignee: Ray Implement a tool can show in memory namespace tree structure with weight(size) and a API can move metadata across different namenode. The purpose is moving data efficiently and faster, without moving blocks on datanode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7705) FileSystem should expose some performance characteristics for caller (e.g., FsShell) to choose the right algorithm.
[ https://issues.apache.org/jira/browse/HDFS-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-7705: Summary: FileSystem should expose some performance characteristics for caller (e.g., FsShell) to choose the right algorithm. (was: {{FileSystem}} should expose some performance characteristics for caller (e.g., FsShell) to choose right algorithm.) FileSystem should expose some performance characteristics for caller (e.g., FsShell) to choose the right algorithm. --- Key: HDFS-7705 URL: https://issues.apache.org/jira/browse/HDFS-7705 Project: Hadoop HDFS Issue Type: Improvement Components: tools Affects Versions: 2.6.0 Reporter: Lei (Eddy) Xu When running {{hadoop fs -put}}, {{FsShell}} creates a {{._COPYING_.}} file on the target directory, and then renames it to target file when the write is done. However, for some targeted systems, such as S3, Azure and Swift, a partial failure write request (i.e., {{PUT}}) has not side effect, while the {{rename}} operation is expensive. {{FileSystem}} should expose some characteristics so that the operation such as {{CommandWithDestination#copyStreamToTarget()}} can detect and choose the right way to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297658#comment-14297658 ] Hadoop QA commented on HDFS-7584: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12695305/HDFS-7584.9a.patch against trunk revision 57b8950. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 10 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9373//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9373//artifact/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9373//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9373//console This message is automatically generated. Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.1.patch, HDFS-7584.9.patch, HDFS-7584.9a.patch, editsStored, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7702) Move metadata across namenode - Effort to a real distributed namenode
[ https://issues.apache.org/jira/browse/HDFS-7702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297657#comment-14297657 ] Charles Lamb commented on HDFS-7702: Hi [~xiyunyue], I read over your proposal and have some high level questions. I am unclear about your proposal in the failure scenarios. If a source or target NN or one or more of the DNs fails in the middle of a migration, how are things restarted? Why use Kryo and not protobuf for serialization? Why use Kryo and not the existing Hadoop/HDFS protocols and infrastructure for network communications between the various nodes? Is the transfer granularity blockpool only? I infer that from this statement: bq. The target namenode will notify datanode remove blockpool id which belong to the source namenode, but then this statement: bq. it will mark delete the involved sub-tree from its own namespace leads me to believe that it's sub-trees in the namespace. Could you please clarify this statement: bq. all read and write operation regarding the same namespace sub-tree is forwarding to the target namenode. Who does the forwarding, the client or the source NN? Move metadata across namenode - Effort to a real distributed namenode - Key: HDFS-7702 URL: https://issues.apache.org/jira/browse/HDFS-7702 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ray Assignee: Ray Implement a tool can show in memory namespace tree structure with weight(size) and a API can move metadata across different namenode. The purpose is moving data efficiently and faster, without moving blocks on datanode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7584: - Attachment: HDFS-7584.9b.patch Fix the findbugs issue. Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.1.patch, HDFS-7584.9.patch, HDFS-7584.9a.patch, HDFS-7584.9b.patch, editsStored, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7705) {{FileSystem}} should expose some performance characteristics for caller (e.g., FsShell) to choose right algorithm.
Lei (Eddy) Xu created HDFS-7705: --- Summary: {{FileSystem}} should expose some performance characteristics for caller (e.g., FsShell) to choose right algorithm. Key: HDFS-7705 URL: https://issues.apache.org/jira/browse/HDFS-7705 Project: Hadoop HDFS Issue Type: Improvement Components: tools Affects Versions: 2.6.0 Reporter: Lei (Eddy) Xu When running {{hadoop fs -put}}, {{FsShell}} creates a {{._COPYING_.}} file on the target directory, and then renames it to target file when the write is done. However, for some targeted systems, such as S3, Azure and Swift, a partial failure write request (i.e., {{PUT}}) has not side effect, while the {{rename}} operation is expensive. {{FileSystem}} should expose some characteristics so that the operation such as {{CommandWithDestination#copyStreamToTarget()}} can detect and choose the right way to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7704) DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
[ https://issues.apache.org/jira/browse/HDFS-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297816#comment-14297816 ] Kihwal Lee commented on HDFS-7704: -- The high-level approach seems fine, but use of {{DatanodeCommand}} for something internal to datanode is a bit strange. {{DatanodeCommand}} is for namenode to instruct datanode to do certain tasks. Also, it will be nice if the bad block reporting portion of the change can aggregate multiple such reports and make a single namenode rpc call, if multiple bad block reports are present in the queue. The queue, {{bpThreadQueue}}, also needs a proper synchronization. DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out. --- Key: HDFS-7704 URL: https://issues.apache.org/jira/browse/HDFS-7704 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.5.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7704.patch There are couple of synchronous calls in BPOfferservice (i.e reportBadBlocks and trySendErrorReport) which will wait for both of the actor threads to process this calls. This calls are made with writeLock acquired. When reportBadBlocks() is blocked at the RPC layer due to unreachable NN, subsequent heartbeat response processing has to wait for the write lock. It eventually gets through, but takes too long and it blocks the next heartbeat. In our HA cluster setup, the standby namenode was taking a long time to process the request. Requesting improvement in datanode to make the above calls asynchronous since these reports don't have any specific deadlines, so extra few seconds of delay should be acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297759#comment-14297759 ] Tsz Wo Nicholas Sze commented on HDFS-7285: --- Thanks for posting the meeting note. The meeting was very productive! ... The COMPLETE state needs to collect ack from all participating DNs in the group. It should be collect ack from minimum number of DNs required for reading the data. E.g. the min is 6 for (6,3)-Reed-Solomon. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298027#comment-14298027 ] Hadoop QA commented on HDFS-7584: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12695373/HDFS-7584.9b.patch against trunk revision ad55083. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 10 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestBlockScanner org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9375//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9375//artifact/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9375//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9375//console This message is automatically generated. Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.1.patch, HDFS-7584.9.patch, HDFS-7584.9a.patch, HDFS-7584.9b.patch, editsStored, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7648) Verify the datanode directory layout
[ https://issues.apache.org/jira/browse/HDFS-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297894#comment-14297894 ] Colin Patrick McCabe commented on HDFS-7648: bq. Do you meant, DirectoryScanner should identify the blocks which are not in the expected directory path computed using its block ID and just do a log message without fixing it correct bq. As per the earlier discussion it need to fix. I was having an idea which is similar to the mechanism used in Datastorage, create hardlinks. Please correct me if I miss anything. It's not the goal of DirectoryScanner to fix anything. It should not modify the filesystem. Verify the datanode directory layout Key: HDFS-7648 URL: https://issues.apache.org/jira/browse/HDFS-7648 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Rakesh R HDFS-6482 changed datanode layout to use block ID to determine the directory to store the block. We should have some mechanism to verify it. Either DirectoryScanner or block report generation could do the check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7706) Switch BlockManager logging to use slf4j
Andrew Wang created HDFS-7706: - Summary: Switch BlockManager logging to use slf4j Key: HDFS-7706 URL: https://issues.apache.org/jira/browse/HDFS-7706 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Nice little refactor to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7706) Switch BlockManager logging to use slf4j
[ https://issues.apache.org/jira/browse/HDFS-7706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7706: -- Attachment: hdfs-7706.001.patch Patch attached. Also refactored a common logging idiom I saw in tests into a new test helper function. Switch BlockManager logging to use slf4j Key: HDFS-7706 URL: https://issues.apache.org/jira/browse/HDFS-7706 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: hdfs-7706.001.patch Nice little refactor to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7706) Switch BlockManager logging to use slf4j
[ https://issues.apache.org/jira/browse/HDFS-7706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297950#comment-14297950 ] Yi Liu commented on HDFS-7706: -- Thanks Andrew, +1 pending Jenkins Switch BlockManager logging to use slf4j Key: HDFS-7706 URL: https://issues.apache.org/jira/browse/HDFS-7706 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: hdfs-7706.001.patch Nice little refactor to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7339) Allocating and persisting block groups in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298016#comment-14298016 ] Hadoop QA commented on HDFS-7339: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12695409/HDFS-7339-007.patch against trunk revision e36ef3b. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9377//console This message is automatically generated. Allocating and persisting block groups in NameNode -- Key: HDFS-7339 URL: https://issues.apache.org/jira/browse/HDFS-7339 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, HDFS-7339-003.patch, HDFS-7339-004.patch, HDFS-7339-005.patch, HDFS-7339-006.patch, HDFS-7339-007.patch, Meta-striping.jpg, NN-stripping.jpg All erasure codec operations center around the concept of _block group_; they are formed in initial encoding and looked up in recoveries and conversions. A lightweight class {{BlockGroup}} is created to record the original and parity blocks in a coding group, as well as a pointer to the codec schema (pluggable codec schemas will be supported in HDFS-7337). With the striping layout, the HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. Therefore we propose to extend a file’s inode to switch between _contiguous_ and _striping_ modes, with the current mode recorded in a binary flag. An array of BlockGroups (or BlockGroup IDs) is added, which remains empty for “traditional” HDFS files with contiguous block layout. The NameNode creates and maintains {{BlockGroup}} instances through the new {{ECManager}} component; the attached figure has an illustration of the architecture. As a simple example, when a {_Striping+EC_} file is created and written to, it will serve requests from the client to allocate new {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, {{BlockGroups}} are allocated both in initial online encoding and in the conversion from replication to EC. {{ECManager}} also facilitates the lookup of {{BlockGroup}} information for block recovery work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7339) Allocating and persisting block groups in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-7339: Attachment: HDFS-7339-007.patch Updated patch based on latest [discussion | https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14296210page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14296210]. Allocating and persisting block groups in NameNode -- Key: HDFS-7339 URL: https://issues.apache.org/jira/browse/HDFS-7339 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, HDFS-7339-003.patch, HDFS-7339-004.patch, HDFS-7339-005.patch, HDFS-7339-006.patch, HDFS-7339-007.patch, Meta-striping.jpg, NN-stripping.jpg All erasure codec operations center around the concept of _block group_; they are formed in initial encoding and looked up in recoveries and conversions. A lightweight class {{BlockGroup}} is created to record the original and parity blocks in a coding group, as well as a pointer to the codec schema (pluggable codec schemas will be supported in HDFS-7337). With the striping layout, the HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. Therefore we propose to extend a file’s inode to switch between _contiguous_ and _striping_ modes, with the current mode recorded in a binary flag. An array of BlockGroups (or BlockGroup IDs) is added, which remains empty for “traditional” HDFS files with contiguous block layout. The NameNode creates and maintains {{BlockGroup}} instances through the new {{ECManager}} component; the attached figure has an illustration of the architecture. As a simple example, when a {_Striping+EC_} file is created and written to, it will serve requests from the client to allocate new {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, {{BlockGroups}} are allocated both in initial online encoding and in the conversion from replication to EC. {{ECManager}} also facilitates the lookup of {{BlockGroup}} information for block recovery work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7706) Switch BlockManager logging to use slf4j
[ https://issues.apache.org/jira/browse/HDFS-7706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7706: -- Status: Patch Available (was: Open) Switch BlockManager logging to use slf4j Key: HDFS-7706 URL: https://issues.apache.org/jira/browse/HDFS-7706 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: hdfs-7706.001.patch Nice little refactor to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7704) DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
[ https://issues.apache.org/jira/browse/HDFS-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297887#comment-14297887 ] Charles Lamb commented on HDFS-7704: Hi [~shahrs87], A couple of quick comments: {code} public void bpThreadEnqueue(DatanodeCommand datanodeCommand) { if (bpThreadQueue != null) { bpThreadQueue.add(datanodeCommand); } } {code} When would bpThreadQueue be null? Don't you want to use Preconditions here? Several lines exceed the 80 char limit. s/if(/if (/ I'll wait for your second version with [~kihwal]'s comments addressed. DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out. --- Key: HDFS-7704 URL: https://issues.apache.org/jira/browse/HDFS-7704 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.5.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7704.patch There are couple of synchronous calls in BPOfferservice (i.e reportBadBlocks and trySendErrorReport) which will wait for both of the actor threads to process this calls. This calls are made with writeLock acquired. When reportBadBlocks() is blocked at the RPC layer due to unreachable NN, subsequent heartbeat response processing has to wait for the write lock. It eventually gets through, but takes too long and it blocks the next heartbeat. In our HA cluster setup, the standby namenode was taking a long time to process the request. Requesting improvement in datanode to make the above calls asynchronous since these reports don't have any specific deadlines, so extra few seconds of delay should be acceptable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7708) Balancer should delete its pid file when it completes rebalance
Akira AJISAKA created HDFS-7708: --- Summary: Balancer should delete its pid file when it completes rebalance Key: HDFS-7708 URL: https://issues.apache.org/jira/browse/HDFS-7708 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 2.6.0 Reporter: Akira AJISAKA When balancer completes rebalance and exits, it does not delete its pid file. Starting balancer again, then kill -0 pid to confirm the process is running. The problem is: * If another process is running as the same pid as `cat pidfile`, balancer fails to start with following message: {code} balancer is running as process 3443. Stop it first. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7703) Support favouredNodes for the append for new blocks
[ https://issues.apache.org/jira/browse/HDFS-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-7703: Status: Patch Available (was: Open) Support favouredNodes for the append for new blocks --- Key: HDFS-7703 URL: https://issues.apache.org/jira/browse/HDFS-7703 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-7703-001.patch Currently favorNodes is supported for the new file creation, and these nodes are applicable for all blocks of the file. Same support should be available when file is opened for append. But, even though original file has not used favor nodes, favorNodes passed to append will be used only for new blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7706) Switch BlockManager logging to use slf4j
[ https://issues.apache.org/jira/browse/HDFS-7706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298216#comment-14298216 ] Xiaoyu Yao commented on HDFS-7706: -- The patch looks good to me. One question: why import org.apache.commons.logging.Log; in BlockManager.java is not removed since we switch to slf4j? Switch BlockManager logging to use slf4j Key: HDFS-7706 URL: https://issues.apache.org/jira/browse/HDFS-7706 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: hdfs-7706.001.patch Nice little refactor to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7708) Balancer should delete its pid file when it completes rebalance
[ https://issues.apache.org/jira/browse/HDFS-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-7708: Description: When balancer completes rebalance and exits, it does not delete its pid file. Starting balancer again, then kill -0 pid to confirm the balancer process is not running. The problem is: If another process is running as the same pid as `cat pidfile`, balancer fails to start with following message: {code} balancer is running as process 3443. Stop it first. {code} was: When balancer completes rebalance and exits, it does not delete its pid file. Starting balancer again, then kill -0 pid to confirm the process is running. The problem is: * If another process is running as the same pid as `cat pidfile`, balancer fails to start with following message: {code} balancer is running as process 3443. Stop it first. {code} Balancer should delete its pid file when it completes rebalance --- Key: HDFS-7708 URL: https://issues.apache.org/jira/browse/HDFS-7708 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 2.6.0 Reporter: Akira AJISAKA When balancer completes rebalance and exits, it does not delete its pid file. Starting balancer again, then kill -0 pid to confirm the balancer process is not running. The problem is: If another process is running as the same pid as `cat pidfile`, balancer fails to start with following message: {code} balancer is running as process 3443. Stop it first. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7584: - Attachment: (was: HDFS-7584.9.1.patch) Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.patch, HDFS-7584.9a.patch, HDFS-7584.9b.patch, HDFS-7584.9c.patch, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7584: - Attachment: HDFS-7584.9c.patch refactor addDirectoryWithQuotaFeature Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.1.patch, HDFS-7584.9.patch, HDFS-7584.9a.patch, HDFS-7584.9b.patch, HDFS-7584.9c.patch, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7584: - Attachment: (was: editsStored) Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.1.patch, HDFS-7584.9.patch, HDFS-7584.9a.patch, HDFS-7584.9b.patch, HDFS-7584.9c.patch, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7501) TransactionsSinceLastCheckpoint can be negative on SBNs
[ https://issues.apache.org/jira/browse/HDFS-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298235#comment-14298235 ] Harsh J commented on HDFS-7501: --- [~daryn] - The metric goes negative at the standby after the first checkpoint and continues going negative until it is in active mode again. The reason is that the getEditLog().getLastWrittenTxId() freezes in standby mode where no local edit logs are written anymore, and only the edit log tailer has the txid tracking info. We could switch to querying that - would that make sense to do just when we are in standby mode. We could expose lastLoadedTxnId in the EditLogTailer, for example. Sorry on delay in responding. TransactionsSinceLastCheckpoint can be negative on SBNs --- Key: HDFS-7501 URL: https://issues.apache.org/jira/browse/HDFS-7501 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Harsh J Assignee: Gautam Gopalakrishnan Priority: Trivial Attachments: HDFS-7501-2.patch, HDFS-7501.patch The metric TransactionsSinceLastCheckpoint is derived as FSEditLog.txid minus NNStorage.mostRecentCheckpointTxId. In Standby mode, the former does not increment beyond the loaded or last-when-active value, but the latter does change due to checkpoints done regularly in this mode. Thereby, the SBN will eventually end up showing negative values for TransactionsSinceLastCheckpoint. This is not an issue as the metric only makes sense to be monitored on the Active NameNode, but we should perhaps just show the value 0 by detecting if the NN is in SBN form, as allowing a negative number is confusing to view within a chart that tracks it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7706) Switch BlockManager logging to use slf4j
[ https://issues.apache.org/jira/browse/HDFS-7706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298164#comment-14298164 ] Hadoop QA commented on HDFS-7706: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12695402/hdfs-7706.001.patch against trunk revision e36ef3b. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestBlockScanner Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9376//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9376//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9376//console This message is automatically generated. Switch BlockManager logging to use slf4j Key: HDFS-7706 URL: https://issues.apache.org/jira/browse/HDFS-7706 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: hdfs-7706.001.patch Nice little refactor to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7703) Support favouredNodes for the append for new blocks
[ https://issues.apache.org/jira/browse/HDFS-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-7703: Attachment: HDFS-7703-001.patch Attaching the patch. Please review. Support favouredNodes for the append for new blocks --- Key: HDFS-7703 URL: https://issues.apache.org/jira/browse/HDFS-7703 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-7703-001.patch Currently favorNodes is supported for the new file creation, and these nodes are applicable for all blocks of the file. Same support should be available when file is opened for append. But, even though original file has not used favor nodes, favorNodes passed to append will be used only for new blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298294#comment-14298294 ] Hadoop QA commented on HDFS-7584: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12695445/HDFS-7584.9c.patch against trunk revision f2c9109. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 10 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot org.apache.hadoop.hdfs.server.datanode.TestBlockScanner org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9378//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9378//artifact/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9378//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9378//console This message is automatically generated. Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, HDFS-7584.5.patch, HDFS-7584.6.patch, HDFS-7584.7.patch, HDFS-7584.8.patch, HDFS-7584.9.patch, HDFS-7584.9a.patch, HDFS-7584.9b.patch, HDFS-7584.9c.patch, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7707) Edit log corruption due to delayed block removal again
Yongjun Zhang created HDFS-7707: --- Summary: Edit log corruption due to delayed block removal again Key: HDFS-7707 URL: https://issues.apache.org/jira/browse/HDFS-7707 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Edit log corruption is seen again, even with the fix of HDFS-6825. Prior to HDFS-6825 fix, if dirX is deleted recursively, an OP_CLOSE can get into edit log for the fileY under dirX, thus corrupting the edit log (restarting NN with the edit log would fail). What HDFS-6825 does to fix this issue is, to detect whether fileY is already deleted by checking the ancestor dirs on it's path, if any of them doesn't exist, then fileY is already deleted, and don't put OP_CLOSE to edit log for the file. For this new edit log corruption, what I found was, the client first deleted dirX recursively, then create another dir with exactly the same name as dirX right away. Because HDFS-6825 count on the namespace checking (whether dirX exists in its parent dir) to decide whether a file has been deleted, the newly created dirX defeats this checking, thus OP_CLOSE for the already deleted file gets into the edit log, due to delayed block removal. What we need to do is to have a more robust way to detect whether a file has been deleted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7707) Edit log corruption due to delayed block removal again
[ https://issues.apache.org/jira/browse/HDFS-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298003#comment-14298003 ] Yongjun Zhang commented on HDFS-7707: - One possible solution I thought about is, whenever we need to delete a fileOrDirX, do * check permission recursively, * if it's permitted to delete fileOrDirX, ** rename it to a unique name fileOrDirX_to_be_deleted that client won't be using, ** delete fileOrDirX_to_be_deleted. This will cause some confusion in the edit log though. Also if fileOrDirX is not permitted to be deleted, some sub dir / file in it may be deleted, so this operation need to be done at allowed dir/file in a recursively fashion, which may not be clean. Edit log corruption due to delayed block removal again -- Key: HDFS-7707 URL: https://issues.apache.org/jira/browse/HDFS-7707 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Edit log corruption is seen again, even with the fix of HDFS-6825. Prior to HDFS-6825 fix, if dirX is deleted recursively, an OP_CLOSE can get into edit log for the fileY under dirX, thus corrupting the edit log (restarting NN with the edit log would fail). What HDFS-6825 does to fix this issue is, to detect whether fileY is already deleted by checking the ancestor dirs on it's path, if any of them doesn't exist, then fileY is already deleted, and don't put OP_CLOSE to edit log for the file. For this new edit log corruption, what I found was, the client first deleted dirX recursively, then create another dir with exactly the same name as dirX right away. Because HDFS-6825 count on the namespace checking (whether dirX exists in its parent dir) to decide whether a file has been deleted, the newly created dirX defeats this checking, thus OP_CLOSE for the already deleted file gets into the edit log, due to delayed block removal. What we need to do is to have a more robust way to detect whether a file has been deleted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)