[jira] [Updated] (HDFS-4185) Add a metric for number of active leases
[ https://issues.apache.org/jira/browse/HDFS-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-4185: --- Attachment: HDFS-4185-003.patch Add a metric for number of active leases Key: HDFS-4185 URL: https://issues.apache.org/jira/browse/HDFS-4185 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 0.23.4, 2.0.2-alpha Reporter: Kihwal Lee Assignee: Rakesh R Attachments: HDFS-4185-001.patch, HDFS-4185-002.patch, HDFS-4185-003.patch We have seen cases of systematic open file leaks, which could have been detected if we have a metric that shows number of active leases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4185) Add a metric for number of active leases
[ https://issues.apache.org/jira/browse/HDFS-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526310#comment-14526310 ] Rakesh R commented on HDFS-4185: Attaching another patch addressing the checkstyle comment. [~kihwal] Please have a look at the patch, thanks! FYI: There is one more checkstyle comment reported but this is not related to my patch. {code} ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:1: File length is 8,290 lines (max allowed is 2,000). {code} Add a metric for number of active leases Key: HDFS-4185 URL: https://issues.apache.org/jira/browse/HDFS-4185 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 0.23.4, 2.0.2-alpha Reporter: Kihwal Lee Assignee: Rakesh R Attachments: HDFS-4185-001.patch, HDFS-4185-002.patch, HDFS-4185-003.patch We have seen cases of systematic open file leaks, which could have been detected if we have a metric that shows number of active leases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8062) Remove hard-coded values in favor of EC schema
[ https://issues.apache.org/jira/browse/HDFS-8062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526375#comment-14526375 ] Kai Sasaki commented on HDFS-8062: -- I'm sorry for delay but I am just on vacation from April 29th to 8th March. Will it be enough or should it be reassigned? I'm so sorry for inconvenience. Remove hard-coded values in favor of EC schema -- Key: HDFS-8062 URL: https://issues.apache.org/jira/browse/HDFS-8062 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Sasaki Attachments: HDFS-8062.1.patch, HDFS-8062.2.patch, HDFS-8062.3.patch, HDFS-8062.4.patch, HDFS-8062.5.patch Related issues about EC schema in NameNode side: HDFS-7859 is to change fsimage and editlog in NameNode to persist EC schemas; HDFS-7866 is to manage EC schemas in NameNode, loading, syncing between persisted ones in image and predefined ones in XML. This is to revisit all the places in NameNode that uses hard-coded values in favor of {{ECSchema}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8062) Remove hard-coded values in favor of EC schema
[ https://issues.apache.org/jira/browse/HDFS-8062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526405#comment-14526405 ] Kai Zheng commented on HDFS-8062: - Thanks for your sync. I thought it's OK if we could work on this and get it done sooner after *May 8th*. Remove hard-coded values in favor of EC schema -- Key: HDFS-8062 URL: https://issues.apache.org/jira/browse/HDFS-8062 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Sasaki Attachments: HDFS-8062.1.patch, HDFS-8062.2.patch, HDFS-8062.3.patch, HDFS-8062.4.patch, HDFS-8062.5.patch Related issues about EC schema in NameNode side: HDFS-7859 is to change fsimage and editlog in NameNode to persist EC schemas; HDFS-7866 is to manage EC schemas in NameNode, loading, syncing between persisted ones in image and predefined ones in XML. This is to revisit all the places in NameNode that uses hard-coded values in favor of {{ECSchema}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands
[ https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8242: --- Attachment: HDFS-8242-HDFS-7285.05.patch Erasure Coding: XML based end-to-end test for ECCli commands Key: HDFS-8242 URL: https://issues.apache.org/jira/browse/HDFS-8242 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch, HDFS-8242-HDFS-7285.05.patch, HDFS-8242-HDFS-7285.05.patch, HDFS-8242-HDFS-7285.05.patch This JIRA to add test cases with CLI test f/w for the commands present in {{ECCli}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands
[ https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526451#comment-14526451 ] Rakesh R commented on HDFS-8242: It looks like test failures due to something wrong and I feel not related to my patch. Re-attaching the previous patch again to get fresh jenkins report. {code} java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/LocatedStripedBlock at java.net.URLClassLoader$1.run(URLClassLoader.java:366) {code} Erasure Coding: XML based end-to-end test for ECCli commands Key: HDFS-8242 URL: https://issues.apache.org/jira/browse/HDFS-8242 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch, HDFS-8242-HDFS-7285.05.patch, HDFS-8242-HDFS-7285.05.patch, HDFS-8242-HDFS-7285.05.patch This JIRA to add test cases with CLI test f/w for the commands present in {{ECCli}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-2484) checkLease should throw FileNotFoundException when file does not exist
[ https://issues.apache.org/jira/browse/HDFS-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526455#comment-14526455 ] Hadoop QA commented on HDFS-2484: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 40s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 28s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 38s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 11s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 4s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 14s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 166m 34s | Tests failed in hadoop-hdfs. | | | | 209m 24s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestFileCreation | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12730023/HDFS-2484.00.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / a319771 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/10774/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10774/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10774/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10774/console | This message was automatically generated. checkLease should throw FileNotFoundException when file does not exist -- Key: HDFS-2484 URL: https://issues.apache.org/jira/browse/HDFS-2484 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 0.22.0, 2.0.0-alpha Reporter: Konstantin Shvachko Assignee: Rakesh R Attachments: HDFS-2484.00.patch When file is deleted during its creation {{FSNamesystem.checkLease(String src, String holder)}} throws {{LeaseExpiredException}}. It would be more informative if it thrown {{FileNotFoundException}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4185) Add a metric for number of active leases
[ https://issues.apache.org/jira/browse/HDFS-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526484#comment-14526484 ] Hadoop QA commented on HDFS-4185: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 42s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 32s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 41s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 14s | The applied patch generated 1 new checkstyle issues (total was 315, now 315). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 6s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 14s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 165m 2s | Tests failed in hadoop-hdfs. | | | | 208m 3s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12730125/HDFS-4185-003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3ba1836 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/10775/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10775/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10775/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10775/console | This message was automatically generated. Add a metric for number of active leases Key: HDFS-4185 URL: https://issues.apache.org/jira/browse/HDFS-4185 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 0.23.4, 2.0.2-alpha Reporter: Kihwal Lee Assignee: Rakesh R Attachments: HDFS-4185-001.patch, HDFS-4185-002.patch, HDFS-4185-003.patch We have seen cases of systematic open file leaks, which could have been detected if we have a metric that shows number of active leases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8219) setStoragePolicy with folder behavior is different after cluster restart
[ https://issues.apache.org/jira/browse/HDFS-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526502#comment-14526502 ] surendra singh lilhore commented on HDFS-8219: -- Can I work on this issue?. I am able to reproduce this and I got the root cause. setStoragePolicy with folder behavior is different after cluster restart Key: HDFS-8219 URL: https://issues.apache.org/jira/browse/HDFS-8219 Project: Hadoop HDFS Issue Type: Bug Reporter: Peter Shi Assignee: Xiaoyu Yao Priority: Minor Attachments: HDFS-8219.unittest-norepro.patch Reproduce steps. 1) mkdir named /temp 2) put one file A under /temp 3) change /temp storage policy to COLD 4) use -getStoragePolicy to query file A's storage policy, it is same with /temp 5) change /temp folder storage policy again, will see file A's storage policy keep same with parent folder. then restart the cluster. do 3) 4) again, will find file A's storage policy is not change while parent folder's storage policy changes. It behaves different. As i debugged, found the code: in INodeFile.getStoragePolicyID {code} public byte getStoragePolicyID() { byte id = getLocalStoragePolicyID(); if (id == BLOCK_STORAGE_POLICY_ID_UNSPECIFIED) { return this.getParent() != null ? this.getParent().getStoragePolicyID() : id; } return id; } {code} If the file do not have its storage policy, it will use parent's. But after cluster restart, the file turns to have its own storage policy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command
[ https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526509#comment-14526509 ] Kai Zheng commented on HDFS-8137: - Thanks for your update, Uma. The patch looks good and just two minor comments: * Regarding the following codes, I'm not very sure about the log output. Would it be better to be: {code} blockLog.warn( Failed to get the ECSchema for the file {} , filename_of_the_block); ... blockLog.warn( No EC Schema found for the file {} , filename_of_the_block); {code} {code} +ECSchema ecSchema = null; +try { + ecSchema = namesystem.getECSchemaForPath(block + .getBlockCollection().getName()); +} catch (IOException e) { + blockLog.warn( + Failed to get the ECSchema for the blockGroup {} , block); +} +if (ecSchema == null) { + blockLog.warn(No EC Schema found for the blockGroup {} , + + so ignoring the block group for EC, block); + // TODO: we may have to revisit later for what we can do better to + // handle this case. + continue; +} {code} * In the codes below, might be better to compare ecSchema1 and ecSchema2 with the system default schema used. {code} +ECSchema ecSchema1 = blkECRecoveryInfo1.getECSchema(); +ECSchema ecSchema2 = blkECRecoveryInfo2.getECSchema(); +assertEquals(ecSchema1.getSchemaName(), ecSchema2.getSchemaName()); +assertEquals(ecSchema1.getNumDataUnits(), ecSchema2.getNumDataUnits()); +assertEquals(ecSchema1.getNumParityUnits(), ecSchema2.getNumParityUnits()); {code} Sends the EC schema to DataNode as well in EC encoding/recovering command - Key: HDFS-8137 URL: https://issues.apache.org/jira/browse/HDFS-8137 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Uma Maheswara Rao G Attachments: HDFS-8137-0.patch, HDFS-8137-1.patch Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the EC schema to DataNode as well contained in the EC encoding/recovering command. The target DataNode will use it to guide the executing of the task. Another way would be, DataNode would just request schema actively thru a separate RPC call, and as an optimization consideration, DataNode may cache schemas to avoid repeatedly asking for the same schema twice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-2484) checkLease should throw FileNotFoundException when file does not exist
[ https://issues.apache.org/jira/browse/HDFS-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-2484: --- Attachment: HDFS-2484.01.patch checkLease should throw FileNotFoundException when file does not exist -- Key: HDFS-2484 URL: https://issues.apache.org/jira/browse/HDFS-2484 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 0.22.0, 2.0.0-alpha Reporter: Konstantin Shvachko Assignee: Rakesh R Attachments: HDFS-2484.00.patch, HDFS-2484.01.patch When file is deleted during its creation {{FSNamesystem.checkLease(String src, String holder)}} throws {{LeaseExpiredException}}. It would be more informative if it thrown {{FileNotFoundException}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8251) Move the synthetic load generator into its own package
[ https://issues.apache.org/jira/browse/HDFS-8251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] J.Andreina updated HDFS-8251: - Status: Patch Available (was: Open) Move the synthetic load generator into its own package -- Key: HDFS-8251 URL: https://issues.apache.org/jira/browse/HDFS-8251 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: J.Andreina Attachments: HDFS-8251.1.patch It doesn't really make sense for the HDFS load generator to be a part of the (extremely large) mapreduce jobclient package. It should be pulled out and put its own package, probably in hadoop-tools. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4185) Add a metric for number of active leases
[ https://issues.apache.org/jira/browse/HDFS-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526519#comment-14526519 ] Rakesh R commented on HDFS-4185: Test failure and checkstyle issue are not related to the patch. {code} ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:1: File length is 8,290 lines (max allowed is 2,000). {code} Add a metric for number of active leases Key: HDFS-4185 URL: https://issues.apache.org/jira/browse/HDFS-4185 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 0.23.4, 2.0.2-alpha Reporter: Kihwal Lee Assignee: Rakesh R Attachments: HDFS-4185-001.patch, HDFS-4185-002.patch, HDFS-4185-003.patch We have seen cases of systematic open file leaks, which could have been detected if we have a metric that shows number of active leases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8062) Remove hard-coded values in favor of EC schema
[ https://issues.apache.org/jira/browse/HDFS-8062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526525#comment-14526525 ] Kai Sasaki commented on HDFS-8062: -- Yes 8th May. Thank you so much! Remove hard-coded values in favor of EC schema -- Key: HDFS-8062 URL: https://issues.apache.org/jira/browse/HDFS-8062 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Sasaki Attachments: HDFS-8062.1.patch, HDFS-8062.2.patch, HDFS-8062.3.patch, HDFS-8062.4.patch, HDFS-8062.5.patch Related issues about EC schema in NameNode side: HDFS-7859 is to change fsimage and editlog in NameNode to persist EC schemas; HDFS-7866 is to manage EC schemas in NameNode, loading, syncing between persisted ones in image and predefined ones in XML. This is to revisit all the places in NameNode that uses hard-coded values in favor of {{ECSchema}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8284) Add usage of tracing originated in DFSClient to doc
[ https://issues.apache.org/jira/browse/HDFS-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-8284: --- Attachment: HDFS-8284.001.patch I attached patch to fix doc for tracing. I moved default config values from core-default.xml to hdfs-default.xml in this patch. Add usage of tracing originated in DFSClient to doc --- Key: HDFS-8284 URL: https://issues.apache.org/jira/browse/HDFS-8284 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Attachments: HDFS-8284.001.patch Tracing originated in DFSClient uses configuration keys prefixed with dfs.client.htrace after HDFS-8213. Server side tracing uses conf keys prefixed with dfs.htrace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8284) Add usage of tracing originated in DFSClient to doc
[ https://issues.apache.org/jira/browse/HDFS-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-8284: --- Status: Patch Available (was: Open) Add usage of tracing originated in DFSClient to doc --- Key: HDFS-8284 URL: https://issues.apache.org/jira/browse/HDFS-8284 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Attachments: HDFS-8284.001.patch Tracing originated in DFSClient uses configuration keys prefixed with dfs.client.htrace after HDFS-8213. Server side tracing uses conf keys prefixed with dfs.htrace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8219) setStoragePolicy with folder behavior is different after cluster restart
[ https://issues.apache.org/jira/browse/HDFS-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526563#comment-14526563 ] surendra singh lilhore commented on HDFS-8219: -- Little change in the scenario, before putting the file in directory first set the storage policy for directory. *Root Cause* When we will add edit log for new file, we will get the storage policy from {{INodeFile}} by {{getStoragePolicyID()}}. {code} AddOp op = AddOp.getInstance(cache.get()) .setStoragePolicyId(newNode.getStoragePolicyID()); {code} but this API will return the parent policy If file storagepolicy is UNSPECIFIED. {code} public byte getStoragePolicyID() { byte id = getLocalStoragePolicyID(); if (id == BLOCK_STORAGE_POLICY_ID_UNSPECIFIED) { return this.getParent() != null ?this.getParent().getStoragePolicyID() : id; } return id; } {code} So in edits log create file will store parent storage policy. *Fix* We should use {{INodeFile.getLocalStoragePolicyID()}}, it will give the current INodeFile storage policy setStoragePolicy with folder behavior is different after cluster restart Key: HDFS-8219 URL: https://issues.apache.org/jira/browse/HDFS-8219 Project: Hadoop HDFS Issue Type: Bug Reporter: Peter Shi Assignee: Xiaoyu Yao Priority: Minor Attachments: HDFS-8219.patch, HDFS-8219.unittest-norepro.patch Reproduce steps. 1) mkdir named /temp 2) put one file A under /temp 3) change /temp storage policy to COLD 4) use -getStoragePolicy to query file A's storage policy, it is same with /temp 5) change /temp folder storage policy again, will see file A's storage policy keep same with parent folder. then restart the cluster. do 3) 4) again, will find file A's storage policy is not change while parent folder's storage policy changes. It behaves different. As i debugged, found the code: in INodeFile.getStoragePolicyID {code} public byte getStoragePolicyID() { byte id = getLocalStoragePolicyID(); if (id == BLOCK_STORAGE_POLICY_ID_UNSPECIFIED) { return this.getParent() != null ? this.getParent().getStoragePolicyID() : id; } return id; } {code} If the file do not have its storage policy, it will use parent's. But after cluster restart, the file turns to have its own storage policy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8219) setStoragePolicy with folder behavior is different after cluster restart
[ https://issues.apache.org/jira/browse/HDFS-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] surendra singh lilhore updated HDFS-8219: - Attachment: HDFS-8219.patch setStoragePolicy with folder behavior is different after cluster restart Key: HDFS-8219 URL: https://issues.apache.org/jira/browse/HDFS-8219 Project: Hadoop HDFS Issue Type: Bug Reporter: Peter Shi Assignee: Xiaoyu Yao Priority: Minor Attachments: HDFS-8219.patch, HDFS-8219.unittest-norepro.patch Reproduce steps. 1) mkdir named /temp 2) put one file A under /temp 3) change /temp storage policy to COLD 4) use -getStoragePolicy to query file A's storage policy, it is same with /temp 5) change /temp folder storage policy again, will see file A's storage policy keep same with parent folder. then restart the cluster. do 3) 4) again, will find file A's storage policy is not change while parent folder's storage policy changes. It behaves different. As i debugged, found the code: in INodeFile.getStoragePolicyID {code} public byte getStoragePolicyID() { byte id = getLocalStoragePolicyID(); if (id == BLOCK_STORAGE_POLICY_ID_UNSPECIFIED) { return this.getParent() != null ? this.getParent().getStoragePolicyID() : id; } return id; } {code} If the file do not have its storage policy, it will use parent's. But after cluster restart, the file turns to have its own storage policy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8219) setStoragePolicy with folder behavior is different after cluster restart
[ https://issues.apache.org/jira/browse/HDFS-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526564#comment-14526564 ] surendra singh lilhore commented on HDFS-8219: -- Attached patch with fix and test. Please review.. setStoragePolicy with folder behavior is different after cluster restart Key: HDFS-8219 URL: https://issues.apache.org/jira/browse/HDFS-8219 Project: Hadoop HDFS Issue Type: Bug Reporter: Peter Shi Assignee: Xiaoyu Yao Priority: Minor Attachments: HDFS-8219.patch, HDFS-8219.unittest-norepro.patch Reproduce steps. 1) mkdir named /temp 2) put one file A under /temp 3) change /temp storage policy to COLD 4) use -getStoragePolicy to query file A's storage policy, it is same with /temp 5) change /temp folder storage policy again, will see file A's storage policy keep same with parent folder. then restart the cluster. do 3) 4) again, will find file A's storage policy is not change while parent folder's storage policy changes. It behaves different. As i debugged, found the code: in INodeFile.getStoragePolicyID {code} public byte getStoragePolicyID() { byte id = getLocalStoragePolicyID(); if (id == BLOCK_STORAGE_POLICY_ID_UNSPECIFIED) { return this.getParent() != null ? this.getParent().getStoragePolicyID() : id; } return id; } {code} If the file do not have its storage policy, it will use parent's. But after cluster restart, the file turns to have its own storage policy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7904) NFS hard codes ShellBasedIdMapping
[ https://issues.apache.org/jira/browse/HDFS-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated HDFS-7904: --- Assignee: (was: Brahma Reddy Battula) NFS hard codes ShellBasedIdMapping -- Key: HDFS-7904 URL: https://issues.apache.org/jira/browse/HDFS-7904 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Allen Wittenauer The current NFS doesn't allow one to configure an alternative to the shell-based id mapping provider. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6300) Allows to run multiple balancer simultaneously
[ https://issues.apache.org/jira/browse/HDFS-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-6300: --- Attachment: HDFS-6300-001.patch Allows to run multiple balancer simultaneously -- Key: HDFS-6300 URL: https://issues.apache.org/jira/browse/HDFS-6300 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-6300-001.patch, HDFS-6300.patch Javadoc of Balancer.java says, it will not allow to run second balancer if the first one is in progress. But I've noticed multiple can run together and balancer.id implementation is not safe guarding. {code} * liAnother balancer is running. Exiting... {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6300) Shouldn't allows to run multiple balancer simultaneously
[ https://issues.apache.org/jira/browse/HDFS-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-6300: --- Summary: Shouldn't allows to run multiple balancer simultaneously (was: Allows to run multiple balancer simultaneously) Shouldn't allows to run multiple balancer simultaneously Key: HDFS-6300 URL: https://issues.apache.org/jira/browse/HDFS-6300 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-6300-001.patch, HDFS-6300.patch Javadoc of Balancer.java says, it will not allow to run second balancer if the first one is in progress. But I've noticed multiple can run together and balancer.id implementation is not safe guarding. {code} * liAnother balancer is running. Exiting... {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1915) fuse-dfs does not support append
[ https://issues.apache.org/jira/browse/HDFS-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526581#comment-14526581 ] Valerio Schiavoni commented on HDFS-1915: - I get the same problem with CD5.4, 1 NN and 4 DN using the hadoop-fuse-dfs client. In the linked gist you can see the the fuse client debug as well as the simple test that makes the system crash: https://gist.github.com/vschiavoni/00d70c5bce29a05f94c4 I'm using the following hadoop version: Hadoop 2.6.0 Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1 Compiled by jenkins on 2014-11-13T21:10Z Compiled with protoc 2.5.0 From source with checksum 18e43357c8f927c0695f1e9522859d6a This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.6.0.jar fuse-dfs does not support append Key: HDFS-1915 URL: https://issues.apache.org/jira/browse/HDFS-1915 Project: Hadoop HDFS Issue Type: New Feature Components: fuse-dfs Affects Versions: 0.20.2 Environment: Ubuntu 10.04 LTS on EC2 Reporter: Sampath K Environment: CloudEra CDH3, EC2 cluster with 2 data nodes and 1 name node(Using ubuntu 10.04 LTS large instances), mounted hdfs in OS using fuse-dfs. Able to do HDFS fs -put but when I try to use a FTP client(ftp PUT) to do the same, I get the following error. I am using vsFTPd on the server. Changed the mounted folder permissions to a+w to rule out any WRITE permission issues. I was able to do a FTP GET on the same mounted volume. Please advise FTPd Log == Tue May 10 23:45:00 2011 [pid 2] CONNECT: Client 127.0.0.1 Tue May 10 23:45:09 2011 [pid 1] [ftpuser] OK LOGIN: Client 127.0.0.1 Tue May 10 23:48:41 2011 [pid 3] [ftpuser] OK DOWNLOAD: Client 127.0.0.1, /hfsmnt/upload/counter.txt, 10 bytes, 0.42Kbyte/sec Tue May 10 23:49:24 2011 [pid 3] [ftpuser] FAIL UPLOAD: Client 127.0.0.1, /hfsmnt/upload/counter1.txt, 0.00Kbyte/sec Error in Namenode Log (I did a ftp GET on counter.txt and PUT with counter1.txt) === 2011-05-11 01:03:02,822 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=ftpuser ip=/10.32.77.36 cmd=listStatus src=/upload dst=nullperm=null 2011-05-11 01:03:02,825 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=root ip=/10.32.77.36 cmd=listStatus src=/upload dst=nullperm=null 2011-05-11 01:03:20,275 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=root ip=/10.32.77.36 cmd=listStatus src=/upload dst=nullperm=null 2011-05-11 01:03:20,290 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=ftpuser ip=/10.32.77.36 cmd=opensrc=/upload/counter.txt dst=null perm=null 2011-05-11 01:03:31,115 WARN org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: failed to append to non-existent file /upload/counter1.txt on client 10.32.77.36 2011-05-11 01:03:31,115 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9000, call append(/upload/counter1.txt, DFSClient_1590956638) from 10.32.77.36:56454: error: java.io.FileNotFoundException: failed to append to non-existent file /upload/counter1.txt on client 10.32.77.36 java.io.FileNotFoundException: failed to append to non-existent file /upload/counter1.txt on client 10.32.77.36 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1166) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1336) at org.apache.hadoop.hdfs.server.namenode.NameNode.append(NameNode.java:596) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1415) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1411) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1409) No activity shows up in datanode logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6300) Shouldn't allows to run multiple balancer simultaneously
[ https://issues.apache.org/jira/browse/HDFS-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-6300: --- Target Version/s: 2.8.0 Shouldn't allows to run multiple balancer simultaneously Key: HDFS-6300 URL: https://issues.apache.org/jira/browse/HDFS-6300 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-6300-001.patch, HDFS-6300.patch Javadoc of Balancer.java says, it will not allow to run second balancer if the first one is in progress. But I've noticed multiple can run together and balancer.id implementation is not safe guarding. {code} * liAnother balancer is running. Exiting... {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6300) Shouldn't allows to run multiple balancer simultaneously
[ https://issues.apache.org/jira/browse/HDFS-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526584#comment-14526584 ] Rakesh R commented on HDFS-6300: Thanks a lot [~aw] for the interest. Sorry for the delay, I've missed your comments. I'm attaching another patch re-based in latest trunk code. Shouldn't allows to run multiple balancer simultaneously Key: HDFS-6300 URL: https://issues.apache.org/jira/browse/HDFS-6300 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-6300-001.patch, HDFS-6300.patch Javadoc of Balancer.java says, it will not allow to run second balancer if the first one is in progress. But I've noticed multiple can run together and balancer.id implementation is not safe guarding. {code} * liAnother balancer is running. Exiting... {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6300) Shouldn't allows to run multiple balancer simultaneously
[ https://issues.apache.org/jira/browse/HDFS-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-6300: --- Status: Patch Available (was: Open) Shouldn't allows to run multiple balancer simultaneously Key: HDFS-6300 URL: https://issues.apache.org/jira/browse/HDFS-6300 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-6300-001.patch, HDFS-6300.patch Javadoc of Balancer.java says, it will not allow to run second balancer if the first one is in progress. But I've noticed multiple can run together and balancer.id implementation is not safe guarding. {code} * liAnother balancer is running. Exiting... {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8251) Move the synthetic load generator into its own package
[ https://issues.apache.org/jira/browse/HDFS-8251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526598#comment-14526598 ] Hadoop QA commented on HDFS-8251: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 36s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 4 new or modified test files. | | {color:green}+1{color} | javac | 7m 30s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 42s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 44s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 53s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | assemblies tests | 0m 11s | Tests passed in hadoop-assemblies. | | {color:green}+1{color} | mapreduce tests | 108m 29s | Tests passed in hadoop-mapreduce-client-jobclient. | | | | 144m 40s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12729400/HDFS-8251.1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / bb9ddef | | hadoop-assemblies test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10777/artifact/patchprocess/testrun_hadoop-assemblies.txt | | hadoop-mapreduce-client-jobclient test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10777/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10777/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10777/console | This message was automatically generated. Move the synthetic load generator into its own package -- Key: HDFS-8251 URL: https://issues.apache.org/jira/browse/HDFS-8251 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: J.Andreina Attachments: HDFS-8251.1.patch It doesn't really make sense for the HDFS load generator to be a part of the (extremely large) mapreduce jobclient package. It should be pulled out and put its own package, probably in hadoop-tools. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HDFS-7097: -- Target Version/s: (was: 2.6.0) Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 2.7.0 Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6348) Secondary namenode - RMI Thread prevents JVM from exiting after main() completes
[ https://issues.apache.org/jira/browse/HDFS-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-6348: --- Attachment: HDFS-6348-003.patch Secondary namenode - RMI Thread prevents JVM from exiting after main() completes - Key: HDFS-6348 URL: https://issues.apache.org/jira/browse/HDFS-6348 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-6348-003.patch, HDFS-6348.patch, HDFS-6348.patch, secondaryNN_threaddump_after_exit.log Secondary Namenode is not exiting when there is RuntimeException occurred during startup. Say I configured wrong configuration, due to that validation failed and thrown RuntimeException as shown below. But when I check the environment SecondaryNamenode process is alive. When analysed, RMI Thread is still alive, since it is not a daemon thread JVM is nit exiting. I'm attaching threaddump to this JIRA for more details about the thread. {code} java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1900) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.getInstance(BlockPlacementPolicy.java:199) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.init(BlockManager.java:256) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:635) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:260) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.init(SecondaryNameNode.java:205) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:695) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1868) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1892) ... 6 more Caused by: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1774) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1866) ... 7 more 2014-05-07 14:27:04,666 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for active state 2014-05-07 14:27:04,666 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state 2014-05-07 14:31:04,926 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: STARTUP_MSG: {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6348) Secondary namenode - RMI Thread prevents JVM from exiting after main() completes
[ https://issues.apache.org/jira/browse/HDFS-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-6348: --- Attachment: (was: HDFS-6348-003.patch) Secondary namenode - RMI Thread prevents JVM from exiting after main() completes - Key: HDFS-6348 URL: https://issues.apache.org/jira/browse/HDFS-6348 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-6348-003.patch, HDFS-6348.patch, HDFS-6348.patch, secondaryNN_threaddump_after_exit.log Secondary Namenode is not exiting when there is RuntimeException occurred during startup. Say I configured wrong configuration, due to that validation failed and thrown RuntimeException as shown below. But when I check the environment SecondaryNamenode process is alive. When analysed, RMI Thread is still alive, since it is not a daemon thread JVM is nit exiting. I'm attaching threaddump to this JIRA for more details about the thread. {code} java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1900) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.getInstance(BlockPlacementPolicy.java:199) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.init(BlockManager.java:256) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:635) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:260) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.init(SecondaryNameNode.java:205) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:695) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1868) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1892) ... 6 more Caused by: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1774) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1866) ... 7 more 2014-05-07 14:27:04,666 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for active state 2014-05-07 14:27:04,666 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state 2014-05-07 14:31:04,926 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: STARTUP_MSG: {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6348) Secondary namenode - RMI Thread prevents JVM from exiting after main() completes
[ https://issues.apache.org/jira/browse/HDFS-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-6348: --- Attachment: HDFS-6348-003.patch Secondary namenode - RMI Thread prevents JVM from exiting after main() completes - Key: HDFS-6348 URL: https://issues.apache.org/jira/browse/HDFS-6348 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-6348-003.patch, HDFS-6348.patch, HDFS-6348.patch, secondaryNN_threaddump_after_exit.log Secondary Namenode is not exiting when there is RuntimeException occurred during startup. Say I configured wrong configuration, due to that validation failed and thrown RuntimeException as shown below. But when I check the environment SecondaryNamenode process is alive. When analysed, RMI Thread is still alive, since it is not a daemon thread JVM is nit exiting. I'm attaching threaddump to this JIRA for more details about the thread. {code} java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1900) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.getInstance(BlockPlacementPolicy.java:199) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.init(BlockManager.java:256) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:635) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:260) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.init(SecondaryNameNode.java:205) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:695) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1868) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1892) ... 6 more Caused by: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1774) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1866) ... 7 more 2014-05-07 14:27:04,666 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for active state 2014-05-07 14:27:04,666 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state 2014-05-07 14:31:04,926 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: STARTUP_MSG: {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6348) Secondary namenode - RMI Thread prevents JVM from exiting after main() completes
[ https://issues.apache.org/jira/browse/HDFS-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526628#comment-14526628 ] Rakesh R commented on HDFS-6348: Attached re-based(in latest trunk) patch to get the jenkins report. Secondary namenode - RMI Thread prevents JVM from exiting after main() completes - Key: HDFS-6348 URL: https://issues.apache.org/jira/browse/HDFS-6348 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-6348-003.patch, HDFS-6348.patch, HDFS-6348.patch, secondaryNN_threaddump_after_exit.log Secondary Namenode is not exiting when there is RuntimeException occurred during startup. Say I configured wrong configuration, due to that validation failed and thrown RuntimeException as shown below. But when I check the environment SecondaryNamenode process is alive. When analysed, RMI Thread is still alive, since it is not a daemon thread JVM is nit exiting. I'm attaching threaddump to this JIRA for more details about the thread. {code} java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1900) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.getInstance(BlockPlacementPolicy.java:199) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.init(BlockManager.java:256) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:635) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:260) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.init(SecondaryNameNode.java:205) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:695) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1868) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1892) ... 6 more Caused by: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1774) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1866) ... 7 more 2014-05-07 14:27:04,666 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for active state 2014-05-07 14:27:04,666 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state 2014-05-07 14:31:04,926 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: STARTUP_MSG: {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands
[ https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526644#comment-14526644 ] Hadoop QA commented on HDFS-8242: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 33s | Pre-patch HDFS-7285 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 5 new or modified test files. | | {color:green}+1{color} | javac | 7m 27s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 37s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 38s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 12s | The patch appears to introduce 8 new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 14s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 229m 45s | Tests failed in hadoop-hdfs. | | | | 270m 55s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | | Inconsistent synchronization of org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time Unsynchronized access at DFSOutputStream.java:90% of time Unsynchronized access at DFSOutputStream.java:[line 142] | | | Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived from an Exception, even though it is named as such At DataStreamer.java:from an Exception, even though it is named as such At DataStreamer.java:[lines 177-201] | | | Dead store to offSuccess in org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() At StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock() At StripedDataStreamer.java:[line 105] | | | Possible null pointer dereference of arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long) Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] | | | Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema):in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String, ECSchema): String.getBytes() At ErasureCodingZoneManager.java:[line 117] | | | Found reliance on default encoding in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath): new String(byte[]) At ErasureCodingZoneManager.java:[line 81] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock, int, int, int, int) At StripedBlockUtil.java:[line 85] | | | Result of integer multiplication cast to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, int, int) At StripedBlockUtil.java:to long in org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, int, int) At StripedBlockUtil.java:[line 167] | | Failed unit tests | hadoop.hdfs.TestFileLengthOnClusterRestart | | | hadoop.cli.TestHDFSCLI | | | hadoop.hdfs.TestMultiThreadedHflush | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation | | | hadoop.hdfs.TestDFSOutputStream | | | hadoop.hdfs.TestQuota | | | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.TestCrcCorruption | | | hadoop.hdfs.TestClose | | | hadoop.hdfs.server.datanode.TestBlockRecovery | | | hadoop.hdfs.server.namenode.TestFileTruncate | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.server.namenode.TestDeleteRace | | Timed out tests | org.apache.hadoop.hdfs.TestDataTransferProtocol | | | org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache | | | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery | | | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | \\ \\ || Subsystem || Report/Notes || | Patch URL |
[jira] [Commented] (HDFS-2484) checkLease should throw FileNotFoundException when file does not exist
[ https://issues.apache.org/jira/browse/HDFS-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526655#comment-14526655 ] Hadoop QA commented on HDFS-2484: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 55s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 38s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 47s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 14s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 7s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 23s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 165m 18s | Tests failed in hadoop-hdfs. | | | | 208m 58s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12730153/HDFS-2484.01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / bb9ddef | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10778/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10778/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10778/console | This message was automatically generated. checkLease should throw FileNotFoundException when file does not exist -- Key: HDFS-2484 URL: https://issues.apache.org/jira/browse/HDFS-2484 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 0.22.0, 2.0.0-alpha Reporter: Konstantin Shvachko Assignee: Rakesh R Attachments: HDFS-2484.00.patch, HDFS-2484.01.patch When file is deleted during its creation {{FSNamesystem.checkLease(String src, String holder)}} throws {{LeaseExpiredException}}. It would be more informative if it thrown {{FileNotFoundException}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command
[ https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-8137: -- Attachment: HDFS-8137-2.patch Thanks a lot, Kai for the review! I have updated the patch by addressing the comments. Sends the EC schema to DataNode as well in EC encoding/recovering command - Key: HDFS-8137 URL: https://issues.apache.org/jira/browse/HDFS-8137 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Uma Maheswara Rao G Attachments: HDFS-8137-0.patch, HDFS-8137-1.patch, HDFS-8137-2.patch Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the EC schema to DataNode as well contained in the EC encoding/recovering command. The target DataNode will use it to guide the executing of the task. Another way would be, DataNode would just request schema actively thru a separate RPC call, and as an optimization consideration, DataNode may cache schemas to avoid repeatedly asking for the same schema twice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8203) Erasure Coding: Seek and other Ops in DFSStripedInputStream.
[ https://issues.apache.org/jira/browse/HDFS-8203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-8203: - Attachment: HDFS-8203.001.patch Will add some test for the seek tomorrow. Erasure Coding: Seek and other Ops in DFSStripedInputStream. Key: HDFS-8203 URL: https://issues.apache.org/jira/browse/HDFS-8203 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-8203.001.patch In HDFS-7782 and HDFS-8033, we handle pread and stateful read for {{DFSStripedInputStream}}, we also need handle other operations, such as {{seek}}, zerocopy read ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8284) Add usage of tracing originated in DFSClient to doc
[ https://issues.apache.org/jira/browse/HDFS-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526773#comment-14526773 ] Hadoop QA commented on HDFS-8284: - (!) The patch artifact directory has been removed! This is a fatal error for test-patch.sh. Aborting. Jenkins (node H4) information at https://builds.apache.org/job/PreCommit-HDFS-Build/10779/ may provide some hints. Add usage of tracing originated in DFSClient to doc --- Key: HDFS-8284 URL: https://issues.apache.org/jira/browse/HDFS-8284 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Attachments: HDFS-8284.001.patch Tracing originated in DFSClient uses configuration keys prefixed with dfs.client.htrace after HDFS-8213. Server side tracing uses conf keys prefixed with dfs.htrace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6300) Shouldn't allows to run multiple balancer simultaneously
[ https://issues.apache.org/jira/browse/HDFS-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526800#comment-14526800 ] Hadoop QA commented on HDFS-6300: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 37s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 27s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 33s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 38s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 2s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 14s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 184m 8s | Tests failed in hadoop-hdfs. | | | | 225m 18s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12730170/HDFS-6300-001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / bb9ddef | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10780/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10780/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10780/console | This message was automatically generated. Shouldn't allows to run multiple balancer simultaneously Key: HDFS-6300 URL: https://issues.apache.org/jira/browse/HDFS-6300 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-6300-001.patch, HDFS-6300.patch Javadoc of Balancer.java says, it will not allow to run second balancer if the first one is in progress. But I've noticed multiple can run together and balancer.id implementation is not safe guarding. {code} * liAnother balancer is running. Exiting... {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8311) DataStreamer.transfer() should timeout the socket InputStream.
[ https://issues.apache.org/jira/browse/HDFS-8311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez updated HDFS-8311: Attachment: 0001-HDFS-8311-DataStreamer.transfer-should-timeout-the-s.patch [~kiranmr] is ok for you if I reassign the JIRA to me? I have an initial patch that re-uses the existing write timeout in the OuputStream and seems to be fine. DataStreamer.transfer() should timeout the socket InputStream. -- Key: HDFS-8311 URL: https://issues.apache.org/jira/browse/HDFS-8311 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Reporter: Esteban Gutierrez Assignee: Kiran Kumar M R Attachments: 0001-HDFS-8311-DataStreamer.transfer-should-timeout-the-s.patch While validating some HA failure modes we found that HDFS clients can take a long time to recover or sometimes don't recover at all since we don't setup the socket timeout in the InputStream: {code} private void transfer () { ... ... OutputStream unbufOut = NetUtils.getOutputStream(sock, writeTimeout); InputStream unbufIn = NetUtils.getInputStream(sock); ... } {code} The InputStream should have its own timeout in the same way as the OutputStream. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6348) Secondary namenode - RMI Thread prevents JVM from exiting after main() completes
[ https://issues.apache.org/jira/browse/HDFS-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526881#comment-14526881 ] Hadoop QA commented on HDFS-6348: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 35s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 28s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 35s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 21s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 5s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 13s | Pre-build of native portion | | {color:green}+1{color} | hdfs tests | 183m 49s | Tests passed in hadoop-hdfs. | | | | 226m 47s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12730177/HDFS-6348-003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / bb9ddef | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10781/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10781/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10781/console | This message was automatically generated. Secondary namenode - RMI Thread prevents JVM from exiting after main() completes - Key: HDFS-6348 URL: https://issues.apache.org/jira/browse/HDFS-6348 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-6348-003.patch, HDFS-6348.patch, HDFS-6348.patch, secondaryNN_threaddump_after_exit.log Secondary Namenode is not exiting when there is RuntimeException occurred during startup. Say I configured wrong configuration, due to that validation failed and thrown RuntimeException as shown below. But when I check the environment SecondaryNamenode process is alive. When analysed, RMI Thread is still alive, since it is not a daemon thread JVM is nit exiting. I'm attaching threaddump to this JIRA for more details about the thread. {code} java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1900) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.getInstance(BlockPlacementPolicy.java:199) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.init(BlockManager.java:256) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:635) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:260) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.init(SecondaryNameNode.java:205) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:695) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1868) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1892) ... 6 more Caused by: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at
[jira] [Updated] (HDFS-8249) Separate HdfsConstants into the client and the server side class
[ https://issues.apache.org/jira/browse/HDFS-8249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8249: Description: The constants in {{HdfsConstants}} are used by both the client side and the server side. There are two types of constants in the class: # Constants that are used internally by the servers or not part of the APIs. These constants are free to evolve without breaking compatibilities. For example, {{MAX_PATH_LENGTH}} is used by the NN to enforce the length of the path does not go too long. Developers are free to change the name of the constants and to move it around if necessary. # Constants that are used by the clients, but not parts of the APIs. For example, {{QUOTA_DONT_SET}} represents an unlimited quota. The value is part of the wire protocol but the value is not. Developers are free to rename the constants but are not allowed to change the value of the constants. # Constants that are parts of the APIs. For example, {{SafeModeAction}} is used in {{DistributedFileSystem}}. Changing the name / value of the constant will break binary compatibility, but not source code compatibility. This jira proposes to separate the above three types of constants into different classes: * Creating a new class {{HdfsConstantsServer}} to hold the first type of constants. * Move {{HdfsConstants}} into the {{hdfs-client}} package. The work of separating the second and the third types of constants will be postponed in a separate jira. was: The constants in {{HdfsConstants}} are used by both the client side and the server side. There are two types of constants in the class: 1. Constants that are used internally by the servers or not part of the APIs. These constants are free to evolve without breaking compatibilities. For example, {{MAX_PATH_LENGTH}} is used by the NN to enforce the length of the path does not go too long. Developers are free to change the name of the constants and to move it around if necessary. 1. Constants that are used by the clients, but not parts of the APIs. For example, {{QUOTA_DONT_SET}} represents an unlimited quota. The value is part of the wire protocol but the value is not. Developers are free to rename the constants but are not allowed to change the value of the constants. 1. Constants that are parts of the APIs. For example, {{SafeModeAction}} is used in {{DistributedFileSystem}}. Changing the name / value of the constant will break binary compatibility, but not source code compatibility. This jira proposes to separate the above three types of constants into different classes: * Creating a new class {{HdfsConstantsServer}} to hold the first type of constants. * Move {{HdfsConstants}} into the {{hdfs-client}} package. The work of separating the second and the third types of constants will be postponed in a separate jira. Separate HdfsConstants into the client and the server side class Key: HDFS-8249 URL: https://issues.apache.org/jira/browse/HDFS-8249 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.8.0 Attachments: HDFS-8249.000.patch, HDFS-8249.001.patch, HDFS-8249.002.patch, HDFS-8249.003.patch, HDFS-8249.004.patch The constants in {{HdfsConstants}} are used by both the client side and the server side. There are two types of constants in the class: # Constants that are used internally by the servers or not part of the APIs. These constants are free to evolve without breaking compatibilities. For example, {{MAX_PATH_LENGTH}} is used by the NN to enforce the length of the path does not go too long. Developers are free to change the name of the constants and to move it around if necessary. # Constants that are used by the clients, but not parts of the APIs. For example, {{QUOTA_DONT_SET}} represents an unlimited quota. The value is part of the wire protocol but the value is not. Developers are free to rename the constants but are not allowed to change the value of the constants. # Constants that are parts of the APIs. For example, {{SafeModeAction}} is used in {{DistributedFileSystem}}. Changing the name / value of the constant will break binary compatibility, but not source code compatibility. This jira proposes to separate the above three types of constants into different classes: * Creating a new class {{HdfsConstantsServer}} to hold the first type of constants. * Move {{HdfsConstants}} into the {{hdfs-client}} package. The work of separating the second and the third types of constants will be postponed in a separate jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6348) Secondary namenode - RMI Thread prevents JVM from exiting after main() completes
[ https://issues.apache.org/jira/browse/HDFS-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526896#comment-14526896 ] Hadoop QA commented on HDFS-6348: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 38s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 30s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 36s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 38s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 38s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 4s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 12s | Pre-build of native portion | | {color:green}+1{color} | hdfs tests | 177m 57s | Tests passed in hadoop-hdfs. | | | | 219m 24s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12730178/HDFS-6348-003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / bb9ddef | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10782/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10782/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10782/console | This message was automatically generated. Secondary namenode - RMI Thread prevents JVM from exiting after main() completes - Key: HDFS-6348 URL: https://issues.apache.org/jira/browse/HDFS-6348 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-6348-003.patch, HDFS-6348.patch, HDFS-6348.patch, secondaryNN_threaddump_after_exit.log Secondary Namenode is not exiting when there is RuntimeException occurred during startup. Say I configured wrong configuration, due to that validation failed and thrown RuntimeException as shown below. But when I check the environment SecondaryNamenode process is alive. When analysed, RMI Thread is still alive, since it is not a daemon thread JVM is nit exiting. I'm attaching threaddump to this JIRA for more details about the thread. {code} java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1900) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.getInstance(BlockPlacementPolicy.java:199) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.init(BlockManager.java:256) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:635) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:260) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.init(SecondaryNameNode.java:205) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:695) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1868) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1892) ... 6 more Caused by: java.lang.ClassNotFoundException: Class com.huawei.hadoop.hdfs.server.blockmanagement.MyBlockPlacementPolicy not found at
[jira] [Updated] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated HDFS-6407: --- Attachment: HDFS-6407-003.patch Adjusted the page size to a lower number (instead of All) to ensure that pages get painted fast when there are lots of items. This is an existing problem when there are thousands of items to display. Sorting thousands of items has no noticeable delay to require any optimization. Tested with 8500 files in one directory. new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Minor Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526913#comment-14526913 ] Hadoop QA commented on HDFS-6407: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 0m 0s | Pre-patch trunk compilation is healthy. | | {color:red}-1{color} | @author | 0m 0s | The patch appears to contain 2 @author tags which the Hadoop community has agreed to not allow in code contributions. | | {color:red}-1{color} | release audit | 0m 14s | The applied patch generated 3 release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | | | 0m 20s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12730214/HDFS-6407-003.patch | | Optional Tests | | | git revision | trunk / 8f65c79 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/10783/artifact/patchprocess/patchReleaseAuditProblems.txt | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10783/console | This message was automatically generated. new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Benoy Antony Priority: Minor Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8315) Datanode does not log reads
Kihwal Lee created HDFS-8315: Summary: Datanode does not log reads Key: HDFS-8315 URL: https://issues.apache.org/jira/browse/HDFS-8315 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Priority: Critical HDFS-6836 made datanode read request logging DEBUG. There is a good reason why it was at INFO for so many years. This is very useful in debugging load issues. This jira will revert HDFS-6836. We haven't seen it being a bottleneck on busy hbase clusters, but if someone thinks it is a serious overhead, please make it configurable in a separate jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8316) Erasure coding: refactor EC constants to be consistent with HDFS-8249
Zhe Zhang created HDFS-8316: --- Summary: Erasure coding: refactor EC constants to be consistent with HDFS-8249 Key: HDFS-8316 URL: https://issues.apache.org/jira/browse/HDFS-8316 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang HDFS-8249 separates server-internal constants and client-visible ones. This JIRA takes care of EC constants from that perspective. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-8316) Erasure coding: refactor EC constants to be consistent with HDFS-8249
[ https://issues.apache.org/jira/browse/HDFS-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-8316 started by Zhe Zhang. --- Erasure coding: refactor EC constants to be consistent with HDFS-8249 - Key: HDFS-8316 URL: https://issues.apache.org/jira/browse/HDFS-8316 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8316-HDFS-7285.0.patch HDFS-8249 separates server-internal constants and client-visible ones. This JIRA takes care of EC constants from that perspective. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8316) Erasure coding: refactor EC constants to be consistent with HDFS-8249
[ https://issues.apache.org/jira/browse/HDFS-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8316: Attachment: HDFS-8316-HDFS-7285.0.patch Erasure coding: refactor EC constants to be consistent with HDFS-8249 - Key: HDFS-8316 URL: https://issues.apache.org/jira/browse/HDFS-8316 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8316-HDFS-7285.0.patch HDFS-8249 separates server-internal constants and client-visible ones. This JIRA takes care of EC constants from that perspective. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8315) Datanode does not log reads
[ https://issues.apache.org/jira/browse/HDFS-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-8315: - Attachment: HDFS-8315.patch Datanode does not log reads --- Key: HDFS-8315 URL: https://issues.apache.org/jira/browse/HDFS-8315 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Priority: Critical Attachments: HDFS-8315.patch HDFS-6836 made datanode read request logging DEBUG. There is a good reason why it was at INFO for so many years. This is very useful in debugging load issues. This jira will revert HDFS-6836. We haven't seen it being a bottleneck on busy hbase clusters, but if someone thinks it is a serious overhead, please make it configurable in a separate jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8315) Datanode does not log reads
[ https://issues.apache.org/jira/browse/HDFS-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-8315: - Assignee: Kihwal Lee Status: Patch Available (was: Open) Datanode does not log reads --- Key: HDFS-8315 URL: https://issues.apache.org/jira/browse/HDFS-8315 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-8315.patch HDFS-6836 made datanode read request logging DEBUG. There is a good reason why it was at INFO for so many years. This is very useful in debugging load issues. This jira will revert HDFS-6836. We haven't seen it being a bottleneck on busy hbase clusters, but if someone thinks it is a serious overhead, please make it configurable in a separate jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8316) Erasure coding: refactor EC constants to be consistent with HDFS-8249
[ https://issues.apache.org/jira/browse/HDFS-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526942#comment-14526942 ] Zhe Zhang commented on HDFS-8316: - [~wheat9] / [~szetszwo] Could you take a look and see if it's consistent with HDFS-8249? Thanks. Erasure coding: refactor EC constants to be consistent with HDFS-8249 - Key: HDFS-8316 URL: https://issues.apache.org/jira/browse/HDFS-8316 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8316-HDFS-7285.0.patch HDFS-8249 separates server-internal constants and client-visible ones. This JIRA takes care of EC constants from that perspective. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8315) Datanode does not log reads
[ https://issues.apache.org/jira/browse/HDFS-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526946#comment-14526946 ] Haohui Mai commented on HDFS-8315: -- Can you elaborate how you debug the load issues based on the log? Datanode does not log reads --- Key: HDFS-8315 URL: https://issues.apache.org/jira/browse/HDFS-8315 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-8315.patch HDFS-6836 made datanode read request logging DEBUG. There is a good reason why it was at INFO for so many years. This is very useful in debugging load issues. This jira will revert HDFS-6836. We haven't seen it being a bottleneck on busy hbase clusters, but if someone thinks it is a serious overhead, please make it configurable in a separate jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8317) test-patch.sh should be documented
Allen Wittenauer created HDFS-8317: -- Summary: test-patch.sh should be documented Key: HDFS-8317 URL: https://issues.apache.org/jira/browse/HDFS-8317 Project: Hadoop HDFS Issue Type: Bug Reporter: Allen Wittenauer It might be useful to have all of test-patch.sh's functionality documented, how to use it, power user hints, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8237) Move all protocol classes used by ClientProtocol to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8237: - Attachment: HDFS-8237.002.patch Move all protocol classes used by ClientProtocol to hdfs-client --- Key: HDFS-8237 URL: https://issues.apache.org/jira/browse/HDFS-8237 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8237.000.patch, HDFS-8237.001.patch, HDFS-8237.002.patch, HDFS-8237.002.patch This jira proposes to move the classes in the hdfs project referred by ClientProtocol into the hdfs-client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8237) Move all protocol classes used by ClientProtocol to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8237: - Attachment: HDFS-8237.002.patch Move all protocol classes used by ClientProtocol to hdfs-client --- Key: HDFS-8237 URL: https://issues.apache.org/jira/browse/HDFS-8237 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8237.000.patch, HDFS-8237.001.patch, HDFS-8237.002.patch, HDFS-8237.002.patch This jira proposes to move the classes in the hdfs project referred by ClientProtocol into the hdfs-client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8237) Move all protocol classes used by ClientProtocol to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8237: - Attachment: (was: HDFS-8237.002.patch) Move all protocol classes used by ClientProtocol to hdfs-client --- Key: HDFS-8237 URL: https://issues.apache.org/jira/browse/HDFS-8237 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8237.000.patch, HDFS-8237.001.patch, HDFS-8237.002.patch This jira proposes to move the classes in the hdfs project referred by ClientProtocol into the hdfs-client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7609) startup used too much time to load edits
[ https://issues.apache.org/jira/browse/HDFS-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-7609: -- Attachment: HDFS-7609.patch Here is the draft patch that prevents client from polluting the retry cache when standby is being transitioned to active. It doesn't cover other possible optimization ideas discussed above. Appreciate any input on this. startup used too much time to load edits Key: HDFS-7609 URL: https://issues.apache.org/jira/browse/HDFS-7609 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.2.0 Reporter: Carrey Zhan Attachments: HDFS-7609-CreateEditsLogWithRPCIDs.patch, HDFS-7609.patch, recovery_do_not_use_retrycache.patch One day my namenode crashed because of two journal node timed out at the same time under very high load, leaving behind about 100 million transactions in edits log.(I still have no idea why they were not rolled into fsimage.) I tryed to restart namenode, but it showed that almost 20 hours would be needed before finish, and it was loading fsedits most of the time. I also tryed to restart namenode in recover mode, the loading speed had no different. I looked into the stack trace, judged that it is caused by the retry cache. So I set dfs.namenode.enable.retrycache to false, the restart process finished in half an hour. I think the retry cached is useless during startup, at least during recover process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7609) startup used too much time to load edits
[ https://issues.apache.org/jira/browse/HDFS-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-7609: -- Assignee: Ming Ma Status: Patch Available (was: Open) startup used too much time to load edits Key: HDFS-7609 URL: https://issues.apache.org/jira/browse/HDFS-7609 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.2.0 Reporter: Carrey Zhan Assignee: Ming Ma Attachments: HDFS-7609-CreateEditsLogWithRPCIDs.patch, HDFS-7609.patch, recovery_do_not_use_retrycache.patch One day my namenode crashed because of two journal node timed out at the same time under very high load, leaving behind about 100 million transactions in edits log.(I still have no idea why they were not rolled into fsimage.) I tryed to restart namenode, but it showed that almost 20 hours would be needed before finish, and it was loading fsedits most of the time. I also tryed to restart namenode in recover mode, the loading speed had no different. I looked into the stack trace, judged that it is caused by the retry cache. So I set dfs.namenode.enable.retrycache to false, the restart process finished in half an hour. I think the retry cached is useless during startup, at least during recover process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5640) Add snapshot methods to FileContext.
[ https://issues.apache.org/jira/browse/HDFS-5640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-5640: --- Attachment: HDFS-5640-004.patch Add snapshot methods to FileContext. Key: HDFS-5640 URL: https://issues.apache.org/jira/browse/HDFS-5640 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, snapshots Affects Versions: 3.0.0, 2.2.0 Reporter: Chris Nauroth Assignee: Rakesh R Attachments: HDFS-5640-001.patch, HDFS-5640-002.patch, HDFS-5640-003.patch, HDFS-5640-004.patch Currently, methods related to HDFS snapshots are defined on {{FileSystem}}. For feature parity, these methods need to be added to {{FileContext}}. This would also require updating {{AbstractFileSystem}} and the {{Hdfs}} subclass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8316) Erasure coding: refactor EC constants to be consistent with HDFS-8249
[ https://issues.apache.org/jira/browse/HDFS-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526990#comment-14526990 ] Jing Zhao commented on HDFS-8316: - Thanks for the work, Zhe! +1. Erasure coding: refactor EC constants to be consistent with HDFS-8249 - Key: HDFS-8316 URL: https://issues.apache.org/jira/browse/HDFS-8316 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8316-HDFS-7285.0.patch HDFS-8249 separates server-internal constants and client-visible ones. This JIRA takes care of EC constants from that perspective. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8316) Erasure coding: refactor EC constants to be consistent with HDFS-8249
[ https://issues.apache.org/jira/browse/HDFS-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HDFS-8316. - Resolution: Fixed Fix Version/s: HDFS-7285 Hadoop Flags: Reviewed I've committed this to the feature branch. Erasure coding: refactor EC constants to be consistent with HDFS-8249 - Key: HDFS-8316 URL: https://issues.apache.org/jira/browse/HDFS-8316 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Fix For: HDFS-7285 Attachments: HDFS-8316-HDFS-7285.0.patch HDFS-8249 separates server-internal constants and client-visible ones. This JIRA takes care of EC constants from that perspective. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8316) Erasure coding: refactor EC constants to be consistent with HDFS-8249
[ https://issues.apache.org/jira/browse/HDFS-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526997#comment-14526997 ] Zhe Zhang commented on HDFS-8316: - Thanks Jing for the prompt review! Erasure coding: refactor EC constants to be consistent with HDFS-8249 - Key: HDFS-8316 URL: https://issues.apache.org/jira/browse/HDFS-8316 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Fix For: HDFS-7285 Attachments: HDFS-8316-HDFS-7285.0.patch HDFS-8249 separates server-internal constants and client-visible ones. This JIRA takes care of EC constants from that perspective. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8315) Datanode does not log reads
[ https://issues.apache.org/jira/browse/HDFS-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527002#comment-14527002 ] Kihwal Lee commented on HDFS-8315: -- If accesses to a small set of blocks create hot spots, namenode audit log may not tell the whole story. Without the read request logging, it is very difficult to know what caused saturation on particular datanodes. If we catch it in action, then jstack can show what is being served and how many dataXceiver threads are used, etc., but that is not always possible. Even if we get lucky, we may not get accurate information depending on how the state is sampled. Datanode does not log reads --- Key: HDFS-8315 URL: https://issues.apache.org/jira/browse/HDFS-8315 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-8315.patch HDFS-6836 made datanode read request logging DEBUG. There is a good reason why it was at INFO for so many years. This is very useful in debugging load issues. This jira will revert HDFS-6836. We haven't seen it being a bottleneck on busy hbase clusters, but if someone thinks it is a serious overhead, please make it configurable in a separate jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8311) DataStreamer.transfer() should timeout the socket InputStream.
[ https://issues.apache.org/jira/browse/HDFS-8311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez updated HDFS-8311: Attachment: HDFS-8311.001.patch Added correctly timeout for both DS.createBlockOutputStream() and DS.transfer() and now it relies on DFSClient.getDatanodeReadTimeout() (both usually should be the same, but now is semantically correct) DataStreamer.transfer() should timeout the socket InputStream. -- Key: HDFS-8311 URL: https://issues.apache.org/jira/browse/HDFS-8311 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Reporter: Esteban Gutierrez Assignee: Kiran Kumar M R Attachments: 0001-HDFS-8311-DataStreamer.transfer-should-timeout-the-s.patch, HDFS-8311.001.patch While validating some HA failure modes we found that HDFS clients can take a long time to recover or sometimes don't recover at all since we don't setup the socket timeout in the InputStream: {code} private void transfer () { ... ... OutputStream unbufOut = NetUtils.getOutputStream(sock, writeTimeout); InputStream unbufIn = NetUtils.getInputStream(sock); ... } {code} The InputStream should have its own timeout in the same way as the OutputStream. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5640) Add snapshot methods to FileContext.
[ https://issues.apache.org/jira/browse/HDFS-5640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527004#comment-14527004 ] Rakesh R commented on HDFS-5640: Attached another patch, here I've added few unit test cases(since {{filecontext}} has delegation calls I've added only few basic test cases). [~cnauroth] could you take a look at the patch and please let me know your thoughts on the patch. Thanks! Add snapshot methods to FileContext. Key: HDFS-5640 URL: https://issues.apache.org/jira/browse/HDFS-5640 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, snapshots Affects Versions: 3.0.0, 2.2.0 Reporter: Chris Nauroth Assignee: Rakesh R Attachments: HDFS-5640-001.patch, HDFS-5640-002.patch, HDFS-5640-003.patch, HDFS-5640-004.patch Currently, methods related to HDFS snapshots are defined on {{FileSystem}}. For feature parity, these methods need to be added to {{FileContext}}. This would also require updating {{AbstractFileSystem}} and the {{Hdfs}} subclass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands
[ https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527014#comment-14527014 ] Zhe Zhang commented on HDFS-8242: - Thanks Rakesh. Since the main change is to add a new test, let's just make sure the test itself passes stably. The list of tests looks good to me. We could add tests covering zone semantics: non-empty dir, moving files between zones, etc. But maybe those could be delayed until we have a unified zone implementation (for EC, encryption, etc.) So +1 after Rakesh confirming about the test. Erasure Coding: XML based end-to-end test for ECCli commands Key: HDFS-8242 URL: https://issues.apache.org/jira/browse/HDFS-8242 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch, HDFS-8242-HDFS-7285.05.patch, HDFS-8242-HDFS-7285.05.patch, HDFS-8242-HDFS-7285.05.patch This JIRA to add test cases with CLI test f/w for the commands present in {{ECCli}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8237) Move all protocol classes used by ClientProtocol to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527016#comment-14527016 ] Brandon Li commented on HDFS-8237: -- +1 pending Jenkins. Move all protocol classes used by ClientProtocol to hdfs-client --- Key: HDFS-8237 URL: https://issues.apache.org/jira/browse/HDFS-8237 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8237.000.patch, HDFS-8237.001.patch, HDFS-8237.002.patch This jira proposes to move the classes in the hdfs project referred by ClientProtocol into the hdfs-client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8305) HDFS INotify: the destination field of RenameOp should always end with the file name
[ https://issues.apache.org/jira/browse/HDFS-8305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-8305: --- Description: HDFS INotify: the destination field of RenameOp should always end with the file name rather than sometimes being a directory name. Previously, in some cases when using the old rename, this was not the case. The format of OP_EDIT_LOG_RENAME_OLD allows moving /f to /d/f to be represented as RENAME(src=/f, dst=/d) or RENAME(src=/f, dst=/d/f). This change makes HDFS always use the latter form. This, in turn, ensures that inotify will always be able to consider the dst field as the full destination file name. This is a compatible change since we aren't removing the ability to handle the first form during edit log replay... we just no longer generate it. (was: HDFS INotify: the destination field of RenameOp should always end with the file name rather than sometimes being a directory name.) HDFS INotify: the destination field of RenameOp should always end with the file name Key: HDFS-8305 URL: https://issues.apache.org/jira/browse/HDFS-8305 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-8305.001.patch HDFS INotify: the destination field of RenameOp should always end with the file name rather than sometimes being a directory name. Previously, in some cases when using the old rename, this was not the case. The format of OP_EDIT_LOG_RENAME_OLD allows moving /f to /d/f to be represented as RENAME(src=/f, dst=/d) or RENAME(src=/f, dst=/d/f). This change makes HDFS always use the latter form. This, in turn, ensures that inotify will always be able to consider the dst field as the full destination file name. This is a compatible change since we aren't removing the ability to handle the first form during edit log replay... we just no longer generate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8305) HDFS INotify: the destination field of RenameOp should always end with the file name
[ https://issues.apache.org/jira/browse/HDFS-8305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527023#comment-14527023 ] Colin Patrick McCabe commented on HDFS-8305: bq. can we add a description to this jira explaining why (e.g., This, in turn, ensures that inotify will always be able to consider the dst field as the full destination file name.)? added bq. can we add java doc to the void logRename( methods to say something like if the rename source is a file, the target is better to be a file too, this will ensure that inotify will always be able to confider the dst file as the full destination file name.? ok HDFS INotify: the destination field of RenameOp should always end with the file name Key: HDFS-8305 URL: https://issues.apache.org/jira/browse/HDFS-8305 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-8305.001.patch HDFS INotify: the destination field of RenameOp should always end with the file name rather than sometimes being a directory name. Previously, in some cases when using the old rename, this was not the case. The format of OP_EDIT_LOG_RENAME_OLD allows moving /f to /d/f to be represented as RENAME(src=/f, dst=/d) or RENAME(src=/f, dst=/d/f). This change makes HDFS always use the latter form. This, in turn, ensures that inotify will always be able to consider the dst field as the full destination file name. This is a compatible change since we aren't removing the ability to handle the first form during edit log replay... we just no longer generate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8305) HDFS INotify: the destination field of RenameOp should always end with the file name
[ https://issues.apache.org/jira/browse/HDFS-8305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-8305: --- Attachment: HDFS-8305.002.patch HDFS INotify: the destination field of RenameOp should always end with the file name Key: HDFS-8305 URL: https://issues.apache.org/jira/browse/HDFS-8305 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-8305.001.patch, HDFS-8305.002.patch HDFS INotify: the destination field of RenameOp should always end with the file name rather than sometimes being a directory name. Previously, in some cases when using the old rename, this was not the case. The format of OP_EDIT_LOG_RENAME_OLD allows moving /f to /d/f to be represented as RENAME(src=/f, dst=/d) or RENAME(src=/f, dst=/d/f). This change makes HDFS always use the latter form. This, in turn, ensures that inotify will always be able to consider the dst field as the full destination file name. This is a compatible change since we aren't removing the ability to handle the first form during edit log replay... we just no longer generate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8290) WebHDFS calls before namesystem initialization can cause NullPointerException.
[ https://issues.apache.org/jira/browse/HDFS-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-8290: Resolution: Fixed Fix Version/s: 2.8.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I committed this to trunk and branch-2. Jakob, thank you for the code review. WebHDFS calls before namesystem initialization can cause NullPointerException. -- Key: HDFS-8290 URL: https://issues.apache.org/jira/browse/HDFS-8290 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.6.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8290.001.patch The NameNode has a brief window of time when the HTTP server has been initialized, but the namesystem has not been initialized. During this window, a WebHDFS call can cause a {{NullPointerException}}. We can catch this condition and return a more meaningful error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7609) startup used too much time to load edits
[ https://issues.apache.org/jira/browse/HDFS-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527030#comment-14527030 ] Ming Ma commented on HDFS-7609: --- #3 in the scenario description above should be Before nn2 starts the transition to active instead of Even though nn2 is still tailing edit log and not active yet, because after nn2 starts tailing edit log, it will lock retryCache until it becomes active and thus prevent the client calls from adding new entry to the retry cache during the transition. startup used too much time to load edits Key: HDFS-7609 URL: https://issues.apache.org/jira/browse/HDFS-7609 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.2.0 Reporter: Carrey Zhan Assignee: Ming Ma Attachments: HDFS-7609-CreateEditsLogWithRPCIDs.patch, HDFS-7609.patch, recovery_do_not_use_retrycache.patch One day my namenode crashed because of two journal node timed out at the same time under very high load, leaving behind about 100 million transactions in edits log.(I still have no idea why they were not rolled into fsimage.) I tryed to restart namenode, but it showed that almost 20 hours would be needed before finish, and it was loading fsedits most of the time. I also tryed to restart namenode in recover mode, the loading speed had no different. I looked into the stack trace, judged that it is caused by the retry cache. So I set dfs.namenode.enable.retrycache to false, the restart process finished in half an hour. I think the retry cached is useless during startup, at least during recover process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5757) refreshNodes with many nodes at the same time could slow down NN
[ https://issues.apache.org/jira/browse/HDFS-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-5757: -- Resolution: Duplicate Status: Resolved (was: Patch Available) This optimization has been taken care of by HDFS-7411. refreshNodes with many nodes at the same time could slow down NN Key: HDFS-5757 URL: https://issues.apache.org/jira/browse/HDFS-5757 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-5757.patch Sometimes we need to decomm a whole rack of nodes at the same time. When the decomm is in process; NN is slow. The reason is when DecommissionManager checks the decomm status, it acquires namesystem's writer lock and iterates through all DNs; for each DN that is in decommissioning state, it check if replication is done for all the blocks on the machine via blockManager.isReplicationInProgress; for large cluster; the number of blocks on the machine could be big. The fix could be to have DecommissionManager check for several decomm-in-progress nodes each time it aquires namesystem's writer lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8290) WebHDFS calls before namesystem initialization can cause NullPointerException.
[ https://issues.apache.org/jira/browse/HDFS-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527039#comment-14527039 ] Hudson commented on HDFS-8290: -- FAILURE: Integrated in Hadoop-trunk-Commit #7724 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7724/]) HDFS-8290. WebHDFS calls before namesystem initialization can cause NullPointerException. Contributed by Chris Nauroth. (cnauroth: rev c4578760b67d5b5169949a1b059f4472a268ff1b) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/web/resources/TestWebHdfsDataLocality.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt WebHDFS calls before namesystem initialization can cause NullPointerException. -- Key: HDFS-8290 URL: https://issues.apache.org/jira/browse/HDFS-8290 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.6.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8290.001.patch The NameNode has a brief window of time when the HTTP server has been initialized, but the namesystem has not been initialized. During this window, a WebHDFS call can cause a {{NullPointerException}}. We can catch this condition and return a more meaningful error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6184) Capture NN's thread dump when it fails over
[ https://issues.apache.org/jira/browse/HDFS-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527042#comment-14527042 ] Ming Ma commented on HDFS-6184: --- Anyone gets time to review this? We have been running with this feature in production clusters. It has been helpful to have better understanding of the namenode state when it fails over. Capture NN's thread dump when it fails over --- Key: HDFS-6184 URL: https://issues.apache.org/jira/browse/HDFS-6184 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-6184-2.patch, HDFS-6184.patch We have seen several false positives in terms of when ZKFC considers NN to be unhealthy. Some of these triggers unnecessary failover. Examples, 1. SBN checkpoint caused ZKFC's RPC call into NN timeout. The consequence isn't bad; just that SBN will quit ZK membership and rejoin it later. But it is unnecessary. The reason is checkpoint acquires NN global write lock and all rpc requests are blocked. Even though HAServiceProtocol.monitorHealth doesn't need to acquire NN lock; it still needs to user service rpc queue. 2. When ANN is busy, sometimes the global lock can block other requests. ZKFC's RPC call timeout. This will trigger failover. The question is even if after the failover, the new ANN might run into similar issue. We can increase ZKFC to NN timeout value to mitigate this to some degree. If ZKFC can be more accurate in judgment if NN is health or not and can predict the failover will help, that will be useful. For example, we can, 1. Have ZKFC made decision based on NN thread dump. 2. Have a dedicated rpc pool for ZKFC NN. Given health check doesn't need to acquire NN global lock; so it can go through even if NN is doing checkpointing or very busy. Any comments? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6184) Capture NN's thread dump when it fails over
[ https://issues.apache.org/jira/browse/HDFS-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527050#comment-14527050 ] Hadoop QA commented on HDFS-6184: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12690183/HDFS-6184-2.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / c457876 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10789/console | This message was automatically generated. Capture NN's thread dump when it fails over --- Key: HDFS-6184 URL: https://issues.apache.org/jira/browse/HDFS-6184 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Ming Ma Assignee: Ming Ma Attachments: HDFS-6184-2.patch, HDFS-6184.patch We have seen several false positives in terms of when ZKFC considers NN to be unhealthy. Some of these triggers unnecessary failover. Examples, 1. SBN checkpoint caused ZKFC's RPC call into NN timeout. The consequence isn't bad; just that SBN will quit ZK membership and rejoin it later. But it is unnecessary. The reason is checkpoint acquires NN global write lock and all rpc requests are blocked. Even though HAServiceProtocol.monitorHealth doesn't need to acquire NN lock; it still needs to user service rpc queue. 2. When ANN is busy, sometimes the global lock can block other requests. ZKFC's RPC call timeout. This will trigger failover. The question is even if after the failover, the new ANN might run into similar issue. We can increase ZKFC to NN timeout value to mitigate this to some degree. If ZKFC can be more accurate in judgment if NN is health or not and can predict the failover will help, that will be useful. For example, we can, 1. Have ZKFC made decision based on NN thread dump. 2. Have a dedicated rpc pool for ZKFC NN. Given health check doesn't need to acquire NN global lock; so it can go through even if NN is doing checkpointing or very busy. Any comments? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8315) Datanode does not log reads
[ https://issues.apache.org/jira/browse/HDFS-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527052#comment-14527052 ] Arpit Agarwal commented on HDFS-8315: - Hi Kihwal, IIRC [~gopalv] had some data indicating the request logging was a bottleneck. Can we use an async appender? Datanode does not log reads --- Key: HDFS-8315 URL: https://issues.apache.org/jira/browse/HDFS-8315 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-8315.patch HDFS-6836 made datanode read request logging DEBUG. There is a good reason why it was at INFO for so many years. This is very useful in debugging load issues. This jira will revert HDFS-6836. We haven't seen it being a bottleneck on busy hbase clusters, but if someone thinks it is a serious overhead, please make it configurable in a separate jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands
[ https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527055#comment-14527055 ] Rakesh R commented on HDFS-8242: Thanks again [~zhz] for the reviews! [Build Report - TestErasureCodingCLI|https://builds.apache.org/job/PreCommit-HDFS-Build/10776/testReport/org.apache.hadoop.cli/TestErasureCodingCLI/] test cases are passing in CI. It looks like CI test failures are unrelated to this patch as this is independent new set of test cases. bq. We could add tests covering zone semantics: non-empty dir, moving files between zones, etc. But maybe those could be delayed until we have a unified zone implementation (for EC, encryption, etc.) Probably we could keep an eye on these changes and some of us can update/add unit tests accordingly. Erasure Coding: XML based end-to-end test for ECCli commands Key: HDFS-8242 URL: https://issues.apache.org/jira/browse/HDFS-8242 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch, HDFS-8242-HDFS-7285.05.patch, HDFS-8242-HDFS-7285.05.patch, HDFS-8242-HDFS-7285.05.patch This JIRA to add test cases with CLI test f/w for the commands present in {{ECCli}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Status: Open (was: Patch Available) HDFS client gets errors trying to to connect to IPv6 DataNode - Key: HDFS-8078 URL: https://issues.apache.org/jira/browse/HDFS-8078 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Nate Edel Assignee: Nate Edel Labels: ipv6 1st exception, on put: 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 2401:db00:1010:70ba:face:0:8:0:50010 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) Appears to actually stem from code in DataNodeID which assumes it's safe to append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 Currently using InetAddress.getByName() to validate IPv6 (guava InetAddresses.forString has been flaky) but could also use our own parsing. (From logging this, it seems like a low-enough frequency call that the extra object creation shouldn't be problematic, and for me the slight risk of passing in bad input that is not actually an IPv4 or IPv6 address and thus calling an external DNS lookup is outweighed by getting the address normalized and avoiding rewriting parsing.) Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() --- 2nd exception (on datanode) 15/04/13 13:18:07 ERROR datanode.DataNode: dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: /2401:db00:11:d010:face:0:2f:0:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) at java.lang.Thread.run(Thread.java:745) Which also comes as client error -get: 2401 is not an IP string literal. This one has existing parsing logic which needs to shift to the last colon rather than the first. Should also be a tiny bit faster by using lastIndexOf rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks
[ https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527073#comment-14527073 ] Rakesh R commented on HDFS-7949: Thanks [~zhz] and [~drankye] for the reviews and committing the patch! FYI: Newly added test cases are passing in the build [jenkins report|https://builds.apache.org/job/PreCommit-HDFS-Build/10776/testReport/org.apache.hadoop.hdfs.tools.offlineImageViewer/TestOfflineImageViewerWithStripedBlocks/] WebImageViewer need support file size calculation with striped blocks - Key: HDFS-7949 URL: https://issues.apache.org/jira/browse/HDFS-7949 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Hui Zheng Assignee: Rakesh R Fix For: HDFS-7285 Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, HDFS-7949-006.patch, HDFS-7949-007.patch, HDFS-7949-HDFS-7285.08.patch, HDFS-7949-HDFS-7285.08.patch The file size calculation should be changed when the blocks of the file are striped in WebImageViewer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Attachment: (was: HDFS-8078.8.patch) HDFS client gets errors trying to to connect to IPv6 DataNode - Key: HDFS-8078 URL: https://issues.apache.org/jira/browse/HDFS-8078 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Nate Edel Assignee: Nate Edel Labels: ipv6 1st exception, on put: 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 2401:db00:1010:70ba:face:0:8:0:50010 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) Appears to actually stem from code in DataNodeID which assumes it's safe to append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 Currently using InetAddress.getByName() to validate IPv6 (guava InetAddresses.forString has been flaky) but could also use our own parsing. (From logging this, it seems like a low-enough frequency call that the extra object creation shouldn't be problematic, and for me the slight risk of passing in bad input that is not actually an IPv4 or IPv6 address and thus calling an external DNS lookup is outweighed by getting the address normalized and avoiding rewriting parsing.) Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() --- 2nd exception (on datanode) 15/04/13 13:18:07 ERROR datanode.DataNode: dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: /2401:db00:11:d010:face:0:2f:0:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) at java.lang.Thread.run(Thread.java:745) Which also comes as client error -get: 2401 is not an IP string literal. This one has existing parsing logic which needs to shift to the last colon rather than the first. Should also be a tiny bit faster by using lastIndexOf rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Status: Patch Available (was: Open) HDFS client gets errors trying to to connect to IPv6 DataNode - Key: HDFS-8078 URL: https://issues.apache.org/jira/browse/HDFS-8078 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Nate Edel Assignee: Nate Edel Labels: ipv6 Attachments: HDFS-8078.9.patch 1st exception, on put: 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 2401:db00:1010:70ba:face:0:8:0:50010 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) Appears to actually stem from code in DataNodeID which assumes it's safe to append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 Currently using InetAddress.getByName() to validate IPv6 (guava InetAddresses.forString has been flaky) but could also use our own parsing. (From logging this, it seems like a low-enough frequency call that the extra object creation shouldn't be problematic, and for me the slight risk of passing in bad input that is not actually an IPv4 or IPv6 address and thus calling an external DNS lookup is outweighed by getting the address normalized and avoiding rewriting parsing.) Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() --- 2nd exception (on datanode) 15/04/13 13:18:07 ERROR datanode.DataNode: dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: /2401:db00:11:d010:face:0:2f:0:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) at java.lang.Thread.run(Thread.java:745) Which also comes as client error -get: 2401 is not an IP string literal. This one has existing parsing logic which needs to shift to the last colon rather than the first. Should also be a tiny bit faster by using lastIndexOf rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Attachment: HDFS-8078.9.patch Fix checkstyle errors and whitespace; test seems to be ND. HDFS client gets errors trying to to connect to IPv6 DataNode - Key: HDFS-8078 URL: https://issues.apache.org/jira/browse/HDFS-8078 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Nate Edel Assignee: Nate Edel Labels: ipv6 Attachments: HDFS-8078.9.patch 1st exception, on put: 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 2401:db00:1010:70ba:face:0:8:0:50010 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) Appears to actually stem from code in DataNodeID which assumes it's safe to append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 Currently using InetAddress.getByName() to validate IPv6 (guava InetAddresses.forString has been flaky) but could also use our own parsing. (From logging this, it seems like a low-enough frequency call that the extra object creation shouldn't be problematic, and for me the slight risk of passing in bad input that is not actually an IPv4 or IPv6 address and thus calling an external DNS lookup is outweighed by getting the address normalized and avoiding rewriting parsing.) Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() --- 2nd exception (on datanode) 15/04/13 13:18:07 ERROR datanode.DataNode: dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: /2401:db00:11:d010:face:0:2f:0:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) at java.lang.Thread.run(Thread.java:745) Which also comes as client error -get: 2401 is not an IP string literal. This one has existing parsing logic which needs to shift to the last colon rather than the first. Should also be a tiny bit faster by using lastIndexOf rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8281) Erasure Coding: implement parallel stateful reading for striped layout
[ https://issues.apache.org/jira/browse/HDFS-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527092#comment-14527092 ] Zhe Zhang commented on HDFS-8281: - bq. the tradeoff here is the throughput and the biggest latency of serving a single read request. I agree. To mitigate the latency issue we can consider something similar to Linux kernel readahead: when the current stripe buffer has X amount of un-accessed data , prefetch the next stripe in async fashion. Overall it looks reasonable to me if we read on stripe unit now. I agree about revisiting possible optimizations after seeing more benchmarking results. bq. One question is why we choose 256KB as the cell size instead of the original 64KB? I remember one of the patches from [~drankye] did this update. Kai maybe you can remind us the reason? Thanks Jing for the explanation and revising the patch. Erasure Coding: implement parallel stateful reading for striped layout -- Key: HDFS-8281 URL: https://issues.apache.org/jira/browse/HDFS-8281 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-8281-HDFS-7285.001.patch, HDFS-8281-HDFS-7285.001.patch, HDFS-8281.000.patch This jira aims to support parallel reading for stateful read in {{DFSStripedInputStream}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7672) Erasure Coding: handle write failure for stripping coding blocks
[ https://issues.apache.org/jira/browse/HDFS-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7672: -- Attachment: h7672_20150504.patch h7672_20150504.patch: - handles failure of (non-leading) datanode with data block during writing; - includes HDFS-8288; * leading datanode failure will be done by HDFS-8254. Erasure Coding: handle write failure for stripping coding blocks Key: HDFS-7672 URL: https://issues.apache.org/jira/browse/HDFS-7672 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h7672_20150504.patch In *stripping* case, for (6, 3)-Reed-Solomon, a client writes to 6 data blocks and 3 parity blocks concurrently. We need to handle datanode or network failures when writing a EC BlockGroup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8322) Display warning if hadoop fs -ls is showing the local filesystem
[ https://issues.apache.org/jira/browse/HDFS-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-8322: Attachment: HDFS-8322.000.patch The patch checks the file system from every expended arguments. If the file system {{LocalFileSystem}}, a warning message is printed: {code} $ hdfs dfs -ls / Warning: running hadoop fs -ls on local file system: / Found 28 items drwx-- - root wheel170 2014-08-13 22:21 /.Spotlight-V100 d-wx-wx-wt - root wheel 68 2014-08-15 23:21 /.Trashes drwxr-xr-x - root wheel136 2014-10-13 12:30 /System drwxr-xr-x - root admin204 2014-08-15 23:14 /Users ... drwxrwxrwt - root wheel442 2015-05-04 16:26 /tmp drwxr-xr-x - root wheel442 2015-02-03 12:27 /usr drwxr-xr-x - root wheel884 2015-03-16 14:43 /var {code} Display warning if hadoop fs -ls is showing the local filesystem Key: HDFS-8322 URL: https://issues.apache.org/jira/browse/HDFS-8322 Project: Hadoop HDFS Issue Type: Improvement Components: HDFS Affects Versions: 2.7.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-8322.000.patch Using {{LocalFileSystem}} is rarely the intention of running {{hadoop fs -ls}}. This JIRA proposes displaying a warning message if hadoop fs -ls is showing the local filesystem or using default fs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8322) Display warning if hadoop fs -ls is showing the local filesystem
[ https://issues.apache.org/jira/browse/HDFS-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-8322: Status: Patch Available (was: Open) Display warning if hadoop fs -ls is showing the local filesystem Key: HDFS-8322 URL: https://issues.apache.org/jira/browse/HDFS-8322 Project: Hadoop HDFS Issue Type: Improvement Components: HDFS Affects Versions: 2.7.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-8322.000.patch Using {{LocalFileSystem}} is rarely the intention of running {{hadoop fs -ls}}. This JIRA proposes displaying a warning message if hadoop fs -ls is showing the local filesystem or using default fs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8306) Generate ACL and Xattr outputs in OIV XML outputs
[ https://issues.apache.org/jira/browse/HDFS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527693#comment-14527693 ] Hadoop QA commented on HDFS-8306: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 32s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 26s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 32s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 14s | The applied patch generated 2 new checkstyle issues (total was 41, now 43). | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 5s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | native | 3m 14s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 165m 55s | Tests failed in hadoop-hdfs. | | | | 208m 34s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestLeaseRecovery2 | | | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12730270/HDFS-8306.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / bf70c5a | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/10793/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10793/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10793/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10793/console | This message was automatically generated. Generate ACL and Xattr outputs in OIV XML outputs - Key: HDFS-8306 URL: https://issues.apache.org/jira/browse/HDFS-8306 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.7.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-8306.000.patch, HDFS-8306.001.patch Currently, in the {{hdfs oiv}} XML outputs, not all fields of fsimage are outputs. It makes inspecting {{fsimage}} from XML outputs less practical. Also it prevents recovering a fsimage from XML file. This JIRA is adding ACL and XAttrs in the XML outputs as the first step to achieve the goal described in HDFS-8061. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality
[ https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527716#comment-14527716 ] Jing Zhao commented on HDFS-7678: - Thanks for the great work, Zhe! I have not finished my review yet, but looks like the latest 008 patch will fail TestDFSStripedInputStream (and it passed the pread part but failed in the stateful read part)? Besides, for {{ReadPortion}}'s public fields, I changed them to private because they both are not declared as final and can be modified outside. Thus feels like to restrict the modification going through the setters will make it easier to track. But not feeling very strong on this. Another quick comment is, the following change looks incorrect to me. For example, if {{blkStartOffset - lb.getStartOffset()}} points to cellSize * 10 inside of the block group, we should read from the 5th data block instead of 2nd. {code} int idx = (int) (((blkStartOffset - lb.getStartOffset()) / cellSize) -% dataBlkNum); +% (dataBlkNum + parityBlkNum)); {code} Will try to finish the review and post comments later today. Erasure coding: DFSInputStream with decode functionality Key: HDFS-7678 URL: https://issues.apache.org/jira/browse/HDFS-7678 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Li Bo Assignee: Zhe Zhang Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, HDFS-7678-HDFS-7285.007.patch, HDFS-7678-HDFS-7285.008.patch, HDFS-7678.000.patch, HDFS-7678.001.patch A block group reader will read data from BlockGroup no matter in striping layout or contiguous layout. The corrupt blocks can be known before reading(told by namenode), or just be found during reading. The block group reader needs to do decoding work when some blocks are found corrupt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8237) Move all protocol classes used by ClientProtocol to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527757#comment-14527757 ] Hudson commented on HDFS-8237: -- SUCCESS: Integrated in Hadoop-trunk-Commit #7729 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7729/]) Move all protocol classes used by ClientProtocol to hdfs-client --- Key: HDFS-8237 URL: https://issues.apache.org/jira/browse/HDFS-8237 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.8.0 Attachments: HDFS-8237.000.patch, HDFS-8237.001.patch, HDFS-8237.002.patch This jira proposes to move the classes in the hdfs project referred by ClientProtocol into the hdfs-client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7758) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
[ https://issues.apache.org/jira/browse/HDFS-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-7758: Attachment: HDFS-7758.010.patch Rebase to the latest trunk. Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead - Key: HDFS-7758 URL: https://issues.apache.org/jira/browse/HDFS-7758 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-7758.000.patch, HDFS-7758.001.patch, HDFS-7758.002.patch, HDFS-7758.003.patch, HDFS-7758.004.patch, HDFS-7758.005.patch, HDFS-7758.006.patch, HDFS-7758.007.patch, HDFS-7758.008.patch, HDFS-7758.010.patch HDFS-7496 introduced reference-counting the volume instances being used to prevent race condition when hot swapping a volume. However, {{FsDatasetSpi#getVolumes()}} can still leak the volume instance without increasing its reference count. In this JIRA, we retire the {{FsDatasetSpi#getVolumes()}} and propose {{FsDatasetSpi#getVolumeRefs()}} and etc. method to access {{FsVolume}}. Thus it makes sure that the consumer of {{FsVolume}} always has correct reference count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7758) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
[ https://issues.apache.org/jira/browse/HDFS-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-7758: Attachment: (was: HDFS-7758.009.patch) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead - Key: HDFS-7758 URL: https://issues.apache.org/jira/browse/HDFS-7758 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-7758.000.patch, HDFS-7758.001.patch, HDFS-7758.002.patch, HDFS-7758.003.patch, HDFS-7758.004.patch, HDFS-7758.005.patch, HDFS-7758.006.patch, HDFS-7758.007.patch, HDFS-7758.008.patch HDFS-7496 introduced reference-counting the volume instances being used to prevent race condition when hot swapping a volume. However, {{FsDatasetSpi#getVolumes()}} can still leak the volume instance without increasing its reference count. In this JIRA, we retire the {{FsDatasetSpi#getVolumes()}} and propose {{FsDatasetSpi#getVolumeRefs()}} and etc. method to access {{FsVolume}}. Thus it makes sure that the consumer of {{FsVolume}} always has correct reference count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7672) Erasure Coding: handle write failure for stripping coding blocks
[ https://issues.apache.org/jira/browse/HDFS-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7672: -- Attachment: h7672_20150504c.patch Thanks Jing for the detail review. 1. I totally forgot to bump GenerationStamp. Let me do it in a separated JIRA since the patch is already big. 6. Let keep using toString. I also added block to it in the new patch. h7672_20150504c.patch: 2. checks streamers in closeImpl(); 3. adds back volatile; 4. removes end block for parity blocks; 5. moves writeParity out of CellBuffers; 7, 8, 9, 10. done Erasure Coding: handle write failure for stripping coding blocks Key: HDFS-7672 URL: https://issues.apache.org/jira/browse/HDFS-7672 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h7672_20150504.patch, h7672_20150504b.patch, h7672_20150504c.patch In *stripping* case, for (6, 3)-Reed-Solomon, a client writes to 6 data blocks and 3 parity blocks concurrently. We need to handle datanode or network failures when writing a EC BlockGroup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8322) Display warning if hadoop fs -ls is showing the local filesystem
[ https://issues.apache.org/jira/browse/HDFS-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527623#comment-14527623 ] Allen Wittenauer commented on HDFS-8322: The base assumption proposed in this JIRA is that the distributed file system in use is HDFS. If I'm using a distributed file system such as lustre or glustre or any number of other mountable distributed file systems, using hadoop fs -ls using LocalFileSystem is completely normal. So, -1 from me. Display warning if hadoop fs -ls is showing the local filesystem Key: HDFS-8322 URL: https://issues.apache.org/jira/browse/HDFS-8322 Project: Hadoop HDFS Issue Type: Improvement Components: HDFS Affects Versions: 2.7.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-8322.000.patch Using {{LocalFileSystem}} is rarely the intention of running {{hadoop fs -ls}}. This JIRA proposes displaying a warning message if hadoop fs -ls is showing the local filesystem or using default fs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7758) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
[ https://issues.apache.org/jira/browse/HDFS-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-7758: Attachment: HDFS-7758.009.patch Updated the patch to fix the tailed whitespace. The checkstyle message complains the total number of lines of {{FsDatasetImpl.java}} exceeding 2000 lines (3000+ lines actually). Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead - Key: HDFS-7758 URL: https://issues.apache.org/jira/browse/HDFS-7758 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-7758.000.patch, HDFS-7758.001.patch, HDFS-7758.002.patch, HDFS-7758.003.patch, HDFS-7758.004.patch, HDFS-7758.005.patch, HDFS-7758.006.patch, HDFS-7758.007.patch, HDFS-7758.008.patch, HDFS-7758.009.patch HDFS-7496 introduced reference-counting the volume instances being used to prevent race condition when hot swapping a volume. However, {{FsDatasetSpi#getVolumes()}} can still leak the volume instance without increasing its reference count. In this JIRA, we retire the {{FsDatasetSpi#getVolumes()}} and propose {{FsDatasetSpi#getVolumeRefs()}} and etc. method to access {{FsVolume}}. Thus it makes sure that the consumer of {{FsVolume}} always has correct reference count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7397) Add more detail to the documentation for the conf key dfs.client.read.shortcircuit.streams.cache.size
[ https://issues.apache.org/jira/browse/HDFS-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527715#comment-14527715 ] Brahma Reddy Battula commented on HDFS-7397: Thanks a lot [~cmccabe] for reviewing and committing this patch.. Add more detail to the documentation for the conf key dfs.client.read.shortcircuit.streams.cache.size --- Key: HDFS-7397 URL: https://issues.apache.org/jira/browse/HDFS-7397 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Brahma Reddy Battula Priority: Minor Fix For: 2.8.0 Attachments: HDFS-7397-002.patch, HDFS-7397.patch For dfs.client.read.shortcircuit.streams.cache.size, is it in MB or KB? Interestingly, it is neither in MB nor KB. It is the number of shortcircuit streams. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8322) Display warning if hadoop fs -ls is showing the local filesystem
[ https://issues.apache.org/jira/browse/HDFS-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527814#comment-14527814 ] Hadoop QA commented on HDFS-8322: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 54s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:red}-1{color} | javadoc | 9m 47s | The applied patch generated 62 additional warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 8s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 1m 44s | The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | common tests | 23m 49s | Tests passed in hadoop-common. | | | | 61m 34s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-common | | | Format string should use %n rather than n in org.apache.hadoop.fs.shell.Ls.processRawArguments(LinkedList) At Ls.java:rather than n in org.apache.hadoop.fs.shell.Ls.processRawArguments(LinkedList) At Ls.java:[line 135] | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12730328/HDFS-8322.000.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 551615f | | javadoc | https://builds.apache.org/job/PreCommit-HDFS-Build/10801/artifact/patchprocess/diffJavadocWarnings.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/10801/artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10801/artifact/patchprocess/testrun_hadoop-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10801/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10801/console | This message was automatically generated. Display warning if hadoop fs -ls is showing the local filesystem Key: HDFS-8322 URL: https://issues.apache.org/jira/browse/HDFS-8322 Project: Hadoop HDFS Issue Type: Improvement Components: HDFS Affects Versions: 2.7.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-8322.000.patch Using {{LocalFileSystem}} is rarely the intention of running {{hadoop fs -ls}}. This JIRA proposes displaying a warning message if hadoop fs -ls is showing the local filesystem or using default fs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8323) Bump GenerationStamp for write faliure in DFSStripedOutputStream
Tsz Wo Nicholas Sze created HDFS-8323: - Summary: Bump GenerationStamp for write faliure in DFSStripedOutputStream Key: HDFS-8323 URL: https://issues.apache.org/jira/browse/HDFS-8323 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8314) Move HdfsServerConstants#IO_FILE_BUFFER_SIZE and SMALL_BUFFER_SIZE to the users
[ https://issues.apache.org/jira/browse/HDFS-8314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527831#comment-14527831 ] Haohui Mai commented on HDFS-8314: -- Thanks for working on this. in hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java: {code} + private static final int SMALL_BUFFER_SIZE; + static { +HdfsConfiguration conf = new HdfsConfiguration(); +SMALL_BUFFER_SIZE = DFSUtil.getSmallBufferSize(conf); + } + {code} The configuration is available through {{dfsClient.getConfiguration()}}. in hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java: {code} + private final static HdfsConfiguration HDFS_CONF = new HdfsConfiguration(); + private final static int IO_FILE_BUFFER_SIZE; + private final static int SMALL_BUFFER_SIZE; + static { +HdfsConfiguration hdfsConf = new HdfsConfiguration(); +IO_FILE_BUFFER_SIZE = DFSUtil.getIoFileBufferSize(hdfsConf); +SMALL_BUFFER_SIZE = DFSUtil.getSmallBufferSize(hdfsConf); + } + {code} The configuration is available through {{datanode.getConf()}}. So as hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java. Move HdfsServerConstants#IO_FILE_BUFFER_SIZE and SMALL_BUFFER_SIZE to the users --- Key: HDFS-8314 URL: https://issues.apache.org/jira/browse/HDFS-8314 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Haohui Mai Assignee: Li Lu Attachments: HDFS-8314-trunk.001.patch, HDFS-8314-trunk.002.patch Currently HdfsServerConstants reads the configuration and to set the value of IO_FILE_BUFFER_SIZE and SMALL_BUFFER_SIZE, thus they are configurable instead of being constants. This jira proposes to move these two variables to the users in the upper-level so that HdfsServerConstants only stores constant values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8219) setStoragePolicy with folder behavior is different after cluster restart
[ https://issues.apache.org/jira/browse/HDFS-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] surendra singh lilhore updated HDFS-8219: - Priority: Major (was: Minor) Changed Severity to *Major* because NN storing wrong edit info.. setStoragePolicy with folder behavior is different after cluster restart Key: HDFS-8219 URL: https://issues.apache.org/jira/browse/HDFS-8219 Project: Hadoop HDFS Issue Type: Bug Reporter: Peter Shi Assignee: Xiaoyu Yao Attachments: HDFS-8219.patch, HDFS-8219.unittest-norepro.patch Reproduce steps. 1) mkdir named /temp 2) put one file A under /temp 3) change /temp storage policy to COLD 4) use -getStoragePolicy to query file A's storage policy, it is same with /temp 5) change /temp folder storage policy again, will see file A's storage policy keep same with parent folder. then restart the cluster. do 3) 4) again, will find file A's storage policy is not change while parent folder's storage policy changes. It behaves different. As i debugged, found the code: in INodeFile.getStoragePolicyID {code} public byte getStoragePolicyID() { byte id = getLocalStoragePolicyID(); if (id == BLOCK_STORAGE_POLICY_ID_UNSPECIFIED) { return this.getParent() != null ? this.getParent().getStoragePolicyID() : id; } return id; } {code} If the file do not have its storage policy, it will use parent's. But after cluster restart, the file turns to have its own storage policy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)