[jira] [Commented] (HDFS-8854) Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs
[ https://issues.apache.org/jira/browse/HDFS-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693024#comment-14693024 ] Hadoop QA commented on HDFS-8854: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750039/HDFS-8854-HDFS-7285-merge.03.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / fbf7e81 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11975/console | This message was automatically generated. > Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs > -- > > Key: HDFS-8854 > URL: https://issues.apache.org/jira/browse/HDFS-8854 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Walter Su >Assignee: Walter Su > Attachments: HDFS-8854-Consolidated-20150806.02.txt, > HDFS-8854-HDFS-7285-merge.03.patch, HDFS-8854-HDFS-7285-merge.03.txt, > HDFS-8854-HDFS-7285.00.patch, HDFS-8854-HDFS-7285.01.patch, > HDFS-8854-HDFS-7285.02.patch, HDFS-8854-HDFS-7285.03.patch, HDFS-8854.00.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8887) Expose storage type and storage ID in BlockLocation
[ https://issues.apache.org/jira/browse/HDFS-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-8887: -- Resolution: Fixed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Thanks again for reviewing Eddy, committed to trunk and branch-2. > Expose storage type and storage ID in BlockLocation > --- > > Key: HDFS-8887 > URL: https://issues.apache.org/jira/browse/HDFS-8887 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.1 >Reporter: Andrew Wang >Assignee: Andrew Wang > Fix For: 2.8.0 > > Attachments: HDFS-8887.001.patch, HDFS-8887.002.patch > > > Some applications schedule based on info like storage type or storage ID, > it'd be useful to expose this information in BlockLocation. It's already > included in LocatedBlock and sent over the wire. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8887) Expose storage type and storage ID in BlockLocation
[ https://issues.apache.org/jira/browse/HDFS-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-8887: -- Attachment: HDFS-8887.002.patch Whitespace-only changes patch attached. javac warnings are all because of the additional deprecation. checkstyle is because I followed code style of rest of BlockLocation. Thanks for reviewing Eddy, committing this since it's just whitespace changes. > Expose storage type and storage ID in BlockLocation > --- > > Key: HDFS-8887 > URL: https://issues.apache.org/jira/browse/HDFS-8887 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.1 >Reporter: Andrew Wang >Assignee: Andrew Wang > Fix For: 2.8.0 > > Attachments: HDFS-8887.001.patch, HDFS-8887.002.patch > > > Some applications schedule based on info like storage type or storage ID, > it'd be useful to expose this information in BlockLocation. It's already > included in LocatedBlock and sent over the wire. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8854) Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs
[ https://issues.apache.org/jira/browse/HDFS-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8854: Attachment: HDFS-8854-HDFS-7285-merge.03.patch Thanks Walter for the update! +1 on the latest patch, pending Jenkins. Uploading renamed patch to trigger Jenkins on the HDFS-7285-merge branch. > Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs > -- > > Key: HDFS-8854 > URL: https://issues.apache.org/jira/browse/HDFS-8854 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Walter Su >Assignee: Walter Su > Attachments: HDFS-8854-Consolidated-20150806.02.txt, > HDFS-8854-HDFS-7285-merge.03.patch, HDFS-8854-HDFS-7285-merge.03.txt, > HDFS-8854-HDFS-7285.00.patch, HDFS-8854-HDFS-7285.01.patch, > HDFS-8854-HDFS-7285.02.patch, HDFS-8854-HDFS-7285.03.patch, HDFS-8854.00.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8388) Time and Date format need to be in sync in Namenode UI page
[ https://issues.apache.org/jira/browse/HDFS-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693005#comment-14693005 ] Akira AJISAKA commented on HDFS-8388: - bq. Is it possible to use moment.js to parse the date instead? No, moment.js cannot parse {{zz}} format. I'll check whether moment-timezone.js can parse {{zz}}. If it can parse the timezone, we can use it instead of adding a new metric. > Time and Date format need to be in sync in Namenode UI page > --- > > Key: HDFS-8388 > URL: https://issues.apache.org/jira/browse/HDFS-8388 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Surendra Singh Lilhore >Priority: Minor > Attachments: HDFS-8388-002.patch, HDFS-8388-003.patch, > HDFS-8388.patch, HDFS-8388_1.patch, ScreenShot-InvalidDate.png > > > In NameNode UI Page, Date and Time FORMAT displayed on the page are not in > sync currently. > Started:Wed May 13 12:28:02 IST 2015 > Compiled:23 Apr 2015 12:22:59 > Block Deletion Start Time 13 May 2015 12:28:02 > We can keep a common format in all the above places. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8880) NameNode metrics logging
[ https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692990#comment-14692990 ] Hadoop QA commented on HDFS-8880: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 50s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 1s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 38s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 37s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 45s | The applied patch generated 1 new checkstyle issues (total was 6, now 7). | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 20s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 22m 5s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 174m 41s | Tests failed in hadoop-hdfs. | | | | 241m 46s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.ha.TestZKFailoverController | | | hadoop.net.TestNetUtils | | Failed build | hadoop-hdfs | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749988/HDFS-8880.02.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3ae716f | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11974/artifact/patchprocess/diffcheckstylehadoop-common.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11974/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11974/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11974/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11974/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11974/console | This message was automatically generated. > NameNode metrics logging > > > Key: HDFS-8880 > URL: https://issues.apache.org/jira/browse/HDFS-8880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8880.01.patch, HDFS-8880.02.patch, > namenode-metrics.log > > > The NameNode can periodically log metrics to help debugging when the cluster > is not setup with another metrics monitoring scheme. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Release Note: Resubmitting NetUtils version of patch, with bugfixes. Older version of patch seems to need rebasing, but isn't breaking ZKFC, let's see if these fixes fix that (I can't replicate the break locally.) was:Resubmitting older (non-NetUtils) version of patch to see if NetUtils change is breaking ZK related tests, can't repeat locally. Status: Patch Available (was: Open) > HDFS client gets errors trying to to connect to IPv6 DataNode > - > > Key: HDFS-8078 > URL: https://issues.apache.org/jira/browse/HDFS-8078 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: Nate Edel >Assignee: Nate Edel > Labels: BB2015-05-TBR, ipv6 > Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, > HDFS-8078.12.patch, HDFS-8078.13.patch, HDFS-8078.14.patch, HDFS-8078.9.patch > > > 1st exception, on put: > 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception > java.lang.IllegalArgumentException: Does not contain a valid host:port > authority: 2401:db00:1010:70ba:face:0:8:0:50010 > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) > at > org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) > Appears to actually stem from code in DataNodeID which assumes it's safe to > append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for > IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which > requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 > Currently using InetAddress.getByName() to validate IPv6 (guava > InetAddresses.forString has been flaky) but could also use our own parsing. > (From logging this, it seems like a low-enough frequency call that the extra > object creation shouldn't be problematic, and for me the slight risk of > passing in bad input that is not actually an IPv4 or IPv6 address and thus > calling an external DNS lookup is outweighed by getting the address > normalized and avoiding rewriting parsing.) > Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() > --- > 2nd exception (on datanode) > 15/04/13 13:18:07 ERROR datanode.DataNode: > dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown > operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: > /2401:db00:11:d010:face:0:2f:0:50010 > java.io.EOFException > at java.io.DataInputStream.readShort(DataInputStream.java:315) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) > at java.lang.Thread.run(Thread.java:745) > Which also comes as client error "-get: 2401 is not an IP string literal." > This one has existing parsing logic which needs to shift to the last colon > rather than the first. Should also be a tiny bit faster by using lastIndexOf > rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Attachment: HDFS-8078.14.patch > HDFS client gets errors trying to to connect to IPv6 DataNode > - > > Key: HDFS-8078 > URL: https://issues.apache.org/jira/browse/HDFS-8078 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: Nate Edel >Assignee: Nate Edel > Labels: BB2015-05-TBR, ipv6 > Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, > HDFS-8078.12.patch, HDFS-8078.13.patch, HDFS-8078.14.patch, HDFS-8078.9.patch > > > 1st exception, on put: > 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception > java.lang.IllegalArgumentException: Does not contain a valid host:port > authority: 2401:db00:1010:70ba:face:0:8:0:50010 > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) > at > org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) > Appears to actually stem from code in DataNodeID which assumes it's safe to > append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for > IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which > requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 > Currently using InetAddress.getByName() to validate IPv6 (guava > InetAddresses.forString has been flaky) but could also use our own parsing. > (From logging this, it seems like a low-enough frequency call that the extra > object creation shouldn't be problematic, and for me the slight risk of > passing in bad input that is not actually an IPv4 or IPv6 address and thus > calling an external DNS lookup is outweighed by getting the address > normalized and avoiding rewriting parsing.) > Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() > --- > 2nd exception (on datanode) > 15/04/13 13:18:07 ERROR datanode.DataNode: > dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown > operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: > /2401:db00:11:d010:face:0:2f:0:50010 > java.io.EOFException > at java.io.DataInputStream.readShort(DataInputStream.java:315) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) > at java.lang.Thread.run(Thread.java:745) > Which also comes as client error "-get: 2401 is not an IP string literal." > This one has existing parsing logic which needs to shift to the last colon > rather than the first. Should also be a tiny bit faster by using lastIndexOf > rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Status: Open (was: Patch Available) > HDFS client gets errors trying to to connect to IPv6 DataNode > - > > Key: HDFS-8078 > URL: https://issues.apache.org/jira/browse/HDFS-8078 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: Nate Edel >Assignee: Nate Edel > Labels: BB2015-05-TBR, ipv6 > Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, > HDFS-8078.12.patch, HDFS-8078.13.patch, HDFS-8078.9.patch > > > 1st exception, on put: > 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception > java.lang.IllegalArgumentException: Does not contain a valid host:port > authority: 2401:db00:1010:70ba:face:0:8:0:50010 > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) > at > org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) > Appears to actually stem from code in DataNodeID which assumes it's safe to > append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for > IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which > requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 > Currently using InetAddress.getByName() to validate IPv6 (guava > InetAddresses.forString has been flaky) but could also use our own parsing. > (From logging this, it seems like a low-enough frequency call that the extra > object creation shouldn't be problematic, and for me the slight risk of > passing in bad input that is not actually an IPv4 or IPv6 address and thus > calling an external DNS lookup is outweighed by getting the address > normalized and avoiding rewriting parsing.) > Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() > --- > 2nd exception (on datanode) > 15/04/13 13:18:07 ERROR datanode.DataNode: > dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown > operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: > /2401:db00:11:d010:face:0:2f:0:50010 > java.io.EOFException > at java.io.DataInputStream.readShort(DataInputStream.java:315) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) > at java.lang.Thread.run(Thread.java:745) > Which also comes as client error "-get: 2401 is not an IP string literal." > This one has existing parsing logic which needs to shift to the last colon > rather than the first. Should also be a tiny bit faster by using lastIndexOf > rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8808: Attachment: HDFS-8808-02.patch The failed tests are unrelated and all pass locally. Updating patch to fix whitespace and checkstyle issues. > dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby > > > Key: HDFS-8808 > URL: https://issues.apache.org/jira/browse/HDFS-8808 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Gautam Gopalakrishnan >Assignee: Zhe Zhang > Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, > HDFS-8808-02.patch > > > The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the > speed with which the fsimage is copied between the namenodes during regular > use. However, as a side effect, this also limits transfers when the > {{-bootstrapStandby}} option is used. This option is often used during > upgrades and could potentially slow down the entire workflow. The request > here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth > setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8889) Erasure Coding: cover more test situations of datanode failure during client writing
Li Bo created HDFS-8889: --- Summary: Erasure Coding: cover more test situations of datanode failure during client writing Key: HDFS-8889 URL: https://issues.apache.org/jira/browse/HDFS-8889 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8118) Delay in checkpointing Trash can leave trash for 2 intervals before deleting
[ https://issues.apache.org/jira/browse/HDFS-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Casey Brotherton updated HDFS-8118: --- Attachment: HDFS-8118.001.patch This is a simplified patch addressing only the defect, and not the testcases. > Delay in checkpointing Trash can leave trash for 2 intervals before deleting > > > Key: HDFS-8118 > URL: https://issues.apache.org/jira/browse/HDFS-8118 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Casey Brotherton >Assignee: Casey Brotherton >Priority: Trivial > Attachments: HDFS-8118.001.patch, HDFS-8118.patch > > > When the fs.trash.checkpoint.interval and the fs.trash.interval are set > non-zero and the same, it is possible for trash to be left for two intervals. > The TrashPolicyDefault will use a floor and ceiling function to ensure that > the Trash will be checkpointed every "interval" of minutes. > Each user's trash is checkpointed individually. The time resolution of the > checkpoint timestamp is to the second. > If the seconds switch while one user is checkpointing, then the next user's > timestamp will be later. > This will cause the next user's checkpoint to not be deleted at the next > interval. > I have recreated this in a lab cluster > I also have a suggestion for a patch that I can upload later tonight after > testing it further. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart
[ https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-8879: - Target Version/s: 2.7.2 > Quota by storage type usage incorrectly initialized upon namenode restart > - > > Key: HDFS-8879 > URL: https://issues.apache.org/jira/browse/HDFS-8879 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Kihwal Lee >Assignee: Xiaoyu Yao > Attachments: HDFS-8879.01.patch > > > This was found by [~kihwal] as part of HDFS-8865 work in this > [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904]. > The unit test > testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit > failed to detect this because they were using an obsolete > FsDirectory instance. Once added the highlighted line below, the issue can be > reproed. > {code} > >fsdir = cluster.getNamesystem().getFSDirectory(); > INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString()); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692936#comment-14692936 ] Hadoop QA commented on HDFS-8078: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 45s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 36s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 40s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 13s | The applied patch generated 12 new checkstyle issues (total was 0, now 12). | | {color:red}-1{color} | checkstyle | 1m 55s | The applied patch generated 4 new checkstyle issues (total was 0, now 4). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 24s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 3s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 175m 0s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 0m 28s | Tests passed in hadoop-hdfs-client. | | | | 220m 23s | | \\ \\ || Reason || Tests || | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749997/HDFS-8078.13.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3ae716f | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11972/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt https://builds.apache.org/job/PreCommit-HDFS-Build/11972/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11972/artifact/patchprocess/testrun_hadoop-hdfs.txt | | hadoop-hdfs-client test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11972/artifact/patchprocess/testrun_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11972/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11972/console | This message was automatically generated. > HDFS client gets errors trying to to connect to IPv6 DataNode > - > > Key: HDFS-8078 > URL: https://issues.apache.org/jira/browse/HDFS-8078 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: Nate Edel >Assignee: Nate Edel > Labels: BB2015-05-TBR, ipv6 > Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, > HDFS-8078.12.patch, HDFS-8078.13.patch, HDFS-8078.9.patch > > > 1st exception, on put: > 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception > java.lang.IllegalArgumentException: Does not contain a valid host:port > authority: 2401:db00:1010:70ba:face:0:8:0:50010 > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) > at > org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) > Appears to actually stem from code in DataNodeID which assumes it's safe to > append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for > IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which > requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 > Currently using InetAddress.getByName() to validate IPv6 (guava > InetAddresses.forString has been
[jira] [Updated] (HDFS-8870) Lease is leaked on write failure
[ https://issues.apache.org/jira/browse/HDFS-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-8870: -- Target Version/s: 2.7.2, 2.6.2 (was: 2.6.1, 2.7.2) > Lease is leaked on write failure > > > Key: HDFS-8870 > URL: https://issues.apache.org/jira/browse/HDFS-8870 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Daryn Sharp > > Creating this ticket on behalf of [~daryn] > We've seen this in our of our cluster. When a long running process has a > write failure, the lease is leaked and gets renewed until the token is > expired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart
[ https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692908#comment-14692908 ] Xiaoyu Yao commented on HDFS-8879: -- Thanks [~arpit99] for the review. I will hold off commit until tomorrow in case [~kihwal] has additional feedback. > Quota by storage type usage incorrectly initialized upon namenode restart > - > > Key: HDFS-8879 > URL: https://issues.apache.org/jira/browse/HDFS-8879 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Kihwal Lee >Assignee: Xiaoyu Yao > Attachments: HDFS-8879.01.patch > > > This was found by [~kihwal] as part of HDFS-8865 work in this > [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904]. > The unit test > testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit > failed to detect this because they were using an obsolete > FsDirectory instance. Once added the highlighted line below, the issue can be > reproed. > {code} > >fsdir = cluster.getNamesystem().getFSDirectory(); > INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString()); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8886) Not able to build with 'mvn compile -Pnative'
[ https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692905#comment-14692905 ] Jagadesh Kiran N commented on HDFS-8886: http://zutai.blogspot.in/2014/06/build-install-and-run-hadoop-24-240-on.html?showComment=1422091525887#c2264594416650430988 > Not able to build with 'mvn compile -Pnative' > - > > Key: HDFS-8886 > URL: https://issues.apache.org/jira/browse/HDFS-8886 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Puneeth P > > I am running into a problem where i am not able to compile the native parts > of hadoop-hdfs project. the problem is that it is not finding MakeFile in > ${project.build.dir}/native. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8886) Not able to build with 'mvn compile -Pnative'
[ https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692904#comment-14692904 ] Jagadesh Kiran N commented on HDFS-8886: Hi, You can refer this > Not able to build with 'mvn compile -Pnative' > - > > Key: HDFS-8886 > URL: https://issues.apache.org/jira/browse/HDFS-8886 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Puneeth P > > I am running into a problem where i am not able to compile the native parts > of hadoop-hdfs project. the problem is that it is not finding MakeFile in > ${project.build.dir}/native. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8887) Expose storage type and storage ID in BlockLocation
[ https://issues.apache.org/jira/browse/HDFS-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692892#comment-14692892 ] Hadoop QA commented on HDFS-8887: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 20m 54s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:red}-1{color} | javac | 7m 47s | The applied patch generated 21 additional warning messages. | | {color:green}+1{color} | javadoc | 9m 46s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 32s | The applied patch generated 4 new checkstyle issues (total was 25, now 29). | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 6m 15s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 22m 23s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 172m 47s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 0m 27s | Tests passed in hadoop-hdfs-client. | | | | 245m 56s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.net.TestNetUtils | | | hadoop.ha.TestZKFailoverController | | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749983/HDFS-8887.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 7c796fd | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/11970/artifact/patchprocess/diffJavacWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11970/artifact/patchprocess/diffcheckstylehadoop-common.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11970/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11970/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11970/artifact/patchprocess/testrun_hadoop-hdfs.txt | | hadoop-hdfs-client test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11970/artifact/patchprocess/testrun_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11970/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11970/console | This message was automatically generated. > Expose storage type and storage ID in BlockLocation > --- > > Key: HDFS-8887 > URL: https://issues.apache.org/jira/browse/HDFS-8887 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.1 >Reporter: Andrew Wang >Assignee: Andrew Wang > Attachments: HDFS-8887.001.patch > > > Some applications schedule based on info like storage type or storage ID, > it'd be useful to expose this information in BlockLocation. It's already > included in LocatedBlock and sent over the wire. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases
[ https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692888#comment-14692888 ] Hadoop QA commented on HDFS-8826: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 3s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 36s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 49s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 37s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 31s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 19s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 22m 10s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 171m 56s | Tests failed in hadoop-hdfs. | | | | 236m 54s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.net.TestNetUtils | | | hadoop.ha.TestZKFailoverController | | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749979/h8826_20150811.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 7c796fd | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11971/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11971/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11971/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11971/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11971/console | This message was automatically generated. > Balancer may not move blocks efficiently in some cases > -- > > Key: HDFS-8826 > URL: https://issues.apache.org/jira/browse/HDFS-8826 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h8826_20150811.patch > > > Balancer is inefficient in the following case: > || Datanode || Utilization || Rack || > | D1 | 95% | A | > | D2 | 30% | B | > | D3, D4, D5 | 0% | B | > The average utilization is 25% so that D2 is within 10% threshold. However, > Balancer currently will first move blocks from D2 to D3, D4 and D5 since they > are under the same rack. Then, it will move blocks from D1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed
[ https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692870#comment-14692870 ] Yi Liu edited comment on HDFS-8863 at 8/12/15 3:49 AM: --- {quote} What if we let it check against storage type level sum and also make sure there is at least one storage with enough space? {quote} Still have potential issue. For example, we have datanode dn0, and three storages(s1, s2, s3) of required storage type. Both s1 and s3 has 2/3 block size remaining space, and s2 has 1+2/3 block size remaining space. We just scheduled one block on dn0, it's certainly on s2, now a new block is adding and block placement checks dn0, for current patch, it will see the maximum of remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat it as a good target, but actually it's not. I am thinking we can do as following: do storage type level sum, but for each storage, we only count the remaining space of multiple block size part, so for above example, remaining space of s1 and s3 is counted 0, s2 is 1, then the sum is 1, dn0 is not a good target. In this approach, we don't need to check the maximum too. {quote} Datanodes only care about the storage type, so checking a particular storagewon't do any good. It will just cause block placement to re-pick target more. {quote} You are right, I also had another meaning: when iterating storages, it's to check the remaining space of storage type, but actually some back storages may be {{State.FAILED}} or {{State.READ_ONLY_SHARED}}, it's remaining space is still be counted, right? So I think you can do these check in {{getRemaining}}. See my JIRA HDFS-8884, which has relation to this, I do fast-fail check for datanode, of cause, I can do this part in my JIRA if you don't do it here. was (Author: hitliuyi): {quote} What if we let it check against storage type level sum and also make sure there is at least one storage with enough space? {quote} Still have potential issue. For example, we have datanode dn0, and three storages(s1, s2, s3) of required storage type. Both s1 and s3 has 2/3 block size remaining space, and s2 has 1+2/3 block size remaining space. We just scheduled one block on dn0, it's certainly on s2, now a new block is adding and block placement checks dn0, for current patch, it will see the maximum of remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat it as a good target, but actually it's not. I am thinking we can do as following: do storage type level sum, but for each storage, we only count the remaining space of multiple block size part, so for above example, remaining space of s1 and s3 is counted 0, s2 is 1, then the sum is 1, dn0 is not a good target. {quote} Datanodes only care about the storage type, so checking a particular storagewon't do any good. It will just cause block placement to re-pick target more. {quote} You are right, I also had another meaning: when iterating storages, it's to check the remaining space of storage type, but actually some back storages may be {{State.FAILED}} or {{State.READ_ONLY_SHARED}}, it's remaining space is still be counted, right? So I think you can do these check in {{getRemaining}}. See my JIRA HDFS-8884, which has relation to this, I do fast-fail check for datanode, of cause, I can do this part in my JIRA if you don't do it here. > The remiaing space check in BlockPlacementPolicyDefault is flawed > - > > Key: HDFS-8863 > URL: https://issues.apache.org/jira/browse/HDFS-8863 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Labels: 2.6.1-candidate > Attachments: HDFS-8863.patch, HDFS-8863.v2.patch > > > The block placement policy calls > {{DatanodeDescriptor#getRemaining(StorageType to check whether the block > is going to fit. Since the method is adding up all remaining spaces, namenode > can allocate a new block on a full node. This causes pipeline construction > failure and {{abandonBlock}}. If the cluster is nearly full, the client might > hit this multiple times and the write can fail permanently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed
[ https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692870#comment-14692870 ] Yi Liu edited comment on HDFS-8863 at 8/12/15 3:50 AM: --- {quote} What if we let it check against storage type level sum and also make sure there is at least one storage with enough space? {quote} Still have potential issue. For example, we have datanode dn0, and three storages(s1, s2, s3) of required storage type. Both s1 and s3 has 2/3 block size remaining space, and s2 has 1+2/3 block size remaining space. We just scheduled one block on dn0, it's certainly on s2, now a new block is adding and block placement checks dn0, for current patch, it will see the maximum of remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat it as a good target, but actually it's not. I am thinking we can do as following: do storage type level sum, but for each storage, we only count the remaining space of multiple block size part, so for above example, remaining space of s1 and s3 is counted 0, s2 is 1, then the sum is 1, dn0 is not a good target. In this approach, we don't need to check the maximum too. {quote} Datanodes only care about the storage type, so checking a particular storagewon't do any good. It will just cause block placement to re-pick target more. {quote} You are right, I also had another meaning: when iterating storages, it's to check the remaining space of storage type, but actually some back storages may be {{State.FAILED}} or {{State.READ_ONLY_SHARED}}, it's remaining space is still be counted, right? So I think you can do these check in {{getRemaining}}. See my JIRA HDFS-8884, which has relation to this, I do fast-fail check for datanode, of course, I can do this part in my JIRA if you don't do it here. was (Author: hitliuyi): {quote} What if we let it check against storage type level sum and also make sure there is at least one storage with enough space? {quote} Still have potential issue. For example, we have datanode dn0, and three storages(s1, s2, s3) of required storage type. Both s1 and s3 has 2/3 block size remaining space, and s2 has 1+2/3 block size remaining space. We just scheduled one block on dn0, it's certainly on s2, now a new block is adding and block placement checks dn0, for current patch, it will see the maximum of remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat it as a good target, but actually it's not. I am thinking we can do as following: do storage type level sum, but for each storage, we only count the remaining space of multiple block size part, so for above example, remaining space of s1 and s3 is counted 0, s2 is 1, then the sum is 1, dn0 is not a good target. In this approach, we don't need to check the maximum too. {quote} Datanodes only care about the storage type, so checking a particular storagewon't do any good. It will just cause block placement to re-pick target more. {quote} You are right, I also had another meaning: when iterating storages, it's to check the remaining space of storage type, but actually some back storages may be {{State.FAILED}} or {{State.READ_ONLY_SHARED}}, it's remaining space is still be counted, right? So I think you can do these check in {{getRemaining}}. See my JIRA HDFS-8884, which has relation to this, I do fast-fail check for datanode, of cause, I can do this part in my JIRA if you don't do it here. > The remiaing space check in BlockPlacementPolicyDefault is flawed > - > > Key: HDFS-8863 > URL: https://issues.apache.org/jira/browse/HDFS-8863 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Labels: 2.6.1-candidate > Attachments: HDFS-8863.patch, HDFS-8863.v2.patch > > > The block placement policy calls > {{DatanodeDescriptor#getRemaining(StorageType to check whether the block > is going to fit. Since the method is adding up all remaining spaces, namenode > can allocate a new block on a full node. This causes pipeline construction > failure and {{abandonBlock}}. If the cluster is nearly full, the client might > hit this multiple times and the write can fail permanently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed
[ https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692870#comment-14692870 ] Yi Liu commented on HDFS-8863: -- {quote} What if we let it check against storage type level sum and also make sure there is at least one storage with enough space? {quote} Still have potential issue. For example, we have datanode dn0, and three storages(s1, s2, s3) of required storage type. Both s1 and s3 has 2/3 block size remaining space, and s2 has 1+2/3 block size remaining space. We just scheduled one block on dn0, it's certainly on s2, now a new block is adding and block placement checks dn0, for current patch, it will see the maximum of remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat it as a good target, but actually it's not. I am thinking we can do as following: do storage type level sum, but for each storage, we only count the remaining space of multiple block size part, so for above example, remaining space of s1 and s3 is counted 0, s2 is 1, then the sum is 1, dn0 is not a good target. {quote} Datanodes only care about the storage type, so checking a particular storagewon't do any good. It will just cause block placement to re-pick target more. {quote} You are right, I also had another meaning: when iterating storages, it's to check the remaining space of storage type, but actually some back storages may be {{State.FAILED}} or {{State.READ_ONLY_SHARED}}, it's remaining space is still be counted, right? So I think you can do these check in {{getRemaining}}. See my JIRA HDFS-8884, which has relation to this, I do fast-fail check for datanode, of cause, I can do this part in my JIRA if you don't do it here. > The remiaing space check in BlockPlacementPolicyDefault is flawed > - > > Key: HDFS-8863 > URL: https://issues.apache.org/jira/browse/HDFS-8863 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Labels: 2.6.1-candidate > Attachments: HDFS-8863.patch, HDFS-8863.v2.patch > > > The block placement policy calls > {{DatanodeDescriptor#getRemaining(StorageType to check whether the block > is going to fit. Since the method is adding up all remaining spaces, namenode > can allocate a new block on a full node. This causes pipeline construction > failure and {{abandonBlock}}. If the cluster is nearly full, the client might > hit this multiple times and the write can fail permanently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart
[ https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692863#comment-14692863 ] Arpit Agarwal commented on HDFS-8879: - +1 The test failures look unrelated. > Quota by storage type usage incorrectly initialized upon namenode restart > - > > Key: HDFS-8879 > URL: https://issues.apache.org/jira/browse/HDFS-8879 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Kihwal Lee >Assignee: Xiaoyu Yao > Attachments: HDFS-8879.01.patch > > > This was found by [~kihwal] as part of HDFS-8865 work in this > [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904]. > The unit test > testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit > failed to detect this because they were using an obsolete > FsDirectory instance. Once added the highlighted line below, the issue can be > reproed. > {code} > >fsdir = cluster.getNamesystem().getFSDirectory(); > INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString()); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692857#comment-14692857 ] Hadoop QA commented on HDFS-8808: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 53s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 46s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 2s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 33s | The applied patch generated 4 new checkstyle issues (total was 574, now 578). | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 51s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 36s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 42s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 13s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 70m 41s | Tests failed in hadoop-hdfs. | | | | 116m 45s | | \\ \\ || Reason || Tests || | Timed out tests | org.apache.hadoop.hdfs.TestFileCreationDelete | | | org.apache.hadoop.hdfs.TestDFSClientExcludedNodes | | | org.apache.hadoop.hdfs.TestGetBlocks | | | org.apache.hadoop.hdfs.TestRollingUpgrade | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749985/HDFS-8808-01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3ae716f | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11973/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11973/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11973/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11973/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11973/console | This message was automatically generated. > dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby > > > Key: HDFS-8808 > URL: https://issues.apache.org/jira/browse/HDFS-8808 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Gautam Gopalakrishnan >Assignee: Zhe Zhang > Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch > > > The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the > speed with which the fsimage is copied between the namenodes during regular > use. However, as a side effect, this also limits transfers when the > {{-bootstrapStandby}} option is used. This option is often used during > upgrades and could potentially slow down the entire workflow. The request > here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth > setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8854) Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs
[ https://issues.apache.org/jira/browse/HDFS-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8854: Attachment: HDFS-8854-HDFS-7285.03.patch HDFS-8854-HDFS-7285-merge.03.txt bq. Since ErasureCodingPolicy has cellSize, can we avoid separate variable cellSize in DFSStripedInputStream.java, DFSStripedOutputStream.java classes. It’s ok because cellSize is involved in calculation. Uploaded 03 patch address all other issues mentioned above. Thanks again [~rakeshr] & [~zhz] > Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs > -- > > Key: HDFS-8854 > URL: https://issues.apache.org/jira/browse/HDFS-8854 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Walter Su >Assignee: Walter Su > Attachments: HDFS-8854-Consolidated-20150806.02.txt, > HDFS-8854-HDFS-7285-merge.03.txt, HDFS-8854-HDFS-7285.00.patch, > HDFS-8854-HDFS-7285.01.patch, HDFS-8854-HDFS-7285.02.patch, > HDFS-8854-HDFS-7285.03.patch, HDFS-8854.00.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8859) Improve DataNode (ReplicaMap) memory footprint to save about 45%
[ https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-8859: - Attachment: HDFS-8859.003.patch > Improve DataNode (ReplicaMap) memory footprint to save about 45% > > > Key: HDFS-8859 > URL: https://issues.apache.org/jira/browse/HDFS-8859 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Yi Liu >Assignee: Yi Liu >Priority: Critical > Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, > HDFS-8859.003.patch > > > By using following approach we can save about *45%* memory footprint for each > block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in > DataNode), the details are: > In ReplicaMap, > {code} > private final Map> map = > new HashMap>(); > {code} > Currently we use a HashMap {{Map}} to store the replicas > in memory. The key is block id of the block replica which is already > included in {{ReplicaInfo}}, so this memory can be saved. Also HashMap Entry > has a object overhead. We can implement a lightweight Set which is similar > to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix > size for the entries array, usually it's a big value, an example is > {{BlocksMap}}, this can avoid full gc since no need to resize), also we > should be able to get Element through key. > Following is comparison of memory footprint If we implement a lightweight set > as described: > We can save: > {noformat} > SIZE (bytes) ITEM > 20The Key: Long (12 bytes object overhead + 8 > bytes long) > 12HashMap Entry object overhead > 4 reference to the key in Entry > 4 reference to the value in Entry > 4 hash in Entry > {noformat} > Total: -44 bytes > We need to add: > {noformat} > SIZE (bytes) ITEM > 4 a reference to next element in ReplicaInfo > {noformat} > Total: +4 bytes > So totally we can save 40bytes for each block replica > And currently one finalized replica needs around 46 bytes (notice: we ignore > memory alignment here). > We can save 1 - (4 + 46) / (44 + 46) = *45%* memory for each block replica > in DataNode. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8859) Improve DataNode (ReplicaMap) memory footprint to save about 45%
[ https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692809#comment-14692809 ] Yi Liu commented on HDFS-8859: -- Thanks [~szetszwo] for the review. For your first question, yes, and another small difference is in {{LightWeightHashGSet}} needs to implement {{public Collection values()}} as java HashMap, now I add it as an interface of {{GSet}} For your second comment, you are right, it's more better to change LightWeightHashGSet extends LightWeightGSet, I do it in the new patch. Actually when I made the first patch, I ever considered make LightWeightHashGSet to extend LightWeightGSet, at that time I thought to support shrink later and more logic may be different, and make them independent. But I agree we should extend even so. > Improve DataNode (ReplicaMap) memory footprint to save about 45% > > > Key: HDFS-8859 > URL: https://issues.apache.org/jira/browse/HDFS-8859 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Yi Liu >Assignee: Yi Liu >Priority: Critical > Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch > > > By using following approach we can save about *45%* memory footprint for each > block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in > DataNode), the details are: > In ReplicaMap, > {code} > private final Map> map = > new HashMap>(); > {code} > Currently we use a HashMap {{Map}} to store the replicas > in memory. The key is block id of the block replica which is already > included in {{ReplicaInfo}}, so this memory can be saved. Also HashMap Entry > has a object overhead. We can implement a lightweight Set which is similar > to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix > size for the entries array, usually it's a big value, an example is > {{BlocksMap}}, this can avoid full gc since no need to resize), also we > should be able to get Element through key. > Following is comparison of memory footprint If we implement a lightweight set > as described: > We can save: > {noformat} > SIZE (bytes) ITEM > 20The Key: Long (12 bytes object overhead + 8 > bytes long) > 12HashMap Entry object overhead > 4 reference to the key in Entry > 4 reference to the value in Entry > 4 hash in Entry > {noformat} > Total: -44 bytes > We need to add: > {noformat} > SIZE (bytes) ITEM > 4 a reference to next element in ReplicaInfo > {noformat} > Total: +4 bytes > So totally we can save 40bytes for each block replica > And currently one finalized replica needs around 46 bytes (notice: we ignore > memory alignment here). > We can save 1 - (4 + 46) / (44 + 46) = *45%* memory for each block replica > in DataNode. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692805#comment-14692805 ] Bob Hansen commented on HDFS-8855: -- Does 2200 DN->NN connections seem a bit... excessive... for 50 concurrent reads? If you set the concurrent_reads environment variable to 500, do you end up with 22000 connections (and start running the NN out of ports very quickly)? If the load scales up linearly with the cluster size (a process on each node reading 50 files), will your NN run out of ports and fail? > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs > Environment: HDP 2.2 >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692803#comment-14692803 ] Ajith S commented on HDFS-8808: --- +1 > dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby > > > Key: HDFS-8808 > URL: https://issues.apache.org/jira/browse/HDFS-8808 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Gautam Gopalakrishnan >Assignee: Zhe Zhang > Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch > > > The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the > speed with which the fsimage is copied between the namenodes during regular > use. However, as a side effect, this also limits transfers when the > {{-bootstrapStandby}} option is used. This option is often used during > upgrades and could potentially slow down the entire workflow. The request > here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth > setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8823) Move replication factor into individual blocks
[ https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692787#comment-14692787 ] Hadoop QA commented on HDFS-8823: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 12s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 7 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 39s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 21s | The applied patch generated 12 new checkstyle issues (total was 654, now 660). | | {color:green}+1{color} | whitespace | 0m 11s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 20s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 34s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 2s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 192m 45s | Tests failed in hadoop-hdfs. | | | | 236m 42s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.TestBlockStoragePolicy | | | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotReplication | | | hadoop.hdfs.server.namenode.ha.TestDNFencing | | | hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication | | | hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | | | hadoop.hdfs.server.namenode.TestProcessCorruptBlocks | | | hadoop.hdfs.TestSetrepDecreasing | | | hadoop.hdfs.server.namenode.TestCacheDirectives | | Timed out tests | org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749951/HDFS-8823.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 7c796fd | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11968/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11968/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11968/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11968/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11968/console | This message was automatically generated. > Move replication factor into individual blocks > -- > > Key: HDFS-8823 > URL: https://issues.apache.org/jira/browse/HDFS-8823 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch, > HDFS-8823.002.patch > > > This jira proposes to record the replication factor in the {{BlockInfo}} > class. The changes have two advantages: > * Decoupling the namespace and the block management layer. It is a > prerequisite step to move block management off the heap or to a separate > process. > * Increased flexibility on replicating blocks. Currently the replication > factors of all blocks have to be the same. The replication factors of these > blocks are equal to the highest replication factor across all snapshots. The > changes will allow blocks in a file to have different replication factor, > potentially saving some space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8824) Do not use small blocks for balancing the cluster
[ https://issues.apache.org/jira/browse/HDFS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692764#comment-14692764 ] Hadoop QA commented on HDFS-8824: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 27s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 44s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 52s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 23s | The applied patch generated 5 new checkstyle issues (total was 523, now 525). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 20s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 7s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 90m 5s | Tests failed in hadoop-hdfs. | | | | 134m 31s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.TestLeaseRecovery2 | | Timed out tests | org.apache.hadoop.hdfs.TestListFilesInFileContext | | | org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749952/h8824_20150811b.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 7c796fd | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11969/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11969/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11969/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11969/console | This message was automatically generated. > Do not use small blocks for balancing the cluster > - > > Key: HDFS-8824 > URL: https://issues.apache.org/jira/browse/HDFS-8824 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h8824_20150727b.patch, h8824_20150811b.patch > > > Balancer gets datanode block lists from NN and then move the blocks in order > to balance the cluster. It should not use the blocks with small size since > moving the small blocks generates a lot of overhead and the small blocks do > not help balancing the cluster much. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7916) 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop
[ https://issues.apache.org/jira/browse/HDFS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated HDFS-7916: -- Labels: (was: 2.6.1-candidate) I earlier removed HDFS-7916 from the 2.6.1-candidate list given HDFS-7704 was only in 2.7.0. [~ctrezzo] added it back and so it appeared in my lists. I removed the label again, Chris, please comment on the mailing lists as to why you added it back. If you want it included, please comment there and we can add it after we figure out the why and the dependent tickets. > 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for > infinite loop > -- > > Key: HDFS-7916 > URL: https://issues.apache.org/jira/browse/HDFS-7916 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.7.0 >Reporter: Vinayakumar B >Assignee: Rushabh S Shah >Priority: Critical > Fix For: 2.7.1 > > Attachments: HDFS-7916-01.patch, HDFS-7916-1.patch > > > if any badblock found, then BPSA for StandbyNode will go for infinite times > to report it. > {noformat}2015-03-11 19:43:41,528 WARN > org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to report bad block > BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: > stobdtserver3/10.224.54.70:18010 > org.apache.hadoop.hdfs.server.datanode.BPServiceActorActionException: Failed > to report bad block > BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: > at > org.apache.hadoop.hdfs.server.datanode.ReportBadBlockAction.reportTo(ReportBadBlockAction.java:63) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processQueueMessages(BPServiceActor.java:1020) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:762) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:856) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692658#comment-14692658 ] Hadoop QA commented on HDFS-6407: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 15s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 55s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 0s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 22s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | native | 3m 7s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 175m 1s | Tests failed in hadoop-hdfs. | | | | 213m 46s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.web.TestWebHDFS | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749927/HDFS-6407.011.patch | | Optional Tests | javadoc javac unit | | git revision | trunk / 7c796fd | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11966/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11966/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11966/console | This message was automatically generated. > new namenode UI, lost ability to sort columns in datanode tab > - > > Key: HDFS-6407 > URL: https://issues.apache.org/jira/browse/HDFS-6407 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Nathan Roberts >Assignee: Haohui Mai >Priority: Critical > Labels: BB2015-05-TBR > Attachments: 002-datanodes-sorted-capacityUsed.png, > 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, > HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, > HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, > HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, > HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting > 2.png, sorting table.png > > > old ui supported clicking on column header to sort on that column. The new ui > seems to have dropped this very useful feature. > There are a few tables in the Namenode UI to display datanodes information, > directory listings and snapshots. > When there are many items in the tables, it is useful to have ability to sort > on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692650#comment-14692650 ] Xiaobing Zhou commented on HDFS-8855: - Tested HDFS-7597 patch, it’s working well for HDFS-8855. In my 3 nodes of local VMs, the ESTABLISHED connection varies from 1400 to 2200 as load generator is running. The code path is different in two cases. HDFS-8855 case goes to cache in org.apache.hadoop.ipc.connection. Let’s investigate on that cache. > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs > Environment: HDP 2.2 >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Release Note: Resubmitting older (non-NetUtils) version of patch to see if NetUtils change is breaking ZK related tests, can't repeat locally. (was: Fix one checkstyle bug, and found a few more tests that depended on treating ipaddress as a null string. Probably should fix the tests, but avoiding breaking on null input is OK here...) Status: Patch Available (was: Open) > HDFS client gets errors trying to to connect to IPv6 DataNode > - > > Key: HDFS-8078 > URL: https://issues.apache.org/jira/browse/HDFS-8078 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: Nate Edel >Assignee: Nate Edel > Labels: BB2015-05-TBR, ipv6 > Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, > HDFS-8078.12.patch, HDFS-8078.13.patch, HDFS-8078.9.patch > > > 1st exception, on put: > 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception > java.lang.IllegalArgumentException: Does not contain a valid host:port > authority: 2401:db00:1010:70ba:face:0:8:0:50010 > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) > at > org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) > Appears to actually stem from code in DataNodeID which assumes it's safe to > append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for > IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which > requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 > Currently using InetAddress.getByName() to validate IPv6 (guava > InetAddresses.forString has been flaky) but could also use our own parsing. > (From logging this, it seems like a low-enough frequency call that the extra > object creation shouldn't be problematic, and for me the slight risk of > passing in bad input that is not actually an IPv4 or IPv6 address and thus > calling an external DNS lookup is outweighed by getting the address > normalized and avoiding rewriting parsing.) > Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() > --- > 2nd exception (on datanode) > 15/04/13 13:18:07 ERROR datanode.DataNode: > dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown > operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: > /2401:db00:11:d010:face:0:2f:0:50010 > java.io.EOFException > at java.io.DataInputStream.readShort(DataInputStream.java:315) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) > at java.lang.Thread.run(Thread.java:745) > Which also comes as client error "-get: 2401 is not an IP string literal." > This one has existing parsing logic which needs to shift to the last colon > rather than the first. Should also be a tiny bit faster by using lastIndexOf > rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Attachment: HDFS-8078.13.patch > HDFS client gets errors trying to to connect to IPv6 DataNode > - > > Key: HDFS-8078 > URL: https://issues.apache.org/jira/browse/HDFS-8078 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: Nate Edel >Assignee: Nate Edel > Labels: BB2015-05-TBR, ipv6 > Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, > HDFS-8078.12.patch, HDFS-8078.13.patch, HDFS-8078.9.patch > > > 1st exception, on put: > 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception > java.lang.IllegalArgumentException: Does not contain a valid host:port > authority: 2401:db00:1010:70ba:face:0:8:0:50010 > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) > at > org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) > Appears to actually stem from code in DataNodeID which assumes it's safe to > append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for > IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which > requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 > Currently using InetAddress.getByName() to validate IPv6 (guava > InetAddresses.forString has been flaky) but could also use our own parsing. > (From logging this, it seems like a low-enough frequency call that the extra > object creation shouldn't be problematic, and for me the slight risk of > passing in bad input that is not actually an IPv4 or IPv6 address and thus > calling an external DNS lookup is outweighed by getting the address > normalized and avoiding rewriting parsing.) > Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() > --- > 2nd exception (on datanode) > 15/04/13 13:18:07 ERROR datanode.DataNode: > dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown > operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: > /2401:db00:11:d010:face:0:2f:0:50010 > java.io.EOFException > at java.io.DataInputStream.readShort(DataInputStream.java:315) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) > at java.lang.Thread.run(Thread.java:745) > Which also comes as client error "-get: 2401 is not an IP string literal." > This one has existing parsing logic which needs to shift to the last colon > rather than the first. Should also be a tiny bit faster by using lastIndexOf > rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Status: Open (was: Patch Available) > HDFS client gets errors trying to to connect to IPv6 DataNode > - > > Key: HDFS-8078 > URL: https://issues.apache.org/jira/browse/HDFS-8078 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: Nate Edel >Assignee: Nate Edel > Labels: BB2015-05-TBR, ipv6 > Attachments: HDFS-8078.10.patch, HDFS-8078.11.patch, > HDFS-8078.12.patch, HDFS-8078.9.patch > > > 1st exception, on put: > 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception > java.lang.IllegalArgumentException: Does not contain a valid host:port > authority: 2401:db00:1010:70ba:face:0:8:0:50010 > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) > at > org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) > Appears to actually stem from code in DataNodeID which assumes it's safe to > append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for > IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which > requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 > Currently using InetAddress.getByName() to validate IPv6 (guava > InetAddresses.forString has been flaky) but could also use our own parsing. > (From logging this, it seems like a low-enough frequency call that the extra > object creation shouldn't be problematic, and for me the slight risk of > passing in bad input that is not actually an IPv4 or IPv6 address and thus > calling an external DNS lookup is outweighed by getting the address > normalized and avoiding rewriting parsing.) > Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() > --- > 2nd exception (on datanode) > 15/04/13 13:18:07 ERROR datanode.DataNode: > dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown > operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: > /2401:db00:11:d010:face:0:2f:0:50010 > java.io.EOFException > at java.io.DataInputStream.readShort(DataInputStream.java:315) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) > at java.lang.Thread.run(Thread.java:745) > Which also comes as client error "-get: 2401 is not an IP string literal." > This one has existing parsing logic which needs to shift to the last colon > rather than the first. Should also be a tiny bit faster by using lastIndexOf > rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8870) Lease is leaked on write failure
[ https://issues.apache.org/jira/browse/HDFS-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-8870: -- Unless the patch is ready to go and the JIRA is a critical fix, we'll defer it to 2.6.2. Let me know if you have comments. Thanks! > Lease is leaked on write failure > > > Key: HDFS-8870 > URL: https://issues.apache.org/jira/browse/HDFS-8870 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Daryn Sharp > > Creating this ticket on behalf of [~daryn] > We've seen this in our of our cluster. When a long running process has a > write failure, the lease is leaked and gets renewed until the token is > expired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8886) Not able to build with 'mvn compile -Pnative'
[ https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang resolved HDFS-8886. --- Resolution: Not A Problem Please use the user list rather than JIRA for problems like this. In this case, I recommend you follow the instructions in BUILDING.txt, namely to run "mvn install -DskipTests" from the top level. > Not able to build with 'mvn compile -Pnative' > - > > Key: HDFS-8886 > URL: https://issues.apache.org/jira/browse/HDFS-8886 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Puneeth P > > I am running into a problem where i am not able to compile the native parts > of hadoop-hdfs project. the problem is that it is not finding MakeFile in > ${project.build.dir}/native. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8888) Support volumes in HDFS
[ https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-: - Description: There are multiple types of zones (e.g., snapshottable directories, encryption zones, directories with quotas) which are conceptually close to namespace volumes in traditional file systems. This jira proposes to introduce the concept of volume to simplify the implementation of snapshots and encryption zones. was: There are multiple types of zones (e.g., snapshot, encryption zone) which are conceptually close to namespace volumes in traditional filesystems. This jira proposes to introduce the concept of volume to simplify the implementation of snapshots and encryption zones. > Support volumes in HDFS > --- > > Key: HDFS- > URL: https://issues.apache.org/jira/browse/HDFS- > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai > > There are multiple types of zones (e.g., snapshottable directories, > encryption zones, directories with quotas) which are conceptually close to > namespace volumes in traditional file systems. > This jira proposes to introduce the concept of volume to simplify the > implementation of snapshots and encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8887) Expose storage type and storage ID in BlockLocation
[ https://issues.apache.org/jira/browse/HDFS-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692594#comment-14692594 ] Lei (Eddy) Xu commented on HDFS-8887: - LGTM, Thanks [~andrew.wang]. +1, pending jenkins. > Expose storage type and storage ID in BlockLocation > --- > > Key: HDFS-8887 > URL: https://issues.apache.org/jira/browse/HDFS-8887 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.1 >Reporter: Andrew Wang >Assignee: Andrew Wang > Attachments: HDFS-8887.001.patch > > > Some applications schedule based on info like storage type or storage ID, > it'd be useful to expose this information in BlockLocation. It's already > included in LocatedBlock and sent over the wire. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8888) Support volumes in HDFS
[ https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-: - Summary: Support volumes in HDFS (was: Support the volume concepts in HDFS) > Support volumes in HDFS > --- > > Key: HDFS- > URL: https://issues.apache.org/jira/browse/HDFS- > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai > > There are multiple types of zones (e.g., snapshot, encryption zone) which are > conceptually close to namespace volumes in traditional filesystems. > This jira proposes to introduce the concept of volume to simplify the > implementation of snapshots and encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8888) Support the volume concepts in HDFS
[ https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692590#comment-14692590 ] Jing Zhao commented on HDFS-: - +1 to have volume in HDFS. Also congrats for the jira number :) > Support the volume concepts in HDFS > --- > > Key: HDFS- > URL: https://issues.apache.org/jira/browse/HDFS- > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai > > There are multiple types of zones (e.g., snapshot, encryption zone) which are > conceptually close to namespace volumes in traditional filesystems. > This jira proposes to introduce the concept of volume to simplify the > implementation of snapshots and encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8888) Support the volume concepts in HDFS
[ https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692587#comment-14692587 ] Haohui Mai commented on HDFS-: -- >From an implementation standpoint, the concepts of volume provides a basic >building block to simplify the implementation of snapshots and encryption >zones that are available today. The concept of volume may simplify the tasks >of administration and operations. More detailed design will be available shortly afterward. > Support the volume concepts in HDFS > --- > > Key: HDFS- > URL: https://issues.apache.org/jira/browse/HDFS- > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai > > There are multiple types of zones (e.g., snapshot, encryption zone) which are > conceptually close to namespace volumes in traditional filesystems. > This jira proposes to introduce the concept of volume to simplify the > implementation of snapshots and encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8880) NameNode metrics logging
[ https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8880: Attachment: HDFS-8880.02.patch .02 patch: * Added test cases. * Replaced {{Timer}} with {{ScheduledThreadPoolExecutor}} in NameNode. > NameNode metrics logging > > > Key: HDFS-8880 > URL: https://issues.apache.org/jira/browse/HDFS-8880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8880.01.patch, HDFS-8880.02.patch, > namenode-metrics.log > > > The NameNode can periodically log metrics to help debugging when the cluster > is not setup with another metrics monitoring scheme. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8888) Support the volume concepts in HDFS
Haohui Mai created HDFS-: Summary: Support the volume concepts in HDFS Key: HDFS- URL: https://issues.apache.org/jira/browse/HDFS- Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai There are multiple types of zones (e.g., snapshot, encryption zone) which are conceptually close to namespace volumes in traditional filesystems. This jira proposes to introduce the concept of volume to simplify the implementation of snapshots and encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones
[ https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692549#comment-14692549 ] Jing Zhao commented on HDFS-8833: - Thanks for the summary, Zhe. The new proposal looks reasonable to me overall. Some thoughts and questions: # Will we allow associating the EC policy with a non-empty directory? I guess we should disallow it, otherwise the semantic of the "create EC Directory" command can be very confusing. # Do we want to allow nested EC directories? Currently since we only support one policy, I do not see any benefits to have nested EC directories. Thus in the first stage we can disallow it. Also note that it's always easier to remove a restriction than adding a new restriction. # If we agree on the above two, the only change we're proposing here is to support rename across EC zone boundary. Since the EC policy bit is already on INodeFile, its implementation can be simple. I also had some offline discussion about this with [~sureshms], [~szetszwo], and [~wheat9]. Currently our main concern is still to allow rename can make it hard for end user to understand the exact semantic and also make the management hard. > Erasure coding: store EC schema and cell size in INodeFile and eliminate > notion of EC zones > --- > > Key: HDFS-8833 > URL: https://issues.apache.org/jira/browse/HDFS-8833 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > > We have [discussed | > https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754] > storing EC schema with files instead of EC zones and recently revisited the > discussion under HDFS-8059. > As a recap, the _zone_ concept has severe limitations including renaming and > nested configuration. Those limitations are valid in encryption for security > reasons and it doesn't make sense to carry them over in EC. > This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For > simplicity, we should first implement it as an xattr and consider memory > optimizations (such as moving it to file header) as a follow-on. We should > also disable changing EC policy on a non-empty file / dir in the first phase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8808: Attachment: HDFS-8808-01.patch Updating the patch with a unit test and the configuration option for transferring images for bootstrapping standby. > dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby > > > Key: HDFS-8808 > URL: https://issues.apache.org/jira/browse/HDFS-8808 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Gautam Gopalakrishnan >Assignee: Zhe Zhang > Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch > > > The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the > speed with which the fsimage is copied between the namenodes during regular > use. However, as a side effect, this also limits transfers when the > {{-bootstrapStandby}} option is used. This option is often used during > upgrades and could potentially slow down the entire workflow. The request > here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth > setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8870) Lease is leaked on write failure
[ https://issues.apache.org/jira/browse/HDFS-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692510#comment-14692510 ] Sangjin Lee commented on HDFS-8870: --- Should this be targeted to 2.6.2? We're trying to release 2.6.1 soon. Let me know. > Lease is leaked on write failure > > > Key: HDFS-8870 > URL: https://issues.apache.org/jira/browse/HDFS-8870 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Daryn Sharp > > Creating this ticket on behalf of [~daryn] > We've seen this in our of our cluster. When a long running process has a > write failure, the lease is leaked and gets renewed until the token is > expired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8887) Expose storage type and storage ID in BlockLocation
[ https://issues.apache.org/jira/browse/HDFS-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-8887: -- Status: Patch Available (was: Open) > Expose storage type and storage ID in BlockLocation > --- > > Key: HDFS-8887 > URL: https://issues.apache.org/jira/browse/HDFS-8887 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.1 >Reporter: Andrew Wang >Assignee: Andrew Wang > Attachments: HDFS-8887.001.patch > > > Some applications schedule based on info like storage type or storage ID, > it'd be useful to expose this information in BlockLocation. It's already > included in LocatedBlock and sent over the wire. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8887) Expose storage type and storage ID in BlockLocation
[ https://issues.apache.org/jira/browse/HDFS-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-8887: -- Attachment: HDFS-8887.001.patch Patch attached. If this goes in, I'll file another follow-on JIRA to remove the getFileBlockStorageLocations API from trunk since it is superceded by storage IDs. > Expose storage type and storage ID in BlockLocation > --- > > Key: HDFS-8887 > URL: https://issues.apache.org/jira/browse/HDFS-8887 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.1 >Reporter: Andrew Wang >Assignee: Andrew Wang > Attachments: HDFS-8887.001.patch > > > Some applications schedule based on info like storage type or storage ID, > it'd be useful to expose this information in BlockLocation. It's already > included in LocatedBlock and sent over the wire. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8887) Expose storage type and storage ID in BlockLocation
Andrew Wang created HDFS-8887: - Summary: Expose storage type and storage ID in BlockLocation Key: HDFS-8887 URL: https://issues.apache.org/jira/browse/HDFS-8887 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.1 Reporter: Andrew Wang Assignee: Andrew Wang Some applications schedule based on info like storage type or storage ID, it'd be useful to expose this information in BlockLocation. It's already included in LocatedBlock and sent over the wire. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8826) Balancer may not move blocks efficiently in some cases
[ https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8826: -- Status: Patch Available (was: Open) > Balancer may not move blocks efficiently in some cases > -- > > Key: HDFS-8826 > URL: https://issues.apache.org/jira/browse/HDFS-8826 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h8826_20150811.patch > > > Balancer is inefficient in the following case: > || Datanode || Utilization || Rack || > | D1 | 95% | A | > | D2 | 30% | B | > | D3, D4, D5 | 0% | B | > The average utilization is 25% so that D2 is within 10% threshold. However, > Balancer currently will first move blocks from D2 to D3, D4 and D5 since they > are under the same rack. Then, it will move blocks from D1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8826) Balancer may not move blocks efficiently in some cases
[ https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8826: -- Attachment: h8826_20150811.patch h8826_20150811.patch: adds a new -source option. Will add some tests later. > Balancer may not move blocks efficiently in some cases > -- > > Key: HDFS-8826 > URL: https://issues.apache.org/jira/browse/HDFS-8826 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h8826_20150811.patch > > > Balancer is inefficient in the following case: > || Datanode || Utilization || Rack || > | D1 | 95% | A | > | D2 | 30% | B | > | D3, D4, D5 | 0% | B | > The average utilization is 25% so that D2 is within 10% threshold. However, > Balancer currently will first move blocks from D2 to D3, D4 and D5 since they > are under the same rack. Then, it will move blocks from D1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692442#comment-14692442 ] Hadoop QA commented on HDFS-8828: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 37s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 8m 20s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 20s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 21s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 28s | The applied patch generated 1 new checkstyle issues (total was 120, now 121). | | {color:green}+1{color} | whitespace | 0m 4s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 50s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | tools/hadoop tests | 6m 29s | Tests passed in hadoop-distcp. | | | | 45m 36s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749947/HDFS-8828.006.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 7c796fd | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11967/artifact/patchprocess/diffcheckstylehadoop-distcp.txt | | hadoop-distcp test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11967/artifact/patchprocess/testrun_hadoop-distcp.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11967/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11967/console | This message was automatically generated. > Utilize Snapshot diff report to build copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones
[ https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692302#comment-14692302 ] Zhe Zhang commented on HDFS-8833: - bq. Looks like Zhe planned to implement the XAttr-based solution, and I'm fine with this direction. Thanks Jing for confirming this. When looking at the code I realized that we should at least leverage the {{isStriped}} bit in the file header to represent the system default policy of RS(6,3). So below is a revised design: # ErasureCodingPolicy table: already done in HDFS-8854 # File header change #* Rename {{isStriped}} to {{erasureCodingPolicy}} in {{INodeFile}} header. {code} /** * Bit format: * [4-bit storagePolicyID][1-bit erasureCodingPolicy] * [11-bit replication][48-bit preferredBlockSize] */ {code} #* The ECPolicy is *always set* when creating a file; {{0}} represents contiguous layout. #* Since {{ErasureCodingPolicyManager}} / {{ErasureCodingSchemaManager}} only has 1 policy, we don't even need to set XAttr on files at this stage. #* [follow-on] When we support more EC policies on HDFS side, figure out the number of additional file header bits to use. #* [follow-on] Add {{inherit-on-create}} flag as Andrew suggested above # Directory XAttr change #* Directory's ECPolicy XAttr can be empty, indicating the ECPolicy is the same as ancestor. Otherwise its own XAttr determines the policy for newly created files under the directory. # Renaming #* A renamed file keeps the ECPolicy in its header. #* Therefore, a directory can have files with different ECPolicies. #* Conversion not explicitly support. If needed a file can be converted by cp+rm. #* When renamed, a directory carries over its ECPolicy if it's set (XAttr non-empty). Otherwise its XAttr remains empty (and newly created files under the moved directory will use policy from the new ancestors). Please let me know if it looks reasonable. Thanks. > Erasure coding: store EC schema and cell size in INodeFile and eliminate > notion of EC zones > --- > > Key: HDFS-8833 > URL: https://issues.apache.org/jira/browse/HDFS-8833 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > > We have [discussed | > https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754] > storing EC schema with files instead of EC zones and recently revisited the > discussion under HDFS-8059. > As a recap, the _zone_ concept has severe limitations including renaming and > nested configuration. Those limitations are valid in encryption for security > reasons and it doesn't make sense to carry them over in EC. > This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For > simplicity, we should first implement it as an xattr and consider memory > optimizations (such as moving it to file header) as a follow-on. We should > also disable changing EC policy on a non-empty file / dir in the first phase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8875) Optimize the wait time in Balancer for federation scenario
[ https://issues.apache.org/jira/browse/HDFS-8875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692284#comment-14692284 ] Tsz Wo Nicholas Sze commented on HDFS-8875: --- Balancer will exit if one of the NNs succeeds or throws exception. See if you also want to fix it here. > Optimize the wait time in Balancer for federation scenario > -- > > Key: HDFS-8875 > URL: https://issues.apache.org/jira/browse/HDFS-8875 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Chris Trezzo > > Balancer has wait time between two consecutive iterations. That is to give > some time for block movement to be fully committed ( return from replaceBlock > doesn't mean the NN's blockmap has been updated and the block has been > invalidated on the source node.). > This wait time could be 23 seconds if {{dfs.heartbeat.interval}} is set to 10 > and {{dfs.namenode.replication.interval}} is to 3. In the case of federation, > given we iterate through all namespaces in each iteration, this wait time > becomes unnecessary as while balancer is processing the next namespace, it > gives the previous namespace it just finished time to commit. > In addition, Balancer calls {{Collections.shuffle(connectors);}} It doesn't > seem necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases
[ https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14691417#comment-14691417 ] Tsz Wo Nicholas Sze commented on HDFS-8826: --- I suggest to add an option to specify the source node list. Then, balancer only selects blocks to move from those nodes. > Balancer may not move blocks efficiently in some cases > -- > > Key: HDFS-8826 > URL: https://issues.apache.org/jira/browse/HDFS-8826 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > > Balancer is inefficient in the following case: > || Datanode || Utilization || Rack || > | D1 | 95% | A | > | D2 | 30% | B | > | D3, D4, D5 | 0% | B | > The average utilization is 25% so that D2 is within 10% threshold. However, > Balancer currently will first move blocks from D2 to D3, D4 and D5 since they > are under the same rack. Then, it will move blocks from D1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8823) Move replication factor into individual blocks
[ https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8823: - Attachment: HDFS-8823.002.patch > Move replication factor into individual blocks > -- > > Key: HDFS-8823 > URL: https://issues.apache.org/jira/browse/HDFS-8823 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch, > HDFS-8823.002.patch > > > This jira proposes to record the replication factor in the {{BlockInfo}} > class. The changes have two advantages: > * Decoupling the namespace and the block management layer. It is a > prerequisite step to move block management off the heap or to a separate > process. > * Increased flexibility on replicating blocks. Currently the replication > factors of all blocks have to be the same. The replication factors of these > blocks are equal to the highest replication factor across all snapshots. The > changes will allow blocks in a file to have different replication factor, > potentially saving some space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8824) Do not use small blocks for balancing the cluster
[ https://issues.apache.org/jira/browse/HDFS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8824: -- Attachment: h8824_20150811b.patch h8824_20150811b.patch: reverts the NN change. Will do it in a separated JIRA. > Do not use small blocks for balancing the cluster > - > > Key: HDFS-8824 > URL: https://issues.apache.org/jira/browse/HDFS-8824 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h8824_20150727b.patch, h8824_20150811b.patch > > > Balancer gets datanode block lists from NN and then move the blocks in order > to balance the cluster. It should not use the blocks with small size since > moving the small blocks generates a lot of overhead and the small blocks do > not help balancing the cluster much. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated HDFS-8828: --- Attachment: HDFS-8828.006.patch Add exclude list while recursively traverse the created directories in snapshot diff report. > Utilize Snapshot diff report to build copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6244) Make Trash Interval configurable for each of the namespaces
[ https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682498#comment-14682498 ] Hadoop QA commented on HDFS-6244: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 20m 19s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 52s | There were no new javac warning messages. | | {color:red}-1{color} | javadoc | 9m 49s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 13s | The applied patch generated 3 new checkstyle issues (total was 195, now 197). | | {color:red}-1{color} | checkstyle | 2m 53s | The applied patch generated 5 new checkstyle issues (total was 19, now 24). | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 5m 56s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 22m 18s | Tests failed in hadoop-common. | | {color:red}-1{color} | yarn tests | 50m 44s | Tests failed in hadoop-yarn-server-resourcemanager. | | {color:red}-1{color} | hdfs tests | 0m 22s | Tests failed in hadoop-hdfs. | | | | 122m 43s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.ha.TestZKFailoverController | | | hadoop.net.TestNetUtils | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation | | | hadoop.yarn.server.resourcemanager.TestRMAdminService | | | hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStorePerf | | | hadoop.yarn.server.resourcemanager.recovery.TestLeveldbRMStateStore | | Timed out tests | org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart | | | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens | | Failed build | hadoop-hdfs | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749892/HDFS-6244.v4.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 1fc3c77 | | javadoc | https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/diffJavadocWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/diffcheckstylehadoop-common.txt https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11965/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11965/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11965/console | This message was automatically generated. > Make Trash Interval configurable for each of the namespaces > --- > > Key: HDFS-6244 > URL: https://issues.apache.org/jira/browse/HDFS-6244 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.5-alpha >Reporter: Siqi Li >Assignee: Siqi Li > Labels: BB2015-05-TBR > Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, > HDFS-6244.v3.patch, HDFS-6244.v4.patch > > > Somehow we need to avoid the cluster filling up. > One solution is to have a different trash policy per namespace. However, if > we can simply make the property configurable per namespace, then the same > config can be rolled
[jira] [Commented] (HDFS-8886) Not able to build with 'mvn compile -Pnative'
[ https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682469#comment-14682469 ] Puneeth P commented on HDFS-8886: - When i try to build it, it fails with {noformat} run (make) on project hadoop-hdfs: An Ant BuildException has occured: no targets found {noformat} > Not able to build with 'mvn compile -Pnative' > - > > Key: HDFS-8886 > URL: https://issues.apache.org/jira/browse/HDFS-8886 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Puneeth P > > I am running into a problem where i am not able to compile the native parts > of hadoop-hdfs project. the problem is that it is not finding MakeFile in > ${project.build.dir}/native. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8886) Not able to build with 'mvn compile -Pnative'
[ https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Puneeth P updated HDFS-8886: Description: I am running into a problem where i am not able to compile the native parts of hadoop-hdfs project. the problem is that it is not finding MakeFile in ${project.build.dir}/native. was:I am running into a problem where i am not able to compile the native parts of hadoop-hdfs project. the problem is that it is not finding MakeFile in *${project.build.dir}/native*. > Not able to build with 'mvn compile -Pnative' > - > > Key: HDFS-8886 > URL: https://issues.apache.org/jira/browse/HDFS-8886 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Puneeth P > > I am running into a problem where i am not able to compile the native parts > of hadoop-hdfs project. the problem is that it is not finding MakeFile in > ${project.build.dir}/native. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8886) Not able to build with 'mvn compile -Pnative'
[ https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Puneeth P updated HDFS-8886: Description: I am running into a problem where i am not able to compile the native parts of hadoop-hdfs project. the problem is that it is not finding MakeFile in *${project.build.dir}/native*. (was: I am running into a problem where i am not able to compile the native parts of hadoop-hdfs project. the problem is that it is not finding MakeFile in {{${project.build.dir}/native}}.) > Not able to build with 'mvn compile -Pnative' > - > > Key: HDFS-8886 > URL: https://issues.apache.org/jira/browse/HDFS-8886 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Puneeth P > > I am running into a problem where i am not able to compile the native parts > of hadoop-hdfs project. the problem is that it is not finding MakeFile in > *${project.build.dir}/native*. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8886) Not able to build with 'mvn compile -Pnative'
[ https://issues.apache.org/jira/browse/HDFS-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Puneeth P updated HDFS-8886: Description: I am running into a problem where i am not able to compile the native parts of hadoop-hdfs project. the problem is that it is not finding MakeFile in {{${project.build.dir}/native}}. (was: I am running into a problem where i am not able to compile the native parts of hadoop-hdfs project. the problem is that it is not finding MakeFile in ${project.build.dir}/native.) > Not able to build with 'mvn compile -Pnative' > - > > Key: HDFS-8886 > URL: https://issues.apache.org/jira/browse/HDFS-8886 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Puneeth P > > I am running into a problem where i am not able to compile the native parts > of hadoop-hdfs project. the problem is that it is not finding MakeFile in > {{${project.build.dir}/native}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8886) Not able to build with 'mvn compile -Pnative'
Puneeth P created HDFS-8886: --- Summary: Not able to build with 'mvn compile -Pnative' Key: HDFS-8886 URL: https://issues.apache.org/jira/browse/HDFS-8886 Project: Hadoop HDFS Issue Type: Bug Reporter: Puneeth P I am running into a problem where i am not able to compile the native parts of hadoop-hdfs project. the problem is that it is not finding MakeFile in ${project.build.dir}/native. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682433#comment-14682433 ] Hadoop QA commented on HDFS-8808: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 16s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 14s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 0s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 28s | The applied patch generated 1 new checkstyle issues (total was 152, now 153). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 23s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 36s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 11s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 87m 49s | Tests failed in hadoop-hdfs. | | | | 134m 0s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestQuotaByStorageType | | | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics | | | hadoop.hdfs.server.namenode.TestDefaultBlockPlacementPolicy | | | hadoop.hdfs.server.namenode.TestFSImageWithSnapshot | | | hadoop.hdfs.server.namenode.TestBlockPlacementPolicyRackFaultTolerant | | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749886/HDFS-8808-00.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 1fc3c77 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11964/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11964/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11964/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11964/console | This message was automatically generated. > dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby > > > Key: HDFS-8808 > URL: https://issues.apache.org/jira/browse/HDFS-8808 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Gautam Gopalakrishnan >Assignee: Zhe Zhang > Attachments: HDFS-8808-00.patch > > > The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the > speed with which the fsimage is copied between the namenodes during regular > use. However, as a side effect, this also limits transfers when the > {{-bootstrapStandby}} option is used. This option is often used during > upgrades and could potentially slow down the entire workflow. The request > here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth > setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8880) NameNode metrics logging
[ https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8880: Status: Patch Available (was: Open) > NameNode metrics logging > > > Key: HDFS-8880 > URL: https://issues.apache.org/jira/browse/HDFS-8880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8880.01.patch, namenode-metrics.log > > > The NameNode can periodically log metrics to help debugging when the cluster > is not setup with another metrics monitoring scheme. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8880) NameNode metrics logging
[ https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8880: Status: Open (was: Patch Available) > NameNode metrics logging > > > Key: HDFS-8880 > URL: https://issues.apache.org/jira/browse/HDFS-8880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8880.01.patch, namenode-metrics.log > > > The NameNode can periodically log metrics to help debugging when the cluster > is not setup with another metrics monitoring scheme. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones
[ https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682419#comment-14682419 ] Jing Zhao commented on HDFS-8833: - bq. I agree that a pure XAttr-based solution is simpler, cleaner, and more scalable. We should probably implement that at this stage and pursue memory saving ideas as follow-on. Looks like Zhe planned to implement the XAttr-based solution, and I'm fine with this direction. > Erasure coding: store EC schema and cell size in INodeFile and eliminate > notion of EC zones > --- > > Key: HDFS-8833 > URL: https://issues.apache.org/jira/browse/HDFS-8833 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > > We have [discussed | > https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754] > storing EC schema with files instead of EC zones and recently revisited the > discussion under HDFS-8059. > As a recap, the _zone_ concept has severe limitations including renaming and > nested configuration. Those limitations are valid in encryption for security > reasons and it doesn't make sense to carry them over in EC. > This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For > simplicity, we should first implement it as an xattr and consider memory > optimizations (such as moving it to file header) as a follow-on. We should > also disable changing EC policy on a non-empty file / dir in the first phase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones
[ https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682412#comment-14682412 ] Andrew Wang commented on HDFS-8833: --- [~jingzhao] any additional comment on file header bits? Else I think Zhe wants to start working on the design as discussed above. > Erasure coding: store EC schema and cell size in INodeFile and eliminate > notion of EC zones > --- > > Key: HDFS-8833 > URL: https://issues.apache.org/jira/browse/HDFS-8833 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > > We have [discussed | > https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754] > storing EC schema with files instead of EC zones and recently revisited the > discussion under HDFS-8059. > As a recap, the _zone_ concept has severe limitations including renaming and > nested configuration. Those limitations are valid in encryption for security > reasons and it doesn't make sense to carry them over in EC. > This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For > simplicity, we should first implement it as an xattr and consider memory > optimizations (such as moving it to file header) as a follow-on. We should > also disable changing EC policy on a non-empty file / dir in the first phase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682404#comment-14682404 ] Chang Li commented on HDFS-6407: ok. +1(non binding) > new namenode UI, lost ability to sort columns in datanode tab > - > > Key: HDFS-6407 > URL: https://issues.apache.org/jira/browse/HDFS-6407 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Nathan Roberts >Assignee: Haohui Mai >Priority: Critical > Labels: BB2015-05-TBR > Attachments: 002-datanodes-sorted-capacityUsed.png, > 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, > HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, > HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, > HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, > HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting > 2.png, sorting table.png > > > old ui supported clicking on column header to sort on that column. The new ui > seems to have dropped this very useful feature. > There are a few tables in the Namenode UI to display datanodes information, > directory listings and snapshots. > When there are many items in the tables, it is useful to have ability to sort > on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8277) Safemode enter fails when Standby NameNode is down
[ https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682399#comment-14682399 ] Arpit Agarwal commented on HDFS-8277: - Hi [~surendrasingh], the setting must be persisted in edit log but changing the behavior would be incompatible for 2.x. We should consider revisiting this for 3.x. I am not in favor the original proposal in the v1 patch. > Safemode enter fails when Standby NameNode is down > -- > > Key: HDFS-8277 > URL: https://issues.apache.org/jira/browse/HDFS-8277 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, HDFS, namenode >Affects Versions: 2.6.0 > Environment: HDP 2.2.0 >Reporter: Hari Sekhon >Assignee: Surendra Singh Lilhore >Priority: Minor > Attachments: HDFS-8277-safemode-edits.patch, HDFS-8277.patch, > HDFS-8277_1.patch, HDFS-8277_2.patch, HDFS-8277_3.patch, HDFS-8277_4.patch > > > HDFS fails to enter safemode when the Standby NameNode is down (eg. due to > AMBARI-10536). > {code}hdfs dfsadmin -safemode enter > safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused{code} > This appears to be a bug in that it's not trying both NameNodes like the > standard hdfs client code does, and is instead stopping after getting a > connection refused from nn1 which is down. I verified normal hadoop fs writes > and reads via cli did work at this time, using nn2. I happened to run this > command as the hdfs user on nn2 which was the surviving Active NameNode. > After I re-bootstrapped the Standby NN to fix it the command worked as > expected again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682398#comment-14682398 ] Haohui Mai commented on HDFS-6407: -- The discussion on dfs usage and the sorting of the column should be separated. Please file another jira for the feature request. > new namenode UI, lost ability to sort columns in datanode tab > - > > Key: HDFS-6407 > URL: https://issues.apache.org/jira/browse/HDFS-6407 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Nathan Roberts >Assignee: Haohui Mai >Priority: Critical > Labels: BB2015-05-TBR > Attachments: 002-datanodes-sorted-capacityUsed.png, > 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, > HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, > HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, > HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, > HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting > 2.png, sorting table.png > > > old ui supported clicking on column header to sort on that column. The new ui > seems to have dropped this very useful feature. > There are a few tables in the Namenode UI to display datanodes information, > directory listings and snapshots. > When there are many items in the tables, it is useful to have ability to sort > on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated HDFS-6407: --- Attachment: HDFS-6407.011.patch added non dfs usage back to column on .11 patch. [~wheat9], [~nroberts] please help review the .11 patch. Thanks! > new namenode UI, lost ability to sort columns in datanode tab > - > > Key: HDFS-6407 > URL: https://issues.apache.org/jira/browse/HDFS-6407 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 >Reporter: Nathan Roberts >Assignee: Haohui Mai >Priority: Critical > Labels: BB2015-05-TBR > Attachments: 002-datanodes-sorted-capacityUsed.png, > 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, > HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, > HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, > HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, > HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting > 2.png, sorting table.png > > > old ui supported clicking on column header to sort on that column. The new ui > seems to have dropped this very useful feature. > There are a few tables in the Namenode UI to display datanodes information, > directory listings and snapshots. > When there are many items in the tables, it is useful to have ability to sort > on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8865) Improve quota initialization performance
[ https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682371#comment-14682371 ] Kihwal Lee commented on HDFS-8865: -- The checkstyle error is for the new config key, which I am not going to fix. The unit test timeout does not happen when I run it. Looks like it is failing in other pre-commit builds too, so it is not being caused by this patch. > Improve quota initialization performance > > > Key: HDFS-8865 > URL: https://issues.apache.org/jira/browse/HDFS-8865 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, > HDFS-8865.v2.patch > > > After replaying edits, the whole file system tree is recursively scanned in > order to initialize the quota. For big name space, this can take a very long > time. Since this is done during namenode failover, it also affects failover > latency. > By using the Fork-Join framework, I was able to greatly reduce the > initialization time. The following is the test result using the fsimage from > one of the big name nodes we have. > || threads || seconds|| > | 1 (existing) | 55| > | 1 (fork-join) | 68 | > | 4 | 16 | > | 8 | 8 | > | 12 | 6 | > | 16 | 5 | > | 20 | 4 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed
[ https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682328#comment-14682328 ] Hadoop QA commented on HDFS-8863: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 5s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 0s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 35s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 30s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 25s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 47s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 17s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 175m 14s | Tests failed in hadoop-hdfs. | | | | 221m 54s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestDatanodeDeath | | | hadoop.hdfs.TestReplaceDatanodeOnFailure | | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749845/HDFS-8863.v2.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / fa1d84a | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11963/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11963/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11963/console | This message was automatically generated. > The remiaing space check in BlockPlacementPolicyDefault is flawed > - > > Key: HDFS-8863 > URL: https://issues.apache.org/jira/browse/HDFS-8863 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Labels: 2.6.1-candidate > Attachments: HDFS-8863.patch, HDFS-8863.v2.patch > > > The block placement policy calls > {{DatanodeDescriptor#getRemaining(StorageType to check whether the block > is going to fit. Since the method is adding up all remaining spaces, namenode > can allocate a new block on a full node. This causes pipeline construction > failure and {{abandonBlock}}. If the cluster is nearly full, the client might > hit this multiple times and the write can fail permanently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8865) Improve quota initialization performance
[ https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682301#comment-14682301 ] Hadoop QA commented on HDFS-8865: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 9s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 41s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 56s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 21s | The applied patch generated 1 new checkstyle issues (total was 493, now 491). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 21s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 38s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 10s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 186m 4s | Tests failed in hadoop-hdfs. | | | | 230m 19s | | \\ \\ || Reason || Tests || | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749856/HDFS-8865.v2.checkstyle.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / fa1d84a | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11962/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11962/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11962/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11962/console | This message was automatically generated. > Improve quota initialization performance > > > Key: HDFS-8865 > URL: https://issues.apache.org/jira/browse/HDFS-8865 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, > HDFS-8865.v2.patch > > > After replaying edits, the whole file system tree is recursively scanned in > order to initialize the quota. For big name space, this can take a very long > time. Since this is done during namenode failover, it also affects failover > latency. > By using the Fork-Join framework, I was able to greatly reduce the > initialization time. The following is the test result using the fsimage from > one of the big name nodes we have. > || threads || seconds|| > | 1 (existing) | 55| > | 1 (fork-join) | 68 | > | 4 | 16 | > | 8 | 8 | > | 12 | 6 | > | 16 | 5 | > | 20 | 4 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8052) Move WebHdfsFileSystem into hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682247#comment-14682247 ] Tsz Wo Nicholas Sze commented on HDFS-8052: --- Agree with Haohui that RetryUtils is not yet a public API although it could be useful for other projects. [~gsaha], the Hadoop APIs by default are for internal use only unless it is annotated as \@InterfaceAudience.Public. If there is a need, please file a JIRA so that we could change the annotation. > Move WebHdfsFileSystem into hadoop-hdfs-client > -- > > Key: HDFS-8052 > URL: https://issues.apache.org/jira/browse/HDFS-8052 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Haohui Mai >Assignee: Haohui Mai > Fix For: 2.8.0 > > Attachments: HDFS-8052.000.patch, HDFS-8052.001.patch, > HDFS-8052.002.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8052) Move WebHdfsFileSystem into hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai resolved HDFS-8052. -- Resolution: Fixed Closing this jira. {{RetryUtils}} is not annotated as a public API thus it might change as the project evolve. This is not an incompatible change as it is an internal implementation detail. It looks like that the fix in SLIDER-923 is correct to me. Do I miss anything? > Move WebHdfsFileSystem into hadoop-hdfs-client > -- > > Key: HDFS-8052 > URL: https://issues.apache.org/jira/browse/HDFS-8052 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Haohui Mai >Assignee: Haohui Mai > Fix For: 2.8.0 > > Attachments: HDFS-8052.000.patch, HDFS-8052.001.patch, > HDFS-8052.002.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8805) Archival Storage: getStoragePolicy should not need superuser privilege
[ https://issues.apache.org/jira/browse/HDFS-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682209#comment-14682209 ] Hudson commented on HDFS-8805: -- FAILURE: Integrated in Hadoop-trunk-Commit #8284 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8284/]) HDFS-8805. Archival Storage: getStoragePolicy should not need superuser privilege. Contributed by Brahma Reddy Battula. (jing9: rev 1fc3c779a422bafdb86ad1a5b2349802dda1cb62) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirAppendOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirWriteFileOp.java > Archival Storage: getStoragePolicy should not need superuser privilege > -- > > Key: HDFS-8805 > URL: https://issues.apache.org/jira/browse/HDFS-8805 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover, namenode >Reporter: Hui Zheng >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: HDFS-8805-002.patch, HDFS-8805-003.patch, > HDFS-8805-004.patch, HDFS-8805.patch > > > The result of getStoragePolicy command is always 'unspecified' even we has > set a StoragePolicy on a directory.But the real placement of blocks is > correct. > The result of fsck is not correct either. > {code} > $ hdfs storagepolicies -setStoragePolicy -path /tmp/cold -policy COLD > Set storage policy COLD on /tmp/cold > $ hdfs storagepolicies -getStoragePolicy -path /tmp/cold > The storage policy of /tmp/cold is unspecified > $ hdfs fsck -storagepolicies /tmp/cold > Blocks NOT satisfying the specified storage policy: > Storage Policy Specified Storage Policy # of blocks > % of blocks > ARCHIVE:4(COLD) HOT 5 >55.5556% > ARCHIVE:3(COLD) HOT 4 >44.% > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8052) Move WebHdfsFileSystem into hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gour Saha reopened HDFS-8052: - Reopening because this is an incompatible change and breaks SLIDER-923 > Move WebHdfsFileSystem into hadoop-hdfs-client > -- > > Key: HDFS-8052 > URL: https://issues.apache.org/jira/browse/HDFS-8052 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Haohui Mai >Assignee: Haohui Mai > Fix For: 2.8.0 > > Attachments: HDFS-8052.000.patch, HDFS-8052.001.patch, > HDFS-8052.002.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8880) NameNode metrics logging
[ https://issues.apache.org/jira/browse/HDFS-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682180#comment-14682180 ] Jitendra Nath Pandey commented on HDFS-8880: +1 > NameNode metrics logging > > > Key: HDFS-8880 > URL: https://issues.apache.org/jira/browse/HDFS-8880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-8880.01.patch, namenode-metrics.log > > > The NameNode can periodically log metrics to help debugging when the cluster > is not setup with another metrics monitoring scheme. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6244) Make Trash Interval configurable for each of the namespaces
[ https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated HDFS-6244: -- Attachment: HDFS-6244.v4.patch > Make Trash Interval configurable for each of the namespaces > --- > > Key: HDFS-6244 > URL: https://issues.apache.org/jira/browse/HDFS-6244 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.5-alpha >Reporter: Siqi Li >Assignee: Siqi Li > Labels: BB2015-05-TBR > Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, > HDFS-6244.v3.patch, HDFS-6244.v4.patch > > > Somehow we need to avoid the cluster filling up. > One solution is to have a different trash policy per namespace. However, if > we can simply make the property configurable per namespace, then the same > config can be rolled everywhere and we'd be done. This seems simple enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8805) Archival Storage: getStoragePolicy should not need superuser privilege
[ https://issues.apache.org/jira/browse/HDFS-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8805: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: (was: 2.6.0) 2.8.0 Status: Resolved (was: Patch Available) I've committed this to trunk and branch-2. Thanks for the contribution, [~brahmareddy]! Thanks for reporting the issue, [~huizane]! > Archival Storage: getStoragePolicy should not need superuser privilege > -- > > Key: HDFS-8805 > URL: https://issues.apache.org/jira/browse/HDFS-8805 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover, namenode >Reporter: Hui Zheng >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: HDFS-8805-002.patch, HDFS-8805-003.patch, > HDFS-8805-004.patch, HDFS-8805.patch > > > The result of getStoragePolicy command is always 'unspecified' even we has > set a StoragePolicy on a directory.But the real placement of blocks is > correct. > The result of fsck is not correct either. > {code} > $ hdfs storagepolicies -setStoragePolicy -path /tmp/cold -policy COLD > Set storage policy COLD on /tmp/cold > $ hdfs storagepolicies -getStoragePolicy -path /tmp/cold > The storage policy of /tmp/cold is unspecified > $ hdfs fsck -storagepolicies /tmp/cold > Blocks NOT satisfying the specified storage policy: > Storage Policy Specified Storage Policy # of blocks > % of blocks > ARCHIVE:4(COLD) HOT 5 >55.5556% > ARCHIVE:3(COLD) HOT 4 >44.% > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8805) Archival Storage: getStoragePolicy should not need superuser privilege
[ https://issues.apache.org/jira/browse/HDFS-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682126#comment-14682126 ] Jing Zhao commented on HDFS-8805: - Thanks for updating the patch, [~brahmareddy]. +1 for the 004 patch. I will commit it shortly. > Archival Storage: getStoragePolicy should not need superuser privilege > -- > > Key: HDFS-8805 > URL: https://issues.apache.org/jira/browse/HDFS-8805 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover, namenode >Reporter: Hui Zheng >Assignee: Brahma Reddy Battula > Fix For: 2.6.0 > > Attachments: HDFS-8805-002.patch, HDFS-8805-003.patch, > HDFS-8805-004.patch, HDFS-8805.patch > > > The result of getStoragePolicy command is always 'unspecified' even we has > set a StoragePolicy on a directory.But the real placement of blocks is > correct. > The result of fsck is not correct either. > {code} > $ hdfs storagepolicies -setStoragePolicy -path /tmp/cold -policy COLD > Set storage policy COLD on /tmp/cold > $ hdfs storagepolicies -getStoragePolicy -path /tmp/cold > The storage policy of /tmp/cold is unspecified > $ hdfs fsck -storagepolicies /tmp/cold > Blocks NOT satisfying the specified storage policy: > Storage Policy Specified Storage Policy # of blocks > % of blocks > ARCHIVE:4(COLD) HOT 5 >55.5556% > ARCHIVE:3(COLD) HOT 4 >44.% > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8808: Attachment: HDFS-8808-00.patch > dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby > > > Key: HDFS-8808 > URL: https://issues.apache.org/jira/browse/HDFS-8808 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Gautam Gopalakrishnan >Assignee: Zhe Zhang > Attachments: HDFS-8808-00.patch > > > The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the > speed with which the fsimage is copied between the namenodes during regular > use. However, as a side effect, this also limits transfers when the > {{-bootstrapStandby}} option is used. This option is often used during > upgrades and could potentially slow down the entire workflow. The request > here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth > setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8808: Affects Version/s: 2.7.1 Target Version/s: 2.7.2 Status: Patch Available (was: Open) Submitting initial patch to trigger Jenkins and collect feedback on the basic idea. In the next rev I will add a unit test and the additional config as [~ajithshetty] suggested. > dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby > > > Key: HDFS-8808 > URL: https://issues.apache.org/jira/browse/HDFS-8808 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Gautam Gopalakrishnan >Assignee: Zhe Zhang > > The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the > speed with which the fsimage is copied between the namenodes during regular > use. However, as a side effect, this also limits transfers when the > {{-bootstrapStandby}} option is used. This option is often used during > upgrades and could potentially slow down the entire workflow. The request > here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth > setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8859) Improve DataNode (ReplicaMap) memory footprint to save about 45%
[ https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682093#comment-14682093 ] Tsz Wo Nicholas Sze commented on HDFS-8859: --- - Is the only difference between LightWeightHashGSet and LightWeightGSet that LightWeightHashGSet is resizable? - It seems that some code in LightWeightHashGSet is copied from LightWeightGSet. Could you change LightWeightHashGSet to extends LightWeightGSet? > Improve DataNode (ReplicaMap) memory footprint to save about 45% > > > Key: HDFS-8859 > URL: https://issues.apache.org/jira/browse/HDFS-8859 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Yi Liu >Assignee: Yi Liu >Priority: Critical > Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch > > > By using following approach we can save about *45%* memory footprint for each > block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in > DataNode), the details are: > In ReplicaMap, > {code} > private final Map> map = > new HashMap>(); > {code} > Currently we use a HashMap {{Map}} to store the replicas > in memory. The key is block id of the block replica which is already > included in {{ReplicaInfo}}, so this memory can be saved. Also HashMap Entry > has a object overhead. We can implement a lightweight Set which is similar > to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix > size for the entries array, usually it's a big value, an example is > {{BlocksMap}}, this can avoid full gc since no need to resize), also we > should be able to get Element through key. > Following is comparison of memory footprint If we implement a lightweight set > as described: > We can save: > {noformat} > SIZE (bytes) ITEM > 20The Key: Long (12 bytes object overhead + 8 > bytes long) > 12HashMap Entry object overhead > 4 reference to the key in Entry > 4 reference to the value in Entry > 4 hash in Entry > {noformat} > Total: -44 bytes > We need to add: > {noformat} > SIZE (bytes) ITEM > 4 a reference to next element in ReplicaInfo > {noformat} > Total: +4 bytes > So totally we can save 40bytes for each block replica > And currently one finalized replica needs around 46 bytes (notice: we ignore > memory alignment here). > We can save 1 - (4 + 46) / (44 + 46) = *45%* memory for each block replica > in DataNode. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5274) Add Tracing to HDFS
[ https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HDFS-5274: Resolution: Fixed Status: Resolved (was: Patch Available) > Add Tracing to HDFS > --- > > Key: HDFS-5274 > URL: https://issues.apache.org/jira/browse/HDFS-5274 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 2.1.1-beta >Reporter: Elliott Clark >Assignee: Elliott Clark > Labels: BB2015-05-TBR > Attachments: 3node_get_200mb.png, 3node_put_200mb.png, > 3node_put_200mb.png, HDFS-5274-0.patch, HDFS-5274-1.patch, > HDFS-5274-10.patch, HDFS-5274-11.txt, HDFS-5274-12.patch, HDFS-5274-13.patch, > HDFS-5274-14.patch, HDFS-5274-15.patch, HDFS-5274-16.patch, > HDFS-5274-17.patch, HDFS-5274-2.patch, HDFS-5274-3.patch, HDFS-5274-4.patch, > HDFS-5274-5.patch, HDFS-5274-6.patch, HDFS-5274-7.patch, HDFS-5274-8.patch, > HDFS-5274-8.patch, HDFS-5274-9.patch, Zipkin Trace a06e941b0172ec73.png, > Zipkin Trace d0f0d66b8a258a69.png, ss-5274v8-get.png, ss-5274v8-put.png > > > Since Google's Dapper paper has shown the benefits of tracing for a large > distributed system, it seems like a good time to add tracing to HDFS. HBase > has added tracing using HTrace. I propose that the same can be done within > HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8818) Allow Balancer to run faster
[ https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681996#comment-14681996 ] Hudson commented on HDFS-8818: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #273 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/273/]) HDFS-8818. Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java > Allow Balancer to run faster > > > Key: HDFS-8818 > URL: https://issues.apache.org/jira/browse/HDFS-8818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Fix For: 2.8.0 > > Attachments: h8818_20150723.patch, h8818_20150727.patch > > > The original design of Balancer is intentionally to make it run slowly so > that the balancing activities won't affect the normal cluster activities and > the running jobs. > There are new use case that cluster admin may choose to balance the cluster > when the cluster load is low, or in a maintain window. So that we should > have an option to allow Balancer to run faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8818) Allow Balancer to run faster
[ https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681962#comment-14681962 ] Hudson commented on HDFS-8818: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2211 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2211/]) HDFS-8818. Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Allow Balancer to run faster > > > Key: HDFS-8818 > URL: https://issues.apache.org/jira/browse/HDFS-8818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Fix For: 2.8.0 > > Attachments: h8818_20150723.patch, h8818_20150727.patch > > > The original design of Balancer is intentionally to make it run slowly so > that the balancing activities won't affect the normal cluster activities and > the running jobs. > There are new use case that cluster admin may choose to balance the cluster > when the cluster load is low, or in a maintain window. So that we should > have an option to allow Balancer to run faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8818) Allow Balancer to run faster
[ https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681915#comment-14681915 ] Hudson commented on HDFS-8818: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2230 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2230/]) HDFS-8818. Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java > Allow Balancer to run faster > > > Key: HDFS-8818 > URL: https://issues.apache.org/jira/browse/HDFS-8818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Fix For: 2.8.0 > > Attachments: h8818_20150723.patch, h8818_20150727.patch > > > The original design of Balancer is intentionally to make it run slowly so > that the balancing activities won't affect the normal cluster activities and > the running jobs. > There are new use case that cluster admin may choose to balance the cluster > when the cluster load is low, or in a maintain window. So that we should > have an option to allow Balancer to run faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8818) Allow Balancer to run faster
[ https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681918#comment-14681918 ] Hudson commented on HDFS-8818: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #281 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/281/]) HDFS-8818. Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Allow Balancer to run faster > > > Key: HDFS-8818 > URL: https://issues.apache.org/jira/browse/HDFS-8818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Fix For: 2.8.0 > > Attachments: h8818_20150723.patch, h8818_20150727.patch > > > The original design of Balancer is intentionally to make it run slowly so > that the balancing activities won't affect the normal cluster activities and > the running jobs. > There are new use case that cluster admin may choose to balance the cluster > when the cluster load is low, or in a maintain window. So that we should > have an option to allow Balancer to run faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8865) Improve quota initialization performance
[ https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-8865: - Attachment: HDFS-8865.v2.checkstyle.patch Missed the one checkstyle warning. > Improve quota initialization performance > > > Key: HDFS-8865 > URL: https://issues.apache.org/jira/browse/HDFS-8865 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, > HDFS-8865.v2.patch > > > After replaying edits, the whole file system tree is recursively scanned in > order to initialize the quota. For big name space, this can take a very long > time. Since this is done during namenode failover, it also affects failover > latency. > By using the Fork-Join framework, I was able to greatly reduce the > initialization time. The following is the test result using the fsimage from > one of the big name nodes we have. > || threads || seconds|| > | 1 (existing) | 55| > | 1 (fork-join) | 68 | > | 4 | 16 | > | 8 | 8 | > | 12 | 6 | > | 16 | 5 | > | 20 | 4 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed
[ https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-8863: - Attachment: HDFS-8863.v2.patch Attaching new patch. > The remiaing space check in BlockPlacementPolicyDefault is flawed > - > > Key: HDFS-8863 > URL: https://issues.apache.org/jira/browse/HDFS-8863 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Labels: 2.6.1-candidate > Attachments: HDFS-8863.patch, HDFS-8863.v2.patch > > > The block placement policy calls > {{DatanodeDescriptor#getRemaining(StorageType to check whether the block > is going to fit. Since the method is adding up all remaining spaces, namenode > can allocate a new block on a full node. This causes pipeline construction > failure and {{abandonBlock}}. If the cluster is nearly full, the client might > hit this multiple times and the write can fail permanently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed
[ https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681832#comment-14681832 ] Kihwal Lee commented on HDFS-8863: -- bq. it should just return current storage remaining space instead of get the maximum remaining space of all storages Datanodes only care about the storage type, so checking a particular storagewon't do any good. It will just cause block placement to re-pick target more. bq. Another issue, getBlocksScheduled is for storage type, not for per storage. Tracking scheduled writes per storage is not going to solve the problem since datanodes are free to choose any storage as long as the type matches. Trying to achieve precise accounting will have diminishing return as there are uncertainties around actual storage being used, blocks being abandoned, control loop delays (heartbeats), etc. What if we let it check against storage type level sum and also make sure there is at least one storage with enough space? I actually had a version of patch that does just that. I will remove unused method and post the patch. > The remiaing space check in BlockPlacementPolicyDefault is flawed > - > > Key: HDFS-8863 > URL: https://issues.apache.org/jira/browse/HDFS-8863 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Labels: 2.6.1-candidate > Attachments: HDFS-8863.patch > > > The block placement policy calls > {{DatanodeDescriptor#getRemaining(StorageType to check whether the block > is going to fit. Since the method is adding up all remaining spaces, namenode > can allocate a new block on a full node. This causes pipeline construction > failure and {{abandonBlock}}. If the cluster is nearly full, the client might > hit this multiple times and the write can fail permanently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8884) Fail-fast check in BlockPlacementPolicyDefault#chooseTarget
[ https://issues.apache.org/jira/browse/HDFS-8884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681818#comment-14681818 ] Hadoop QA commented on HDFS-8884: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 10s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 36s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 41s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 21s | The applied patch generated 4 new checkstyle issues (total was 58, now 56). | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 7 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 22s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 29s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 3s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 175m 17s | Tests failed in hadoop-hdfs. | | | | 218m 58s | | \\ \\ || Reason || Tests || | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749787/HDFS-8884.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / fa1d84a | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11961/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11961/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11961/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11961/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11961/console | This message was automatically generated. > Fail-fast check in BlockPlacementPolicyDefault#chooseTarget > --- > > Key: HDFS-8884 > URL: https://issues.apache.org/jira/browse/HDFS-8884 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Yi Liu >Assignee: Yi Liu > Attachments: HDFS-8884.001.patch > > > In current BlockPlacementPolicyDefault, when choosing datanode storage to > place block, we have following logic: > {code} > final DatanodeStorageInfo[] storages = DFSUtil.shuffle( > chosenNode.getStorageInfos()); > int i = 0; > boolean search = true; > for (Iterator> iter = storageTypes > .entrySet().iterator(); search && iter.hasNext(); ) { > Map.Entry entry = iter.next(); > for (i = 0; i < storages.length; i++) { > StorageType type = entry.getKey(); > final int newExcludedNodes = addIfIsGoodTarget(storages[i], > {code} > We will iterate (actually two {{for}}, although they are usually small value) > all storages of the candidate datanode even the datanode itself is not good > (e.g. decommissioned, stale, too busy..), since currently we do all the check > in {{addIfIsGoodTarget}}. > We can fail-fast: check the datanode related conditions first, if the > datanode is not good, then no need to shuffle and iterate the storages. Then > it's more efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)