[jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2
[ https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741935#comment-14741935 ] Duo Zhang commented on HDFS-7966: - No, the testcase uses multiple connections... But yes, this is not a typical usage in real world. Let me try to deploy an HBase on top of HDFS and run YCSB to collect some performance data. Thanks. > New Data Transfer Protocol via HTTP/2 > - > > Key: HDFS-7966 > URL: https://issues.apache.org/jira/browse/HDFS-7966 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Haohui Mai >Assignee: Qianqian Shi > Labels: gsoc, gsoc2015, mentor > Attachments: GSoC2015_Proposal.pdf, > TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg, > TestHttp2ReadBlockInsideEventLoop.svg > > > The current Data Transfer Protocol (DTP) implements a rich set of features > that span across multiple layers, including: > * Connection pooling and authentication (session layer) > * Encryption (presentation layer) > * Data writing pipeline (application layer) > All these features are HDFS-specific and defined by implementation. As a > result it requires non-trivial amount of work to implement HDFS clients and > servers. > This jira explores to delegate the responsibilities of the session and > presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles > connection multiplexing, QoS, authentication and encryption, reducing the > scope of DTP to the application layer only. By leveraging the existing HTTP/2 > library, it should simplify the implementation of both HDFS clients and > servers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9043) Doc updation for commands in HDFS Federation
[ https://issues.apache.org/jira/browse/HDFS-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741929#comment-14741929 ] Vinayakumar B commented on HDFS-9043: - All changes are looks good to me. +1 for all. [~aw], do you also want to take a look? Will wait for a day before commit. -Thanks > Doc updation for commands in HDFS Federation > > > Key: HDFS-9043 > URL: https://issues.apache.org/jira/browse/HDFS-9043 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: J.Andreina >Assignee: J.Andreina >Priority: Minor > Attachments: HDFS-9043-1.patch, HDFS-9043-branch-2-1.patch, > HDFS-9043-branch-2.7.0-1.patch > > > 1. command is wrong > {noformat} > $HADOOP_PREFIX/bin/hdfs dfsadmin -refreshNameNode > : > {noformat} > Correct command is : hdfs dfsadmin -refreshNameNode's' > 2.command is wrong > {noformat} > $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script > $HADOOP_PREFIX/bin/hdfs start balancer > {noformat} > Correct command is : *start-balancer.sh -policy* > 3. Reference link to balancer for further details is wrong > {noformat} > Note that Balancer only balances the data and does not balance the namespace. > For the complete command usage, see balancer. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8780) Fetching live/dead datanode list with arg true for removeDecommissionNode,returns list with decom node.
[ https://issues.apache.org/jira/browse/HDFS-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741923#comment-14741923 ] Hadoop QA commented on HDFS-8780: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 57s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 3s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 2s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 23s | The applied patch generated 1 new checkstyle issues (total was 323, now 322). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 30s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 11s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 164m 8s | Tests failed in hadoop-hdfs. | | | | 209m 47s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistLockedMemory | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter | | | hadoop.tools.TestJMXGet | | | hadoop.hdfs.web.TestWebHDFSOAuth2 | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12755530/HDFS-8780.2.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 9538af0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12409/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12409/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12409/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12409/console | This message was automatically generated. > Fetching live/dead datanode list with arg true for > removeDecommissionNode,returns list with decom node. > --- > > Key: HDFS-8780 > URL: https://issues.apache.org/jira/browse/HDFS-8780 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: J.Andreina >Assignee: J.Andreina >Priority: Critical > Attachments: HDFS-8780.1.patch, HDFS-8780.2.patch > > > Current implementation: > == > DatanodeManager#removeDecomNodeFromList() , Decommissioned node will be > removed from dead/live node list only if below conditions are met > I . If the Include list is not empty. > II. If include and exclude list does not have decommissioned node and node > state is decommissioned. > {code} > if (!hostFileManager.hasIncludes()) { > return; >} > if ((!hostFileManager.isIncluded(node)) && > (!hostFileManager.isExcluded(node)) > && node.isDecommissioned()) { > // Include list is not empty, an existing datanode does not appear > // in both include or exclude lists and it has been decommissioned. > // Remove it from the node list. > it.remove(); > } > {code} > As mentioned in javadoc a datanode cannot be in "already decommissioned > datanode state". > Following the steps mentioned in javadoc datanode state is "dead" and not > decommissioned. > *Can we avoid the unnecessary checks and have check for the node is in > decommissioned state then remove from node list. ?* > Please provide your feedback. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9041) Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741883#comment-14741883 ] Hadoop QA commented on HDFS-9041: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 36s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 52s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 9s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | native | 3m 10s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 85m 26s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 0m 28s | Tests passed in hadoop-hdfs-client. | | | | 125m 10s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistLockedMemory | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement | | Timed out tests | org.apache.hadoop.hdfs.TestSafeMode | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12755268/HDFS-9041.001.patch | | Optional Tests | javadoc javac unit | | git revision | trunk / 9538af0 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12407/artifact/patchprocess/testrun_hadoop-hdfs.txt | | hadoop-hdfs-client test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12407/artifact/patchprocess/testrun_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12407/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12407/console | This message was automatically generated. > Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client > > > Key: HDFS-9041 > URL: https://issues.apache.org/jira/browse/HDFS-9041 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Haohui Mai >Assignee: Mingliang Liu > Attachments: HDFS-9041.000.patch, HDFS-9041.001.patch > > > This is a follow up of HDFS-8052. It looks like HDFS-8052 breaks > {{TestLocalJobSubmission}} in MR. > HDFS-8052 moves the implementation of {{WebHdfsFileSystem}} and > {{SWebHdfsFileSystem}} to hdfs-client. > According the usage of ServiceLoader, the corresponding entries in > {{hadoop-hdfs-project/hadoop-hdfs/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}} > should be moved as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8996) Consolidate validateLog and scanLog in FJM#EditLogFile
[ https://issues.apache.org/jira/browse/HDFS-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741881#comment-14741881 ] Hadoop QA commented on HDFS-8996: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 22m 31s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 4 new or modified test files. | | {color:green}+1{color} | javac | 7m 50s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 12s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 34s | The applied patch generated 1 new checkstyle issues (total was 159, now 156). | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 19s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 10s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 192m 58s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 6m 21s | Tests passed in bkjournal. | | | | 250m 36s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestReplaceDatanodeOnFailure | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistLockedMemory | | | hadoop.tools.TestJMXGet | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12755506/HDFS-8996.01.patch | | Optional Tests | javac unit findbugs checkstyle javadoc | | git revision | trunk / 9538af0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12405/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12405/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12405/artifact/patchprocess/testrun_hadoop-hdfs.txt | | bkjournal test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12405/artifact/patchprocess/testrun_bkjournal.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12405/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12405/console | This message was automatically generated. > Consolidate validateLog and scanLog in FJM#EditLogFile > -- > > Key: HDFS-8996 > URL: https://issues.apache.org/jira/browse/HDFS-8996 > Project: Hadoop HDFS > Issue Type: Bug > Components: journal-node, namenode >Affects Versions: 2.0.0-alpha >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-8996.00.patch, HDFS-8996.01.patch > > > After HDFS-8965 is committed, {{scanEditLog}} will be identical to > {{validateEditLog}} in {{EditLogInputStream}} and {{FSEditlogLoader}}. This > is a place holder for us to remove the redundant {{scanEditLog}} code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741879#comment-14741879 ] Hadoop QA commented on HDFS-8873: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 56s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 1s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 1s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 34s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 7s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 26s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 11s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 191m 53s | Tests failed in hadoop-hdfs. | | | | 234m 46s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.tools.TestJMXGet | | | hadoop.hdfs.server.blockmanagement.TestBlockManager | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistLockedMemory | | | hadoop.hdfs.shortcircuit.TestShortCircuitCache | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12755501/HDFS-8873.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 9538af0 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12406/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12406/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12406/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12406/console | This message was automatically generated. > throttle directoryScanner > - > > Key: HDFS-8873 > URL: https://issues.apache.org/jira/browse/HDFS-8873 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Daniel Templeton > Attachments: HDFS-8873.001.patch > > > The new 2-level directory layout can make directory scans expensive in terms > of disk seeks (see HDFS-8791) for details. > It would be good if the directoryScanner() had a configurable duty cycle that > would reduce its impact on disk performance (much like the approach in > HDFS-8617). > Without such a throttle, disks can go 100% busy for many minutes at a time > (assuming the common case of all inodes in cache but no directory blocks > cached, 64K seeks are required for full directory listing which translates to > 655 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9065) Include commas on # of files, blocks, total filesystem objects in NN Web UI
[ https://issues.apache.org/jira/browse/HDFS-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741872#comment-14741872 ] Hadoop QA commented on HDFS-9065: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 0m 0s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | release audit | 0m 21s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | | | 0m 24s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12755531/HDFS-9065.002.patch | | Optional Tests | | | git revision | trunk / 9538af0 | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12410/console | This message was automatically generated. > Include commas on # of files, blocks, total filesystem objects in NN Web UI > --- > > Key: HDFS-9065 > URL: https://issues.apache.org/jira/browse/HDFS-9065 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Attachments: HDFS-9065.001.patch, HDFS-9065.002.patch > > > Include commas on the number of files, blocks, and total filesystem objects > in the NN Web UI (please see example below) to make the numbers easier to > read. > Current format: > 3236 files and directories, 1409 blocks = 4645 total filesystem object(s). > Proposed format: > 3,236 files and directories, 1,409 blocks = 4,645 total filesystem object(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9065) Include commas on # of files, blocks, total filesystem objects in NN Web UI
[ https://issues.apache.org/jira/browse/HDFS-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-9065: --- Attachment: HDFS-9065.002.patch Good suggestion. Here's a new patch. > Include commas on # of files, blocks, total filesystem objects in NN Web UI > --- > > Key: HDFS-9065 > URL: https://issues.apache.org/jira/browse/HDFS-9065 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Attachments: HDFS-9065.001.patch, HDFS-9065.002.patch > > > Include commas on the number of files, blocks, and total filesystem objects > in the NN Web UI (please see example below) to make the numbers easier to > read. > Current format: > 3236 files and directories, 1409 blocks = 4645 total filesystem object(s). > Proposed format: > 3,236 files and directories, 1,409 blocks = 4,645 total filesystem object(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9066) expose truncate via webhdfs
Allen Wittenauer created HDFS-9066: -- Summary: expose truncate via webhdfs Key: HDFS-9066 URL: https://issues.apache.org/jira/browse/HDFS-9066 Project: Hadoop HDFS Issue Type: New Feature Components: webhdfs Affects Versions: 3.0.0 Reporter: Allen Wittenauer Truncate should be exposed to WebHDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8780) Fetching live/dead datanode list with arg true for removeDecommissionNode,returns list with decom node.
[ https://issues.apache.org/jira/browse/HDFS-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] J.Andreina updated HDFS-8780: - Attachment: HDFS-8780.2.patch Updated the patch fixing checkstyle comments. Testcase failures are unrelated. Please review. > Fetching live/dead datanode list with arg true for > removeDecommissionNode,returns list with decom node. > --- > > Key: HDFS-8780 > URL: https://issues.apache.org/jira/browse/HDFS-8780 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: J.Andreina >Assignee: J.Andreina >Priority: Critical > Attachments: HDFS-8780.1.patch, HDFS-8780.2.patch > > > Current implementation: > == > DatanodeManager#removeDecomNodeFromList() , Decommissioned node will be > removed from dead/live node list only if below conditions are met > I . If the Include list is not empty. > II. If include and exclude list does not have decommissioned node and node > state is decommissioned. > {code} > if (!hostFileManager.hasIncludes()) { > return; >} > if ((!hostFileManager.isIncluded(node)) && > (!hostFileManager.isExcluded(node)) > && node.isDecommissioned()) { > // Include list is not empty, an existing datanode does not appear > // in both include or exclude lists and it has been decommissioned. > // Remove it from the node list. > it.remove(); > } > {code} > As mentioned in javadoc a datanode cannot be in "already decommissioned > datanode state". > Following the steps mentioned in javadoc datanode state is "dead" and not > decommissioned. > *Can we avoid the unnecessary checks and have check for the node is in > decommissioned state then remove from node list. ?* > Please provide your feedback. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9055) WebHDFS REST v2
[ https://issues.apache.org/jira/browse/HDFS-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741849#comment-14741849 ] Allen Wittenauer commented on HDFS-9055: bq. This is a false statement. The people who wrote the library disagree. So if you have solution, you should send that to Google. Frankly, I'll take their word over yours given they, you know, built it. bq. There is no need to use protobuf 3.x. The format is wire-compatible with protobuf 2.x, which means the current RPC will work without changes. Please go back and re-read exactly what I said. Slowly this time. bq. Have you tried it out? No because... bq. You're good as long as you're able to talk HTTP and protobuf. ... protobuf instantly kills it. If we wanted to use protobuf, we wouldn't be using REST to begin with! > WebHDFS REST v2 > --- > > Key: HDFS-9055 > URL: https://issues.apache.org/jira/browse/HDFS-9055 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer > > There's starting to be enough changes to fix and add missing functionality to > webhdfs that we should probably update to REST v2. This also gives us an > opportunity to deal with some incompatible issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9040) Erasure coding: A BlockGroupDataStreamer to rule all internal blocks streamers
[ https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-9040: Attachment: HDFS-9040.001.wip.patch Upload a wip patch (based on Walter's great job in HDFS-8383) to demo the above idea. The patch is just half way so will keep working on it. > Erasure coding: A BlockGroupDataStreamer to rule all internal blocks streamers > -- > > Key: HDFS-9040 > URL: https://issues.apache.org/jira/browse/HDFS-9040 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Walter Su >Assignee: Walter Su > Attachments: HDFS-9040.00.patch, HDFS-9040.001.wip.patch > > > A {{BlockGroupDataStreamer}} to communicate with NN to allocate/update block, > and {{StripedDataStreamer}} s only have to stream blocks to DNs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9065) Include commas on # of files, blocks, total filesystem objects in NN Web UI
[ https://issues.apache.org/jira/browse/HDFS-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741835#comment-14741835 ] Haohui Mai commented on HDFS-9065: -- Thanks! It might make sense to use {{Number.toLocaleString}}: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/toLocaleString > Include commas on # of files, blocks, total filesystem objects in NN Web UI > --- > > Key: HDFS-9065 > URL: https://issues.apache.org/jira/browse/HDFS-9065 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Attachments: HDFS-9065.001.patch > > > Include commas on the number of files, blocks, and total filesystem objects > in the NN Web UI (please see example below) to make the numbers easier to > read. > Current format: > 3236 files and directories, 1409 blocks = 4645 total filesystem object(s). > Proposed format: > 3,236 files and directories, 1,409 blocks = 4,645 total filesystem object(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9055) WebHDFS REST v2
[ https://issues.apache.org/jira/browse/HDFS-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741828#comment-14741828 ] Haohui Mai commented on HDFS-9055: -- bq. it does not work with curl or wget despite the above assurances. This is a false statement. You're good as long as you're able to talk HTTP and protobuf. There is no out-of-box support but it doesn't mean it's not feasible. bq. it is HIGHLY HIGHLY HIGHLY recommend to use protobuf 3.x in order to get the widest possible language coverage. There is no need to use protobuf 3.x. The format is wire-compatible with protobuf 2.x, which means the current RPC will work without changes. bq. So yeah, gRPC is a total non-starter without massive delays and reworking things like the entire RPC layer. This looks highly speculative from both design and implementation point of view. Have you tried it out? > WebHDFS REST v2 > --- > > Key: HDFS-9055 > URL: https://issues.apache.org/jira/browse/HDFS-9055 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer > > There's starting to be enough changes to fix and add missing functionality to > webhdfs that we should probably update to REST v2. This also gives us an > opportunity to deal with some incompatible issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9055) WebHDFS REST v2
[ https://issues.apache.org/jira/browse/HDFS-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741774#comment-14741774 ] Allen Wittenauer commented on HDFS-9055: A bit more research reveals that the gRPC FAQ specifically states that it currently doesn't work in a browser. So yes, it does not work with curl or wget despite the above assurances. Added bonus: it is HIGHLY HIGHLY HIGHLY recommend to use protobuf 3.x in order to get the widest possible language coverage. So yeah, gRPC is a total non-starter without massive delays and reworking things like the entire RPC layer. Meanwhile... back to fixing WebHDFS. > WebHDFS REST v2 > --- > > Key: HDFS-9055 > URL: https://issues.apache.org/jira/browse/HDFS-9055 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer > > There's starting to be enough changes to fix and add missing functionality to > webhdfs that we should probably update to REST v2. This also gives us an > opportunity to deal with some incompatible issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4981) chmod 777 the .snapshot directory does not error that modification on RO snapshot is disallowed
[ https://issues.apache.org/jira/browse/HDFS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741764#comment-14741764 ] Xiao Chen commented on HDFS-4981: - Hi Stephen, Thanks for reporting the issue. I'll be working on it later next week. > chmod 777 the .snapshot directory does not error that modification on RO > snapshot is disallowed > --- > > Key: HDFS-4981 > URL: https://issues.apache.org/jira/browse/HDFS-4981 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Affects Versions: 3.0.0, 2.0.4-alpha >Reporter: Stephen Chu >Assignee: Xiao Chen >Priority: Trivial > > Snapshots currently are RO, so it's expected that when someone tries to > modify the .snapshot directory s/he is denied. > However, if the user tries to chmod 777 the .snapshot directory, the > operation does not error. The user should be alerted that modifications are > not allowed, even if this operation didn't actually change anything. > Using other modes will trigger the error, though. > {code} > [schu@hdfs-snapshots-1 hdfs]$ sudo -u hdfs hdfs dfs -chmod 777 > /user/schu/test_dir_1/.snapshot/ > [schu@hdfs-snapshots-1 hdfs]$ sudo -u hdfs hdfs dfs -chmod 755 > /user/schu/test_dir_1/.snapshot/ > chmod: changing permissions of '/user/schu/test_dir_1/.snapshot': > Modification on a read-only snapshot is disallowed > [schu@hdfs-snapshots-1 hdfs]$ sudo -u hdfs hdfs dfs -chmod 435 > /user/schu/test_dir_1/.snapshot/ > chmod: changing permissions of '/user/schu/test_dir_1/.snapshot': > Modification on a read-only snapshot is disallowed > [schu@hdfs-snapshots-1 hdfs]$ sudo -u hdfs hdfs dfs -chown hdfs > /user/schu/test_dir_1/.snapshot/ > chown: changing ownership of '/user/schu/test_dir_1/.snapshot': Modification > on a read-only snapshot is disallowed > [schu@hdfs-snapshots-1 hdfs]$ sudo -u hdfs hdfs dfs -chown schu > /user/schu/test_dir_1/.snapshot/ > chown: changing ownership of '/user/schu/test_dir_1/.snapshot': Modification > on a read-only snapshot is disallowed > [schu@hdfs-snapshots-1 hdfs]$ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741762#comment-14741762 ] Xiao Chen commented on HDFS-5802: - Hi Harsh, Thanks for reporting the issue. I'll be working on it. > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-4981) chmod 777 the .snapshot directory does not error that modification on RO snapshot is disallowed
[ https://issues.apache.org/jira/browse/HDFS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen reassigned HDFS-4981: --- Assignee: Xiao Chen > chmod 777 the .snapshot directory does not error that modification on RO > snapshot is disallowed > --- > > Key: HDFS-4981 > URL: https://issues.apache.org/jira/browse/HDFS-4981 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Affects Versions: 3.0.0, 2.0.4-alpha >Reporter: Stephen Chu >Assignee: Xiao Chen >Priority: Trivial > > Snapshots currently are RO, so it's expected that when someone tries to > modify the .snapshot directory s/he is denied. > However, if the user tries to chmod 777 the .snapshot directory, the > operation does not error. The user should be alerted that modifications are > not allowed, even if this operation didn't actually change anything. > Using other modes will trigger the error, though. > {code} > [schu@hdfs-snapshots-1 hdfs]$ sudo -u hdfs hdfs dfs -chmod 777 > /user/schu/test_dir_1/.snapshot/ > [schu@hdfs-snapshots-1 hdfs]$ sudo -u hdfs hdfs dfs -chmod 755 > /user/schu/test_dir_1/.snapshot/ > chmod: changing permissions of '/user/schu/test_dir_1/.snapshot': > Modification on a read-only snapshot is disallowed > [schu@hdfs-snapshots-1 hdfs]$ sudo -u hdfs hdfs dfs -chmod 435 > /user/schu/test_dir_1/.snapshot/ > chmod: changing permissions of '/user/schu/test_dir_1/.snapshot': > Modification on a read-only snapshot is disallowed > [schu@hdfs-snapshots-1 hdfs]$ sudo -u hdfs hdfs dfs -chown hdfs > /user/schu/test_dir_1/.snapshot/ > chown: changing ownership of '/user/schu/test_dir_1/.snapshot': Modification > on a read-only snapshot is disallowed > [schu@hdfs-snapshots-1 hdfs]$ sudo -u hdfs hdfs dfs -chown schu > /user/schu/test_dir_1/.snapshot/ > chown: changing ownership of '/user/schu/test_dir_1/.snapshot': Modification > on a read-only snapshot is disallowed > [schu@hdfs-snapshots-1 hdfs]$ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen reassigned HDFS-5802: --- Assignee: Xiao Chen > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8874) Add DN metrics for balancer and other block movement scenarios
[ https://issues.apache.org/jira/browse/HDFS-8874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741711#comment-14741711 ] Ming Ma commented on HDFS-8874: --- Thanks [~ctrezzo]. * If the new {{receiveBlock}} returns false due to balanceThrottler limit, it seems {{addReplaceBlockOp}} still gets called. * The new functions {{sendBlock}} and {{receiveBlock}} don't cover the regular read and write scenarios. So maybe it is better to use different names like {{copyBlockInt}}, {{replaceBlockInt}}. > Add DN metrics for balancer and other block movement scenarios > -- > > Key: HDFS-8874 > URL: https://issues.apache.org/jira/browse/HDFS-8874 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Chris Trezzo > Attachments: HDFS-8874-trunk-v1.patch, HDFS-8874-trunk-v2.patch, > HDFS-8874-trunk-v3.patch > > > For balancer, mover and migrator (HDFS-8789), we want to know how close it is > to the DN's throttling thresholds. Although DN has existing metrics such as > {{BytesWritten}}, {{BytesRead}}, {{CopyBlockOpNumOps}} and > {{ReplaceBlockOpNumOps}}, there is no metrics to indicate the number of bytes > moved. > We can add {{ReplaceBlockBytesWritten}} and {{CopyBlockBytesRead}} to account > for the bytes moved in ReplaceBlock and CopyBlock operations. In addition, we > can also add throttling metrics for {{DataTransferThrottler}} and > {{BlockBalanceThrottler}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6955) DN should reserve disk space for a full block when creating tmp files
[ https://issues.apache.org/jira/browse/HDFS-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741694#comment-14741694 ] Hadoop QA commented on HDFS-6955: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 40s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 5 new or modified test files. | | {color:green}+1{color} | javac | 7m 47s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 5s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 21s | The applied patch generated 3 new checkstyle issues (total was 197, now 198). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 19s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 149m 22s | Tests failed in hadoop-hdfs. | | | | 194m 35s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles | | | hadoop.hdfs.server.namenode.TestMetadataVersionOutput | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistLockedMemory | | | hadoop.hdfs.server.namenode.TestMetaSave | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter | | Timed out tests | org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12755469/HDFS-6955-08.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / ea4bb27 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12402/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12402/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12402/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12402/console | This message was automatically generated. > DN should reserve disk space for a full block when creating tmp files > - > > Key: HDFS-6955 > URL: https://issues.apache.org/jira/browse/HDFS-6955 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.5.0 >Reporter: Arpit Agarwal >Assignee: Kanaka Kumar Avvaru > Attachments: HDFS-6955-01.patch, HDFS-6955-02.patch, > HDFS-6955-03.patch, HDFS-6955-04.patch, HDFS-6955-05.patch, > HDFS-6955-06.patch, HDFS-6955-07.patch, HDFS-6955-08.patch > > > HDFS-6898 is introducing disk space reservation for RBW files to avoid > running out of disk space midway through block creation. > This Jira is to introduce similar reservation for tmp files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9065) Include commas on # of files, blocks, total filesystem objects in NN Web UI
[ https://issues.apache.org/jira/browse/HDFS-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741670#comment-14741670 ] Hadoop QA commented on HDFS-9065: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 0m 0s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | | | 0m 27s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12755496/HDFS-9065.001.patch | | Optional Tests | | | git revision | trunk / 9538af0 | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12404/console | This message was automatically generated. > Include commas on # of files, blocks, total filesystem objects in NN Web UI > --- > > Key: HDFS-9065 > URL: https://issues.apache.org/jira/browse/HDFS-9065 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Attachments: HDFS-9065.001.patch > > > Include commas on the number of files, blocks, and total filesystem objects > in the NN Web UI (please see example below) to make the numbers easier to > read. > Current format: > 3236 files and directories, 1409 blocks = 4645 total filesystem object(s). > Proposed format: > 3,236 files and directories, 1,409 blocks = 4,645 total filesystem object(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8996) Consolidate validateLog and scanLog in FJM#EditLogFile
[ https://issues.apache.org/jira/browse/HDFS-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8996: Attachment: HDFS-8996.01.patch Thanks Colin. Updating the patch to eliminate {{validateLog}} and {{validateEditLog}} from all places. To do that we need to add a switch to {{scanLog}} controlling whether to verify layout version. Based on existing logic I guess JN shouldn't try to verify layout version. > Consolidate validateLog and scanLog in FJM#EditLogFile > -- > > Key: HDFS-8996 > URL: https://issues.apache.org/jira/browse/HDFS-8996 > Project: Hadoop HDFS > Issue Type: Bug > Components: journal-node, namenode >Affects Versions: 2.0.0-alpha >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-8996.00.patch, HDFS-8996.01.patch > > > After HDFS-8965 is committed, {{scanEditLog}} will be identical to > {{validateEditLog}} in {{EditLogInputStream}} and {{FSEditlogLoader}}. This > is a place holder for us to remove the redundant {{scanEditLog}} code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741643#comment-14741643 ] Zhe Zhang commented on HDFS-7285: - I agree with Jing that write pipeline error handling can be moved to follow-on. We should also think about whether to merge EC to branch-2, and if so, what should be the target version (current target version is 3.0.0). > Erasure Coding Support inside HDFS > -- > > Key: HDFS-7285 > URL: https://issues.apache.org/jira/browse/HDFS-7285 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Weihua Jiang >Assignee: Zhe Zhang > Attachments: Compare-consolidated-20150824.diff, > Consolidated-20150707.patch, Consolidated-20150806.patch, > Consolidated-20150810.patch, ECAnalyzer.py, ECParser.py, > HDFS-7285-Consolidated-20150911.patch, HDFS-7285-initial-PoC.patch, > HDFS-7285-merge-consolidated-01.patch, > HDFS-7285-merge-consolidated-trunk-01.patch, > HDFS-7285-merge-consolidated.trunk.03.patch, > HDFS-7285-merge-consolidated.trunk.04.patch, > HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch, > HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf, > HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, > HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf, > HDFSErasureCodingSystemTestPlan-20150824.pdf, > HDFSErasureCodingSystemTestReport-20150826.pdf, fsimage-analysis-20150105.pdf > > > Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice > of data reliability, comparing to the existing HDFS 3-replica approach. For > example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, > with storage overhead only being 40%. This makes EC a quite attractive > alternative for big data storage, particularly for cold data. > Facebook had a related open source project called HDFS-RAID. It used to be > one of the contribute packages in HDFS but had been removed since Hadoop 2.0 > for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends > on MapReduce to do encoding and decoding tasks; 2) it can only be used for > cold files that are intended not to be appended anymore; 3) the pure Java EC > coding implementation is extremely slow in practical use. Due to these, it > might not be a good idea to just bring HDFS-RAID back. > We (Intel and Cloudera) are working on a design to build EC into HDFS that > gets rid of any external dependencies, makes it self-contained and > independently maintained. This design lays the EC feature on the storage type > support and considers compatible with existing HDFS features like caching, > snapshot, encryption, high availability and etc. This design will also > support different EC coding schemes, implementations and policies for > different deployment scenarios. By utilizing advanced libraries (e.g. Intel > ISA-L library), an implementation can greatly improve the performance of EC > encoding/decoding and makes the EC solution even more attractive. We will > post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8898) Create API and command-line argument to get quota without need to get file and directory counts
[ https://issues.apache.org/jira/browse/HDFS-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741632#comment-14741632 ] Joep Rottinghuis commented on HDFS-8898: So it sounds like we're discussing two things here: 1) Getting the quota itself for a directory that a user has access to. There seems to be little security concerns with this. 2) Getting the quota, and the "ContentSummary" / count / usage for a directory that a user has access to, even if they might not have access to all the sub-directories. This is where [~jlowe] pointed out that there could be a potential security implication. Even with yielding the NN lock, it seems the NN can still lock for ~1 sec per 10M files in a sub-directory to check the entire sub-directory sub-directory tree for permissions. To address the potential security implications for 2) we could either make this a cluster-wide (final) config value, or we could do something with an extended attribute on the directory itself to allow or disallow a particular directory to be traversed (or not). 1) would give a huge performance boost for the cases when people just want to know what the quota is. 2) would give a huge performance boost for the cases when people want to know a quota plus what's left for large directories relatively high in the directory structure (let alone / on a huge namespace of many tens of millions of files). > Create API and command-line argument to get quota without need to get file > and directory counts > --- > > Key: HDFS-8898 > URL: https://issues.apache.org/jira/browse/HDFS-8898 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs >Reporter: Joep Rottinghuis > > On large directory structures it takes significant time to iterate through > the file and directory counts recursively to get a complete ContentSummary. > When you want to just check for the quota on a higher level directory it > would be good to have an option to skip the file and directory counts. > Moreover, currently one can only check the quota if you have access to all > the directories underneath. For example, if I have a large home directory > under /user/joep and I host some files for another user in a sub-directory, > the moment they create an unreadable sub-directory under my home I can no > longer check what my quota is. Understood that I cannot check the current > file counts unless I can iterate through all the usage, but for > administrative purposes it is nice to be able to get the current quota > setting on a directory without the need to iterate through and run into > permission issues on sub-directories. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741622#comment-14741622 ] Colin Patrick McCabe commented on HDFS-8873: Good work, [~templedf]. {code} public static final String DFS_DATANODE_DIRECTORYSCAN_THROTTLE_KEY = "dfs.datanode.directoryscan.throttle"; {code} Should be "throttle.type"? {code} 649 650 dfs.datanode.directoryscan.throttle.limit 651 0 652 The limit setting for the report compiler threads throttle. The 653 meaning of this setting is determined by the 654 dfs.datanode.directoryscan.throttle setting. In all cases, setting this limit 655 to 0 disables compiler thread throttling. 0 is the default setting. 656 657 {code} I would prefer to see per-type configuration keys that are more descriptive. For example, {{dfs.datanode.directoryscan.timeslice.throttle.ms.per.sec}}. If we invent more throttle types later, we can always add more configuration key names. testTimesliceThrottle: please copy the Configuration object and change the copy rather than mutating {{TestDirectoryScanner#CONF}}. {code} 604 // Waiting should be about 4x running. 605 ratio = (float)scanner.timeWaiting.get() / scanner.timeRunning.get(); 606 607 LOG.info("RATIO: " + ratio); 608 assertTrue("Throttle is too restrictive", ratio <= 4.5); 609 assertTrue("Throttle is too permissive", ratio >= 3.0); {code} I'm a bit concerned that other tests running on the same machine, or GCs could cause delays here that would make the test fail. Perhaps we could do this in a loop and keep retrying until we found that the ratio was correct? {code} 84@VisibleForTesting 85final AtomicLong timeRunning = new AtomicLong(0L); {code} Should be "timeRunningMs" to reflect the fact that this interval is in milliseconds. Similar with "timeWaiting" {code} 115 public static ThrottleType parse(String type) { 116 if (type.trim().equalsIgnoreCase(TIMESLICE.toString())) { 117 return TIMESLICE; 118 } else { 119 return NONE; 120 } 121 } {code} We should log an ERROR message if we can't understand the throttle type. Silently defaulting to doing nothing is not behavior most users will appreciate. DirectoryScanner.java: there are some unnecessary whitespace changes. TimesliceThrottler: maybe TimeSliceThrottlerTask is a more appropriate name here? I like to think of executors as scheduling tasks. Arguably the throttler state is all contained outside the class so it's not really "the throttler." {code} 1121 } catch (Throwable t) { 1122LOG.error("Throttle thread died unexpectedly", t); 1123 1124if (t instanceof Error) { 1125 throw t; 1126} 1127 } {code} What's the purpose of rethrowing exceptions here? {code} private static class Throttle extends Semaphore { {code} While this works, it feels more natural to use a boolean + condition variable here. {code} try { lock.lock(); while (blockThreads) { cond.wait(); } } finally { lock.unlock(); } {code} {code} 74private final ThrottleType throttleType; 75private final int throttleLimit; 76private ScheduledExecutorService throttleThread; 77private Semaphore runningThreads = new Semaphore(0); 78private volatile Throttle throttle; {code} It feels like this state should be encapsulated inside the Throttler itself. {code} 860 // Give our parent a chance to block us for throttling 861 if (throttle != null) { 862 throttle.start(); 863 } {code} Can we just say that Throttler is always non-null, but sometimes we have a {{NoOpThrottler}}? I don't like all this checking for null business. You could get rid of the type enum and all those explicit type checks, and just have a factory function inside an interface that creates the appropriate kind of Throttler object from the Configuration (no-op, timeslice, etc). > throttle directoryScanner > - > > Key: HDFS-8873 > URL: https://issues.apache.org/jira/browse/HDFS-8873 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Daniel Templeton > Attachments: HDFS-8873.001.patch > > > The new 2-level directory layout can make directory scans expensive in terms > of disk seeks (see HDFS-8791) for details. > It would be good if the directoryScanner() had a configurable duty cycle that > would reduce its impact on disk performance (much like the approach in > HDFS-8617). > Without such a throttle, disks can go 100% busy for many minutes a
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741620#comment-14741620 ] Jing Zhao commented on HDFS-7285: - Thanks Zhe! Also IMO the write pipeline error handling may not be a blocker for merging the feature branch to trunk (but may be for branch-2). > Erasure Coding Support inside HDFS > -- > > Key: HDFS-7285 > URL: https://issues.apache.org/jira/browse/HDFS-7285 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Weihua Jiang >Assignee: Zhe Zhang > Attachments: Compare-consolidated-20150824.diff, > Consolidated-20150707.patch, Consolidated-20150806.patch, > Consolidated-20150810.patch, ECAnalyzer.py, ECParser.py, > HDFS-7285-Consolidated-20150911.patch, HDFS-7285-initial-PoC.patch, > HDFS-7285-merge-consolidated-01.patch, > HDFS-7285-merge-consolidated-trunk-01.patch, > HDFS-7285-merge-consolidated.trunk.03.patch, > HDFS-7285-merge-consolidated.trunk.04.patch, > HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch, > HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf, > HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, > HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf, > HDFSErasureCodingSystemTestPlan-20150824.pdf, > HDFSErasureCodingSystemTestReport-20150826.pdf, fsimage-analysis-20150105.pdf > > > Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice > of data reliability, comparing to the existing HDFS 3-replica approach. For > example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, > with storage overhead only being 40%. This makes EC a quite attractive > alternative for big data storage, particularly for cold data. > Facebook had a related open source project called HDFS-RAID. It used to be > one of the contribute packages in HDFS but had been removed since Hadoop 2.0 > for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends > on MapReduce to do encoding and decoding tasks; 2) it can only be used for > cold files that are intended not to be appended anymore; 3) the pure Java EC > coding implementation is extremely slow in practical use. Due to these, it > might not be a good idea to just bring HDFS-RAID back. > We (Intel and Cloudera) are working on a design to build EC into HDFS that > gets rid of any external dependencies, makes it self-contained and > independently maintained. This design lays the EC feature on the storage type > support and considers compatible with existing HDFS features like caching, > snapshot, encryption, high availability and etc. This design will also > support different EC coding schemes, implementations and policies for > different deployment scenarios. By utilizing advanced libraries (e.g. Intel > ISA-L library), an implementation can greatly improve the performance of EC > encoding/decoding and makes the EC solution even more attractive. We will > post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-8873: --- Attachment: HDFS-8873.001.patch Resolved some of the checkstyle issues. On the line too long comments, it's in the DFSConfig file where every line is too long. I would assume I should follow the established pattern rather than introduce a new style just for checkstyle. On the comments about making the fields private and adding accessors, since the fields are all final, I don't see need. Anyone disagree. The whitespace issue wasn't something I introduced, so I don't know why it's complaining. > throttle directoryScanner > - > > Key: HDFS-8873 > URL: https://issues.apache.org/jira/browse/HDFS-8873 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Daniel Templeton > Attachments: HDFS-8873.001.patch > > > The new 2-level directory layout can make directory scans expensive in terms > of disk seeks (see HDFS-8791) for details. > It would be good if the directoryScanner() had a configurable duty cycle that > would reduce its impact on disk performance (much like the approach in > HDFS-8617). > Without such a throttle, disks can go 100% busy for many minutes at a time > (assuming the common case of all inodes in cache but no directory blocks > cached, 64K seeks are required for full directory listing which translates to > 655 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-8873: --- Attachment: (was: HDFS-8873.001.patch) > throttle directoryScanner > - > > Key: HDFS-8873 > URL: https://issues.apache.org/jira/browse/HDFS-8873 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Daniel Templeton > > The new 2-level directory layout can make directory scans expensive in terms > of disk seeks (see HDFS-8791) for details. > It would be good if the directoryScanner() had a configurable duty cycle that > would reduce its impact on disk performance (much like the approach in > HDFS-8617). > Without such a throttle, disks can go 100% busy for many minutes at a time > (assuming the common case of all inodes in cache but no directory blocks > cached, 64K seeks are required for full directory listing which translates to > 655 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741591#comment-14741591 ] Hadoop QA commented on HDFS-8873: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 46s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 6s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 25s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 22s | The applied patch generated 20 new checkstyle issues (total was 442, now 441). | | {color:red}-1{color} | whitespace | 0m 7s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 27s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 34s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 18s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 73m 1s | Tests failed in hadoop-hdfs. | | | | 119m 6s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.web.TestWebHDFSOAuth2 | | Timed out tests | org.apache.hadoop.hdfs.TestDFSFinalize | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12755474/HDFS-8873.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / ea4bb27 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12403/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12403/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12403/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12403/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12403/console | This message was automatically generated. > throttle directoryScanner > - > > Key: HDFS-8873 > URL: https://issues.apache.org/jira/browse/HDFS-8873 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Daniel Templeton > Attachments: HDFS-8873.001.patch > > > The new 2-level directory layout can make directory scans expensive in terms > of disk seeks (see HDFS-8791) for details. > It would be good if the directoryScanner() had a configurable duty cycle that > would reduce its impact on disk performance (much like the approach in > HDFS-8617). > Without such a throttle, disks can go 100% busy for many minutes at a time > (assuming the common case of all inodes in cache but no directory blocks > cached, 64K seeks are required for full directory listing which translates to > 655 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9027) Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method
[ https://issues.apache.org/jira/browse/HDFS-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741571#comment-14741571 ] Hudson commented on HDFS-9027: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #359 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/359/]) HDFS-9027. Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method. (Contributed by Mingliang Liu) (arp: rev 15a557fcfec5eceedde9f1597385d5d3b01b2cd7) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockStoragePolicySuite.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockStoragePolicy.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java > Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method > --- > > Key: HDFS-9027 > URL: https://issues.apache.org/jira/browse/HDFS-9027 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-9027.000.patch, HDFS-9027.001.patch > > > In method {{isLazyPersist()}}, the {{org.apache.hadoop.hdfs.DataStreamer}} > class checks whether the HDFS file is lazy persist. It does two things: > 1. Create a class-wide _static_ {{BlockStoragePolicySuite}} object, which > builds an array of {{BlockStoragePolicy}} internally > 2. Get a block storage policy object from the {{blockStoragePolicySuite}} by > policy name {{HdfsConstants.MEMORY_STORAGE_POLICY_NAME}} > This has two side effects: > 1. Takes time to iterate the pre-built block storage policy array in order to > find the _same_ policy every time whose id matters only (as we need to > compare the file status policy id with lazy persist policy id) > 2. {{DataStreamer}} class imports {{BlockStoragePolicySuite}}. The former > should be moved to {{hadoop-hdfs-client}} module, while the latter can stay > in {{hadoop-hdfs}} module. > Actually, we have the block storage policy IDs, which can be used to compare > with HDFS file status' policy id, as following: > {code} > static boolean isLazyPersist(HdfsFileStatus stat) { > return stat.getStoragePolicy() == HdfsConstants.MEMORY_STORAGE_POLICY_ID; > } > {code} > This way, we only need to move the block storage policies' IDs from > {{HdfsServerConstant}} ({{hadoop-hdfs}} module) to {{HdfsConstants}} > ({{hadoop-hdfs-client}} module). > Another reason we should move those block storage policy IDs is that the > block storage policy names were moved to {{HdfsConstants}} already. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9065) Include commas on # of files, blocks, total filesystem objects in NN Web UI
[ https://issues.apache.org/jira/browse/HDFS-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-9065: --- Status: Patch Available (was: Open) > Include commas on # of files, blocks, total filesystem objects in NN Web UI > --- > > Key: HDFS-9065 > URL: https://issues.apache.org/jira/browse/HDFS-9065 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Attachments: HDFS-9065.001.patch > > > Include commas on the number of files, blocks, and total filesystem objects > in the NN Web UI (please see example below) to make the numbers easier to > read. > Current format: > 3236 files and directories, 1409 blocks = 4645 total filesystem object(s). > Proposed format: > 3,236 files and directories, 1,409 blocks = 4,645 total filesystem object(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9065) Include commas on # of files, blocks, total filesystem objects in NN Web UI
[ https://issues.apache.org/jira/browse/HDFS-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-9065: --- Attachment: HDFS-9065.001.patch > Include commas on # of files, blocks, total filesystem objects in NN Web UI > --- > > Key: HDFS-9065 > URL: https://issues.apache.org/jira/browse/HDFS-9065 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Attachments: HDFS-9065.001.patch > > > Include commas on the number of files, blocks, and total filesystem objects > in the NN Web UI (please see example below) to make the numbers easier to > read. > Current format: > 3236 files and directories, 1409 blocks = 4645 total filesystem object(s). > Proposed format: > 3,236 files and directories, 1,409 blocks = 4,645 total filesystem object(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-7285: Attachment: HDFS-7285-Consolidated-20150911.patch With HDFS-8833 committed, all remaining subtasks are about write pipeline error handling (with very little change to existing trunk code) and minor improvements. I suggest we start reviewing the feature branch in the context of merging to trunk. I'm attaching the current consolidated patch ({{HDFS-7285-Consolidated-20150911.patch}}). Feedbacks are very welcome! > Erasure Coding Support inside HDFS > -- > > Key: HDFS-7285 > URL: https://issues.apache.org/jira/browse/HDFS-7285 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Weihua Jiang >Assignee: Zhe Zhang > Attachments: Compare-consolidated-20150824.diff, > Consolidated-20150707.patch, Consolidated-20150806.patch, > Consolidated-20150810.patch, ECAnalyzer.py, ECParser.py, > HDFS-7285-Consolidated-20150911.patch, HDFS-7285-initial-PoC.patch, > HDFS-7285-merge-consolidated-01.patch, > HDFS-7285-merge-consolidated-trunk-01.patch, > HDFS-7285-merge-consolidated.trunk.03.patch, > HDFS-7285-merge-consolidated.trunk.04.patch, > HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch, > HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf, > HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, > HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf, > HDFSErasureCodingSystemTestPlan-20150824.pdf, > HDFSErasureCodingSystemTestReport-20150826.pdf, fsimage-analysis-20150105.pdf > > > Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice > of data reliability, comparing to the existing HDFS 3-replica approach. For > example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, > with storage overhead only being 40%. This makes EC a quite attractive > alternative for big data storage, particularly for cold data. > Facebook had a related open source project called HDFS-RAID. It used to be > one of the contribute packages in HDFS but had been removed since Hadoop 2.0 > for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends > on MapReduce to do encoding and decoding tasks; 2) it can only be used for > cold files that are intended not to be appended anymore; 3) the pure Java EC > coding implementation is extremely slow in practical use. Due to these, it > might not be a good idea to just bring HDFS-RAID back. > We (Intel and Cloudera) are working on a design to build EC into HDFS that > gets rid of any external dependencies, makes it self-contained and > independently maintained. This design lays the EC feature on the storage type > support and considers compatible with existing HDFS features like caching, > snapshot, encryption, high availability and etc. This design will also > support different EC coding schemes, implementations and policies for > different deployment scenarios. By utilizing advanced libraries (e.g. Intel > ISA-L library), an implementation can greatly improve the performance of EC > encoding/decoding and makes the EC solution even more attractive. We will > post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9065) Include commas on # of files, blocks, total filesystem objects in NN Web UI
Daniel Templeton created HDFS-9065: -- Summary: Include commas on # of files, blocks, total filesystem objects in NN Web UI Key: HDFS-9065 URL: https://issues.apache.org/jira/browse/HDFS-9065 Project: Hadoop HDFS Issue Type: Improvement Components: HDFS Affects Versions: 2.7.1 Reporter: Daniel Templeton Assignee: Daniel Templeton Priority: Minor Include commas on the number of files, blocks, and total filesystem objects in the NN Web UI (please see example below) to make the numbers easier to read. Current format: 3236 files and directories, 1409 blocks = 4645 total filesystem object(s). Proposed format: 3,236 files and directories, 1,409 blocks = 4,645 total filesystem object(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9027) Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method
[ https://issues.apache.org/jira/browse/HDFS-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741552#comment-14741552 ] Hudson commented on HDFS-9027: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2321 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2321/]) HDFS-9027. Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method. (Contributed by Mingliang Liu) (arp: rev 15a557fcfec5eceedde9f1597385d5d3b01b2cd7) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockStoragePolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockStoragePolicySuite.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java > Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method > --- > > Key: HDFS-9027 > URL: https://issues.apache.org/jira/browse/HDFS-9027 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-9027.000.patch, HDFS-9027.001.patch > > > In method {{isLazyPersist()}}, the {{org.apache.hadoop.hdfs.DataStreamer}} > class checks whether the HDFS file is lazy persist. It does two things: > 1. Create a class-wide _static_ {{BlockStoragePolicySuite}} object, which > builds an array of {{BlockStoragePolicy}} internally > 2. Get a block storage policy object from the {{blockStoragePolicySuite}} by > policy name {{HdfsConstants.MEMORY_STORAGE_POLICY_NAME}} > This has two side effects: > 1. Takes time to iterate the pre-built block storage policy array in order to > find the _same_ policy every time whose id matters only (as we need to > compare the file status policy id with lazy persist policy id) > 2. {{DataStreamer}} class imports {{BlockStoragePolicySuite}}. The former > should be moved to {{hadoop-hdfs-client}} module, while the latter can stay > in {{hadoop-hdfs}} module. > Actually, we have the block storage policy IDs, which can be used to compare > with HDFS file status' policy id, as following: > {code} > static boolean isLazyPersist(HdfsFileStatus stat) { > return stat.getStoragePolicy() == HdfsConstants.MEMORY_STORAGE_POLICY_ID; > } > {code} > This way, we only need to move the block storage policies' IDs from > {{HdfsServerConstant}} ({{hadoop-hdfs}} module) to {{HdfsConstants}} > ({{hadoop-hdfs-client}} module). > Another reason we should move those block storage policy IDs is that the > block storage policy names were moved to {{HdfsConstants}} already. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9027) Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method
[ https://issues.apache.org/jira/browse/HDFS-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741539#comment-14741539 ] Hudson commented on HDFS-9027: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2298 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2298/]) HDFS-9027. Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method. (Contributed by Mingliang Liu) (arp: rev 15a557fcfec5eceedde9f1597385d5d3b01b2cd7) * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockStoragePolicySuite.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockStoragePolicy.java > Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method > --- > > Key: HDFS-9027 > URL: https://issues.apache.org/jira/browse/HDFS-9027 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-9027.000.patch, HDFS-9027.001.patch > > > In method {{isLazyPersist()}}, the {{org.apache.hadoop.hdfs.DataStreamer}} > class checks whether the HDFS file is lazy persist. It does two things: > 1. Create a class-wide _static_ {{BlockStoragePolicySuite}} object, which > builds an array of {{BlockStoragePolicy}} internally > 2. Get a block storage policy object from the {{blockStoragePolicySuite}} by > policy name {{HdfsConstants.MEMORY_STORAGE_POLICY_NAME}} > This has two side effects: > 1. Takes time to iterate the pre-built block storage policy array in order to > find the _same_ policy every time whose id matters only (as we need to > compare the file status policy id with lazy persist policy id) > 2. {{DataStreamer}} class imports {{BlockStoragePolicySuite}}. The former > should be moved to {{hadoop-hdfs-client}} module, while the latter can stay > in {{hadoop-hdfs}} module. > Actually, we have the block storage policy IDs, which can be used to compare > with HDFS file status' policy id, as following: > {code} > static boolean isLazyPersist(HdfsFileStatus stat) { > return stat.getStoragePolicy() == HdfsConstants.MEMORY_STORAGE_POLICY_ID; > } > {code} > This way, we only need to move the block storage policies' IDs from > {{HdfsServerConstant}} ({{hadoop-hdfs}} module) to {{HdfsConstants}} > ({{hadoop-hdfs-client}} module). > Another reason we should move those block storage policy IDs is that the > block storage policy names were moved to {{HdfsConstants}} already. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8099) Change "DFSInputStream has been closed already" message to debug log level
[ https://issues.apache.org/jira/browse/HDFS-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-8099: - Fix Version/s: (was: 2.8.0) 2.7.2 > Change "DFSInputStream has been closed already" message to debug log level > -- > > Key: HDFS-8099 > URL: https://issues.apache.org/jira/browse/HDFS-8099 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Fix For: 2.7.2 > > Attachments: HDFS-8099.000.patch, HDFS-8099.001.patch > > > The hadoop fs -get command always shows this warning: > {noformat} > $ hadoop fs -get /data/schemas/sfdc/BusinessHours-2014-12-09.avsc > 15/04/06 06:22:19 WARN hdfs.DFSClient: DFSInputStream has been closed already > {noformat} > This was introduced by HDFS-7494. The easiest thing is to just remove the > warning from the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8099) Change "DFSInputStream has been closed already" message to debug log level
[ https://issues.apache.org/jira/browse/HDFS-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741528#comment-14741528 ] Kihwal Lee commented on HDFS-8099: -- The bug was introduced in 2.7.0. Cherry-picking this to 2.7.2. > Change "DFSInputStream has been closed already" message to debug log level > -- > > Key: HDFS-8099 > URL: https://issues.apache.org/jira/browse/HDFS-8099 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-8099.000.patch, HDFS-8099.001.patch > > > The hadoop fs -get command always shows this warning: > {noformat} > $ hadoop fs -get /data/schemas/sfdc/BusinessHours-2014-12-09.avsc > 15/04/06 06:22:19 WARN hdfs.DFSClient: DFSInputStream has been closed already > {noformat} > This was introduced by HDFS-7494. The easiest thing is to just remove the > warning from the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8952) InputStream.PositionRead() should be aware of available DNs
[ https://issues.apache.org/jira/browse/HDFS-8952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai resolved HDFS-8952. -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-8707 Committed to the HDFS-8707 branch. Thanks James and Bob for the reviews. > InputStream.PositionRead() should be aware of available DNs > --- > > Key: HDFS-8952 > URL: https://issues.apache.org/jira/browse/HDFS-8952 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Fix For: HDFS-8707 > > Attachments: HDFS-8952.000.patch > > > This jira provides basic functionality to allow libraries to recover from DN > failures. > {{InputStream.PositionRead()}} should (1) report the DN that when it serves > reads, and (2) allows the users exclude dead nodes during reads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9025) Fix compilation issues on arch linux
[ https://issues.apache.org/jira/browse/HDFS-9025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-9025: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-8707 Target Version/s: HDFS-8707 Status: Resolved (was: Patch Available) Committed to the HDFS-8707 branch. Thanks Owen for the contribution. Thanks James for the review. > Fix compilation issues on arch linux > > > Key: HDFS-9025 > URL: https://issues.apache.org/jira/browse/HDFS-9025 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: HDFS-8707 > > Attachments: HDFS-9025.HDFS-8707.001.patch, > HDFS-9025.HDFS-8707.002.patch, HDFS-9025.HDFS-8707.003.patch, HDFS-9025.patch > > > There are several compilation issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9062) Add a parameter to MiniDFSCluster to turn off security checks on the domain socked path
[ https://issues.apache.org/jira/browse/HDFS-9062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741481#comment-14741481 ] James Clampffer commented on HDFS-9062: --- That's correct. Glad it sounds like a reasonable change. > Add a parameter to MiniDFSCluster to turn off security checks on the domain > socked path > --- > > Key: HDFS-9062 > URL: https://issues.apache.org/jira/browse/HDFS-9062 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer >Priority: Blocker > > I'd like to add a command line parameter that allows the permission checks on > dfs.domain.socket.path to be turned off. > Right now a blocker, or at least major inconvenience, for short circuit > reader development is getting the domain socket path set up with the correct > permissions. I'm working on shared test machines where messing with things > in /var/lib is discouraged. > This should also make it easier to write tests for short circuit reads once > completed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9008) Balancer#Parameters class could use a builder pattern
[ https://issues.apache.org/jira/browse/HDFS-9008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741460#comment-14741460 ] Ming Ma commented on HDFS-9008: --- +1 for the latest patch. [~szetszwo] do you have additional comments? > Balancer#Parameters class could use a builder pattern > - > > Key: HDFS-9008 > URL: https://issues.apache.org/jira/browse/HDFS-9008 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Minor > Attachments: HDFS-9008-trunk-v1.patch, HDFS-9008-trunk-v2.patch, > HDFS-9008-trunk-v3.patch, HDFS-9008-trunk-v4.patch, HDFS-9008-trunk-v5.patch > > > The Balancer#Parameters class is violating a few checkstyle rules. > # Instance variables are not privately scoped and do not have accessor > methods. > # The Balancer#Parameter constructor has too many arguments (according to > checkstyle). > Changing this class to use the builder pattern could fix both of these style > issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-8829) DataNode sets SO_RCVBUF explicitly is disabling tcp auto-tuning
[ https://issues.apache.org/jira/browse/HDFS-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741447#comment-14741447 ] Colin Patrick McCabe edited comment on HDFS-8829 at 9/11/15 7:47 PM: - Thanks for working on this, [~He Tianyi]. +1 pending jenkins. It looks like you may need to rebase the patch (last build failed) was (Author: cmccabe): Thanks for working on this, [~He Tianyi]. +1. > DataNode sets SO_RCVBUF explicitly is disabling tcp auto-tuning > --- > > Key: HDFS-8829 > URL: https://issues.apache.org/jira/browse/HDFS-8829 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.3.0, 2.6.0 >Reporter: He Tianyi >Assignee: He Tianyi > Attachments: HDFS-8829.0001.patch, HDFS-8829.0002.patch, > HDFS-8829.0003.patch, HDFS-8829.0004.patch, HDFS-8829.0005.patch > > > {code:java} > private void initDataXceiver(Configuration conf) throws IOException { > // find free port or use privileged port provided > TcpPeerServer tcpPeerServer; > if (secureResources != null) { > tcpPeerServer = new TcpPeerServer(secureResources); > } else { > tcpPeerServer = new TcpPeerServer(dnConf.socketWriteTimeout, > DataNode.getStreamingAddr(conf)); > } > > tcpPeerServer.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE); > {code} > The last line sets SO_RCVBUF explicitly, thus disabling tcp auto-tuning on > some system. > Shall we make this behavior configurable? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8829) DataNode sets SO_RCVBUF explicitly is disabling tcp auto-tuning
[ https://issues.apache.org/jira/browse/HDFS-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741447#comment-14741447 ] Colin Patrick McCabe commented on HDFS-8829: Thanks for working on this, [~He Tianyi]. +1. > DataNode sets SO_RCVBUF explicitly is disabling tcp auto-tuning > --- > > Key: HDFS-8829 > URL: https://issues.apache.org/jira/browse/HDFS-8829 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.3.0, 2.6.0 >Reporter: He Tianyi >Assignee: He Tianyi > Attachments: HDFS-8829.0001.patch, HDFS-8829.0002.patch, > HDFS-8829.0003.patch, HDFS-8829.0004.patch, HDFS-8829.0005.patch > > > {code:java} > private void initDataXceiver(Configuration conf) throws IOException { > // find free port or use privileged port provided > TcpPeerServer tcpPeerServer; > if (secureResources != null) { > tcpPeerServer = new TcpPeerServer(secureResources); > } else { > tcpPeerServer = new TcpPeerServer(dnConf.socketWriteTimeout, > DataNode.getStreamingAddr(conf)); > } > > tcpPeerServer.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE); > {code} > The last line sets SO_RCVBUF explicitly, thus disabling tcp auto-tuning on > some system. > Shall we make this behavior configurable? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9027) Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method
[ https://issues.apache.org/jira/browse/HDFS-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741443#comment-14741443 ] Hudson commented on HDFS-9027: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #373 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/373/]) HDFS-9027. Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method. (Contributed by Mingliang Liu) (arp: rev 15a557fcfec5eceedde9f1597385d5d3b01b2cd7) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockStoragePolicySuite.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockStoragePolicy.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java > Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method > --- > > Key: HDFS-9027 > URL: https://issues.apache.org/jira/browse/HDFS-9027 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-9027.000.patch, HDFS-9027.001.patch > > > In method {{isLazyPersist()}}, the {{org.apache.hadoop.hdfs.DataStreamer}} > class checks whether the HDFS file is lazy persist. It does two things: > 1. Create a class-wide _static_ {{BlockStoragePolicySuite}} object, which > builds an array of {{BlockStoragePolicy}} internally > 2. Get a block storage policy object from the {{blockStoragePolicySuite}} by > policy name {{HdfsConstants.MEMORY_STORAGE_POLICY_NAME}} > This has two side effects: > 1. Takes time to iterate the pre-built block storage policy array in order to > find the _same_ policy every time whose id matters only (as we need to > compare the file status policy id with lazy persist policy id) > 2. {{DataStreamer}} class imports {{BlockStoragePolicySuite}}. The former > should be moved to {{hadoop-hdfs-client}} module, while the latter can stay > in {{hadoop-hdfs}} module. > Actually, we have the block storage policy IDs, which can be used to compare > with HDFS file status' policy id, as following: > {code} > static boolean isLazyPersist(HdfsFileStatus stat) { > return stat.getStoragePolicy() == HdfsConstants.MEMORY_STORAGE_POLICY_ID; > } > {code} > This way, we only need to move the block storage policies' IDs from > {{HdfsServerConstant}} ({{hadoop-hdfs}} module) to {{HdfsConstants}} > ({{hadoop-hdfs-client}} module). > Another reason we should move those block storage policy IDs is that the > block storage policy names were moved to {{HdfsConstants}} already. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9027) Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method
[ https://issues.apache.org/jira/browse/HDFS-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741436#comment-14741436 ] Hudson commented on HDFS-9027: -- FAILURE: Integrated in Hadoop-Yarn-trunk # (See [https://builds.apache.org/job/Hadoop-Yarn-trunk//]) HDFS-9027. Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method. (Contributed by Mingliang Liu) (arp: rev 15a557fcfec5eceedde9f1597385d5d3b01b2cd7) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockStoragePolicySuite.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockStoragePolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java > Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method > --- > > Key: HDFS-9027 > URL: https://issues.apache.org/jira/browse/HDFS-9027 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-9027.000.patch, HDFS-9027.001.patch > > > In method {{isLazyPersist()}}, the {{org.apache.hadoop.hdfs.DataStreamer}} > class checks whether the HDFS file is lazy persist. It does two things: > 1. Create a class-wide _static_ {{BlockStoragePolicySuite}} object, which > builds an array of {{BlockStoragePolicy}} internally > 2. Get a block storage policy object from the {{blockStoragePolicySuite}} by > policy name {{HdfsConstants.MEMORY_STORAGE_POLICY_NAME}} > This has two side effects: > 1. Takes time to iterate the pre-built block storage policy array in order to > find the _same_ policy every time whose id matters only (as we need to > compare the file status policy id with lazy persist policy id) > 2. {{DataStreamer}} class imports {{BlockStoragePolicySuite}}. The former > should be moved to {{hadoop-hdfs-client}} module, while the latter can stay > in {{hadoop-hdfs}} module. > Actually, we have the block storage policy IDs, which can be used to compare > with HDFS file status' policy id, as following: > {code} > static boolean isLazyPersist(HdfsFileStatus stat) { > return stat.getStoragePolicy() == HdfsConstants.MEMORY_STORAGE_POLICY_ID; > } > {code} > This way, we only need to move the block storage policies' IDs from > {{HdfsServerConstant}} ({{hadoop-hdfs}} module) to {{HdfsConstants}} > ({{hadoop-hdfs-client}} module). > Another reason we should move those block storage policy IDs is that the > block storage policy names were moved to {{HdfsConstants}} already. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7767) Use the noredirect flag in WebHDFS to allow web browsers to upload files via the NN UI
[ https://issues.apache.org/jira/browse/HDFS-7767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-7767: --- Attachment: HDFS-7767.01.patch Here's a patch which DOESN'T use the noredirect flag. I'm uploading it only to illustrate the problem It works only with Chrome. It doesn't work with Firefox. Also, it doesn't let me upload large files (because the NN rightly limits the size of incoming requests) > Use the noredirect flag in WebHDFS to allow web browsers to upload files via > the NN UI > -- > > Key: HDFS-7767 > URL: https://issues.apache.org/jira/browse/HDFS-7767 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ravi Prakash > Attachments: HDFS-7767.01.patch > > > This subtask would use the functionality provided in HDFS-7766 to allow files > to be uploaded to HDFS via a Web-browser. (These include the changes to the > HTML5 and javascript code) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9055) WebHDFS REST v2
[ https://issues.apache.org/jira/browse/HDFS-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741427#comment-14741427 ] Allen Wittenauer commented on HDFS-9055: Also: https://github.com/grpc/grpc-common/issues/158 does not make me confident that bq. GRPC is over HTTP / HTTPS. Both curl and wget talks HTTP / HTTPS. You do that for the same functionalities. is a true statement. > WebHDFS REST v2 > --- > > Key: HDFS-9055 > URL: https://issues.apache.org/jira/browse/HDFS-9055 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer > > There's starting to be enough changes to fix and add missing functionality to > webhdfs that we should probably update to REST v2. This also gives us an > opportunity to deal with some incompatible issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9062) Add a parameter to MiniDFSCluster to turn off security checks on the domain socked path
[ https://issues.apache.org/jira/browse/HDFS-9062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741424#comment-14741424 ] Chris Nauroth commented on HDFS-9062: - Thanks for the additional details, James. I hadn't made the connection that you were looking to do an external setup, with the mini-cluster in a separate process from the test executable. If I understand correctly then, the proposal is to add a new argument to {{MiniDFSClusterManager}}, which if present triggers a call to {{DomainSocket#disableBindPathValidation}}. Is that right? If so, then it sounds like a reasonable change. > Add a parameter to MiniDFSCluster to turn off security checks on the domain > socked path > --- > > Key: HDFS-9062 > URL: https://issues.apache.org/jira/browse/HDFS-9062 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer >Priority: Blocker > > I'd like to add a command line parameter that allows the permission checks on > dfs.domain.socket.path to be turned off. > Right now a blocker, or at least major inconvenience, for short circuit > reader development is getting the domain socket path set up with the correct > permissions. I'm working on shared test machines where messing with things > in /var/lib is discouraged. > This should also make it easier to write tests for short circuit reads once > completed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9055) WebHDFS REST v2
[ https://issues.apache.org/jira/browse/HDFS-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741421#comment-14741421 ] Allen Wittenauer commented on HDFS-9055: bq. Can you show some numbers on what are the latencies on data-center networks? Our customers are transferring data from outside our data center so the the point is moot. bq. If the NN takes the read / write lock to perform the recursive requests, shouldn't it be considered as DDoS as well? No because there's no reason why the NN should lock here. Nothing is prevent it from spawning off a DFSClient thread to act just like an RPC client would. > WebHDFS REST v2 > --- > > Key: HDFS-9055 > URL: https://issues.apache.org/jira/browse/HDFS-9055 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer > > There's starting to be enough changes to fix and add missing functionality to > webhdfs that we should probably update to REST v2. This also gives us an > opportunity to deal with some incompatible issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-8873: --- Status: Patch Available (was: Open) > throttle directoryScanner > - > > Key: HDFS-8873 > URL: https://issues.apache.org/jira/browse/HDFS-8873 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Daniel Templeton > Attachments: HDFS-8873.001.patch > > > The new 2-level directory layout can make directory scans expensive in terms > of disk seeks (see HDFS-8791) for details. > It would be good if the directoryScanner() had a configurable duty cycle that > would reduce its impact on disk performance (much like the approach in > HDFS-8617). > Without such a throttle, disks can go 100% busy for many minutes at a time > (assuming the common case of all inodes in cache but no directory blocks > cached, 64K seeks are required for full directory listing which translates to > 655 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-8873: --- Attachment: HDFS-8873.001.patch Here's a patch to add a time-based throttle. The changes are a little large because I had to first make the scanner throttlable. I also took the liberty of fleshing out the javadocs. The throttling approach is explained in the comments and javadocs. > throttle directoryScanner > - > > Key: HDFS-8873 > URL: https://issues.apache.org/jira/browse/HDFS-8873 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Daniel Templeton > Attachments: HDFS-8873.001.patch > > > The new 2-level directory layout can make directory scans expensive in terms > of disk seeks (see HDFS-8791) for details. > It would be good if the directoryScanner() had a configurable duty cycle that > would reduce its impact on disk performance (much like the approach in > HDFS-8617). > Without such a throttle, disks can go 100% busy for many minutes at a time > (assuming the common case of all inodes in cache but no directory blocks > cached, 64K seeks are required for full directory listing which translates to > 655 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9027) Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method
[ https://issues.apache.org/jira/browse/HDFS-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741391#comment-14741391 ] Hudson commented on HDFS-9027: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #379 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/379/]) HDFS-9027. Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method. (Contributed by Mingliang Liu) (arp: rev 15a557fcfec5eceedde9f1597385d5d3b01b2cd7) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockStoragePolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockStoragePolicySuite.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method > --- > > Key: HDFS-9027 > URL: https://issues.apache.org/jira/browse/HDFS-9027 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-9027.000.patch, HDFS-9027.001.patch > > > In method {{isLazyPersist()}}, the {{org.apache.hadoop.hdfs.DataStreamer}} > class checks whether the HDFS file is lazy persist. It does two things: > 1. Create a class-wide _static_ {{BlockStoragePolicySuite}} object, which > builds an array of {{BlockStoragePolicy}} internally > 2. Get a block storage policy object from the {{blockStoragePolicySuite}} by > policy name {{HdfsConstants.MEMORY_STORAGE_POLICY_NAME}} > This has two side effects: > 1. Takes time to iterate the pre-built block storage policy array in order to > find the _same_ policy every time whose id matters only (as we need to > compare the file status policy id with lazy persist policy id) > 2. {{DataStreamer}} class imports {{BlockStoragePolicySuite}}. The former > should be moved to {{hadoop-hdfs-client}} module, while the latter can stay > in {{hadoop-hdfs}} module. > Actually, we have the block storage policy IDs, which can be used to compare > with HDFS file status' policy id, as following: > {code} > static boolean isLazyPersist(HdfsFileStatus stat) { > return stat.getStoragePolicy() == HdfsConstants.MEMORY_STORAGE_POLICY_ID; > } > {code} > This way, we only need to move the block storage policies' IDs from > {{HdfsServerConstant}} ({{hadoop-hdfs}} module) to {{HdfsConstants}} > ({{hadoop-hdfs-client}} module). > Another reason we should move those block storage policy IDs is that the > block storage policy names were moved to {{HdfsConstants}} already. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9040) Erasure coding: A BlockGroupDataStreamer to rule all internal blocks streamers
[ https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741388#comment-14741388 ] Jing Zhao commented on HDFS-9040: - Thanks for the patch, Walter! I think this looks much clearer compared with the current implementation. Some thoughts and comments: # In general I think it's the correct direction to push all the coordination logic into one place, and let all the other streamers simply transfer data. # Currently the new block allocation step and failure handling steps can still be interleaved. To me this may be too hard to guarantee the correctness. For example, we need to handle a scenario where some data streamer has not fetched the new block yet when the coordinator starts handling a failure. The current patch tries to handle this by checking the corresponding following block queue. But since a data streamer can be in a state where it fetches the new block but has not assigned new values to its nodes/storageTypes, we may still have some race condition. Thus I agree with Nicholas's comment [here|https://issues.apache.org/jira/browse/HDFS-8383?focusedCommentId=14737962&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14737962], i.e., we need to add some "barriers" to sync all the data streamers and so as to simplify the problem. # More specifically, my current proposal for failure handling looks like this: The coordinator side: #* Check if there is failure(s) periodically. If we use DFSStripedOutputStream as the coordinator, we can easily do this in {{writeChunk}}, e.g., to check failures whenever we've received one stripe of data. #* If there is new failure, first wait till all the healthy streamers fetch the new block and are in DATA_STREAMING stage. #* Mark all the healthy streamers as external error. #* Call updateBlockForPipeline and get the new GS. #* Wait till all the healthy streamers to fetch the new block from the queue and creating new block streams. #* If there is new failure happening when creating new block streams, notify all the remaining streamers the failure and keep them in the external error state. Repeat the above steps. #* Otherwise reset all the external error states and make the updatePipeline RPC call. Then notify all the streamers that this failure handling session has succeeded. # The DataStreamer side: #* When finding itself in external error state, wait and take the new block from the blocking queue. #* Create new datanode connection using the new block. #* Notify the coordinator the result of the new datanode connection creation. #* If the connection creation succeeded, wait the coordinator for the overall result. #* If all the involving streamers succeed, update its block based on the new GS. #* Otherwise repeat the steps. #* And instead of overriding updateBlockForPipeline and updatePipeline, it may be easier to implement the above logic by overriding {{setupPipelineForAppendOrRecovery}}. # Obviously the above proposal may still have some holes. But the direction here is to make sure there is no overlap between different error handling efforts and the new block allocation. Please see if this makes sense to you. # Also I think it is easier to implement the above logic in StripedOutputStream: 1) it's easier to determine when to start block allocation and failure check, 2) it's easier to handle exceptions during the NN RPCs since we do not need to pass the exception from a separate coordinator thread. But we can discuss this further and please let me know if I miss something. Currently I have an in-progress patch implementing the above proposal. I will try to make it in a better shape and post it as a demo soon. > Erasure coding: A BlockGroupDataStreamer to rule all internal blocks streamers > -- > > Key: HDFS-9040 > URL: https://issues.apache.org/jira/browse/HDFS-9040 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Walter Su >Assignee: Walter Su > Attachments: HDFS-9040.00.patch > > > A {{BlockGroupDataStreamer}} to communicate with NN to allocate/update block, > and {{StripedDataStreamer}} s only have to stream blocks to DNs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8953) DataNode Metrics logging
[ https://issues.apache.org/jira/browse/HDFS-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741380#comment-14741380 ] Kanaka Kumar Avvaru commented on HDFS-8953: --- test failures doesn't seem to be related and check style issues reported can be ignored as stated in earlier comments > DataNode Metrics logging > > > Key: HDFS-8953 > URL: https://issues.apache.org/jira/browse/HDFS-8953 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kanaka Kumar Avvaru >Assignee: Kanaka Kumar Avvaru > Attachments: HDFS-8953-01.patch, HDFS-8953-02.patch, > HDFS-8953-03.patch, HDFS-8953-04.patch, HDFS-8953-05.patch > > > HDFS-8880 added metrics logging at NameNode. Similarly, this JIRA is to add > a separate logger for metrics at DN -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6955) DN should reserve disk space for a full block when creating tmp files
[ https://issues.apache.org/jira/browse/HDFS-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741374#comment-14741374 ] Kanaka Kumar Avvaru commented on HDFS-6955: --- patch v8 address {{TestSpaceReservation}} test failure in last build and comment from arpit for spaces around operators {{replicaInfo!=null}} > DN should reserve disk space for a full block when creating tmp files > - > > Key: HDFS-6955 > URL: https://issues.apache.org/jira/browse/HDFS-6955 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.5.0 >Reporter: Arpit Agarwal >Assignee: Kanaka Kumar Avvaru > Attachments: HDFS-6955-01.patch, HDFS-6955-02.patch, > HDFS-6955-03.patch, HDFS-6955-04.patch, HDFS-6955-05.patch, > HDFS-6955-06.patch, HDFS-6955-07.patch, HDFS-6955-08.patch > > > HDFS-6898 is introducing disk space reservation for RBW files to avoid > running out of disk space midway through block creation. > This Jira is to introduce similar reservation for tmp files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6955) DN should reserve disk space for a full block when creating tmp files
[ https://issues.apache.org/jira/browse/HDFS-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kanaka Kumar Avvaru updated HDFS-6955: -- Attachment: HDFS-6955-08.patch > DN should reserve disk space for a full block when creating tmp files > - > > Key: HDFS-6955 > URL: https://issues.apache.org/jira/browse/HDFS-6955 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.5.0 >Reporter: Arpit Agarwal >Assignee: Kanaka Kumar Avvaru > Attachments: HDFS-6955-01.patch, HDFS-6955-02.patch, > HDFS-6955-03.patch, HDFS-6955-04.patch, HDFS-6955-05.patch, > HDFS-6955-06.patch, HDFS-6955-07.patch, HDFS-6955-08.patch > > > HDFS-6898 is introducing disk space reservation for RBW files to avoid > running out of disk space midway through block creation. > This Jira is to introduce similar reservation for tmp files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6955) DN should reserve disk space for a full block when creating tmp files
[ https://issues.apache.org/jira/browse/HDFS-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741365#comment-14741365 ] Kanaka Kumar Avvaru commented on HDFS-6955: --- Thanks for the review [~arpitagarwal] 1. Ok. fine 2. New tests are added for full success and volume errors. Please refer {{TestSpaceReservation#testTmpSpaceReserve()}} 3. it is to be used in tests for verification for recent reserved space for last call to assert the actual block length instead of default block size. 4. Yes, based on the code flow, block finalization is happening for re-replicated block also > DN should reserve disk space for a full block when creating tmp files > - > > Key: HDFS-6955 > URL: https://issues.apache.org/jira/browse/HDFS-6955 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.5.0 >Reporter: Arpit Agarwal >Assignee: Kanaka Kumar Avvaru > Attachments: HDFS-6955-01.patch, HDFS-6955-02.patch, > HDFS-6955-03.patch, HDFS-6955-04.patch, HDFS-6955-05.patch, > HDFS-6955-06.patch, HDFS-6955-07.patch > > > HDFS-6898 is introducing disk space reservation for RBW files to avoid > running out of disk space midway through block creation. > This Jira is to introduce similar reservation for tmp files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7986) Allow files / directories to be deleted from the NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741350#comment-14741350 ] Hadoop QA commented on HDFS-7986: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 0m 0s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | release audit | 0m 15s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | | | 0m 18s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12755460/HDFS-7986.01.patch | | Optional Tests | | | git revision | trunk / 15a557f | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12401/console | This message was automatically generated. > Allow files / directories to be deleted from the NameNode UI > > > Key: HDFS-7986 > URL: https://issues.apache.org/jira/browse/HDFS-7986 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-7986.01.patch > > > Users should be able to delete files or directories using the Namenode UI. > I'm thinking there ought to be a confirmation dialog. For directories > recursive should be set to true. Initially there should be no option to > skipTrash. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9055) WebHDFS REST v2
[ https://issues.apache.org/jira/browse/HDFS-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741337#comment-14741337 ] Haohui Mai commented on HDFS-9055: -- bq. Adding a third protocol which nothing really supports yet doesn't fix REST. The ability to use curl and wget is a feature, not a bug. 1. GRPC is over HTTP / HTTPS. Both curl and wget talks HTTP / HTTPS. You do that for the same functionalities. 2. Experiences of WebHDFS development shows the burdens of maintaining REST APIs is quite high. This suggests to some extent the approach of adding REST calls one by one is broken. My point is that if the goal of webhdfs v2 is fixing these problems, exposing these calls automatically is a high priority from a development point of view. bq. I'm hesitant to make the client do this work in the WebHDFS case because it's likely going to be extremely expensive network-wise, especially over high latency networks. Worse, I can easily see someone want to get the speed back by multi-threading the connections and effectively DDoSing the NN. Can you show some numbers on what are the latencies on data-center networks? If the NN takes the read / write lock to perform the recursive requests, shouldn't it be considered as DDoS as well? > WebHDFS REST v2 > --- > > Key: HDFS-9055 > URL: https://issues.apache.org/jira/browse/HDFS-9055 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer > > There's starting to be enough changes to fix and add missing functionality to > webhdfs that we should probably update to REST v2. This also gives us an > opportunity to deal with some incompatible issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9038) Reserved space is erroneously counted towards non-DFS used.
[ https://issues.apache.org/jira/browse/HDFS-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741329#comment-14741329 ] Vinayakumar B commented on HDFS-9038: - Thanks brahma for the options. #1 would be better IMO. > Reserved space is erroneously counted towards non-DFS used. > --- > > Key: HDFS-9038 > URL: https://issues.apache.org/jira/browse/HDFS-9038 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.7.1 >Reporter: Chris Nauroth >Assignee: Brahma Reddy Battula > > HDFS-5215 changed the DataNode volume available space calculation to consider > the reserved space held by the {{dfs.datanode.du.reserved}} configuration > property. As a side effect, reserved space is now counted towards non-DFS > used. I don't believe it was intentional to change the definition of non-DFS > used. This issue proposes restoring the prior behavior: do not count > reserved space towards non-DFS used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9062) Add a parameter to MiniDFSCluster to turn off security checks on the domain socked path
[ https://issues.apache.org/jira/browse/HDFS-9062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741286#comment-14741286 ] James Clampffer commented on HDFS-9062: --- Hi Chris, Thanks for the suggestions and usage examples. I'd really like to avoid any hard dependencies between libhdfs++ development and the cluster the tests run against. That way the tests don't need to care if I'm starting up a minidfscluster or if I'm pointing them at a rack of machines with a proper HDFS installation. The target workflow I have in mind is: 1) Generate a set of native executables that embed libhdfs++ to be used for testing and profiling. 2) Start up a miniDFS cluster in an arbitrary location with path validation off. 3) Execute the binaries under valgrind and profiling tools and just point them at the minidfs cluster. Once it comes time to do large scale performance testing step 2 can be substituted with a normal HDFS installation. I was thinking of turning off path validation by using an extra command line flag to call DomainSocket#disableBindPathValidation. Admittedly I'm still very new to the Hadoop project's way of running tests so I'm going to look into your suggestions and see if one of those existing methods would apply here rather than add more knobs. > Add a parameter to MiniDFSCluster to turn off security checks on the domain > socked path > --- > > Key: HDFS-9062 > URL: https://issues.apache.org/jira/browse/HDFS-9062 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer >Priority: Blocker > > I'd like to add a command line parameter that allows the permission checks on > dfs.domain.socket.path to be turned off. > Right now a blocker, or at least major inconvenience, for short circuit > reader development is getting the domain socket path set up with the correct > permissions. I'm working on shared test machines where messing with things > in /var/lib is discouraged. > This should also make it easier to write tests for short circuit reads once > completed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9022: Status: Open (was: Patch Available) > Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client > -- > > Key: HDFS-9022 > URL: https://issues.apache.org/jira/browse/HDFS-9022 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch > > > The static helper methods in NameNodes are used in {{hdfs-client}} module. > For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes > which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should > keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module. > This jira tracks the effort of moving the following static helper methods out > of {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these > methods is the {{DFSUtilClient}} class: > {code} > public static InetSocketAddress getAddress(String address); > public static InetSocketAddress getAddress(Configuration conf); > public static InetSocketAddress getAddress(URI filesystemURI); > public static URI getUri(InetSocketAddress namenode); > {code} > Be cautious not to bring new checkstyle warnings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9022: Status: Patch Available (was: Open) > Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client > -- > > Key: HDFS-9022 > URL: https://issues.apache.org/jira/browse/HDFS-9022 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch > > > The static helper methods in NameNodes are used in {{hdfs-client}} module. > For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes > which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should > keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module. > This jira tracks the effort of moving the following static helper methods out > of {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these > methods is the {{DFSUtilClient}} class: > {code} > public static InetSocketAddress getAddress(String address); > public static InetSocketAddress getAddress(Configuration conf); > public static InetSocketAddress getAddress(URI filesystemURI); > public static URI getUri(InetSocketAddress namenode); > {code} > Be cautious not to bring new checkstyle warnings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9010) Replace NameNode.DEFAULT_PORT with HdfsClientConfigKeys.DFS_NAMENODE_RPC_PORT_DEFAULT config key
[ https://issues.apache.org/jira/browse/HDFS-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9010: Status: Patch Available (was: Open) > Replace NameNode.DEFAULT_PORT with > HdfsClientConfigKeys.DFS_NAMENODE_RPC_PORT_DEFAULT config key > > > Key: HDFS-9010 > URL: https://issues.apache.org/jira/browse/HDFS-9010 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9010.000.patch, HDFS-9010.001.patch, > HDFS-9010.002.patch, HDFS-9010.003.patch, HDFS-9010.004.patch > > > The {{NameNode.DEFAULT_PORT}} static attribute is stale as we use > {{HdfsClientConfigKeys.DFS_NAMENODE_RPC_PORT_DEFAULT}} config value. > This jira tracks the effort of replacing the {{NameNode.DEFAULT_PORT}} with > {{HdfsClientConfigKeys.DFS_NAMENODE_RPC_PORT_DEFAULT}}. Meanwhile, we mark > the {{NameNode.DEFAULT_PORT}} as _@Deprecated_ before removing it totally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9010) Replace NameNode.DEFAULT_PORT with HdfsClientConfigKeys.DFS_NAMENODE_RPC_PORT_DEFAULT config key
[ https://issues.apache.org/jira/browse/HDFS-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9010: Status: Open (was: Patch Available) > Replace NameNode.DEFAULT_PORT with > HdfsClientConfigKeys.DFS_NAMENODE_RPC_PORT_DEFAULT config key > > > Key: HDFS-9010 > URL: https://issues.apache.org/jira/browse/HDFS-9010 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9010.000.patch, HDFS-9010.001.patch, > HDFS-9010.002.patch, HDFS-9010.003.patch, HDFS-9010.004.patch > > > The {{NameNode.DEFAULT_PORT}} static attribute is stale as we use > {{HdfsClientConfigKeys.DFS_NAMENODE_RPC_PORT_DEFAULT}} config value. > This jira tracks the effort of replacing the {{NameNode.DEFAULT_PORT}} with > {{HdfsClientConfigKeys.DFS_NAMENODE_RPC_PORT_DEFAULT}}. Meanwhile, we mark > the {{NameNode.DEFAULT_PORT}} as _@Deprecated_ before removing it totally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7986) Allow files / directories to be deleted from the NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash reassigned HDFS-7986: -- Assignee: Ravi Prakash > Allow files / directories to be deleted from the NameNode UI > > > Key: HDFS-7986 > URL: https://issues.apache.org/jira/browse/HDFS-7986 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-7986.01.patch > > > Users should be able to delete files or directories using the Namenode UI. > I'm thinking there ought to be a confirmation dialog. For directories > recursive should be set to true. Initially there should be no option to > skipTrash. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7986) Allow files / directories to be deleted from the NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-7986: --- Attachment: HDFS-7986.01.patch Here's a patch which allows deleting of files and directories via the UI > Allow files / directories to be deleted from the NameNode UI > > > Key: HDFS-7986 > URL: https://issues.apache.org/jira/browse/HDFS-7986 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ravi Prakash > Attachments: HDFS-7986.01.patch > > > Users should be able to delete files or directories using the Namenode UI. > I'm thinking there ought to be a confirmation dialog. For directories > recursive should be set to true. Initially there should be no option to > skipTrash. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7986) Allow files / directories to be deleted from the NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-7986: --- Status: Patch Available (was: Open) > Allow files / directories to be deleted from the NameNode UI > > > Key: HDFS-7986 > URL: https://issues.apache.org/jira/browse/HDFS-7986 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ravi Prakash > Attachments: HDFS-7986.01.patch > > > Users should be able to delete files or directories using the Namenode UI. > I'm thinking there ought to be a confirmation dialog. For directories > recursive should be set to true. Initially there should be no option to > skipTrash. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7779) Improve the HDFS Web UI browser to allow chowning / chgrp and setting replication
[ https://issues.apache.org/jira/browse/HDFS-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741273#comment-14741273 ] Hadoop QA commented on HDFS-7779: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 0m 0s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | release audit | 0m 21s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | | | 0m 24s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12755454/HDFS-7779.02.patch | | Optional Tests | | | git revision | trunk / 15a557f | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12400/console | This message was automatically generated. > Improve the HDFS Web UI browser to allow chowning / chgrp and setting > replication > - > > Key: HDFS-7779 > URL: https://issues.apache.org/jira/browse/HDFS-7779 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: Chmod.png, Chown.png, HDFS-7779.01.patch, > HDFS-7779.02.patch > > > This JIRA converts the owner, group and replication fields into > contenteditable fields which can be modified by the user from the browser > itself. It too uses the WebHDFS to affect these changes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9064) NN old UI (block_info_xml) not available in 2.7.x
[ https://issues.apache.org/jira/browse/HDFS-9064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741271#comment-14741271 ] Rushabh S Shah commented on HDFS-9064: -- only in the REST call. > NN old UI (block_info_xml) not available in 2.7.x > - > > Key: HDFS-9064 > URL: https://issues.apache.org/jira/browse/HDFS-9064 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.7.0 >Reporter: Rushabh S Shah >Priority: Critical > > In 2.6.x hadoop deploys, given a blockId it was very easy to find out the > file name and the locations of replicas (also whether they are corrupt or > not). > This was the REST call: > {noformat} > http://:/block_info_xml.jsp?blockId=xxx > {noformat} > But this was removed by HDFS-6252 in 2.7 builds. > Creating this jira to restore that functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9008) Balancer#Parameters class could use a builder pattern
[ https://issues.apache.org/jira/browse/HDFS-9008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741262#comment-14741262 ] Chris Trezzo commented on HDFS-9008: Test failures are unrelated. The patch should be good to go. > Balancer#Parameters class could use a builder pattern > - > > Key: HDFS-9008 > URL: https://issues.apache.org/jira/browse/HDFS-9008 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Minor > Attachments: HDFS-9008-trunk-v1.patch, HDFS-9008-trunk-v2.patch, > HDFS-9008-trunk-v3.patch, HDFS-9008-trunk-v4.patch, HDFS-9008-trunk-v5.patch > > > The Balancer#Parameters class is violating a few checkstyle rules. > # Instance variables are not privately scoped and do not have accessor > methods. > # The Balancer#Parameter constructor has too many arguments (according to > checkstyle). > Changing this class to use the builder pattern could fix both of these style > issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9064) NN old UI (block_info_xml) not available in 2.7.x
[ https://issues.apache.org/jira/browse/HDFS-9064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741258#comment-14741258 ] Ravi Prakash commented on HDFS-9064: Thanks for reporting the issue Rushabh! Is this related to HDFS-7588 ? i.e. was there a web page exposing this? Or are you interested only in the REST call? > NN old UI (block_info_xml) not available in 2.7.x > - > > Key: HDFS-9064 > URL: https://issues.apache.org/jira/browse/HDFS-9064 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.7.0 >Reporter: Rushabh S Shah >Priority: Critical > > In 2.6.x hadoop deploys, given a blockId it was very easy to find out the > file name and the locations of replicas (also whether they are corrupt or > not). > This was the REST call: > {noformat} > http://:/block_info_xml.jsp?blockId=xxx > {noformat} > But this was removed by HDFS-6252 in 2.7 builds. > Creating this jira to restore that functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9064) NN old UI (block_info_xml) not available in 2.7.x
[ https://issues.apache.org/jira/browse/HDFS-9064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-9064: - Description: In 2.6.x hadoop deploys, given a blockId it was very easy to find out the file name and the locations of replicas (also whether they are corrupt or not). This was the REST call: {noformat} http://:/block_info_xml.jsp?blockId=xxx {noformat} But this was removed by HDFS-6252 in 2.7 builds. Creating this jira to restore that functionality. was: In 2.6.x hadoop deploys, given a blockId it was very easy to find out the file name and the locations of replicas (also whether they are corrupt or not). This was the REST call: {noformat} http://:/block_info_xml.jsp?blockId=xxx {noformat} But this was removed by HDFS-6252. Creating this jira to restore that functionality. > NN old UI (block_info_xml) not available in 2.7.x > - > > Key: HDFS-9064 > URL: https://issues.apache.org/jira/browse/HDFS-9064 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.7.0 >Reporter: Rushabh S Shah >Priority: Critical > > In 2.6.x hadoop deploys, given a blockId it was very easy to find out the > file name and the locations of replicas (also whether they are corrupt or > not). > This was the REST call: > {noformat} > http://:/block_info_xml.jsp?blockId=xxx > {noformat} > But this was removed by HDFS-6252 in 2.7 builds. > Creating this jira to restore that functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6955) DN should reserve disk space for a full block when creating tmp files
[ https://issues.apache.org/jira/browse/HDFS-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741256#comment-14741256 ] Arpit Agarwal commented on HDFS-6955: - Thanks for updating the patch. A few comments. # Nitpick: Please add spaces around operators {{replicaInfo!=null}} # It doesn't look like there is a new test case for temporary reservation. We should have test case for both full and interrupted transfer. # {{FsVolumeImpl#recentReserved}} looks unused. Let's remove it. # I didn't recall the state machine, do temporary blocks get finalized? > DN should reserve disk space for a full block when creating tmp files > - > > Key: HDFS-6955 > URL: https://issues.apache.org/jira/browse/HDFS-6955 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.5.0 >Reporter: Arpit Agarwal >Assignee: Kanaka Kumar Avvaru > Attachments: HDFS-6955-01.patch, HDFS-6955-02.patch, > HDFS-6955-03.patch, HDFS-6955-04.patch, HDFS-6955-05.patch, > HDFS-6955-06.patch, HDFS-6955-07.patch > > > HDFS-6898 is introducing disk space reservation for RBW files to avoid > running out of disk space midway through block creation. > This Jira is to introduce similar reservation for tmp files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9064) NN old UI (block_info_xml) not available in 2.7.x
Rushabh S Shah created HDFS-9064: Summary: NN old UI (block_info_xml) not available in 2.7.x Key: HDFS-9064 URL: https://issues.apache.org/jira/browse/HDFS-9064 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 2.7.0 Reporter: Rushabh S Shah Priority: Critical In 2.6.x hadoop deploys, given a blockId it was very easy to find out the file name and the locations of replicas (also whether they are corrupt or not). This was the REST call: {noformat} http://:/block_info_xml.jsp?blockId=xxx {noformat} But this was removed by HDFS-6252. Creating this jira to restore that functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9027) Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method
[ https://issues.apache.org/jira/browse/HDFS-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741242#comment-14741242 ] Hudson commented on HDFS-9027: -- FAILURE: Integrated in Hadoop-trunk-Commit #8435 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8435/]) HDFS-9027. Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method. (Contributed by Mingliang Liu) (arp: rev 15a557fcfec5eceedde9f1597385d5d3b01b2cd7) * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockStoragePolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockStoragePolicySuite.java > Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method > --- > > Key: HDFS-9027 > URL: https://issues.apache.org/jira/browse/HDFS-9027 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-9027.000.patch, HDFS-9027.001.patch > > > In method {{isLazyPersist()}}, the {{org.apache.hadoop.hdfs.DataStreamer}} > class checks whether the HDFS file is lazy persist. It does two things: > 1. Create a class-wide _static_ {{BlockStoragePolicySuite}} object, which > builds an array of {{BlockStoragePolicy}} internally > 2. Get a block storage policy object from the {{blockStoragePolicySuite}} by > policy name {{HdfsConstants.MEMORY_STORAGE_POLICY_NAME}} > This has two side effects: > 1. Takes time to iterate the pre-built block storage policy array in order to > find the _same_ policy every time whose id matters only (as we need to > compare the file status policy id with lazy persist policy id) > 2. {{DataStreamer}} class imports {{BlockStoragePolicySuite}}. The former > should be moved to {{hadoop-hdfs-client}} module, while the latter can stay > in {{hadoop-hdfs}} module. > Actually, we have the block storage policy IDs, which can be used to compare > with HDFS file status' policy id, as following: > {code} > static boolean isLazyPersist(HdfsFileStatus stat) { > return stat.getStoragePolicy() == HdfsConstants.MEMORY_STORAGE_POLICY_ID; > } > {code} > This way, we only need to move the block storage policies' IDs from > {{HdfsServerConstant}} ({{hadoop-hdfs}} module) to {{HdfsConstants}} > ({{hadoop-hdfs-client}} module). > Another reason we should move those block storage policy IDs is that the > block storage policy names were moved to {{HdfsConstants}} already. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9056) add set/remove quota capability to webhdfs
[ https://issues.apache.org/jira/browse/HDFS-9056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741228#comment-14741228 ] Surendra Singh Lilhore commented on HDFS-9056: -- Duplicate of HDFS-8631. > add set/remove quota capability to webhdfs > -- > > Key: HDFS-9056 > URL: https://issues.apache.org/jira/browse/HDFS-9056 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer > > It would be nice to be able to set and remove quotas via WebHDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7779) Improve the HDFS Web UI browser to allow chowning / chgrp and setting replication
[ https://issues.apache.org/jira/browse/HDFS-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-7779: --- Attachment: HDFS-7779.02.patch > Improve the HDFS Web UI browser to allow chowning / chgrp and setting > replication > - > > Key: HDFS-7779 > URL: https://issues.apache.org/jira/browse/HDFS-7779 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ravi Prakash > Attachments: Chmod.png, Chown.png, HDFS-7779.01.patch, > HDFS-7779.02.patch > > > This JIRA converts the owner, group and replication fields into > contenteditable fields which can be modified by the user from the browser > itself. It too uses the WebHDFS to affect these changes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7779) Improve the HDFS Web UI browser to allow chowning / chgrp and setting replication
[ https://issues.apache.org/jira/browse/HDFS-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-7779: --- Assignee: Ravi Prakash Status: Patch Available (was: Open) > Improve the HDFS Web UI browser to allow chowning / chgrp and setting > replication > - > > Key: HDFS-7779 > URL: https://issues.apache.org/jira/browse/HDFS-7779 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: Chmod.png, Chown.png, HDFS-7779.01.patch, > HDFS-7779.02.patch > > > This JIRA converts the owner, group and replication fields into > contenteditable fields which can be modified by the user from the browser > itself. It too uses the WebHDFS to affect these changes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9060) expose snapshotdiff via WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741215#comment-14741215 ] Kanaka Kumar Avvaru commented on HDFS-9060: --- I would like to work on this issue. [~aw] feel free to assign back to you if you have already have a work plan on this jira. > expose snapshotdiff via WebHDFS > --- > > Key: HDFS-9060 > URL: https://issues.apache.org/jira/browse/HDFS-9060 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Allen Wittenauer >Assignee: Kanaka Kumar Avvaru > > snapshotDiff should be exposed via webhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9055) WebHDFS REST v2
[ https://issues.apache.org/jira/browse/HDFS-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741210#comment-14741210 ] Allen Wittenauer commented on HDFS-9055: bq. I would like work on this, please feel free to reassign if you already started working on this.. I have not started on it. I wanted to start documenting the holes and problems I'm seeing while working on some WebHDFS client-side stuff. --- bq. Is the idea that v1 and v2 would run concurrently, with the only difference being that legacy clients could go to v1 for the old non-compliant URI handling, and newer clients could go to v2? Yes. We'd effectively be supporting two versions of the protocol. bq. Would v1 and v2 offer the same set of APIs otherwise? I think adding admin-level commands to v1 might be a bad idea considering most v1 implementations will likely need some retooling to support them. --- bq. Can you elaborate where and how WebHDFS v1 is broken? We're hitting HDFS-7822 enough that I consider WebHDFS to be extremely flawed. We're starting to teach users to har stuff before they distcp/put/whatever through corporate networks to work around this issue. bq. I believe a cleaner approach is to expose the RPC in a Web-friendly protocol like GRPC instead of doing every single call by hand. Adding a third protocol which nothing really supports yet doesn't fix REST. The ability to use curl and wget is a feature, not a bug. bq. For the second type of jiras, particularly the find and lsr, they obviously require processing directories recursively. It should not be done at the NN side to avoid blocking other requests. We did that at the client side today through DFSClient, IMO WebHDFS should follow the same approach. I'm hesitant to make the client do this work in the WebHDFS case because it's likely going to be extremely expensive network-wise, especially over high latency networks. Worse, I can easily see someone want to get the speed back by multi-threading the connections and effectively DDoSing the NN. > WebHDFS REST v2 > --- > > Key: HDFS-9055 > URL: https://issues.apache.org/jira/browse/HDFS-9055 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer > > There's starting to be enough changes to fix and add missing functionality to > webhdfs that we should probably update to REST v2. This also gives us an > opportunity to deal with some incompatible issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-9060) expose snapshotdiff via WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kanaka Kumar Avvaru reassigned HDFS-9060: - Assignee: Kanaka Kumar Avvaru > expose snapshotdiff via WebHDFS > --- > > Key: HDFS-9060 > URL: https://issues.apache.org/jira/browse/HDFS-9060 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Allen Wittenauer >Assignee: Kanaka Kumar Avvaru > > snapshotDiff should be exposed via webhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9063) Correctly handle snapshot path for getContentSummary
Jing Zhao created HDFS-9063: --- Summary: Correctly handle snapshot path for getContentSummary Key: HDFS-9063 URL: https://issues.apache.org/jira/browse/HDFS-9063 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao The current getContentSummary implementation does not take into account the snapshot path, thus if we have the following ops: 1. create dirs /foo/bar 2. take snapshot s1 on /foo 3. create a 1 byte file /foo/bar/baz then "du /foo" and "du /foo/.snapshot/s1" can report same results for "bar", which is incorrect since the 1 byte file is not included in snapshot s1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9027) Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method
[ https://issues.apache.org/jira/browse/HDFS-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-9027: Resolution: Fixed Fix Version/s: 2.8.0 Target Version/s: (was: 2.8.0) Status: Resolved (was: Patch Available) Committed for 2.8.0. Thanks for contributing this improvement [~liuml07]. > Refactor o.a.h.hdfs.DataStreamer#isLazyPersist() method > --- > > Key: HDFS-9027 > URL: https://issues.apache.org/jira/browse/HDFS-9027 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-9027.000.patch, HDFS-9027.001.patch > > > In method {{isLazyPersist()}}, the {{org.apache.hadoop.hdfs.DataStreamer}} > class checks whether the HDFS file is lazy persist. It does two things: > 1. Create a class-wide _static_ {{BlockStoragePolicySuite}} object, which > builds an array of {{BlockStoragePolicy}} internally > 2. Get a block storage policy object from the {{blockStoragePolicySuite}} by > policy name {{HdfsConstants.MEMORY_STORAGE_POLICY_NAME}} > This has two side effects: > 1. Takes time to iterate the pre-built block storage policy array in order to > find the _same_ policy every time whose id matters only (as we need to > compare the file status policy id with lazy persist policy id) > 2. {{DataStreamer}} class imports {{BlockStoragePolicySuite}}. The former > should be moved to {{hadoop-hdfs-client}} module, while the latter can stay > in {{hadoop-hdfs}} module. > Actually, we have the block storage policy IDs, which can be used to compare > with HDFS file status' policy id, as following: > {code} > static boolean isLazyPersist(HdfsFileStatus stat) { > return stat.getStoragePolicy() == HdfsConstants.MEMORY_STORAGE_POLICY_ID; > } > {code} > This way, we only need to move the block storage policies' IDs from > {{HdfsServerConstant}} ({{hadoop-hdfs}} module) to {{HdfsConstants}} > ({{hadoop-hdfs-client}} module). > Another reason we should move those block storage policy IDs is that the > block storage policy names were moved to {{HdfsConstants}} already. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6955) DN should reserve disk space for a full block when creating tmp files
[ https://issues.apache.org/jira/browse/HDFS-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741159#comment-14741159 ] Hadoop QA commented on HDFS-6955: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 20m 1s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 5 new or modified test files. | | {color:green}+1{color} | javac | 9m 45s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 51s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 27s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 45s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 57s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 44s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 7s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 48s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 160m 27s | Tests failed in hadoop-hdfs. | | | | 213m 4s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.blockmanagement.TestBlockManager | | | hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks | | | hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks | | | hadoop.hdfs.web.TestWebHDFSOAuth2 | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation | | Timed out tests | org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy | | | org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestScrLazyPersistFiles | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12755401/HDFS-6955-07.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 486d5cb | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12398/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12398/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12398/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12398/console | This message was automatically generated. > DN should reserve disk space for a full block when creating tmp files > - > > Key: HDFS-6955 > URL: https://issues.apache.org/jira/browse/HDFS-6955 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.5.0 >Reporter: Arpit Agarwal >Assignee: Kanaka Kumar Avvaru > Attachments: HDFS-6955-01.patch, HDFS-6955-02.patch, > HDFS-6955-03.patch, HDFS-6955-04.patch, HDFS-6955-05.patch, > HDFS-6955-06.patch, HDFS-6955-07.patch > > > HDFS-6898 is introducing disk space reservation for RBW files to avoid > running out of disk space midway through block creation. > This Jira is to introduce similar reservation for tmp files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2
[ https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741145#comment-14741145 ] Haohui Mai commented on HDFS-7966: -- I never understand the the performance numbers. What does 4% means in the data? Do you repeat the experiment multiple times to get a 95% confidence intervals? Can you please explain them a little bit more? A chart would definitely help. It looks to me that the test case is only having a single connection performing reads for a single block? In production it is there are will be tens of thousands on concurrent reads. Does the thread pool help? How does the current implementation look like in this scenario? > New Data Transfer Protocol via HTTP/2 > - > > Key: HDFS-7966 > URL: https://issues.apache.org/jira/browse/HDFS-7966 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Haohui Mai >Assignee: Qianqian Shi > Labels: gsoc, gsoc2015, mentor > Attachments: GSoC2015_Proposal.pdf, > TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg, > TestHttp2ReadBlockInsideEventLoop.svg > > > The current Data Transfer Protocol (DTP) implements a rich set of features > that span across multiple layers, including: > * Connection pooling and authentication (session layer) > * Encryption (presentation layer) > * Data writing pipeline (application layer) > All these features are HDFS-specific and defined by implementation. As a > result it requires non-trivial amount of work to implement HDFS clients and > servers. > This jira explores to delegate the responsibilities of the session and > presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles > connection multiplexing, QoS, authentication and encryption, reducing the > scope of DTP to the application layer only. By leveraging the existing HTTP/2 > library, it should simplify the implementation of both HDFS clients and > servers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9055) WebHDFS REST v2
[ https://issues.apache.org/jira/browse/HDFS-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741133#comment-14741133 ] Haohui Mai commented on HDFS-9055: -- Can you elaborate where and how WebHDFS v1 is broken? Do you have a design that can be posted? I can see two types of jiras from the related jiras listed above: * WebHDFS does not expose some NN functionalities. (quota, snapshots, etc.) * New functionalities via WebHDFS that might require help in the NN side (e.g., find, recursive list) For the first type of jiras they are basically NN RPC calls. Currently we add new WebHDFS call whenever we need to make WebHDFS to support the same NN RPC calls manually. This is not scalable from the maintenance points of view. I believe a cleaner approach is to expose the RPC in a Web-friendly protocol like GRPC instead of doing every single call by hand. For the second type of jiras, particularly the find and lsr, they obviously require processing directories recursively. It should not be done at the NN side to avoid blocking other requests. We did that at the client side today through DFSClient, IMO WebHDFS should follow the same approach. > WebHDFS REST v2 > --- > > Key: HDFS-9055 > URL: https://issues.apache.org/jira/browse/HDFS-9055 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer > > There's starting to be enough changes to fix and add missing functionality to > webhdfs that we should probably update to REST v2. This also gives us an > opportunity to deal with some incompatible issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9062) Add a parameter to MiniDFSCluster to turn off security checks on the domain socked path
[ https://issues.apache.org/jira/browse/HDFS-9062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741118#comment-14741118 ] Chris Nauroth commented on HDFS-9062: - Hi [~James Clampffer]. I'm curious if you've tried to get a setup working by controlling the path in configuration with {{dfs.domain.socket.path}} pointed to a more appropriate location for your environment, such as /tmp. On the Java side of things, an example of this is {{TestShortCircuitLocalRead}}, which uses helper class {{TemporarySocketDirectory}} to grab a temp location and then set that location in {{Configuration}} as {{dfs.domain.socket.path}} before booting the mini-cluster. libhdfs takes a similar approach too. An example is in test_libhdfs_zerocopy.c. There is also the static method {{DomainSocket#disableBindPathValidation}}, which is a method intended only for tests to use for skipping the security checks. Do any of these existing mechanisms help solve the current problem? > Add a parameter to MiniDFSCluster to turn off security checks on the domain > socked path > --- > > Key: HDFS-9062 > URL: https://issues.apache.org/jira/browse/HDFS-9062 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer >Priority: Blocker > > I'd like to add a command line parameter that allows the permission checks on > dfs.domain.socket.path to be turned off. > Right now a blocker, or at least major inconvenience, for short circuit > reader development is getting the domain socket path set up with the correct > permissions. I'm working on shared test machines where messing with things > in /var/lib is discouraged. > This should also make it easier to write tests for short circuit reads once > completed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8953) DataNode Metrics logging
[ https://issues.apache.org/jira/browse/HDFS-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741110#comment-14741110 ] Hadoop QA commented on HDFS-8953: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 22m 53s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 9m 1s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 49s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 28s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 55s | The applied patch generated 2 new checkstyle issues (total was 719, now 716). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 42s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 38s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 5m 5s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 26m 58s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 63m 40s | Tests failed in hadoop-hdfs. | | | | 145m 14s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.fs.TestLocalFsFCStatistics | | | hadoop.security.token.delegation.web.TestWebDelegationToken | | | hadoop.hdfs.server.blockmanagement.TestBlockManager | | | hadoop.hdfs.server.blockmanagement.TestNodeCount | | Timed out tests | org.apache.hadoop.hdfs.TestDistributedFileSystem | | | org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12755412/HDFS-8953-05.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 486d5cb | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12399/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12399/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12399/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12399/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12399/console | This message was automatically generated. > DataNode Metrics logging > > > Key: HDFS-8953 > URL: https://issues.apache.org/jira/browse/HDFS-8953 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kanaka Kumar Avvaru >Assignee: Kanaka Kumar Avvaru > Attachments: HDFS-8953-01.patch, HDFS-8953-02.patch, > HDFS-8953-03.patch, HDFS-8953-04.patch, HDFS-8953-05.patch > > > HDFS-8880 added metrics logging at NameNode. Similarly, this JIRA is to add > a separate logger for metrics at DN -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9055) WebHDFS REST v2
[ https://issues.apache.org/jira/browse/HDFS-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741088#comment-14741088 ] Chris Nauroth commented on HDFS-9055: - Hi [~aw]. AFAIK, the only proposed backwards-incompatible change that would warrant bumping the REST version is the URI handling: HDFS-7822. Is the idea that v1 and v2 would run concurrently, with the only difference being that legacy clients could go to v1 for the old non-compliant URI handling, and newer clients could go to v2? Would v1 and v2 offer the same set of APIs otherwise? Are there other backwards-incompatible proposals that I've missed? > WebHDFS REST v2 > --- > > Key: HDFS-9055 > URL: https://issues.apache.org/jira/browse/HDFS-9055 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer > > There's starting to be enough changes to fix and add missing functionality to > webhdfs that we should probably update to REST v2. This also gives us an > opportunity to deal with some incompatible issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5795) RemoteBlockReader2#checkSuccess() shoud print error status
[ https://issues.apache.org/jira/browse/HDFS-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741076#comment-14741076 ] Xiao Chen commented on HDFS-5795: - Hi Brandon, Thank you for reporting this issue. I'll working on it soon. > RemoteBlockReader2#checkSuccess() shoud print error status > --- > > Key: HDFS-5795 > URL: https://issues.apache.org/jira/browse/HDFS-5795 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Brandon Li >Assignee: Xiao Chen >Priority: Trivial > > RemoteBlockReader2#checkSuccess() doesn't print error status, which makes > debug harder when the client can't read from DataNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9055) WebHDFS REST v2
[ https://issues.apache.org/jira/browse/HDFS-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741038#comment-14741038 ] Brahma Reddy Battula commented on HDFS-9055: I would like work on this, please feel free to reassign if you already started working on this.. HDFS-8629 is also raised for same,but there storagepolicy and quota only raised..I think, we can sync up with that jira also..HDFS-8631 is for quota,hence we can close HDFS-9056..? > WebHDFS REST v2 > --- > > Key: HDFS-9055 > URL: https://issues.apache.org/jira/browse/HDFS-9055 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer > > There's starting to be enough changes to fix and add missing functionality to > webhdfs that we should probably update to REST v2. This also gives us an > opportunity to deal with some incompatible issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-9059) Expose lssnapshottabledir via WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-9059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula reassigned HDFS-9059: -- Assignee: Brahma Reddy Battula > Expose lssnapshottabledir via WebHDFS > - > > Key: HDFS-9059 > URL: https://issues.apache.org/jira/browse/HDFS-9059 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > > lssnapshottabledir should be exposed via WebHDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-9061) hdfs groups should be exposed via WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula reassigned HDFS-9061: -- Assignee: Brahma Reddy Battula > hdfs groups should be exposed via WebHDFS > - > > Key: HDFS-9061 > URL: https://issues.apache.org/jira/browse/HDFS-9061 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > > It would be extremely useful from a REST perspective to expose which groups > the NN says the user belongs to. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9062) Add a parameter to MiniDFSCluster to turn off security checks on the domain socked path
James Clampffer created HDFS-9062: - Summary: Add a parameter to MiniDFSCluster to turn off security checks on the domain socked path Key: HDFS-9062 URL: https://issues.apache.org/jira/browse/HDFS-9062 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: James Clampffer Assignee: James Clampffer Priority: Blocker I'd like to add a command line parameter that allows the permission checks on dfs.domain.socket.path to be turned off. Right now a blocker, or at least major inconvenience, for short circuit reader development is getting the domain socket path set up with the correct permissions. I'm working on shared test machines where messing with things in /var/lib is discouraged. This should also make it easier to write tests for short circuit reads once completed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-9057) allow/disallow snapshots via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula reassigned HDFS-9057: -- Assignee: Brahma Reddy Battula > allow/disallow snapshots via webhdfs > > > Key: HDFS-9057 > URL: https://issues.apache.org/jira/browse/HDFS-9057 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > > We should be able to allow and disallow directories for snapshotting via > WebHDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-9058) enable find via WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-9058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula reassigned HDFS-9058: -- Assignee: Brahma Reddy Battula > enable find via WebHDFS > --- > > Key: HDFS-9058 > URL: https://issues.apache.org/jira/browse/HDFS-9058 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > > It'd be useful to implement find over webhdfs rather than forcing the client > to grab a lot of data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9055) WebHDFS REST v2
[ https://issues.apache.org/jira/browse/HDFS-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-9055: --- Description: There's starting to be enough changes to fix and add missing functionality to webhdfs that we should probably update to REST v2. This also gives us an opportunity to deal with some incompatible issues. (was: There's starting to be enough changes to fix webhdfs that we should probably update to REST v2.) > WebHDFS REST v2 > --- > > Key: HDFS-9055 > URL: https://issues.apache.org/jira/browse/HDFS-9055 > Project: Hadoop HDFS > Issue Type: New Feature > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer > > There's starting to be enough changes to fix and add missing functionality to > webhdfs that we should probably update to REST v2. This also gives us an > opportunity to deal with some incompatible issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)