[jira] [Comment Edited] (HDFS-8161) Both Namenodes are in standby State
[ https://issues.apache.org/jira/browse/HDFS-8161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243980#comment-15243980 ] Harsh J edited comment on HDFS-8161 at 4/16/16 3:26 AM: [~brahmareddy] - was this encountered on virtual machine hosts, or physical ones? Asking because https://tech.vijayp.ca/linux-kernel-bug-delivers-corrupt-tcp-ip-data-to-mesos-kubernetes-docker-containers-4986f88f7a19#.v3hx212ne (H/T [~daisuke.kobayashi]) was (Author: qwertymaniac): [~brahmareddy] - was this encountered on virtual machine hosts, or physical ones? Asking because https://tech.vijayp.ca/linux-kernel-bug-delivers-corrupt-tcp-ip-data-to-mesos-kubernetes-docker-containers-4986f88f7a19#.v3hx212ne > Both Namenodes are in standby State > --- > > Key: HDFS-8161 > URL: https://issues.apache.org/jira/browse/HDFS-8161 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Affects Versions: 2.6.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Attachments: ACTIVEBreadcumb and StandbyElector.txt > > > Suspected Scenario: > > Start cluster with three Nodes. > Reboot Machine where ZKFC is not running..( Here Active Node ZKFC should open > session with this ZK ) > Now ZKFC ( Active NN's ) session expire and try re-establish connection with > another ZK...Bythe time ZKFC ( StndBy NN's ) will try to fence old active > and create the active Breadcrumb and Makes SNN to active state.. > But immediately it fence to standby state.. ( Here is the doubt) > Hence both will be in standby state.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8161) Both Namenodes are in standby State
[ https://issues.apache.org/jira/browse/HDFS-8161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243980#comment-15243980 ] Harsh J commented on HDFS-8161: --- [~brahmareddy] - was this encountered on virtual machine hosts, or physical ones? Asking because https://tech.vijayp.ca/linux-kernel-bug-delivers-corrupt-tcp-ip-data-to-mesos-kubernetes-docker-containers-4986f88f7a19#.v3hx212ne > Both Namenodes are in standby State > --- > > Key: HDFS-8161 > URL: https://issues.apache.org/jira/browse/HDFS-8161 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Affects Versions: 2.6.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Attachments: ACTIVEBreadcumb and StandbyElector.txt > > > Suspected Scenario: > > Start cluster with three Nodes. > Reboot Machine where ZKFC is not running..( Here Active Node ZKFC should open > session with this ZK ) > Now ZKFC ( Active NN's ) session expire and try re-establish connection with > another ZK...Bythe time ZKFC ( StndBy NN's ) will try to fence old active > and create the active Breadcrumb and Makes SNN to active state.. > But immediately it fence to standby state.. ( Here is the doubt) > Hence both will be in standby state.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243977#comment-15243977 ] Brahma Reddy Battula commented on HDFS-10284: - [~liuml07] thanks for reporting this. Yes, we should separate out...will look into this > o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode > fails intermittently > - > > Key: HDFS-10284 > URL: https://issues.apache.org/jira/browse/HDFS-10284 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.9.0 >Reporter: Mingliang Liu >Assignee: Mingliang Liu >Priority: Minor > Attachments: HDFS-10284.000.patch, HDFS-10284.001.patch > > > *Stacktrace* > {code} > org.mockito.exceptions.misusing.UnfinishedStubbingException: > Unfinished stubbing detected here: > -> at > org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169) > E.g. thenReturn() may be missing. > Examples of correct stubbing: > when(mock.isOk()).thenReturn(true); > when(mock.isOk()).thenThrow(exception); > doThrow(exception).when(mock).someVoidMethod(); > Hints: > 1. missing thenReturn() > 2. although stubbed methods may return mocks, you cannot inline mock > creation (mock()) call inside a thenReturn method (see issue 53) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169) > {code} > Sample failing pre-commit UT: > https://builds.apache.org/job/PreCommit-HDFS-Build/15153/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlockManagerSafeMode/testCheckSafeMode/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243970#comment-15243970 ] Mingliang Liu commented on HDFS-10175: -- Thanks [~cmccabe] very much for the discussion. # The v5 patch is to make the detailed statistics optional. As the statistics is shared among file system objects, it's hard to enable/disable it using per-file-system config key. The patch added a new API to {{Statistics}} to enable/disable this feature according to the tradeoff between reduced cost and detailed per-op counter. The extra overhead should be avoided if the enum map is not constructed. # I filed [HADOOP-13031] to track the discussion and effort of refactoring the code that maintains rack-aware counters. Specially, I also think it's not good to expose the internal composite data structure of distance-aware bytes read. Those use cases that iterate all the distances will call {{getBytesReadByDistance(int distance)}} multiple times, which internally will trigger the aggregation among all threads statistics data multiple times. To address this, perhaps they can use the {{getData()}} to get all the statistics data at once. I reviewed the current patch iof [MAPREDUCE-6660] which employs the bytes-read-by-distance, and found it used the {{getData()}} as I expected. # Based on the current FileSystem design, many of which are HDFS specific, we see no better choice than putting them in FileSystem$Statistics, for supporting either distance-aware read counters (HDFS-specific) or per-operation-counters (many of which are HDFS specific). By now, when the detailed statistics are missing (e.g. {{S3AFileSystem#append()}}), we treat it as zero. If some operations' statistics are different, they can update the statistics accordingly (e.g. {{S3AFileSystem#rename}}) as the counters are populated in concrete file system operations. Another point is that, for existing {{readOps/writeOps}} counters, we also have similar scenario (and challenges). Will file follow-up jiras if we have specific cases to handle. # I created a new jira [HADOOP-13032] to track the effort of moving the {{Statistics}} class out of {{FileSystem}} for shorter source code and simpler class structure, thought incompatible change. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10175: - Attachment: HDFS-10175.005.patch > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, > HDFS-10175.005.patch, TestStatisticsOverhead.java > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8986) Add option to -du to calculate directory space usage excluding snapshots
[ https://issues.apache.org/jira/browse/HDFS-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243927#comment-15243927 ] Hadoop QA commented on HDFS-8986: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 55s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 57s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 52s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 23s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 15s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 49s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 45s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 45s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 6s {color} | {color:red} root: patch generated 9 new + 177 unchanged - 0 fixed = 186 total (was 177) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 53s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 4m 55s {color} | {color:red} hadoop-common-project_hadoop-common-jdk1.8.0_77 with JDK v1.8.0_77 generated 12 new + 1 unchanged - 0 fixed = 13 total (was 1) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 23s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 8m 39s {color} | {color:red} hadoop-common-project_hadoop-common-jdk1.7.0_95 with JDK v1.7.0_95 generated 12 new + 13 unchanged - 0 fixed = 25 total (was 13) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 23s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 24s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} |
[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart
[ https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243887#comment-15243887 ] Hadoop QA commented on HDFS-10207: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 45s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 37s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 47s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 41s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 35s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 35s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 8s {color} | {color:red} root: patch generated 7 new + 417 unchanged - 0 fixed = 424 total (was 417) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 58s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 48s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 46s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 41s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 55s {color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 28s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 143m 33s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK
[jira] [Commented] (HDFS-9543) DiskBalancer : Add Data mover
[ https://issues.apache.org/jira/browse/HDFS-9543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243878#comment-15243878 ] Lei (Eddy) Xu commented on HDFS-9543: - Hi, [~anu] Thanks for the patches A few comments: Could you put {{getNextBlock()}} logic into a separate Iterator, and make it {{Closable}}, which will include {{getBlockToCopy(), openPoolIters(), getNextBlock(), closePoolIters()}}. There are a few draw backs of separating them into different functions. 1) The states (i.e. poolIndex,) are stored outside these functions, the caller needs maintain these states. 2) {{poolIndex}} is never initialized and is not able be reset. {code} } catch (IOException e) { item.incErrorCount(); } {code} Please always log the IOEs. And I think it is better to throw {{IOE}} here as well as many other places. {code} private void openPoolIters(); {code} Can it be a {{private static List openPoolIters()}}? {code} // Check for the max error count constraint. if (item.getErrorCount() > getMaxError(item)) { LOG.error("Exceeded the max error count. source {}, dest: {} " + "error count: {}", source.getBasePath(), dest.getBasePath(), item.getErrorCount()); this.setExitFlag(); continue; } {code} In a few such places, should we actually {{break}} the while loop? Wouldn't {{continue}} here just generate a lot of LOGS and spend CPU cycles? Why do you need to change {{float}} to {{double}}. In this case, wouldn't {{float}} good enough ? I think a {{5%}} of errors are OK for these tasks. Thanks very much! > DiskBalancer : Add Data mover > -- > > Key: HDFS-9543 > URL: https://issues.apache.org/jira/browse/HDFS-9543 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-9543-HDFS-1312.001.patch > > > This patch adds the actual mover logic to the datanode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10289) Balancer configures DNs directly
[ https://issues.apache.org/jira/browse/HDFS-10289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243816#comment-15243816 ] Ravi Prakash commented on HDFS-10289: - Fair enough! Thanks for the responses Kihwal and Ming! I do recognize that this particular JIRA is a little bit orthogonal, so I welcome all the improvements intended. > Balancer configures DNs directly > > > Key: HDFS-10289 > URL: https://issues.apache.org/jira/browse/HDFS-10289 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Critical > > Balancer directly configures the 2 balance-related properties > (bandwidthPerSec and concurrentMoves) on the DNs involved. > Details: > * Before each balancing iteration, set the properties on all DNs involved in > the current iteration. > * The DN property changes will not survive restart. > * Balancer gets the property values from command line or its config file. > * Need new DN APIs to query and set the 2 properties. > * No need to edit the config file on each DN or run {{hdfs dfsadmin > -setBalancerBandwidth}} to configure every DN in the cluster. > Pros: > * Improve ease of use because all configurations are done at one place, the > balancer. We saw many customers often forgot to set concurrentMoves properly > since it is required on both DN and Balancer. > * Support new DNs added between iterations > * Handle DN restarts between iterations > * May be able to dynamically adjust the thresholds in different iterations. > Don't know how useful though. > Cons: > * New DN property API > * A malicious/misconfigured balancer may overwhelm DNs. {{hdfs dfsadmin > -setBalancerBandwidth}} has the same issue. Also Balancer can only be run by > admin. > Questions: > * Can we create {{BalancerConcurrentMovesCommand}} similar to > {{BalancerBandwidthCommand}}? Can Balancer use them directly without going > through NN? > One proposal to implement HDFS-7466 calls for an API to query DN properties. > DN Conf Servlet returns all config properties. It does not return individual > property and it does not return the value set by {{hdfs dfsadmin > -setBalancerBandwidth}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10299) libhdfs++: File length doesn't always going the last block if it's being written to
[ https://issues.apache.org/jira/browse/HDFS-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243771#comment-15243771 ] Hadoop QA commented on HDFS-10299: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 33s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 44s {color} | {color:green} HDFS-8707 passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 50s {color} | {color:green} HDFS-8707 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 47s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 1s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 38s {color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 36s {color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 61m 42s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0cf5e66 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12799019/HDFS-10299.HDFS-8707.000.patch | | JIRA Issue | HDFS-10299 | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux b8c951f9f489 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-8707 / 0828600 | | Default Java | 1.7.0_95 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 | | JDK v1.7.0_95 Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/15174/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: hadoop-hdfs-project/hadoop-hdfs-native-client | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15174/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > libhdfs++: File length doesn't always going the last block if it's being > written to > --- > > Key: HDFS-10299 > URL:
[jira] [Commented] (HDFS-10289) Balancer configures DNs directly
[ https://issues.apache.org/jira/browse/HDFS-10289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243740#comment-15243740 ] Ming Ma commented on HDFS-10289: There was some discussion about moving balancer into namenode in https://issues.apache.org/jira/browse/HDFS-1431. Maybe we can address the issue [~kihwal] brought up, say to have some lightweight balancer CLI send NN balancer commands via RPC. Even with that there will be complexity to move balancer into namenode. Thus we would like to try out the "block movement scheduling inside NN" idea as part of the migrator HDFS-8789 [~ctrezzo] is working on. > Balancer configures DNs directly > > > Key: HDFS-10289 > URL: https://issues.apache.org/jira/browse/HDFS-10289 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Critical > > Balancer directly configures the 2 balance-related properties > (bandwidthPerSec and concurrentMoves) on the DNs involved. > Details: > * Before each balancing iteration, set the properties on all DNs involved in > the current iteration. > * The DN property changes will not survive restart. > * Balancer gets the property values from command line or its config file. > * Need new DN APIs to query and set the 2 properties. > * No need to edit the config file on each DN or run {{hdfs dfsadmin > -setBalancerBandwidth}} to configure every DN in the cluster. > Pros: > * Improve ease of use because all configurations are done at one place, the > balancer. We saw many customers often forgot to set concurrentMoves properly > since it is required on both DN and Balancer. > * Support new DNs added between iterations > * Handle DN restarts between iterations > * May be able to dynamically adjust the thresholds in different iterations. > Don't know how useful though. > Cons: > * New DN property API > * A malicious/misconfigured balancer may overwhelm DNs. {{hdfs dfsadmin > -setBalancerBandwidth}} has the same issue. Also Balancer can only be run by > admin. > Questions: > * Can we create {{BalancerConcurrentMovesCommand}} similar to > {{BalancerBandwidthCommand}}? Can Balancer use them directly without going > through NN? > One proposal to implement HDFS-7466 calls for an API to query DN properties. > DN Conf Servlet returns all config properties. It does not return individual > property and it does not return the value set by {{hdfs dfsadmin > -setBalancerBandwidth}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10208) Addendum for HDFS-9579: to handle the case when client machine can't resolve network path
[ https://issues.apache.org/jira/browse/HDFS-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243738#comment-15243738 ] Ming Ma commented on HDFS-10208: TestReadStripedFileWithDecoding and TestWriteReadStripedFile aren't related. All other tests passed locally. > Addendum for HDFS-9579: to handle the case when client machine can't resolve > network path > - > > Key: HDFS-10208 > URL: https://issues.apache.org/jira/browse/HDFS-10208 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-10208-2.patch, HDFS-10208-3.patch, > HDFS-10208-4.patch, HDFS-10208-5.patch, HDFS-10208.patch > > > If DFSClient runs on a machine that can't resolve network path, > {{DNSToSwitchMapping}} will return {{DEFAULT_RACK}}. In addition, if somehow > {{dnsToSwitchMapping.resolve}} returns null, that will cause exception when > it tries to create {{clientNode}}. In either case, there is no need to create > {{clientNode}} and we should treat its network distance with any datanode as > Integer.MAX_VALUE. > {noformat} > clientNode = new NodeBase(clientHostName, > dnsToSwitchMapping.resolve(nodes).get(0)); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart
[ https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243716#comment-15243716 ] Xiaobing Zhou commented on HDFS-10207: -- [~xyao] thanks for review. I posted patch v004 that addressed your comments. > Support enable Hadoop IPC backoff without namenode restart > -- > > Key: HDFS-10207 > URL: https://issues.apache.org/jira/browse/HDFS-10207 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Xiaobing Zhou > Attachments: HDFS-10207-HDFS-9000.000.patch, > HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, > HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch > > > It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a > namenode restart to protect namenode from being overloaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart
[ https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-10207: - Attachment: HDFS-10207-HDFS-9000.004.patch > Support enable Hadoop IPC backoff without namenode restart > -- > > Key: HDFS-10207 > URL: https://issues.apache.org/jira/browse/HDFS-10207 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Xiaobing Zhou > Attachments: HDFS-10207-HDFS-9000.000.patch, > HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, > HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch > > > It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a > namenode restart to protect namenode from being overloaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8986) Add option to -du to calculate directory space usage excluding snapshots
[ https://issues.apache.org/jira/browse/HDFS-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-8986: Target Version/s: 2.8.0 > Add option to -du to calculate directory space usage excluding snapshots > > > Key: HDFS-8986 > URL: https://issues.apache.org/jira/browse/HDFS-8986 > Project: Hadoop HDFS > Issue Type: Improvement > Components: snapshots >Reporter: Gautam Gopalakrishnan >Assignee: Xiao Chen > Attachments: HDFS-8986.01.patch > > > When running {{hadoop fs -du}} on a snapshotted directory (or one of its > children), the report includes space consumed by blocks that are only present > in the snapshots. This is confusing for end users. > {noformat} > $ hadoop fs -du -h -s /tmp/parent /tmp/parent/* > 799.7 M 2.3 G /tmp/parent > 799.7 M 2.3 G /tmp/parent/sub1 > $ hdfs dfs -createSnapshot /tmp/parent snap1 > Created snapshot /tmp/parent/.snapshot/snap1 > $ hadoop fs -rm -skipTrash /tmp/parent/sub1/* > ... > $ hadoop fs -du -h -s /tmp/parent /tmp/parent/* > 799.7 M 2.3 G /tmp/parent > 799.7 M 2.3 G /tmp/parent/sub1 > $ hdfs dfs -deleteSnapshot /tmp/parent snap1 > $ hadoop fs -du -h -s /tmp/parent /tmp/parent/* > 0 0 /tmp/parent > 0 0 /tmp/parent/sub1 > {noformat} > It would be helpful if we had a flag, say -X, to exclude any snapshot related > disk usage in the output -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8986) Add option to -du to calculate directory space usage excluding snapshots
[ https://issues.apache.org/jira/browse/HDFS-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-8986: Status: Patch Available (was: Open) > Add option to -du to calculate directory space usage excluding snapshots > > > Key: HDFS-8986 > URL: https://issues.apache.org/jira/browse/HDFS-8986 > Project: Hadoop HDFS > Issue Type: Improvement > Components: snapshots >Reporter: Gautam Gopalakrishnan >Assignee: Xiao Chen > Attachments: HDFS-8986.01.patch > > > When running {{hadoop fs -du}} on a snapshotted directory (or one of its > children), the report includes space consumed by blocks that are only present > in the snapshots. This is confusing for end users. > {noformat} > $ hadoop fs -du -h -s /tmp/parent /tmp/parent/* > 799.7 M 2.3 G /tmp/parent > 799.7 M 2.3 G /tmp/parent/sub1 > $ hdfs dfs -createSnapshot /tmp/parent snap1 > Created snapshot /tmp/parent/.snapshot/snap1 > $ hadoop fs -rm -skipTrash /tmp/parent/sub1/* > ... > $ hadoop fs -du -h -s /tmp/parent /tmp/parent/* > 799.7 M 2.3 G /tmp/parent > 799.7 M 2.3 G /tmp/parent/sub1 > $ hdfs dfs -deleteSnapshot /tmp/parent snap1 > $ hadoop fs -du -h -s /tmp/parent /tmp/parent/* > 0 0 /tmp/parent > 0 0 /tmp/parent/sub1 > {noformat} > It would be helpful if we had a flag, say -X, to exclude any snapshot related > disk usage in the output -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10297) Increase default balance bandwidth and concurrent moves
[ https://issues.apache.org/jira/browse/HDFS-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243691#comment-15243691 ] Hadoop QA commented on HDFS-10297: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 23s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 2 new + 393 unchanged - 2 fixed = 395 total (was 395) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 42s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 18s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 135m 38s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.TestHFlush | | | hadoop.hdfs.server.namenode.TestEditLog | | | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead | \\ \\ || Subsystem || Report/Notes || | Docker |
[jira] [Commented] (HDFS-9016) Display upgrade domain information in fsck
[ https://issues.apache.org/jira/browse/HDFS-9016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243689#comment-15243689 ] Ming Ma commented on HDFS-9016: --- TestReadStripedFileWithDecoding and TestWriteReadStripedFile aren't related. All other tests passed locally. > Display upgrade domain information in fsck > -- > > Key: HDFS-9016 > URL: https://issues.apache.org/jira/browse/HDFS-9016 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9016.patch > > > This will make it easy for people to use fsck to check block placement when > upgrade domain is enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10299) libhdfs++: File length doesn't always going the last block if it's being written to
[ https://issues.apache.org/jira/browse/HDFS-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaowei Zhu updated HDFS-10299: --- Attachment: HDFS-10299.HDFS-8707.000.patch > libhdfs++: File length doesn't always going the last block if it's being > written to > --- > > Key: HDFS-10299 > URL: https://issues.apache.org/jira/browse/HDFS-10299 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: Xiaowei Zhu > Attachments: HDFS-10299.HDFS-8707.000.patch > > > It looks like we aren't factoring in the last block of files that are being > written to or haven't been closed yet into the length of the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10299) libhdfs++: File length doesn't always going the last block if it's being written to
[ https://issues.apache.org/jira/browse/HDFS-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaowei Zhu updated HDFS-10299: --- Status: Patch Available (was: In Progress) > libhdfs++: File length doesn't always going the last block if it's being > written to > --- > > Key: HDFS-10299 > URL: https://issues.apache.org/jira/browse/HDFS-10299 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: Xiaowei Zhu > Attachments: HDFS-10299.HDFS-8707.000.patch > > > It looks like we aren't factoring in the last block of files that are being > written to or haven't been closed yet into the length of the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9670) DistCp throws NPE when source is root
[ https://issues.apache.org/jira/browse/HDFS-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243675#comment-15243675 ] Hadoop QA commented on HDFS-9670: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 8s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s {color} | {color:green} hadoop-tools/hadoop-distcp: patch generated 0 new + 18 unchanged - 1 fixed = 18 total (was 19) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 32s {color} | {color:green} hadoop-distcp in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 25s {color} | {color:green} hadoop-distcp in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 33s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12799010/HDFS-9670.002.patch | | JIRA Issue | HDFS-9670 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux db40902e1ce5 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 69f3d42 | | Default Java |
[jira] [Updated] (HDFS-9543) DiskBalancer : Add Data mover
[ https://issues.apache.org/jira/browse/HDFS-9543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-9543: --- Description: This patch adds the actual mover logic to the datanode. (was: This patch adds the RPCs and mover logic that allows data to be moved from one storage partition to another) > DiskBalancer : Add Data mover > -- > > Key: HDFS-9543 > URL: https://issues.apache.org/jira/browse/HDFS-9543 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-9543-HDFS-1312.001.patch > > > This patch adds the actual mover logic to the datanode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-10299) libhdfs++: File length doesn't always going the last block if it's being written to
[ https://issues.apache.org/jira/browse/HDFS-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-10299 started by Xiaowei Zhu. -- > libhdfs++: File length doesn't always going the last block if it's being > written to > --- > > Key: HDFS-10299 > URL: https://issues.apache.org/jira/browse/HDFS-10299 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: Xiaowei Zhu > > It looks like we aren't factoring in the last block of files that are being > written to or haven't been closed yet into the length of the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9543) DiskBalancer : Add Data mover
[ https://issues.apache.org/jira/browse/HDFS-9543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243672#comment-15243672 ] Anu Engineer commented on HDFS-9543: Test failures are not related to this patch. > DiskBalancer : Add Data mover > -- > > Key: HDFS-9543 > URL: https://issues.apache.org/jira/browse/HDFS-9543 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-9543-HDFS-1312.001.patch > > > This patch adds the RPCs and mover logic that allows data to be moved from > one storage partition to another -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9543) DiskBalancer : Add Data mover
[ https://issues.apache.org/jira/browse/HDFS-9543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243664#comment-15243664 ] Hadoop QA commented on HDFS-9543: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 50s {color} | {color:green} HDFS-1312 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s {color} | {color:green} HDFS-1312 passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s {color} | {color:green} HDFS-1312 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s {color} | {color:green} HDFS-1312 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s {color} | {color:green} HDFS-1312 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 30s {color} | {color:green} HDFS-1312 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 45s {color} | {color:green} HDFS-1312 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s {color} | {color:green} HDFS-1312 passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 43s {color} | {color:green} HDFS-1312 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s {color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: patch generated 0 new + 180 unchanged - 1 fixed = 180 total (was 181) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 45s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 13s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 23s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 32s {color} | {color:red} Patch generated 2 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 119m 6s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.server.datanode.TestDataNodeUUID | | | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead | | | hadoop.hdfs.security.TestDelegationTokenForProxyUser | | | hadoop.hdfs.TestFileAppend | | | hadoop.hdfs.tools.TestDFSAdmin | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | JDK v1.8.0_77
[jira] [Commented] (HDFS-10300) TestDistCpSystem should share MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243650#comment-15243650 ] John Zhuge commented on HDFS-10300: --- It is the old style of JUnit 3. Switch to JUnit 4 annotation style. > TestDistCpSystem should share MiniDFSCluster > > > Key: HDFS-10300 > URL: https://issues.apache.org/jira/browse/HDFS-10300 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Trivial > > The test cases in this class should share MiniDFSCluster if possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10300) TestDistCpSystem should share MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243647#comment-15243647 ] John Zhuge commented on HDFS-10300: --- Do not understand why {{TestDistCpSystem}} extends {{TestCase}}. > TestDistCpSystem should share MiniDFSCluster > > > Key: HDFS-10300 > URL: https://issues.apache.org/jira/browse/HDFS-10300 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Trivial > > The test cases in this class should share MiniDFSCluster if possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9670) DistCp throws NPE when source is root
[ https://issues.apache.org/jira/browse/HDFS-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243556#comment-15243556 ] John Zhuge commented on HDFS-9670: -- Created HDFS-10300 "TestDistCpSystem should share MiniDFSCluster". > DistCp throws NPE when source is root > - > > Key: HDFS-9670 > URL: https://issues.apache.org/jira/browse/HDFS-9670 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: John Zhuge > Labels: supportability > Fix For: 2.8.0 > > Attachments: HDFS-9670.001.patch, HDFS-9670.002.patch > > > Symptom: > {quote} > [root@vb0724 ~]# hadoop distcp hdfs://X:8020/ hdfs://Y:8020/ > 16/01/20 11:33:33 INFO tools.DistCp: Input Options: > DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, > ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', > copyStrategy='uniformsize', sourceFileListing=null, > sourcePaths=[hdfs://X:8020/], targetPath=hdfs://Y:8020/, > targetPathExists=true, preserveRawXattrs=false, filtersFile='null'} > 16/01/20 11:33:33 INFO client.RMProxy: Connecting to ResourceManager at Z:8032 > 16/01/20 11:33:33 ERROR tools.DistCp: Exception encountered > java.lang.NullPointerException > at > org.apache.hadoop.tools.util.DistCpUtils.getRelativePath(DistCpUtils.java:144) > at > org.apache.hadoop.tools.SimpleCopyListing.writeToFileListing(SimpleCopyListing.java:598) > at > org.apache.hadoop.tools.SimpleCopyListing.writeToFileListingRoot(SimpleCopyListing.java:583) > at > org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:313) > at > org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:174) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:365) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:171) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:122) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:429) > {quote} > Relevant code: > {code} > private Path computeSourceRootPath(FileStatus sourceStatus, > DistCpOptions options) throws > IOException { > Path target = options.getTargetPath(); > FileSystem targetFS = target.getFileSystem(getConf()); > final boolean targetPathExists = options.getTargetPathExists(); > boolean solitaryFile = options.getSourcePaths().size() == 1 > && > !sourceStatus.isDirectory(); > if (solitaryFile) { > if (targetFS.isFile(target) || !targetPathExists) { > return sourceStatus.getPath(); > } else { > return sourceStatus.getPath().getParent(); > } > } else { > boolean specialHandling = (options.getSourcePaths().size() == 1 && > !targetPathExists) || > options.shouldSyncFolder() || options.shouldOverwrite(); > return specialHandling && sourceStatus.isDirectory() ? > sourceStatus.getPath() : > sourceStatus.getPath().getParent(); > } > } > {code} > We can see that it could return NULL at the end when doing > {{sourceStatus.getPath().getParent()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-10300) TestDistCpSystem should share MiniDFSCluster
John Zhuge created HDFS-10300: - Summary: TestDistCpSystem should share MiniDFSCluster Key: HDFS-10300 URL: https://issues.apache.org/jira/browse/HDFS-10300 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 2.6.0 Reporter: John Zhuge Assignee: John Zhuge Priority: Trivial The test cases in this class should share MiniDFSCluster if possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9670) DistCp throws NPE when source is root
[ https://issues.apache.org/jira/browse/HDFS-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-9670: - Attachment: HDFS-9670.002.patch Patch 002: * Use 1 mini cluster in unit test because the issue is reproducible with both source and target on the same cluster. > DistCp throws NPE when source is root > - > > Key: HDFS-9670 > URL: https://issues.apache.org/jira/browse/HDFS-9670 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: John Zhuge > Labels: supportability > Fix For: 2.8.0 > > Attachments: HDFS-9670.001.patch, HDFS-9670.002.patch > > > Symptom: > {quote} > [root@vb0724 ~]# hadoop distcp hdfs://X:8020/ hdfs://Y:8020/ > 16/01/20 11:33:33 INFO tools.DistCp: Input Options: > DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, > ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', > copyStrategy='uniformsize', sourceFileListing=null, > sourcePaths=[hdfs://X:8020/], targetPath=hdfs://Y:8020/, > targetPathExists=true, preserveRawXattrs=false, filtersFile='null'} > 16/01/20 11:33:33 INFO client.RMProxy: Connecting to ResourceManager at Z:8032 > 16/01/20 11:33:33 ERROR tools.DistCp: Exception encountered > java.lang.NullPointerException > at > org.apache.hadoop.tools.util.DistCpUtils.getRelativePath(DistCpUtils.java:144) > at > org.apache.hadoop.tools.SimpleCopyListing.writeToFileListing(SimpleCopyListing.java:598) > at > org.apache.hadoop.tools.SimpleCopyListing.writeToFileListingRoot(SimpleCopyListing.java:583) > at > org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:313) > at > org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:174) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:365) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:171) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:122) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:429) > {quote} > Relevant code: > {code} > private Path computeSourceRootPath(FileStatus sourceStatus, > DistCpOptions options) throws > IOException { > Path target = options.getTargetPath(); > FileSystem targetFS = target.getFileSystem(getConf()); > final boolean targetPathExists = options.getTargetPathExists(); > boolean solitaryFile = options.getSourcePaths().size() == 1 > && > !sourceStatus.isDirectory(); > if (solitaryFile) { > if (targetFS.isFile(target) || !targetPathExists) { > return sourceStatus.getPath(); > } else { > return sourceStatus.getPath().getParent(); > } > } else { > boolean specialHandling = (options.getSourcePaths().size() == 1 && > !targetPathExists) || > options.shouldSyncFolder() || options.shouldOverwrite(); > return specialHandling && sourceStatus.isDirectory() ? > sourceStatus.getPath() : > sourceStatus.getPath().getParent(); > } > } > {code} > We can see that it could return NULL at the end when doing > {{sourceStatus.getPath().getParent()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10291) TestShortCircuitLocalRead failing
[ https://issues.apache.org/jira/browse/HDFS-10291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243553#comment-15243553 ] Mingliang Liu commented on HDFS-10291: -- Thanks for the detailed investigation, [~ste...@apache.org]. I prefer the first choice (fixing the test only). IMHO, throwing exception is a better defined behavior than making HDFS itself shrinking read length as this is not really a necessary feature. > TestShortCircuitLocalRead failing > - > > Key: HDFS-10291 > URL: https://issues.apache.org/jira/browse/HDFS-10291 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > > {{TestShortCircuitLocalRead}} failing as length of read is considered off end > of buffer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9670) DistCp throws NPE when source is root
[ https://issues.apache.org/jira/browse/HDFS-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243547#comment-15243547 ] John Zhuge commented on HDFS-9670: -- Thanks, will fix it. > DistCp throws NPE when source is root > - > > Key: HDFS-9670 > URL: https://issues.apache.org/jira/browse/HDFS-9670 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: John Zhuge > Labels: supportability > Fix For: 2.8.0 > > Attachments: HDFS-9670.001.patch > > > Symptom: > {quote} > [root@vb0724 ~]# hadoop distcp hdfs://X:8020/ hdfs://Y:8020/ > 16/01/20 11:33:33 INFO tools.DistCp: Input Options: > DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, > ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', > copyStrategy='uniformsize', sourceFileListing=null, > sourcePaths=[hdfs://X:8020/], targetPath=hdfs://Y:8020/, > targetPathExists=true, preserveRawXattrs=false, filtersFile='null'} > 16/01/20 11:33:33 INFO client.RMProxy: Connecting to ResourceManager at Z:8032 > 16/01/20 11:33:33 ERROR tools.DistCp: Exception encountered > java.lang.NullPointerException > at > org.apache.hadoop.tools.util.DistCpUtils.getRelativePath(DistCpUtils.java:144) > at > org.apache.hadoop.tools.SimpleCopyListing.writeToFileListing(SimpleCopyListing.java:598) > at > org.apache.hadoop.tools.SimpleCopyListing.writeToFileListingRoot(SimpleCopyListing.java:583) > at > org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:313) > at > org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:174) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:365) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:171) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:122) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:429) > {quote} > Relevant code: > {code} > private Path computeSourceRootPath(FileStatus sourceStatus, > DistCpOptions options) throws > IOException { > Path target = options.getTargetPath(); > FileSystem targetFS = target.getFileSystem(getConf()); > final boolean targetPathExists = options.getTargetPathExists(); > boolean solitaryFile = options.getSourcePaths().size() == 1 > && > !sourceStatus.isDirectory(); > if (solitaryFile) { > if (targetFS.isFile(target) || !targetPathExists) { > return sourceStatus.getPath(); > } else { > return sourceStatus.getPath().getParent(); > } > } else { > boolean specialHandling = (options.getSourcePaths().size() == 1 && > !targetPathExists) || > options.shouldSyncFolder() || options.shouldOverwrite(); > return specialHandling && sourceStatus.isDirectory() ? > sourceStatus.getPath() : > sourceStatus.getPath().getParent(); > } > } > {code} > We can see that it could return NULL at the end when doing > {{sourceStatus.getPath().getParent()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10297) Increase default balance bandwidth and concurrent moves
[ https://issues.apache.org/jira/browse/HDFS-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243522#comment-15243522 ] John Zhuge commented on HDFS-10297: --- Thanks [~jingzhao]. > Increase default balance bandwidth and concurrent moves > --- > > Key: HDFS-10297 > URL: https://issues.apache.org/jira/browse/HDFS-10297 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Attachments: HDFS-10297.001.patch > > > Adjust the default values to better support the current level of customer > host and network configurations. > Increase the default for property {{dfs.datanode.balance.bandwidthPerSec}} > from 1 to 10 MB. Apply to DN. 10 MB/s is about 10% of the GbE network. > Increase the default for property > {{dfs.datanode.balance.max.concurrent.moves}} from 5 to 50. Apply to DN and > Balancer. The default number of DN receiver threads is 4096. The default > number of balancer mover threads is 1000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10289) Balancer configures DNs directly
[ https://issues.apache.org/jira/browse/HDFS-10289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243518#comment-15243518 ] Kihwal Lee commented on HDFS-10289: --- bq. Does anyone remember why the Balancer was a separate process from the Namenode, rather than just a thread in it? Sometimes you want to stop it and restart it later. It is not impossible to do it inside NN, but is easier if it's separate. > Balancer configures DNs directly > > > Key: HDFS-10289 > URL: https://issues.apache.org/jira/browse/HDFS-10289 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Critical > > Balancer directly configures the 2 balance-related properties > (bandwidthPerSec and concurrentMoves) on the DNs involved. > Details: > * Before each balancing iteration, set the properties on all DNs involved in > the current iteration. > * The DN property changes will not survive restart. > * Balancer gets the property values from command line or its config file. > * Need new DN APIs to query and set the 2 properties. > * No need to edit the config file on each DN or run {{hdfs dfsadmin > -setBalancerBandwidth}} to configure every DN in the cluster. > Pros: > * Improve ease of use because all configurations are done at one place, the > balancer. We saw many customers often forgot to set concurrentMoves properly > since it is required on both DN and Balancer. > * Support new DNs added between iterations > * Handle DN restarts between iterations > * May be able to dynamically adjust the thresholds in different iterations. > Don't know how useful though. > Cons: > * New DN property API > * A malicious/misconfigured balancer may overwhelm DNs. {{hdfs dfsadmin > -setBalancerBandwidth}} has the same issue. Also Balancer can only be run by > admin. > Questions: > * Can we create {{BalancerConcurrentMovesCommand}} similar to > {{BalancerBandwidthCommand}}? Can Balancer use them directly without going > through NN? > One proposal to implement HDFS-7466 calls for an API to query DN properties. > DN Conf Servlet returns all config properties. It does not return individual > property and it does not return the value set by {{hdfs dfsadmin > -setBalancerBandwidth}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10256) Use GenericTestUtils.getTestDir method in tests for temporary directories
[ https://issues.apache.org/jira/browse/HDFS-10256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243502#comment-15243502 ] Kihwal Lee commented on HDFS-10256: --- We have seen cases where a mini dfs cluster startup fails due to not being able to delete the {{data_dir}} in {{initMiniDFSCluster()}}. Depending on when the build machine gets busy, it hits random test cases. If we make it sleep few seconds and try again, it works most of times. The surefire doc says, {quote} After the test-set has completed, the process executes java.lang.System.exit(0) which starts shutdown hooks. At this point the process may run next 30 seconds until all non daemon Threads die. After the period of time has elapsed, the process kills itself by java.lang.Runtime.halt(0). {quote} {{MiniDFSCluster#shutdown()}} registers {{base_dir}} to be deleted on shutdown. If this gets slow, the next test JVM will start to run before the shutdown hook completes. But forcing every test to call {{shutdown(true)}} can slowdown things. Instead, each instance should get a random {{base_dir}}, so that the deletion through shutdown hook and the subsequent new test setup can overlap. [~steve_l] mentioned this in HADOOP-12984. bq. many buildups of test dirs now use something random, rather than a hard-coded path like "dfs". This includes minidfs cluster...which should improve parallelism on test runs. Can we actually make sure each MiniDFSCluster gets a unique base directory? > Use GenericTestUtils.getTestDir method in tests for temporary directories > - > > Key: HDFS-10256 > URL: https://issues.apache.org/jira/browse/HDFS-10256 > Project: Hadoop HDFS > Issue Type: Improvement > Components: build, test >Reporter: Vinayakumar B >Assignee: Vinayakumar B > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9670) DistCp throws NPE when source is root
[ https://issues.apache.org/jira/browse/HDFS-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243482#comment-15243482 ] Yongjun Zhang commented on HDFS-9670: - Hi [~jzhuge], For the new test you added, it seems creating one cluster would be sufficient. Would you please look into? Then we can consider a future jira for consolidating the set of tests. Thanks. > DistCp throws NPE when source is root > - > > Key: HDFS-9670 > URL: https://issues.apache.org/jira/browse/HDFS-9670 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: John Zhuge > Labels: supportability > Fix For: 2.8.0 > > Attachments: HDFS-9670.001.patch > > > Symptom: > {quote} > [root@vb0724 ~]# hadoop distcp hdfs://X:8020/ hdfs://Y:8020/ > 16/01/20 11:33:33 INFO tools.DistCp: Input Options: > DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, > ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', > copyStrategy='uniformsize', sourceFileListing=null, > sourcePaths=[hdfs://X:8020/], targetPath=hdfs://Y:8020/, > targetPathExists=true, preserveRawXattrs=false, filtersFile='null'} > 16/01/20 11:33:33 INFO client.RMProxy: Connecting to ResourceManager at Z:8032 > 16/01/20 11:33:33 ERROR tools.DistCp: Exception encountered > java.lang.NullPointerException > at > org.apache.hadoop.tools.util.DistCpUtils.getRelativePath(DistCpUtils.java:144) > at > org.apache.hadoop.tools.SimpleCopyListing.writeToFileListing(SimpleCopyListing.java:598) > at > org.apache.hadoop.tools.SimpleCopyListing.writeToFileListingRoot(SimpleCopyListing.java:583) > at > org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:313) > at > org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:174) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:365) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:171) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:122) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:429) > {quote} > Relevant code: > {code} > private Path computeSourceRootPath(FileStatus sourceStatus, > DistCpOptions options) throws > IOException { > Path target = options.getTargetPath(); > FileSystem targetFS = target.getFileSystem(getConf()); > final boolean targetPathExists = options.getTargetPathExists(); > boolean solitaryFile = options.getSourcePaths().size() == 1 > && > !sourceStatus.isDirectory(); > if (solitaryFile) { > if (targetFS.isFile(target) || !targetPathExists) { > return sourceStatus.getPath(); > } else { > return sourceStatus.getPath().getParent(); > } > } else { > boolean specialHandling = (options.getSourcePaths().size() == 1 && > !targetPathExists) || > options.shouldSyncFolder() || options.shouldOverwrite(); > return specialHandling && sourceStatus.isDirectory() ? > sourceStatus.getPath() : > sourceStatus.getPath().getParent(); > } > } > {code} > We can see that it could return NULL at the end when doing > {{sourceStatus.getPath().getParent()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9670) DistCp throws NPE when source is root
[ https://issues.apache.org/jira/browse/HDFS-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243416#comment-15243416 ] John Zhuge commented on HDFS-9670: -- Very good point! * File a separate jira to consolidate mini cluster in this test class? * Or bundle the change in this patch? > DistCp throws NPE when source is root > - > > Key: HDFS-9670 > URL: https://issues.apache.org/jira/browse/HDFS-9670 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: John Zhuge > Labels: supportability > Fix For: 2.8.0 > > Attachments: HDFS-9670.001.patch > > > Symptom: > {quote} > [root@vb0724 ~]# hadoop distcp hdfs://X:8020/ hdfs://Y:8020/ > 16/01/20 11:33:33 INFO tools.DistCp: Input Options: > DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, > ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', > copyStrategy='uniformsize', sourceFileListing=null, > sourcePaths=[hdfs://X:8020/], targetPath=hdfs://Y:8020/, > targetPathExists=true, preserveRawXattrs=false, filtersFile='null'} > 16/01/20 11:33:33 INFO client.RMProxy: Connecting to ResourceManager at Z:8032 > 16/01/20 11:33:33 ERROR tools.DistCp: Exception encountered > java.lang.NullPointerException > at > org.apache.hadoop.tools.util.DistCpUtils.getRelativePath(DistCpUtils.java:144) > at > org.apache.hadoop.tools.SimpleCopyListing.writeToFileListing(SimpleCopyListing.java:598) > at > org.apache.hadoop.tools.SimpleCopyListing.writeToFileListingRoot(SimpleCopyListing.java:583) > at > org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:313) > at > org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:174) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:365) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:171) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:122) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:429) > {quote} > Relevant code: > {code} > private Path computeSourceRootPath(FileStatus sourceStatus, > DistCpOptions options) throws > IOException { > Path target = options.getTargetPath(); > FileSystem targetFS = target.getFileSystem(getConf()); > final boolean targetPathExists = options.getTargetPathExists(); > boolean solitaryFile = options.getSourcePaths().size() == 1 > && > !sourceStatus.isDirectory(); > if (solitaryFile) { > if (targetFS.isFile(target) || !targetPathExists) { > return sourceStatus.getPath(); > } else { > return sourceStatus.getPath().getParent(); > } > } else { > boolean specialHandling = (options.getSourcePaths().size() == 1 && > !targetPathExists) || > options.shouldSyncFolder() || options.shouldOverwrite(); > return specialHandling && sourceStatus.isDirectory() ? > sourceStatus.getPath() : > sourceStatus.getPath().getParent(); > } > } > {code} > We can see that it could return NULL at the end when doing > {{sourceStatus.getPath().getParent()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10283) o.a.h.hdfs.server.namenode.TestFSImageWithSnapshot#testSaveLoadImageWithAppending fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-10283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243399#comment-15243399 ] Hudson commented on HDFS-10283: --- FAILURE: Integrated in Hadoop-trunk-Commit #9619 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9619/]) HDFS-10283. (jing9: rev 89a838769ff5b6c64565e6949b14d7fed05daf54) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImageWithSnapshot.java > o.a.h.hdfs.server.namenode.TestFSImageWithSnapshot#testSaveLoadImageWithAppending > fails intermittently > -- > > Key: HDFS-10283 > URL: https://issues.apache.org/jira/browse/HDFS-10283 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.8.0 >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.9.0 > > Attachments: HDFS-10283.000.patch > > > The test fails with exception as following: > {code} > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:47227,DS-dd109c14-79e5-4380-ac5e-4434cd7e25b5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:56949,DS-6c0be75e-a78c-41b9-bfd0-7ee0cdefaa0e,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:47227,DS-dd109c14-79e5-4380-ac5e-4434cd7e25b5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:56949,DS-6c0be75e-a78c-41b9-bfd0-7ee0cdefaa0e,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1162) > at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232) > at > org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9539) libhdfs++: enable default configuration files
[ https://issues.apache.org/jira/browse/HDFS-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243402#comment-15243402 ] James Clampffer commented on HDFS-9539: --- So the idea is just grab a default config file, turn it into a giant C string literal, and link that in? That sounds good to me. > libhdfs++: enable default configuration files > - > > Key: HDFS-9539 > URL: https://issues.apache.org/jira/browse/HDFS-9539 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: Bob Hansen > > In the Java implementation of config files, the Hadoop jars included a > default core-default and hdfs-default.xml file that provided default values > for the run-time configurations. libhdfs++ should honor that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10293) StripedFileTestUtil#readAll flaky
[ https://issues.apache.org/jira/browse/HDFS-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243401#comment-15243401 ] Hudson commented on HDFS-10293: --- FAILURE: Integrated in Hadoop-trunk-Commit #9619 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9619/]) HDFS-10293. StripedFileTestUtil#readAll flaky. Contributed by Mingliang (jing9: rev 55e19b7f0c1243090dff2d08ed785cefd420b009) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/StripedFileTestUtil.java > StripedFileTestUtil#readAll flaky > - > > Key: HDFS-10293 > URL: https://issues.apache.org/jira/browse/HDFS-10293 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 3.0.0 > > Attachments: HDFS-10293.000.patch > > > The flaky test helper method cause several UT test failing intermittently. > For example, the > {{TestDFSStripedOutputStreamWithFailure#testAddBlockWhenNoSufficientParityNumOfNodes}} > timed out in a recent run (see > [exception|https://builds.apache.org/job/PreCommit-HDFS-Build/15158/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure/testAddBlockWhenNoSufficientParityNumOfNodes/]), > which can be easily reproduced locally. > Debugging at the code, chances are that the helper method is stuck in an > infinite loop. We need a fix to make the test robust. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-10299) libhdfs++: File length doesn't always going the last block if it's being written to
James Clampffer created HDFS-10299: -- Summary: libhdfs++: File length doesn't always going the last block if it's being written to Key: HDFS-10299 URL: https://issues.apache.org/jira/browse/HDFS-10299 Project: Hadoop HDFS Issue Type: Sub-task Reporter: James Clampffer Assignee: Xiaowei Zhu It looks like we aren't factoring in the last block of files that are being written to or haven't been closed yet into the length of the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9670) DistCp throws NPE when source is root
[ https://issues.apache.org/jira/browse/HDFS-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243389#comment-15243389 ] Yongjun Zhang commented on HDFS-9670: - Hi [~jzhuge], Thanks for working on this issue. The solution looks good to me. One comment about the test code here. The cost of starting Mini cluster is expensive, ideally we could try to think about using the same cluster for the set of tests. In this case, can we at least try to create a single cluster and do distcp within the same cluster? Thanks. > DistCp throws NPE when source is root > - > > Key: HDFS-9670 > URL: https://issues.apache.org/jira/browse/HDFS-9670 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: John Zhuge > Labels: supportability > Fix For: 2.8.0 > > Attachments: HDFS-9670.001.patch > > > Symptom: > {quote} > [root@vb0724 ~]# hadoop distcp hdfs://X:8020/ hdfs://Y:8020/ > 16/01/20 11:33:33 INFO tools.DistCp: Input Options: > DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, > ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', > copyStrategy='uniformsize', sourceFileListing=null, > sourcePaths=[hdfs://X:8020/], targetPath=hdfs://Y:8020/, > targetPathExists=true, preserveRawXattrs=false, filtersFile='null'} > 16/01/20 11:33:33 INFO client.RMProxy: Connecting to ResourceManager at Z:8032 > 16/01/20 11:33:33 ERROR tools.DistCp: Exception encountered > java.lang.NullPointerException > at > org.apache.hadoop.tools.util.DistCpUtils.getRelativePath(DistCpUtils.java:144) > at > org.apache.hadoop.tools.SimpleCopyListing.writeToFileListing(SimpleCopyListing.java:598) > at > org.apache.hadoop.tools.SimpleCopyListing.writeToFileListingRoot(SimpleCopyListing.java:583) > at > org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:313) > at > org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:174) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:365) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:171) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:122) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:429) > {quote} > Relevant code: > {code} > private Path computeSourceRootPath(FileStatus sourceStatus, > DistCpOptions options) throws > IOException { > Path target = options.getTargetPath(); > FileSystem targetFS = target.getFileSystem(getConf()); > final boolean targetPathExists = options.getTargetPathExists(); > boolean solitaryFile = options.getSourcePaths().size() == 1 > && > !sourceStatus.isDirectory(); > if (solitaryFile) { > if (targetFS.isFile(target) || !targetPathExists) { > return sourceStatus.getPath(); > } else { > return sourceStatus.getPath().getParent(); > } > } else { > boolean specialHandling = (options.getSourcePaths().size() == 1 && > !targetPathExists) || > options.shouldSyncFolder() || options.shouldOverwrite(); > return specialHandling && sourceStatus.isDirectory() ? > sourceStatus.getPath() : > sourceStatus.getPath().getParent(); > } > } > {code} > We can see that it could return NULL at the end when doing > {{sourceStatus.getPath().getParent()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9543) DiskBalancer : Add Data mover
[ https://issues.apache.org/jira/browse/HDFS-9543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-9543: --- Attachment: HDFS-9543-HDFS-1312.001.patch > DiskBalancer : Add Data mover > -- > > Key: HDFS-9543 > URL: https://issues.apache.org/jira/browse/HDFS-9543 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-9543-HDFS-1312.001.patch > > > This patch adds the RPCs and mover logic that allows data to be moved from > one storage partition to another -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9543) DiskBalancer : Add Data mover
[ https://issues.apache.org/jira/browse/HDFS-9543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-9543: --- Status: Patch Available (was: Open) > DiskBalancer : Add Data mover > -- > > Key: HDFS-9543 > URL: https://issues.apache.org/jira/browse/HDFS-9543 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-9543-HDFS-1312.001.patch > > > This patch adds the RPCs and mover logic that allows data to be moved from > one storage partition to another -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10289) Balancer configures DNs directly
[ https://issues.apache.org/jira/browse/HDFS-10289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243385#comment-15243385 ] Ravi Prakash commented on HDFS-10289: - Thanks for trying to improve the Balancer John! Does anyone remember why the Balancer was a separate process from the Namenode, rather than just a thread in it? > Balancer configures DNs directly > > > Key: HDFS-10289 > URL: https://issues.apache.org/jira/browse/HDFS-10289 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Critical > > Balancer directly configures the 2 balance-related properties > (bandwidthPerSec and concurrentMoves) on the DNs involved. > Details: > * Before each balancing iteration, set the properties on all DNs involved in > the current iteration. > * The DN property changes will not survive restart. > * Balancer gets the property values from command line or its config file. > * Need new DN APIs to query and set the 2 properties. > * No need to edit the config file on each DN or run {{hdfs dfsadmin > -setBalancerBandwidth}} to configure every DN in the cluster. > Pros: > * Improve ease of use because all configurations are done at one place, the > balancer. We saw many customers often forgot to set concurrentMoves properly > since it is required on both DN and Balancer. > * Support new DNs added between iterations > * Handle DN restarts between iterations > * May be able to dynamically adjust the thresholds in different iterations. > Don't know how useful though. > Cons: > * New DN property API > * A malicious/misconfigured balancer may overwhelm DNs. {{hdfs dfsadmin > -setBalancerBandwidth}} has the same issue. Also Balancer can only be run by > admin. > Questions: > * Can we create {{BalancerConcurrentMovesCommand}} similar to > {{BalancerBandwidthCommand}}? Can Balancer use them directly without going > through NN? > One proposal to implement HDFS-7466 calls for an API to query DN properties. > DN Conf Servlet returns all config properties. It does not return individual > property and it does not return the value set by {{hdfs dfsadmin > -setBalancerBandwidth}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10297) Increase default balance bandwidth and concurrent moves
[ https://issues.apache.org/jira/browse/HDFS-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243340#comment-15243340 ] Jing Zhao commented on HDFS-10297: -- [~jzhuge], thanks for reporting the issue and providing a patch. I noticed you set the "Fix Version" to 2.8.0. We usually update "Fix Version" only after the patch has been committed. So currently I removed that field for you. > Increase default balance bandwidth and concurrent moves > --- > > Key: HDFS-10297 > URL: https://issues.apache.org/jira/browse/HDFS-10297 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Attachments: HDFS-10297.001.patch > > > Adjust the default values to better support the current level of customer > host and network configurations. > Increase the default for property {{dfs.datanode.balance.bandwidthPerSec}} > from 1 to 10 MB. Apply to DN. 10 MB/s is about 10% of the GbE network. > Increase the default for property > {{dfs.datanode.balance.max.concurrent.moves}} from 5 to 50. Apply to DN and > Balancer. The default number of DN receiver threads is 4096. The default > number of balancer mover threads is 1000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor
[ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243338#comment-15243338 ] Hadoop QA commented on HDFS-10220: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 48s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 27s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 601 unchanged - 4 fixed = 602 total (was 605) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 2s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 107m 30s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 50s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 236m 9s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeUUID | | | hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate | | | hadoop.hdfs.server.namenode.TestNestedEncryptionZones | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion | | |
[jira] [Updated] (HDFS-10297) Increase default balance bandwidth and concurrent moves
[ https://issues.apache.org/jira/browse/HDFS-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-10297: - Fix Version/s: (was: 2.8.0) > Increase default balance bandwidth and concurrent moves > --- > > Key: HDFS-10297 > URL: https://issues.apache.org/jira/browse/HDFS-10297 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Attachments: HDFS-10297.001.patch > > > Adjust the default values to better support the current level of customer > host and network configurations. > Increase the default for property {{dfs.datanode.balance.bandwidthPerSec}} > from 1 to 10 MB. Apply to DN. 10 MB/s is about 10% of the GbE network. > Increase the default for property > {{dfs.datanode.balance.max.concurrent.moves}} from 5 to 50. Apply to DN and > Balancer. The default number of DN receiver threads is 4096. The default > number of balancer mover threads is 1000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10297) Increase default balance bandwidth and concurrent moves
[ https://issues.apache.org/jira/browse/HDFS-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10297: -- Fix Version/s: 2.8.0 Status: Patch Available (was: Open) > Increase default balance bandwidth and concurrent moves > --- > > Key: HDFS-10297 > URL: https://issues.apache.org/jira/browse/HDFS-10297 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Fix For: 2.8.0 > > Attachments: HDFS-10297.001.patch > > > Adjust the default values to better support the current level of customer > host and network configurations. > Increase the default for property {{dfs.datanode.balance.bandwidthPerSec}} > from 1 to 10 MB. Apply to DN. 10 MB/s is about 10% of the GbE network. > Increase the default for property > {{dfs.datanode.balance.max.concurrent.moves}} from 5 to 50. Apply to DN and > Balancer. The default number of DN receiver threads is 4096. The default > number of balancer mover threads is 1000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10297) Increase default balance bandwidth and concurrent moves
[ https://issues.apache.org/jira/browse/HDFS-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10297: -- Attachment: HDFS-10297.001.patch Patch 001: * Change values in {{hdfs-default.xml}} * Change values of {{DFS_DATANODE_BALANCE_BANDWIDTHPERSEC_DEFAULT}} and {{DFS_DATANODE_BALANCE_MAX_NUM_CONCURRENT_MOVES_DEFAULT}} Test output: {noformat} $ ( cd hadoop-hdfs-project/hadoop-hdfs ; mvn test -Dtest=TestBalancer#testBalancer2 ) $ grep 'concurrent\.moves\|Number threads for balancing' hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports/*output* 2016-04-15 10:46:58,563 [Thread-0] INFO datanode.DataNode (DataXceiverServer.java:(79)) - Number threads for balancing is 50 2016-04-15 10:46:58,781 [Thread-0] INFO datanode.DataNode (DataXceiverServer.java:(79)) - Number threads for balancing is 50 2016-04-15 10:46:59,942 [Thread-0] INFO datanode.DataNode (DataXceiverServer.java:(79)) - Number threads for balancing is 50 2016-04-15 10:47:00,153 [Thread-0] INFO balancer.Balancer (Balancer.java:getInt(240)) - dfs.datanode.balance.max.concurrent.moves = 50 (default=50) 2016-04-15 10:47:04,205 [Thread-0] INFO balancer.Balancer (Balancer.java:getInt(240)) - dfs.datanode.balance.max.concurrent.moves = 50 (default=50) [jzhuge@jzhuge-MBP hadoop](trunk *)$ grep 'bandwidthPer\|Balancing bandwidth' hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports/*output* 2016-04-15 10:46:58,563 [Thread-0] INFO datanode.DataNode (DataXceiverServer.java:(78)) - Balancing bandwidth is 10485760 bytes/s 2016-04-15 10:46:58,781 [Thread-0] INFO datanode.DataNode (DataXceiverServer.java:(78)) - Balancing bandwidth is 10485760 bytes/s 2016-04-15 10:46:59,942 [Thread-0] INFO datanode.DataNode (DataXceiverServer.java:(78)) - Balancing bandwidth is 10485760 bytes/s {noformat} > Increase default balance bandwidth and concurrent moves > --- > > Key: HDFS-10297 > URL: https://issues.apache.org/jira/browse/HDFS-10297 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Attachments: HDFS-10297.001.patch > > > Adjust the default values to better support the current level of customer > host and network configurations. > Increase the default for property {{dfs.datanode.balance.bandwidthPerSec}} > from 1 to 10 MB. Apply to DN. 10 MB/s is about 10% of the GbE network. > Increase the default for property > {{dfs.datanode.balance.max.concurrent.moves}} from 5 to 50. Apply to DN and > Balancer. The default number of DN receiver threads is 4096. The default > number of balancer mover threads is 1000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10293) StripedFileTestUtil#readAll flaky
[ https://issues.apache.org/jira/browse/HDFS-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243326#comment-15243326 ] Mingliang Liu commented on HDFS-10293: -- Thanks for your review and commit, [~jing9]. > StripedFileTestUtil#readAll flaky > - > > Key: HDFS-10293 > URL: https://issues.apache.org/jira/browse/HDFS-10293 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 3.0.0 > > Attachments: HDFS-10293.000.patch > > > The flaky test helper method cause several UT test failing intermittently. > For example, the > {{TestDFSStripedOutputStreamWithFailure#testAddBlockWhenNoSufficientParityNumOfNodes}} > timed out in a recent run (see > [exception|https://builds.apache.org/job/PreCommit-HDFS-Build/15158/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure/testAddBlockWhenNoSufficientParityNumOfNodes/]), > which can be easily reproduced locally. > Debugging at the code, chances are that the helper method is stuck in an > infinite loop. We need a fix to make the test robust. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10293) StripedFileTestUtil#readAll flaky
[ https://issues.apache.org/jira/browse/HDFS-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-10293: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) I've committed this to trunk. Thanks for the fix, [~liuml07]! > StripedFileTestUtil#readAll flaky > - > > Key: HDFS-10293 > URL: https://issues.apache.org/jira/browse/HDFS-10293 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 3.0.0 > > Attachments: HDFS-10293.000.patch > > > The flaky test helper method cause several UT test failing intermittently. > For example, the > {{TestDFSStripedOutputStreamWithFailure#testAddBlockWhenNoSufficientParityNumOfNodes}} > timed out in a recent run (see > [exception|https://builds.apache.org/job/PreCommit-HDFS-Build/15158/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure/testAddBlockWhenNoSufficientParityNumOfNodes/]), > which can be easily reproduced locally. > Debugging at the code, chances are that the helper method is stuck in an > infinite loop. We need a fix to make the test robust. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10283) o.a.h.hdfs.server.namenode.TestFSImageWithSnapshot#testSaveLoadImageWithAppending fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-10283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243319#comment-15243319 ] Mingliang Liu commented on HDFS-10283: -- Thank you [~jingzhao] for your review and commit. > o.a.h.hdfs.server.namenode.TestFSImageWithSnapshot#testSaveLoadImageWithAppending > fails intermittently > -- > > Key: HDFS-10283 > URL: https://issues.apache.org/jira/browse/HDFS-10283 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.8.0 >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.9.0 > > Attachments: HDFS-10283.000.patch > > > The test fails with exception as following: > {code} > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:47227,DS-dd109c14-79e5-4380-ac5e-4434cd7e25b5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:56949,DS-6c0be75e-a78c-41b9-bfd0-7ee0cdefaa0e,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:47227,DS-dd109c14-79e5-4380-ac5e-4434cd7e25b5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:56949,DS-6c0be75e-a78c-41b9-bfd0-7ee0cdefaa0e,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1162) > at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232) > at > org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10283) o.a.h.hdfs.server.namenode.TestFSImageWithSnapshot#testSaveLoadImageWithAppending fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-10283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-10283: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.0 Status: Resolved (was: Patch Available) I've committed this to trunk and branch-2. Thanks for the contribution, [~liuml07]! > o.a.h.hdfs.server.namenode.TestFSImageWithSnapshot#testSaveLoadImageWithAppending > fails intermittently > -- > > Key: HDFS-10283 > URL: https://issues.apache.org/jira/browse/HDFS-10283 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.8.0 >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.9.0 > > Attachments: HDFS-10283.000.patch > > > The test fails with exception as following: > {code} > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:47227,DS-dd109c14-79e5-4380-ac5e-4434cd7e25b5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:56949,DS-6c0be75e-a78c-41b9-bfd0-7ee0cdefaa0e,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:47227,DS-dd109c14-79e5-4380-ac5e-4434cd7e25b5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:56949,DS-6c0be75e-a78c-41b9-bfd0-7ee0cdefaa0e,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1162) > at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232) > at > org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9732) Remove DelegationTokenIdentifier.toString() —for better logging output
[ https://issues.apache.org/jira/browse/HDFS-9732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243216#comment-15243216 ] Steve Loughran commented on HDFS-9732: -- Getting close, just some details in {{DelegationTokenFetcher}} # good to see you are using StringBuilder, can you split the append() sequence in to one per entry. Thats in {{toStringStable}} and {{printTokensToString}}. If you use intellij IDEA, it'll do that automatically if you ask it nicely. #line 78, how about we make the text just "print verbose output" In {{TestDelegationTokenFetcher}} # Line 141 assert statement should build up a string to print on failure. Imagine: everything you'd need to understand the problem from a jenkins test failure # Lines 142-143 use SLF4J logging APIs, not System.out > Remove DelegationTokenIdentifier.toString() —for better logging output > -- > > Key: HDFS-9732 > URL: https://issues.apache.org/jira/browse/HDFS-9732 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.2 >Reporter: Steve Loughran >Assignee: Yongjun Zhang > Attachments: HADOOP-12752-001.patch, HDFS-9732-000.patch, > HDFS-9732.001.patch, HDFS-9732.002.patch, HDFS-9732.003.patch > > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > HDFS {{DelegationTokenIdentifier.toString()}} adds some diagnostics info, > owner, sequence number. But its superclass, > {{AbstractDelegationTokenIdentifier}} contains a lot more information, > including token issue and expiry times. > Because {{DelegationTokenIdentifier.toString()}} doesn't include this data, > information that is potentially useful for kerberos diagnostics is lost. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10291) TestShortCircuitLocalRead failing
[ https://issues.apache.org/jira/browse/HDFS-10291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-10291: -- Description: {{TestShortCircuitLocalRead}} failing as length of read is considered off end of buffer. (was: {{TestShortCircuitLocalRead}} failing as length of read is considered off end of buffer. There's an off-by-one error somewhere in the test or the new validation code) > TestShortCircuitLocalRead failing > - > > Key: HDFS-10291 > URL: https://issues.apache.org/jira/browse/HDFS-10291 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > > {{TestShortCircuitLocalRead}} failing as length of read is considered off end > of buffer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-334) dfsadmin -metasave should also log corrupt replicas info
[ https://issues.apache.org/jira/browse/HDFS-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denis Bolshakov reassigned HDFS-334: Assignee: Denis Bolshakov > dfsadmin -metasave should also log corrupt replicas info > > > Key: HDFS-334 > URL: https://issues.apache.org/jira/browse/HDFS-334 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Lohit Vijayarenu >Assignee: Denis Bolshakov >Priority: Minor > Labels: newbie > > _hadoop dfsadmin -metasave _ should also dump information about > corrupt replicas map. This could help in telling if pending replication was > due to corrupt replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-334) dfsadmin -metasave should also log corrupt replicas info
[ https://issues.apache.org/jira/browse/HDFS-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243145#comment-15243145 ] Andras Bokor commented on HDFS-334: --- Not at all. Go ahead. > dfsadmin -metasave should also log corrupt replicas info > > > Key: HDFS-334 > URL: https://issues.apache.org/jira/browse/HDFS-334 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Lohit Vijayarenu >Priority: Minor > Labels: newbie > > _hadoop dfsadmin -metasave _ should also dump information about > corrupt replicas map. This could help in telling if pending replication was > due to corrupt replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-334) dfsadmin -metasave should also log corrupt replicas info
[ https://issues.apache.org/jira/browse/HDFS-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243137#comment-15243137 ] Denis Bolshakov commented on HDFS-334: -- [~boky01]Do you mind if I take care about this issue? If so, I will assign it to me. > dfsadmin -metasave should also log corrupt replicas info > > > Key: HDFS-334 > URL: https://issues.apache.org/jira/browse/HDFS-334 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Lohit Vijayarenu >Priority: Minor > Labels: newbie > > _hadoop dfsadmin -metasave _ should also dump information about > corrupt replicas map. This could help in telling if pending replication was > due to corrupt replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9940) Balancer should not use property dfs.datanode.balance.max.concurrent.moves
[ https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-9940: - Summary: Balancer should not use property dfs.datanode.balance.max.concurrent.moves (was: Balancer should not use property name dfs.datanode.balance.max.concurrent.moves) > Balancer should not use property dfs.datanode.balance.max.concurrent.moves > -- > > Key: HDFS-9940 > URL: https://issues.apache.org/jira/browse/HDFS-9940 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Fix For: 2.8.0 > > > It is very confusing for both Balancer and Datanode to use the same property > {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the > Balancer because the property has "datanode" in the name string. Many > customers forget to set the property for the Balancer. > Change the Balancer to use a new property > {{dfs.balancer.max.concurrent.moves}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor
[ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Fraison updated HDFS-10220: --- Status: Patch Available (was: Open) > Namenode failover due to too long loking in LeaseManager.Monitor > > > Key: HDFS-10220 > URL: https://issues.apache.org/jira/browse/HDFS-10220 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Nicolas Fraison >Assignee: Nicolas Fraison >Priority: Minor > Attachments: HADOOP-10220.001.patch, HADOOP-10220.002.patch, > HADOOP-10220.003.patch, HADOOP-10220.004.patch, threaddump_zkfc.txt > > > I have faced a namenode failover due to unresponsive namenode detected by the > zkfc with lot's of WARN messages (5 millions) like this one: > _org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease: All > existing blocks are COMPLETE, lease removed, file closed._ > On the threaddump taken by the zkfc there are lots of thread blocked due to a > lock. > Looking at the code, there are a lock taken by the LeaseManager.Monitor when > some lease must be released. Due to the really big number of lease to be > released the namenode has taken too many times to release them blocking all > other tasks and making the zkfc thinking that the namenode was not > available/stuck. > The idea of this patch is to limit the number of leased released each time we > check for lease so the lock won't be taken for a too long time period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor
[ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Fraison updated HDFS-10220: --- Status: Open (was: Patch Available) > Namenode failover due to too long loking in LeaseManager.Monitor > > > Key: HDFS-10220 > URL: https://issues.apache.org/jira/browse/HDFS-10220 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Nicolas Fraison >Assignee: Nicolas Fraison >Priority: Minor > Attachments: HADOOP-10220.001.patch, HADOOP-10220.002.patch, > HADOOP-10220.003.patch, HADOOP-10220.004.patch, threaddump_zkfc.txt > > > I have faced a namenode failover due to unresponsive namenode detected by the > zkfc with lot's of WARN messages (5 millions) like this one: > _org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease: All > existing blocks are COMPLETE, lease removed, file closed._ > On the threaddump taken by the zkfc there are lots of thread blocked due to a > lock. > Looking at the code, there are a lock taken by the LeaseManager.Monitor when > some lease must be released. Due to the really big number of lease to be > released the namenode has taken too many times to release them blocking all > other tasks and making the zkfc thinking that the namenode was not > available/stuck. > The idea of this patch is to limit the number of leased released each time we > check for lease so the lock won't be taken for a too long time period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor
[ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Fraison updated HDFS-10220: --- Attachment: HADOOP-10220.004.patch > Namenode failover due to too long loking in LeaseManager.Monitor > > > Key: HDFS-10220 > URL: https://issues.apache.org/jira/browse/HDFS-10220 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Nicolas Fraison >Assignee: Nicolas Fraison >Priority: Minor > Attachments: HADOOP-10220.001.patch, HADOOP-10220.002.patch, > HADOOP-10220.003.patch, HADOOP-10220.004.patch, threaddump_zkfc.txt > > > I have faced a namenode failover due to unresponsive namenode detected by the > zkfc with lot's of WARN messages (5 millions) like this one: > _org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease: All > existing blocks are COMPLETE, lease removed, file closed._ > On the threaddump taken by the zkfc there are lots of thread blocked due to a > lock. > Looking at the code, there are a lock taken by the LeaseManager.Monitor when > some lease must be released. Due to the really big number of lease to be > released the namenode has taken too many times to release them blocking all > other tasks and making the zkfc thinking that the namenode was not > available/stuck. > The idea of this patch is to limit the number of leased released each time we > check for lease so the lock won't be taken for a too long time period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist erasure coding policies in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242516#comment-15242516 ] Rakesh R commented on HDFS-7859: bq. I thought many considerations originally targeted for the issue have already been implemented elsewhere, therefore the only thing left is custom codec and schema support. I don't think there is a strong requirement for this feature but we can implement it perhaps in phase II I guess. Thanks for making it clear, [~drankye]. > Erasure Coding: Persist erasure coding policies in NameNode > --- > > Key: HDFS-7859 > URL: https://issues.apache.org/jira/browse/HDFS-7859 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Xinwei Qin > Labels: BB2015-05-TBR > Attachments: HDFS-7859-HDFS-7285.002.patch, > HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, > HDFS-7859.001.patch, HDFS-7859.002.patch > > > In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we > persist EC schemas in NameNode centrally and reliably, so that EC zones can > reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist erasure coding policies in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242515#comment-15242515 ] Rakesh R commented on HDFS-7859: bq. For the builtin schema and policies, IIRC, there was a consideration that we still need to persist the schema and policy to indicate the software upgrades (so the builtin ones may be changed). Yes, changing built-in schema is an interesting case. If we end up in a case to change the default one then persisting would be required. I think we can proceed to persist the ec policy details in the fsimage and editlog. I'm just adding a thought to understand more - probably we could explore whether layout version can be utilized to handle this kinda situations. The patch need to rebase in latest code. Would you mind rebasing it, [~xinwei]. > Erasure Coding: Persist erasure coding policies in NameNode > --- > > Key: HDFS-7859 > URL: https://issues.apache.org/jira/browse/HDFS-7859 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Xinwei Qin > Labels: BB2015-05-TBR > Attachments: HDFS-7859-HDFS-7285.002.patch, > HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, > HDFS-7859.001.patch, HDFS-7859.002.patch > > > In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we > persist EC schemas in NameNode centrally and reliably, so that EC zones can > reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-10298) Document the usage of distcp -diff option
Akira AJISAKA created HDFS-10298: Summary: Document the usage of distcp -diff option Key: HDFS-10298 URL: https://issues.apache.org/jira/browse/HDFS-10298 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, documentation Affects Versions: 2.8.0 Reporter: Akira AJISAKA Distcp -diff options is currently documented as "Use snapshot diff report to identify the difference between source and target.", but the usage is not documented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)