[jira] [Updated] (HDFS-12415) Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-12415: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-7240 Status: Resolved (was: Patch Available) Thanks [~vagarychen] and [~cheersyang] for the reviews. I have committed this to HDFS-7240 branch. > Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails > > > Key: HDFS-12415 > URL: https://issues.apache.org/jira/browse/HDFS-12415 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7240 >Reporter: Weiwei Yang >Assignee: Mukul Kumar Singh > Fix For: HDFS-7240 > > Attachments: HDFS-12415-HDFS-7240.001.patch, > HDFS-12415-HDFS-7240.002.patch, HDFS-12415-HDFS-7240.003.patch, > HDFS-12415-HDFS-7240.004.patch, HDFS-12415-HDFS-7240.005.patch > > > TestXceiverClientManager seems to be occasionally failing in some jenkins > jobs, > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828) > at > org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147) > at > org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125) > {noformat} > see more from [this > report|https://builds.apache.org/job/PreCommit-HDFS-Build/21065/testReport/] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12556) [SPS] : Block movement analysis should be done in read lock.
[ https://issues.apache.org/jira/browse/HDFS-12556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-12556: -- Attachment: HDFS-12556-HDFS-10285-03.patch Rebased patch, please review... > [SPS] : Block movement analysis should be done in read lock. > > > Key: HDFS-12556 > URL: https://issues.apache.org/jira/browse/HDFS-12556 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > Attachments: HDFS-12556-HDFS-10285-01.patch, > HDFS-12556-HDFS-10285-02.patch, HDFS-12556-HDFS-10285-03.patch > > > {noformat} > 2017-09-27 15:58:32,852 [StoragePolicySatisfier] ERROR > namenode.StoragePolicySatisfier > (StoragePolicySatisfier.java:handleException(308)) - StoragePolicySatisfier > thread received runtime exception. Stopping Storage policy satisfier work > java.lang.ArrayIndexOutOfBoundsException: 1 > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getStorages(BlockManager.java:4130) > at > org.apache.hadoop.hdfs.server.namenode.StoragePolicySatisfier.analyseBlocksStorageMovementsAndAssignToDN(StoragePolicySatisfier.java:362) > at > org.apache.hadoop.hdfs.server.namenode.StoragePolicySatisfier.run(StoragePolicySatisfier.java:236) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12613) Native EC coder should implement release() as idempotent function.
[ https://issues.apache.org/jira/browse/HDFS-12613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203099#comment-16203099 ] Kai Zheng commented on HDFS-12613: -- Thanks [~eddyxu] for the update! +1 pending on the Jenkins and the minor check style. > Native EC coder should implement release() as idempotent function. > -- > > Key: HDFS-12613 > URL: https://issues.apache.org/jira/browse/HDFS-12613 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.0.0-beta1 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-12613.00.patch, HDFS-12613.01.patch, > HDFS-12613.02.patch, HDFS-12613.03.patch > > > Recently, we found native EC coder crashes JVM because > {{NativeRSDecoder#release()}} being called multiple times (HDFS-12612 and > HDFS-12606). > We should strength the implement the native code to make {{release()}} > idempotent as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12504) Ozone: Improve SQLCLI performance
[ https://issues.apache.org/jira/browse/HDFS-12504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203089#comment-16203089 ] Weiwei Yang commented on HDFS-12504: bq. if we can have some simple benchmark results to see the performance improvement, Agree with this idea. Actually I suggest to add some log to record the time consumed on critical paths, e.g insert a record to target DB, insert a batch of records to a target DB. So that we can estimate the performance improvement given by this patch. [~yuanbo], does that make sense to you? > Ozone: Improve SQLCLI performance > - > > Key: HDFS-12504 > URL: https://issues.apache.org/jira/browse/HDFS-12504 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Weiwei Yang >Assignee: Yuanbo Liu > Labels: performance > Attachments: HDFS-12504-HDFS-7240.001.patch > > > In my test, my {{ksm.db}} has *3017660* entries with total size of *128mb*, > SQLCLI tool runs over *2 hours* but still not finish exporting the DB. This > is because it iterates each entry and inserts that to another sqllite DB > file, which is not efficient. We need to improve this to be running more > efficiently on large DB files. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML
[ https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203065#comment-16203065 ] Hadoop QA commented on HDFS-11467: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 13s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 47 unchanged - 2 fixed = 47 total (was 49) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 44s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 50s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}134m 49s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:3d04c00 | | JIRA Issue | HDFS-11467 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12891964/HDFS-11467.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 96aed1c2fd5d 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e46d5bb | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/21678/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/21678/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job
[jira] [Commented] (HDFS-12637) Extend TestDistributedFileSystemWithECFile with a random EC policy
[ https://issues.apache.org/jira/browse/HDFS-12637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203063#comment-16203063 ] Xiao Chen commented on HDFS-12637: -- Thanks [~tasanuma0829] for filing the jira and providing a patch. I'm not familiar with EC details, but what's the difference with having a random non-default EC policy than parameterizing all policies for the base test ({{TestDistributedFileSystemWithECFile}})? IIUC parameterizing will deterministically cover all policies, right? > Extend TestDistributedFileSystemWithECFile with a random EC policy > -- > > Key: HDFS-12637 > URL: https://issues.apache.org/jira/browse/HDFS-12637 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma > Attachments: HDFS-12637.1.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12642) Log block and datanode details in BlockRecoveryWorker
[ https://issues.apache.org/jira/browse/HDFS-12642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203062#comment-16203062 ] Hadoop QA commented on HDFS-12642: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 22s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 43s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 2s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}142m 49s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:3d04c00 | | JIRA Issue | HDFS-12642 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12891634/HDFS-12642.01.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux fd594502c92b 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e46d5bb | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/21677/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/21677/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/21677/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http:
[jira] [Comment Edited] (HDFS-12415) Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203055#comment-16203055 ] Weiwei Yang edited comment on HDFS-12415 at 10/13/17 5:03 AM: -- I am also +1 to [~msingh]'s patch, lets get this committed and see if this resolves the issue completely. Thanks [~msingh] and [~vagarychen] for your attention. [~msingh], feel free to commit latest patch since you got 2 +1s :). was (Author: cheersyang): I am also +1 to [~msingh]'s patch, lets get this committed and see if this resolves the issue completely. Thanks [~msingh] and [~vagarychen] for your attention. > Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails > > > Key: HDFS-12415 > URL: https://issues.apache.org/jira/browse/HDFS-12415 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7240 >Reporter: Weiwei Yang >Assignee: Mukul Kumar Singh > Attachments: HDFS-12415-HDFS-7240.001.patch, > HDFS-12415-HDFS-7240.002.patch, HDFS-12415-HDFS-7240.003.patch, > HDFS-12415-HDFS-7240.004.patch, HDFS-12415-HDFS-7240.005.patch > > > TestXceiverClientManager seems to be occasionally failing in some jenkins > jobs, > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828) > at > org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147) > at > org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125) > {noformat} > see more from [this > report|https://builds.apache.org/job/PreCommit-HDFS-Build/21065/testReport/] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-12415) Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang reassigned HDFS-12415: -- Assignee: Mukul Kumar Singh (was: Weiwei Yang) > Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails > > > Key: HDFS-12415 > URL: https://issues.apache.org/jira/browse/HDFS-12415 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7240 >Reporter: Weiwei Yang >Assignee: Mukul Kumar Singh > Attachments: HDFS-12415-HDFS-7240.001.patch, > HDFS-12415-HDFS-7240.002.patch, HDFS-12415-HDFS-7240.003.patch, > HDFS-12415-HDFS-7240.004.patch, HDFS-12415-HDFS-7240.005.patch > > > TestXceiverClientManager seems to be occasionally failing in some jenkins > jobs, > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828) > at > org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147) > at > org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125) > {noformat} > see more from [this > report|https://builds.apache.org/job/PreCommit-HDFS-Build/21065/testReport/] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12415) Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203055#comment-16203055 ] Weiwei Yang commented on HDFS-12415: I am also +1 to [~msingh]'s patch, lets get this committed and see if this resolves the issue completely. Thanks [~msingh] and [~vagarychen] for your attention. > Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails > > > Key: HDFS-12415 > URL: https://issues.apache.org/jira/browse/HDFS-12415 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7240 >Reporter: Weiwei Yang >Assignee: Weiwei Yang > Attachments: HDFS-12415-HDFS-7240.001.patch, > HDFS-12415-HDFS-7240.002.patch, HDFS-12415-HDFS-7240.003.patch, > HDFS-12415-HDFS-7240.004.patch, HDFS-12415-HDFS-7240.005.patch > > > TestXceiverClientManager seems to be occasionally failing in some jenkins > jobs, > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828) > at > org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147) > at > org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125) > {noformat} > see more from [this > report|https://builds.apache.org/job/PreCommit-HDFS-Build/21065/testReport/] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12613) Native EC coder should implement release() as idempotent function.
[ https://issues.apache.org/jira/browse/HDFS-12613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203032#comment-16203032 ] Hadoop QA commented on HDFS-12613: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 41s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 34m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 7m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 26m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 10s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 33s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 28m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 34s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 59s{color} | {color:orange} root: The patch generated 1 new + 82 unchanged - 0 fixed = 83 total (was 82) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 7m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 49s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 13m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 18s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 7s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 13s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}295m 33s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.TestFixKerberosTicketOrder | | | hadoop.ha.TestZKFailoverController | | | hadoop.security.token.delegation.TestZKDelegationTokenSecretManager | | | hadoop.hdfs.server.namenode.ha.TestHAStateTransitions | | | hadoop.hdfs.server.datanode.TestDataNodeUUID | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy | | | hadoop.tracing.TestTracing | | | hadoop.hdfs.server.namenode.ha.TestDNFencing | | |
[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203022#comment-16203022 ] Jiandan Yang commented on HDFS-12638: -- [~daryn] NN audit log lost some, and we did not find truncate operation about this file, but I think this file was truncated by viewing DN logs. In DN log the block was first finallized then do recover. > NameNode exits due to ReplicationMonitor thread received Runtime exception in > ReplicationWork#chooseTargets > --- > > Key: HDFS-12638 > URL: https://issues.apache.org/jira/browse/HDFS-12638 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang > > Active NamNode exit due to NPE, I can confirm that the BlockCollection passed > in when creating ReplicationWork is null, but I do not know why > BlockCollection is null, By view history I found > [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging > whether BlockCollection is null. > NN logs are as following: > {code:java} > 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: > ReplicationMonitor thread received Runtime exception. > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744) > at java.lang.Thread.run(Thread.java:834) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12621) Inconsistency/confusion around ViewFileSystem.getDelagation
[ https://issues.apache.org/jira/browse/HDFS-12621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202997#comment-16202997 ] Mohammad Kamrul Islam commented on HDFS-12621: -- Thanks [~sureshms] for re-adding me. [~xkrogen] : thanks for your comments. Follow up comments: _addDelegationTokens_ for ViewFileSystem works fine and collects the appropriate tokens from child filesystem(s). But the confusion is *getDelegationToken*() works for most FS but not for ViewFileSsytem. Which option do you think will be a good idea? I think option #1 could be less risky but at least give some message to the caller to call _addDelegationTokens_ instead. > Inconsistency/confusion around ViewFileSystem.getDelagation > > > Key: HDFS-12621 > URL: https://issues.apache.org/jira/browse/HDFS-12621 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.3 >Reporter: Mohammad Kamrul Islam >Assignee: Mohammad Kamrul Islam > > *Symptom*: > When a user invokes ViewFileSystem.getDelegationToken(String renewer), she > gets a "null". However, for any other file system, it returns a valid > delegation token. For a normal user, it is very confusing and it takes > substantial time to debug/find out an alternative. > *Root Cause:* > ViewFileSystem inherits the basic implementation from > FileSystem.getDelegationToken() that returns "_null_". The comments in the > source code indicates not to use it and instead use addDelegationTokens(). > However, it works fine DistributedFileSystem. > In short, the same client call is working for hdfs:// but not for viewfs://. > And there is no way of end-user to identify the root cause. This also creates > a lot of confusion for any service that are supposed to work for both viewfs > and hdfs. > *Possible Solution*: > _Option 1:_ Add a LOG.warn() with reasons/alternative before returning > "null" in the base class. > _Option 2:_ As done for other FS, ViewFileSystem can override the method with > a implementation by returning the token related to fs.defaultFS. In this > case, the defaultFS is something like "viewfs://..". We need to find out the > actual namenode and uses that to retrieve the delegation token. > _Option 3:_ Open for suggestion ? > *Last note:* My hunch is : there are very few users who may be using > viewfs:// with Kerberos. Therefore, it was not being exposed earlier. > I'm working on a good solution. Please add your suggestion. > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML
[ https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huafeng Wang updated HDFS-11467: Attachment: HDFS-11467.001.patch > Support ErasureCoding section in OIV XML/ReverseXML > --- > > Key: HDFS-11467 > URL: https://issues.apache.org/jira/browse/HDFS-11467 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: tools >Affects Versions: 3.0.0-alpha4 >Reporter: Wei-Chiu Chuang >Assignee: Huafeng Wang > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11467.001.patch > > > As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, > we would like to also support exporting this section into an XML back and > forth using the OIV tool. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML
[ https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huafeng Wang updated HDFS-11467: Status: Patch Available (was: Open) > Support ErasureCoding section in OIV XML/ReverseXML > --- > > Key: HDFS-11467 > URL: https://issues.apache.org/jira/browse/HDFS-11467 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: tools >Affects Versions: 3.0.0-alpha4 >Reporter: Wei-Chiu Chuang >Assignee: Huafeng Wang > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-11467.001.patch > > > As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, > we would like to also support exporting this section into an XML back and > forth using the OIV tool. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML
[ https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202993#comment-16202993 ] Huafeng Wang commented on HDFS-11467: - As discussed with Wei offline, I'll take this one. > Support ErasureCoding section in OIV XML/ReverseXML > --- > > Key: HDFS-11467 > URL: https://issues.apache.org/jira/browse/HDFS-11467 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: tools >Affects Versions: 3.0.0-alpha4 >Reporter: Wei-Chiu Chuang >Assignee: Wei Zhou > Labels: hdfs-ec-3.0-must-do > > As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, > we would like to also support exporting this section into an XML back and > forth using the OIV tool. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML
[ https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huafeng Wang reassigned HDFS-11467: --- Assignee: Huafeng Wang (was: Wei Zhou) > Support ErasureCoding section in OIV XML/ReverseXML > --- > > Key: HDFS-11467 > URL: https://issues.apache.org/jira/browse/HDFS-11467 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: tools >Affects Versions: 3.0.0-alpha4 >Reporter: Wei-Chiu Chuang >Assignee: Huafeng Wang > Labels: hdfs-ec-3.0-must-do > > As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, > we would like to also support exporting this section into an XML back and > forth using the OIV tool. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12642) Log block and datanode details in BlockRecoveryWorker
[ https://issues.apache.org/jira/browse/HDFS-12642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12642: - Status: Patch Available (was: Open) > Log block and datanode details in BlockRecoveryWorker > - > > Key: HDFS-12642 > URL: https://issues.apache.org/jira/browse/HDFS-12642 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12642.01.patch > > > In a recent investigation, we have seen a weird block recovery issue, which > is difficult to reach to a conclusion because of insufficient logs. > For the most critical part of the events, we see block recovery failed to > {{commitBlockSynchronization]} on the NN, due to the block not closed. This > leaves the file as open forever (for 1+ months). > The reason the block was not closed on NN, was because it is configured with > {{dfs.namenode.replication.min}} =2, and only 1 replica was with the latest > genstamp. > We were not able to tell why only 1 replica is on latest genstamp. > From the primary node of the recovery (ps2204), {{initReplicaRecoveryImpl}} > was called on each of the 7 DNs the block were ever placed. All DNs but > ps2204 and ps3765 failed because of genstamp comparison - that's expected. > ps2204 and ps3765 have gone past the comparison (since no exceptions from > their logs), but {{updateReplicaUnderRecovery}} only appeared to be called on > ps3765. > This jira is to propose we log more details when {{BlockRecoveryWorker}} is > about to call {{updateReplicaUnderRecovery}} on the DataNodes, so this could > be figured out in the future. > {noformat} > $ grep "updateReplica:" ps2204.dn.log > $ grep "updateReplica:" ps3765.dn.log > hadoop-hdfs-datanode-ps3765.log.2:{"@timestamp":"2017-09-13T00:56:20.933Z","source_host":"ps3765.example.com","file":"FsDatasetImpl.java","method":"updateReplicaUnderRecovery","level":"INFO","line_number":"2512","thread_name":"IPC > Server handler 6 on > 50020","@version":1,"logger_name":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl","message":"updateReplica: > BP-550436645-17.142.147.13-1438988035284:blk_2172795728_1106150312, > recoveryId=1107074793, length=65024, replica=ReplicaUnderRecovery, > blk_2172795728_1106150312, RUR > $ grep "initReplicaRecovery:" ps2204.dn.log > hadoop-hdfs-datanode-ps2204.log.1:{"@timestamp":"2017-09-13T00:56:20.691Z","source_host":"ps2204.example.com","file":"FsDatasetImpl.java","method":"initReplicaRecoveryImpl","level":"INFO","line_number":"2441","thread_name":"org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@5ae3cb26","@version":1,"logger_name":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl","message":"initReplicaRecovery: > blk_2172795728_1106150312, recoveryId=1107074793, > replica=ReplicaWaitingToBeRecovered, blk_2172795728_1106150312, RWR > hadoop-hdfs-datanode-ps2204.log.1:{"@timestamp":"2017-09-13T00:56:20.691Z","source_host":"ps2204.example.com","file":"FsDatasetImpl.java","method":"initReplicaRecoveryImpl","level":"INFO","line_number":"2497","thread_name":"org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@5ae3cb26","@version":1,"logger_name":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl","message":"initReplicaRecovery: > changing replica state for blk_2172795728_1106150312 from RWR to > RUR","class":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl","mdc":{}} > $ grep "initReplicaRecovery:" ps3765.dn.log > hadoop-hdfs-datanode-ps3765.log.2:{"@timestamp":"2017-09-13T00:56:20.457Z","source_host":"ps3765.example.com","file":"FsDatasetImpl.java","method":"initReplicaRecoveryImpl","level":"INFO","line_number":"2441","thread_name":"IPC > Server handler 5 on > 50020","@version":1,"logger_name":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl","message":"initReplicaRecovery: > blk_2172795728_1106150312, recoveryId=1107074793, > replica=ReplicaBeingWritten, blk_2172795728_1106150312, RBW > hadoop-hdfs-datanode-ps3765.log.2:{"@timestamp":"2017-09-13T00:56:20.457Z","source_host":"ps3765.example.com","file":"FsDatasetImpl.java","method":"initReplicaRecoveryImpl","level":"INFO","line_number":"2441","thread_name":"IPC > Server handler 5 on > 50020","@version":1,"logger_name":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl","message":"initReplicaRecovery: > blk_2172795728_1106150312, recoveryId=1107074793, > replica=ReplicaBeingWritten, blk_2172795728_1106150312, RBW > hadoop-hdfs-datanode-ps3765.log.2:{"@timestamp":"2017-09-13T00:56:20.457Z","source_host":"ps3765.example.com","file":"FsDatasetImpl.java","method":"initReplicaRecoveryImpl","level":"INFO","line_number":"2497","thread_name":"IPC > Serve
[jira] [Commented] (HDFS-12650) Use slf4j instead of log4j in LeaseManager
[ https://issues.apache.org/jira/browse/HDFS-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202981#comment-16202981 ] Hadoop QA commented on HDFS-12650: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 13s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 1 new + 385 unchanged - 9 fixed = 386 total (was 394) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 44s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 102 unchanged - 2 fixed = 105 total (was 104) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 46s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 97m 43s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}149m 46s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:3d04c00 | | JIRA Issue | HDFS-12650 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12891850/HDFS-12650.01.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux dff4c7f281f4 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e46d5bb | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/21676/artifact/patchprocess/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/21676/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/21676/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://build
[jira] [Resolved] (HDFS-12649) handling of corrupt blocks not suitable for commodity hardware
[ https://issues.apache.org/jira/browse/HDFS-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gruust resolved HDFS-12649. --- Resolution: Invalid > handling of corrupt blocks not suitable for commodity hardware > -- > > Key: HDFS-12649 > URL: https://issues.apache.org/jira/browse/HDFS-12649 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.8.1 >Reporter: Gruust >Priority: Minor > > Hadoop's documentation tells me it's suitable for commodity hardware in the > sense that hardware failures are expected to happen frequently. However, > there is currently no automatic handling of corrupted blocks, which seems a > bit contradictory to me. > See: > https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files > This is even problematic for data integrity as the redundancy is not kept at > the desired level without manual intervention and therefore in a timely > manner. If there is a corrupted block, I would at least expect that the > namenode forces the creation of an additional good replica to keep up the > redundancy level, ie. the redundancy level should never include corrupted > data... which it currently does: > "UnderReplicatedBlocks" : 0, > "CorruptBlocks" : 2, > (namenode /jmx http dump) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12626) Ozone : delete open key entries that will no longer be closed
[ https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202878#comment-16202878 ] Hadoop QA commented on HDFS-12626: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HDFS-7240 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 42s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 52s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 37s{color} | {color:green} HDFS-7240 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 1s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 6s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}174m 44s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.ozone.scm.TestXceiverClientManager | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | HDFS-12626 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12891822/HDFS-12626-HDFS-7240.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 74707213b235 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ma
[jira] [Updated] (HDFS-12570) [SPS]: Refactor Co-ordinator datanode logic to track the block storage movements
[ https://issues.apache.org/jira/browse/HDFS-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-12570: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-10285 Status: Resolved (was: Patch Available) I have just pushed it to branch > [SPS]: Refactor Co-ordinator datanode logic to track the block storage > movements > > > Key: HDFS-12570 > URL: https://issues.apache.org/jira/browse/HDFS-12570 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: HDFS-10285 > > Attachments: HDFS-12570-HDFS-10285-00.patch, > HDFS-12570-HDFS-10285-01.patch, HDFS-12570-HDFS-10285-02.patch, > HDFS-12570-HDFS-10285-03.patch, HDFS-12570-HDFS-10285-04.patch > > > This task is to refactor the C-DN block storage movements. Basically, the > idea is to move the scheduling and tracking logic to Namenode rather than at > the special C-DN. Please refer the discussion with [~andrew.wang] to > understand the [background and the necessity of > refactoring|https://issues.apache.org/jira/browse/HDFS-10285?focusedCommentId=16141060&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16141060]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12570) [SPS]: Refactor Co-ordinator datanode logic to track the block storage movements
[ https://issues.apache.org/jira/browse/HDFS-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202847#comment-16202847 ] Uma Maheswara Rao G commented on HDFS-12570: +1 on the latest patch. Thanks Rakesh for updating the patch. Thanks [~surendrasingh] for the reviews. > [SPS]: Refactor Co-ordinator datanode logic to track the block storage > movements > > > Key: HDFS-12570 > URL: https://issues.apache.org/jira/browse/HDFS-12570 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-12570-HDFS-10285-00.patch, > HDFS-12570-HDFS-10285-01.patch, HDFS-12570-HDFS-10285-02.patch, > HDFS-12570-HDFS-10285-03.patch, HDFS-12570-HDFS-10285-04.patch > > > This task is to refactor the C-DN block storage movements. Basically, the > idea is to move the scheduling and tracking logic to Namenode rather than at > the special C-DN. Please refer the discussion with [~andrew.wang] to > understand the [background and the necessity of > refactoring|https://issues.apache.org/jira/browse/HDFS-10285?focusedCommentId=16141060&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16141060]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12653) Implement toArray() and toSubArray() for ReadOnlyList
Manoj Govindassamy created HDFS-12653: - Summary: Implement toArray() and toSubArray() for ReadOnlyList Key: HDFS-12653 URL: https://issues.apache.org/jira/browse/HDFS-12653 Project: Hadoop HDFS Issue Type: Improvement Reporter: Manoj Govindassamy Assignee: Manoj Govindassamy {{ReadOnlyList}} today gives an unmodifiable view of the backing List. This list supports following Util methods for easy construction of read only views of any given list. {noformat} public static ReadOnlyList asReadOnlyList(final List list) public static List asList(final ReadOnlyList list) {noformat} {{asList}} above additionally overrides {{Object[] toArray()}} of the {{java.util.List}} interface. Unlike the {{java.util.List}}, the above one returns an array of Objects referring to the backing list and avoid any copying of objects. Given that we have many usages of read only lists, 1. Lets have a light-weight / shared-view {{toArray()}} implementation for {{ReadOnlyList}} as well. 2. Additionally, similar to {{java.util.List#subList(fromIndex, toIndex)}}, lets have {{ReadOnlyList#subArray(fromIndex, toIndex)}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12653) Implement toArray() and subArray() for ReadOnlyList
[ https://issues.apache.org/jira/browse/HDFS-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-12653: -- Summary: Implement toArray() and subArray() for ReadOnlyList (was: Implement toArray() and toSubArray() for ReadOnlyList) > Implement toArray() and subArray() for ReadOnlyList > --- > > Key: HDFS-12653 > URL: https://issues.apache.org/jira/browse/HDFS-12653 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > > {{ReadOnlyList}} today gives an unmodifiable view of the backing List. This > list supports following Util methods for easy construction of read only views > of any given list. > {noformat} > public static ReadOnlyList asReadOnlyList(final List list) > public static List asList(final ReadOnlyList list) > {noformat} > {{asList}} above additionally overrides {{Object[] toArray()}} of the > {{java.util.List}} interface. Unlike the {{java.util.List}}, the above one > returns an array of Objects referring to the backing list and avoid any > copying of objects. Given that we have many usages of read only lists, > 1. Lets have a light-weight / shared-view {{toArray()}} implementation for > {{ReadOnlyList}} as well. > 2. Additionally, similar to {{java.util.List#subList(fromIndex, toIndex)}}, > lets have {{ReadOnlyList#subArray(fromIndex, toIndex)}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-10685) libhdfs++: return explicit error when non-secured client connects to secured server
[ https://issues.apache.org/jira/browse/HDFS-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer reassigned HDFS-10685: --- Assignee: Kai Jiang > libhdfs++: return explicit error when non-secured client connects to secured > server > --- > > Key: HDFS-10685 > URL: https://issues.apache.org/jira/browse/HDFS-10685 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: Kai Jiang > Attachments: HDFS-10685.HDFS-8707.000.patch, > HDFS-10685.HDFS-8707.001.patch > > > When a non-secured client tries to connect to a secured server, the first > indication is an error from RpcConnection::HandleRpcRespose complaining about > "RPC response with Unknown call id -33". > We should insert code in HandleRpcResponse to detect if the unknown call id > == RpcEngine::kCallIdSasl and return an informative error that you have an > unsecured client connecting to a secured server. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12578) TestDeadDatanode#testNonDFSUsedONDeadNodeReReg failing in branch-2.7
[ https://issues.apache.org/jira/browse/HDFS-12578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202827#comment-16202827 ] Xiao Chen commented on HDFS-12578: -- Thank you for the investigation [~ajayydv]. And sorry for my delayed response. I think the reason there is a 1+ second delay in branch-2.7, is in {{BlockManagerTestUtil.checkHeartbeat}}: {code:title=HDFS-11224's change, which is in 2.8+} public static void checkHeartbeat(BlockManager bm) { -bm.getDatanodeManager().getHeartbeatManager().heartbeatCheck(); +HeartbeatManager hbm = bm.getDatanodeManager().getHeartbeatManager(); +hbm.restartHeartbeatStopWatch(); +hbm.heartbeatCheck(); } {code} The jira HDFS-11224 was mainly to fix a bug in feature HDFS-9239, which is only in branch-2.8+. So here is what I propose: - for branch-2.7, in addition to what you have found, we should also restart the stopwatch. - for branch-2.8+, let's do what you did in the patch, to give a wider range than 1 millisecond. (Please upload a patch based on trunk, so pre-commit can be triggered) Does this make sense? > TestDeadDatanode#testNonDFSUsedONDeadNodeReReg failing in branch-2.7 > > > Key: HDFS-12578 > URL: https://issues.apache.org/jira/browse/HDFS-12578 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Xiao Chen >Assignee: Ajay Kumar >Priority: Blocker > Attachments: HDFS-12578-branch-2.7.001.patch > > > It appears {{TestDeadDatanode#testNonDFSUsedONDeadNodeReReg}} is consistently > failing in branch-2.7. We should investigate and fix it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12628) libhdfs crashes on thread exit for JNI+libhdfs applications
[ https://issues.apache.org/jira/browse/HDFS-12628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202824#comment-16202824 ] Allen Wittenauer commented on HDFS-12628: - Honestly, it's probably time HDFS-8707 got merged into a release and then we can kill off libhdfs. cc: [~James C], [~bobhansen], [~anatoli.shein], ... > libhdfs crashes on thread exit for JNI+libhdfs applications > --- > > Key: HDFS-12628 > URL: https://issues.apache.org/jira/browse/HDFS-12628 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.0.0-alpha3 >Reporter: Joe McDonnell >Priority: Critical > Attachments: jni-util-test2.cc > > > Impala uses libhdfs to access HDFS while also using JNI to run other Java > code. Impala currently relies on HDFS's getJNIEnv to get a JNIEnv to interact > with the process JVM (which is created by HDFS code). It uses this JNIEnv > even for code that is not related to HDFS. > In recent versions of HDFS, getJNIEnv is no longer visible in libhdfs due to > HDFS-7879. In HDFS-8474, the proposed solution was for Impala to write its > own equivalent (tracked by IMPALA-2029). After implementing an equivalent of > getJNIEnv (heavily based on HDFS code, but with distinct names), we are > seeing crashes in hdfsThreadDestructor() in threads that use both HDFS and > JNI codepaths. The crash shows up under concurrency and does not reproduce in > serial execution. > I have distilled it down to a simple testcase that reproduces the issue. It > creates a JVM in the main thread (which Impala does at startup), then spawns > multiple threads that do basic HDFS and JNI work. I have removed all but the > essential steps. > This blocks running Impala on any hadoop version past 2.7 (when HDFS-7879 was > merged). Note that exposing getJNIEnv should unblock Impala development if a > fix is not forthcoming. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12650) Use slf4j instead of log4j in LeaseManager
[ https://issues.apache.org/jira/browse/HDFS-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDFS-12650: -- Status: Patch Available (was: Open) > Use slf4j instead of log4j in LeaseManager > -- > > Key: HDFS-12650 > URL: https://issues.apache.org/jira/browse/HDFS-12650 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Fix For: 3.1.0 > > Attachments: HDFS-12650.01.patch > > > LeaseManager is still using log4j dependencies. We should move those to > slf4j. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12650) Use slf4j instead of log4j in LeaseManager
[ https://issues.apache.org/jira/browse/HDFS-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDFS-12650: -- Attachment: HDFS-12650.01.patch > Use slf4j instead of log4j in LeaseManager > -- > > Key: HDFS-12650 > URL: https://issues.apache.org/jira/browse/HDFS-12650 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Fix For: 3.1.0 > > Attachments: HDFS-12650.01.patch > > > LeaseManager is still using log4j dependencies. We should move those to > slf4j. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12650) Use slf4j instead of log4j in LeaseManager
[ https://issues.apache.org/jira/browse/HDFS-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDFS-12650: -- Attachment: (was: HDFS-12650.01.patch) > Use slf4j instead of log4j in LeaseManager > -- > > Key: HDFS-12650 > URL: https://issues.apache.org/jira/browse/HDFS-12650 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Fix For: 3.1.0 > > Attachments: HDFS-12650.01.patch > > > LeaseManager is still using log4j dependencies. We should move those to > slf4j. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12650) Use slf4j instead of log4j in LeaseManager
[ https://issues.apache.org/jira/browse/HDFS-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDFS-12650: -- Attachment: HDFS-12650.01.patch > Use slf4j instead of log4j in LeaseManager > -- > > Key: HDFS-12650 > URL: https://issues.apache.org/jira/browse/HDFS-12650 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Fix For: 3.1.0 > > Attachments: HDFS-12650.01.patch > > > LeaseManager is still using log4j dependencies. We should move those to > slf4j. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured
[ https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202794#comment-16202794 ] Manoj Govindassamy commented on HDFS-12614: --- Filed HDFS-12652 to track {{INodeAttributeProvider#getAttributes()}} performance improvement task detailed by [~daryn] in the previous comments. I am assuming that the request is not for changing the INodeAttributesProvider#getAttributes() interface. > FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider > configured > -- > > Key: HDFS-12614 > URL: https://issues.apache.org/jira/browse/HDFS-12614 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-beta1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-12614.01.patch, HDFS-12614.02.patch, > HDFS-12614.03.patch, HDFS-12614.test.01.patch > > > When INodeAttributesProvider is configured, and when resolving path (like > "/") and checking for permission, the following code when working on > {{pathByNameArr}} throws NullPointerException. > {noformat} > private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx, > INode inode, int snapshotId) { > INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId); > if (getAttributesProvider() != null) { > String[] elements = new String[pathIdx + 1]; > for (int i = 0; i < elements.length; i++) { > elements[i] = DFSUtil.bytes2String(pathByNameArr[i]); <=== > } > inodeAttrs = getAttributesProvider().getAttributes(elements, > inodeAttrs); > } > return inodeAttrs; > } > {noformat} > Looks like for paths like "/" where the split components based on delimiter > "/" can be null, the pathByNameArr array can have null elements and can throw > NPE. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12652) INodeAttributesProvider#getAttributes(): Avoid multiple conversions of path components byte[][] to String[] when requesting INode attributes
Manoj Govindassamy created HDFS-12652: - Summary: INodeAttributesProvider#getAttributes(): Avoid multiple conversions of path components byte[][] to String[] when requesting INode attributes Key: HDFS-12652 URL: https://issues.apache.org/jira/browse/HDFS-12652 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Affects Versions: 3.0.0-beta1 Reporter: Manoj Govindassamy Assignee: Manoj Govindassamy {{INodeAttributesProvider#getAttributes}} needs the path components passed in to be an array of Strings. Where as the INode and related layers maintain path components as an array of byte[]. So, these layers are required to convert each byte[] component of the path back into a string and for multiple times when requesting for INode attributes from the Provider. That is, the path "/a/b/c" requires calling the attribute provider with: (1) "", (2) "", "a", (3) "", "a","b", (4) "", "a","b", "c". Every single one of those strings were freshly (re)converted from a byte[]. Say, a file listing is done on a huge directory containing 100s of millions of files, then these multiple time redundant conversions of byte[][] to String[] create lots of tiny object garbages, occupying memory and affecting performance. Better if we could avoid creating redundant copies of path component strings. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12411) Ozone: Add container usage information to DN container report
[ https://issues.apache.org/jira/browse/HDFS-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202783#comment-16202783 ] Xiaoyu Yao commented on HDFS-12411: --- Open HDFS-12651 for [~anu]'s comment #4. > Ozone: Add container usage information to DN container report > - > > Key: HDFS-12411 > URL: https://issues.apache.org/jira/browse/HDFS-12411 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone, scm >Reporter: Xiaoyu Yao >Assignee: Xiaoyu Yao > Labels: ozoneMerge > Attachments: HDFS-12411-HDFS-7240.001.patch, > HDFS-12411-HDFS-7240.002.patch, HDFS-12411-HDFS-7240.003.patch, > HDFS-12411-HDFS-7240.004.patch, HDFS-12411-HDFS-7240.005.patch, > HDFS-12411-HDFS-7240.006.patch, HDFS-12411-HDFS-7240.007.patch, > HDFS-12411-HDFS-7240.008.patch > > > Current DN ReportState for container only has a counter, we will need to > include individual container usage information so that SCM can > * close container when they are full > * assign container for block service with different policies. > * etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12651) Ozone: SCM: avoid synchronously loading all the keys from containers upon SCM datanode start
Xiaoyu Yao created HDFS-12651: - Summary: Ozone: SCM: avoid synchronously loading all the keys from containers upon SCM datanode start Key: HDFS-12651 URL: https://issues.apache.org/jira/browse/HDFS-12651 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7240 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao This is based on code review feedback from HDFS-12411 to avoid slow SCM datanode restart when there are large amount of keys and containers. E.g., 5 GB per container / 4 KB per key = 1.25 Million keys per container. The proposed solution is async loading containers/key size info and update the containerStatus once done. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12639) BPOfferService lock may stall all service actors
[ https://issues.apache.org/jira/browse/HDFS-12639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202766#comment-16202766 ] Hanisha Koneru commented on HDFS-12639: --- Hi [~daryn], This is my understanding of the problem. Please correct me if I am wrong. BPServiceActor obtains the writeLock before processing each command and releases it after. During the processing of a single command, the other actor would not be able to register or process heartbeats. If we remove the write lock held during command processing, then command processing would no longer be asynchronous. Not sure if this opens up the possibility of creating anomalies in the datanode. bq. The worst case scenario for processing commands while holding the lock is re-registration. The actor will loop, catching and logging exceptions, leaving the other actor blocked for an non-deterministic (possibly infinite) amount of time. The re-registration process itself does not acquire the write lock to register Datanode right? (It needs the write lock to check that the new registration info is consistent with the storage). Can you please elaborate on how the actor would go into a loop, catching and logging exceptions? > BPOfferService lock may stall all service actors > > > Key: HDFS-12639 > URL: https://issues.apache.org/jira/browse/HDFS-12639 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Hanisha Koneru > > {{BPOfferService}} manages {{BPServiceActor}} instances for the active and > standby. It uses a RW lock to primarily protect registration information > while determining the active/standby from heartbeats. > Unfortunately the write lock is held during command processing. If an actor > is experiencing high latency processing commands, the other actor will > neither be able to register (blocked in createRegistration, setNamespaceInfo, > verifyAndSetNamespaceInfo) nor process heartbeats (blocked in > updateActorStatesFromHeartbeat). > The worst case scenario for processing commands while holding the lock is > re-registration. The actor will loop, catching and logging exceptions, > leaving the other actor blocked for an non-deterministic (possibly infinite) > amount of time. > The lock must not be held during command processing. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12613) Native EC coder should implement release() as idempotent function.
[ https://issues.apache.org/jira/browse/HDFS-12613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-12613: - Attachment: HDFS-12613.03.patch Thanks for the suggestion, [~Sammi]. Add checks into java code as well, also this patch propagates {{IOException}} for {{decode()}} / {{encode()}}. Would you mind to give another review? [~Sammi] and [~drankye] > Native EC coder should implement release() as idempotent function. > -- > > Key: HDFS-12613 > URL: https://issues.apache.org/jira/browse/HDFS-12613 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.0.0-beta1 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-12613.00.patch, HDFS-12613.01.patch, > HDFS-12613.02.patch, HDFS-12613.03.patch > > > Recently, we found native EC coder crashes JVM because > {{NativeRSDecoder#release()}} being called multiple times (HDFS-12612 and > HDFS-12606). > We should strength the implement the native code to make {{release()}} > idempotent as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12626) Ozone : delete open key entries that will no longer be closed
[ https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-12626: -- Attachment: HDFS-12626-HDFS-7240.004.patch Somehow Jenkins ran on v002 patch again, resubmit v003 patch as v004 to trigger another run. > Ozone : delete open key entries that will no longer be closed > - > > Key: HDFS-12626 > URL: https://issues.apache.org/jira/browse/HDFS-12626 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang > Attachments: HDFS-12626-HDFS-7240.001.patch, > HDFS-12626-HDFS-7240.002.patch, HDFS-12626-HDFS-7240.003.patch, > HDFS-12626-HDFS-7240.004.patch > > > HDFS-12543 introduced the notion of "open key" where when a key is opened, an > open key entry gets persisted, only after client calls a close will this > entry be made visible. One issue is that if the client does not call close > (e.g. failed), then that open key entry will never be deleted from meta data. > This JIRA tracks this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12626) Ozone : delete open key entries that will no longer be closed
[ https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202730#comment-16202730 ] Hadoop QA commented on HDFS-12626: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 3m 35s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HDFS-7240 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 54s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 38s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 57s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 18s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 2s{color} | {color:green} HDFS-7240 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 30s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 6s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 38s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}141m 16s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 29s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}211m 40s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations | | | hadoop.cblock.TestCBlockCLI | | | hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport | | | hadoop.cblock.TestCBlockReadWrite | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.ozone.scm.TestAllocateContainer | | | hadoop.ozone.container.common.impl.TestContainerPersistence | | | hadoop.hdfs.server.federation.router.TestRouterRpc | | | hadoop.hdfs.server.datanode.TestDataNodeUUID | | Timed out junit tests | org.apache.hadoop.cblock.TestLocalBlockCache | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | HDFS-12626 | |
[jira] [Updated] (HDFS-12650) Use slf4j instead of log4j in LeaseManager
[ https://issues.apache.org/jira/browse/HDFS-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDFS-12650: -- Description: LeaseManager is still using log4j dependencies. We should move those to slf4j. (was: FileNamesystem is still using log4j dependencies. We should move those to slf4j, as most of the methods using log4j are deprecated.) > Use slf4j instead of log4j in LeaseManager > -- > > Key: HDFS-12650 > URL: https://issues.apache.org/jira/browse/HDFS-12650 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Fix For: 3.1.0 > > > LeaseManager is still using log4j dependencies. We should move those to > slf4j. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12650) Use slf4j instead of log4j in LeaseManager
Ajay Kumar created HDFS-12650: - Summary: Use slf4j instead of log4j in LeaseManager Key: HDFS-12650 URL: https://issues.apache.org/jira/browse/HDFS-12650 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ajay Kumar Assignee: Ajay Kumar Fix For: 3.1.0 FileNamesystem is still using log4j dependencies. We should move those to slf4j, as most of the methods using log4j are deprecated. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured
[ https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202696#comment-16202696 ] Hadoop QA commented on HDFS-12614: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 4m 42s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 26s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}115m 46s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}167m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 | | | org.apache.hadoop.fs.viewfs.TestViewFileSystemHdfs | | | org.apache.hadoop.fs.TestSymlinkHdfsFileContext | | | org.apache.hadoop.fs.TestEnhancedByteBufferAccess | | | org.apache.hadoop.fs.TestSymlinkHdfsFileSystem | | | org.apache.hadoop.fs.permission.TestStickyBit | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:3d04c00 | | JIRA Issue | HDFS-12614 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12891787/HDFS-12614.03.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2d953a9b0100 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e46d5bb | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/21672/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/21672/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/h
[jira] [Updated] (HDFS-12626) Ozone : delete open key entries that will no longer be closed
[ https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-12626: -- Attachment: HDFS-12626-HDFS-7240.003.patch fix asf license header missing in v003 patch. failed tests are unrelated > Ozone : delete open key entries that will no longer be closed > - > > Key: HDFS-12626 > URL: https://issues.apache.org/jira/browse/HDFS-12626 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang > Attachments: HDFS-12626-HDFS-7240.001.patch, > HDFS-12626-HDFS-7240.002.patch, HDFS-12626-HDFS-7240.003.patch > > > HDFS-12543 introduced the notion of "open key" where when a key is opened, an > open key entry gets persisted, only after client calls a close will this > entry be made visible. One issue is that if the client does not call close > (e.g. failed), then that open key entry will never be deleted from meta data. > This JIRA tracks this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12626) Ozone : delete open key entries that will no longer be closed
[ https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202654#comment-16202654 ] Hadoop QA commented on HDFS-12626: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HDFS-7240 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 29s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 50s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 36s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 55s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s{color} | {color:green} HDFS-7240 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 9s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 34s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 53s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}159m 19s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | HDFS-12626 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12891777/HDFS-12626-HDFS-7240.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux f5de9eae4ed0 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/perso
[jira] [Assigned] (HDFS-12648) DN should provide feedback to NN for throttling commands
[ https://issues.apache.org/jira/browse/HDFS-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru reassigned HDFS-12648: - Assignee: Hanisha Koneru > DN should provide feedback to NN for throttling commands > > > Key: HDFS-12648 > URL: https://issues.apache.org/jira/browse/HDFS-12648 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Hanisha Koneru > > The NN should avoid sending commands to a DN with a high number of > outstanding commands. The heartbeat could provide this feedback via perhaps > a simple count of the commands or rate of processing. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12585) Add description for config in Ozone config UI
[ https://issues.apache.org/jira/browse/HDFS-12585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202608#comment-16202608 ] Chen Liang commented on HDFS-12585: --- Thanks [~ajayydv] for the clarification. Then I think maybe it's better to either rename loadDescriptionFromXml to something such as descriptionLoaded or remove this flag and just check if {{propertyMap}} is null. > Add description for config in Ozone config UI > - > > Key: HDFS-12585 > URL: https://issues.apache.org/jira/browse/HDFS-12585 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7240 >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Fix For: HDFS-7240 > > Attachments: HDFS-12585-HDFS-7240.01.patch, > HDFS-12585-HDFS-7240.02.patch, HDFS-12585-HDFS-7240.03.patch > > > Add description for each config in Ozone config UI -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12249) dfsadmin -metaSave to output maintenance mode blocks
[ https://issues.apache.org/jira/browse/HDFS-12249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202604#comment-16202604 ] Wellington Chevreuil commented on HDFS-12249: - I believe the test failures are not related. Have those passing, locally. > dfsadmin -metaSave to output maintenance mode blocks > > > Key: HDFS-12249 > URL: https://issues.apache.org/jira/browse/HDFS-12249 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Wei-Chiu Chuang >Assignee: Wellington Chevreuil >Priority: Minor > Attachments: HDFS-12249.001.patch > > > Found while reviewing for HDFS-12182. > {quote} > After the patch, the output of metaSave is: > Live Datanodes: 0 > Dead Datanodes: 0 > Metasave: Blocks waiting for reconstruction: 0 > Metasave: Blocks currently missing: 1 > file16387: blk_0_1 MISSING (replicas: l: 0 d: 0 c: 2 e: 0) > 1.1.1.1:9866(corrupt) (block deletions maybe out of date) : > 2.2.2.2:9866(corrupt) (block deletions maybe out of date) : > Mis-replicated blocks that have been postponed: > Metasave: Blocks being reconstructed: 0 > Metasave: Blocks 0 waiting deletion from 0 datanodes. > Corrupt Blocks: > Block=0 Node=1.1.1.1:9866 StorageID=s1StorageState=NORMAL > TotalReplicas=2 Reason=GENSTAMP_MISMATCH > Block=0 Node=2.2.2.2:9866 StorageID=s2StorageState=NORMAL > TotalReplicas=2 Reason=GENSTAMP_MISMATCH > Metasave: Number of datanodes: 0 > {quote} > {quote} > Looking at the output > The output is not user friendly — The meaning of "(replicas: l: 0 d: 0 c: 2 > e: 0)" is not obvious without looking at the code. > Also, it should print maintenance mode replicas. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11821) BlockManager.getMissingReplOneBlocksCount() does not report correct value if corrupt file with replication factor of 1 gets deleted
[ https://issues.apache.org/jira/browse/HDFS-11821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202602#comment-16202602 ] Wellington Chevreuil commented on HDFS-11821: - Any insights from anyone? > BlockManager.getMissingReplOneBlocksCount() does not report correct value if > corrupt file with replication factor of 1 gets deleted > --- > > Key: HDFS-11821 > URL: https://issues.apache.org/jira/browse/HDFS-11821 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.6.0, 3.0.0-alpha2 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Minor > Attachments: HDFS-11821-1.patch, HDFS-11821-2.patch > > > *BlockManager* keeps a separate metric for number of missing blocks with > replication factor of 1. This is returned by > *BlockManager.getMissingReplOneBlocksCount()* method currently, and that's > what is displayed on below attribute for *dfsadmin -report* (in below > example, there's one corrupt block that relates to a file with replication > factor of 1): > {noformat} > ... > Missing blocks (with replication factor 1): 1 > ... > {noformat} > However, if the related file gets deleted, (for instance, using hdfs fsck > -delete option), this metric never gets updated, and *dfsadmin -report* will > keep reporting a missing block, even though the file does not exist anymore. > The only workaround available is to restart the NN, so that this metric will > be cleared. > This can be easily reproduced by forcing a replication factor 1 file > corruption such as follows: > 1) Put a file into hdfs with replication factor 1: > {noformat} > $ hdfs dfs -Ddfs.replication=1 -put test_corrupt / > $ hdfs dfs -ls / > -rw-r--r-- 1 hdfs supergroup 19 2017-05-10 09:21 /test_corrupt > {noformat} > 2) Find related block for the file and delete it from DN: > {noformat} > $ hdfs fsck /test_corrupt -files -blocks -locations > ... > /test_corrupt 19 bytes, 1 block(s): OK > 0. BP-782213640-172.31.113.82-1494420317936:blk_1073742742_1918 len=19 > Live_repl=1 > [DatanodeInfoWithStorage[172.31.112.178:20002,DS-a0dc0b30-a323-4087-8c36-26ffdfe44f46,DISK]] > Status: HEALTHY > ... > $ find /dfs/dn/ -name blk_1073742742* > /dfs/dn/current/BP-782213640-172.31.113.82-1494420317936/current/finalized/subdir0/subdir3/blk_1073742742 > /dfs/dn/current/BP-782213640-172.31.113.82-1494420317936/current/finalized/subdir0/subdir3/blk_1073742742_1918.meta > $ rm -rf > /dfs/dn/current/BP-782213640-172.31.113.82-1494420317936/current/finalized/subdir0/subdir3/blk_1073742742 > $ rm -rf > /dfs/dn/current/BP-782213640-172.31.113.82-1494420317936/current/finalized/subdir0/subdir3/blk_1073742742_1918.meta > {noformat} > 3) Running fsck will report the corruption as expected: > {noformat} > $ hdfs fsck /test_corrupt -files -blocks -locations > ... > /test_corrupt 19 bytes, 1 block(s): > /test_corrupt: CORRUPT blockpool BP-782213640-172.31.113.82-1494420317936 > block blk_1073742742 > MISSING 1 blocks of total size 19 B > ... > Total blocks (validated): 1 (avg. block size 19 B) > > UNDER MIN REPL'D BLOCKS:1 (100.0 %) > dfs.namenode.replication.min: 1 > CORRUPT FILES: 1 > MISSING BLOCKS: 1 > MISSING SIZE: 19 B > CORRUPT BLOCKS: 1 > ... > {noformat} > 4) Same for *dfsadmin -report* > {noformat} > $ hdfs dfsadmin -report > ... > Under replicated blocks: 1 > Blocks with corrupt replicas: 0 > Missing blocks: 1 > Missing blocks (with replication factor 1): 1 > ... > {noformat} > 5) Running *fsck -delete* option does cause fsck to report correct > information about corrupt block, but dfsadmin still shows the corrupt block: > {noformat} > $ hdfs fsck /test_corrupt -delete > ... > $ hdfs fsck / > ... > The filesystem under path '/' is HEALTHY > ... > $ hdfs dfsadmin -report > ... > Under replicated blocks: 0 > Blocks with corrupt replicas: 0 > Missing blocks: 0 > Missing blocks (with replication factor 1): 1 > ... > {noformat} > The problem seems to be on *BlockManager.removeBlock()* method, which in turn > uses util class *LowRedundancyBlocks* that classifies blocks according to the > current replication level, including blocks currently marked as corrupt. > The related metric showed on *dfsadmin -report* for corrupt blocks with > replication factor 1 is tracked on this *LowRedundancyBlocks*. Whenever a > block is marked as corrupt and it has replication factor of 1, the related > metric is updated. When removing the block, though, > *BlockManager.removeBlock()* is calling *LowRedundancyBlocks.remove(BlockInfo > block, int priLevel)*, which does not check if the given bl
[jira] [Updated] (HDFS-12649) handling of corrupt blocks not suitable for commodity hardware
[ https://issues.apache.org/jira/browse/HDFS-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gruust updated HDFS-12649: -- Description: Hadoop's documentation tells me it's suitable for commodity hardware in the sense that hardware failures are expected to happen frequently. However, there is currently no automatic handling of corrupted blocks, which seems a bit contradictory to me. See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files This is even problematic for data integrity as the redundancy is not kept at the desired level without manual intervention and therefore in a timely manner. If there is a corrupted block, I would at least expect that the namenode forces the creation of an additional good replica to keep up the redundancy level, ie. the redundancy level should never include corrupted data... which it currently does: "UnderReplicatedBlocks" : 0, "CorruptBlocks" : 2, (namenode /jmx http dump) was: Hadoop's documentation tells me it's suitable for commodity hardware in the sense that hardware failures are expected to happen frequently. However, there is currently no automatic handling of corrupted blocks, which seems a bit contradictory to me. See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files This is even problematic for data integrity as the redundancy is not kept at the desired level without manual intervention and therefore in a timely manner. If there is a corrupted block, I would at least expect that the namenode forces the creation of an additional good replica to keep up the redundancy level. > handling of corrupt blocks not suitable for commodity hardware > -- > > Key: HDFS-12649 > URL: https://issues.apache.org/jira/browse/HDFS-12649 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.8.1 >Reporter: Gruust >Priority: Minor > > Hadoop's documentation tells me it's suitable for commodity hardware in the > sense that hardware failures are expected to happen frequently. However, > there is currently no automatic handling of corrupted blocks, which seems a > bit contradictory to me. > See: > https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files > This is even problematic for data integrity as the redundancy is not kept at > the desired level without manual intervention and therefore in a timely > manner. If there is a corrupted block, I would at least expect that the > namenode forces the creation of an additional good replica to keep up the > redundancy level, ie. the redundancy level should never include corrupted > data... which it currently does: > "UnderReplicatedBlocks" : 0, > "CorruptBlocks" : 2, > (namenode /jmx http dump) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12649) handling of corrupt blocks not suitable for commodity hardware
[ https://issues.apache.org/jira/browse/HDFS-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gruust updated HDFS-12649: -- Description: Hadoop's documentation tells me it's suitable for commodity hardware in the sense that hardware failures are expected to happen frequently. However, there is currently no automatic handling of corrupted blocks, which seems a bit contradictory to me. See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files This is even problematic for data integrity as the redundancy is not kept at the desired level without manual intervention and therefore in a timely manner. If there is a corrupted block, I would at least expect that the namenode forces the creation of an additional good replica to keep up the redundancy level. was: Hadoop's documentation tells me it's suitable for commodity hardware in the sense that hardware failures are expected to happen frequently. However, there is currently no automatic handling of corrupted blocks, which seems a bit contradictory to me. See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files This is even problematic for data integrity as the redundancy is not kept at the desired level without manual intervention. If there is a corrupted block, I would at least expect that the namenode forces the creation of an additional good replica to keep up the redundancy level. > handling of corrupt blocks not suitable for commodity hardware > -- > > Key: HDFS-12649 > URL: https://issues.apache.org/jira/browse/HDFS-12649 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.8.1 >Reporter: Gruust >Priority: Minor > > Hadoop's documentation tells me it's suitable for commodity hardware in the > sense that hardware failures are expected to happen frequently. However, > there is currently no automatic handling of corrupted blocks, which seems a > bit contradictory to me. > See: > https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files > This is even problematic for data integrity as the redundancy is not kept at > the desired level without manual intervention and therefore in a timely > manner. If there is a corrupted block, I would at least expect that the > namenode forces the creation of an additional good replica to keep up the > redundancy level. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12649) handling of corrupt blocks not suitable for commodity hardware
Gruust created HDFS-12649: - Summary: handling of corrupt blocks not suitable for commodity hardware Key: HDFS-12649 URL: https://issues.apache.org/jira/browse/HDFS-12649 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.8.1 Reporter: Gruust Priority: Minor Hadoop's documentation tells me it's suitable for commodity hardware in the sense that hardware failures are expected to happen frequently. However, there is currently no automatic handling of corrupted blocks, which seems a bit contradictory to me. See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files This is even problematic for data integrity as the redundancy is not kept at the desired level without manual intervention. If there is a corrupted block, I would at least expect that the namenode forces the creation of an additional good replica to keep up the redundancy level. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12632) Ozone: OzoneFileSystem: Add contract tests to OzoneFileSystem
[ https://issues.apache.org/jira/browse/HDFS-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-12632: -- Resolution: Fixed Status: Resolved (was: Patch Available) The failed tests are unrelated. I've committed this to the feature branch, thanks [~msingh] for the contribution and [~anu] for the review! > Ozone: OzoneFileSystem: Add contract tests to OzoneFileSystem > - > > Key: HDFS-12632 > URL: https://issues.apache.org/jira/browse/HDFS-12632 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Labels: ozoneMerge > Fix For: HDFS-7240 > > Attachments: HDFS-12632-HDFS-7240.001.patch > > > HDFS-11704 adds OzoneFileSytem aka (o3) to ozone. This jira will be used to > add ContractTest for the filesystem to Ozone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10743) MiniDFSCluster test runtimes can be drastically reduce
[ https://issues.apache.org/jira/browse/HDFS-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202482#comment-16202482 ] Daryn Sharp commented on HDFS-10743: Triggering a block report immediately after the heartbeat isn't addressing the main issue of delayed reconnects after a cluster restart. Eliminating that delay will save a lot of time. The DNs are stuck waiting for the next heartbeat or stuck in {{sleepAfterException}}. The mini cluster has some "triggerBlah" methods but they are synchronous and wait for the operation to complete which we can't do because sometimes DNs are expected to fail to connect. An async wakeup can be done with {{DataNode#scheduleAllBlockReport(0)}} – if it also then triggered a heartbeat. Maybe add a flag to that method for sending a heartbeat. Something needs to be done to wake the thread from {{sleepAfterException}} because tests will likely encounter that delay during restarts. > MiniDFSCluster test runtimes can be drastically reduce > -- > > Key: HDFS-10743 > URL: https://issues.apache.org/jira/browse/HDFS-10743 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Kuhu Shukla > Attachments: HDFS-10743.001.patch, HDFS-10743.002.patch, > HDFS-10743.003.patch > > > {{MiniDFSCluster}} tests have excessive runtimes. The main problem appears > to be the heartbeat interval. The NN may have to wait up to 3s (default > value) for all DNs to heartbeat, triggering registration, so NN can go > active. Tests that repeatedly restart the NN are severely affected. > Example for varying heartbeat intervals for {{TestFSImageWithAcl}}: > * 3s = ~70s -- (disgusting, why I investigated) > * 1s = ~27s > * 500ms = ~17s -- (had to hack DNConf for millisecond precision) > That a 4x improvement in runtime. > 17s is still excessively long for what the test does. Further areas to > explore when running tests: > * Reduce numerous sleeps intervals in DN's {{BPServiceActor}}. > * Ensure heartbeats and initial BR are sent immediately upon (re)registration. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured
[ https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-12614: -- Attachment: HDFS-12614.03.patch Attached v03 patch with more comments. [~yzhangal], [~daryn], can you please take a look at the latest patch revision? Thanks. > FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider > configured > -- > > Key: HDFS-12614 > URL: https://issues.apache.org/jira/browse/HDFS-12614 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-beta1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-12614.01.patch, HDFS-12614.02.patch, > HDFS-12614.03.patch, HDFS-12614.test.01.patch > > > When INodeAttributesProvider is configured, and when resolving path (like > "/") and checking for permission, the following code when working on > {{pathByNameArr}} throws NullPointerException. > {noformat} > private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx, > INode inode, int snapshotId) { > INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId); > if (getAttributesProvider() != null) { > String[] elements = new String[pathIdx + 1]; > for (int i = 0; i < elements.length; i++) { > elements[i] = DFSUtil.bytes2String(pathByNameArr[i]); <=== > } > inodeAttrs = getAttributesProvider().getAttributes(elements, > inodeAttrs); > } > return inodeAttrs; > } > {noformat} > Looks like for paths like "/" where the split components based on delimiter > "/" can be null, the pathByNameArr array can have null elements and can throw > NPE. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12632) Ozone: OzoneFileSystem: Add contract tests to OzoneFileSystem
[ https://issues.apache.org/jira/browse/HDFS-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202460#comment-16202460 ] Chen Liang commented on HDFS-12632: --- Thanks [~msingh] for adding the tests. +1 > Ozone: OzoneFileSystem: Add contract tests to OzoneFileSystem > - > > Key: HDFS-12632 > URL: https://issues.apache.org/jira/browse/HDFS-12632 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Labels: ozoneMerge > Fix For: HDFS-7240 > > Attachments: HDFS-12632-HDFS-7240.001.patch > > > HDFS-11704 adds OzoneFileSytem aka (o3) to ozone. This jira will be used to > add ContractTest for the filesystem to Ozone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12626) Ozone : delete open key entries that will no longer be closed
[ https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-12626: -- Attachment: HDFS-12626-HDFS-7240.002.patch 002 patch to rebase. > Ozone : delete open key entries that will no longer be closed > - > > Key: HDFS-12626 > URL: https://issues.apache.org/jira/browse/HDFS-12626 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang > Attachments: HDFS-12626-HDFS-7240.001.patch, > HDFS-12626-HDFS-7240.002.patch > > > HDFS-12543 introduced the notion of "open key" where when a key is opened, an > open key entry gets persisted, only after client calls a close will this > entry be made visible. One issue is that if the client does not call close > (e.g. failed), then that open key entry will never be deleted from meta data. > This JIRA tracks this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12626) Ozone : delete open key entries that will no longer be closed
[ https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202361#comment-16202361 ] Hadoop QA commented on HDFS-12626: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} HDFS-12626 does not apply to HDFS-7240. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-12626 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12891548/HDFS-12626-HDFS-7240.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/21669/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Ozone : delete open key entries that will no longer be closed > - > > Key: HDFS-12626 > URL: https://issues.apache.org/jira/browse/HDFS-12626 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang > Attachments: HDFS-12626-HDFS-7240.001.patch > > > HDFS-12543 introduced the notion of "open key" where when a key is opened, an > open key entry gets persisted, only after client calls a close will this > entry be made visible. One issue is that if the client does not call close > (e.g. failed), then that open key entry will never be deleted from meta data. > This JIRA tracks this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202357#comment-16202357 ] Daryn Sharp commented on HDFS-12638: The issues actually appear unrelated. In our case, there was only 1 replica, that node died, the block was deleted. The block got stuck in the replication monitor's missing block queue. Kihwal is filing a jira with additional details. > NameNode exits due to ReplicationMonitor thread received Runtime exception in > ReplicationWork#chooseTargets > --- > > Key: HDFS-12638 > URL: https://issues.apache.org/jira/browse/HDFS-12638 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang > > Active NamNode exit due to NPE, I can confirm that the BlockCollection passed > in when creating ReplicationWork is null, but I do not know why > BlockCollection is null, By view history I found > [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging > whether BlockCollection is null. > NN logs are as following: > {code:java} > 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: > ReplicationMonitor thread received Runtime exception. > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744) > at java.lang.Thread.run(Thread.java:834) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12626) Ozone : delete open key entries that will no longer be closed
[ https://issues.apache.org/jira/browse/HDFS-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-12626: -- Status: Patch Available (was: Open) > Ozone : delete open key entries that will no longer be closed > - > > Key: HDFS-12626 > URL: https://issues.apache.org/jira/browse/HDFS-12626 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang > Attachments: HDFS-12626-HDFS-7240.001.patch > > > HDFS-12543 introduced the notion of "open key" where when a key is opened, an > open key entry gets persisted, only after client calls a close will this > entry be made visible. One issue is that if the client does not call close > (e.g. failed), then that open key entry will never be deleted from meta data. > This JIRA tracks this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12415) Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202341#comment-16202341 ] Chen Liang commented on HDFS-12415: --- I looked in this a little bit too. What was happening seems to be that {{SCMCommonPolicy#chooseDatanodes}} calls {{nodeManager.getNodes(OzoneProtos.NodeState.HEALTHY);}}, but the returned list contains a {{null}} datanode id entry. So the {{hasEnoughSpace(d, sizeRequired)}} call on the null d will fail with NPE. And the returned list with a null entry is returned by {{SCMNodeManager#getNodes}}, where seems there is some datanode id in {{healthyNodes}} but not present in {{nodes}} map. I don't see how could a datanode id be present in {{healthyNodes}} but not in {{nodes}}, because the first thing of register is to always add that datanode to {{nodes}}, before {{healthyNodes}}. I can only think of the issue being just like [~msingh] mentioned, that it is probably due to some unexpected race condition behaviour when two register calls happen and change the HashMap {{nodes}} at the same time. So I would +1 on Mukul's change. Additionally, I ran {{TestXceiverClientManager}} several ten times with v005 patch applied. The test did not fail. > Ozone: TestXceiverClientManager and TestAllocateContainer occasionally fails > > > Key: HDFS-12415 > URL: https://issues.apache.org/jira/browse/HDFS-12415 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7240 >Reporter: Weiwei Yang >Assignee: Weiwei Yang > Attachments: HDFS-12415-HDFS-7240.001.patch, > HDFS-12415-HDFS-7240.002.patch, HDFS-12415-HDFS-7240.003.patch, > HDFS-12415-HDFS-7240.004.patch, HDFS-12415-HDFS-7240.005.patch > > > TestXceiverClientManager seems to be occasionally failing in some jenkins > jobs, > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828) > at > org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147) > at > org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125) > {noformat} > see more from [this > report|https://builds.apache.org/job/PreCommit-HDFS-Build/21065/testReport/] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-5926) documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity
[ https://issues.apache.org/jira/browse/HDFS-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202333#comment-16202333 ] Hadoop QA commented on HDFS-5926: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 29m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 12s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}131m 32s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}181m 27s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:3d04c00 | | JIRA Issue | HDFS-5926 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12891730/HDFS-5926-1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml | | uname | Linux 461d2b393e7a 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 075358e | | Default Java | 1.8.0_144 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/21668/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/21668/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/21668/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity > - > >
[jira] [Commented] (HDFS-12639) BPOfferService lock may stall all service actors
[ https://issues.apache.org/jira/browse/HDFS-12639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202321#comment-16202321 ] Hanisha Koneru commented on HDFS-12639: --- Sure.. Thanks Daryn. > BPOfferService lock may stall all service actors > > > Key: HDFS-12639 > URL: https://issues.apache.org/jira/browse/HDFS-12639 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Hanisha Koneru > > {{BPOfferService}} manages {{BPServiceActor}} instances for the active and > standby. It uses a RW lock to primarily protect registration information > while determining the active/standby from heartbeats. > Unfortunately the write lock is held during command processing. If an actor > is experiencing high latency processing commands, the other actor will > neither be able to register (blocked in createRegistration, setNamespaceInfo, > verifyAndSetNamespaceInfo) nor process heartbeats (blocked in > updateActorStatesFromHeartbeat). > The worst case scenario for processing commands while holding the lock is > re-registration. The actor will loop, catching and logging exceptions, > leaving the other actor blocked for an non-deterministic (possibly infinite) > amount of time. > The lock must not be held during command processing. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-12639) BPOfferService lock may stall all service actors
[ https://issues.apache.org/jira/browse/HDFS-12639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru reassigned HDFS-12639: - Assignee: Hanisha Koneru > BPOfferService lock may stall all service actors > > > Key: HDFS-12639 > URL: https://issues.apache.org/jira/browse/HDFS-12639 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Hanisha Koneru > > {{BPOfferService}} manages {{BPServiceActor}} instances for the active and > standby. It uses a RW lock to primarily protect registration information > while determining the active/standby from heartbeats. > Unfortunately the write lock is held during command processing. If an actor > is experiencing high latency processing commands, the other actor will > neither be able to register (blocked in createRegistration, setNamespaceInfo, > verifyAndSetNamespaceInfo) nor process heartbeats (blocked in > updateActorStatesFromHeartbeat). > The worst case scenario for processing commands while holding the lock is > re-registration. The actor will loop, catching and logging exceptions, > leaving the other actor blocked for an non-deterministic (possibly infinite) > amount of time. > The lock must not be held during command processing. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12632) Ozone: OzoneFileSystem: Add contract tests to OzoneFileSystem
[ https://issues.apache.org/jira/browse/HDFS-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202302#comment-16202302 ] Anu Engineer commented on HDFS-12632: - +1, LGTM. > Ozone: OzoneFileSystem: Add contract tests to OzoneFileSystem > - > > Key: HDFS-12632 > URL: https://issues.apache.org/jira/browse/HDFS-12632 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Labels: ozoneMerge > Fix For: HDFS-7240 > > Attachments: HDFS-12632-HDFS-7240.001.patch > > > HDFS-11704 adds OzoneFileSytem aka (o3) to ozone. This jira will be used to > add ContractTest for the filesystem to Ozone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12599) Remove Mockito dependency from DataNodeTestUtils
[ https://issues.apache.org/jira/browse/HDFS-12599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12599: - Fix Version/s: (was: 3.1.0) > Remove Mockito dependency from DataNodeTestUtils > > > Key: HDFS-12599 > URL: https://issues.apache.org/jira/browse/HDFS-12599 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.0.0-beta1 >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-12599.v1.patch, HDFS-12599.v1.patch, > HDFS-12599.v1.patch > > > HDFS-11164 introduced {{DataNodeTestUtils.mockDatanodeBlkPinning}} which > brought dependency on mockito back into DataNodeTestUtils > Downstream, this resulted in: > {code} > java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer > at > org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2668) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2564) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2607) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1667) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:874) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:769) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(HBaseTestingUtility.java:661) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1075) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:953) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12212) Options.Rename.To_TRASH is considered even when Options.Rename.NONE is specified
[ https://issues.apache.org/jira/browse/HDFS-12212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202294#comment-16202294 ] Wei-Chiu Chuang commented on HDFS-12212: Hi [~vinayrpet] thanks for the patch. The fix itself looks to me. Would you also like to contribute a test as well? > Options.Rename.To_TRASH is considered even when Options.Rename.NONE is > specified > > > Key: HDFS-12212 > URL: https://issues.apache.org/jira/browse/HDFS-12212 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.9.0, 2.7.4, 3.0.0-alpha1, 2.8.2 >Reporter: Vinayakumar B >Assignee: Vinayakumar B > Attachments: HDFS-12212-01.patch > > > HDFS-8312 introduced {{Options.Rename.TO_TRASH}} to differentiate the > movement to trash and other renames for permission checks. > When Options.Rename.NONE is passed also TO_TRASH is considered for rename and > wrong permissions are checked for rename. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12212) Options.Rename.To_TRASH is considered even when Options.Rename.NONE is specified
[ https://issues.apache.org/jira/browse/HDFS-12212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202257#comment-16202257 ] Wei-Chiu Chuang commented on HDFS-12212: Updated Affects versions based on HDFS-8312's affects versions. > Options.Rename.To_TRASH is considered even when Options.Rename.NONE is > specified > > > Key: HDFS-12212 > URL: https://issues.apache.org/jira/browse/HDFS-12212 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.9.0, 2.7.4, 3.0.0-alpha1, 2.8.2 >Reporter: Vinayakumar B >Assignee: Vinayakumar B > Attachments: HDFS-12212-01.patch > > > HDFS-8312 introduced {{Options.Rename.TO_TRASH}} to differentiate the > movement to trash and other renames for permission checks. > When Options.Rename.NONE is passed also TO_TRASH is considered for rename and > wrong permissions are checked for rename. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12212) Options.Rename.To_TRASH is considered even when Options.Rename.NONE is specified
[ https://issues.apache.org/jira/browse/HDFS-12212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-12212: --- Affects Version/s: 2.8.2 2.9.0 2.7.4 3.0.0-alpha1 > Options.Rename.To_TRASH is considered even when Options.Rename.NONE is > specified > > > Key: HDFS-12212 > URL: https://issues.apache.org/jira/browse/HDFS-12212 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.9.0, 2.7.4, 3.0.0-alpha1, 2.8.2 >Reporter: Vinayakumar B >Assignee: Vinayakumar B > Attachments: HDFS-12212-01.patch > > > HDFS-8312 introduced {{Options.Rename.TO_TRASH}} to differentiate the > movement to trash and other renames for permission checks. > When Options.Rename.NONE is passed also TO_TRASH is considered for rename and > wrong permissions are checked for rename. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12490) Ozone: OzoneClient: Add creation/modification time information in OzoneVolume/OzoneBucket/OzoneKey
[ https://issues.apache.org/jira/browse/HDFS-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDFS-12490: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Ozone: OzoneClient: Add creation/modification time information in > OzoneVolume/OzoneBucket/OzoneKey > -- > > Key: HDFS-12490 > URL: https://issues.apache.org/jira/browse/HDFS-12490 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Fix For: HDFS-7240 > > Attachments: HDFS-12490-HDFS-7240.001.patch, > HDFS-12490-HDFS-7240.002.patch, HDFS-12490-HDFS-7240.003.patch > > > OzoneBucket should have information about the bucket creation time. > OzoneFileSystem needs creation time to display the file status information > for the root of the filesystem. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12490) Ozone: OzoneClient: Add creation/modification time information in OzoneVolume/OzoneBucket/OzoneKey
[ https://issues.apache.org/jira/browse/HDFS-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202211#comment-16202211 ] Nanda kumar commented on HDFS-12490: I have committed this to feature branch. Thanks for the contribution [~msingh] and [~anu] for the review. > Ozone: OzoneClient: Add creation/modification time information in > OzoneVolume/OzoneBucket/OzoneKey > -- > > Key: HDFS-12490 > URL: https://issues.apache.org/jira/browse/HDFS-12490 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Fix For: HDFS-7240 > > Attachments: HDFS-12490-HDFS-7240.001.patch, > HDFS-12490-HDFS-7240.002.patch, HDFS-12490-HDFS-7240.003.patch > > > OzoneBucket should have information about the bucket creation time. > OzoneFileSystem needs creation time to display the file status information > for the root of the filesystem. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12490) Ozone: OzoneClient: Add creation/modification time information in OzoneVolume/OzoneBucket/OzoneKey
[ https://issues.apache.org/jira/browse/HDFS-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202203#comment-16202203 ] Nanda kumar commented on HDFS-12490: Since we don't want to expose setter methods for the parameters passed to constructor, the checkstyle issue is left for now. Test failures are not related, will commit it shortly. > Ozone: OzoneClient: Add creation/modification time information in > OzoneVolume/OzoneBucket/OzoneKey > -- > > Key: HDFS-12490 > URL: https://issues.apache.org/jira/browse/HDFS-12490 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Fix For: HDFS-7240 > > Attachments: HDFS-12490-HDFS-7240.001.patch, > HDFS-12490-HDFS-7240.002.patch, HDFS-12490-HDFS-7240.003.patch > > > OzoneBucket should have information about the bucket creation time. > OzoneFileSystem needs creation time to display the file status information > for the root of the filesystem. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12620) Backporting HDFS-10467 to branch-2
[ https://issues.apache.org/jira/browse/HDFS-12620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202148#comment-16202148 ] Íñigo Goiri commented on HDFS-12620: Even with the most standard naming convention, Jenkins is not giving me the report so not sure what's wrong with javac. > Backporting HDFS-10467 to branch-2 > -- > > Key: HDFS-12620 > URL: https://issues.apache.org/jira/browse/HDFS-12620 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Attachments: HDFS-10467-branch-2.001.patch, > HDFS-10467-branch-2.002.patch, HDFS-10467-branch-2.003.patch, > HDFS-10467-branch-2.patch, HDFS-12620-branch-2.000.patch, > HDFS-12620-branch-2.004.patch > > > When backporting HDFS-10467, there are a few things that changed: > * {{bin\hdfs}} > * {{ClientProtocol}} > * Java 7 not supporting referencing functions > * {{org.eclipse.jetty.util.ajax.JSON}} in branch-2 is > {{org.mortbay.util.ajax.JSON}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12585) Add description for config in Ozone config UI
[ https://issues.apache.org/jira/browse/HDFS-12585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202128#comment-16202128 ] Ajay Kumar commented on HDFS-12585: --- [~vagarychen], thanks for review. Yes, {{propertyMap}} is only initialized in {{loadDescriptions()}}. Since {{loadDescriptionFromXml}} default value is true {{loadDescriptions()}} will be called initially and there after we will use the initialized map. Main intention behind calling {{loadDescriptionFromXml}} is to load the description of properties from xml files which are not going to change frequently. So, we don't want to unmarshall xml files every time there is a call to servlet. > Add description for config in Ozone config UI > - > > Key: HDFS-12585 > URL: https://issues.apache.org/jira/browse/HDFS-12585 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7240 >Reporter: Ajay Kumar >Assignee: Ajay Kumar > Fix For: HDFS-7240 > > Attachments: HDFS-12585-HDFS-7240.01.patch, > HDFS-12585-HDFS-7240.02.patch, HDFS-12585-HDFS-7240.03.patch > > > Add description for each config in Ozone config UI -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202092#comment-16202092 ] Daryn Sharp commented on HDFS-12638: It does differ from our case but likely has the same root cause. The block was in your blocks map, but not in ours. The block existed on your DN, but not ours. The commonality is the block referenced a non-existent inode/collection. > NameNode exits due to ReplicationMonitor thread received Runtime exception in > ReplicationWork#chooseTargets > --- > > Key: HDFS-12638 > URL: https://issues.apache.org/jira/browse/HDFS-12638 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang > > Active NamNode exit due to NPE, I can confirm that the BlockCollection passed > in when creating ReplicationWork is null, but I do not know why > BlockCollection is null, By view history I found > [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging > whether BlockCollection is null. > NN logs are as following: > {code:java} > 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: > ReplicationMonitor thread received Runtime exception. > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744) > at java.lang.Thread.run(Thread.java:834) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202079#comment-16202079 ] Daryn Sharp commented on HDFS-12638: This is bad. The fsck NPE means the block _is_ in the blocks map but the inode/collection it references _is not_ in the blocks map. Last I knew, a size of Long.MAX_VALUE means the block is scheduled for a "fire and forget" invalidation because the file was deleted. bq. and there are logs of truncate cmd in auditlog To double check, did you mean "are" or "are not"? > NameNode exits due to ReplicationMonitor thread received Runtime exception in > ReplicationWork#chooseTargets > --- > > Key: HDFS-12638 > URL: https://issues.apache.org/jira/browse/HDFS-12638 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang > > Active NamNode exit due to NPE, I can confirm that the BlockCollection passed > in when creating ReplicationWork is null, but I do not know why > BlockCollection is null, By view history I found > [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging > whether BlockCollection is null. > NN logs are as following: > {code:java} > 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: > ReplicationMonitor thread received Runtime exception. > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744) > at java.lang.Thread.run(Thread.java:834) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12639) BPOfferService lock may stall all service actors
[ https://issues.apache.org/jira/browse/HDFS-12639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202067#comment-16202067 ] Daryn Sharp commented on HDFS-12639: Sure, go ahead and assign to yourself. If I don't assign to myself unless I have free cycles. Even though my responses are often delayed, please let me review your patch. > BPOfferService lock may stall all service actors > > > Key: HDFS-12639 > URL: https://issues.apache.org/jira/browse/HDFS-12639 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp > > {{BPOfferService}} manages {{BPServiceActor}} instances for the active and > standby. It uses a RW lock to primarily protect registration information > while determining the active/standby from heartbeats. > Unfortunately the write lock is held during command processing. If an actor > is experiencing high latency processing commands, the other actor will > neither be able to register (blocked in createRegistration, setNamespaceInfo, > verifyAndSetNamespaceInfo) nor process heartbeats (blocked in > updateActorStatesFromHeartbeat). > The worst case scenario for processing commands while holding the lock is > re-registration. The actor will loop, catching and logging exceptions, > leaving the other actor blocked for an non-deterministic (possibly infinite) > amount of time. > The lock must not be held during command processing. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12645) FSDatasetImpl lock will stall BP service actors and may cause missing blocks
[ https://issues.apache.org/jira/browse/HDFS-12645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202063#comment-16202063 ] Daryn Sharp commented on HDFS-12645: I understand the lifeline protocol was designed to avoid the node being declared dead, but it's just hiding the consequences of a poor locking design. Preventing a dead node via a lifeline is of dubious value when the node is effectively dead due to blocked IO in the dataset lock. The node can't process replications which may lead to data loss when another node could have serviced the replication request. Ex. The lifeline will keep a node "alive" even though it's having severe hw issues and ultimately crashed. > FSDatasetImpl lock will stall BP service actors and may cause missing blocks > > > Key: HDFS-12645 > URL: https://issues.apache.org/jira/browse/HDFS-12645 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp > > The DN is extremely susceptible to a slow volume due bad locking practices. > DN operations require a fs dataset lock. IO in the dataset lock should not > be permissible as it leads to severe performance degradation and possibly > (temporarily) missing blocks. > A slow disk will cause pipelines to experience significant latency and > timeouts, increasing lock/io contention while cleaning up, leading to more > timeouts, etc. Meanwhile, the actor service thread is interleaving multiple > lock acquire/releases with xceivers. If many commands are issued, the node > may be incorrectly declared as dead. > HDFS-12639 documents that both actors synchronize on the offer service lock > while processing commands. A backlogged active actor will block the standby > actor and cause it to go dead too. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-12647) DN commands processing should be async
[ https://issues.apache.org/jira/browse/HDFS-12647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandakumar reassigned HDFS-12647: - Assignee: Nandakumar > DN commands processing should be async > -- > > Key: HDFS-12647 > URL: https://issues.apache.org/jira/browse/HDFS-12647 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Nandakumar > > Due to dataset lock contention, service actors may encounter significant > latency while processing DN commands. Even the queuing of async deletions > require multiple lock acquisitions. A slow disk will cause a backlog of > xceivers instantiating block sender/receivers which starves the actor and > leads to the NN falsely declaring the node dead. > Async processing of all commands will free the actor to perform its primary > purpose of heartbeating and block reporting. Note that FBRs will be > dependent on queued block invalidations not being included in the report. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12648) DN should provide feedback to NN for throttling commands
Daryn Sharp created HDFS-12648: -- Summary: DN should provide feedback to NN for throttling commands Key: HDFS-12648 URL: https://issues.apache.org/jira/browse/HDFS-12648 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.8.0 Reporter: Daryn Sharp The NN should avoid sending commands to a DN with a high number of outstanding commands. The heartbeat could provide this feedback via perhaps a simple count of the commands or rate of processing. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-5926) documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity
[ https://issues.apache.org/jira/browse/HDFS-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota updated HDFS-5926: - Status: Patch Available (was: Open) > documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity > - > > Key: HDFS-5926 > URL: https://issues.apache.org/jira/browse/HDFS-5926 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 0.20.2 >Reporter: Alexander Fahlke >Assignee: Gabor Bota >Priority: Minor > Labels: newbie > Attachments: HDFS-5926-1.patch > > > I'm using hadoop-0.20.2 on Debian Squeeze and ran into the same confusion as > many others with the parameter for dfs.datanode.du.reserved. One day some > data nodes got out of disk errors although there was space left on the disks. > The following values are rounded to make the problem more clear: > - the disk for the DFS data has 1000GB and only one Partition (ext3) for DFS > data > - you plan to set the dfs.datanode.du.reserved to 20GB > - the reserved reserved-blocks-percentage by tune2fs is 5% (the default) > That gives all users, except root, 5% less capacity that they can use. > Although the System reports the total of 1000GB as usable for all users via > df. The hadoop-deamons are not running as root. > If i read it right, than hadoop get's the free capacity via df. > > Starting in > {{/src/hdfs/org/apache/hadoop/hdfs/server/datanode/FSDataset.java}} on line > 350: {{return usage.getCapacity()-reserved;}} > going to {{/src/core/org/apache/hadoop/fs/DF.java}} which says: > {{"Filesystem disk space usage statistics. Uses the unix 'df' program"}} > When you have 5% reserved by tune2fs (in our case 50GB) and you give > dfs.datanode.du.reserved only 20GB, than you can possibly ran into out of > disk errors that hadoop can't handle. > In this case you must add the planned 20GB du reserved to the reserved > capacity by tune2fs. This results in (at least) 70GB for > dfs.datanode.du.reserved in my case. > Two ideas: > # The documentation must be clear at this point to avoid this problem. > # Hadoop could check for reserved space by tune2fs (or other tools) and add > this value to the dfs.datanode.du.reserved parameter. > This ticket is a follow up from the Mailinglist: > https://mail-archives.apache.org/mod_mbox/hadoop-common-user/201312.mbox/%3CCAHodO=Kbv=13T=2otz+s8nsodbs1icnzqyxt_0wdfxy5gks...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-5926) documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity
[ https://issues.apache.org/jira/browse/HDFS-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota updated HDFS-5926: - Attachment: HDFS-5926-1.patch > documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity > - > > Key: HDFS-5926 > URL: https://issues.apache.org/jira/browse/HDFS-5926 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 0.20.2 >Reporter: Alexander Fahlke >Assignee: Gabor Bota >Priority: Minor > Labels: newbie > Attachments: HDFS-5926-1.patch > > > I'm using hadoop-0.20.2 on Debian Squeeze and ran into the same confusion as > many others with the parameter for dfs.datanode.du.reserved. One day some > data nodes got out of disk errors although there was space left on the disks. > The following values are rounded to make the problem more clear: > - the disk for the DFS data has 1000GB and only one Partition (ext3) for DFS > data > - you plan to set the dfs.datanode.du.reserved to 20GB > - the reserved reserved-blocks-percentage by tune2fs is 5% (the default) > That gives all users, except root, 5% less capacity that they can use. > Although the System reports the total of 1000GB as usable for all users via > df. The hadoop-deamons are not running as root. > If i read it right, than hadoop get's the free capacity via df. > > Starting in > {{/src/hdfs/org/apache/hadoop/hdfs/server/datanode/FSDataset.java}} on line > 350: {{return usage.getCapacity()-reserved;}} > going to {{/src/core/org/apache/hadoop/fs/DF.java}} which says: > {{"Filesystem disk space usage statistics. Uses the unix 'df' program"}} > When you have 5% reserved by tune2fs (in our case 50GB) and you give > dfs.datanode.du.reserved only 20GB, than you can possibly ran into out of > disk errors that hadoop can't handle. > In this case you must add the planned 20GB du reserved to the reserved > capacity by tune2fs. This results in (at least) 70GB for > dfs.datanode.du.reserved in my case. > Two ideas: > # The documentation must be clear at this point to avoid this problem. > # Hadoop could check for reserved space by tune2fs (or other tools) and add > this value to the dfs.datanode.du.reserved parameter. > This ticket is a follow up from the Mailinglist: > https://mail-archives.apache.org/mod_mbox/hadoop-common-user/201312.mbox/%3CCAHodO=Kbv=13T=2otz+s8nsodbs1icnzqyxt_0wdfxy5gks...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-5926) documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity
[ https://issues.apache.org/jira/browse/HDFS-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota updated HDFS-5926: - Attachment: (was: HDFS-5926-1.patch) > documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity > - > > Key: HDFS-5926 > URL: https://issues.apache.org/jira/browse/HDFS-5926 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 0.20.2 >Reporter: Alexander Fahlke >Assignee: Gabor Bota >Priority: Minor > Labels: newbie > Attachments: HDFS-5926-1.patch > > > I'm using hadoop-0.20.2 on Debian Squeeze and ran into the same confusion as > many others with the parameter for dfs.datanode.du.reserved. One day some > data nodes got out of disk errors although there was space left on the disks. > The following values are rounded to make the problem more clear: > - the disk for the DFS data has 1000GB and only one Partition (ext3) for DFS > data > - you plan to set the dfs.datanode.du.reserved to 20GB > - the reserved reserved-blocks-percentage by tune2fs is 5% (the default) > That gives all users, except root, 5% less capacity that they can use. > Although the System reports the total of 1000GB as usable for all users via > df. The hadoop-deamons are not running as root. > If i read it right, than hadoop get's the free capacity via df. > > Starting in > {{/src/hdfs/org/apache/hadoop/hdfs/server/datanode/FSDataset.java}} on line > 350: {{return usage.getCapacity()-reserved;}} > going to {{/src/core/org/apache/hadoop/fs/DF.java}} which says: > {{"Filesystem disk space usage statistics. Uses the unix 'df' program"}} > When you have 5% reserved by tune2fs (in our case 50GB) and you give > dfs.datanode.du.reserved only 20GB, than you can possibly ran into out of > disk errors that hadoop can't handle. > In this case you must add the planned 20GB du reserved to the reserved > capacity by tune2fs. This results in (at least) 70GB for > dfs.datanode.du.reserved in my case. > Two ideas: > # The documentation must be clear at this point to avoid this problem. > # Hadoop could check for reserved space by tune2fs (or other tools) and add > this value to the dfs.datanode.du.reserved parameter. > This ticket is a follow up from the Mailinglist: > https://mail-archives.apache.org/mod_mbox/hadoop-common-user/201312.mbox/%3CCAHodO=Kbv=13T=2otz+s8nsodbs1icnzqyxt_0wdfxy5gks...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12647) DN commands processing should be async
Daryn Sharp created HDFS-12647: -- Summary: DN commands processing should be async Key: HDFS-12647 URL: https://issues.apache.org/jira/browse/HDFS-12647 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.8.0 Reporter: Daryn Sharp Due to dataset lock contention, service actors may encounter significant latency while processing DN commands. Even the queuing of async deletions require multiple lock acquisitions. A slow disk will cause a backlog of xceivers instantiating block sender/receivers which starves the actor and leads to the NN falsely declaring the node dead. Async processing of all commands will free the actor to perform its primary purpose of heartbeating and block reporting. Note that FBRs will be dependent on queued block invalidations not being included in the report. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12646) Avoid IO while holding the FsDataset lock
Daryn Sharp created HDFS-12646: -- Summary: Avoid IO while holding the FsDataset lock Key: HDFS-12646 URL: https://issues.apache.org/jira/browse/HDFS-12646 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.8.0 Reporter: Daryn Sharp IO operations should be allowed while holding the dataset lock. Notable offenders include but are not limited to the instantiation of a block sender/receiver, constructing the path to a block, unfinalizing a block. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12570) [SPS]: Refactor Co-ordinator datanode logic to track the block storage movements
[ https://issues.apache.org/jira/browse/HDFS-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202006#comment-16202006 ] Hadoop QA commented on HDFS-12570: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 15 new or modified test files. {color} | || || || || {color:brown} HDFS-10285 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 19s{color} | {color:green} HDFS-10285 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} HDFS-10285 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} HDFS-10285 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} HDFS-10285 passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 12m 4s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} HDFS-10285 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} HDFS-10285 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 51s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 10 new + 1142 unchanged - 2 fixed = 1152 total (was 1144) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 10m 47s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}119m 19s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}169m 3s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestPersistentStoragePolicySatisfier | | | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport | | | hadoop.hdfs.TestClientProtocolForPipelineRecovery | | | hadoop.hdfs.TestLeaseRecoveryStriped | | | hadoop.hdfs.TestFileAppend | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | | | hadoop.hdfs.server.namenode.TestStoragePolicySatisfierWithStripedFile | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 | | Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile | | | org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-12570 | | JIRA Patch URL | https://issues.ap
[jira] [Created] (HDFS-12645) FSDatasetImpl lock will stall BP service actors and may cause missing blocks
Daryn Sharp created HDFS-12645: -- Summary: FSDatasetImpl lock will stall BP service actors and may cause missing blocks Key: HDFS-12645 URL: https://issues.apache.org/jira/browse/HDFS-12645 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.8.0 Reporter: Daryn Sharp The DN is extremely susceptible to a slow volume due bad locking practices. DN operations require a fs dataset lock. IO in the dataset lock should not be permissible as it leads to severe performance degradation and possibly (temporarily) missing blocks. A slow disk will cause pipelines to experience significant latency and timeouts, increasing lock/io contention while cleaning up, leading to more timeouts, etc. Meanwhile, the actor service thread is interleaving multiple lock acquire/releases with xceivers. If many commands are issued, the node may be incorrectly declared as dead. HDFS-12639 documents that both actors synchronize on the offer service lock while processing commands. A backlogged active actor will block the standby actor and cause it to go dead too. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11754) Make FsServerDefaults cache configurable.
[ https://issues.apache.org/jira/browse/HDFS-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201997#comment-16201997 ] Rushabh S Shah commented on HDFS-11754: --- [~subru]: I feel the patch is very close to done. [~erofeev]: do you have some bandwidth to address my review comments ? If not, we can push to next release. > Make FsServerDefaults cache configurable. > - > > Key: HDFS-11754 > URL: https://issues.apache.org/jira/browse/HDFS-11754 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Mikhail Erofeev >Priority: Minor > Labels: newbie > Fix For: 2.9.0 > > Attachments: HDFS-11754.001.patch, HDFS-11754.002.patch, > HDFS-11754.003.patch, HDFS-11754.004.patch > > > DFSClient caches the result of FsServerDefaults for 60 minutes. > But the 60 minutes time is not configurable. > Continuing the discussion from HDFS-11702, it would be nice if we can make > this configurable and make the default as 60 minutes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12519) Ozone: Lease Manager framework
[ https://issues.apache.org/jira/browse/HDFS-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandakumar updated HDFS-12519: -- Resolution: Fixed Status: Resolved (was: Patch Available) > Ozone: Lease Manager framework > -- > > Key: HDFS-12519 > URL: https://issues.apache.org/jira/browse/HDFS-12519 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Anu Engineer >Assignee: Nandakumar > Labels: ozoneMerge > Attachments: HDFS-12519-HDFS-7240.000.patch, > HDFS-12519-HDFS-7240.001.patch, HDFS-12519-HDFS-7240.002.patch, > HDFS-12519-HDFS-7240.003.patch, HDFS-12519-HDFS-7240.003.patch > > > Many objects, including Containers and pipelines can time out during creating > process. We need a way to track these timeouts. This lease Manager allows SCM > to hold a lease on these objects and helps SCM timeout waiting for creating > of these objects. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-5926) documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity
[ https://issues.apache.org/jira/browse/HDFS-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota updated HDFS-5926: - Attachment: HDFS-5926-1.patch > documation should clarify dfs.datanode.du.reserved wrt reserved disk capacity > - > > Key: HDFS-5926 > URL: https://issues.apache.org/jira/browse/HDFS-5926 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 0.20.2 >Reporter: Alexander Fahlke >Assignee: Gabor Bota >Priority: Minor > Labels: newbie > Attachments: HDFS-5926-1.patch > > > I'm using hadoop-0.20.2 on Debian Squeeze and ran into the same confusion as > many others with the parameter for dfs.datanode.du.reserved. One day some > data nodes got out of disk errors although there was space left on the disks. > The following values are rounded to make the problem more clear: > - the disk for the DFS data has 1000GB and only one Partition (ext3) for DFS > data > - you plan to set the dfs.datanode.du.reserved to 20GB > - the reserved reserved-blocks-percentage by tune2fs is 5% (the default) > That gives all users, except root, 5% less capacity that they can use. > Although the System reports the total of 1000GB as usable for all users via > df. The hadoop-deamons are not running as root. > If i read it right, than hadoop get's the free capacity via df. > > Starting in > {{/src/hdfs/org/apache/hadoop/hdfs/server/datanode/FSDataset.java}} on line > 350: {{return usage.getCapacity()-reserved;}} > going to {{/src/core/org/apache/hadoop/fs/DF.java}} which says: > {{"Filesystem disk space usage statistics. Uses the unix 'df' program"}} > When you have 5% reserved by tune2fs (in our case 50GB) and you give > dfs.datanode.du.reserved only 20GB, than you can possibly ran into out of > disk errors that hadoop can't handle. > In this case you must add the planned 20GB du reserved to the reserved > capacity by tune2fs. This results in (at least) 70GB for > dfs.datanode.du.reserved in my case. > Two ideas: > # The documentation must be clear at this point to avoid this problem. > # Hadoop could check for reserved space by tune2fs (or other tools) and add > this value to the dfs.datanode.du.reserved parameter. > This ticket is a follow up from the Mailinglist: > https://mail-archives.apache.org/mod_mbox/hadoop-common-user/201312.mbox/%3CCAHodO=Kbv=13T=2otz+s8nsodbs1icnzqyxt_0wdfxy5gks...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12519) Ozone: Lease Manager framework
[ https://issues.apache.org/jira/browse/HDFS-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201953#comment-16201953 ] Nandakumar commented on HDFS-12519: --- Committed this to HDFS-7240. Thanks [~linyiqun], [~vagarychen] & [~anu] for the review. > Ozone: Lease Manager framework > -- > > Key: HDFS-12519 > URL: https://issues.apache.org/jira/browse/HDFS-12519 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Anu Engineer >Assignee: Nandakumar > Labels: ozoneMerge > Attachments: HDFS-12519-HDFS-7240.000.patch, > HDFS-12519-HDFS-7240.001.patch, HDFS-12519-HDFS-7240.002.patch, > HDFS-12519-HDFS-7240.003.patch, HDFS-12519-HDFS-7240.003.patch > > > Many objects, including Containers and pipelines can time out during creating > process. We need a way to track these timeouts. This lease Manager allows SCM > to hold a lease on these objects and helps SCM timeout waiting for creating > of these objects. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12519) Ozone: Lease Manager framework
[ https://issues.apache.org/jira/browse/HDFS-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandakumar updated HDFS-12519: -- Summary: Ozone: Lease Manager framework (was: Ozone: Add a Lease Manager to SCM) > Ozone: Lease Manager framework > -- > > Key: HDFS-12519 > URL: https://issues.apache.org/jira/browse/HDFS-12519 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Anu Engineer >Assignee: Nandakumar > Labels: ozoneMerge > Attachments: HDFS-12519-HDFS-7240.000.patch, > HDFS-12519-HDFS-7240.001.patch, HDFS-12519-HDFS-7240.002.patch, > HDFS-12519-HDFS-7240.003.patch, HDFS-12519-HDFS-7240.003.patch > > > Many objects, including Containers and pipelines can time out during creating > process. We need a way to track these timeouts. This lease Manager allows SCM > to hold a lease on these objects and helps SCM timeout waiting for creating > of these objects. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12519) Ozone: Add a Lease Manager to SCM
[ https://issues.apache.org/jira/browse/HDFS-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201930#comment-16201930 ] Nandakumar commented on HDFS-12519: --- Test failures are not related, will commit it shortly. > Ozone: Add a Lease Manager to SCM > - > > Key: HDFS-12519 > URL: https://issues.apache.org/jira/browse/HDFS-12519 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Anu Engineer >Assignee: Nandakumar > Labels: ozoneMerge > Attachments: HDFS-12519-HDFS-7240.000.patch, > HDFS-12519-HDFS-7240.001.patch, HDFS-12519-HDFS-7240.002.patch, > HDFS-12519-HDFS-7240.003.patch, HDFS-12519-HDFS-7240.003.patch > > > Many objects, including Containers and pipelines can time out during creating > process. We need a way to track these timeouts. This lease Manager allows SCM > to hold a lease on these objects and helps SCM timeout waiting for creating > of these objects. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12490) Ozone: OzoneClient: Add creation/modification time information in OzoneVolume/OzoneBucket/OzoneKey
[ https://issues.apache.org/jira/browse/HDFS-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201903#comment-16201903 ] Hadoop QA commented on HDFS-12490: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HDFS-7240 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 40s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 24s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 17s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 19s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 46s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 48s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s{color} | {color:green} HDFS-7240 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 13s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 54s{color} | {color:orange} hadoop-hdfs-project: The patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 56s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}134m 18s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}210m 44s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport | | | hadoop.hdfs.TestDatanodeReport | | | hadoop.ozone.web.client.TestKeys | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration | | | hadoop.hdfs.server.namenode.TestFileTruncate | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | \\ \\ || Subsystem || Repo
[jira] [Commented] (HDFS-12519) Ozone: Add a Lease Manager to SCM
[ https://issues.apache.org/jira/browse/HDFS-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201848#comment-16201848 ] Hadoop QA commented on HDFS-12519: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} HDFS-7240 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 34s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 1s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} HDFS-7240 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 36s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}148m 56s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.TestSafeMode | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | HDFS-12519 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12891674/HDFS-12519-HDFS-7240.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 06569d1c7fbc 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-7240 / 034f01a | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/21666/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/21666/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/21666/console | | Powered by | Apache Yetus 0.6
[jira] [Commented] (HDFS-12570) [SPS]: Refactor Co-ordinator datanode logic to track the block storage movements
[ https://issues.apache.org/jira/browse/HDFS-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201817#comment-16201817 ] Rakesh R commented on HDFS-12570: - Attached another patch fixing {{TestStoragePolicySatisfierWithStripedFile}} and {{TestHdfsConfigFields}} test failures. I could see {{TestPersistentStoragePolicySatisfier}} failures will be fixed as part of HDFS-12556. > [SPS]: Refactor Co-ordinator datanode logic to track the block storage > movements > > > Key: HDFS-12570 > URL: https://issues.apache.org/jira/browse/HDFS-12570 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-12570-HDFS-10285-00.patch, > HDFS-12570-HDFS-10285-01.patch, HDFS-12570-HDFS-10285-02.patch, > HDFS-12570-HDFS-10285-03.patch, HDFS-12570-HDFS-10285-04.patch > > > This task is to refactor the C-DN block storage movements. Basically, the > idea is to move the scheduling and tracking logic to Namenode rather than at > the special C-DN. Please refer the discussion with [~andrew.wang] to > understand the [background and the necessity of > refactoring|https://issues.apache.org/jira/browse/HDFS-10285?focusedCommentId=16141060&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16141060]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12570) [SPS]: Refactor Co-ordinator datanode logic to track the block storage movements
[ https://issues.apache.org/jira/browse/HDFS-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-12570: Attachment: HDFS-12570-HDFS-10285-04.patch > [SPS]: Refactor Co-ordinator datanode logic to track the block storage > movements > > > Key: HDFS-12570 > URL: https://issues.apache.org/jira/browse/HDFS-12570 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-12570-HDFS-10285-00.patch, > HDFS-12570-HDFS-10285-01.patch, HDFS-12570-HDFS-10285-02.patch, > HDFS-12570-HDFS-10285-03.patch, HDFS-12570-HDFS-10285-04.patch > > > This task is to refactor the C-DN block storage movements. Basically, the > idea is to move the scheduling and tracking logic to Namenode rather than at > the special C-DN. Please refer the discussion with [~andrew.wang] to > understand the [background and the necessity of > refactoring|https://issues.apache.org/jira/browse/HDFS-10285?focusedCommentId=16141060&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16141060]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12644) Offer a non-privileged listEncryptionZone operation
Wei-Chiu Chuang created HDFS-12644: -- Summary: Offer a non-privileged listEncryptionZone operation Key: HDFS-12644 URL: https://issues.apache.org/jira/browse/HDFS-12644 Project: Hadoop HDFS Issue Type: Improvement Components: encryption, namenode Affects Versions: 3.0.0-alpha1, 2.8.0 Reporter: Wei-Chiu Chuang Assignee: Wei-Chiu Chuang As discussed in HDFS-12484, we can consider adding a non-privileged listEncryptionZone for better user experience. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12484) Undefined -expunge behavior after 2.8
[ https://issues.apache.org/jira/browse/HDFS-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201776#comment-16201776 ] Wei-Chiu Chuang commented on HDFS-12484: Thanks [~xyao] and sorry for late update. I was on vacation. I am a little hesitate to take the latter route, since I am not so sure about the purpose of listEncryptionZone, specifically, why is it a privileged operation? Let's file a new jira to initiate the discussion. > Undefined -expunge behavior after 2.8 > - > > Key: HDFS-12484 > URL: https://issues.apache.org/jira/browse/HDFS-12484 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-12484.001.patch, HDFS-12484.002.patch > > > (Rewrote the description to reflect the actual behavior) > Hadoop 2.8 added a feature to support trash inside encryption zones, which is > a great feature to have. > However, when it comes to -expunge, the behavior is not well defined. A > superuser invoking -expunge removes files under all encryption zone trash > directory belonging to the user. On the other hand, because > listEncryptionZones requires superuser permission, a non-privileged user > invoking -expunge can removes under home directory, but not under encryption > zones. > Moreover, the command prints a scary warning message that looks annoying. > {noformat} > 2017-09-21 01:22:44,744 [main] WARN hdfs.DFSClient > (DistributedFileSystem.java:getTrashRoots(2795)) - Cannot get all encrypted > trash roots > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): > Access denied for user user. Superuser privilege is required > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSuperuserPrivilege(FSPermissionChecker.java:130) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:4556) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.listEncryptionZones(FSNamesystem.java:7048) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.listEncryptionZones(NameNodeRpcServer.java:2053) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.listEncryptionZones(ClientNamenodeProtocolServerSideTranslatorPB.java:1477) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1490) > at org.apache.hadoop.ipc.Client.call(Client.java:1436) > at org.apache.hadoop.ipc.Client.call(Client.java:1346) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) > at com.sun.proxy.$Proxy25.listEncryptionZones(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.listEncryptionZones(ClientNamenodeProtocolTranslatorPB.java:1510) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) > at com.sun.proxy.$Proxy29.listEncryptionZones(Unknown Source) > at > org.apache.hadoop.hdfs.protocol.EncryptionZoneIterator.makeRequest(Encrypti
[jira] [Resolved] (HDFS-11797) BlockManager#createLocatedBlocks() can throw ArrayIndexOutofBoundsException when corrupt replicas are inconsistent
[ https://issues.apache.org/jira/browse/HDFS-11797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved HDFS-11797. Resolution: Duplicate I'm going to close it as a dup of HDFS-11445. Feel free to reopen if this is not the case. Thanks [~kshukla]! > BlockManager#createLocatedBlocks() can throw ArrayIndexOutofBoundsException > when corrupt replicas are inconsistent > -- > > Key: HDFS-11797 > URL: https://issues.apache.org/jira/browse/HDFS-11797 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Critical > Attachments: HDFS-11797.001.patch > > > The calculation for {{numMachines}} can be too less (causing > ArrayIndexOutOfBoundsException) or too many (causing NPE (HDFS-9958)) if data > structures find inconsistent number of corrupt replicas. This was earlier > found related to failed storages. This JIRA tracks a change that works for > all possible cases of inconsistencies. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org