[jira] [Commented] (HDFS-13782) ObserverReadProxyProvider should work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-13782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592734#comment-16592734 ] Konstantin Shvachko commented on HDFS-13782: I just committed this. > ObserverReadProxyProvider should work with IPFailoverProxyProvider > -- > > Key: HDFS-13782 > URL: https://issues.apache.org/jira/browse/HDFS-13782 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Fix For: HDFS-12943 > > Attachments: HDFS-13782-HDFS-12943.001.patch, > HDFS-13782-HDFS-12943.002.patch > > > Currently {{ObserverReadProxyProvider}} is based on > {{ConfiguredFailoverProxyProvider}}. We should also be able perform SBN reads > in case of {{IPFailoverProxyProvider}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13782) ObserverReadProxyProvider should work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-13782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-13782: --- Release Note: (was: I just committed this.) > ObserverReadProxyProvider should work with IPFailoverProxyProvider > -- > > Key: HDFS-13782 > URL: https://issues.apache.org/jira/browse/HDFS-13782 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Fix For: HDFS-12943 > > Attachments: HDFS-13782-HDFS-12943.001.patch, > HDFS-13782-HDFS-12943.002.patch > > > Currently {{ObserverReadProxyProvider}} is based on > {{ConfiguredFailoverProxyProvider}}. We should also be able perform SBN reads > in case of {{IPFailoverProxyProvider}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-13782) ObserverReadProxyProvider should work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-13782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko resolved HDFS-13782. Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-12943 Release Note: I just committed this. > ObserverReadProxyProvider should work with IPFailoverProxyProvider > -- > > Key: HDFS-13782 > URL: https://issues.apache.org/jira/browse/HDFS-13782 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Fix For: HDFS-12943 > > Attachments: HDFS-13782-HDFS-12943.001.patch, > HDFS-13782-HDFS-12943.002.patch > > > Currently {{ObserverReadProxyProvider}} is based on > {{ConfiguredFailoverProxyProvider}}. We should also be able perform SBN reads > in case of {{IPFailoverProxyProvider}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13836) RBF: To handle the exception when the mounttable znode have null value.
[ https://issues.apache.org/jira/browse/HDFS-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592616#comment-16592616 ] yanghuafeng edited comment on HDFS-13836 at 8/25/18 3:10 PM: - I have found that it may be better to manage null value in the ZkCuratorManager.getString(). But the method getString() has throw an exception, including the NPE. In the StateStoreZooKeeperImpl.get(), we have caught the Exception but we just log the error not to delete the corrupted znode. In general we also in the catch clause judge the NPE and delete the anode. Now we just judge the null in advance. Compareing to manage NPE in the ZkCuratorManager I am not sure which is better. {code:java} try{ String path = getNodePath(znode, child); Stat stat = new Stat(); String data = zkManager.getStringData(path, stat); .. } catch (Exception e) { LOG.error("Cannot get data for {}: {}", child, e.getMessage()); } {code} was (Author: hfyang20071): I have found that it may be better to manage null value in the ZkCuratorManager.getString(). But the method getString() has throw an exception, including the NPE. In the StateStoreZooKeeperImpl.get(), we have caught the Exception but we just log the error not to delete the corrupted znode. In general we also in the catch clause judge the NPE and delete the anode. Now we just judge the null in advance. So I am not sure which is better. {code:java} try{ String path = getNodePath(znode, child); Stat stat = new Stat(); String data = zkManager.getStringData(path, stat); .. } catch (Exception e) { LOG.error("Cannot get data for {}: {}", child, e.getMessage()); } {code} > RBF: To handle the exception when the mounttable znode have null value. > --- > > Key: HDFS-13836 > URL: https://issues.apache.org/jira/browse/HDFS-13836 > Project: Hadoop HDFS > Issue Type: Bug > Components: federation, hdfs >Affects Versions: 3.1.0 >Reporter: yanghuafeng >Assignee: yanghuafeng >Priority: Major > Fix For: 2.9.0, 3.0.0, 3.1.0, 3.2.0 > > Attachments: HDFS-13836.001.patch, HDFS-13836.002.patch, > HDFS-13836.003.patch, HDFS-13836.004.patch > > > When we are adding the mounttable entry, the router sever is terminated. > Some error messages show in log, as follow: > 2018-08-20 14:18:32,404 ERROR > org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl: > Cannot get data for 0SLASH0testzk: null. > The reason is that router server have created the znode but not to set data > before being terminated. But the method zkManager.getStringData(path, stat) > will throw NPE if the path has null value in the StateStoreZooKeeperImpl, > leading to fail in adding the same mounttable entry and deleting the existing > znode. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13836) RBF: To handle the exception when the mounttable znode have null value.
[ https://issues.apache.org/jira/browse/HDFS-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592616#comment-16592616 ] yanghuafeng commented on HDFS-13836: I have found that it may be better to manage null value in the ZkCuratorManager.getString(). But the method getString() has throw an exception, including the NPE. In the StateStoreZooKeeperImpl.get(), we have caught the Exception but we just log the error not to delete the corrupted znode. In general we also in the catch clause judge the NPE and delete the anode. Now we just judge the null in advance. So I am not sure which is better. {code:java} try{ String path = getNodePath(znode, child); Stat stat = new Stat(); String data = zkManager.getStringData(path, stat); .. } catch (Exception e) { LOG.error("Cannot get data for {}: {}", child, e.getMessage()); } {code} > RBF: To handle the exception when the mounttable znode have null value. > --- > > Key: HDFS-13836 > URL: https://issues.apache.org/jira/browse/HDFS-13836 > Project: Hadoop HDFS > Issue Type: Bug > Components: federation, hdfs >Affects Versions: 3.1.0 >Reporter: yanghuafeng >Assignee: yanghuafeng >Priority: Major > Fix For: 2.9.0, 3.0.0, 3.1.0, 3.2.0 > > Attachments: HDFS-13836.001.patch, HDFS-13836.002.patch, > HDFS-13836.003.patch, HDFS-13836.004.patch > > > When we are adding the mounttable entry, the router sever is terminated. > Some error messages show in log, as follow: > 2018-08-20 14:18:32,404 ERROR > org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl: > Cannot get data for 0SLASH0testzk: null. > The reason is that router server have created the znode but not to set data > before being terminated. But the method zkManager.getStringData(path, stat) > will throw NPE if the path has null value in the StateStoreZooKeeperImpl, > leading to fail in adding the same mounttable entry and deleting the existing > znode. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13844) Refactor the fmt_bytes function in the dfs-dust.js.
[ https://issues.apache.org/jira/browse/HDFS-13844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592600#comment-16592600 ] yanghuafeng commented on HDFS-13844: It is just an example because it is difficult to get the capacity in our environment with last unit ZB. We can decrease the units and simulate one situation overflowing the unit to explain this problem. > Refactor the fmt_bytes function in the dfs-dust.js. > --- > > Key: HDFS-13844 > URL: https://issues.apache.org/jira/browse/HDFS-13844 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, ui >Affects Versions: 1.2.0, 2.2.0, 2.7.2, 3.0.0, 3.1.0 >Reporter: yanghuafeng >Assignee: yanghuafeng >Priority: Minor > Attachments: HDFS-13844.001.patch, overflow_undefined_unit.jpg, > overflow_unit.jpg, undefined_unit.jpg > > > The namenode WebUI cannot display the capacity with correct units. I have > found that the function fmt_bytes in the dfs-dust.js missed the EB unit. This > will lead to undefined unit in the ui. > And although the unit ZB is very large, we should take the unit overflow into > consideration. Supposing the last unit is GB, we should get the 8192 GB with > the total capacity 8T rather than 8 undefined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13854) RBF: The ProcessingAvgTime and ProxyAvgTime should display by JMX with ms unit.
[ https://issues.apache.org/jira/browse/HDFS-13854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592596#comment-16592596 ] yanghuafeng commented on HDFS-13854: It is not wrong just improve to unify the time unit. In the previous code nanoseconds is not better to display in the jmx. As we provide the toMS() to transfer the unit and the nanoseconds is really not essential, is it best to unify the proxy time unit with ms when storing? > RBF: The ProcessingAvgTime and ProxyAvgTime should display by JMX with ms > unit. > --- > > Key: HDFS-13854 > URL: https://issues.apache.org/jira/browse/HDFS-13854 > Project: Hadoop HDFS > Issue Type: Improvement > Components: federation, hdfs >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: yanghuafeng >Assignee: yanghuafeng >Priority: Major > Attachments: HDFS-13854.001.patch, HDFS-13854.002.patch, > ganglia_jmx_compare1.jpg, ganglia_jmx_compare2.jpg > > > In the FederationRPCMetrics, proxy time and processing time should be exposed > to the jmx or ganglia with ms units. Although the method toMS() exists, we > cannot get the correct proxy time and processing time by jmx and ganglia. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-247) Handle CLOSED_CONTAINER_IO exception in ozoneClient
[ https://issues.apache.org/jira/browse/HDDS-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592566#comment-16592566 ] genericqa commented on HDDS-247: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 2s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 28s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 16s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s{color} | {color:green} client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 37s{color} | {color:green} common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 32s{color} | {color:green} client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 13m 15s{color} | {color:red} integration-test in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}118m 13s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ozone.web.client.TestKeys | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | HDDS-247 | | JIRA Patch URL | htt
[jira] [Commented] (HDFS-13830) Backport HDFS-13141 to branch-3.0: WebHDFS: Add support for getting snasphottable directory list
[ https://issues.apache.org/jira/browse/HDFS-13830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592562#comment-16592562 ] genericqa commented on HDFS-13830: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 0s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-3.0 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 15s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 18s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 12s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 44s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 5s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 18s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 16s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 34s{color} | {color:green} branch-3.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 19s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 43s{color} | {color:orange} root: The patch generated 2 new + 281 unchanged - 2 fixed = 283 total (was 283) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 35s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 0s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 38s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m 21s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}217m 30s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints | | | hadoop.hdfs.TestLeaseRecovery2 | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:1776208 | | JIRA Issue | HDFS-13830 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12937128/HDFS-13830.branch-3.0.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 36d705a03007 4.4.0-133-generic #159-Ubuntu SMP
[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592561#comment-16592561 ] lindongdong commented on HDFS-13671: Hi, [~kihwal], how is the revert work? > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Priority: Major > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-247) Handle CLOSED_CONTAINER_IO exception in ozoneClient
[ https://issues.apache.org/jira/browse/HDDS-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592540#comment-16592540 ] Shashikant Banerjee commented on HDDS-247: -- Thanks [~msingh], for the review .patch v11 addresses your review comments. > Handle CLOSED_CONTAINER_IO exception in ozoneClient > --- > > Key: HDDS-247 > URL: https://issues.apache.org/jira/browse/HDDS-247 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-247.00.patch, HDDS-247.01.patch, HDDS-247.02.patch, > HDDS-247.03.patch, HDDS-247.04.patch, HDDS-247.05.patch, HDDS-247.06.patch, > HDDS-247.07.patch, HDDS-247.08.patch, HDDS-247.09.patch, HDDS-247.10.patch, > HDDS-247.11.patch > > > In case of ongoing writes by Ozone client to a container, the container might > get closed on the Datanodes because of node loss, out of space issues etc. In > such cases, the operation will fail with CLOSED_CONTAINER_IO exception. In > cases as such, ozone client should try to get the committed length of the > block from the Datanodes, and update the OM. This Jira aims to address this > issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-247) Handle CLOSED_CONTAINER_IO exception in ozoneClient
[ https://issues.apache.org/jira/browse/HDDS-247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee updated HDDS-247: - Attachment: HDDS-247.11.patch > Handle CLOSED_CONTAINER_IO exception in ozoneClient > --- > > Key: HDDS-247 > URL: https://issues.apache.org/jira/browse/HDDS-247 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-247.00.patch, HDDS-247.01.patch, HDDS-247.02.patch, > HDDS-247.03.patch, HDDS-247.04.patch, HDDS-247.05.patch, HDDS-247.06.patch, > HDDS-247.07.patch, HDDS-247.08.patch, HDDS-247.09.patch, HDDS-247.10.patch, > HDDS-247.11.patch > > > In case of ongoing writes by Ozone client to a container, the container might > get closed on the Datanodes because of node loss, out of space issues etc. In > such cases, the operation will fail with CLOSED_CONTAINER_IO exception. In > cases as such, ozone client should try to get the committed length of the > block from the Datanodes, and update the OM. This Jira aims to address this > issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-247) Handle CLOSED_CONTAINER_IO exception in ozoneClient
[ https://issues.apache.org/jira/browse/HDDS-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592528#comment-16592528 ] Mukul Kumar Singh commented on HDDS-247: Thanks for working on this [~shashikant]. The latest patch looks really good to me. Some really minor comments. I am +1 on the patch after that. 1) ChunkGroupOutputStream:259,260,276,277,396,648 this is an unrelated change 2) ChunkGroupOutputStream:292-300, the TODO should be moved inside handleCloseContainerException 3) ChunkGroupOutputStream:631 setCurrentPosition is not used, can we remove this ? 4) ChunkOutputStream:113-114 unrelated change. 5) TestCloseContainerHandlingByClient#validateData, the input stream is not closed here 5) TestCloseContainerHandlingByClient#waitForContainerClose, the wait for close loop should be another loop. this will help in closing multiple container in one iteration faster. 6) TestCloseContainerHandlingByClient#95, I feel fixedLengthString can be removed and replaced with RandomStringUtils.random() to generate keydata, this will help in validating with random data. 7)TestOmBlockVersioning, wildcard import > Handle CLOSED_CONTAINER_IO exception in ozoneClient > --- > > Key: HDDS-247 > URL: https://issues.apache.org/jira/browse/HDDS-247 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-247.00.patch, HDDS-247.01.patch, HDDS-247.02.patch, > HDDS-247.03.patch, HDDS-247.04.patch, HDDS-247.05.patch, HDDS-247.06.patch, > HDDS-247.07.patch, HDDS-247.08.patch, HDDS-247.09.patch, HDDS-247.10.patch > > > In case of ongoing writes by Ozone client to a container, the container might > get closed on the Datanodes because of node loss, out of space issues etc. In > such cases, the operation will fail with CLOSED_CONTAINER_IO exception. In > cases as such, ozone client should try to get the committed length of the > block from the Datanodes, and update the OM. This Jira aims to address this > issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13830) Backport HDFS-13141 to branch-3.0: WebHDFS: Add support for getting snasphottable directory list
[ https://issues.apache.org/jira/browse/HDFS-13830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592519#comment-16592519 ] Siyao Meng commented on HDFS-13830: --- Thanks [~jojochuang] for the comment. Removed HDFS-13280 patch in rev 004. > Backport HDFS-13141 to branch-3.0: WebHDFS: Add support for getting > snasphottable directory list > > > Key: HDFS-13830 > URL: https://issues.apache.org/jira/browse/HDFS-13830 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 3.0.3 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Attachments: HDFS-13830.branch-3.0.001.patch, > HDFS-13830.branch-3.0.002.patch, HDFS-13830.branch-3.0.003.patch, > HDFS-13830.branch-3.0.004.patch > > > HDFS-13141 conflicts with 3.0.3 because of interface change in HdfsFileStatus. > This Jira aims to backport the WebHDFS getSnapshottableDirListing() support > to branch-3.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13830) Backport HDFS-13141 to branch-3.0: WebHDFS: Add support for getting snasphottable directory list
[ https://issues.apache.org/jira/browse/HDFS-13830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyao Meng updated HDFS-13830: -- Attachment: HDFS-13830.branch-3.0.004.patch Status: Patch Available (was: In Progress) > Backport HDFS-13141 to branch-3.0: WebHDFS: Add support for getting > snasphottable directory list > > > Key: HDFS-13830 > URL: https://issues.apache.org/jira/browse/HDFS-13830 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 3.0.3 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Attachments: HDFS-13830.branch-3.0.001.patch, > HDFS-13830.branch-3.0.002.patch, HDFS-13830.branch-3.0.003.patch, > HDFS-13830.branch-3.0.004.patch > > > HDFS-13141 conflicts with 3.0.3 because of interface change in HdfsFileStatus. > This Jira aims to backport the WebHDFS getSnapshottableDirListing() support > to branch-3.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13830) Backport HDFS-13141 to branch-3.0: WebHDFS: Add support for getting snasphottable directory list
[ https://issues.apache.org/jira/browse/HDFS-13830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyao Meng updated HDFS-13830: -- Status: In Progress (was: Patch Available) > Backport HDFS-13141 to branch-3.0: WebHDFS: Add support for getting > snasphottable directory list > > > Key: HDFS-13830 > URL: https://issues.apache.org/jira/browse/HDFS-13830 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 3.0.3 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Attachments: HDFS-13830.branch-3.0.001.patch, > HDFS-13830.branch-3.0.002.patch, HDFS-13830.branch-3.0.003.patch > > > HDFS-13141 conflicts with 3.0.3 because of interface change in HdfsFileStatus. > This Jira aims to backport the WebHDFS getSnapshottableDirListing() support > to branch-3.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org