[jira] [Updated] (HDDS-2528) Sonar : code smell category issues in CommitWatcher
[ https://issues.apache.org/jira/browse/HDDS-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2528: Summary: Sonar : code smell category issues in CommitWatcher (was: Sonar : change return type to interface instead of implementation in CommitWatcher) > Sonar : code smell category issues in CommitWatcher > --- > > Key: HDDS-2528 > URL: https://issues.apache.org/jira/browse/HDDS-2528 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: sonar > > Sonar issues for CommitWatcher.java: > use interface instead of implementation: > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVr&open=AW5md-_8KcVY8lQ4ZsVr > redundant return: > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVv&open=AW5md-_8KcVY8lQ4ZsVv > format specifiers instead of concatenation in Log: > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVu&open=AW5md-_8KcVY8lQ4ZsVu > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVt&open=AW5md-_8KcVY8lQ4ZsVt > redundant temporary variable: > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVs&open=AW5md-_8KcVY8lQ4ZsVs -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2528) Sonar : change return type to interface instead of implementation in CommitWatcher
[ https://issues.apache.org/jira/browse/HDDS-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2528: Description: Sonar issues for CommitWatcher.java: use interface instead of implementation: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVr&open=AW5md-_8KcVY8lQ4ZsVr redundant return: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVv&open=AW5md-_8KcVY8lQ4ZsVv format specifiers instead of concatenation in Log: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVu&open=AW5md-_8KcVY8lQ4ZsVu https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVt&open=AW5md-_8KcVY8lQ4ZsVt redundant temporary variable: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVs&open=AW5md-_8KcVY8lQ4ZsVs was: Sonar report : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVq&open=AW5md-_8KcVY8lQ4ZsVq https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVr&open=AW5md-_8KcVY8lQ4ZsVr > Sonar : change return type to interface instead of implementation in > CommitWatcher > -- > > Key: HDDS-2528 > URL: https://issues.apache.org/jira/browse/HDDS-2528 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: sonar > > Sonar issues for CommitWatcher.java: > use interface instead of implementation: > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVr&open=AW5md-_8KcVY8lQ4ZsVr > redundant return: > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVv&open=AW5md-_8KcVY8lQ4ZsVv > format specifiers instead of concatenation in Log: > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVu&open=AW5md-_8KcVY8lQ4ZsVu > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVt&open=AW5md-_8KcVY8lQ4ZsVt > redundant temporary variable: > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVs&open=AW5md-_8KcVY8lQ4ZsVs -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2586) Sonar : refactor getAvailableNodesCount in NetworkTopologyImpl to reduce cognitive complexity
[ https://issues.apache.org/jira/browse/HDDS-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2586: Description: Sonar reports CC value of 34: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-2rKcVY8lQ4ZsN2&open=AW5md-2rKcVY8lQ4ZsN2 was: Sonar reports CC value of 16: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-2rKcVY8lQ4ZsN2&open=AW5md-2rKcVY8lQ4ZsN2 > Sonar : refactor getAvailableNodesCount in NetworkTopologyImpl to reduce > cognitive complexity > - > > Key: HDDS-2586 > URL: https://issues.apache.org/jira/browse/HDDS-2586 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Critical > > Sonar reports CC value of 34: > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-2rKcVY8lQ4ZsN2&open=AW5md-2rKcVY8lQ4ZsN2 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2586) Sonar : refactor getAvailableNodesCount in NetworkTopologyImpl to reduce cognitive complexity
Supratim Deka created HDDS-2586: --- Summary: Sonar : refactor getAvailableNodesCount in NetworkTopologyImpl to reduce cognitive complexity Key: HDDS-2586 URL: https://issues.apache.org/jira/browse/HDDS-2586 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar reports CC value of 16: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-2rKcVY8lQ4ZsN2&open=AW5md-2rKcVY8lQ4ZsN2 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2585) Sonar : refactor getDistanceCost in NetworkTopologyImpl to reduce cognitive complexity
Supratim Deka created HDDS-2585: --- Summary: Sonar : refactor getDistanceCost in NetworkTopologyImpl to reduce cognitive complexity Key: HDDS-2585 URL: https://issues.apache.org/jira/browse/HDDS-2585 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar reports CC value 24: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-2rKcVY8lQ4ZsN0&open=AW5md-2rKcVY8lQ4ZsN0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2584) Sonar : refactor chooseNodeInternal in NetworkTopologyImpl to reduce cognitive complexity
Supratim Deka created HDDS-2584: --- Summary: Sonar : refactor chooseNodeInternal in NetworkTopologyImpl to reduce cognitive complexity Key: HDDS-2584 URL: https://issues.apache.org/jira/browse/HDDS-2584 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar reports CC value 31 for this method: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-2rKcVY8lQ4ZsNy&open=AW5md-2rKcVY8lQ4ZsNy -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2582) Sonar : reduce cognitive complexity of getObject in OzoneConfiguration
[ https://issues.apache.org/jira/browse/HDDS-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2582: Description: Sonar reports CC value 16 : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-4nKcVY8lQ4ZsPS&open=AW5md-4nKcVY8lQ4ZsPS was: Sonar reports CC value 15 : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-4nKcVY8lQ4ZsPS&open=AW5md-4nKcVY8lQ4ZsPS > Sonar : reduce cognitive complexity of getObject in OzoneConfiguration > -- > > Key: HDDS-2582 > URL: https://issues.apache.org/jira/browse/HDDS-2582 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Critical > > Sonar reports CC value 16 : > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-4nKcVY8lQ4ZsPS&open=AW5md-4nKcVY8lQ4ZsPS -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2583) Sonar : refactor getRangeKVs in RocksDBStore to reduce cognitive complexity
Supratim Deka created HDDS-2583: --- Summary: Sonar : refactor getRangeKVs in RocksDBStore to reduce cognitive complexity Key: HDDS-2583 URL: https://issues.apache.org/jira/browse/HDDS-2583 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Reporter: Supratim Deka Assignee: Supratim Deka Sonar reports CC value 35: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-0-KcVY8lQ4ZsLe&open=AW5md-0-KcVY8lQ4ZsLe -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2582) Sonar : reduce cognitive complexity of getObject in OzoneConfiguration
Supratim Deka created HDDS-2582: --- Summary: Sonar : reduce cognitive complexity of getObject in OzoneConfiguration Key: HDDS-2582 URL: https://issues.apache.org/jira/browse/HDDS-2582 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Manager Reporter: Supratim Deka Assignee: Supratim Deka Sonar reports CC value 15 : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-4nKcVY8lQ4ZsPS&open=AW5md-4nKcVY8lQ4ZsPS -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2526) Sonar : use format specifiers in Log inside HddsConfServlet
[ https://issues.apache.org/jira/browse/HDDS-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2526: Status: Patch Available (was: Open) > Sonar : use format specifiers in Log inside HddsConfServlet > > > Key: HDDS-2526 > URL: https://issues.apache.org/jira/browse/HDDS-2526 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Sonar report : > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-4jKcVY8lQ4ZsPQ&open=AW5md-4jKcVY8lQ4ZsPQ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2525) Sonar : replace lambda with method reference in SCM BufferPool
[ https://issues.apache.org/jira/browse/HDDS-2525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2525: Status: Patch Available (was: Open) > Sonar : replace lambda with method reference in SCM BufferPool > -- > > Key: HDDS-2525 > URL: https://issues.apache.org/jira/browse/HDDS-2525 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > As per Sonar, method references are more compact than lambda - this applies > to java 8, not older versions. > Sonar report: > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_5KcVY8lQ4ZsVn&open=AW5md-_5KcVY8lQ4ZsVn -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2524) Sonar : clumsy error handling in BlockOutputStream validateResponse
[ https://issues.apache.org/jira/browse/HDDS-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2524: Status: Patch Available (was: Open) > Sonar : clumsy error handling in BlockOutputStream validateResponse > --- > > Key: HDDS-2524 > URL: https://issues.apache.org/jira/browse/HDDS-2524 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Link to Sonar report : > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_2KcVY8lQ4ZsVk&open=AW5md-_2KcVY8lQ4ZsVk -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2532) Sonar : fix issues in OzoneQuota
Supratim Deka created HDDS-2532: --- Summary: Sonar : fix issues in OzoneQuota Key: HDDS-2532 URL: https://issues.apache.org/jira/browse/HDDS-2532 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Client Reporter: Supratim Deka Assignee: Supratim Deka Sonar issues : remove runtime exception from declaration. https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-4NKcVY8lQ4ZsO_&open=AW5md-4NKcVY8lQ4ZsO_ use primitive boolean expression. https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-4NKcVY8lQ4ZsO-&open=AW5md-4NKcVY8lQ4ZsO- -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2531) Sonar : remove duplicate string literals in BlockOutputStream
[ https://issues.apache.org/jira/browse/HDDS-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2531: Description: Sonar issue in executePutBlock, duplicate string literal "blockID" : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_1KcVY8lQ4ZsVa&open=AW5md-_1KcVY8lQ4ZsVa format specifiers in Log: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_2KcVY8lQ4ZsVg&open=AW5md-_2KcVY8lQ4ZsVg define string constant instead of duplicate string literals. https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_2KcVY8lQ4ZsVb&open=AW5md-_2KcVY8lQ4ZsVb was: Sonar issue in executePutBlock, duplicate string literal "blockID" : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_1KcVY8lQ4ZsVa&open=AW5md-_1KcVY8lQ4ZsVa > Sonar : remove duplicate string literals in BlockOutputStream > - > > Key: HDDS-2531 > URL: https://issues.apache.org/jira/browse/HDDS-2531 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > > Sonar issue in executePutBlock, duplicate string literal "blockID" : > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_1KcVY8lQ4ZsVa&open=AW5md-_1KcVY8lQ4ZsVa > format specifiers in Log: > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_2KcVY8lQ4ZsVg&open=AW5md-_2KcVY8lQ4ZsVg > define string constant instead of duplicate string literals. > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_2KcVY8lQ4ZsVb&open=AW5md-_2KcVY8lQ4ZsVb -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2531) Sonar : remove duplicate string literals in BlockOutputStream
Supratim Deka created HDDS-2531: --- Summary: Sonar : remove duplicate string literals in BlockOutputStream Key: HDDS-2531 URL: https://issues.apache.org/jira/browse/HDDS-2531 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Client Reporter: Supratim Deka Assignee: Supratim Deka Sonar issue in executePutBlock, duplicate string literal "blockID" : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_1KcVY8lQ4ZsVa&open=AW5md-_1KcVY8lQ4ZsVa -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2530) Sonar : refactor verifyResourceName in HddsClientUtils to fix Sonar errors
[ https://issues.apache.org/jira/browse/HDDS-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2530: Description: Sonar report : Reduce cognitive complexity from 33 to 15 https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWR&open=AW5md_APKcVY8lQ4ZsWR https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWQ&open=AW5md_APKcVY8lQ4ZsWQ https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWJ&open=AW5md_APKcVY8lQ4ZsWJ was: Sonar report : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWR&open=AW5md_APKcVY8lQ4ZsWR https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWQ&open=AW5md_APKcVY8lQ4ZsWQ > Sonar : refactor verifyResourceName in HddsClientUtils to fix Sonar errors > -- > > Key: HDDS-2530 > URL: https://issues.apache.org/jira/browse/HDDS-2530 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > > Sonar report : > Reduce cognitive complexity from 33 to 15 > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWR&open=AW5md_APKcVY8lQ4ZsWR > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWQ&open=AW5md_APKcVY8lQ4ZsWQ > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWJ&open=AW5md_APKcVY8lQ4ZsWJ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2530) Sonar : refactor verifyResourceName in HddsClientUtils to fix Sonar errors
[ https://issues.apache.org/jira/browse/HDDS-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2530: Summary: Sonar : refactor verifyResourceName in HddsClientUtils to fix Sonar errors (was: Sonar : refactor verifyResourceName in HddsClientUtils to reduce Cognitive Complexity ) > Sonar : refactor verifyResourceName in HddsClientUtils to fix Sonar errors > -- > > Key: HDDS-2530 > URL: https://issues.apache.org/jira/browse/HDDS-2530 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > > Sonar report : > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWR&open=AW5md_APKcVY8lQ4ZsWR > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWQ&open=AW5md_APKcVY8lQ4ZsWQ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2530) Sonar : refactor verifyResourceName in HddsClientUtils to reduce Cognitive Complexity
[ https://issues.apache.org/jira/browse/HDDS-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2530: Description: Sonar report : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWR&open=AW5md_APKcVY8lQ4ZsWR https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWQ&open=AW5md_APKcVY8lQ4ZsWQ was: Sonar report : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWR&open=AW5md_APKcVY8lQ4ZsWR > Sonar : refactor verifyResourceName in HddsClientUtils to reduce Cognitive > Complexity > -- > > Key: HDDS-2530 > URL: https://issues.apache.org/jira/browse/HDDS-2530 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > > Sonar report : > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWR&open=AW5md_APKcVY8lQ4ZsWR > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWQ&open=AW5md_APKcVY8lQ4ZsWQ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2530) Sonar : refactor verifyResourceName in HddsClientUtils to reduce Cognitive Complexity
[ https://issues.apache.org/jira/browse/HDDS-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2530: Component/s: Ozone Client > Sonar : refactor verifyResourceName in HddsClientUtils to reduce Cognitive > Complexity > -- > > Key: HDDS-2530 > URL: https://issues.apache.org/jira/browse/HDDS-2530 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > > Sonar report : > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWR&open=AW5md_APKcVY8lQ4ZsWR -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2530) Sonar : refactor verifyResourceName in HddsClientUtils to reduce Cognitive Complexity
[ https://issues.apache.org/jira/browse/HDDS-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2530: Summary: Sonar : refactor verifyResourceName in HddsClientUtils to reduce Cognitive Complexity (was: Sonar : refactor method to reduce Cognitive ) > Sonar : refactor verifyResourceName in HddsClientUtils to reduce Cognitive > Complexity > -- > > Key: HDDS-2530 > URL: https://issues.apache.org/jira/browse/HDDS-2530 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > > Sonar report : > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWR&open=AW5md_APKcVY8lQ4ZsWR -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2530) Sonar : refactor method to reduce Cognitive
Supratim Deka created HDDS-2530: --- Summary: Sonar : refactor method to reduce Cognitive Key: HDDS-2530 URL: https://issues.apache.org/jira/browse/HDDS-2530 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Supratim Deka Assignee: Supratim Deka Sonar report : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_APKcVY8lQ4ZsWR&open=AW5md_APKcVY8lQ4ZsWR -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2529) Sonar : return interface instead of implementation class in XceiverClientRatis getCommintInfoMap
Supratim Deka created HDDS-2529: --- Summary: Sonar : return interface instead of implementation class in XceiverClientRatis getCommintInfoMap Key: HDDS-2529 URL: https://issues.apache.org/jira/browse/HDDS-2529 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar report : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AKKcVY8lQ4ZsWH&open=AW5md_AKKcVY8lQ4ZsWH -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2528) Sonar : change return type to interface instead of implementation in CommitWatcher
Supratim Deka created HDDS-2528: --- Summary: Sonar : change return type to interface instead of implementation in CommitWatcher Key: HDDS-2528 URL: https://issues.apache.org/jira/browse/HDDS-2528 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar report : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVq&open=AW5md-_8KcVY8lQ4ZsVq https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_8KcVY8lQ4ZsVr&open=AW5md-_8KcVY8lQ4ZsVr -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2527) Sonar : remove redundant temporary assignment in HddsVersionProvider
Supratim Deka created HDDS-2527: --- Summary: Sonar : remove redundant temporary assignment in HddsVersionProvider Key: HDDS-2527 URL: https://issues.apache.org/jira/browse/HDDS-2527 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Supratim Deka Assignee: Supratim Deka Sonar report : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-4AKcVY8lQ4ZsO6&open=AW5md-4AKcVY8lQ4ZsO6 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2526) Sonar : use format specifiers in Log inside HddsConfServlet
Supratim Deka created HDDS-2526: --- Summary: Sonar : use format specifiers in Log inside HddsConfServlet Key: HDDS-2526 URL: https://issues.apache.org/jira/browse/HDDS-2526 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar report : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-4jKcVY8lQ4ZsPQ&open=AW5md-4jKcVY8lQ4ZsPQ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2525) Sonar : replace lambda with method reference in SCM BufferPool
Supratim Deka created HDDS-2525: --- Summary: Sonar : replace lambda with method reference in SCM BufferPool Key: HDDS-2525 URL: https://issues.apache.org/jira/browse/HDDS-2525 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka As per Sonar, method references are more compact than lambda - this applies to java 8, not older versions. Sonar report: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_5KcVY8lQ4ZsVn&open=AW5md-_5KcVY8lQ4ZsVn -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2524) Sonar : clumsy error handling in BlockOutputStream validateResponse
Supratim Deka created HDDS-2524: --- Summary: Sonar : clumsy error handling in BlockOutputStream validateResponse Key: HDDS-2524 URL: https://issues.apache.org/jira/browse/HDDS-2524 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Client Reporter: Supratim Deka Assignee: Supratim Deka Link to Sonar report : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-_2KcVY8lQ4ZsVk&open=AW5md-_2KcVY8lQ4ZsVk -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2478) Sonar : remove temporary variable in XceiverClientGrpc.sendCommand
[ https://issues.apache.org/jira/browse/HDDS-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2478: Status: Patch Available (was: Open) > Sonar : remove temporary variable in XceiverClientGrpc.sendCommand > -- > > Key: HDDS-2478 > URL: https://issues.apache.org/jira/browse/HDDS-2478 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Sonar issues : > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AGKcVY8lQ4ZsV1&open=AW5md_AGKcVY8lQ4ZsV1 > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AGKcVY8lQ4ZsV2&open=AW5md_AGKcVY8lQ4ZsV2 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2478) Sonar : remove temporary variable in XceiverClientSpi.sendCommand
[ https://issues.apache.org/jira/browse/HDDS-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2478: Description: Sonar issues : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AGKcVY8lQ4ZsV1&open=AW5md_AGKcVY8lQ4ZsV1 https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AGKcVY8lQ4ZsV2&open=AW5md_AGKcVY8lQ4ZsV2 was: Sonar issue : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AGKcVY8lQ4ZsV1&open=AW5md_AGKcVY8lQ4ZsV1 > Sonar : remove temporary variable in XceiverClientSpi.sendCommand > - > > Key: HDDS-2478 > URL: https://issues.apache.org/jira/browse/HDDS-2478 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > > Sonar issues : > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AGKcVY8lQ4ZsV1&open=AW5md_AGKcVY8lQ4ZsV1 > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AGKcVY8lQ4ZsV2&open=AW5md_AGKcVY8lQ4ZsV2 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2478) Sonar : remove temporary variable in XceiverClientGrpc.sendCommand
[ https://issues.apache.org/jira/browse/HDDS-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2478: Summary: Sonar : remove temporary variable in XceiverClientGrpc.sendCommand (was: Sonar : remove temporary variable in XceiverClientSpi.sendCommand) > Sonar : remove temporary variable in XceiverClientGrpc.sendCommand > -- > > Key: HDDS-2478 > URL: https://issues.apache.org/jira/browse/HDDS-2478 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > > Sonar issues : > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AGKcVY8lQ4ZsV1&open=AW5md_AGKcVY8lQ4ZsV1 > https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AGKcVY8lQ4ZsV2&open=AW5md_AGKcVY8lQ4ZsV2 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2480) Sonar : remove log spam for exceptions inside XceiverClientGrpc.reconnect
Supratim Deka created HDDS-2480: --- Summary: Sonar : remove log spam for exceptions inside XceiverClientGrpc.reconnect Key: HDDS-2480 URL: https://issues.apache.org/jira/browse/HDDS-2480 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar issue: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AGKcVY8lQ4ZsWE&open=AW5md_AGKcVY8lQ4ZsWE -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2479) Sonar : replace instanceof with catch block in XceiverClientGrpc.sendCommandWithRetry
Supratim Deka created HDDS-2479: --- Summary: Sonar : replace instanceof with catch block in XceiverClientGrpc.sendCommandWithRetry Key: HDDS-2479 URL: https://issues.apache.org/jira/browse/HDDS-2479 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar issue: https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AGKcVY8lQ4ZsV_&open=AW5md_AGKcVY8lQ4ZsV_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2478) Sonar : remove temporary variable in XceiverClientSpi.sendCommand
Supratim Deka created HDDS-2478: --- Summary: Sonar : remove temporary variable in XceiverClientSpi.sendCommand Key: HDDS-2478 URL: https://issues.apache.org/jira/browse/HDDS-2478 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar issue : https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md_AGKcVY8lQ4ZsV1&open=AW5md_AGKcVY8lQ4ZsV1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2466) Split OM Key into a Prefix Part and a Name Part
Supratim Deka created HDDS-2466: --- Summary: Split OM Key into a Prefix Part and a Name Part Key: HDDS-2466 URL: https://issues.apache.org/jira/browse/HDDS-2466 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Manager Reporter: Supratim Deka Assignee: Supratim Deka OM stores every key in a key table, which maps the key to a KeyInfo. If we split the key into a prefix and a name part which are then stored in separate tables, serves 2 purposes: 1. OzoneFS operations can be made efficient by deriving a prefix tree representation of the pathnames(prefixes) - details of this are outside the current scope. Also, the prefix table can get preferential treatment when it comes to caching. 2. PutKey is not penalised by having to parse the key into each path component - this is for cases where the dataset is a pure object store. Splitting into a prefix and a name is the minimal work to be done inline during the putKey operation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2208) Propagate System Exceptions from OM transaction apply phase
[ https://issues.apache.org/jira/browse/HDDS-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2208: Status: Patch Available (was: Open) > Propagate System Exceptions from OM transaction apply phase > --- > > Key: HDDS-2208 > URL: https://issues.apache.org/jira/browse/HDDS-2208 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The change for HDDS-2206 tracks system exceptions during preExecute phase of > OM request handling. > The current jira is to implement exception propagation once the OM request is > submitted to Ratis - when the handler is running validateAndUpdateCache for > the request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2208) Propagate System Exceptions from OM transaction apply phase
[ https://issues.apache.org/jira/browse/HDDS-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2208: Description: The change for HDDS-2206 tracks system exceptions during preExecute phase of OM request handling. The current jira is to implement exception propagation once the OM request is submitted to Ratis - when the handler is running validateAndUpdateCache for the request. was: applyTransaction handling in the OzoneManagerStateMachine does not propagate exceptions/failures to the initiator. The future which is returned from applyTransaction simply tracks completion of the async executor represented by the "executorService" in OzoneManagerStateMachine.java > Propagate System Exceptions from OM transaction apply phase > --- > > Key: HDDS-2208 > URL: https://issues.apache.org/jira/browse/HDDS-2208 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > > The change for HDDS-2206 tracks system exceptions during preExecute phase of > OM request handling. > The current jira is to implement exception propagation once the OM request is > submitted to Ratis - when the handler is running validateAndUpdateCache for > the request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2208) Propagate System Exceptions from OM transaction apply phase
[ https://issues.apache.org/jira/browse/HDDS-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2208: Summary: Propagate System Exceptions from OM transaction apply phase (was: OzoneManagerStateMachine does not track failures in applyTransaction) > Propagate System Exceptions from OM transaction apply phase > --- > > Key: HDDS-2208 > URL: https://issues.apache.org/jira/browse/HDDS-2208 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > > applyTransaction handling in the OzoneManagerStateMachine does not propagate > exceptions/failures to the initiator. > The future which is returned from applyTransaction simply tracks completion > of the async executor represented by the "executorService" in > OzoneManagerStateMachine.java -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2206) Separate handling for OMException and IOException in the Ozone Manager
[ https://issues.apache.org/jira/browse/HDDS-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2206: Status: Patch Available (was: Open) > Separate handling for OMException and IOException in the Ozone Manager > -- > > Key: HDDS-2206 > URL: https://issues.apache.org/jira/browse/HDDS-2206 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > As part of improving error propagation from the OM for ease of > troubleshooting and diagnosis, the proposal is to handle IOExceptions > separately from the business exceptions which are thrown as OMExceptions. > Handling for OMExceptions will not be changed in this jira. > Handling for IOExceptions will include logging the stacktrace on the server, > and propagation to the client under the control of a config parameter. > Similar handling is also proposed for SCMException. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2175) Propagate System Exceptions from the OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2175: Labels: (was: pull-request-available) > Propagate System Exceptions from the OzoneManager > - > > Key: HDDS-2175 > URL: https://issues.apache.org/jira/browse/HDDS-2175 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Exceptions encountered while processing requests on the OM are categorized as > business exceptions and system exceptions. All of the business exceptions are > captured as OMException and have an associated status code which is returned > to the client. The handling of these is not going to be changed. > Currently system exceptions are returned as INTERNAL ERROR to the client with > a 1 line message string from the exception. The scope of this jira is to > capture system exceptions and propagate the related information(including the > complete stack trace) back to the client. > There are 3 sub-tasks required to achieve this > 1. Separate capture and handling for OMException and the other > exceptions(IOException). For system exceptions, use Hadoop IPC > ServiceException mechanism to send the stack trace to the client. > 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and > propagate up to the OzoneManager layer (on the leader). Currently, these > exceptions are not being tracked. > 3. Handle and propagate exceptions from Ratis. > Will raise jira for each sub-task. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2208) OzoneManagerStateMachine does not track failures in applyTransaction
Supratim Deka created HDDS-2208: --- Summary: OzoneManagerStateMachine does not track failures in applyTransaction Key: HDDS-2208 URL: https://issues.apache.org/jira/browse/HDDS-2208 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Supratim Deka Assignee: Supratim Deka applyTransaction handling in the OzoneManagerStateMachine does not propagate exceptions/failures to the initiator. The future which is returned from applyTransaction simply tracks completion of the async executor represented by the "executorService" in OzoneManagerStateMachine.java -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2206) Separate handling for OMException and IOException in the Ozone Manager
Supratim Deka created HDDS-2206: --- Summary: Separate handling for OMException and IOException in the Ozone Manager Key: HDDS-2206 URL: https://issues.apache.org/jira/browse/HDDS-2206 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager Reporter: Supratim Deka Assignee: Supratim Deka As part of improving error propagation from the OM for ease of troubleshooting and diagnosis, the proposal is to handle IOExceptions separately from the business exceptions which are thrown as OMExceptions. Handling for OMExceptions will not be changed in this jira. Handling for IOExceptions will include logging the stacktrace on the server, and propagation to the client under the control of a config parameter. Similar handling is also proposed for SCMException. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2175) Propagate System Exceptions from the OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939841#comment-16939841 ] Supratim Deka commented on HDDS-2175: - Note from [~aengineer] posted on the github PR: Also are these call stacks something that the end user should ever see? I have always found as user a call stack useless, it might be useful for the developer for debugging purposes, but clients are generally things used by real users. Maybe if these stacks are not logged in the ozone.log, we can log them, provided we can guard them via a config key and by default we do not do that. > Propagate System Exceptions from the OzoneManager > - > > Key: HDDS-2175 > URL: https://issues.apache.org/jira/browse/HDDS-2175 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Exceptions encountered while processing requests on the OM are categorized as > business exceptions and system exceptions. All of the business exceptions are > captured as OMException and have an associated status code which is returned > to the client. The handling of these is not going to be changed. > Currently system exceptions are returned as INTERNAL ERROR to the client with > a 1 line message string from the exception. The scope of this jira is to > capture system exceptions and propagate the related information(including the > complete stack trace) back to the client. > There are 3 sub-tasks required to achieve this > 1. Separate capture and handling for OMException and the other > exceptions(IOException). For system exceptions, use Hadoop IPC > ServiceException mechanism to send the stack trace to the client. > 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and > propagate up to the OzoneManager layer (on the leader). Currently, these > exceptions are not being tracked. > 3. Handle and propagate exceptions from Ratis. > Will raise jira for each sub-task. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2175) Propagate System Exceptions from the OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2175: Description: Exceptions encountered while processing requests on the OM are categorized as business exceptions and system exceptions. All of the business exceptions are captured as OMException and have an associated status code which is returned to the client. The handling of these is not going to be changed. Currently system exceptions are returned as INTERNAL ERROR to the client with a 1 line message string from the exception. The scope of this jira is to capture system exceptions and propagate the related information(including the complete stack trace) back to the client. There are 3 sub-tasks required to achieve this 1. Separate capture and handling for OMException and the other exceptions(IOException). For system exceptions, use Hadoop IPC ServiceException mechanism to send the stack trace to the client. 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and propagate up to the OzoneManager layer (on the leader). Currently, these exceptions are not being tracked. 3. Handle and propagate exceptions from Ratis. Will raise jira for each sub-task. was: Exceptions encountered while processing requests on the OM are categorized as business exceptions and system exceptions. All of the business exceptions are captured as OMException and have an associated status code which is returned to the client. The handling of these is not going to be changed. Currently system exceptions are returned as INTERNAL ERROR to the client with a 1 line message string from the exception. The scope of this jira is to capture system exceptions and propagate the related information(including the complete stack trace) back to the client. There are 3 sub-tasks required to achieve this 1. Separate capture and handling for OMException and the other exceptions(IOException). For system exceptions, use Hadoop IPC ServiceException mechanism to send the stack trace to the client. 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and propagate up to the OzoneManager layer (on the leader). Currently, these exceptions are not being tracked. 3. Handle Exceptions from Ratis and report > Propagate System Exceptions from the OzoneManager > - > > Key: HDDS-2175 > URL: https://issues.apache.org/jira/browse/HDDS-2175 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Exceptions encountered while processing requests on the OM are categorized as > business exceptions and system exceptions. All of the business exceptions are > captured as OMException and have an associated status code which is returned > to the client. The handling of these is not going to be changed. > Currently system exceptions are returned as INTERNAL ERROR to the client with > a 1 line message string from the exception. The scope of this jira is to > capture system exceptions and propagate the related information(including the > complete stack trace) back to the client. > There are 3 sub-tasks required to achieve this > 1. Separate capture and handling for OMException and the other > exceptions(IOException). For system exceptions, use Hadoop IPC > ServiceException mechanism to send the stack trace to the client. > 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and > propagate up to the OzoneManager layer (on the leader). Currently, these > exceptions are not being tracked. > 3. Handle and propagate exceptions from Ratis. > Will raise jira for each sub-task. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2175) Propagate stack trace for OM Exceptions to the Client
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2175: Description: Exceptions encountered while processing requests on the OM are categorized as business exceptions and system exceptions. All of the business exceptions are captured as OMException and have an associated status code which is returned to the client. The handling of these is not going to be changed. Currently system exceptions are returned as INTERNAL ERROR to the client with a 1 line message string from the exception. The scope of this jira is to capture system exceptions and propagate the related information(including the complete stack trace) back to the client. There are 3 sub-tasks required to achieve this 1. Separate capture and handling for OMException and the other exceptions(IOException). For system exceptions, use Hadoop IPC ServiceException mechanism to send the stack trace to the client. 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and propagate up to the OzoneManager layer (on the leader). Currently, these exceptions are not being tracked. 3. Handle Exceptions from Ratis and report was: Ozone Manager responds with a Status code and the summary message when an exception occurs while running the OM request handlers. The proposal is to respond to the client with the complete stack trace for the exception, as part of the response message. This makes debugging more convenient without requiring code change on the client, because the status code is retained in the response message. > Propagate stack trace for OM Exceptions to the Client > - > > Key: HDDS-2175 > URL: https://issues.apache.org/jira/browse/HDDS-2175 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Exceptions encountered while processing requests on the OM are categorized as > business exceptions and system exceptions. All of the business exceptions are > captured as OMException and have an associated status code which is returned > to the client. The handling of these is not going to be changed. > Currently system exceptions are returned as INTERNAL ERROR to the client with > a 1 line message string from the exception. The scope of this jira is to > capture system exceptions and propagate the related information(including the > complete stack trace) back to the client. > There are 3 sub-tasks required to achieve this > 1. Separate capture and handling for OMException and the other > exceptions(IOException). For system exceptions, use Hadoop IPC > ServiceException mechanism to send the stack trace to the client. > 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and > propagate up to the OzoneManager layer (on the leader). Currently, these > exceptions are not being tracked. > 3. Handle Exceptions from Ratis and report > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2175) Propagate System Exceptions from the OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2175: Summary: Propagate System Exceptions from the OzoneManager (was: Propagate stack trace for OM Exceptions to the Client) > Propagate System Exceptions from the OzoneManager > - > > Key: HDDS-2175 > URL: https://issues.apache.org/jira/browse/HDDS-2175 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Exceptions encountered while processing requests on the OM are categorized as > business exceptions and system exceptions. All of the business exceptions are > captured as OMException and have an associated status code which is returned > to the client. The handling of these is not going to be changed. > Currently system exceptions are returned as INTERNAL ERROR to the client with > a 1 line message string from the exception. The scope of this jira is to > capture system exceptions and propagate the related information(including the > complete stack trace) back to the client. > There are 3 sub-tasks required to achieve this > 1. Separate capture and handling for OMException and the other > exceptions(IOException). For system exceptions, use Hadoop IPC > ServiceException mechanism to send the stack trace to the client. > 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and > propagate up to the OzoneManager layer (on the leader). Currently, these > exceptions are not being tracked. > 3. Handle Exceptions from Ratis and report > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2175) Propagate stack trace for OM Exceptions to the Client
Supratim Deka created HDDS-2175: --- Summary: Propagate stack trace for OM Exceptions to the Client Key: HDDS-2175 URL: https://issues.apache.org/jira/browse/HDDS-2175 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Manager Reporter: Supratim Deka Assignee: Supratim Deka Ozone Manager responds with a Status code and the summary message when an exception occurs while running the OM request handlers. The proposal is to respond to the client with the complete stack trace for the exception, as part of the response message. This makes debugging more convenient without requiring code change on the client, because the status code is retained in the response message. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14843) Double Synchronization in BlockReportLeaseManager
[ https://issues.apache.org/jira/browse/HDFS-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928155#comment-16928155 ] Supratim Deka commented on HDFS-14843: -- +1 Thanks for the patch [~belugabehr], looks good to me. > Double Synchronization in BlockReportLeaseManager > - > > Key: HDFS-14843 > URL: https://issues.apache.org/jira/browse/HDFS-14843 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Attachments: HDFS-14843.1.patch > > > {code:java|title=BlockReportLeaseManager.java} > private synchronized long getNextId() { > long id; > do { > id = nextId++; > } while (id == 0); > return id; > } > {code} > This is a private method and is synchronized, however, it is only be accessed > from an already-synchronized method. No need to double-synchronize. > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockReportLeaseManager.java#L183-L189 > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockReportLeaseManager.java#L227 -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2061) Add hdds.container.chunk.persistdata to ozone-default.xml
[ https://issues.apache.org/jira/browse/HDDS-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919516#comment-16919516 ] Supratim Deka commented on HDDS-2061: - [~adoroszlai], this configuration is strictly for developers and not meant for users/administrators. As such I am not sure if we should include it in ozone-default.xml > Add hdds.container.chunk.persistdata to ozone-default.xml > - > > Key: HDDS-2061 > URL: https://issues.apache.org/jira/browse/HDDS-2061 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Doroszlai, Attila >Assignee: Doroszlai, Attila >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > HDDS-1094 introduced a new config key > ([hdds.container.chunk.persistdata|https://github.com/apache/hadoop/blob/96f7dc1992246a16031f613e55dc39ea0d64acd1/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/HddsConfigKeys.java#L241-L245]), > which needs to be added to {{ozone-default.xml}}, too. > https://github.com/elek/ozone-ci/blob/master/trunk/trunk-nightly-20190830-rr75b/integration/hadoop-ozone/integration-test/org.apache.hadoop.ozone.TestOzoneConfigurationFields.txt -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2057) Incorrect Default OM Port in Ozone FS URI Error Message
[ https://issues.apache.org/jira/browse/HDDS-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2057: Labels: pull-request-available (was: ) > Incorrect Default OM Port in Ozone FS URI Error Message > --- > > Key: HDDS-2057 > URL: https://issues.apache.org/jira/browse/HDDS-2057 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available > > The error message displayed from BasicOzoneFilesystem.initialize specifies > 5678 as the OM port. This is not the default port. > "Ozone file system URL " + > "should be one of the following formats: " + > "o3fs://bucket.volume/key OR " + > "o3fs://bucket.volume.om-host.example.com/key OR " + > "o3fs://bucket.volume.om-host.example.com:5678/key"; > > This should be fixed to pull the default value from the configuration > parameter, instead of a hard-coded value. > > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2057) Incorrect Default OM Port in Ozone FS URI Error Message
[ https://issues.apache.org/jira/browse/HDDS-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919132#comment-16919132 ] Supratim Deka commented on HDDS-2057: - [https://github.com/apache/hadoop/pull/1377] not sure why the pull request is not linked to the jira > Incorrect Default OM Port in Ozone FS URI Error Message > --- > > Key: HDDS-2057 > URL: https://issues.apache.org/jira/browse/HDDS-2057 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > > The error message displayed from BasicOzoneFilesystem.initialize specifies > 5678 as the OM port. This is not the default port. > "Ozone file system URL " + > "should be one of the following formats: " + > "o3fs://bucket.volume/key OR " + > "o3fs://bucket.volume.om-host.example.com/key OR " + > "o3fs://bucket.volume.om-host.example.com:5678/key"; > > This should be fixed to pull the default value from the configuration > parameter, instead of a hard-coded value. > > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2057) Incorrect Default OM Port in Ozone FS URI Error Message
[ https://issues.apache.org/jira/browse/HDDS-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2057: Status: Patch Available (was: Open) > Incorrect Default OM Port in Ozone FS URI Error Message > --- > > Key: HDDS-2057 > URL: https://issues.apache.org/jira/browse/HDDS-2057 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > > The error message displayed from BasicOzoneFilesystem.initialize specifies > 5678 as the OM port. This is not the default port. > "Ozone file system URL " + > "should be one of the following formats: " + > "o3fs://bucket.volume/key OR " + > "o3fs://bucket.volume.om-host.example.com/key OR " + > "o3fs://bucket.volume.om-host.example.com:5678/key"; > > This should be fixed to pull the default value from the configuration > parameter, instead of a hard-coded value. > > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2057) Incorrect Default OM Port in Ozone FS URI Error Message
[ https://issues.apache.org/jira/browse/HDDS-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2057: Description: The error message displayed from BasicOzoneFilesystem.initialize specifies 5678 as the OM port. This is not the default port. "Ozone file system URL " + "should be one of the following formats: " + "o3fs://bucket.volume/key OR " + "o3fs://bucket.volume.om-host.example.com/key OR " + "o3fs://bucket.volume.om-host.example.com:5678/key"; This should be fixed to pull the default value from the configuration parameter, instead of a hard-coded value. was: The error message displayed from BasicOzoneFilesystem.initialize specifies 5678 as the OM port. This is not the default port. "Ozone file system URL " + "should be one of the following formats: " + "o3fs://bucket.volume/key OR " + "o3fs://bucket.volume.om-host.example.com/key OR " + "o3fs://bucket.volume.om-host.example.com:5678/key"; This should be fixed to pull the default value from the configuration parameter, instead of a hard-coded value. The same error exists in the documentation in ozonefs.html reference document. > Incorrect Default OM Port in Ozone FS URI Error Message > --- > > Key: HDDS-2057 > URL: https://issues.apache.org/jira/browse/HDDS-2057 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > > The error message displayed from BasicOzoneFilesystem.initialize specifies > 5678 as the OM port. This is not the default port. > "Ozone file system URL " + > "should be one of the following formats: " + > "o3fs://bucket.volume/key OR " + > "o3fs://bucket.volume.om-host.example.com/key OR " + > "o3fs://bucket.volume.om-host.example.com:5678/key"; > > This should be fixed to pull the default value from the configuration > parameter, instead of a hard-coded value. > > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2057) Incorrect Default OM Port in Ozone FS URI Error Message
[ https://issues.apache.org/jira/browse/HDDS-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2057: Summary: Incorrect Default OM Port in Ozone FS URI Error Message (was: Incorrect Default OM Port in Ozone FS Error Message and ozonefs.html) > Incorrect Default OM Port in Ozone FS URI Error Message > --- > > Key: HDDS-2057 > URL: https://issues.apache.org/jira/browse/HDDS-2057 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > > The error message displayed from BasicOzoneFilesystem.initialize specifies > 5678 as the OM port. This is not the default port. > "Ozone file system URL " + > "should be one of the following formats: " + > "o3fs://bucket.volume/key OR " + > "o3fs://bucket.volume.om-host.example.com/key OR " + > "o3fs://bucket.volume.om-host.example.com:5678/key"; > > This should be fixed to pull the default value from the configuration > parameter, instead of a hard-coded value. > > The same error exists in the documentation in ozonefs.html reference document. > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2057) Incorrect Default OM Port in Ozone FS Error Message and ozonefs.html
Supratim Deka created HDDS-2057: --- Summary: Incorrect Default OM Port in Ozone FS Error Message and ozonefs.html Key: HDDS-2057 URL: https://issues.apache.org/jira/browse/HDDS-2057 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Filesystem Reporter: Supratim Deka Assignee: Supratim Deka The error message displayed from BasicOzoneFilesystem.initialize specifies 5678 as the OM port. This is not the default port. "Ozone file system URL " + "should be one of the following formats: " + "o3fs://bucket.volume/key OR " + "o3fs://bucket.volume.om-host.example.com/key OR " + "o3fs://bucket.volume.om-host.example.com:5678/key"; This should be fixed to pull the default value from the configuration parameter, instead of a hard-coded value. The same error exists in the documentation in ozonefs.html reference document. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-168) Add ScmGroupID to Datanode Version File
[ https://issues.apache.org/jira/browse/HDDS-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917369#comment-16917369 ] Supratim Deka commented on HDDS-168: [~hanishakoneru] , what is the ScmGroupID being referred to in the description? > Add ScmGroupID to Datanode Version File > --- > > Key: HDDS-168 > URL: https://issues.apache.org/jira/browse/HDDS-168 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > > Add the field {{ScmGroupID}} to Datanode Version file. This field identifies > the set of SCMs that this datanode talks to, or takes commands from. > This value is not same as Cluster ID – since a cluster can technically have > more than one SCM group. > Refer to [~anu]'s > [comment|https://issues.apache.org/jira/browse/HDDS-156?focusedCommentId=16511903&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16511903] > in HDDS-156. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode
[ https://issues.apache.org/jira/browse/HDDS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16913161#comment-16913161 ] Supratim Deka commented on HDDS-1094: - yes, I understand. exactly why I said earlier, depends on what we wish to test. with this patch, we are trying to enable system level tests - the data path (not just the Datanodes) together with the metadata components. while the 3 sections you list are subsystem scope and are required as well. With this change, we should be able to run system performance tests without investing in beefy storage hardware. Btw, in HDFS world there is very similar functionality in the form of SimulatedFSDataset. > Performance test infrastructure : skip writing user data on Datanode > > > Key: HDDS-1094 > URL: https://issues.apache.org/jira/browse/HDDS-1094 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Goal: > Make Ozone chunk Read/Write operations CPU/network bound for specially > constructed performance micro benchmarks. > Remove disk bandwidth and latency constraints - running ozone data path > against extreme low-latency & high throughput storage will expose performance > bottlenecks in the flow. But low-latency storage(NVME flash drives, Storage > class memory etc) is expensive and availability is limited. Is there a > workaround which achieves similar running conditions for the software without > actually having the low latency storage? At least for specially constructed > datasets - for example zero-filled blocks (*not* zero-length blocks). > Required characteristics of the solution: > No changes in Ozone client, OM and SCM. Changes limited to Datanode, Minimal > footprint in datanode code. > Possible High level Approach: > The ChunkManager and ChunkUtils can enable writeChunk for zero-filled chunks > to be dropped without actually writing to the local filesystem. Similarly, if > readChunk can construct a zero-filled buffer without reading from the local > filesystem whenever it detects a zero-filled chunk. Specifics of how to > detect and record a zero-filled chunk can be discussed on this jira. Also > discuss how to control this behaviour and make it available only for internal > testing. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode
[ https://issues.apache.org/jira/browse/HDDS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912954#comment-16912954 ] Supratim Deka commented on HDDS-1094: - [~anu] , depends I think. if we want to stress the entire pipeline and see how the whole thing holds up, then cutting out the data in the client won't work. We can think of this as simulating an extremely fast storage device and examine how the rest of the system responds. not useful? > Performance test infrastructure : skip writing user data on Datanode > > > Key: HDDS-1094 > URL: https://issues.apache.org/jira/browse/HDDS-1094 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Goal: > Make Ozone chunk Read/Write operations CPU/network bound for specially > constructed performance micro benchmarks. > Remove disk bandwidth and latency constraints - running ozone data path > against extreme low-latency & high throughput storage will expose performance > bottlenecks in the flow. But low-latency storage(NVME flash drives, Storage > class memory etc) is expensive and availability is limited. Is there a > workaround which achieves similar running conditions for the software without > actually having the low latency storage? At least for specially constructed > datasets - for example zero-filled blocks (*not* zero-length blocks). > Required characteristics of the solution: > No changes in Ozone client, OM and SCM. Changes limited to Datanode, Minimal > footprint in datanode code. > Possible High level Approach: > The ChunkManager and ChunkUtils can enable writeChunk for zero-filled chunks > to be dropped without actually writing to the local filesystem. Similarly, if > readChunk can construct a zero-filled buffer without reading from the local > filesystem whenever it detects a zero-filled chunk. Specifics of how to > detect and record a zero-filled chunk can be discussed on this jira. Also > discuss how to control this behaviour and make it available only for internal > testing. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode
[ https://issues.apache.org/jira/browse/HDDS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-1094: Assignee: Supratim Deka Status: Patch Available (was: Open) > Performance test infrastructure : skip writing user data on Datanode > > > Key: HDDS-1094 > URL: https://issues.apache.org/jira/browse/HDDS-1094 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Goal: > Make Ozone chunk Read/Write operations CPU/network bound for specially > constructed performance micro benchmarks. > Remove disk bandwidth and latency constraints - running ozone data path > against extreme low-latency & high throughput storage will expose performance > bottlenecks in the flow. But low-latency storage(NVME flash drives, Storage > class memory etc) is expensive and availability is limited. Is there a > workaround which achieves similar running conditions for the software without > actually having the low latency storage? At least for specially constructed > datasets - for example zero-filled blocks (*not* zero-length blocks). > Required characteristics of the solution: > No changes in Ozone client, OM and SCM. Changes limited to Datanode, Minimal > footprint in datanode code. > Possible High level Approach: > The ChunkManager and ChunkUtils can enable writeChunk for zero-filled chunks > to be dropped without actually writing to the local filesystem. Similarly, if > readChunk can construct a zero-filled buffer without reading from the local > filesystem whenever it detects a zero-filled chunk. Specifics of how to > detect and record a zero-filled chunk can be discussed on this jira. Also > discuss how to control this behaviour and make it available only for internal > testing. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1094) Performance test infrastructure : skip writing user data on Datanode
[ https://issues.apache.org/jira/browse/HDDS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-1094: Summary: Performance test infrastructure : skip writing user data on Datanode (was: Performance testing infrastructure : Special handling for zero-filled chunks on the Datanode) > Performance test infrastructure : skip writing user data on Datanode > > > Key: HDDS-1094 > URL: https://issues.apache.org/jira/browse/HDDS-1094 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Supratim Deka >Priority: Major > > Goal: > Make Ozone chunk Read/Write operations CPU/network bound for specially > constructed performance micro benchmarks. > Remove disk bandwidth and latency constraints - running ozone data path > against extreme low-latency & high throughput storage will expose performance > bottlenecks in the flow. But low-latency storage(NVME flash drives, Storage > class memory etc) is expensive and availability is limited. Is there a > workaround which achieves similar running conditions for the software without > actually having the low latency storage? At least for specially constructed > datasets - for example zero-filled blocks (*not* zero-length blocks). > Required characteristics of the solution: > No changes in Ozone client, OM and SCM. Changes limited to Datanode, Minimal > footprint in datanode code. > Possible High level Approach: > The ChunkManager and ChunkUtils can enable writeChunk for zero-filled chunks > to be dropped without actually writing to the local filesystem. Similarly, if > readChunk can construct a zero-filled buffer without reading from the local > filesystem whenever it detects a zero-filled chunk. Specifics of how to > detect and record a zero-filled chunk can be discussed on this jira. Also > discuss how to control this behaviour and make it available only for internal > testing. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1740) Handle Failure to Update Ozone Container YAML
[ https://issues.apache.org/jira/browse/HDDS-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka resolved HDDS-1740. - Resolution: Not A Problem On the Datanode, Container state changes are driven through KeyValueContainer.updateContainerData() this always resets the in-memory state of the container to the previous state in case the update to the container YAML hits any exception. Also, the container YAML is sync flushed to persistent storage as implemented in: ContainerDataYAML.createContainerFile() So I am marking this as not a problem. > Handle Failure to Update Ozone Container YAML > - > > Key: HDDS-1740 > URL: https://issues.apache.org/jira/browse/HDDS-1740 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > > Ensure consistent state in-memory and in the persistent YAML file for the > Container. > If an update to the YAML fails, then the in-memory state also does not change. > This ensures that in every container report, the SCM continues to see the > specific container is still in the old state. And this triggers a retry of > the state change operation from the SCM. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1798) Propagate failure in writeStateMachineData to Ratis
[ https://issues.apache.org/jira/browse/HDDS-1798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-1798: Status: Patch Available (was: Open) > Propagate failure in writeStateMachineData to Ratis > --- > > Key: HDDS-1798 > URL: https://issues.apache.org/jira/browse/HDDS-1798 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, > writeStateMachineData() returns a future to Ratis. This future does not track > any errors or failures encountered as part of the operation - WriteChunk / > handleWriteChunk(). The error is propagated back to the client in the form of > an error code embedded inside writeChunkResponseProto. But the error goes > undetected and unhandled in the Ratis server. The future handed back to Ratis > is always completed with success. > The goal is to detect any errors in writeStateMachineData in Ratis and treat > is as a failure of the Ratis log. Handling for which is already implemented > in HDDS-1603. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1818) Instantiate Ozone Containers using Factory pattern
[ https://issues.apache.org/jira/browse/HDDS-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887169#comment-16887169 ] Supratim Deka commented on HDDS-1818: - Useful to implement a test Container implementation which will throw exceptions for specific operations. > Instantiate Ozone Containers using Factory pattern > -- > > Key: HDDS-1818 > URL: https://issues.apache.org/jira/browse/HDDS-1818 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > > Introduce a factory to instantiate Containers in Ozone. > This will be useful in different ways: > # to test higher level functionality, for example test error handling for > situations like HDDS-1798 > # create a simulated container which does not do disk IO for data and is > used to run targeted max throughput tests. As an example, HDDS-1094 -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1818) Instantiate Ozone Containers using Factory pattern
Supratim Deka created HDDS-1818: --- Summary: Instantiate Ozone Containers using Factory pattern Key: HDDS-1818 URL: https://issues.apache.org/jira/browse/HDDS-1818 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Supratim Deka Assignee: Supratim Deka Introduce a factory to instantiate Containers in Ozone. This will be useful in different ways: # to test higher level functionality, for example test error handling for situations like HDDS-1798 # create a simulated container which does not do disk IO for data and is used to run targeted max throughput tests. As an example, HDDS-1094 -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1798) Propagate failure in writeStateMachineData to Ratis
Supratim Deka created HDDS-1798: --- Summary: Propagate failure in writeStateMachineData to Ratis Key: HDDS-1798 URL: https://issues.apache.org/jira/browse/HDDS-1798 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Datanode Reporter: Supratim Deka Assignee: Supratim Deka Currently, writeStateMachineData() returns a future to Ratis. This future does not track any errors or failures encountered as part of the operation - WriteChunk / handleWriteChunk(). The error is propagated back to the client in the form of an error code embedded inside writeChunkResponseProto. But the error goes undetected and unhandled in the Ratis server. The future handed back to Ratis is always completed with success. The goal is to detect any errors in writeStateMachineData in Ratis and treat is as a failure of the Ratis log. Handling for which is already implemented in HDDS-1603. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1783) Latency metric for applyTransaction in ContainerStateMachine
Supratim Deka created HDDS-1783: --- Summary: Latency metric for applyTransaction in ContainerStateMachine Key: HDDS-1783 URL: https://issues.apache.org/jira/browse/HDDS-1783 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Reporter: Supratim Deka applyTransaction is invoked from the Ratis pipeline and the ContainerStateMachine uses a async executor to complete the task. We require a latency metric to track the performance of log apply operations in the state machine. This will measure the end-to-end latency of apply which includes the queueing delay in the executor queues. Combined with the latency measurement in HddsDispatcher, this will be an indicator if the executors are overloaded. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1781) Add ContainerCache metrics in ContainerMetrics
Supratim Deka created HDDS-1781: --- Summary: Add ContainerCache metrics in ContainerMetrics Key: HDDS-1781 URL: https://issues.apache.org/jira/browse/HDDS-1781 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Supratim Deka ContainerCache cache handles to open Container DB instances. This LRU cache is configured with a limited capacity (1024 entries default). Add metrics to track the performance of this cache(hits : misses) and also track the average latency to acquire a DB handle in case of cache miss. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1765) destroyPipeline scheduled from finalizeAndDestroyPipeline fails for short dead node interval
[ https://issues.apache.org/jira/browse/HDDS-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879243#comment-16879243 ] Supratim Deka commented on HDDS-1765: - similar symptom but not the same problem. Linking for reference. > destroyPipeline scheduled from finalizeAndDestroyPipeline fails for short > dead node interval > > > Key: HDDS-1765 > URL: https://issues.apache.org/jira/browse/HDDS-1765 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Supratim Deka >Priority: Major > > This happens when > OZONE_SCM_PIPELINE_DESTROY_TIMEOUT exceeds the value of > OZONE_SCM_DEADNODE_INTERVAL. This is the case for start-chaos.sh > When a Datanode is shutdown, SCM Stale node handler calls > finalizeAndDestroyPipeline() which schedules destroyPipeline() operation with > a delay > of OZONE_SCM_PIPELINE_DESTROY_TIMEOUT. By the time this gets scheduled, dead > node handler would have destroyed the pipeline. > > {code:java} > 2019-07-05 14:45:16,358 INFO pipeline.SCMPipelineManager > (SCMPipelineManager.java:finalizeAndDestroyPipeline(307)) - destroying > pipeline:Pipeline[ Id: ef60537a-0a82-4fea-a574-109c881fa140, Nodes: > 7947bf32-faaa-4b34-bf1e-2752a929938c{ip: 192.168.1.6, host: 192.168.1.6, > networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, > State:CLOSED] > 2019-07-05 14:45:16,363 INFO pipeline.PipelineStateManager > (PipelineStateManager.java:removePipeline(108)) - Pipeline Pipeline[ Id: > ef60537a-0a82-4fea-a574-109c881fa140, Nodes: > 7947bf32-faaa-4b34-bf1e-2752a929938c{ip: 192.168.1.6, host: 192.168.1.6, > networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, > State:CLOSED] removed from db > ... > 2019-07-05 14:46:12,400 WARN pipeline.RatisPipelineUtils > (RatisPipelineUtils.java:destroyPipeline(66)) - Pipeline destroy failed for > pipeline=PipelineID=ef60537a-0a82-4fea-a574-109c881fa140 > dn=7947bf32-faaa-4b34-bf1e-2752a929938c\{ip: 192.168.1.6, host: 192.168.1.6, > networkLocation: /default-rack, certSerialId: null} > 2019-07-05 14:46:12,401 ERROR pipeline.SCMPipelineManager > (Scheduler.java:lambda$schedule$1(70)) - Destroy pipeline failed for > pipeline:Pipeline[ Id: ef60537a-0a82-4fea-a574-109c881fa140, Nodes: > 7947bf32-faaa-4b34-bf1e-2752a929938c\{ip: 192.168.1.6, host: 192.168.1.6, > networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, > State:OPEN] > org.apache.hadoop.hdds.scm.pipeline.PipelineNotFoundException: > PipelineID=ef60537a-0a82-4fea-a574-109c881fa140 not found > at > org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.getPipeline(PipelineStateMap.java:132) > at > org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.removePipeline(PipelineStateMap.java:322) > at > org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager.removePipeline(PipelineStateManager.java:107) > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.removePipeline(SCMPipelineManager.java:401) > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.destroyPipeline(SCMPipelineManager.java:387) > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.lambda$finalizeAndDestroyPipeline$0(SCMPipelineManager.java:321) > at > org.apache.hadoop.utils.Scheduler.lambda$schedule$1(Scheduler.java:68) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1765) destroyPipeline scheduled from finalizeAndDestroyPipeline fails for short dead node interval
Supratim Deka created HDDS-1765: --- Summary: destroyPipeline scheduled from finalizeAndDestroyPipeline fails for short dead node interval Key: HDDS-1765 URL: https://issues.apache.org/jira/browse/HDDS-1765 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Supratim Deka This happens when OZONE_SCM_PIPELINE_DESTROY_TIMEOUT exceeds the value of OZONE_SCM_DEADNODE_INTERVAL. This is the case for start-chaos.sh When a Datanode is shutdown, SCM Stale node handler calls finalizeAndDestroyPipeline() which schedules destroyPipeline() operation with a delay of OZONE_SCM_PIPELINE_DESTROY_TIMEOUT. By the time this gets scheduled, dead node handler would have destroyed the pipeline. {code:java} 2019-07-05 14:45:16,358 INFO pipeline.SCMPipelineManager (SCMPipelineManager.java:finalizeAndDestroyPipeline(307)) - destroying pipeline:Pipeline[ Id: ef60537a-0a82-4fea-a574-109c881fa140, Nodes: 7947bf32-faaa-4b34-bf1e-2752a929938c{ip: 192.168.1.6, host: 192.168.1.6, networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, State:CLOSED] 2019-07-05 14:45:16,363 INFO pipeline.PipelineStateManager (PipelineStateManager.java:removePipeline(108)) - Pipeline Pipeline[ Id: ef60537a-0a82-4fea-a574-109c881fa140, Nodes: 7947bf32-faaa-4b34-bf1e-2752a929938c{ip: 192.168.1.6, host: 192.168.1.6, networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, State:CLOSED] removed from db ... 2019-07-05 14:46:12,400 WARN pipeline.RatisPipelineUtils (RatisPipelineUtils.java:destroyPipeline(66)) - Pipeline destroy failed for pipeline=PipelineID=ef60537a-0a82-4fea-a574-109c881fa140 dn=7947bf32-faaa-4b34-bf1e-2752a929938c\{ip: 192.168.1.6, host: 192.168.1.6, networkLocation: /default-rack, certSerialId: null} 2019-07-05 14:46:12,401 ERROR pipeline.SCMPipelineManager (Scheduler.java:lambda$schedule$1(70)) - Destroy pipeline failed for pipeline:Pipeline[ Id: ef60537a-0a82-4fea-a574-109c881fa140, Nodes: 7947bf32-faaa-4b34-bf1e-2752a929938c\{ip: 192.168.1.6, host: 192.168.1.6, networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, State:OPEN] org.apache.hadoop.hdds.scm.pipeline.PipelineNotFoundException: PipelineID=ef60537a-0a82-4fea-a574-109c881fa140 not found at org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.getPipeline(PipelineStateMap.java:132) at org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.removePipeline(PipelineStateMap.java:322) at org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager.removePipeline(PipelineStateManager.java:107) at org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.removePipeline(SCMPipelineManager.java:401) at org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.destroyPipeline(SCMPipelineManager.java:387) at org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.lambda$finalizeAndDestroyPipeline$0(SCMPipelineManager.java:321) at org.apache.hadoop.utils.Scheduler.lambda$schedule$1(Scheduler.java:68) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1754) getContainerWithPipeline fails with PipelineNotFoundException
[ https://issues.apache.org/jira/browse/HDDS-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka reassigned HDDS-1754: --- Assignee: Supratim Deka (was: Nanda kumar) > getContainerWithPipeline fails with PipelineNotFoundException > - > > Key: HDDS-1754 > URL: https://issues.apache.org/jira/browse/HDDS-1754 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Supratim Deka >Priority: Major > Labels: MiniOzoneChaosCluster > > Once a pipeline is closed or finalized and it was not able to close all the > containers inside the pipeline. > Then getContainerWithPipeline will try to fetch the pipeline state from > pipelineManager after the pipeline has been closed. > {code} > 2019-07-02 20:48:20,370 INFO ipc.Server (Server.java:logException(2726)) - > IPC Server handler 13 on 50130, call Call#17339 Retry#0 > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocol.getContainerWithPipeline > from 192.168.0.2:51452 > org.apache.hadoop.hdds.scm.pipeline.PipelineNotFoundException: > PipelineID=e1a7b16a-48d9-4194-9774-ad49ec9ad78b not found > at > org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.getPipeline(PipelineStateMap.java:132) > at > org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager.getPipeline(PipelineStateManager.java:66) > at > org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.getPipeline(SCMPipelineManager.java:184) > at > org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.getContainerWithPipeline(SCMClientProtocolServer.java:244) > at > org.apache.hadoop.ozone.protocolPB.StorageContainerLocationProtocolServerSideTranslatorPB.getContainerWithPipeline(StorageContainerLocationProtocolServerSideTranslatorPB.java:144) > at > org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:16390) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1748) Error message for 3 way commit failure is not verbose
[ https://issues.apache.org/jira/browse/HDDS-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka reassigned HDDS-1748: --- Assignee: Supratim Deka > Error message for 3 way commit failure is not verbose > - > > Key: HDDS-1748 > URL: https://issues.apache.org/jira/browse/HDDS-1748 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Supratim Deka >Priority: Major > > The error message for 3 way client commit is not verbose, it should include > blockID and pipeline ID along with node details for debugging. > {code} > 2019-07-02 09:58:12,025 WARN scm.XceiverClientRatis > (XceiverClientRatis.java:watchForCommit(262)) - 3 way commit failed > java.util.concurrent.ExecutionException: > org.apache.ratis.protocol.NotReplicatedException: Request with call Id 39482 > and log index 11562 is not yet replicated to ALL_COMMITTED > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) > at > org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:259) > at > org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchForCommit(CommitWatcher.java:194) > at > org.apache.hadoop.hdds.scm.storage.CommitWatcher.watchOnFirstIndex(CommitWatcher.java:135) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.watchForCommit(BlockOutputStream.java:355) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFullBuffer(BlockOutputStream.java:332) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.write(BlockOutputStream.java:259) > at > org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.write(BlockOutputStreamEntry.java:129) > at > org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:211) > at > org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:193) > at > org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:49) > at java.io.OutputStream.write(OutputStream.java:75) > at > org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:103) > at > org.apache.hadoop.ozone.MiniOzoneLoadGenerator.lambda$startIO$0(MiniOzoneLoadGenerator.java:147) > at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.ratis.protocol.NotReplicatedException: Request with > call Id 39482 and log index 11562 is not yet replicated to ALL_COMMITTED > at > org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:245) > at > org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:254) > at > org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:249) > at > org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:421) > at > org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33) > at > org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33) > at > org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:519) > at > org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > ... 3 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1740) Handle Failure to Update Ozone Container YAML
Supratim Deka created HDDS-1740: --- Summary: Handle Failure to Update Ozone Container YAML Key: HDDS-1740 URL: https://issues.apache.org/jira/browse/HDDS-1740 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Datanode Reporter: Supratim Deka Ensure consistent state in-memory and in the persistent YAML file for the Container. If an update to the YAML fails, then the in-memory state also does not change. This ensures that in every container report, the SCM continues to see the specific container is still in the old state. And this triggers a retry of the state change operation from the SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1603) Handle Ratis Append Failure in Container State Machine
[ https://issues.apache.org/jira/browse/HDDS-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka reassigned HDDS-1603: --- Assignee: Supratim Deka > Handle Ratis Append Failure in Container State Machine > -- > > Key: HDDS-1603 > URL: https://issues.apache.org/jira/browse/HDDS-1603 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode, SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > RATIS-573 would add notification to the State Machine on encountering failure > during Log append. > The scope of this jira is to build on RATIS-573 and define the handling for > log append failure in Container State Machine. > 1. Enqueue pipeline unhealthy action to SCM, add a reason code to the message. > 2. Trigger heartbeat to SCM > 3. Notify Ratis volume unhealthy to the Datanode, so that DN can trigger > async volume checker > Changes in the SCM to leverage the additional failure reason code, is outside > the scope of this jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1739) Handle Apply Transaction Failure in State Machine
Supratim Deka created HDDS-1739: --- Summary: Handle Apply Transaction Failure in State Machine Key: HDDS-1739 URL: https://issues.apache.org/jira/browse/HDDS-1739 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Datanode Reporter: Supratim Deka Assignee: Supratim Deka Scope of this jira is to handle failure of applyTransaction() for the Container State Machine. 1. Introduce new Replica state - STALE to indicate container is missing transactions. Mark failed container as STALE. 2. Trigger immediate ICR to SCM 3. Fail new transactions on STALE container 4. Notify volume error to the DN (to trigger background volume check) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1621) writeData in ChunkUtils should not use AsynchronousFileChannel
[ https://issues.apache.org/jira/browse/HDDS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-1621: Summary: writeData in ChunkUtils should not use AsynchronousFileChannel (was: flushStateMachineData should ensure the write chunks are flushed to disk) > writeData in ChunkUtils should not use AsynchronousFileChannel > -- > > Key: HDDS-1621 > URL: https://issues.apache.org/jira/browse/HDDS-1621 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Shashikant Banerjee >Assignee: Supratim Deka >Priority: Major > > Currently, chunks writes are not synced to disk by default. When > flushStateMachineData gests invoked from Ratis, it should also ensure all the > pending chunk writes should be flushed to disk. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1621) flushStateMachineData should ensure the write chunks are flushed to disk
[ https://issues.apache.org/jira/browse/HDDS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855621#comment-16855621 ] Supratim Deka commented on HDDS-1621: - keeping the FileChannel around after writeData and passing it back to the State Machine is not really required. dfs.container.chunk.write.sync determines whether the chunk is persisted as soon as the data is written. If this parameter is set to false, it implies the possibility of data loss. This tradeoff is provided to enable higher throughput. In keeping with this understanding, we will limit the change to 1. invoking a force+close on the channel inside writeData if the sync option is set. 2. change AsynchronousFileChannel to FileChannel (as explained in the previous comment) > flushStateMachineData should ensure the write chunks are flushed to disk > > > Key: HDDS-1621 > URL: https://issues.apache.org/jira/browse/HDDS-1621 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Shashikant Banerjee >Assignee: Supratim Deka >Priority: Major > > Currently, chunks writes are not synced to disk by default. When > flushStateMachineData gests invoked from Ratis, it should also ensure all the > pending chunk writes should be flushed to disk. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1621) flushStateMachineData should ensure the write chunks are flushed to disk
[ https://issues.apache.org/jira/browse/HDDS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16854754#comment-16854754 ] Supratim Deka commented on HDDS-1621: - ChunkManagerImpl is initialised with the default value of DFS_CONTAINER_CHUNK_WRITE_SYNC_KEY(="dfs.container.chunk.write.sync"), which is set to false. This means, when writing the tmp chunk file, ChunkUtils.writeData does not pass StandardOpenOption.SYNC to the File Channel (currently an AsynchronousFileChannel). At the end of WriteData, the function invokes a close() on the channel and returns with success. This is enough to complete the future handed back to Ratis by writeStateMachineData. Which means - completion of the state machine future does not guarantee that the chunk is persisted. we will target 2 changes : 1. return a future from writeData containing the FileChannel. flushStateMachineData can then invoke force() on each of the channels in the write chunk future map and then close them. 2. change AsynchronousFileChannel to FileChannel in ChunkUtils.writeData. This is the right thing to do because writeStateMachineData is already scheduling the write chunk requests on threads from XceiverServerRatis.chunkExecutor. > flushStateMachineData should ensure the write chunks are flushed to disk > > > Key: HDDS-1621 > URL: https://issues.apache.org/jira/browse/HDDS-1621 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Shashikant Banerjee >Assignee: Supratim Deka >Priority: Major > > Currently, chunks writes are not synced to disk by default. When > flushStateMachineData gests invoked from Ratis, it should also ensure all the > pending chunk writes should be flushed to disk. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1621) flushStateMachineData should ensure the write chunks are flushed to disk
[ https://issues.apache.org/jira/browse/HDDS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-1621: Issue Type: Sub-task (was: Bug) Parent: HDDS-1595 > flushStateMachineData should ensure the write chunks are flushed to disk > > > Key: HDDS-1621 > URL: https://issues.apache.org/jira/browse/HDDS-1621 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Shashikant Banerjee >Assignee: Supratim Deka >Priority: Major > > Currently, chunks writes are not synced to disk by default. When > flushStateMachineData gests invoked from Ratis, it should also ensure all the > pending chunk writes should be flushed to disk. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1603) Handle Ratis Append Failure in Container State Machine
Supratim Deka created HDDS-1603: --- Summary: Handle Ratis Append Failure in Container State Machine Key: HDDS-1603 URL: https://issues.apache.org/jira/browse/HDDS-1603 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Datanode, SCM Reporter: Supratim Deka RATIS-573 would add notification to the State Machine on encountering failure during Log append. The scope of this jira is to build on RATIS-573 and define the handling for log append failure in Container State Machine. 1. Enqueue pipeline unhealthy action to SCM, add a reason code to the message. 2. Trigger heartbeat to SCM 3. Notify Ratis volume unhealthy to the Datanode, so that DN can trigger async volume checker Changes in the SCM to leverage the additional failure reason code, is outside the scope of this jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1595) Handling IO Failures on the Datanode
[ https://issues.apache.org/jira/browse/HDDS-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-1595: Attachment: Handling IO Failures on the Datanode.pdf > Handling IO Failures on the Datanode > > > Key: HDDS-1595 > URL: https://issues.apache.org/jira/browse/HDDS-1595 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Supratim Deka >Priority: Major > Attachments: Handling IO Failures on the Datanode.pdf, Raft IO v2.svg > > > This Jira covers all the changes required to handle IO Failures on the > Datanode. Handling an IO failure on the Datanode involves detecting failures > as they happen and propagating the failure to the appropriate component in > the system - possibly the Client and/or the SCM based on the type of failure. > At a high-level, IO Failure handling has the following goals: > 1. Prevent Inconsistencies and corruption - due to non-handling or > mishandling of failures. > 2. Prevent any data loss - timely detection of failure and propagate correct > error back to the initiator instead of silently dropping the data while the > client assumes the operation is committed. > 3. Contain the disruption in the system - if a disk volume fails on a DN, > operations to the other nodes and volumes should not be affected. > Details pertaining to design and changes required are covered in the attached > pdf document. > A sequence diagram used to analyse the Datanode IO Path is also attached, in > svg format. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1595) Handling IO Failures on the Datanode
Supratim Deka created HDDS-1595: --- Summary: Handling IO Failures on the Datanode Key: HDDS-1595 URL: https://issues.apache.org/jira/browse/HDDS-1595 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Reporter: Supratim Deka Attachments: Raft IO v2.svg This Jira covers all the changes required to handle IO Failures on the Datanode. Handling an IO failure on the Datanode involves detecting failures as they happen and propagating the failure to the appropriate component in the system - possibly the Client and/or the SCM based on the type of failure. At a high-level, IO Failure handling has the following goals: 1. Prevent Inconsistencies and corruption - due to non-handling or mishandling of failures. 2. Prevent any data loss - timely detection of failure and propagate correct error back to the initiator instead of silently dropping the data while the client assumes the operation is committed. 3. Contain the disruption in the system - if a disk volume fails on a DN, operations to the other nodes and volumes should not be affected. Details pertaining to design and changes required are covered in the attached pdf document. A sequence diagram used to analyse the Datanode IO Path is also attached, in svg format. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-700) Support rack awared node placement policy based on network topology
[ https://issues.apache.org/jira/browse/HDDS-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16847244#comment-16847244 ] Supratim Deka commented on HDDS-700: looks like the checkstyle issues reported in patch 03 slipped by and made it into the commit. > Support rack awared node placement policy based on network topology > --- > > Key: HDDS-700 > URL: https://issues.apache.org/jira/browse/HDDS-700 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Xiaoyu Yao >Assignee: Sammi Chen >Priority: Major > Fix For: 0.4.1 > > Attachments: HDDS-700.01.patch, HDDS-700.02.patch, HDDS-700.03.patch > > > Implement a new container placement policy implementation based datanode's > network topology. It follows the same rule as HDFS. > By default with 3 replica, two replica will be on the same rack, the third > replica and all the remaining replicas will be on different racks. > > {color:#808080} {color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1534) freon should return non-zero exit code on failure
[ https://issues.apache.org/jira/browse/HDDS-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16846789#comment-16846789 ] Supratim Deka commented on HDDS-1534: - +1 Thanks [~nilotpalnandi] for updating the patch. > freon should return non-zero exit code on failure > - > > Key: HDDS-1534 > URL: https://issues.apache.org/jira/browse/HDDS-1534 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Nilotpal Nandi >Assignee: Nilotpal Nandi >Priority: Major > Attachments: HDDS-1534.001.patch, HDDS-1534.002.patch > > > Currently freon does not return any non-zero exit code even on failure. > The status shows as "Failed" but the exit code is always zero. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1534) freon should return non-zero exit code on failure
[ https://issues.apache.org/jira/browse/HDDS-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844651#comment-16844651 ] Supratim Deka commented on HDDS-1534: - Thanks [~nilotpalnandi] for the patch. Why not change the type of the private member "exception" to Exception, instead of adding the new member failureCause? This way we do not have to remember setting 2 values whenever there is an error. Also a question about the validator, may be outside the scope of this Jira. Validator errors are just logged, but there is no indication of failure to the main flow. Does this behavior also need to be fixed? > freon should return non-zero exit code on failure > - > > Key: HDDS-1534 > URL: https://issues.apache.org/jira/browse/HDDS-1534 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Nilotpal Nandi >Assignee: Nilotpal Nandi >Priority: Major > Attachments: HDDS-1534.001.patch > > > Currently freon does not return any non-zero exit code even on failure. > The status shows as "Failed" but the exit code is always zero. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1454) GC other system pause events can trigger pipeline destroy for all the nodes in the cluster
[ https://issues.apache.org/jira/browse/HDDS-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka reassigned HDDS-1454: --- Assignee: Supratim Deka > GC other system pause events can trigger pipeline destroy for all the nodes > in the cluster > -- > > Key: HDDS-1454 > URL: https://issues.apache.org/jira/browse/HDDS-1454 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.3.0 >Reporter: Mukul Kumar Singh >Assignee: Supratim Deka >Priority: Major > Labels: MiniOzoneChaosCluster > > In a MiniOzoneChaosCluster run it was observed that events like GC pauses or > any other pauses in SCM can mark all the datanodes as stale in SCM. This will > trigger multiple pipeline destroy and will render the system unusable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1559) Include committedBytes to determine Out of Space in VolumeChoosingPolicy
Supratim Deka created HDDS-1559: --- Summary: Include committedBytes to determine Out of Space in VolumeChoosingPolicy Key: HDDS-1559 URL: https://issues.apache.org/jira/browse/HDDS-1559 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Reporter: Supratim Deka Assignee: Supratim Deka This is a follow-up from HDDS-1511 and HDDS-1535 Currently when creating a new Container, the DN invokes RoundRobinVolumeChoosingPolicy:chooseVolume(). This routine checks for (volume available space > container max size). If no eligible volume is found, the policy throws a DiskOutOfSpaceException. This is the current behaviour. However, the computation of available space does not take into consideration the space that is going to be consumed by writes to existing containers which are still Open and accepting chunk writes. This Jira proposes to enhance the space availability check in chooseVolume by inclusion of committed space(committedBytes in HddsVolume) in the equation. The handling/management of the exception in Ratis will not be modified in this Jira. That will be scoped separately as part of Datanode IO Failure handling work. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1535) Space tracking for Open Containers : Handle Node Startup
Supratim Deka created HDDS-1535: --- Summary: Space tracking for Open Containers : Handle Node Startup Key: HDDS-1535 URL: https://issues.apache.org/jira/browse/HDDS-1535 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Reporter: Supratim Deka Assignee: Supratim Deka This is related to HDDS-1511 Space tracking for Open Containers (committed space in the volume) relies on usedBytes in the Container state. usedBytes is not persisted for every update (chunkWrite). So on a node restart the value is stale. The proposal is to: iterate the block DB for each open container during startup and compute the used space. The block DB process will be accelerated by spawning executors for each container. This process will be carried out as part of building the container set during startup. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1533) JVM exit on TestHddsDatanodeService
Supratim Deka created HDDS-1533: --- Summary: JVM exit on TestHddsDatanodeService Key: HDDS-1533 URL: https://issues.apache.org/jira/browse/HDDS-1533 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Reporter: Supratim Deka JVM exits when running TestHddsDatanodeService https://builds.apache.org/job/hadoop-multibranch/job/PR-812/3/artifact/out/patch-unit-hadoop-hdds.txt test encounters same failure on trunk without my patch for HDDS-1511 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1511) Space tracking for Open Containers in HDDS Volumes
[ https://issues.apache.org/jira/browse/HDDS-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-1511: Attachment: HDDS-1511.001.patch > Space tracking for Open Containers in HDDS Volumes > -- > > Key: HDDS-1511 > URL: https://issues.apache.org/jira/browse/HDDS-1511 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1511.000.patch, HDDS-1511.001.patch > > Time Spent: 10m > Remaining Estimate: 0h > > For every HDDS Volume, track the space usage in open containers. Introduce a > counter committedBytes in HddsVolume - this counts the remaining space in > Open containers until they reach max capacity. The counter is incremented (by > container max capacity) for every container create. And decremented (by chunk > size) for every chunk write. > Space tracking for open containers will enable adding a safety check during > container create. > If there is not sufficient free space in the volume, the container create > operation can be failed. > The scope of this jira is to just add the space tracking for Open Containers. > Checking for space and failing container create will be introduced in a > subsequent jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1511) Space tracking for Open Containers in HDDS Volumes
[ https://issues.apache.org/jira/browse/HDDS-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836892#comment-16836892 ] Supratim Deka commented on HDDS-1511: - addressed comment from [~arpitagarwal] in patch 001. will add a pull request as well. > Space tracking for Open Containers in HDDS Volumes > -- > > Key: HDDS-1511 > URL: https://issues.apache.org/jira/browse/HDDS-1511 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1511.000.patch, HDDS-1511.001.patch > > Time Spent: 10m > Remaining Estimate: 0h > > For every HDDS Volume, track the space usage in open containers. Introduce a > counter committedBytes in HddsVolume - this counts the remaining space in > Open containers until they reach max capacity. The counter is incremented (by > container max capacity) for every container create. And decremented (by chunk > size) for every chunk write. > Space tracking for open containers will enable adding a safety check during > container create. > If there is not sufficient free space in the volume, the container create > operation can be failed. > The scope of this jira is to just add the space tracking for Open Containers. > Checking for space and failing container create will be introduced in a > subsequent jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1511) Space tracking for Open Containers in HDDS Volumes
[ https://issues.apache.org/jira/browse/HDDS-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-1511: Attachment: HDDS-1511.000.patch Status: Patch Available (was: Open) unit test code added into existing TestContainerPersistence tests. > Space tracking for Open Containers in HDDS Volumes > -- > > Key: HDDS-1511 > URL: https://issues.apache.org/jira/browse/HDDS-1511 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Attachments: HDDS-1511.000.patch > > > For every HDDS Volume, track the space usage in open containers. Introduce a > counter committedBytes in HddsVolume - this counts the remaining space in > Open containers until they reach max capacity. The counter is incremented (by > container max capacity) for every container create. And decremented (by chunk > size) for every chunk write. > Space tracking for open containers will enable adding a safety check during > container create. > If there is not sufficient free space in the volume, the container create > operation can be failed. > The scope of this jira is to just add the space tracking for Open Containers. > Checking for space and failing container create will be introduced in a > subsequent jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1511) Space tracking for Open Containers in HDDS Volumes
Supratim Deka created HDDS-1511: --- Summary: Space tracking for Open Containers in HDDS Volumes Key: HDDS-1511 URL: https://issues.apache.org/jira/browse/HDDS-1511 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Reporter: Supratim Deka Assignee: Supratim Deka For every HDDS Volume, track the space usage in open containers. Introduce a counter committedBytes in HddsVolume - this counts the remaining space in Open containers until they reach max capacity. The counter is incremented (by container max capacity) for every container create. And decremented (by chunk size) for every chunk write. Space tracking for open containers will enable adding a safety check during container create. If there is not sufficient free space in the volume, the container create operation can be failed. The scope of this jira is to just add the space tracking for Open Containers. Checking for space and failing container create will be introduced in a subsequent jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1206) Handle Datanode volume out of space
[ https://issues.apache.org/jira/browse/HDDS-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-1206: Summary: Handle Datanode volume out of space (was: need to handle in the client when one of the datanode disk goes out of space) > Handle Datanode volume out of space > --- > > Key: HDDS-1206 > URL: https://issues.apache.org/jira/browse/HDDS-1206 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Nilotpal Nandi >Assignee: Supratim Deka >Priority: Major > > steps taken : > > # create 40 datanode cluster. > # one of the datanodes has less than 5 GB space. > # Started writing key of size 600MB. > operation failed: > Error on the client: > > {noformat} > Fri Mar 1 09:05:28 UTC 2019 Ruuning > /root/hadoop_trunk/ozone-0.4.0-SNAPSHOT/bin/ozone sh key put > testvol172275910-1551431122-1/testbuck172275910-1551431122-1/test_file24 > /root/test_files/test_file24 > original md5sum a6de00c9284708585f5a99b0490b0b23 > 2019-03-01 09:05:39,142 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:39,578 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:40,368 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:40,450 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:6
[jira] [Assigned] (HDDS-1315) datanode process dies if it runs out of disk space
[ https://issues.apache.org/jira/browse/HDDS-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka reassigned HDDS-1315: --- Assignee: Supratim Deka > datanode process dies if it runs out of disk space > -- > > Key: HDDS-1315 > URL: https://issues.apache.org/jira/browse/HDDS-1315 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Sandeep Nemuri >Assignee: Supratim Deka >Priority: Major > > As of now the datanode process dies if it runs out of disk space which makes > the data present in that DN is inaccessible. > datanode logs: > {code:java} > 2019-03-11 04:01:27,141 ERROR org.apache.ratis.server.storage.RaftLogWorker: > Terminating with exit status 1: > fb635e52-e2eb-46b1-b109-a831c10d3bf8-RaftLogWorker failed. > java.io.FileNotFoundException: > /opt/data/meta/ratis/68e315f3-312c-4c9f-a7bd-590194deb5e7/current/log_inprogress_8705582 > (No space left on device) > at java.io.RandomAccessFile.open0(Native Method) > at java.io.RandomAccessFile.open(RandomAccessFile.java:316) > at java.io.RandomAccessFile.(RandomAccessFile.java:243) > at > org.apache.ratis.server.storage.LogOutputStream.(LogOutputStream.java:66) > at > org.apache.ratis.server.storage.RaftLogWorker$StartLogSegment.execute(RaftLogWorker.java:436) > at > org.apache.ratis.server.storage.RaftLogWorker.run(RaftLogWorker.java:219) > at java.lang.Thread.run(Thread.java:745) > {code} > {code:java} > 2019-03-11 04:01:25,531 [grpc-default-executor-9192] INFO - Operation: > WriteChunk : Trace ID: : Message: java.nio.file.FileSystemException: > /opt/data/hdds/a83a7108-91c7-4357-9f68-46753641d429/current/containerDir0/88/chunks/ba29bb91559179cbf7ab5d86cac47ba1_stream_9fb1e802-dca6-46e0-be12-5ac743d8563d_chunk_1.tmp.11076.8705539: > No space left on device : Result: IO_EXCEPTION > 2019-03-11 04:01:25,543 [grpc-default-executor-9192] INFO - Operation: > WriteChunk : Trace ID: : Message: java.nio.file.FileSystemException: > /opt/data/hdds/a83a7108-91c7-4357-9f68-46753641d429/current/containerDir0/86/chunks/19ef3c1d36eadbc9538116c68c6e494f_stream_c58e8b91-dc18-4b61-918f-ab1eeda41c02_chunk_1.tmp.11076.8705540: > No space left on device : Result: IO_EXCEPTION > 2019-03-11 04:01:25,546 [grpc-default-executor-9192] INFO - Operation: > WriteChunk : Trace ID: : Message: java.nio.file.FileSystemException: > /opt/data/hdds/a83a7108-91c7-4357-9f68-46753641d429/current/containerDir0/87/chunks/83a6a81f2f703f49a7e0a1413eebfc4c_stream_cae1ed30-c613-4278-8404-c9e37d0b690f_chunk_1.tmp.11076.8705541: > No space left on device : Result: IO_EXCEPTION > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1315) datanode process dies if it runs out of disk space
[ https://issues.apache.org/jira/browse/HDDS-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810444#comment-16810444 ] Supratim Deka commented on HDDS-1315: - related to disk full handling across ozone components. > datanode process dies if it runs out of disk space > -- > > Key: HDDS-1315 > URL: https://issues.apache.org/jira/browse/HDDS-1315 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Sandeep Nemuri >Priority: Major > > As of now the datanode process dies if it runs out of disk space which makes > the data present in that DN is inaccessible. > datanode logs: > {code:java} > 2019-03-11 04:01:27,141 ERROR org.apache.ratis.server.storage.RaftLogWorker: > Terminating with exit status 1: > fb635e52-e2eb-46b1-b109-a831c10d3bf8-RaftLogWorker failed. > java.io.FileNotFoundException: > /opt/data/meta/ratis/68e315f3-312c-4c9f-a7bd-590194deb5e7/current/log_inprogress_8705582 > (No space left on device) > at java.io.RandomAccessFile.open0(Native Method) > at java.io.RandomAccessFile.open(RandomAccessFile.java:316) > at java.io.RandomAccessFile.(RandomAccessFile.java:243) > at > org.apache.ratis.server.storage.LogOutputStream.(LogOutputStream.java:66) > at > org.apache.ratis.server.storage.RaftLogWorker$StartLogSegment.execute(RaftLogWorker.java:436) > at > org.apache.ratis.server.storage.RaftLogWorker.run(RaftLogWorker.java:219) > at java.lang.Thread.run(Thread.java:745) > {code} > {code:java} > 2019-03-11 04:01:25,531 [grpc-default-executor-9192] INFO - Operation: > WriteChunk : Trace ID: : Message: java.nio.file.FileSystemException: > /opt/data/hdds/a83a7108-91c7-4357-9f68-46753641d429/current/containerDir0/88/chunks/ba29bb91559179cbf7ab5d86cac47ba1_stream_9fb1e802-dca6-46e0-be12-5ac743d8563d_chunk_1.tmp.11076.8705539: > No space left on device : Result: IO_EXCEPTION > 2019-03-11 04:01:25,543 [grpc-default-executor-9192] INFO - Operation: > WriteChunk : Trace ID: : Message: java.nio.file.FileSystemException: > /opt/data/hdds/a83a7108-91c7-4357-9f68-46753641d429/current/containerDir0/86/chunks/19ef3c1d36eadbc9538116c68c6e494f_stream_c58e8b91-dc18-4b61-918f-ab1eeda41c02_chunk_1.tmp.11076.8705540: > No space left on device : Result: IO_EXCEPTION > 2019-03-11 04:01:25,546 [grpc-default-executor-9192] INFO - Operation: > WriteChunk : Trace ID: : Message: java.nio.file.FileSystemException: > /opt/data/hdds/a83a7108-91c7-4357-9f68-46753641d429/current/containerDir0/87/chunks/83a6a81f2f703f49a7e0a1413eebfc4c_stream_cae1ed30-c613-4278-8404-c9e37d0b690f_chunk_1.tmp.11076.8705541: > No space left on device : Result: IO_EXCEPTION > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1206) need to handle in the client when one of the datanode disk goes out of space
[ https://issues.apache.org/jira/browse/HDDS-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka reassigned HDDS-1206: --- Assignee: Supratim Deka (was: Shashikant Banerjee) > need to handle in the client when one of the datanode disk goes out of space > > > Key: HDDS-1206 > URL: https://issues.apache.org/jira/browse/HDDS-1206 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Nilotpal Nandi >Assignee: Supratim Deka >Priority: Major > > steps taken : > > # create 40 datanode cluster. > # one of the datanodes has less than 5 GB space. > # Started writing key of size 600MB. > operation failed: > Error on the client: > > {noformat} > Fri Mar 1 09:05:28 UTC 2019 Ruuning > /root/hadoop_trunk/ozone-0.4.0-SNAPSHOT/bin/ozone sh key put > testvol172275910-1551431122-1/testbuck172275910-1551431122-1/test_file24 > /root/test_files/test_file24 > original md5sum a6de00c9284708585f5a99b0490b0b23 > 2019-03-01 09:05:39,142 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:39,578 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:40,368 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:40,450 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStr
[jira] [Commented] (HDDS-1365) Fix error handling in KeyValueContainerCheck
[ https://issues.apache.org/jira/browse/HDDS-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807897#comment-16807897 ] Supratim Deka commented on HDDS-1365: - hello [~linyiqun], I did consider losing the error code. Not every corruption will require specific handling. At best there will be a few categories. When that happens, we can introduce a specific new exception for every such specialised category with its own handler logic. Basically, moving away from the error code approach - using exceptions does make the code cleaner and less clunky. And it should be ok to defer this incremental work until actually required. Yes? > Fix error handling in KeyValueContainerCheck > > > Key: HDDS-1365 > URL: https://issues.apache.org/jira/browse/HDDS-1365 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Attachments: HDDS-1365.000.patch > > > Error handling and propagation in KeyValueContainerCheck needs to be based on > throwing IOException instead of passing an error code to the calling function. > HDDS-1163 implemented the basic framework using a mix of error code return > and exception handling. There is added complexity because exceptions deep > inside the call chain are being caught and translated to error code return > values. The goal is to change all error handling in this class to use > Exceptions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1200) Ozone Data Scrubbing : Checksum verification for chunks
[ https://issues.apache.org/jira/browse/HDDS-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka reassigned HDDS-1200: --- Assignee: (was: Supratim Deka) > Ozone Data Scrubbing : Checksum verification for chunks > --- > > Key: HDDS-1200 > URL: https://issues.apache.org/jira/browse/HDDS-1200 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Supratim Deka >Priority: Major > > Background scrubber should read each chunk and verify the checksum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1365) Fix error handling in KeyValueContainerCheck
[ https://issues.apache.org/jira/browse/HDDS-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-1365: Attachment: HDDS-1365.000.patch Status: Patch Available (was: Open) > Fix error handling in KeyValueContainerCheck > > > Key: HDDS-1365 > URL: https://issues.apache.org/jira/browse/HDDS-1365 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Attachments: HDDS-1365.000.patch > > > Error handling and propagation in KeyValueContainerCheck needs to be based on > throwing IOException instead of passing an error code to the calling function. > HDDS-1163 implemented the basic framework using a mix of error code return > and exception handling. There is added complexity because exceptions deep > inside the call chain are being caught and translated to error code return > values. The goal is to change all error handling in this class to use > Exceptions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1365) Fix error handling in KeyValueContainerCheck
Supratim Deka created HDDS-1365: --- Summary: Fix error handling in KeyValueContainerCheck Key: HDDS-1365 URL: https://issues.apache.org/jira/browse/HDDS-1365 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Datanode Reporter: Supratim Deka Assignee: Supratim Deka Error handling and propagation in KeyValueContainerCheck needs to be based on throwing IOException instead of passing an error code to the calling function. HDDS-1163 implemented the basic framework using a mix of error code return and exception handling. There is added complexity because exceptions deep inside the call chain are being caught and translated to error code return values. The goal is to change all error handling in this class to use Exceptions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1229) Concurrency issues with Background Block Delete
[ https://issues.apache.org/jira/browse/HDDS-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka resolved HDDS-1229. - Resolution: Not A Problem This is not an issue because HDDS-1163 adopted a simple approach to resolving the concurrency with background block delete. Every time a chunk lookup fails when checking the block DB, the checker retries lookup for the specific block. If the block is not found in DB, it means background delete has removed it and the missing chunk is not actually a corruption. > Concurrency issues with Background Block Delete > --- > > Key: HDDS-1229 > URL: https://issues.apache.org/jira/browse/HDDS-1229 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode >Reporter: Supratim Deka >Priority: Major > > HDDS-1163 takes a simplistic approach to deal with concurrent block deletes > on a container, > when the metadata scanner is checking existence of chunks for each block in > the Container Block DB. > As part of HDDS-1663 checkBlockDB() just does a retry if any inconsistency is > detected during a concurrency window. The retry is expected to succeed > because the new DB iterator will not include any of the blocks being > processed by the concurrent background delete. If retry fails, then the > inconsistency is ignored expecting the next iteration of the metadata scanner > will avoid running concurrently with the same container. > This Jira is raised to explore a more predictable (yet simple) mechanism to > deal with this concurrency. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org