[jira] [Updated] (HDDS-3762) Intermittent failure in TestDeleteWithSlowFollower
[ https://issues.apache.org/jira/browse/HDDS-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-3762: - Labels: pull-request-available (was: ) > Intermittent failure in TestDeleteWithSlowFollower > -- > > Key: HDDS-3762 > URL: https://issues.apache.org/jira/browse/HDDS-3762 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: test >Affects Versions: 1.0.0 >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > > TestDeleteWithSlowFollower failed soon after it was re-enabled in HDDS-3330. > {code:title=https://github.com/apache/hadoop-ozone/runs/753363338} > [INFO] Running org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 28.647 s <<< FAILURE! - in > org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower > [ERROR] > testDeleteKeyWithSlowFollower(org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower) > Time elapsed: 0.163 s <<< FAILURE! > java.lang.AssertionError > ... > at org.junit.Assert.assertNotNull(Assert.java:631) > at > org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225) > {code} > CC [~shashikant] [~elek] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai opened a new pull request #1376: HDDS-3762. Intermittent failure in TestDeleteWithSlowFollower
adoroszlai opened a new pull request #1376: URL: https://github.com/apache/hadoop-ozone/pull/1376 ## What changes were proposed in this pull request? Intermittent failure in `testDeleteKeyWithSlowFollower` seems to be caused by: * `DeleteBlocksCommandHandler` increments `invocationCount` near the beginning of `handle()`, and only updates `deleteTransactionId` later * `TestDeleteWithSlowFollower` waits for `invocationCount >= 1`, then asserts `deleteTransactionId` also increased The test is fixed by changing the order in the handler: only increment `invocationCount` at the end, when `deleteTransactionId` is already updated. I think `invocationCount` should be updated together with `totalTime` to provide (slightly more) correct average run time (`getAverageRunTime`). https://issues.apache.org/jira/browse/HDDS-3762 ## How was this patch tested? Passed 50x: https://github.com/adoroszlai/hadoop-ozone/runs/1059570465#step:5:4 Regular CI: https://github.com/adoroszlai/hadoop-ozone/runs/1059560880 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.
amaliujia commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481746270 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java ## @@ -0,0 +1,26 @@ +package org.apache.hadoop.ozone.admin.scm; + +import java.util.List; +import java.util.concurrent.Callable; +import org.apache.hadoop.hdds.cli.HddsVersionProvider; +import org.apache.hadoop.hdds.scm.client.ScmClient; +import picocli.CommandLine; + +@CommandLine.Command( +name = "listratisstatus", Review comment: that makes sense. Indeed `status` sounds more like a health check and can carry much more information. Consider `OM` is also actually getting `roles`. We can start from `roles` ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java ## @@ -0,0 +1,26 @@ +package org.apache.hadoop.ozone.admin.scm; + +import java.util.List; +import java.util.concurrent.Callable; +import org.apache.hadoop.hdds.cli.HddsVersionProvider; +import org.apache.hadoop.hdds.scm.client.ScmClient; +import picocli.CommandLine; + +@CommandLine.Command( +name = "listratisstatus", Review comment: that makes sense. Indeed `status` sounds more like a health check and can carry much more information. Consider `OM` is also actually getting, essentially, `roles`. We can start from `roles` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.
adoroszlai commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481709338 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java ## @@ -0,0 +1,26 @@ +package org.apache.hadoop.ozone.admin.scm; + +import java.util.List; +import java.util.concurrent.Callable; +import org.apache.hadoop.hdds.cli.HddsVersionProvider; +import org.apache.hadoop.hdds.scm.client.ScmClient; +import picocli.CommandLine; + +@CommandLine.Command( +name = "listratisstatus", Review comment: I think it depends on whether you want to keep this command specific to roles, or may extend the same command in the future with other status info. Probably "roles" is better now, and "status" can be either a separate command or another alias later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-3103) Have multi-raft pipeline calculator to recommend best pipeline number per datanode
[ https://issues.apache.org/jira/browse/HDDS-3103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188980#comment-17188980 ] Rui Wang commented on HDDS-3103: [~timmylicheng] Do you have a plan to work on this JIRA in near term? Could I take this one? > Have multi-raft pipeline calculator to recommend best pipeline number per > datanode > -- > > Key: HDDS-3103 > URL: https://issues.apache.org/jira/browse/HDDS-3103 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Affects Versions: 0.5.0 >Reporter: Li Cheng >Assignee: Li Cheng >Priority: Critical > > PipelinePlacementPolicy should have a calculator method to recommend better > number for pipeline number per node. The number used to come from > ozone.datanode.pipeline.limit in config. SCM should be able to consider how > many ratis dir and the ratis retry timeout to recommend the best pipeline > number for every node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.
amaliujia commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481638257 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java ## @@ -0,0 +1,26 @@ +package org.apache.hadoop.ozone.admin.scm; + +import java.util.List; +import java.util.concurrent.Callable; +import org.apache.hadoop.hdds.cli.HddsVersionProvider; +import org.apache.hadoop.hdds.scm.client.ScmClient; +import picocli.CommandLine; + +@CommandLine.Command( +name = "listratisstatus", Review comment: I am ok with both. @adoroszlai what do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`
amaliujia commented on pull request #1375: URL: https://github.com/apache/hadoop-ozone/pull/1375#issuecomment-685278548 It seemed that there is another opinion about what command name should be. So I will convert this PR to a draft now to wait for a consensus. Thanks for all reviews so far and sorry for the confusion. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on pull request #1340: HDDS-3188 Enable SCM group with failover proxy for SCM block location.
amaliujia commented on pull request #1340: URL: https://github.com/apache/hadoop-ozone/pull/1340#issuecomment-685271344 This PR overall LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] vivekratnavel commented on pull request #1364: HDDS-4165. GitHub Actions cache does not work outside of workspace
vivekratnavel commented on pull request #1364: URL: https://github.com/apache/hadoop-ozone/pull/1364#issuecomment-685272723 @adoroszlai Thanks for fixing this! +1 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`
amaliujia commented on a change in pull request #1375: URL: https://github.com/apache/hadoop-ozone/pull/1375#discussion_r481610070 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/om/GetServiceRolesSubcommand.java ## @@ -30,10 +30,10 @@ import java.util.concurrent.Callable; /** - * Handler of om get-service-roles command. + * Handler of om status command. */ @CommandLine.Command( -name = "getserviceroles", +name = "status", Review comment: That makes sense. Suggestion applied. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-3762) Intermittent failure in TestDeleteWithSlowFollower
[ https://issues.apache.org/jira/browse/HDDS-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai reassigned HDDS-3762: -- Assignee: Attila Doroszlai > Intermittent failure in TestDeleteWithSlowFollower > -- > > Key: HDDS-3762 > URL: https://issues.apache.org/jira/browse/HDDS-3762 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: test >Affects Versions: 1.0.0 >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > > TestDeleteWithSlowFollower failed soon after it was re-enabled in HDDS-3330. > {code:title=https://github.com/apache/hadoop-ozone/runs/753363338} > [INFO] Running org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 28.647 s <<< FAILURE! - in > org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower > [ERROR] > testDeleteKeyWithSlowFollower(org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower) > Time elapsed: 0.163 s <<< FAILURE! > java.lang.AssertionError > ... > at org.junit.Assert.assertNotNull(Assert.java:631) > at > org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225) > {code} > CC [~shashikant] [~elek] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] cxorm commented on pull request #1367: HDDS-4169. Fix some minor errors in StorageContainerManager.md
cxorm commented on pull request #1367: URL: https://github.com/apache/hadoop-ozone/pull/1367#issuecomment-685254966 The failed `build-branch / unit (pull_request)` seems not related the patch. Let's commit it again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-3762) Intermittent failure in TestDeleteWithSlowFollower
[ https://issues.apache.org/jira/browse/HDDS-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188941#comment-17188941 ] Attila Doroszlai commented on HDDS-3762: I think the bug from description was fixed by HDDS-3964: committed on Jul 17, no new failures since Jul 14: {noformat} 2020/06/09/997/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225) 2020/06/16/1047/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225) 2020/06/22/1077/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225) 2020/06/23/1122/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225) 2020/06/25/1172/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225) 2020/06/26/1215/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:225) 2020/07/02/1386/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:216) 2020/07/14/1658/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:216) {noformat} However, let me hijack this bug for another intermittent failure of TestDeleteWithSlowFollower: {noformat} 2020/06/29/1299/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:279) 2020/07/25/2012/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:272) 2020/08/02/2191/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:272) 2020/08/03/2211/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:272) 2020/08/03/2214/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:272) 2020/08/16/2452/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:272) 2020/08/28/2646/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.txt: at org.apache.hadoop.ozone.client.rpc.TestDeleteWithSlowFollower.testDeleteKeyWithSlowFollower(TestDeleteWithSlowFollower.java:272) {noformat} that is: {code:title=https://github.com/apache/hadoop-ozone/blob/9cef3f63384d643ca8d25ea70d87f5415f92bc88/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestDeleteWithSlowFollower.java#L272} Assert.assertTrue(containerData.getDeleteTransactionId() > delTrxId); {code} > Intermittent failure in TestDeleteWithSlowFollower > -- > > Key: HDDS-3762 > URL: https://issues.apache.org/jira/browse/HDDS-3762 > Project: Hadoop Distributed Data Store >
[GitHub] [hadoop-ozone] cxorm commented on a change in pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`
cxorm commented on a change in pull request #1375: URL: https://github.com/apache/hadoop-ozone/pull/1375#discussion_r481576976 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/om/GetServiceRolesSubcommand.java ## @@ -30,10 +30,10 @@ import java.util.concurrent.Callable; /** - * Handler of om get-service-roles command. + * Handler of om status command. */ @CommandLine.Command( -name = "getserviceroles", +name = "status", Review comment: Thanks @adoroszlai for the idea, I agree with it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] captainzmc commented on pull request #868: HDDS-3457. Fix ozonefs put and mkdir KEY_NOT_FOUND issue when ACL enable
captainzmc commented on pull request #868: URL: https://github.com/apache/hadoop-ozone/pull/868#issuecomment-685244606 > Hi @captainzmc, > I enabled ACL and performed mkdir using o3fs, but 'checkAccess' function is never reached. I have put some System.out.println statements in 'checkAccess' function, but nothing is being printed. Can you please help! Hi @aryangupta1998 [Is that the method](https://github.com/apache/hadoop-ozone/blob/34ee8311b0d0a37878fe1fd2e5d8c1b91aa8cc8f/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L1633)? I suggest you print the log with LOG.xxx or enable debugging in IDEA. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4176) Fix failed UT: test2WayCommitForTimeoutException
[ https://issues.apache.org/jira/browse/HDDS-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-4176: - Labels: pull-request-available (was: ) > Fix failed UT: test2WayCommitForTimeoutException > > > Key: HDDS-4176 > URL: https://issues.apache.org/jira/browse/HDDS-4176 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Labels: pull-request-available > Fix For: 1.1.0 > > > org.apache.ratis.protocol.GroupMismatchException: > 6f2b1ee5-bc2b-491c-bff4-ab0f4ce64709: group-2D066F5AFBD0 not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:127) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:274) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:283) > at > org.apache.hadoop.ozone.container.ContainerTestHelper.getRaftServerImpl(ContainerTestHelper.java:593) > at > org.apache.hadoop.ozone.container.ContainerTestHelper.isRatisFollower(ContainerTestHelper.java:608) > at > org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.test2WayCommitForTimeoutException(TestWatchForCommit.java:302) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.
timmylicheng commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481557217 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java ## @@ -0,0 +1,26 @@ +package org.apache.hadoop.ozone.admin.scm; + +import java.util.List; +import java.util.concurrent.Callable; +import org.apache.hadoop.hdds.cli.HddsVersionProvider; +import org.apache.hadoop.hdds.scm.client.ScmClient; +import picocli.CommandLine; + +@CommandLine.Command( +name = "listratisstatus", Review comment: ozone admin om getserviceroles -id=<> This is what OM does. I have my +1 on ozone admin (om|scm) roles. Status is more like health check. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4176) Fix failed UT: test2WayCommitForTimeoutException
[ https://issues.apache.org/jira/browse/HDDS-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-4176: --- Labels: (was: pull-request-available) > Fix failed UT: test2WayCommitForTimeoutException > > > Key: HDDS-4176 > URL: https://issues.apache.org/jira/browse/HDDS-4176 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Fix For: 1.1.0 > > > org.apache.ratis.protocol.GroupMismatchException: > 6f2b1ee5-bc2b-491c-bff4-ab0f4ce64709: group-2D066F5AFBD0 not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:127) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:274) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:283) > at > org.apache.hadoop.ozone.container.ContainerTestHelper.getRaftServerImpl(ContainerTestHelper.java:593) > at > org.apache.hadoop.ozone.container.ContainerTestHelper.isRatisFollower(ContainerTestHelper.java:608) > at > org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.test2WayCommitForTimeoutException(TestWatchForCommit.java:302) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4176) Fix failed UT: test2WayCommitForTimeoutException
[ https://issues.apache.org/jira/browse/HDDS-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-4176: --- Component/s: test > Fix failed UT: test2WayCommitForTimeoutException > > > Key: HDDS-4176 > URL: https://issues.apache.org/jira/browse/HDDS-4176 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Fix For: 1.1.0 > > > org.apache.ratis.protocol.GroupMismatchException: > 6f2b1ee5-bc2b-491c-bff4-ab0f4ce64709: group-2D066F5AFBD0 not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:127) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:274) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:283) > at > org.apache.hadoop.ozone.container.ContainerTestHelper.getRaftServerImpl(ContainerTestHelper.java:593) > at > org.apache.hadoop.ozone.container.ContainerTestHelper.isRatisFollower(ContainerTestHelper.java:608) > at > org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.test2WayCommitForTimeoutException(TestWatchForCommit.java:302) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-4176) Fix failed UT: test2WayCommitForTimeoutException
[ https://issues.apache.org/jira/browse/HDDS-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai resolved HDDS-4176. Fix Version/s: 1.1.0 Resolution: Fixed > Fix failed UT: test2WayCommitForTimeoutException > > > Key: HDDS-4176 > URL: https://issues.apache.org/jira/browse/HDDS-4176 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Labels: pull-request-available > Fix For: 1.1.0 > > > org.apache.ratis.protocol.GroupMismatchException: > 6f2b1ee5-bc2b-491c-bff4-ab0f4ce64709: group-2D066F5AFBD0 not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:127) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:274) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:283) > at > org.apache.hadoop.ozone.container.ContainerTestHelper.getRaftServerImpl(ContainerTestHelper.java:593) > at > org.apache.hadoop.ozone.container.ContainerTestHelper.isRatisFollower(ContainerTestHelper.java:608) > at > org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.test2WayCommitForTimeoutException(TestWatchForCommit.java:302) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai merged pull request #1370: HDDS-4176. Fix failed UT: test2WayCommitForTimeoutException
adoroszlai merged pull request #1370: URL: https://github.com/apache/hadoop-ozone/pull/1370 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on pull request #1370: HDDS-4176. Fix failed UT: test2WayCommitForTimeoutException
adoroszlai commented on pull request #1370: URL: https://github.com/apache/hadoop-ozone/pull/1370#issuecomment-685243200 Thanks @runzhiwang for the fix and @amaliujia for the review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4190) Intermittent failure in TestOMAllocateBlockRequest#testValidateAndUpdateCache
[ https://issues.apache.org/jira/browse/HDDS-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188923#comment-17188923 ] Attila Doroszlai commented on HDDS-4190: Similar assertions (failing intermittently when run locally): * TestOMVolumeSetQuotaRequest.testValidateAndUpdateCacheSuccess(TestOMVolumeSetQuotaRequest.java:101) * TestOMVolumeSetOwnerRequest.testValidateAndUpdateCacheSuccess(TestOMVolumeSetOwnerRequest.java:100) > Intermittent failure in TestOMAllocateBlockRequest#testValidateAndUpdateCache > - > > Key: HDDS-4190 > URL: https://issues.apache.org/jira/browse/HDDS-4190 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Priority: Major > > {code:title=https://github.com/elek/ozone-build-results/blob/master/2020/09/01/2686/unit/hadoop-ozone/ozone-manager/org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest.txt} > Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.214 s <<< > FAILURE! - in > org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest > testValidateAndUpdateCache(org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest) > Time elapsed: 0.129 s <<< FAILURE! > java.lang.AssertionError: Values should be different. Actual: 1598964934681 > ... > at org.junit.Assert.assertNotEquals(Assert.java:209) > at > org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest.testValidateAndUpdateCache(TestOMAllocateBlockRequest.java:100) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4190) Intermittent failure in TestOMAllocateBlockRequest#testValidateAndUpdateCache
Attila Doroszlai created HDDS-4190: -- Summary: Intermittent failure in TestOMAllocateBlockRequest#testValidateAndUpdateCache Key: HDDS-4190 URL: https://issues.apache.org/jira/browse/HDDS-4190 Project: Hadoop Distributed Data Store Issue Type: Bug Components: test Reporter: Attila Doroszlai {code:title=https://github.com/elek/ozone-build-results/blob/master/2020/09/01/2686/unit/hadoop-ozone/ozone-manager/org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest.txt} Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.214 s <<< FAILURE! - in org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest testValidateAndUpdateCache(org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest) Time elapsed: 0.129 s <<< FAILURE! java.lang.AssertionError: Values should be different. Actual: 1598964934681 ... at org.junit.Assert.assertNotEquals(Assert.java:209) at org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest.testValidateAndUpdateCache(TestOMAllocateBlockRequest.java:100) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`
adoroszlai commented on a change in pull request #1375: URL: https://github.com/apache/hadoop-ozone/pull/1375#discussion_r481521954 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/om/GetServiceRolesSubcommand.java ## @@ -30,10 +30,10 @@ import java.util.concurrent.Callable; /** - * Handler of om get-service-roles command. + * Handler of om status command. */ @CommandLine.Command( -name = "getserviceroles", +name = "status", Review comment: I think we should keep `getserviceroles` as an alias for compatibility. ```suggestion name = "status", aliases = "getserviceroles", ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`
amaliujia commented on pull request #1375: URL: https://github.com/apache/hadoop-ozone/pull/1375#issuecomment-685211998 Thank you for your review @cxorm! ``` [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.417 s <<< FAILURE! - in org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest [ERROR] testValidateAndUpdateCache(org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest) Time elapsed: 0.114 s <<< FAILURE! java.lang.AssertionError: Values should be different. Actual: 1599003025036 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failEquals(Assert.java:185) at org.junit.Assert.assertNotEquals(Assert.java:161) at org.junit.Assert.assertNotEquals(Assert.java:198) at org.junit.Assert.assertNotEquals(Assert.java:209) at org.apache.hadoop.ozone.om.request.key.TestOMAllocateBlockRequest.testValidateAndUpdateCache(TestOMAllocateBlockRequest.java:100) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) ``` The failed UT seems not related to this change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] github-actions[bot] closed pull request #912: WIP Patch - HDDS-2949: store dir/key entries in separate tables - first patch onl…
github-actions[bot] closed pull request #912: URL: https://github.com/apache/hadoop-ozone/pull/912 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] github-actions[bot] commented on pull request #912: WIP Patch - HDDS-2949: store dir/key entries in separate tables - first patch onl…
github-actions[bot] commented on pull request #912: URL: https://github.com/apache/hadoop-ozone/pull/912#issuecomment-685206902 Thank you very much for the patch. I am closing this PR __temporarily__ as there was no activity recently and it is waiting for response from its author. It doesn't mean that this PR is not important or ignored: feel free to reopen the PR at any time. It only means that attention of committers is not required. We prefer to keep the review queue clean. This ensures PRs in need of review are more visible, which results in faster feedback for all PRs. If you need ANY help to finish this PR, please [contact the community](https://github.com/apache/hadoop-ozone#contact) on the mailing list or the slack channel." This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] github-actions[bot] commented on pull request #1173: HDDS-3880. Improve OM HA Robot test
github-actions[bot] commented on pull request #1173: URL: https://github.com/apache/hadoop-ozone/pull/1173#issuecomment-685206895 Thank you very much for the patch. I am closing this PR __temporarily__ as there was no activity recently and it is waiting for response from its author. It doesn't mean that this PR is not important or ignored: feel free to reopen the PR at any time. It only means that attention of committers is not required. We prefer to keep the review queue clean. This ensures PRs in need of review are more visible, which results in faster feedback for all PRs. If you need ANY help to finish this PR, please [contact the community](https://github.com/apache/hadoop-ozone#contact) on the mailing list or the slack channel." This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] github-actions[bot] closed pull request #1173: HDDS-3880. Improve OM HA Robot test
github-actions[bot] closed pull request #1173: URL: https://github.com/apache/hadoop-ozone/pull/1173 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] runzhiwang commented on a change in pull request #1370: HDDS-4176. Fix failed UT: test2WayCommitForTimeoutException
runzhiwang commented on a change in pull request #1370: URL: https://github.com/apache/hadoop-ozone/pull/1370#discussion_r481493152 ## File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestWatchForCommit.java ## @@ -297,9 +298,11 @@ public void test2WayCommitForTimeoutException() throws Exception { xceiverClient.getPipeline())); reply.getResponse().get(); Assert.assertEquals(3, ratisClient.getCommitInfoMap().size()); +List datanodeDetails = pipeline.getNodes(); Review comment: @adoroszlai Thanks for review. I have updated the patch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] cxorm commented on pull request #1075: HDDS-3369. Cleanup old write-path of volume in OM
cxorm commented on pull request #1075: URL: https://github.com/apache/hadoop-ozone/pull/1075#issuecomment-685191223 Thank you @adoroszlai for the advise. Splitting `OzoneManagerProtocol` interface is huge work IMHO. I think it is improper to change the interface after GA-release, so I propose close this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`
amaliujia commented on pull request #1375: URL: https://github.com/apache/hadoop-ozone/pull/1375#issuecomment-685183677 Context: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481380452 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on pull request #1371: HDDS-2922. Balance ratis leader distribution in datanodes
amaliujia commented on pull request #1371: URL: https://github.com/apache/hadoop-ozone/pull/1371#issuecomment-685181887 Thanks @runzhiwang. This is an awesome work! I will also try to help review this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`
amaliujia commented on pull request #1375: URL: https://github.com/apache/hadoop-ozone/pull/1375#issuecomment-685181181 R: @adoroszlai @timmylicheng This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4189) Use a unified Cli syntax for both getting OM and SCM status
[ https://issues.apache.org/jira/browse/HDDS-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-4189: - Labels: pull-request-available (was: ) > Use a unified Cli syntax for both getting OM and SCM status > --- > > Key: HDDS-4189 > URL: https://issues.apache.org/jira/browse/HDDS-4189 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Labels: pull-request-available > > https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481380452 > suggests a unification for OM and SCM for getting status by Cli. > https://github.com/apache/hadoop-ozone/pull/1346 updated for SCM case. > This JIRA proposes to change > ozone admin om getserviceroles > to > ozone admin om status -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia opened a new pull request #1375: HDDS-4189. Change `ozone admin om getserviceroles` to `ozone admin om status`
amaliujia opened a new pull request #1375: URL: https://github.com/apache/hadoop-ozone/pull/1375 ## What changes were proposed in this pull request? Change `ozone admin om getserviceroles` to `ozone admin om status` ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-4189 ## How was this patch tested? Unit Test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] cxorm edited a comment on pull request #1233: HDDS-3725. Ozone sh volume client support quota option.
cxorm edited a comment on pull request #1233: URL: https://github.com/apache/hadoop-ozone/pull/1233#issuecomment-685179614 Thank you @captainzmc for updating the PR. Could you rebase it with latest master branch ? This PR looks good to me, +1 (Would you be so kind to take a look on this PR, @ChenSammi ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] cxorm commented on pull request #1233: HDDS-3725. Ozone sh volume client support quota option.
cxorm commented on pull request #1233: URL: https://github.com/apache/hadoop-ozone/pull/1233#issuecomment-685179614 Thank you @captainzmc for updating the PR. Could you rebase it with latest master branch ? This PR looks good to me, +1 (cc @ChenSammi ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] cxorm commented on a change in pull request #1233: HDDS-3725. Ozone sh volume client support quota option.
cxorm commented on a change in pull request #1233: URL: https://github.com/apache/hadoop-ozone/pull/1233#discussion_r481477852 ## File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/protocol/ClientProtocol.java ## @@ -101,10 +100,11 @@ void createVolume(String volumeName, VolumeArgs args) /** * Set Volume Quota. * @param volumeName Name of the Volume - * @param quota Quota to be set for the Volume + * @param quotaInBytes The maximum size this volume can be used. + * @param quotaInCounts The maximum number of buckets in this volume. * @throws IOException */ - void setVolumeQuota(String volumeName, OzoneQuota quota) + void setVolumeQuota(String volumeName, long quotaInBytes, long quotaInCounts) Review comment: > The previous feature of Ozone on Set quota is virtual, and this interface won't have any actual effect. This feature is incomplete, so no one will use it before. I am fine with it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4189) Use a unified Cli syntax for both getting OM and SCM status
Rui Wang created HDDS-4189: -- Summary: Use a unified Cli syntax for both getting OM and SCM status Key: HDDS-4189 URL: https://issues.apache.org/jira/browse/HDDS-4189 Project: Hadoop Distributed Data Store Issue Type: Task Reporter: Rui Wang Assignee: Rui Wang https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481380452 suggests a unification for OM and SCM for getting status by Cli. https://github.com/apache/hadoop-ozone/pull/1346 updated for SCM case. This JIRA proposes to change ozone admin om getserviceroles to ozone admin om status -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.
amaliujia commented on pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#issuecomment-685173208 Addressed the following comments 1. Merge `getRatisStatus` with `GetScmInfo` 2. Adopt command syntax `ozone admin scm status` 3. added an acceptance test @timmylicheng I am not sure how to test an acceptance test. Can you share a way to run it locally? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.
amaliujia commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481469131 ## File path: hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocol.java ## @@ -230,4 +230,5 @@ Pipeline createReplicationPipeline(HddsProtos.ReplicationType type, */ boolean getReplicationManagerStatus() throws IOException; + List getScmRatisStatus() throws IOException; Review comment: Pushed a new commit to merge this logic into `GetScmInfo` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.
amaliujia commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481469412 ## File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/shell/TestScmAdminHA.java ## @@ -0,0 +1,66 @@ +package org.apache.hadoop.ozone.shell; Review comment: ack This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.
amaliujia commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481414123 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java ## @@ -0,0 +1,26 @@ +package org.apache.hadoop.ozone.admin.scm; + +import java.util.List; +import java.util.concurrent.Callable; +import org.apache.hadoop.hdds.cli.HddsVersionProvider; +import org.apache.hadoop.hdds.scm.client.ScmClient; +import picocli.CommandLine; + +@CommandLine.Command( +name = "listratisstatus", Review comment: Thanks. I will adopt `ozone admin scm status` in this PR and I will send another PR for `ozone admin om status` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] hanishakoneru commented on a change in pull request #1298: HDDS-3869. Use different column families for datanode block and metadata
hanishakoneru commented on a change in pull request #1298: URL: https://github.com/apache/hadoop-ozone/pull/1298#discussion_r476042841 ## File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerDataYaml.java ## @@ -280,6 +280,9 @@ public Object construct(Node node) { String state = (String) nodes.get(OzoneConsts.STATE); kvData .setState(ContainerProtos.ContainerDataProto.State.valueOf(state)); +String schemaVersion = (String) nodes.get(OzoneConsts.SCHEMA_VERSION); +kvData.setSchemaVersion(schemaVersion); Review comment: When reading old containerDataYaml which does not container the Schema version field, what value would be returned? IIRC and it returns null, then we should set it to version V1. ## File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java ## @@ -159,122 +178,126 @@ public static void parseKVContainerData(KeyValueContainerData kvContainerData, } kvContainerData.setDbFile(dbFile); +if (kvContainerData.getSchemaVersion() == null) { + // If this container has not specified a schema version, it is in the old + // format with one default column family. + kvContainerData.setSchemaVersion(OzoneConsts.SCHEMA_V1); +} + boolean isBlockMetadataSet = false; try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData, config)) { + Table metadataTable = + containerDB.getStore().getMetadataTable(); + // Set pending deleted block count. - byte[] pendingDeleteBlockCount = - containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY); + Long pendingDeleteBlockCount = + metadataTable.get(OzoneConsts.PENDING_DELETE_BLOCK_COUNT); if (pendingDeleteBlockCount != null) { kvContainerData.incrPendingDeletionBlocks( -Longs.fromByteArray(pendingDeleteBlockCount)); +pendingDeleteBlockCount.intValue()); Review comment: Any reason for using intValue here instead of the long value as incrPendingDeletionBlocks takes in a long parameter? ## File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/BlockManagerImpl.java ## @@ -262,14 +264,17 @@ public void deleteBlock(Container container, BlockID blockID) throws getBlockByID(db, blockID); // Update DB to delete block and set block count and bytes used. - BatchOperation batch = new BatchOperation(); - batch.delete(blockKey); Review comment: blockKey variable is redundant now and can be removed. ## File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainer.java ## @@ -487,6 +486,7 @@ public void importContainerData(InputStream input, containerData.setState(originalContainerData.getState()); containerData .setContainerDBType(originalContainerData.getContainerDBType()); + containerData.setSchemaVersion(originalContainerData.getSchemaVersion()); Review comment: I see that schema version is being set in KeyValueContainerUtil#parseKVContainerData. We can explore the option of setting the default schema version (V1) while reading the Yaml itself so that it is never missed. ## File path: hadoop-hdds/container-service/src/test/resources/123-dn-container.db/LOG ## @@ -0,0 +1,284 @@ +2020/08/03-15:13:40.359520 7f80eb7a9700 RocksDB version: 6.8.1 +2020/08/03-15:13:40.359563 7f80eb7a9700 Git sha rocksdb_build_git_sha: +2020/08/03-15:13:40.359566 7f80eb7a9700 Compile date Apr 26 2020 Review comment: LOCK and LOG files are not required to load the DB. ## File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java ## @@ -159,122 +178,126 @@ public static void parseKVContainerData(KeyValueContainerData kvContainerData, } kvContainerData.setDbFile(dbFile); +if (kvContainerData.getSchemaVersion() == null) { + // If this container has not specified a schema version, it is in the old + // format with one default column family. + kvContainerData.setSchemaVersion(OzoneConsts.SCHEMA_V1); +} + boolean isBlockMetadataSet = false; try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData, config)) { + Table metadataTable = + containerDB.getStore().getMetadataTable(); + // Set pending deleted block count. - byte[] pendingDeleteBlockCount = - containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY); + Long pendingDeleteBlockCount = + metadataTable.get(OzoneConsts.PENDING_DELETE_BLOCK_COUNT); if (pendingDeleteBlockCount != null) {
[GitHub] [hadoop-ozone] prashantpogde commented on a change in pull request #1353: HDDS-4148. Add servlet to return SCM DB checkpoint.
prashantpogde commented on a change in pull request #1353: URL: https://github.com/apache/hadoop-ozone/pull/1353#discussion_r481431139 ## File path: hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/DBCheckpointServlet.java ## @@ -0,0 +1,197 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hdds.utils; + +import javax.servlet.ServletException; +import javax.servlet.http.HttpServlet; +import javax.servlet.http.HttpServletRequest; +import javax.servlet.http.HttpServletResponse; +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; +import java.io.OutputStream; +import java.nio.file.Files; +import java.nio.file.Path; +import java.time.Duration; +import java.time.Instant; +import java.util.stream.Collectors; +import java.util.stream.Stream; + +import org.apache.hadoop.hdds.utils.db.DBCheckpoint; +import org.apache.hadoop.hdds.utils.db.DBStore; + +import org.apache.commons.compress.archivers.ArchiveEntry; +import org.apache.commons.compress.archivers.ArchiveOutputStream; +import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream; +import org.apache.commons.compress.compressors.CompressorException; +import org.apache.commons.compress.compressors.CompressorOutputStream; +import org.apache.commons.compress.compressors.CompressorStreamFactory; +import org.apache.commons.compress.utils.IOUtils; +import org.apache.commons.lang3.StringUtils; +import static org.apache.hadoop.ozone.OzoneConsts.OZONE_DB_CHECKPOINT_REQUEST_FLUSH; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Provides the current checkpoint Snapshot of the OM/SCM DB. (tar.gz) + */ +public class DBCheckpointServlet extends HttpServlet { Review comment: I have refactored OMDBCheckpointServlet in this request itself. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-3938) Flaky TestWatchForCommit#test2WayCommitForTimeoutException
[ https://issues.apache.org/jira/browse/HDDS-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai resolved HDDS-3938. Resolution: Duplicate > Flaky TestWatchForCommit#test2WayCommitForTimeoutException > -- > > Key: HDDS-3938 > URL: https://issues.apache.org/jira/browse/HDDS-3938 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: test >Affects Versions: 1.0.0 >Reporter: Siyao Meng >Priority: Major > Labels: 0.7.0 > > In PR#1255 > https://github.com/apache/hadoop-ozone/runs/813994346?check_suite_focus=true: > {code:title=https://github.com/elek/ozone-build-results/blob/master/2020/06/27/1255/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestWatchForCommit-output.txt} > java.util.concurrent.ExecutionException: > org.apache.ratis.protocol.GroupMismatchException: > a498c7dc-27d9-4ae8-a233-895baee1c3ae: group-C4714E1CC0B9 not found. > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:262) > at > org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForGroupMismatchException(TestWatchForCommit.java:351) > {code} > In PR#1459 > https://github.com/apache/hadoop-ozone/runs/844177861?check_suite_focus=true: > {code:title=https://github.com/elek/ozone-build-results/blob/master/2020/07/07/1459/it-client/hadoop-ozone/integration-test/org.apache.hadoop.ozone.client.rpc.TestWatchForCommit-output.txt} > java.util.concurrent.ExecutionException: > org.apache.ratis.protocol.GroupMismatchException: > a7b8b74b-f98f-42e2-9f4c-7068bd51e221: group-DCED9E4CDB5B not found. > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:262) > at > org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForGroupMismatchException(TestWatchForCommit.java:348) > {code} > And there are two more instances that can be found in > https://elek.github.io/ozone-build-results/. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-4012) FLAKY-UT: TestWatchForCommit#test2WayCommitForTimeoutException
[ https://issues.apache.org/jira/browse/HDDS-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai resolved HDDS-4012. Resolution: Duplicate > FLAKY-UT: TestWatchForCommit#test2WayCommitForTimeoutException > -- > > Key: HDDS-4012 > URL: https://issues.apache.org/jira/browse/HDDS-4012 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: test >Affects Versions: 1.0.0 >Reporter: maobaolong >Priority: Major > > [INFO] Running org.apache.hadoop.ozone.client.rpc.TestWatchForCommit > [ERROR] Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 191.617 s <<< FAILURE! - in > org.apache.hadoop.ozone.client.rpc.TestWatchForCommit > [ERROR] > test2WayCommitForTimeoutException(org.apache.hadoop.ozone.client.rpc.TestWatchForCommit) > Time elapsed: 38.847 s <<< ERROR! > org.apache.ratis.protocol.GroupMismatchException: > bc6ce7e8-8a72-4287-9d17-f76681f43526: group-91575AE6096A not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:127) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:274) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:283) > at > org.apache.hadoop.ozone.container.ContainerTestHelper.getRaftServerImpl(ContainerTestHelper.java:593) > at > org.apache.hadoop.ozone.container.ContainerTestHelper.isRatisFollower(ContainerTestHelper.java:608) > at > org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.test2WayCommitForTimeoutException(TestWatchForCommit.java:302) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4012) FLAKY-UT: TestWatchForCommit#test2WayCommitForTimeoutException
[ https://issues.apache.org/jira/browse/HDDS-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-4012: --- Labels: (was: pull-request-available) > FLAKY-UT: TestWatchForCommit#test2WayCommitForTimeoutException > -- > > Key: HDDS-4012 > URL: https://issues.apache.org/jira/browse/HDDS-4012 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: test >Affects Versions: 1.0.0 >Reporter: maobaolong >Priority: Major > > [INFO] Running org.apache.hadoop.ozone.client.rpc.TestWatchForCommit > [ERROR] Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 191.617 s <<< FAILURE! - in > org.apache.hadoop.ozone.client.rpc.TestWatchForCommit > [ERROR] > test2WayCommitForTimeoutException(org.apache.hadoop.ozone.client.rpc.TestWatchForCommit) > Time elapsed: 38.847 s <<< ERROR! > org.apache.ratis.protocol.GroupMismatchException: > bc6ce7e8-8a72-4287-9d17-f76681f43526: group-91575AE6096A not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:127) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:274) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:283) > at > org.apache.hadoop.ozone.container.ContainerTestHelper.getRaftServerImpl(ContainerTestHelper.java:593) > at > org.apache.hadoop.ozone.container.ContainerTestHelper.isRatisFollower(ContainerTestHelper.java:608) > at > org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.test2WayCommitForTimeoutException(TestWatchForCommit.java:302) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1370: HDDS-4176. Fix failed UT: test2WayCommitForTimeoutException
adoroszlai commented on a change in pull request #1370: URL: https://github.com/apache/hadoop-ozone/pull/1370#discussion_r481426506 ## File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestWatchForCommit.java ## @@ -297,9 +298,11 @@ public void test2WayCommitForTimeoutException() throws Exception { xceiverClient.getPipeline())); reply.getResponse().get(); Assert.assertEquals(3, ratisClient.getCommitInfoMap().size()); +List datanodeDetails = pipeline.getNodes(); Review comment: Nit: `nodesInPipeline` may be a better name. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1353: HDDS-4148. Add servlet to return SCM DB checkpoint.
xiaoyuyao commented on a change in pull request #1353: URL: https://github.com/apache/hadoop-ozone/pull/1353#discussion_r481416948 ## File path: hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/DBCheckpointServlet.java ## @@ -0,0 +1,197 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hdds.utils; + +import javax.servlet.ServletException; +import javax.servlet.http.HttpServlet; +import javax.servlet.http.HttpServletRequest; +import javax.servlet.http.HttpServletResponse; +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; +import java.io.OutputStream; +import java.nio.file.Files; +import java.nio.file.Path; +import java.time.Duration; +import java.time.Instant; +import java.util.stream.Collectors; +import java.util.stream.Stream; + +import org.apache.hadoop.hdds.utils.db.DBCheckpoint; +import org.apache.hadoop.hdds.utils.db.DBStore; + +import org.apache.commons.compress.archivers.ArchiveEntry; +import org.apache.commons.compress.archivers.ArchiveOutputStream; +import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream; +import org.apache.commons.compress.compressors.CompressorException; +import org.apache.commons.compress.compressors.CompressorOutputStream; +import org.apache.commons.compress.compressors.CompressorStreamFactory; +import org.apache.commons.compress.utils.IOUtils; +import org.apache.commons.lang3.StringUtils; +import static org.apache.hadoop.ozone.OzoneConsts.OZONE_DB_CHECKPOINT_REQUEST_FLUSH; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Provides the current checkpoint Snapshot of the OM/SCM DB. (tar.gz) + */ +public class DBCheckpointServlet extends HttpServlet { Review comment: There are many overlaps between OMDBCheckpointServlet and OMDBCheckpointServlet. Do we have a followup JIRA to refactor OMDBCheckpointServlet to use OMDBCheckpointServlet? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.
amaliujia commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481414123 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java ## @@ -0,0 +1,26 @@ +package org.apache.hadoop.ozone.admin.scm; + +import java.util.List; +import java.util.concurrent.Callable; +import org.apache.hadoop.hdds.cli.HddsVersionProvider; +import org.apache.hadoop.hdds.scm.client.ScmClient; +import picocli.CommandLine; + +@CommandLine.Command( +name = "listratisstatus", Review comment: Thanks. I will adopt `ozone admin scm status` in this PR and I will send another PR for ``ozone admin om status` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.
adoroszlai commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481380452 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java ## @@ -0,0 +1,26 @@ +package org.apache.hadoop.ozone.admin.scm; + +import java.util.List; +import java.util.concurrent.Callable; +import org.apache.hadoop.hdds.cli.HddsVersionProvider; +import org.apache.hadoop.hdds.scm.client.ScmClient; +import picocli.CommandLine; + +@CommandLine.Command( +name = "listratisstatus", Review comment: `status` or `roles` would probably be enough to indicate the goal of the subcommand, something like ``` ozone admin (om|scm) status ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.
amaliujia commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481378454 ## File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java ## @@ -560,6 +563,13 @@ public boolean getReplicationManagerStatus() { return scm.getReplicationManager().isRunning(); } + @Override + public List getScmRatisStatus() throws IOException { +return scm.getScmHAManager() +.getRatisServer().getRaftPeers() +.stream().map(peer -> peer.getAddress()).collect(Collectors.toList()); Review comment: Got it. Will move this functionality to ScmHAManager. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1346: [WIP] HDDS-4115. CLI command to show current SCM leader and follower status.
amaliujia commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481371486 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java ## @@ -0,0 +1,26 @@ +package org.apache.hadoop.ozone.admin.scm; + +import java.util.List; +import java.util.concurrent.Callable; +import org.apache.hadoop.hdds.cli.HddsVersionProvider; +import org.apache.hadoop.hdds.scm.client.ScmClient; +import picocli.CommandLine; + +@CommandLine.Command( +name = "listratisstatus", Review comment: @adoroszlai What kind of improvement OM HA `getserviceroles` you are thinking of? I am happy to make a separate PR for that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on pull request #1366: HDDS-4167. Acceptance test logs missing if SCM fails to exit safe mode
adoroszlai commented on pull request #1366: URL: https://github.com/apache/hadoop-ozone/pull/1366#issuecomment-685067401 Thanks @elek for reviewing and committing it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #1346: [WIP] HDDS-4115. CLI command to show current SCM leader and follower status.
adoroszlai commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481359383 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java ## @@ -0,0 +1,26 @@ +package org.apache.hadoop.ozone.admin.scm; + +import java.util.List; +import java.util.concurrent.Callable; +import org.apache.hadoop.hdds.cli.HddsVersionProvider; +import org.apache.hadoop.hdds.scm.client.ScmClient; +import picocli.CommandLine; + +@CommandLine.Command( +name = "listratisstatus", Review comment: I think both this one and OM HA `getserviceroles` can be improved. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-4179) Implement post-finalize SCM logic to allow nodes of only new version to participate in pipelines.
[ https://issues.apache.org/jira/browse/HDDS-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle reassigned HDDS-4179: - Assignee: Prashant Pogde > Implement post-finalize SCM logic to allow nodes of only new version to > participate in pipelines. > - > > Key: HDDS-4179 > URL: https://issues.apache.org/jira/browse/HDDS-4179 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode, SCM >Reporter: Aravindan Vijayan >Assignee: Prashant Pogde >Priority: Major > Fix For: 1.1.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-4178) SCM Finalize command implementation.
[ https://issues.apache.org/jira/browse/HDDS-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle reassigned HDDS-4178: - Assignee: Istvan Fajth > SCM Finalize command implementation. > > > Key: HDDS-4178 > URL: https://issues.apache.org/jira/browse/HDDS-4178 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Aravindan Vijayan >Assignee: Istvan Fajth >Priority: Major > Fix For: 1.1.0 > > > * RPC endpoint implementation > * Ratis request to persist MLV, Trigger DN Finalize, Pipeline close. (WHEN > MLV changes) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-4174) Add current HDDS layout version to Datanode heartbeat and registration.
[ https://issues.apache.org/jira/browse/HDDS-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle reassigned HDDS-4174: - Assignee: Prashant Pogde > Add current HDDS layout version to Datanode heartbeat and registration. > --- > > Key: HDDS-4174 > URL: https://issues.apache.org/jira/browse/HDDS-4174 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode >Reporter: Aravindan Vijayan >Assignee: Prashant Pogde >Priority: Major > Fix For: 1.1.0 > > > Add the layout version as a field to proto. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-4175) Implement Datanode Finalization
[ https://issues.apache.org/jira/browse/HDDS-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle reassigned HDDS-4175: - Assignee: Istvan Fajth > Implement Datanode Finalization > --- > > Key: HDDS-4175 > URL: https://issues.apache.org/jira/browse/HDDS-4175 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode >Reporter: Aravindan Vijayan >Assignee: Istvan Fajth >Priority: Major > Fix For: 1.1.0 > > > * Create FinalizeCommand in SCM and Datanode protocol. > * Create FinalizeCommand Handler in Datanode. > * Datanode Finalization should FAIL if there are open containers on it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-4173) Implement HDDS Version management using the LayoutVersionManager interface.
[ https://issues.apache.org/jira/browse/HDDS-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle reassigned HDDS-4173: - Assignee: Prashant Pogde > Implement HDDS Version management using the LayoutVersionManager interface. > --- > > Key: HDDS-4173 > URL: https://issues.apache.org/jira/browse/HDDS-4173 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode, SCM >Affects Versions: 1.1.0 >Reporter: Aravindan Vijayan >Assignee: Prashant Pogde >Priority: Major > Fix For: 1.1.0 > > > * Create HDDS Layout Feature Catalog similar to the OM Layout Feature Catalog. > * Any layout change to SCM and Datanode needs to be recorded here as a Layout > Feature. > * This includes new SCM HA requests, new container layouts in DN etc. > * Create a HDDSLayoutVersionManager similar to OMLayoutVersionManager. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4181) Add acceptance tests for upgrade, finalization and downgrade
[ https://issues.apache.org/jira/browse/HDDS-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188636#comment-17188636 ] Aravindan Vijayan commented on HDDS-4181: - Thanks [~elek]. I have seen that changes, and planning to build on top of that. > Add acceptance tests for upgrade, finalization and downgrade > > > Key: HDDS-4181 > URL: https://issues.apache.org/jira/browse/HDDS-4181 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Priority: Major > Fix For: 1.1.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-4181) Add acceptance tests for upgrade, finalization and downgrade
[ https://issues.apache.org/jira/browse/HDDS-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188636#comment-17188636 ] Aravindan Vijayan edited comment on HDDS-4181 at 9/1/20, 4:49 PM: -- Thanks [~elek]. I have seen those changes, and planning to build on top of that. was (Author: avijayan): Thanks [~elek]. I have seen that changes, and planning to build on top of that. > Add acceptance tests for upgrade, finalization and downgrade > > > Key: HDDS-4181 > URL: https://issues.apache.org/jira/browse/HDDS-4181 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Priority: Major > Fix For: 1.1.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4097) S3/Ozone Filesystem inter-op
[ https://issues.apache.org/jira/browse/HDDS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188586#comment-17188586 ] Arpit Agarwal commented on HDDS-4097: - If you always create them, then you are basically interpreting key names as filesystem paths, so then they have to normalized and interpreted as paths. There is no middle ground. > S3/Ozone Filesystem inter-op > > > Key: HDDS-4097 > URL: https://issues.apache.org/jira/browse/HDDS-4097 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Attachments: Ozone FileSystem Paths Enabled.docx, Ozone filesystem > path enabled.xlsx > > > This Jira is to implement changes required to use Ozone buckets when data is > ingested via S3 and use the bucket/volume via OzoneFileSystem. Initial > implementation for this is done as part of HDDS-3955. There are few API's > which have missed the changes during the implementation of HDDS-3955. > Attached design document which discusses each API, and what changes are > required. > Excel sheet has information about each API, from what all interfaces the OM > API is used, and what changes are required for the API to support > inter-operability. > Note: The proposal for delete/rename is still under discussion, not yet > finalized. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on pull request #1328: HDDS-4102. Normalize Keypath for lookupKey.
bharatviswa504 edited a comment on pull request #1328: URL: https://github.com/apache/hadoop-ozone/pull/1328#issuecomment-684950217 @elek There is no issue with this, and I believe we have agreed on the part we need this like 100% HCFS with few changes to AWS S3 semantics. I don't see what is the problem in moving forward?. As the other argument is when the flag is disabled to support compromised HCFS and 100% AWS. Let me know if you have any concerns about this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on pull request #1328: HDDS-4102. Normalize Keypath for lookupKey.
bharatviswa504 edited a comment on pull request #1328: URL: https://github.com/apache/hadoop-ozone/pull/1328#issuecomment-684950217 @elek There is no issue with this, and I believe we have agreed on the part we need this like 100% HCFS with few changes to AWS S3 semantics. I don't see what is the problem in moving forward?. As the other argument is when the flag is disabled to support HCFS. Let me know if you have any concerns with this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on pull request #1328: HDDS-4102. Normalize Keypath for lookupKey.
bharatviswa504 edited a comment on pull request #1328: URL: https://github.com/apache/hadoop-ozone/pull/1328#issuecomment-684950217 @elek There is no issue with this. As the other argument is when the flag is disabled to support HCFS. And I believe we have agreed on the part we need this like 100% HCFS with few changes to AWS S3 semantics. I don't see what is the problem in moving forward? Let me know if you have any concerns. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #1328: HDDS-4102. Normalize Keypath for lookupKey.
bharatviswa504 commented on pull request #1328: URL: https://github.com/apache/hadoop-ozone/pull/1328#issuecomment-684950217 @elek There is no issue with this. As the other argument is when the flag is disabled to support HCFS. And I believe we have agreed on the part we need this like 100% HCFS with few changes to AWS S3 semantics. Let me know if you have any concerns. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4097) S3/Ozone Filesystem inter-op
[ https://issues.apache.org/jira/browse/HDDS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188544#comment-17188544 ] Marton Elek commented on HDDS-4097: --- > Unfortunately there is no way you can guarantee that. A filesystem client > will need all the intermediate directories to exist for navigating the tree. Is there any problem with always creating the intermediate directories? I see some possible, minor performance problems but as RocksDB is already the fastest part shouldn't be a blocker. Especially as we can support both S3 and HCFS with this approach. > S3/Ozone Filesystem inter-op > > > Key: HDDS-4097 > URL: https://issues.apache.org/jira/browse/HDDS-4097 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Attachments: Ozone FileSystem Paths Enabled.docx, Ozone filesystem > path enabled.xlsx > > > This Jira is to implement changes required to use Ozone buckets when data is > ingested via S3 and use the bucket/volume via OzoneFileSystem. Initial > implementation for this is done as part of HDDS-3955. There are few API's > which have missed the changes during the implementation of HDDS-3955. > Attached design document which discusses each API, and what changes are > required. > Excel sheet has information about each API, from what all interfaces the OM > API is used, and what changes are required for the API to support > inter-operability. > Note: The proposal for delete/rename is still under discussion, not yet > finalized. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] arp7 commented on pull request #1328: HDDS-4102. Normalize Keypath for lookupKey.
arp7 commented on pull request #1328: URL: https://github.com/apache/hadoop-ozone/pull/1328#issuecomment-684929618 > What is your opinion about AWS compatibility issues? What exactly is the compatibility issue? If the paths are not interpreted (default behavior), then there is full AWS compatibility. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-4097) S3/Ozone Filesystem inter-op
[ https://issues.apache.org/jira/browse/HDDS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188538#comment-17188538 ] Arpit Agarwal edited comment on HDDS-4097 at 9/1/20, 3:12 PM: -- bq. Using simple, acceptable key names (/a/b/c, /a/b/c/d) both s3 and HCFS should work out-of-the box, without any additional settings. Unfortunately there is no way you can guarantee that. A filesystem client will need all the intermediate directories to exist for navigating the tree. was (Author: arpitagarwal): bq. Using simple, acceptable key names (/a/b/c, /a/b/c/d) both s3 and HCFS should work out-of-the box, without any additional settings. Unfortunately there is no way you cannot guarantee that. A filesystem client will need all the intermediate directories to exist for navigating the tree. > S3/Ozone Filesystem inter-op > > > Key: HDDS-4097 > URL: https://issues.apache.org/jira/browse/HDDS-4097 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Attachments: Ozone FileSystem Paths Enabled.docx, Ozone filesystem > path enabled.xlsx > > > This Jira is to implement changes required to use Ozone buckets when data is > ingested via S3 and use the bucket/volume via OzoneFileSystem. Initial > implementation for this is done as part of HDDS-3955. There are few API's > which have missed the changes during the implementation of HDDS-3955. > Attached design document which discusses each API, and what changes are > required. > Excel sheet has information about each API, from what all interfaces the OM > API is used, and what changes are required for the API to support > inter-operability. > Note: The proposal for delete/rename is still under discussion, not yet > finalized. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4097) S3/Ozone Filesystem inter-op
[ https://issues.apache.org/jira/browse/HDDS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188538#comment-17188538 ] Arpit Agarwal commented on HDDS-4097: - bq. Using simple, acceptable key names (/a/b/c, /a/b/c/d) both s3 and HCFS should work out-of-the box, without any additional settings. Unfortunately there is no way you cannot guarantee that. A filesystem client will need all the intermediate directories to exist for navigating the tree. > S3/Ozone Filesystem inter-op > > > Key: HDDS-4097 > URL: https://issues.apache.org/jira/browse/HDDS-4097 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Attachments: Ozone FileSystem Paths Enabled.docx, Ozone filesystem > path enabled.xlsx > > > This Jira is to implement changes required to use Ozone buckets when data is > ingested via S3 and use the bucket/volume via OzoneFileSystem. Initial > implementation for this is done as part of HDDS-3955. There are few API's > which have missed the changes during the implementation of HDDS-3955. > Attached design document which discusses each API, and what changes are > required. > Excel sheet has information about each API, from what all interfaces the OM > API is used, and what changes are required for the API to support > inter-operability. > Note: The proposal for delete/rename is still under discussion, not yet > finalized. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #1328: HDDS-4102. Normalize Keypath for lookupKey.
elek commented on pull request #1328: URL: https://github.com/apache/hadoop-ozone/pull/1328#issuecomment-684923522 > @elek Any more comments? Looks like we don't have the consensus, yet, in HDDS-4097. It was discussed yesterday during the community meeting, and I felt that everybody agree more or less, but I am not sure if @arp7 fully agrees. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4097) S3/Ozone Filesystem inter-op
[ https://issues.apache.org/jira/browse/HDDS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188536#comment-17188536 ] Marton Elek commented on HDDS-4097: --- [~arp] It was discussed with more details during the community sync (the recording is shared in the ozone-dev mailing list). In short, my proposal is the following: 1. Using simple, acceptable key names (/a/b/c, /a/b/c/d) *both s3 and HCFS should work out-of-the box, without any additional settings*. (Based on my understanding this is not true today as we need to turn on `ozone.om.enable.filesystem.paths` to get intermediate directories) 2. There are some conflicts between AWS S3 / HCFS interface. We need a new option to express how to resolve the conflicts. Let's say we have ozone.key.compatibility settings. a) ozone.key.compatibility=aws means that we enable (almost) everything which is enabled by aws s3, but we couldn't show all the keys in the hadoop interaface. For example if directory and key are created with the same prefix (possible with AWS S3), HCFS will show only the directory, not the key. b) ozone.key.compatibility=hadoop is the opposite, we can validate the path, and throw an exception on s3 interface if dir/key are created with the same name > S3/Ozone Filesystem inter-op > > > Key: HDDS-4097 > URL: https://issues.apache.org/jira/browse/HDDS-4097 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Attachments: Ozone FileSystem Paths Enabled.docx, Ozone filesystem > path enabled.xlsx > > > This Jira is to implement changes required to use Ozone buckets when data is > ingested via S3 and use the bucket/volume via OzoneFileSystem. Initial > implementation for this is done as part of HDDS-3955. There are few API's > which have missed the changes during the implementation of HDDS-3955. > Attached design document which discusses each API, and what changes are > required. > Excel sheet has information about each API, from what all interfaces the OM > API is used, and what changes are required for the API to support > inter-operability. > Note: The proposal for delete/rename is still under discussion, not yet > finalized. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #1363: HDDS-3805. [OFS] Remove usage of OzoneClientAdapter interface
elek commented on pull request #1363: URL: https://github.com/apache/hadoop-ozone/pull/1363#issuecomment-684914342 Thanks for the separation @smengcl, I think it's easier to discuss. I am not sure it's the same patch what I already commented or not. I had a comment in the previous PR, where the discussion is stopped: >> I'm in favor of A. I'll attempt to remove the usage of OzoneClientAdapter in OFS altogether then. > I am fine with that approach but let me add some comments to the latest patch. > The naming of BasicRootedOzoneFileSystem and BasicRootedOzoneFileSystemImpl is misleading. Usually the Impl postfix is used when the class implemented a well known interface. There is no such interface here. (It's more like the delegation design pattern not an implementation) > As a test: Can you please explain what are the differences between the two classes and the responsibilities? > If not, we don't need two classes. Just remove the Impl and remove the dedicated methods and directly call the proxy from the original methods of BasicRootedOzoneFileSystem. > Wouldn't it be more simple? This patch seems to use `BasicRootedOzoneClientAdapterImpl` which is not an `Impl` (not an interface). Do you need a client adapter here (in the old code we need an interface and an implementation for classpath separation but here this separation is removed.)? If yes, do you need `Impl` in the name (it doesn't implement anything)? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #1346: [WIP] HDDS-4115. CLI command to show current SCM leader and follower status.
timmylicheng commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481146665 ## File path: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/shell/TestScmAdminHA.java ## @@ -0,0 +1,66 @@ +package org.apache.hadoop.ozone.shell; Review comment: Apart from UT, we can have an acceptance test to test CLI. You may find examples here: https://github.com/apache/hadoop-ozone/pull/375 ## File path: hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocol.java ## @@ -230,4 +230,5 @@ Pipeline createReplicationPipeline(HddsProtos.ReplicationType type, */ boolean getReplicationManagerStatus() throws IOException; + List getScmRatisStatus() throws IOException; Review comment: Any way to merge it with GetSCMInfo? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4167) Acceptance test logs missing if SCM fails to exit safe mode
[ https://issues.apache.org/jira/browse/HDDS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-4167: -- Fix Version/s: 1.1.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Acceptance test logs missing if SCM fails to exit safe mode > --- > > Key: HDDS-4167 > URL: https://issues.apache.org/jira/browse/HDDS-4167 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Fix For: 1.1.0 > > > Acceptance test sometimes fails due to SCM not coming out of safe mode. If > this happens, the cluster is stopped without running Robot tests. {{rebot}} > command to process test results fails due to missing input, and acceptance > check is abruptly stopped without fetching docker logs or running tests in > other environments. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek merged pull request #1366: HDDS-4167. Acceptance test logs missing if SCM fails to exit safe mode
elek merged pull request #1366: URL: https://github.com/apache/hadoop-ozone/pull/1366 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #1366: HDDS-4167. Acceptance test logs missing if SCM fails to exit safe mode
elek commented on pull request #1366: URL: https://github.com/apache/hadoop-ozone/pull/1366#issuecomment-684899079 > Reduce some code duplication between test-all.sh and ozone-mr/test.sh by extracting functions for the shared code being fixed :heart: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on pull request #1374: HDDS-4185. Remove IncrementalByteBuffer from Ozone client
elek commented on pull request #1374: URL: https://github.com/apache/hadoop-ozone/pull/1374#issuecomment-684895095 The key line is this: ``` int effectiveBufferSize = Math.min(bufferSize, maxSize - bufferList.size() * bufferSize); ``` If the block/key is smaller than the `BuffferPool` (16 * 4MB) we allocate only the required bytes.(last buffer in the pool is reduced by this `effectiveBufferSize`). If block size is bigger than the `BufferPool`, we need to allocate the full `BufferPool` anyway. cc @bshashikant This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on a change in pull request #1336: HDDS-4119. Improve performance of the BufferPool management of Ozone client
elek commented on a change in pull request #1336: URL: https://github.com/apache/hadoop-ozone/pull/1336#discussion_r481179724 ## File path: hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChunkBuffer.java ## @@ -44,9 +45,6 @@ static ChunkBuffer allocate(int capacity) { * When increment <= 0, entire buffer is allocated in the beginning. */ static ChunkBuffer allocate(int capacity, int increment) { -if (increment > 0 && increment < capacity) { - return new IncrementalChunkBuffer(capacity, increment, false); Review comment: See #1374 about the 2nd. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4185) Remove IncrementalByteBuffer from Ozone client
[ https://issues.apache.org/jira/browse/HDDS-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-4185: - Labels: pull-request-available (was: ) > Remove IncrementalByteBuffer from Ozone client > -- > > Key: HDDS-4185 > URL: https://issues.apache.org/jira/browse/HDDS-4185 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > > During the teragen test it was identified that the IncrementalByteBuffer is > one of the biggest bottlenecks. > In the PR of HDDS-4119 a long conversation has been started if it can be > removed or we need other solution to optimize. > This jira is opened to continue the discussion and either remove or optimize > the IncrementalByteByffer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek opened a new pull request #1374: HDDS-4185. Remove IncrementalByteBuffer from Ozone client
elek opened a new pull request #1374: URL: https://github.com/apache/hadoop-ozone/pull/1374 ## What changes were proposed in this pull request? During the teragen test it was identified that the IncrementalByteBuffer is one of the biggest bottlenecks. In the PR of HDDS-4119 (#1336) a long conversation has been started if it can be removed, or we need other solution to optimize. This jira is opened to continue the discussion and either remove or optimize the IncrementalByteByffer. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-4185 ## How was this patch tested? Checked with basic smoke test locally, and with full green CI on my branch. And tested with executing related, existing unit tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #1346: [WIP] HDDS-4115. CLI command to show current SCM leader and follower status.
timmylicheng commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481142517 ## File path: hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisStatusSubcommand.java ## @@ -0,0 +1,26 @@ +package org.apache.hadoop.ozone.admin.scm; + +import java.util.List; +import java.util.concurrent.Callable; +import org.apache.hadoop.hdds.cli.HddsVersionProvider; +import org.apache.hadoop.hdds.scm.client.ScmClient; +import picocli.CommandLine; + +@CommandLine.Command( +name = "listratisstatus", Review comment: Name seems a bit verbose. Let me find a syntax to align with OM HA This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4187) Fix memory leak of recon
[ https://issues.apache.org/jira/browse/HDDS-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4187: - Description: 40 datanodes with 400, 000 containers, start recon with xmx:10G. After several hours, recon's memory increase to 12G and OOM. Memory leak happens on heap, and the reason is recon is slow to process ContainerReport, so the queue of thread OOM. !screenshot-1.png! !screenshot-2.png! was: 40 datanodes with 400, 000 containers, start recon with xmx:10G. After several hours, recon's memory increase to 12G and OOM. Memory leak happens on heap, and the reason is recon is slow to process ContainerReplicaReport, so the queue of thread OOM. !screenshot-1.png! !screenshot-2.png! > Fix memory leak of recon > > > Key: HDDS-4187 > URL: https://issues.apache.org/jira/browse/HDDS-4187 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Attachments: screenshot-1.png, screenshot-2.png > > > 40 datanodes with 400, 000 containers, start recon with xmx:10G. After > several hours, recon's memory increase to 12G and OOM. Memory leak happens on > heap, and the reason is recon is slow to process ContainerReport, so the > queue of thread OOM. > !screenshot-1.png! > !screenshot-2.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #1346: [WIP] HDDS-4115. CLI command to show current SCM leader and follower status.
timmylicheng commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481136753 ## File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java ## @@ -560,6 +563,13 @@ public boolean getReplicationManagerStatus() { return scm.getReplicationManager().isRunning(); } + @Override + public List getScmRatisStatus() throws IOException { +return scm.getScmHAManager() +.getRatisServer().getRaftPeers() +.stream().map(peer -> peer.getAddress()).collect(Collectors.toList()); Review comment: I can see this method be wrapper in ScmHAManager to show a list of scm hosts. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #1346: [WIP] HDDS-4115. CLI command to show current SCM leader and follower status.
timmylicheng commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481135941 ## File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java ## @@ -560,6 +563,13 @@ public boolean getReplicationManagerStatus() { return scm.getReplicationManager().isRunning(); } + @Override + public List getScmRatisStatus() throws IOException { +return scm.getScmHAManager() +.getRatisServer().getRaftPeers() +.stream().map(peer -> peer.getAddress()).collect(Collectors.toList()); Review comment: Could potentially throw NPE This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4187) Fix memory leak of recon
[ https://issues.apache.org/jira/browse/HDDS-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4187: - Description: 40 datanodes with 400, 000 containers, start recon with xmx:10G. After several hours, recon's memory increase to 12G and OOM. Memory leak happens on heap, and the reason is recon is slow to process ContainerReplicaReport, so the queue of thread OOM. !screenshot-1.png! !screenshot-2.png! was: 40 datanodes with 400, 000 containers, start recon with xmx:10G. After several hours, recon's memory increase to 12G and OOM. Memory leak happens on heap, and the reason is recon is slow to process ContainerReplicaReport, so the queue of thread OOM. > Fix memory leak of recon > > > Key: HDDS-4187 > URL: https://issues.apache.org/jira/browse/HDDS-4187 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Attachments: screenshot-1.png, screenshot-2.png > > > 40 datanodes with 400, 000 containers, start recon with xmx:10G. After > several hours, recon's memory increase to 12G and OOM. Memory leak happens on > heap, and the reason is recon is slow to process ContainerReplicaReport, so > the queue of thread OOM. > !screenshot-1.png! > !screenshot-2.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4187) Fix memory leak of recon
[ https://issues.apache.org/jira/browse/HDDS-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4187: - Attachment: screenshot-2.png > Fix memory leak of recon > > > Key: HDDS-4187 > URL: https://issues.apache.org/jira/browse/HDDS-4187 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Attachments: screenshot-1.png, screenshot-2.png > > > 40 datanodes with 400, 000 containers, start recon with xmx:10G. After > several hours, recon's memory increase to 12G and OOM. Memory leak happens on > heap, and the reason is recon is slow to process ContainerReplicaReport, so > the queue of thread OOM. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4187) Fix memory leak of recon
[ https://issues.apache.org/jira/browse/HDDS-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4187: - Attachment: screenshot-1.png > Fix memory leak of recon > > > Key: HDDS-4187 > URL: https://issues.apache.org/jira/browse/HDDS-4187 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Attachments: screenshot-1.png > > > 40 datanodes with 400, 000 containers, start recon with xmx:10G. After > several hours, recon's memory increase to 12G and OOM. Memory leak happens on > heap, and the reason is recon is slow to process ContainerReplicaReport, so > the queue of thread OOM. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4188) Fix Chinese document broken link.
[ https://issues.apache.org/jira/browse/HDDS-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Huang-Mu updated HDDS-4188: - Description: In Chinese document *Home/概念/概览*, There is a broken link. Look attachment screenshot. !brokenLink.png! was: In Chinese document *Home/概念/概览*, There is a broken link. Look attachment screenshot. > Fix Chinese document broken link. > - > > Key: HDDS-4188 > URL: https://issues.apache.org/jira/browse/HDDS-4188 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Zheng Huang-Mu >Priority: Minor > Attachments: brokenLink.png > > > In Chinese document *Home/概念/概览*, > There is a broken link. > Look attachment screenshot. > !brokenLink.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #1346: [WIP] HDDS-4115. CLI command to show current SCM leader and follower status.
timmylicheng commented on a change in pull request #1346: URL: https://github.com/apache/hadoop-ozone/pull/1346#discussion_r481134741 ## File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocolServerSideTranslatorPB.java ## @@ -271,6 +273,14 @@ public ScmContainerLocationResponse processRequest( .setGetSafeModeRuleStatusesResponse(getSafeModeRuleStatues( request.getGetSafeModeRuleStatusesRequest())) .build(); + case ScmHAStatus: Review comment: GetSCMRatisRole will be better This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4188) Fix Chinese document broken link.
[ https://issues.apache.org/jira/browse/HDDS-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Huang-Mu updated HDDS-4188: - Attachment: (was: brokenLink.png) > Fix Chinese document broken link. > - > > Key: HDDS-4188 > URL: https://issues.apache.org/jira/browse/HDDS-4188 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Zheng Huang-Mu >Priority: Minor > Attachments: brokenLink.png > > > In Chinese document *Home/概念/概览*, > There is a broken link. > Look attachment screenshot. > !brokenLink.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4188) Fix Chinese document broken link.
[ https://issues.apache.org/jira/browse/HDDS-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Huang-Mu updated HDDS-4188: - Description: In Chinese document *Home/概念/概览*, There is a broken link. Look attachment screenshot. was: In Chinese document *Home/概念/概览*, There is a broken link. Look attachment screenshot. !brokenLink.png! > Fix Chinese document broken link. > - > > Key: HDDS-4188 > URL: https://issues.apache.org/jira/browse/HDDS-4188 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Zheng Huang-Mu >Priority: Minor > Attachments: brokenLink.png > > > In Chinese document *Home/概念/概览*, > There is a broken link. > Look attachment screenshot. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4188) Fix Chinese document broken link.
[ https://issues.apache.org/jira/browse/HDDS-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Huang-Mu updated HDDS-4188: - Attachment: brokenLink.png > Fix Chinese document broken link. > - > > Key: HDDS-4188 > URL: https://issues.apache.org/jira/browse/HDDS-4188 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Zheng Huang-Mu >Priority: Minor > Attachments: brokenLink.png > > > In Chinese document *Home/概念/概览*, > There is a broken link. > Look attachment screenshot. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4188) Fix Chinese document broken link.
[ https://issues.apache.org/jira/browse/HDDS-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Huang-Mu updated HDDS-4188: - Description: In Chinese document *Home/概念/概览*, There is a broken link. Look attachment screenshot. !brokenLink.png! was: In Chinese document *Home/概念/概览*, There is a broken link. Look attachment screenshot. > Fix Chinese document broken link. > - > > Key: HDDS-4188 > URL: https://issues.apache.org/jira/browse/HDDS-4188 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Zheng Huang-Mu >Priority: Minor > Attachments: brokenLink.png > > > In Chinese document *Home/概念/概览*, > There is a broken link. > Look attachment screenshot. > !brokenLink.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4188) Fix Chinese document broken link.
Zheng Huang-Mu created HDDS-4188: Summary: Fix Chinese document broken link. Key: HDDS-4188 URL: https://issues.apache.org/jira/browse/HDDS-4188 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Zheng Huang-Mu Attachments: brokenLink.png In Chinese document *Home/概念/概览*, There is a broken link. Look attachment screenshot. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4187) Fix memory leak of recon
[ https://issues.apache.org/jira/browse/HDDS-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4187: - Description: 40 datanodes with 400, 000 containers, start recon with xmx:10G. After several hours, recon's memory increase to 12G and OOM. Memory leak happens on heap, and the reason is recon is slow to process ContainerReplicaReport, so the queue of thread OOM. > Fix memory leak of recon > > > Key: HDDS-4187 > URL: https://issues.apache.org/jira/browse/HDDS-4187 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > > 40 datanodes with 400, 000 containers, start recon with xmx:10G. After > several hours, recon's memory increase to 12G and OOM. Memory leak happens on > heap, and the reason is recon is slow to process ContainerReplicaReport, so > the queue of thread OOM. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org