[ https://issues.apache.org/jira/browse/IGNITE-14794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pavel Pereslegin updated IGNITE-14794: -------------------------------------- Description: Add JMX command to restore a cache group from the snapshot. Suggested methods {code:java} @MXBeanDescription("Restore cluster-wide snapshot.") public void restoreSnapshot( @MXBeanParameter(name = "snpName", description = "Snapshot name.") String name, @MXBeanParameter(name = "cacheGroupNames", description = "Optional comma-separated list of cache group names.") String cacheGroupNames); @MXBeanDescription("Cancel previously started snapshot restore operation.") public void cancelSnapshotRestore(@MXBeanParameter(name = "snpName", description = "Snapshot name.") String name); {code} Since the automatic snapshot restore operation can take a long time, we must be able to track its progress using metrics. Suggested metrics: {noformat} start time partitions (processed/total) bytes (processed/total) end time {noformat} Suggested status command output. [in progress] {noformat} Restore operation for snapshot "snapshot_25052021" is still in progress (requestId=0e2d8c06-d44a-4ade-91bf-2b84b367499a). Progress: 100% completed (66/66 partitions, 3.8/3.8 MB) Started: 2021-10-05 15:47:47.942 Cache groups: default Node 11faec83-a304-48f7-aac7-e67bf8800001: 100% completed (33/33 partitions, 1.9/1.9 MB) Node 99066100-890f-41a3-b0cd-4a3d59600000: 100% completed (33/33 partitions, 1.9/1.9 MB) {noformat} [error] {noformat} Restore operation for snapshot "snapshot_25052021" failed (requestId=b9b312f5-ba34-40e9-bb94-35daacd552c0). Error: Operation has been canceled by the user. Started: 2021-10-05 15:51:52.255 Finished: 2021-10-05 15:51:52.782 Cache groups: default Node e3c8d45b-2ccd-43ba-81ab-ea3bb9e00001: 100% completed (33/33 partitions, 1.9/1.9 MB) Node 884cd446-38c2-4538-9dcd-81509eb00000: 100% completed (33/33 partitions, 1.9/1.9 MB){noformat} [finished] {noformat} Restore operation for snapshot "snapshot_25052021" completed successfully (requestId=6adeea86-1ee2-4664-8d7d-3383a484a00a). Progress: 100% completed (66/66 partitions, 3.8/3.8 MB) Started: 2021-10-05 15:53:03.352 Finished: 2021-10-05 15:53:03.443 Cache groups: default Node cc69e33f-de95-42b4-99af-86cf83900001: 100% completed (33/33 partitions, 1.9/1.9 MB) Node b4f3bb36-aef3-4813-a3e9-9f7773600000: 100% completed (33/33 partitions, 1.9/1.9 MB){noformat} [missing snapshot name] {noformat} No information about restoring snapshot "snapshot_MISSING" is available.{noformat} was: Add JMX command to restore a cache group from the snapshot. Suggested methods {code:java} @MXBeanDescription("Restore cluster-wide snapshot.") public void restoreSnapshot( @MXBeanParameter(name = "snpName", description = "Snapshot name.") String name, @MXBeanParameter(name = "cacheGroupNames", description = "Optional comma-separated list of cache group names.") String cacheGroupNames); @MXBeanDescription("Cancel previously started snapshot restore operation.") public void cancelSnapshotRestore(@MXBeanParameter(name = "snpName", description = "Snapshot name.") String name); {code} Since the automatic snapshot restore operation can take a long time, we must be able to track its progress using metrics. Suggested metrics: {noformat} start time partitions (processed/total) bytes (processed/total) end time {noformat} Suggested status command output. [in progress] {noformat} Restore operation for snapshot "snapshot_25052021" is still in progress (requestId=0e2d8c06-d44a-4ade-91bf-2b84b367499a). Progress: 100% completed (66/66 partitions, 3.8/3.8 MB) Started: 2021-10-05 15:47:47.942 Cache groups: default Node 11faec83-a304-48f7-aac7-e67bf8800001: 100% completed (33/33 partitions, 1.9/1.9 MB) Node 99066100-890f-41a3-b0cd-4a3d59600000: 100% completed (33/33 partitions, 1.9/1.9 MB)Command [SNAPSHOT] finished with code: 0{noformat} [finished] {noformat} Restore operation for snapshot "snapshot_25052021" completed successfully (requestId=6adeea86-1ee2-4664-8d7d-3383a484a00a). Progress: 100% completed (66/66 partitions, 3.8/3.8 MB) Started: 2021-10-05 15:53:03.352 Finished: 2021-10-05 15:53:03.443 Cache groups: default Node cc69e33f-de95-42b4-99af-86cf83900001: 100% completed (33/33 partitions, 1.9/1.9 MB) Node b4f3bb36-aef3-4813-a3e9-9f7773600000: 100% completed (33/33 partitions, 1.9/1.9 MB){noformat} [missing snapshot name] {noformat} No information about restoring snapshot "snapshot_MISSING" is available.{noformat} [error] {noformat} Restore operation for snapshot "snapshot_25052021" failed (requestId=b9b312f5-ba34-40e9-bb94-35daacd552c0). Error: Operation has been canceled by the user. Started: 2021-10-05 15:51:52.255 Finished: 2021-10-05 15:51:52.782 Cache groups: default Node e3c8d45b-2ccd-43ba-81ab-ea3bb9e00001: 100% completed (33/33 partitions, 1.9/1.9 MB) Node 884cd446-38c2-4538-9dcd-81509eb00000: 100% completed (33/33 partitions, 1.9/1.9 MB){noformat} > Add JMX command and metrics for automatic snapshot restore operation. > ---------------------------------------------------------------------- > > Key: IGNITE-14794 > URL: https://issues.apache.org/jira/browse/IGNITE-14794 > Project: Ignite > Issue Type: Improvement > Reporter: Pavel Pereslegin > Assignee: Pavel Pereslegin > Priority: Major > Labels: iep-43 > Fix For: 2.12 > > Time Spent: 10m > Remaining Estimate: 0h > > Add JMX command to restore a cache group from the snapshot. > Suggested methods > {code:java} > @MXBeanDescription("Restore cluster-wide snapshot.") > public void restoreSnapshot( > @MXBeanParameter(name = "snpName", description = "Snapshot name.") > String name, > @MXBeanParameter(name = "cacheGroupNames", description = "Optional > comma-separated list of cache group names.") String cacheGroupNames); > @MXBeanDescription("Cancel previously started snapshot restore > operation.") > public void cancelSnapshotRestore(@MXBeanParameter(name = "snpName", > description = "Snapshot name.") String name); > {code} > Since the automatic snapshot restore operation can take a long time, we must > be able to track its progress using metrics. > Suggested metrics: > {noformat} > start time > partitions (processed/total) > bytes (processed/total) > end time > {noformat} > > Suggested status command output. > [in progress] > {noformat} > Restore operation for snapshot "snapshot_25052021" is still in progress > (requestId=0e2d8c06-d44a-4ade-91bf-2b84b367499a). > Progress: 100% completed (66/66 partitions, 3.8/3.8 MB) > Started: 2021-10-05 15:47:47.942 > Cache groups: default > Node 11faec83-a304-48f7-aac7-e67bf8800001: 100% completed (33/33 > partitions, 1.9/1.9 MB) > Node 99066100-890f-41a3-b0cd-4a3d59600000: 100% completed (33/33 > partitions, 1.9/1.9 MB) > {noformat} > [error] > {noformat} > Restore operation for snapshot "snapshot_25052021" failed > (requestId=b9b312f5-ba34-40e9-bb94-35daacd552c0). > Error: Operation has been canceled by the user. > Started: 2021-10-05 15:51:52.255 > Finished: 2021-10-05 15:51:52.782 > Cache groups: default > Node e3c8d45b-2ccd-43ba-81ab-ea3bb9e00001: 100% completed (33/33 > partitions, 1.9/1.9 MB) > Node 884cd446-38c2-4538-9dcd-81509eb00000: 100% completed (33/33 > partitions, 1.9/1.9 MB){noformat} > [finished] > {noformat} > Restore operation for snapshot "snapshot_25052021" completed successfully > (requestId=6adeea86-1ee2-4664-8d7d-3383a484a00a). > Progress: 100% completed (66/66 partitions, 3.8/3.8 MB) > Started: 2021-10-05 15:53:03.352 > Finished: 2021-10-05 15:53:03.443 > Cache groups: default > Node cc69e33f-de95-42b4-99af-86cf83900001: 100% completed (33/33 > partitions, 1.9/1.9 MB) > Node b4f3bb36-aef3-4813-a3e9-9f7773600000: 100% completed (33/33 > partitions, 1.9/1.9 MB){noformat} > [missing snapshot name] > {noformat} > No information about restoring snapshot "snapshot_MISSING" is > available.{noformat} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)