[jira] [Updated] (IGNITE-7278) Node failed to recover partition from WAL on unstable topology
[ https://issues.apache.org/jira/browse/IGNITE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy Govorukhin updated IGNITE-7278: --- Priority: Blocker (was: Major) > Node failed to recover partition from WAL on unstable topology > -- > > Key: IGNITE-7278 > URL: https://issues.apache.org/jira/browse/IGNITE-7278 > Project: Ignite > Issue Type: Bug > Components: persistence >Reporter: Andrew Mashenkov >Assignee: Dmitriy Govorukhin >Priority: Blocker > Fix For: 2.4 > > Attachments: page_corrupted2.tar.gz > > > The use case is: > -Grid with partitioned cache with 2 backups (or replicated) > -Node-1 is killed in the middle of checkpoint and started again. > -Node-1 detects unfinished checkpoint and tries to recover it. > -At this point Node-2 is killed while node-1 recovering is in progress. > -Node-1 fails with AssertionError. > PFA logs, parsed WAL, reproducer. > Can be reproduced with IgnitePdsContinuousRestartTest with minor changes, > we have to have 2 nodes flapping and kill nodes ungracefully. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-7278) Node failed to recover partition from WAL on unstable topology
[ https://issues.apache.org/jira/browse/IGNITE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy Govorukhin updated IGNITE-7278: --- Summary: Node failed to recover partition from WAL on unstable topology (was: Node failed to recover partition from WAL on unstable topology.) > Node failed to recover partition from WAL on unstable topology > -- > > Key: IGNITE-7278 > URL: https://issues.apache.org/jira/browse/IGNITE-7278 > Project: Ignite > Issue Type: Bug > Components: persistence >Reporter: Andrew Mashenkov >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.4 > > Attachments: page_corrupted2.tar.gz > > > The use case is: > -Grid with partitioned cache with 2 backups (or replicated) > -Node-1 is killed in the middle of checkpoint and started again. > -Node-1 detects unfinished checkpoint and tries to recover it. > -At this point Node-2 is killed while node-1 recovering is in progress. > -Node-1 fails with AssertionError. > PFA logs, parsed WAL, reproducer. > Can be reproduced with IgnitePdsContinuousRestartTest with minor changes, > we have to have 2 nodes flapping and kill nodes ungracefully. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-7278) Node failed to recover partition from WAL on unstable topology.
[ https://issues.apache.org/jira/browse/IGNITE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Mashenkov updated IGNITE-7278: - Attachment: page_corrupted2.tar.gz > Node failed to recover partition from WAL on unstable topology. > --- > > Key: IGNITE-7278 > URL: https://issues.apache.org/jira/browse/IGNITE-7278 > Project: Ignite > Issue Type: Bug > Components: persistence >Reporter: Andrew Mashenkov >Assignee: Andrew Mashenkov > Fix For: 2.4 > > Attachments: page_corrupted2.tar.gz > > > The use case is: > -Grid with partitioned cache with 2 backups (or replicated) > -Node-1 is killed in the middle of checkpoint and started again. > -Node-1 detects unfinished checkpoint and tries to recover it. > -At this point Node-2 is killed while node-1 recovering is in progress. > -Node-1 fails with AssertionError. > PFA logs, parsed WAL, reproducer. > Can be reproduced with IgnitePdsContinuousRestartTest with minor changes, > we have to have 2 nodes flapping and kill nodes ungracefully. -- This message was sent by Atlassian JIRA (v6.4.14#64029)