[
https://issues.apache.org/jira/browse/IGNITE-26421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladimir Pligin reassigned IGNITE-26421:
----------------------------------------
Assignee: Kirill Sizov
> Make node stop time unbounded
> -----------------------------
>
> Key: IGNITE-26421
> URL: https://issues.apache.org/jira/browse/IGNITE-26421
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexander Lapin
> Assignee: Kirill Sizov
> Priority: Major
> Labels: ignite-3
>
> Currently we may halt component stop if it doesn't fit some time bound which
> has no sense. E.g. in
> PartitionReplicaLifecycleManager#cleanUpPartitionsResources we will terminate
> partitions stop if it doesn't fit in 30 seconds
> {code:java}
> allOf(stopPartitionsFuture).get(30, TimeUnit.SECONDS);{code}
> Sometimes, especially, on slow machines 30 seconds might not be enough. In
> that case we won't stop some replcias which will lead to an assertion error
> on Loza stop because there will be some alive raft groups, which is not
> expected.
> Proper behaviour should be following:
> * Stop each component for whatever time it takes.
> * In case of stop timeout exceedance, log with greater details what node is
> doing: e.g. which partition is stopping, etc.
> * In case exceptions trigger FH.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)