[
https://issues.apache.org/jira/browse/IGNITE-24942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladimir Pligin reassigned IGNITE-24942:
----------------------------------------
Assignee: Roman Puchkovskiy
> StackOverflowError in PartitionMover
> ------------------------------------
>
> Key: IGNITE-24942
> URL: https://issues.apache.org/jira/browse/IGNITE-24942
> Project: Ignite
> Issue Type: Bug
> Reporter: Roman Puchkovskiy
> Assignee: Roman Puchkovskiy
> Priority: Major
> Labels: ignite-3
>
> PartitionMover makes a retry on an exception. Retries are made on each
> exception (including those that are not retriable), there is no retry limit
> and the retries might happen in the same thread, which sometimes leads to an
> infinite loop (resulting in StackOverflowError) if something is broken.
> # We need to differentiate which exceptions are retryable and which are not
> # For non-retryable ones, we should call FailureManager right away and stop
> retrying
> # For retryable ones, we should add a retry counter and stop handling an
> exception as a retryable when the counter reaches some limit (that is, stop
> retrying and notify FailureManager)
> # Maybe we should initiate a retry in a separate thread pool to avoid stack
> overflow if there are many retries (or simply pick max retry count that is
> not big enough to trigger stack overflow)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)