[ 
https://issues.apache.org/jira/browse/IGNITE-24942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Pligin reassigned IGNITE-24942:
----------------------------------------

    Assignee: Roman Puchkovskiy

> StackOverflowError in PartitionMover
> ------------------------------------
>
>                 Key: IGNITE-24942
>                 URL: https://issues.apache.org/jira/browse/IGNITE-24942
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Roman Puchkovskiy
>            Assignee: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>
> PartitionMover makes a retry on an exception. Retries are made on each 
> exception (including those that are not retriable), there is no retry limit 
> and the retries might happen in the same thread, which sometimes leads to an 
> infinite loop (resulting in StackOverflowError) if something is broken.
>  # We need to differentiate which exceptions are retryable and which are not
>  # For non-retryable ones, we should call FailureManager right away and stop 
> retrying
>  # For retryable ones, we should add a retry counter and stop handling an 
> exception as a retryable when the counter reaches some limit (that is, stop 
> retrying and notify FailureManager)
>  # Maybe we should initiate a retry in a separate thread pool to avoid stack 
> overflow if there are many retries (or simply pick max retry count that is 
> not big enough to trigger stack overflow)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to