[ 
https://issues.apache.org/jira/browse/IGNITE-25815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-25815:
---------------------------------------
    Description: 
PartitionMover makes a retry on an exception. Retries are made on each 
exception (including those that are not retriable), which sometimes leads to an 
infinite loop if something is broken.
 # We need to differentiate which exceptions are retryable and which are not
 # For non-retryable ones, we could call FailureManager right away and stop 
retrying; or we could change replica state to some special error state to avoid 
crashing the node, but at the same time indicate that something is wrong
 # For retryable ones, we should add a retry counter and stop handling an 
exception as a retryable when the counter reaches some limit (that is, stop 
retrying and notify FailureManager or switch replica state to error)

> Improve exception handling in PartitionMover
> --------------------------------------------
>
>                 Key: IGNITE-25815
>                 URL: https://issues.apache.org/jira/browse/IGNITE-25815
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>
> PartitionMover makes a retry on an exception. Retries are made on each 
> exception (including those that are not retriable), which sometimes leads to 
> an infinite loop if something is broken.
>  # We need to differentiate which exceptions are retryable and which are not
>  # For non-retryable ones, we could call FailureManager right away and stop 
> retrying; or we could change replica state to some special error state to 
> avoid crashing the node, but at the same time indicate that something is wrong
>  # For retryable ones, we should add a retry counter and stop handling an 
> exception as a retryable when the counter reaches some limit (that is, stop 
> retrying and notify FailureManager or switch replica state to error)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to