Tan Kim created FLINK-31963:
-------------------------------

             Summary: java.lang.ArrayIndexOutOfBoundsException when scale down 
via autoscaler
                 Key: FLINK-31963
                 URL: https://issues.apache.org/jira/browse/FLINK-31963
             Project: Flink
          Issue Type: Bug
          Components: Kubernetes Operator, Runtime / Checkpointing
         Environment: Flink: 1.17.0
FKO: 1.4.0
StateBackend: RocksDB(Genetic Incremental Checkpoint & Unaligned Checkpoint 
enabled)
            Reporter: Tan Kim
         Attachments: jobmanager_error.txt, taskmanager_error.txt

I'm testing Autoscaler through Kubernetes Operator and I'm facing the following 
issue.

As you know, when a job is scaled down through the autoscaler, the job manager 
and task manager go down and then back up again.

When this happens, an index out of bounds exception is thrown and the state is 
not restored from a checkpoint.

[~gyfora] told me via the Flink Slack troubleshooting channel that this is 
likely an issue with Unaligned Checkpoint and not an issue with the autoscaler, 
but I'm opening a ticket with Gyula for more clarification.

Please see the attached JM and TM error logs.
Thank you.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to