prshnt opened a new pull request, #1101:
URL: https://github.com/apache/flink-kubernetes-operator/pull/1101

   ### What is the purpose of the change
   
   Fix the operator getting stuck in a CANCELLING loop when a job has already 
reached a terminal state (e.g. FAILED, FINISHED) before the cancel request is 
processed. cancelJobOrError now returns a boolean indicating whether the cancel 
is pending (true) or the job was already terminated (false), allowing the 
reconciler to skip the async re-observe wait and proceed directly to cleanup.
   Brief change log
   • Changed cancelJobOrError return type from void to boolean to distinguish 
between "cancel submitted, wait for async completion" vs "job already gone, 
proceed immediately"
   • Extended isJobTerminated to also match Flink's "already reached another 
terminal state" error message (e.g. HTTP 400 BAD_REQUEST with that text), in 
addition to the existing HTTP CONFLICT check
   • When the job is already missing or terminated during a STATELESS/CANCEL 
suspend, the operator no longer returns CancelResult.pending() — it falls 
through to CancelResult.completed()
   
   ### Verifying this change
   This change added tests and can be verified as follows:
   • Added cancelErrorHandlingWithTerminalStateMessage unit test: simulates a 
REST client returning a 400 BAD_REQUEST with the "already reached another 
terminal state" message during cancel, and asserts that the job status 
transitions to FINISHED with the job ID cleared (rather than remaining stuck in 
CANCELLING)
   • Updated existing cancelSessionJobTest to assert the job reaches FINISHED 
state (not CANCELLING) when the job is already gone during a stateless cancel
   
   ### Does this pull request potentially affect one of the following parts:
   • Dependencies: no
   • Public API / CRD changes: no
   • Core observer or reconciler logic that is regularly executed: yes — the 
cancel/suspend path in AbstractFlinkService
   
   ### Documentation
   • Does this pull request introduce a new feature? no (bug fix)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to