joaopamaral opened a new pull request, #6722:
URL: https://github.com/apache/kyuubi/pull/6722
# :mag: Description
## Issue References ๐
<!-- Append the issue number after #. If there is no issue for you to link
create one or -->
<!-- If there are no issues to link, please provide details here. -->
This issue was noticed a few times when the batch `state` was `set` to
`ERROR`, but the `appState` kept the non-terminal state forever (e.g.
`RUNNING`), even if the application was finished (in this case Yarn
Application).
```json
{
"id": "********",
"user": "****",
"batchType": "SPARK",
"name": "*********",
"appStartTime": 0,
"appId": "********",
"appUrl": "********",
"appState": "RUNNING",
"appDiagnostic": "",
"kyuubiInstance": "*********",
"state": "ERROR",
"createTime": 1725343207318,
"endTime": 1725343300986,
"batchInfo": {}
}
```
It seems that this happens when there is some intermittent failure during
the monitoring step and the batch ends with ERROR, leaving the application
metadata without an update. This can lead to some misinterpretation that the
application is still running. We need to set this to `UNKNOWN` state to avoid
errors.
## Describe Your Solution ๐ง
This is a simple fix that only checks if the batch state is `ERROR` and the
appState is not in a terminal state and changes the `appState` to `UNKNOWN`, in
these cases (during the batch metadata update).
## Types of changes :bookmark:
<!--- What types of changes does your code introduce? Put an `x` in all the
boxes that apply: -->
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
## Test Plan ๐งช
#### Behavior Without This Pull Request :coffin:
If there is some error between the Kyuubi and the Application request (e.g.
YARN client), the batch is finished with `ERROR` state and the application
keeps the last know state (e.g. RUNNING).
#### Behavior With This Pull Request :tada:
If there is some error between the Kyuubi and the Application request (e.g.
YARN client), the batch is finished with `ERROR `state and the application has
a non-terminal state, it is forced to `UNKNOWN` state.
#### Related Unit Tests
I've tried to implement a unit test to replicate this behavior but I didn't
make it. We need to force an exception in the Engine Request (e.g.
`YarnClient.getApplication`) but we need to wait for the application to be in
the RUNNING state before raising this exception, or maybe block the connection
between kyuubi and the engine.
---
# Checklist ๐
<!--- Go over all the following points, and put an `x` in all the boxes that
apply. -->
<!--- If you're unsure about any of these, don't hesitate to ask. We're here
to help! -->
- [ ] This patch was not authored or co-authored using [Generative
Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]