Pavel Zeger created FLINK-39692:
-----------------------------------
Summary: Several catch blocks log the message but discard the
captured exception
Key: FLINK-39692
URL: https://issues.apache.org/jira/browse/FLINK-39692
Project: Flink
Issue Type: Bug
Components: Kubernetes Operator
Reporter: Pavel Zeger
*Where*
Five sites where `catch (Exception e)` is followed by a `LOG.warn` /
`LOG.error` that does not pass `e` as the last argument, so the stack trace is
silently discarded:
1.
`flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/snapshot/StateSnapshotReconciler.java`
line 173 - failure to dispose savepoint.
2.
`flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/autoscaler/state/KubernetesAutoScalerStateStore.java`
line 393 - failure to decompress scaling data.
3.
`flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/utils/IngressUtils.java`
line 369 - failure to parse Kubernetes server version.
4.
`flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/config/FlinkConfigManager.java`
lines 264 and 266 - failure to parse a Flink-version-prefixed config key.
5.
`flink-autoscaler/src/main/java/org/apache/flink/autoscaler/tuning/MemoryTuning.java`
line 90 - failure to parse memory configuration.
When something abnormal happens - a savepoint can't be disposed, scaling data
can't be decompressed, a memory config is rejected - operators want to see the
**root cause**, not just "something failed."
Today, when an operator team gets paged for one of these warnings, they have to:
1. Read the code to figure out what `Exception` types could have been thrown.
2. Re-derive (often incorrectly) what the underlying problem might be.
3. Hope it reproduces while they have a debugger attached.
Passing the exception costs nothing and saves debugging time.
*Proposed fix*
Mechanical, one-line change at each site:
```diff
-LOG.warn("Error while decompressing scaling data, treating as uncompressed");
+LOG.warn("Error while decompressing scaling data, treating as uncompressed",
e);
```
--
This message was sent by Atlassian Jira
(v8.20.10#820010)