Pavel Zeger created FLINK-39692:
-----------------------------------

             Summary: Several catch blocks log the message but discard the 
captured exception
                 Key: FLINK-39692
                 URL: https://issues.apache.org/jira/browse/FLINK-39692
             Project: Flink
          Issue Type: Bug
          Components: Kubernetes Operator
            Reporter: Pavel Zeger


*Where*

Five sites where `catch (Exception e)` is followed by a `LOG.warn` / 
`LOG.error` that does not pass `e` as the last argument, so the stack trace is 
silently discarded:

1. 
`flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/snapshot/StateSnapshotReconciler.java`
 line 173 - failure to dispose savepoint.
2. 
`flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/autoscaler/state/KubernetesAutoScalerStateStore.java`
 line 393 - failure to decompress scaling data.
3. 
`flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/utils/IngressUtils.java`
 line 369 - failure to parse Kubernetes server version.
4. 
`flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/config/FlinkConfigManager.java`
 lines 264 and 266 - failure to parse a Flink-version-prefixed config key.
5. 
`flink-autoscaler/src/main/java/org/apache/flink/autoscaler/tuning/MemoryTuning.java`
 line 90 - failure to parse memory configuration.

When something abnormal happens - a savepoint can't be disposed, scaling data 
can't be decompressed, a memory config is rejected - operators want to see the 
**root cause**, not just "something failed."

Today, when an operator team gets paged for one of these warnings, they have to:

1. Read the code to figure out what `Exception` types could have been thrown.
2. Re-derive (often incorrectly) what the underlying problem might be.
3. Hope it reproduces while they have a debugger attached.

Passing the exception costs nothing and saves debugging time.

*Proposed fix*

Mechanical, one-line change at each site:

```diff
-LOG.warn("Error while decompressing scaling data, treating as uncompressed");
+LOG.warn("Error while decompressing scaling data, treating as uncompressed", 
e);
```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to