[
https://issues.apache.org/jira/browse/FLINK-39752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gyula Fora closed FLINK-39752.
------------------------------
Assignee: Dennis-Mircea Ciupitu
Resolution: Fixed
merged to main 271e269abba59fc807bc533e49574db806b0dbb7
> Thread EventRecorder through FlinkResourceContext instead of holding it on
> the factory
> --------------------------------------------------------------------------------------
>
> Key: FLINK-39752
> URL: https://issues.apache.org/jira/browse/FLINK-39752
> Project: Flink
> Issue Type: Improvement
> Components: Kubernetes Operator
> Affects Versions: kubernetes-operator-1.15.0
> Reporter: Dennis-Mircea Ciupitu
> Assignee: Dennis-Mircea Ciupitu
> Priority: Major
> Labels: pull-request-available
> Fix For: kubernetes-operator-1.16.0
>
>
> h1. Summary
> The operator creates more than one {{EventRecorder}} instance even though the
> recorder is a fixed, operator-scoped dependency that should be created once
> and shared. This leaves an inconsistency where, for some controllers, the
> recorder used by the reconciler and observer is a different instance from the
> one the {{FlinkService}} emits through.
> h1. Background
> {{FlinkResourceContextFactory}} holds a single {{EventRecorder}} and uses it
> when creating the {{FlinkService}}. The FlinkDeployment controller reuses
> that same shared instance, but the FlinkSessionJob and FlinkStateSnapshot
> controllers each create their own recorder via {{EventRecorder.create(...)}}
> in {{FlinkOperator}}. As a result, for those two controllers the recorder
> feeding the reconciler and observer is not the same instance the
> {{FlinkService}} uses.
> h1. Why this matters
> {{EventRecorder}} is currently a stateless dispatcher over the shared
> resource listeners and the Kubernetes client, so today the duplicate
> instances behave identically and the inconsistency is invisible. It becomes a
> real bug the moment any per-instance state is added to {{EventRecorder}} (for
> example an event de-duplication cache or rate limiter), because events
> emitted on the {{FlinkService}} path and events emitted on the controller
> path would then diverge.
> h1. Goal
> Ensure a single operator-scoped {{EventRecorder}} is created once and reused
> by every controller and by the {{FlinkResourceContextFactory}}, removing the
> duplicate per-controller instances. This keeps the recorder a shared fixed
> dependency owned by the factory, without passing it on every
> {{getResourceContext}} call.
> h1. Notes
> This is a behavior-preserving consistency fix. No public API, CRD, or
> reconciliation behavior changes. It is covered by the existing controller
> tests.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)