[ 
https://issues.apache.org/jira/browse/FLINK-39752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-39752.
------------------------------
      Assignee: Dennis-Mircea Ciupitu
    Resolution: Fixed

merged to main 271e269abba59fc807bc533e49574db806b0dbb7

> Thread EventRecorder through FlinkResourceContext instead of holding it on 
> the factory
> --------------------------------------------------------------------------------------
>
>                 Key: FLINK-39752
>                 URL: https://issues.apache.org/jira/browse/FLINK-39752
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.15.0
>            Reporter: Dennis-Mircea Ciupitu
>            Assignee: Dennis-Mircea Ciupitu
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: kubernetes-operator-1.16.0
>
>
> h1. Summary
> The operator creates more than one {{EventRecorder}} instance even though the 
> recorder is a fixed, operator-scoped dependency that should be created once 
> and shared. This leaves an inconsistency where, for some controllers, the 
> recorder used by the reconciler and observer is a different instance from the 
> one the {{FlinkService}} emits through.
> h1. Background
> {{FlinkResourceContextFactory}} holds a single {{EventRecorder}} and uses it 
> when creating the {{FlinkService}}. The FlinkDeployment controller reuses 
> that same shared instance, but the FlinkSessionJob and FlinkStateSnapshot 
> controllers each create their own recorder via {{EventRecorder.create(...)}} 
> in {{FlinkOperator}}. As a result, for those two controllers the recorder 
> feeding the reconciler and observer is not the same instance the 
> {{FlinkService}} uses.
> h1. Why this matters
> {{EventRecorder}} is currently a stateless dispatcher over the shared 
> resource listeners and the Kubernetes client, so today the duplicate 
> instances behave identically and the inconsistency is invisible. It becomes a 
> real bug the moment any per-instance state is added to {{EventRecorder}} (for 
> example an event de-duplication cache or rate limiter), because events 
> emitted on the {{FlinkService}} path and events emitted on the controller 
> path would then diverge.
> h1. Goal
> Ensure a single operator-scoped {{EventRecorder}} is created once and reused 
> by every controller and by the {{FlinkResourceContextFactory}}, removing the 
> duplicate per-controller instances. This keeps the recorder a shared fixed 
> dependency owned by the factory, without passing it on every 
> {{getResourceContext}} call.
> h1. Notes
> This is a behavior-preserving consistency fix. No public API, CRD, or 
> reconciliation behavior changes. It is covered by the existing controller 
> tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to