Draft PR)

Sergio Chong Loo Mon, 04 May 2026 10:57:32 -0700

Hi Daniel / Ryan,

Again thanks a lot for the feedback. Here are some thoughts:


The following are excellent points and I indeed think we can work on them 
immediately to incorporate them. I already have some ideas on implementation 
but let me flesh them out more before I share them:
  - WatermarkGenerator
  - Gate Parallelism
  - TransitionMode default (trivial)

Thanks for the Misc Testing proposal (especially with SQL), it will indeed 
facilitate development and make sure the outcome/output is precise. Let me know 
if/when you choose to start this effort (or any other item for that matter).

Exactly-once sinks. This needs explicit testing indeed. I haven't verified this 
one yet. I believe the gate operator should ensure green's sink transactions 
don't begin until blue has completed its final checkpoint and committed. This 
is a good candidate for a dedicated test case before declaring phase 1 stable.

Non-idempotent sinks. I'm still not sure this is a problem meant to be solved 
by Blue/Green deployments. If we think closely about it, even within a single 
Flink pipeline, strict global ordering is only guaranteed within a single 
subtask's partition. The moment you have parallelism > 1, different subtasks 
process different keys/partitions independently and emit records at different 
rates. There's no global wall-clock ordering across subtasks. Flink's watermark 
mechanism gives you event-time ordering, not arrival-order consistency across 
the whole topology. For pipelines where strict ordering to a non-idempotent 
sink is a hard requirement, the right pattern is a savepoint based stop/restart 
rather than concurrent blue/green execution.

On the other hand, assuming we pursue this, your proposal sounds 
straightforward, but using state would imply the new pipeline needs to buffer 
an unbounded amount of data while waiting for the first deployment to finish, 
which makes memory management and state sizing very difficult. The gate 
watermark barrier already provides a temporal ordering guarantee (green doesn't 
advance past blue's watermark), which covers the majority of 
idempotency-sensitive sinks. Perhaps I'm looking at this from an overly 
simplistic perspective, is there a concrete example you can share to illustrate 
the scenario you have in mind? We should definitely keep this topic open as we 
make progress on the other items.

I'll keep you posted on progress for the items at the top.

  ⁃ Sergio


> On Apr 20, 2026, at 8:40 AM, Sergio Chong Loo <[email protected]> wrote:
> 
> Hi @Daniel / @Ryan,
> 
> Thanks a lot for the input. Similarly we’re swamped in a time crunch here but 
> I’ll be taking a deep dive into your feedback hopefully before EoW. 
> 
> Stay tuned!
> 
> - Sergio
> 
>> On Apr 15, 2026, at 3:57 PM, Daniel Rossos <[email protected]> wrote:
>> 
>> Hey Sergio, 
>> 
>> I got around to running locally as well as doing a deeper dive into the 
>> current implementation details (Sorry about the delay). This all looks super 
>> awesome and huge thanks for taking the time to put this together. I have 
>> some comments / questions below. Some I think will be answered as we test 
>> further and some are potential next steps / features that might be nice for 
>> phase 2.
>> 
>> Non-Idempotent sinks
>> I believe we will still have non-idempotent sinks issues by using these gate 
>> functions. Brief recap of when this was raised before, records in the 
>> “green” deployment could be produced before the “blue” deployment causing an 
>> out-of-order delivery in record processing order which could have 
>> implications for downstream sinks. One potential solution I was thinking 
>> about that fits into this PR was to have the gate-operator be stateful and 
>> have your “green” pipeline accumulate messages after watermark barrier and 
>> wait for the “blue” to communicate (via the configmap) that it is done 
>> processing before passing its records on. This would introduce a minimal 
>> processing delay (poll rate of the “green” on configmap), but would ensure 
>> that the ordering of the streams to down stream sinks remains consistent. 
>> Other complications arise here with regards to handling state, but want to 
>> get your opinion here.
>> 
>> WatermarkGenerator
>> Inserting a watermark generator instead of requiring watermark in data. On 
>> the SQL side of things I noticed it is required that a field be a timestamp 
>> field that can be converted into a watermark. I was wondering if an 
>> alternative would be to inject a watermark generator step (generate based 
>> off some other value), that way existing table defs won’t need to be changed 
>> to accommodate this new feature. 
>> 
>> Exactly-once-sinks
>> How does this work with exactly_once downstream sinks compatibility wise? 
>> For example we should test using exactly_once kafka sinks to see if there 
>> will be any conflicts there. 
>> 
>> Gate Parallelism
>> From my understanding, the parallelism of the gate operator has to be same 
>> as the sink/source operator it is tied to. With autoscaling how do these 
>> stay in sync? Could this cause problems?  Can this gate become a performance 
>> bottleneck somehow?
>> 
>> TransitionMode Default
>> `transitionMode` not being set to default `BASIC` means all phase 1 
>> Blue-Green deployment would be broken specs on upgrade. I think we will want 
>> to include that default
>> 
>> Misc Testing
>> This is something for the future (and something I might play around with as 
>> I test), is adding a case to the e2e-tests that creates a Flink pipeline 
>> (sql and/or non-sql) and checks output from sink to ensure there are no 
>> duplicates. I don’t have a convenient pipeline on hand
>> 
>> Going forward, I am going to try to get a good test pipeline created and 
>> test this on our sandbox environment to see if I can get a good e2e run on 
>> prod-like conditions. 
>> 
>> Thanks again,
>> 
>> Daniel
>> 
>> On Wed, Apr 1, 2026 at 12:25 PM Sergio Chong Loo <[email protected] 
>> <mailto:[email protected]>> wrote:
>>> Absolutely no rush, take your time.
>>> 
>>> I’m also still working on some details around error handling. I’ll reach 
>>> out offline so you can always work on the latest.
>>> 
>>> Thank you both as well for the feedback!
>>> 
>>> - Sergio
>>> 
>>> 
>>>> On Apr 1, 2026, at 8:16 AM, Daniel Rossos <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>> 
>>>> Hi Sergio,
>>>> 
>>>> Sorry about the delay on my end (just got back after a few weeks off). I 
>>>> have caught up with + agree with all the feedback Ryan provided in this 
>>>> thread.
>>>> 
>>>> The Gate auto-injection idea is really cool. I'm going to try and make 
>>>> some time this week to test out your PR on my side and provide feedback 
>>>> and thoughts.
>>>> 
>>>> Thanks again for driving this,
>>>> Daniel
>>>> 
>>>> 
>>>> On Thu, Mar 26, 2026 at 11:13 AM Ryan van Huuksloot via dev 
>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>> Awesome! Thanks for the update, Sergio. I'm excited to see the plan - it 
>>>>> is
>>>>> a cool idea so I'm glad it is working.
>>>>> 
>>>>> Ryan van Huuksloot
>>>>> Staff Engineer, Infrastructure | Streaming Platform
>>>>> [image: Shopify]
>>>>> <https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email>
>>>>> 
>>>>> 
>>>>> On Thu, Mar 26, 2026 at 1:46 AM Sergio Chong Loo <[email protected] 
>>>>> <mailto:[email protected]>>
>>>>> wrote:
>>>>> 
>>>>> > @Ryan / @Daniel,
>>>>> >
>>>>> > Good news! The “GateInjectorPipelineExecutor” idea is successful!!
>>>>> >
>>>>> > While the original approach of simply activating it with
>>>>> > “execution.target” did not quite work, I was able to implement it via 
>>>>> > the *Instrumentation
>>>>> > API with a Java Agent* that injects it… the user doesn’t have to touch
>>>>> > their pipelines and I added 2 options, at least for now, to place/inject
>>>>> > the Gate after the source or before the sink (complex DAG cases with
>>>>> > multiple sources or sinks for now are not supported).
>>>>> >
>>>>> > I’m documenting everything and prepping the Draft PR for your review,
>>>>> > probably a couple more days.
>>>>> >
>>>>> > Thanks, stay tuned.
>>>>> >
>>>>> > - Sergio
>>>>> >
>>>>> >
>>>>> > On Mar 16, 2026, at 3:30 PM, Sergio Chong Loo <[email protected] 
>>>>> > <mailto:[email protected]>> wrote:
>>>>> >
>>>>> > Thanks for the ideas and the offer to help out Ryan! It’s invaluable to
>>>>> > learn about how other users/teams scenarios.
>>>>> >
>>>>> > Indeed I have to pursue and evaluate the GateInjectorExecutor 
>>>>> > nonetheless
>>>>> > for our internal development. Ideally it’d be great if the user can 
>>>>> > simply
>>>>> > “invoke” the functionality, even give the user an option to specify 
>>>>> > “where"
>>>>> > the gating mechanism to be placed (e.g. right after the source or before
>>>>> > sink), or for the most flexibility they can incorporate and place the 
>>>>> > Gate
>>>>> > manually just like it is now.
>>>>> >
>>>>> > I’ll share the progress asap and we can all take it from there (this
>>>>> > should not exceed a couple weeks). I’ll definitely need more of your
>>>>> > feedback to verify this with Flink SQL.
>>>>> >
>>>>> > Thanks again,
>>>>> > Sergio
>>>>> >
>>>>> >
>>>>> > On Mar 16, 2026, at 7:01 AM, Ryan van Huuksloot <
>>>>> > [email protected] <mailto:[email protected]>> 
>>>>> > wrote:
>>>>> >
>>>>> > Hi Sergio,
>>>>> >
>>>>> > re: 1.1
>>>>> > My thought is that a BlueGreen Mixin isn't Kubernetes specific and could
>>>>> > be reused by other deployment control planes. However, I do agree that
>>>>> > attaching it to the sink has other implications so I am happy to pivot 
>>>>> > if
>>>>> > we can find an alternative solution.
>>>>> >
>>>>> > re: 2
>>>>> > I'm happy to leave it out of the Phase 2 implementation, but I think it
>>>>> > should be possible. For example we use Phase 1 with cross cluster
>>>>> > migrations today. Phase 2 within a single cluster isn't particularly 
>>>>> > useful
>>>>> > for us.
>>>>> >
>>>>> > re: GateInjectorExecutor
>>>>> > This sounds like a neat idea. I need to read more about how it would 
>>>>> > work
>>>>> > but from a high level, injecting an operator before your sinks sounds 
>>>>> > like
>>>>> > a good idea. Better isolation, possible with SQL, no mixins, etc.
>>>>> >
>>>>> > I will mention that part of the reason I want it before the sinks is
>>>>> > because nine out of ten people building pipelines struggle to understand
>>>>> > where their state is and how Phase 2 would affect the correctness of 
>>>>> > their
>>>>> > state depending on where they put the gate. I understand that if you 
>>>>> > have a
>>>>> > remote lookup and want to save bandwidth, you could optimize your 
>>>>> > pipeline
>>>>> > by moving the gate before the remote call; however, that seems like an
>>>>> > optimization that can be made later.
>>>>> >
>>>>> > Thanks for driving this! Let me know how we can help.
>>>>> >
>>>>> > Ryan van Huuksloot
>>>>> > Staff Engineer, Infrastructure | Streaming Platform
>>>>> > [image: Shopify]
>>>>> > <https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email>
>>>>> >
>>>>> >
>>>>> > On Mon, Mar 16, 2026 at 2:21 AM Sergio Chong Loo <[email protected] 
>>>>> > <mailto:[email protected]>>
>>>>> > wrote:
>>>>> >
>>>>> >> Hi Ryan
>>>>> >>
>>>>> >> Thanks a lot for these details. For sure some of these observations
>>>>> >> popped up during our initial discussions, and that’s why our initial 
>>>>> >> goal
>>>>> >> was to introduce this as simple as possible and gradually enhance it to
>>>>> >> cover gaps.
>>>>> >>
>>>>> >> Allow me to address your concerns:
>>>>> >>
>>>>> >>    1. I’m happy you stressed the point of “disruption to existing
>>>>> >>    pipelines”. However, there’s a few points about attempting to build 
>>>>> >> this
>>>>> >>    functionality into the sinks (or sources) right off the bat (read 
>>>>> >> further
>>>>> >>    below for my alternative):
>>>>> >>       1. Kubernetes centric: as of now the Blue/Green Deployments
>>>>> >>       support is a Kubernetes specific solution, adding a mixin 
>>>>> >> directly
>>>>> >>       available to sinks would “leak” this support outside of K8s
>>>>> >>       2. A sink being aware of these deployment phases violates single
>>>>> >>       responsibility, but more importantly…
>>>>> >>       3. Flink currently has many connectors, with the majority being
>>>>> >>       maintained outside of the Flink code base, by separate teams, 
>>>>> >> separate
>>>>> >>       repos, separate release cycles. This would complicate things 
>>>>> >> significantly
>>>>> >>       as to try and add support for this for every potential flink 
>>>>> >> connector
>>>>> >>       project out there would be a cumbersome. Blue/Green Phase 2 then 
>>>>> >> only would
>>>>> >>       works with "gate-aware" sinks.
>>>>> >>    2. I’d leave the conversation about migrating jobs between K8s
>>>>> >>    clusters outside of this scope, even Phase 1 is meant to only work 
>>>>> >> in a
>>>>> >>    single cluster…
>>>>> >>    3. Watermarking, excellent point, it’s indeed a requirement so I’ll
>>>>> >>    make sure this is validated where applicable (by the concrete
>>>>> >>    implementation)
>>>>> >>
>>>>> >>
>>>>> >> Having said what I said about point 1.1 above, I’m currently working on
>>>>> >> an approach which uses a “GateInjectorPipelineExecutor” so to speak; in
>>>>> >> other words a custom PipelineExecutor that would be shipped with the 
>>>>> >> K8s
>>>>> >> Operator, invoked by Flink Configuration (via “execution.target:”). 
>>>>> >> This
>>>>> >> custom piece would instantiate and inject the Gate at a fixed point in 
>>>>> >> the
>>>>> >> StreamGraph right before job submission. I still have to validate and
>>>>> >> ensure a few things are correctly taken care of (like Type Information,
>>>>> >> etc.) but the theory looks promising.
>>>>> >>
>>>>> >> For the most part this works well with Flink SQL (same configuration),
>>>>> >> here’s my estimation:
>>>>> >>
>>>>> >> tEnv.executeSql("INSERT INTO my_sink ...")
>>>>> >>     └─> SQL planner → ExecNodeGraph → Transformation[]
>>>>> >>           └─> StreamGraph
>>>>> >>                 └─> GateInjectorExecutor injects GateProcessFunction
>>>>> >>                       └─> StreamGraph' (mutated) → JobGraph
>>>>> >>                             └─> Submit Job
>>>>> >>
>>>>> >> I’m aiming to share some updates along these lines in the next few 
>>>>> >> weeks
>>>>> >> but hopefully this falls inline with your objectives/thoughts overall.
>>>>> >>
>>>>> >> Sergio
>>>>> >>
>>>>> >>
>>>>> >> On Mar 6, 2026, at 3:36 PM, Ryan van Huuksloot via dev <
>>>>> >> [email protected] <mailto:[email protected]>> wrote:
>>>>> >>
>>>>> >> Hi Sergio,
>>>>> >> Thanks for starting this conversation.
>>>>> >>
>>>>> >> A few thoughts regarding BlueGreen Phase 2:
>>>>> >> 1. The Gate Operator is interesting but I don't like that we would 
>>>>> >> have to
>>>>> >> modify users' pipelines for them to use Phase 2. This gate function 
>>>>> >> seems
>>>>> >> like it could be a Mixin that connectors would implement. If you want 
>>>>> >> to
>>>>> >> use Phase 2, your sinks must implement this Mixin. I understand that a
>>>>> >> unique GateFunction has pros, but it works less well with FlinkSQL - 
>>>>> >> and
>>>>> >> the trade-off doesn't seem worthwhile.
>>>>> >> 2. Regarding the ConfigMap. We should consider a solution that supports
>>>>> >> migrating Flink jobs between Kubernetes clusters. Otherwise Phase 2 is
>>>>> >> only
>>>>> >> useful for in cluster operations.
>>>>> >> 3. Watermarking is a requirement. Will the Flink Kubernetes Operator
>>>>> >> validate that the pipeline is using watermarks?
>>>>> >>
>>>>> >> What happens when idleness is configured? Watermarks will get ignored 
>>>>> >> from
>>>>> >>
>>>>> >> these “slow” subtasks and advance, could records from the ignored 
>>>>> >> subtasks
>>>>> >> eventually be lost?
>>>>> >> Yes they would be lost, but that would happen irrespective of Phase 2.
>>>>> >>
>>>>> >> I'll have more thoughts after we discuss the Gate Operator, as that is
>>>>> >> crucial to the FLIP right now.
>>>>> >>
>>>>> >> Ryan van Huuksloot
>>>>> >> Staff Engineer, Infrastructure | Streaming Platform
>>>>> >> [image: Shopify]
>>>>> >> <https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email>
>>>>> >>
>>>>> >>
>>>>> >> On Mon, Mar 2, 2026 at 6:52 PM Sergio Chong Loo <[email protected] 
>>>>> >> <mailto:[email protected]>>
>>>>> >> wrote:
>>>>> >>
>>>>> >> Bumping this (Advanced Blue/Green deployments - FLIP-504) thread after
>>>>> >> making some code adjustments.
>>>>> >>
>>>>> >> FYI @drossos <https://github.com/drossos> @ryanvanhuuksloot <
>>>>> >> https://github.com/ryanvanhuuksloot> I’d like to get your feedback 
>>>>> >> since
>>>>> >> I know you’re interested in this feature.
>>>>> >>
>>>>> >> Thanks,
>>>>> >> - Sergio
>>>>> >>
>>>>> >>
>>>>> >> On Dec 5, 2025, at 2:31 PM, Sergio Chong Loo <[email protected] 
>>>>> >> <mailto:[email protected]>>
>>>>> >>
>>>>> >> wrote:
>>>>> >>
>>>>> >>
>>>>> >> Hi folks,
>>>>> >>
>>>>> >> FLIP-503 (already merged) introduced the Basic Blue/Green Deployment
>>>>> >>
>>>>> >> functionality to the Flink K8s Operator. It was very straightforward,
>>>>> >> simply transitioning to the second deployment once it's considered 
>>>>> >> stable.
>>>>> >>
>>>>> >>
>>>>> >> FLIP-504 is an Advanced version added on top of 503 and brings about 
>>>>> >> the
>>>>> >>
>>>>> >> notion of "record-level" coordination between the 2 deployments to 
>>>>> >> have no
>>>>> >> data duplication and exactly once semantics while preserving a smooth
>>>>> >> transition.
>>>>> >>
>>>>> >>
>>>>> >> The main goals are:
>>>>> >>    • For the community to take a quick look at the current
>>>>> >>
>>>>> >> functionality (previously mentioned at the Flink Forward 2025 
>>>>> >> Conference)
>>>>> >>
>>>>> >>    • To get feedback and improvement suggestions
>>>>> >>
>>>>> >> Flip 504 details:
>>>>> >>
>>>>> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=337677650
>>>>> >>
>>>>> >>
>>>>> >> Draft PR: https://github.com/apache/flink-kubernetes-operator/pull/1043
>>>>> >>
>>>>> >> Thank you!
>>>>> >> - Sergio
>>>>> >>
>>>>> >>
>>>>> >
>

Re: FLIP-504: Blue/Green Deployments for Flink on Kubernetes - Phase 2 (WIP/Draft PR)

Reply via email to