Hi everyone,

I'd like to start a discussion on an umbrella FLIP[1] that lays out a
direction for evolving Flink into a data engine that natively supports AI
workloads.

The short version: user workloads are shifting from BI analytics to
multimodal data processing centered on model inference, and this triggers
cascading changes across the stack — multimodal data flowing through
pipelines, heterogeneous CPU/GPU resources, vectorized execution, and
inference tasks that run for seconds to minutes on Spot instances. The
proposal sketches an evolution along five directions (development paradigm,
data model, heterogeneous resources, execution engine, fault tolerance),
decomposed into 11 sub-FLIPs organized into three layers: core runtime
primitives, AI workload expression and execution, and production-grade
operational guarantees. Most sub-FLIPs have no hard dependencies on each
other and can be advanced in parallel.

A note on scope, since it's an umbrella:

- In scope here: whether the evolution directions are reasonable, whether
each sub-FLIP's motivation and proposed approach are well-founded, and
whether the boundaries and dependencies between sub-FLIPs are clear.
- Out of scope here: detailed designs, API specifics, and implementation
plans of individual sub-FLIPs — those will go through their own FLIPs.
- Consensus criteria: agreement on the overall direction is sufficient for
the umbrella to pass; passing it does not lock in any sub-FLIP's design —
sub-FLIPs may still be adjusted, deferred, or withdrawn as they progress.

All proposed changes are incremental — no existing API or behavior is
removed or altered. Compatibility details are covered at the end of the
document.

Looking forward to your feedback on the overall direction and the layering.

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=421957275

Thanks,
Guowei

Reply via email to