[
https://issues.apache.org/jira/browse/FLINK-39961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Purshotam Shah updated FLINK-39961:
-----------------------------------
Component/s: Table SQL / API
> Support routing across models in Flink SQL (ML_PREDICT)
> --------------------------------------------------------
>
> Key: FLINK-39961
> URL: https://issues.apache.org/jira/browse/FLINK-39961
> Project: Flink
> Issue Type: Improvement
> Components: Table SQL / API
> Reporter: Purshotam Shah
> Priority: Major
>
> *The problem*
> FLIP-526 added model inference to Flink SQL (CREATE MODEL, ML_PREDICT,
> ML_EVALUATE), but a query
> is bound to a single, statically chosen model. There is no way to route a
> request among several
> candidate models from SQL — and no learned/ML-based way to make that choice.
> Today selection has
> to be hard-coded to one model or handled outside the query.
> Why it's worth doing
> - Cost and quality: automatically send simple requests to a small/cheap model
> and hard ones to a
> stronger model, instead of paying for the strongest model on every row or
> fixing one model for
> all traffic.
> - SQL-native: keep the routing decision inside the query, so SQL users get it
> without external
> orchestration or dropping to a programmatic API.
> - Scales better than static rules: a learned router adapts to the workload
> rather than relying on
> hand-written conditions that go stale.
> - Builds directly on the existing model functions (FLIP-526 / ML_PREDICT)
> rather than introducing
> a parallel mechanism.
> *What we plan to do*
> - Let a set of candidate models plus a routing strategy be declared in SQL,
> and have ML_PREDICT
> pick the model per request rather than being pinned to one.
> - Support multiple strategies: condition/rule-based selection and a learned
> (ML) router that
> scores the request and chooses a model.
> - Reuse the existing ML_PREDICT execution path to invoke the chosen model, so
> routing is a
> selection layer on top of the current model functions, not a new inference
> mechanism.
> - Degrade gracefully — fall back to a configured default model when the
> router can't decide or a
> chosen model fails.
> - Make the decision observable (which model served each request) via
> metrics/logging.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)