[ 
https://issues.apache.org/jira/browse/FLINK-39961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Purshotam Shah updated FLINK-39961:
-----------------------------------
    Component/s: Table SQL / API

>  Support routing across models in Flink SQL (ML_PREDICT)
> --------------------------------------------------------
>
>                 Key: FLINK-39961
>                 URL: https://issues.apache.org/jira/browse/FLINK-39961
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / API
>            Reporter: Purshotam Shah
>            Priority: Major
>
> *The problem*
> FLIP-526 added model inference to Flink SQL (CREATE MODEL, ML_PREDICT, 
> ML_EVALUATE), but a query
> is bound to a single, statically chosen model. There is no way to route a 
> request among several
> candidate models from SQL — and no learned/ML-based way to make that choice. 
> Today selection has
> to be hard-coded to one model or handled outside the query.
> Why it's worth doing
> - Cost and quality: automatically send simple requests to a small/cheap model 
> and hard ones to a
>   stronger model, instead of paying for the strongest model on every row or 
> fixing one model for
>   all traffic.
> - SQL-native: keep the routing decision inside the query, so SQL users get it 
> without external
>   orchestration or dropping to a programmatic API.
> - Scales better than static rules: a learned router adapts to the workload 
> rather than relying on
>   hand-written conditions that go stale.
> - Builds directly on the existing model functions (FLIP-526 / ML_PREDICT) 
> rather than introducing
>   a parallel mechanism.
> *What we plan to do*
> - Let a set of candidate models plus a routing strategy be declared in SQL, 
> and have ML_PREDICT
>   pick the model per request rather than being pinned to one.
> - Support multiple strategies: condition/rule-based selection and a learned 
> (ML) router that
>   scores the request and chooses a model.
> - Reuse the existing ML_PREDICT execution path to invoke the chosen model, so 
> routing is a
>   selection layer on top of the current model functions, not a new inference 
> mechanism.
> - Degrade gracefully — fall back to a configured default model when the 
> router can't decide or a
>   chosen model fails.
> - Make the decision observable (which model served each request) via 
> metrics/logging.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to