Re: [Question] use ML_PREDICT/ML_EVALUATE with python model provider in Flink

Shengkai Fang Tue, 23 Sep 2025 19:48:20 -0700

Hi, Swapna.

Thanks for the quick response.


>>  we could also expose a very generic python model provider, where users
can plugin their python implementation.

As I understand, the current design relies on SPI (Service Provider
Interface) to load model providers, which requires users to provide a JAR
containing a custom Factory implementation. This raises the question of how
Python users could contribute their own model providers within this
framework. I’d be interested in exploring how we might support such
extensibility for Python-based implementations.

>> I will refine it and open a FLIP, so we can discuss more details.

Looking forward to hearing from you.

Best,
Shengkai


Swapna Marru <[email protected]> 于2025年9月24日周三 04:44写道：

> Hi Shengkai,
>
> Thanks so much for the response and the details.
>
> Enhancements you proposed make perfect sense.
>
> Was thinking along the same lines. I am interested in contributing to this.
> Have a rough design and a POC implementation around this.
> From my initial digging, we would need to extend
> PythonTableFunctionOperator to encapsulate the Model Context with it.
> I will refine it and open a FLIP, so we can discuss more details.
> >> *Introduce a pluggable Model Provider interface*
>
> Sounds good. Also, to start with, we could also expose a very generic
> python model provider, where users can plugin
> their python implementation. Users can bring in their own generic Python
> implementation of PyPredictFunction.
> >> *Add built-in model providers for PyTorch and TensorFlow in the Flink
> repository*
>
> Yes, I agree, this is very much needed.  Also, i see
>
> https://nightlies.apache.org/flink/flink-docs-release-2.1/docs/dev/python/table/udfs/vectorized_python_udfs/
> but there is no implementation of Vectorized Table function available
> currently. Providing that seems promising to use resources efficiently.
> >>*Optimize resource usage via cached files and batched inference*
>
> On Mon, Sep 22, 2025 at 7:56 PM Shengkai Fang <[email protected]> wrote:
>
> > Hi, Swapna.
> >
> > Supporting local models has always been a key milestone on our journey
> > toward Flink AI. After an offline discussion with @DianFu, we found that
> > the current API is not well-suited for PyFlink, as PyFlink prefers to
> > provide its own Python operators.
> >
> > To better accommodate this requirement and improve extensibility, we
> > propose the following enhancements:
> >
> > *Introduce a pluggable Model Provider interface*
> > Extend the model management framework by adding a standardized interface
> > for model providers, enabling custom implementations — such as a
> > PythonModelProvider — to supply native Python operators (e.g.,
> > PythonTableFunctionOperator).
> >
> > *Add built-in model providers for PyTorch and TensorFlow in the Flink
> > repository*
> > Implement and integrate first-party model providers for popular
> frameworks
> > like PyTorch and TensorFlow. These providers will implement the new
> > interface and be maintained within the main Flink codebase.
> >
> > *Optimize resource usage via cached files and batched inference*
> > Leverage `pipeline.cached-files` to manage model weight files
> efficiently,
> > avoiding redundant downloads across tasks. Additionally, support batched
> > inference to maximize hardware utilization (e.g., GPU/TPU throughput) and
> > improve overall performance.
> >
> > Best,
> > Shengkai
> >
> >
> > Swapna Marru <[email protected]> 于2025年9月22日周一 23:36写道：
> >
> > > Hi Devs,
> > >
> > >
> > > I am interested in learning more about MODEL, ML_PREDICT, and
> ML_EVALUATE
> > > functionalities added in the following FLIP.
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-437%3A+Support+ML+Models+in+Flink+SQL
> > >
> > >
> > > I see the original FLIP has extensibility to local model providers in
> > > Flink.
> > >
> > >
> > > Is there a way to do pluggable local model providers in Python? Like,
> > say,
> > > generate embeddings using Sentence transformer models running locally
> in
> > > Flink.
> > >
> > >
> > > An option could be to introduce a Model Provider factory implementation
> > in
> > > Java that internally uses a predict function in Python . But I see this
> > > puts in a lot of work related to Java to Python
> communication/translation
> > > inside the provider.
> > >
> > >
> > > Something like PythonRuntimeProvider along with PredictRuntimeProvider
> /
> > > AsyncRuntimeProvider which can handle Java -> Python translations out
> of
> > > the box would be helpful to de-duplicate that effort.
> > >
> > >
> > > Can you please point to, if there are any discussions related to this
> > > already ? Or any other ways to achieve the same? Please share your
> > > thoughts.
> > >
> > >
> > > -Thanks,
> > >
> > > Swapna Marru
> > >
> >
>

Re: [Question] use ML_PREDICT/ML_EVALUATE with python model provider in Flink

Reply via email to