Re: [Question] use ML_PREDICT/ML_EVALUATE with python model provider in Flink

Őrhidi Mátyás Wed, 15 Oct 2025 07:03:52 -0700

Thanks Shenkai,

you're right about the generic entry point and Swapna will open a PR
soon that will include a reference implementation for that provider.
The FLIP refers to it as Generic Python Model Provider, but perhaps
the wording is not 100% clear.


The idea here is exactly what you're suggesting, to allow users to use
any model through that generic provider by extending the
PyPredictFunction

class MyCustomModel(PyPredictFunction):

   def open(self, context):
       """
       Initialize the function and load your custom model

       :param context: A context object that provides access to model
properties and other runtime information.
       """
       ...

   def predict(self, data: Row) -> List[Row]:
       """
       :param data: The input data row for prediction (expects text field)
       :return: A list of rows containing the prediction results
       """
       ...

and referring it in SQL:

CREATE MODEL my_python_model
INPUT (text STRING)
OUTPUT (prediction STRING, confidence FLOAT)
WITH (
   'provider' = 'generic-python',
   'model-directory-path' = '/path/to/your/model',
   'python-predict-class' = 'mymodule.MyCustomModel',
   'properties.model_version' = '1'
);

linking 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-552+Support+ML_PREDICT+for+Python+based+model+providers
for reference

On Tue, Oct 14, 2025 at 8:53 PM Shengkai Fang <[email protected]> wrote:
>
> Hi Matyas and Swapna,
>
> Please see my response inline.
>
> >This is an interesting idea, however we are proposing using the
> > provider model because it aligns with Flink's existing Java-based
> > architecture for discovering plugins. A Java entry point is required
> > to launch the Python code, and this is the standard way to do it.
>
> A Java entry point is required. However, I don't mean that every Python
> model implementation needs its own Java entry point. Instead, we could use
> a generic Java entry point to start the Python process, and let this entry
> point discover available Python model implementations dynamically.
> In this way, model developers would only need to focus on their Python
> code, without having to write a Java factory wrapper.
>
> > We can certainly learn from Beam's design, but directly reusing it
> > would add a very heavy dependency and be difficult to integrate
> > cleanly into Flink's native processing model.
>
> Flink already uses Apache Beam in PyFlink. For example, you can see
> pyflink-udf-runner.sh, which is the startup script used to launch the
> Python process at runtime.
>
> >This is actually a great idea and essential for usability. Just to
> > double check on your suggestion, your proposal is to have an explicit
> > PredictRuntimeContext for dynamic model file downloading?
>
> Yes! I think passing PredictRuntimeContext in open is a good idea.
>
> Best,
> Shengkai
>
>
> Hao Li <[email protected]> 于2025年10月15日周三 07:08写道：
>
> > Hi Swapna,
> >
> > Some other suggestion and questions:
> >
> > 1. Maybe change `getPythonPredictFunction` to `getPythonPredictFunction` to
> > align with `createPredictFunction` in `PredictRuntimeProvider`.
> > 2. Is it possible to enforce the returned Java `PythonFunction` from
> > `createPythonFunction` implements Python `PredictFunction`?
> >
> > Hao
> >
> > On Tue, Oct 14, 2025 at 2:52 PM Swapna Marru <[email protected]>
> > wrote:
> >
> > > Thanks Matyas and Shengkai.
> > >
> > >
> > > > 1. I'm wondering whether we could extend the SQL API to change how
> > Python
> > > > models are loaded. For example, we could allow users to write:
> > > >
> > > > ```
> > > > CREATE MODEL my_pytorch_model
> > > > WITH (
> > > >    'type' = 'pytorch'
> > > > ) LANGUAGE PYTHON;
> > > > ```
> > >
> > > Kind of thought about this, but one initial concern I had with this model
> > > is,
> > > Will a model provider be completely implemented in python itself ?
> > > When we refer to PyTorch or HuggingFace or ONNX model providers for
> > > example, do we need different behavior
> > > or optimizations related to Predictruntimecontext/model config building ,
> > > batching or resource scheduling decisions .. which needs to be done in
> > Java
> > > Flink entrypoints ?
> > >
> > >
> > > > 2. Beam already supports TensorFlow, ONNX, and many built-in models.
> > Can
> > > we
> > > > reuse Beam's utilities to build Flink prediction functions[1]?
> > >
> > > Thanks, I will take a look at this to understand how it works and if we
> > can
> > > learn from that design.
> > >
> > > > 3. It would be better if we introduced a PredictRuntimeContext to help
> > > > users download required weight files.
> > >
> > > Currently I have this as Model Config(set_model_config) , where this will
> > > be passed to PredictFunction in python. But PredictRuntimeContext seems
> > > more suitable .
> > > I will look into passing PredictRuntimeContext in open, similar to
> > > RuntimeContext for UDF's.
> > >
> > >
> > > > 4. In ML, users typically perform inference on batches of data.
> > > Therefore,
> > > > per-record evaluation may not be necessary. How about we just introduce
> > > API
> > > > like[2]?
> > >
> > > Yes, I completely agree on this. I was first aiming at agreement on model
> > > creation and provider api and then look at this in more detail.
> > >
> > > On Tue, Oct 14, 2025 at 10:39 AM Őrhidi Mátyás <[email protected]>
> > > wrote:
> > >
> > > > Hey Shengkai,
> > > >
> > > > Thank you for your observations. This proposal is mostly driven by
> > > > Swapna, but I could also share my thoughts here, please find them
> > > > inline.
> > > >
> > > > Cheers,
> > > > Matyas
> > > >
> > > > On Tue, Oct 14, 2025 at 3:02 AM Shengkai Fang <[email protected]>
> > wrote:
> > > > >
> > > > > Hi, Matyas.
> > > > >
> > > > > Thanks for the proposal.  I have some suggestions about the proposal.
> > > > >
> > > > > 1. I'm wondering whether we could extend the SQL API to change how
> > > Python
> > > > > models are loaded. For example, we could allow users to write:
> > > > >
> > > > > ```
> > > > > CREATE MODEL my_pytorch_model
> > > > > WITH (
> > > > >    'type' = 'pytorch'
> > > > > ) LANGUAGE PYTHON;
> > > > > ```
> > > > > In this case, we wouldn't rely on Java SPI to load the Python model
> > > > > provider. However, I'm not sure whether Python has a similar
> > mechanism
> > > to
> > > > > SPI that avoids hardcoding class paths.
> > > >
> > > > This is an interesting idea, however we are proposing using the
> > > > provider model because it aligns with Flink's existing Java-based
> > > > architecture for discovering plugins. A Java entry point is required
> > > > to launch the Python code, and this is the standard way to do it.
> > > >
> > > > > 2. Beam already supports TensorFlow, ONNX, and many built-in models.
> > > Can
> > > > we
> > > > > reuse Beam's utilities to build Flink prediction functions[1]?
> > > >
> > > > We can certainly learn from Beam's design, but directly reusing it
> > > > would add a very heavy dependency and be difficult to integrate
> > > > cleanly into Flink's native processing model.
> > > >
> > > > > 3. It would be better if we introduced a PredictRuntimeContext to
> > help
> > > > > users download required weight files.
> > > >
> > > > This is actually a great idea and essential for usability. Just to
> > > > double check on your suggestion, your proposal is to have an explicit
> > > > PredictRuntimeContext for dynamic model file downloading?
> > > >
> > > > >
> > > > > 4. In ML, users typically perform inference on batches of data.
> > > > Therefore,
> > > > > per-record evaluation may not be necessary. How about we just
> > introduce
> > > > API
> > > > > like[2]?
> > > >
> > > > I agree completely. The row-by-row API is just a starting point, and
> > > > we should aim to prioritize support for efficient batch inference to
> > > > ensure good performance for real-world models.
> > > > >
> > > > > Best,
> > > > > Shengkai
> > > > >
> > > > > [1] https://beam.apache.org/documentation/ml/about-ml/
> > > > > [2]
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-491%3A+BundledAggregateFunction+for+batched+aggregation
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Swapna Marru <[email protected]> 于2025年10月14日周二 11:53写道：
> > > > >
> > > > > > Thanks Matyas.
> > > > > >
> > > > > > Hao,
> > > > > >
> > > > > > The proposal is to provide a generic framework .
> > > > > > Interfaces ->  PythonPredictRuntimeProvider /
> > PythonPredictFunction /
> > > > > > PredictFunction(in Python) are defined to provide a base for that
> > > > > > framework.
> > > > > >
> > > > > > generic-python is one of the implementations, registered similar to
> > > > openai
> > > > > > in original FLIP.
> > > > > > This is though not a concrete implementation end to end. It can be
> > > > used as,
> > > > > > 1. As a reference implementation for other complete end to end
> > > concrete
> > > > > > model provider implementations
> > > > > > 2. For simple python model implementations, this can be used out of
> > > > box to
> > > > > > avoid boilerplate java provider implementation.
> > > > > >
> > > > > > I will also open a PR with current implementation changes , so it's
> > > > more
> > > > > > clear for further discussion.
> > > > > >
> > > > > > -Thanks,
> > > > > > M.Swapna
> > > > > >
> > > > > > On Mon, Oct 13, 2025 at 5:04 PM Őrhidi Mátyás <
> > > [email protected]
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-552+Support+ML_PREDICT+for+Python+based+model+providers
> > > > > > >
> > > > > > > On Mon, Oct 13, 2025 at 4:10 PM Őrhidi Mátyás <
> > > > [email protected]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Swapna, I can help you to create a FLIP page.
> > > > > > > >
> > > > > > > > On Mon, Oct 13, 2025 at 3:58 PM Hao Li
> > <[email protected]
> > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Hi Swapna,
> > > > > > > > >
> > > > > > > > > Thanks for the proposal. Can you put it in a FLIP and start a
> > > > > > > discussion
> > > > > > > > > thread for it?
> > > > > > > > >
> > > > > > > > > From an initial look, I'm a bit confused if this is a
> > concrete
> > > > > > > > > implementation for "generic-python" or it's generic framework
> > > to
> > > > > > handle
> > > > > > > > > python predict function. Because everything seems concrete
> > like
> > > > > > > > > `GenericPythonModelProviderFactory`,
> > > `GenericPythonModelProvider`
> > > > > > > exception
> > > > > > > > > the final Python predict function.
> > > > > > > > >
> > > > > > > > > Also if `GenericPythonModelProviderFactory` is predefined, do
> > > you
> > > > > > > predefine
> > > > > > > > > the required and optional options for it? Will it be
> > inflexible
> > > > if
> > > > > > > > > predefined?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Hao
> > > > > > > > >
> > > > > > > > > On Mon, Oct 13, 2025 at 10:04 AM Swapna Marru <
> > > > > > > [email protected]>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Hi ShengKai,
> > > > > > > > > >
> > > > > > > > > > Documented the initial proposal here ,
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> > https://docs.google.com/document/d/1YzBxLUPvluaZIvR0S3ktc5Be1FF4bNeTsXB9ILfgyWY/edit?usp=sharing
> > > > > > > > > >
> > > > > > > > > > Please review and let me know your thoughts.
> > > > > > > > > >
> > > > > > > > > > -Thanks,
> > > > > > > > > > Swapna
> > > > > > > > > >
> > > > > > > > > > On Tue, Sep 23, 2025 at 10:39 PM Shengkai Fang <
> > > > [email protected]>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I see your point, and I agree that your proposal is
> > > feasible.
> > > > > > > However,
> > > > > > > > > > > there is one limitation to consider: the current loading
> > > > > > mechanism
> > > > > > > first
> > > > > > > > > > > discovers all available factories on the classpath and
> > then
> > > > > > > filters them
> > > > > > > > > > > based on the user-specified identifiers.
> > > > > > > > > > >
> > > > > > > > > > > In most practical scenarios, we would likely have only
> > one
> > > > > > generic
> > > > > > > > > factory
> > > > > > > > > > > (e.g., a GenericPythonModelFactory) present in the
> > > classpath.
> > > > > > This
> > > > > > > means
> > > > > > > > > > > the framework would be able to load either PyTorch or
> > > > TensorFlow
> > > > > > > > > > > models—whichever is defined within that single generic
> > > > > > > > > implementation—but
> > > > > > > > > > > not both simultaneously unless additional mechanisms are
> > > > > > > introduced.
> > > > > > > > > > >
> > > > > > > > > > > This doesn't block the proposal, but it’s something worth
> > > > noting
> > > > > > > as we
> > > > > > > > > > > design the extensibility model. We may want to explore
> > ways
> > > > to
> > > > > > > support
> > > > > > > > > > > multiple user-defined providers more seamlessly in the
> > > > future.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Shengkai
> > > > > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >

Re: [Question] use ML_PREDICT/ML_EVALUATE with python model provider in Flink

Reply via email to