Re: [Question] use ML_PREDICT/ML_EVALUATE with python model provider in Flink

Swapna Marru Tue, 09 Dec 2025 09:13:41 -0800

Hi

Thanks Geng Biao for the interest and the poc implementation.   Sorry I was
away for a while and in between some other stuff currently.
I will be able to look back on this starting next week.
Meanwhile, would like to share my previous POC impl,
https://github.com/apache/flink/commit/bb2d5c4aa096a4eb3a89adb6a5bfecd8e87d3262



I will take a look at your implementation.
In previous discussions with ShengKai , one more thing we wanted to explore
is the flexibility of having the model provider factory also in Python ,
instead of having it in Java as proposed in FLIP.
This is something, I haven't yet gotten time to dig deeper.

Yes i agree on this.
>> I notice that the parameter `model-directory-path` is introduced. It
should work but I see that some popular python model libraries like
`transformer` and `vllm`, tend to directly use the parameter name `model`
which could be a local path or a simple name like `Qwen/Qwen3-0.6B`. Then
they check if `model` is a path or a simple name and automatically download
it to local cache directory if necessary. Considering use cases  based
these libraries, maybe  `model-directory-path` can be renamed to `model`
which can be either a real path or a simple model name.

Yes , PredictRuntimeContext is more generic and is extendible. I will
update the FLIP soon.
>> There is a `set_model_config`  in the python PredictFunction interface.
I find that in the previous discussion, you are considering passing
PredictRuntimeContext in `open`. I do agree with this choice as well, which
makes the logic more clear and uses can just use the configs in the `open`
method to init their model. I just find that the FLIP still has the
`set_model_config` so just want to double check it here.

-Thanks,
M.Swapna


On Tue, Dec 2, 2025 at 4:48 AM Geng Biao <[email protected]> wrote:

> Hi folks,
>
> I am writing to follow up with this FLIP. I recently implemented a PoC for
> the ML_PREDICT using local python model based on current PyFlink’s udtf
> framework.
> Codes can be found here:
> https://github.com/bgeng777/flink/commit/8eb002944ee8ce863680923ff53622eb10813777
> I believe the proposal can work and want to share some findings:
> My implementation follows most SQL section design of the FLIP (only
> difference is that I use ‘model’ option instead of
> 'model-directory-path’(see my previous reply of the thread for details),
> which I think is expressive enough.
>
> In the python part, as the ML_PREDICT is a specialized UDTF and PyFlink
> also supports UDTF, it is natural to directly reuse the udtf <
> https://github.com/apache/flink/blob/master/flink-python/pyflink/table/udf.py#L692>
> decorator to define a python predict class. One point is that as the SQL
> primitive has defined OUTPUT schema, we may need to enhance the PyFlink’s
> udtf interface to allow users to use it without specifying `result_types`
> and directly use the definition from the DDL. Another point is that we need
> to pass some model parameters(e.g. temperature, GPU/CPU settings) to Python
> process so we may need to enhance PyFlink to support such usage so that in
> the UDTF, users can directly get the model parameters. After introducing
> such improvements, we can define the python logic like this:
>
> ```
> class HuggingFaceFunc(TableFunction):
>     def __init__(self):
>         self.model = None
>         self.tokenizer = None
>         self.pipeline = None
>
>     def open(self, runtime_context):
>         model_dir = runtime_context.get_job_parameter("model", None)
>         ...
>         self.tokenizer = AutoTokenizer.from_pretrained(
>             model_dir, trust_remote_code=True
>         )
>
>         self.model = AutoModelForCausalLM.from_pretrained(model_dir,
> device_map=device)
>         self.pipeline = pipeline(
>             "text-generation", model=self.model, tokenizer=self.tokenizer
>         )
>         assert self.model is not None
>
>     def eval(self, content: str, comment: str):
>         output = self.pipeline(content)[0]["generated_text"]
>         yield output, len(output)
>
>
> HuggingFaceModelUDTF = udtf(HuggingFaceFunc())
> ```
>
> I have implemented 2 python predict class, one uses HuggingFace and
> another one is based on vLLM to verify this design(codes can be found here <
> https://github.com/bgeng777/flink/blob/bgeng777/python-ml-predict-poc/flink-end-to-end-tests/flink-python-test/python/huggingface_udtf.py>).
> So to conclude, I am wondering if we really need a new Python `class
> PredictFunction(TableFunction)`.
>
> Thanks for reading and looking forward to your thoughts.
>
> Best,
> Biao Geng
>
>
> > 2025年11月21日 19:28，Geng Biao <[email protected]> 写道：
> >
> > Hi Swapna,
> >
> > Thanks for the FLIP and sorry for the late reply. After reading it, I
> have some comments about the proposal.
> >
> > 1. I notice that the parameter `model-directory-path` is introduced. It
> should work but I see that some popular python model libraries like
> `transformer` and `vllm`, tend to directly use the parameter name `model`
> which could be a local path or a simple name like `Qwen/Qwen3-0.6B`. Then
> they check if `model` is a path or a simple name and automatically download
> it to local cache directory if necessary. Considering use cases  based
> these libraries, maybe  `model-directory-path` can be renamed to `model`
> which can be either a real path or a simple model name.
> >
> > 2. There is a `set_model_config`  in the python PredictFunction
> interface. I find that in the previous discussion, you are considering
> passing PredictRuntimeContext in `open`. I do agree with this choice as
> well, which makes the logic more clear and uses can just use the configs in
> the `open` method to init their model. I just find that the FLIP still has
> the `set_model_config` so just want to double check it here.
> >
> >
> >
> > Thanks for your time!
> >
> > Best,
> > Biao Geng
> >
> >> 2025年9月22日 23:36，Swapna Marru <[email protected]> 写道：
> >>
> >> Hi Devs,
> >>
> >>
> >> I am interested in learning more about MODEL, ML_PREDICT, and
> ML_EVALUATE
> >> functionalities added in the following FLIP.
> >>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-437%3A+Support+ML+Models+in+Flink+SQL
> >>
> >>
> >> I see the original FLIP has extensibility to local model providers in
> Flink.
> >>
> >>
> >> Is there a way to do pluggable local model providers in Python? Like,
> say,
> >> generate embeddings using Sentence transformer models running locally in
> >> Flink.
> >>
> >>
> >> An option could be to introduce a Model Provider factory implementation
> in
> >> Java that internally uses a predict function in Python . But I see this
> >> puts in a lot of work related to Java to Python
> communication/translation
> >> inside the provider.
> >>
> >>
> >> Something like PythonRuntimeProvider along with PredictRuntimeProvider /
> >> AsyncRuntimeProvider which can handle Java -> Python translations out of
> >> the box would be helpful to de-duplicate that effort.
> >>
> >>
> >> Can you please point to, if there are any discussions related to this
> >> already ? Or any other ways to achieve the same? Please share your
> thoughts.
> >>
> >>
> >> -Thanks,
> >>
> >> Swapna Marru
> >
>
>

Re: [Question] use ML_PREDICT/ML_EVALUATE with python model provider in Flink

Reply via email to