Hi Swapna, Thanks for your reply! After doing some exploration on batching and reviewing your POC code, I’d like to revise my earlier response: I now think introducing a new Python interface (e.g., PredictFunction) would be very helpful. It would not only make the current interface easier to use and spare users from having to call udtf() manually again, but also allow us to hide some implementation details and provide a more appropriate API in the ML_PREDICT or model inference scenario.
Best, Biao Geng > 2025年12月10日 01:11,Swapna Marru <[email protected]> 写道: > > Hi > > Thanks Geng Biao for the interest and the poc implementation. Sorry I was > away for a while and in between some other stuff currently. > I will be able to look back on this starting next week. > Meanwhile, would like to share my previous POC impl, > https://github.com/apache/flink/commit/bb2d5c4aa096a4eb3a89adb6a5bfecd8e87d3262 > > > I will take a look at your implementation. > In previous discussions with ShengKai , one more thing we wanted to explore > is the flexibility of having the model provider factory also in Python , > instead of having it in Java as proposed in FLIP. > This is something, I haven't yet gotten time to dig deeper. > > Yes i agree on this. >>> I notice that the parameter `model-directory-path` is introduced. It > should work but I see that some popular python model libraries like > `transformer` and `vllm`, tend to directly use the parameter name `model` > which could be a local path or a simple name like `Qwen/Qwen3-0.6B`. Then > they check if `model` is a path or a simple name and automatically download > it to local cache directory if necessary. Considering use cases based > these libraries, maybe `model-directory-path` can be renamed to `model` > which can be either a real path or a simple model name. > > Yes , PredictRuntimeContext is more generic and is extendible. I will > update the FLIP soon. >>> There is a `set_model_config` in the python PredictFunction interface. > I find that in the previous discussion, you are considering passing > PredictRuntimeContext in `open`. I do agree with this choice as well, which > makes the logic more clear and uses can just use the configs in the `open` > method to init their model. I just find that the FLIP still has the > `set_model_config` so just want to double check it here. > > -Thanks, > M.Swapna > > > On Tue, Dec 2, 2025 at 4:48 AM Geng Biao <[email protected] > <mailto:[email protected]>> wrote: > >> Hi folks, >> >> I am writing to follow up with this FLIP. I recently implemented a PoC for >> the ML_PREDICT using local python model based on current PyFlink’s udtf >> framework. >> Codes can be found here: >> https://github.com/bgeng777/flink/commit/8eb002944ee8ce863680923ff53622eb10813777 >> I believe the proposal can work and want to share some findings: >> My implementation follows most SQL section design of the FLIP (only >> difference is that I use ‘model’ option instead of >> 'model-directory-path’(see my previous reply of the thread for details), >> which I think is expressive enough. >> >> In the python part, as the ML_PREDICT is a specialized UDTF and PyFlink >> also supports UDTF, it is natural to directly reuse the udtf < >> https://github.com/apache/flink/blob/master/flink-python/pyflink/table/udf.py#L692> >> decorator to define a python predict class. One point is that as the SQL >> primitive has defined OUTPUT schema, we may need to enhance the PyFlink’s >> udtf interface to allow users to use it without specifying `result_types` >> and directly use the definition from the DDL. Another point is that we need >> to pass some model parameters(e.g. temperature, GPU/CPU settings) to Python >> process so we may need to enhance PyFlink to support such usage so that in >> the UDTF, users can directly get the model parameters. After introducing >> such improvements, we can define the python logic like this: >> >> ``` >> class HuggingFaceFunc(TableFunction): >> def __init__(self): >> self.model = None >> self.tokenizer = None >> self.pipeline = None >> >> def open(self, runtime_context): >> model_dir = runtime_context.get_job_parameter("model", None) >> ... >> self.tokenizer = AutoTokenizer.from_pretrained( >> model_dir, trust_remote_code=True >> ) >> >> self.model = AutoModelForCausalLM.from_pretrained(model_dir, >> device_map=device) >> self.pipeline = pipeline( >> "text-generation", model=self.model, tokenizer=self.tokenizer >> ) >> assert self.model is not None >> >> def eval(self, content: str, comment: str): >> output = self.pipeline(content)[0]["generated_text"] >> yield output, len(output) >> >> >> HuggingFaceModelUDTF = udtf(HuggingFaceFunc()) >> ``` >> >> I have implemented 2 python predict class, one uses HuggingFace and >> another one is based on vLLM to verify this design(codes can be found here < >> https://github.com/bgeng777/flink/blob/bgeng777/python-ml-predict-poc/flink-end-to-end-tests/flink-python-test/python/huggingface_udtf.py>). >> So to conclude, I am wondering if we really need a new Python `class >> PredictFunction(TableFunction)`. >> >> Thanks for reading and looking forward to your thoughts. >> >> Best, >> Biao Geng >> >> >>> 2025年11月21日 19:28,Geng Biao <[email protected]> 写道: >>> >>> Hi Swapna, >>> >>> Thanks for the FLIP and sorry for the late reply. After reading it, I >> have some comments about the proposal. >>> >>> 1. I notice that the parameter `model-directory-path` is introduced. It >> should work but I see that some popular python model libraries like >> `transformer` and `vllm`, tend to directly use the parameter name `model` >> which could be a local path or a simple name like `Qwen/Qwen3-0.6B`. Then >> they check if `model` is a path or a simple name and automatically download >> it to local cache directory if necessary. Considering use cases based >> these libraries, maybe `model-directory-path` can be renamed to `model` >> which can be either a real path or a simple model name. >>> >>> 2. There is a `set_model_config` in the python PredictFunction >> interface. I find that in the previous discussion, you are considering >> passing PredictRuntimeContext in `open`. I do agree with this choice as >> well, which makes the logic more clear and uses can just use the configs in >> the `open` method to init their model. I just find that the FLIP still has >> the `set_model_config` so just want to double check it here. >>> >>> >>> >>> Thanks for your time! >>> >>> Best, >>> Biao Geng >>> >>>> 2025年9月22日 23:36,Swapna Marru <[email protected]> 写道: >>>> >>>> Hi Devs, >>>> >>>> >>>> I am interested in learning more about MODEL, ML_PREDICT, and >> ML_EVALUATE >>>> functionalities added in the following FLIP. >>>> >>>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-437%3A+Support+ML+Models+in+Flink+SQL >>>> >>>> >>>> I see the original FLIP has extensibility to local model providers in >> Flink. >>>> >>>> >>>> Is there a way to do pluggable local model providers in Python? Like, >> say, >>>> generate embeddings using Sentence transformer models running locally in >>>> Flink. >>>> >>>> >>>> An option could be to introduce a Model Provider factory implementation >> in >>>> Java that internally uses a predict function in Python . But I see this >>>> puts in a lot of work related to Java to Python >> communication/translation >>>> inside the provider. >>>> >>>> >>>> Something like PythonRuntimeProvider along with PredictRuntimeProvider / >>>> AsyncRuntimeProvider which can handle Java -> Python translations out of >>>> the box would be helpful to de-duplicate that effort. >>>> >>>> >>>> Can you please point to, if there are any discussions related to this >>>> already ? Or any other ways to achieve the same? Please share your >> thoughts. >>>> >>>> >>>> -Thanks, >>>> >>>> Swapna Marru
