Hi Swapna, Some other suggestion and questions:
1. Maybe change `getPythonPredictFunction` to `getPythonPredictFunction` to align with `createPredictFunction` in `PredictRuntimeProvider`. 2. Is it possible to enforce the returned Java `PythonFunction` from `createPythonFunction` implements Python `PredictFunction`? Hao On Tue, Oct 14, 2025 at 2:52 PM Swapna Marru <[email protected]> wrote: > Thanks Matyas and Shengkai. > > > > 1. I'm wondering whether we could extend the SQL API to change how Python > > models are loaded. For example, we could allow users to write: > > > > ``` > > CREATE MODEL my_pytorch_model > > WITH ( > > 'type' = 'pytorch' > > ) LANGUAGE PYTHON; > > ``` > > Kind of thought about this, but one initial concern I had with this model > is, > Will a model provider be completely implemented in python itself ? > When we refer to PyTorch or HuggingFace or ONNX model providers for > example, do we need different behavior > or optimizations related to Predictruntimecontext/model config building , > batching or resource scheduling decisions .. which needs to be done in Java > Flink entrypoints ? > > > > 2. Beam already supports TensorFlow, ONNX, and many built-in models. Can > we > > reuse Beam's utilities to build Flink prediction functions[1]? > > Thanks, I will take a look at this to understand how it works and if we can > learn from that design. > > > 3. It would be better if we introduced a PredictRuntimeContext to help > > users download required weight files. > > Currently I have this as Model Config(set_model_config) , where this will > be passed to PredictFunction in python. But PredictRuntimeContext seems > more suitable . > I will look into passing PredictRuntimeContext in open, similar to > RuntimeContext for UDF's. > > > > 4. In ML, users typically perform inference on batches of data. > Therefore, > > per-record evaluation may not be necessary. How about we just introduce > API > > like[2]? > > Yes, I completely agree on this. I was first aiming at agreement on model > creation and provider api and then look at this in more detail. > > On Tue, Oct 14, 2025 at 10:39 AM Őrhidi Mátyás <[email protected]> > wrote: > > > Hey Shengkai, > > > > Thank you for your observations. This proposal is mostly driven by > > Swapna, but I could also share my thoughts here, please find them > > inline. > > > > Cheers, > > Matyas > > > > On Tue, Oct 14, 2025 at 3:02 AM Shengkai Fang <[email protected]> wrote: > > > > > > Hi, Matyas. > > > > > > Thanks for the proposal. I have some suggestions about the proposal. > > > > > > 1. I'm wondering whether we could extend the SQL API to change how > Python > > > models are loaded. For example, we could allow users to write: > > > > > > ``` > > > CREATE MODEL my_pytorch_model > > > WITH ( > > > 'type' = 'pytorch' > > > ) LANGUAGE PYTHON; > > > ``` > > > In this case, we wouldn't rely on Java SPI to load the Python model > > > provider. However, I'm not sure whether Python has a similar mechanism > to > > > SPI that avoids hardcoding class paths. > > > > This is an interesting idea, however we are proposing using the > > provider model because it aligns with Flink's existing Java-based > > architecture for discovering plugins. A Java entry point is required > > to launch the Python code, and this is the standard way to do it. > > > > > 2. Beam already supports TensorFlow, ONNX, and many built-in models. > Can > > we > > > reuse Beam's utilities to build Flink prediction functions[1]? > > > > We can certainly learn from Beam's design, but directly reusing it > > would add a very heavy dependency and be difficult to integrate > > cleanly into Flink's native processing model. > > > > > 3. It would be better if we introduced a PredictRuntimeContext to help > > > users download required weight files. > > > > This is actually a great idea and essential for usability. Just to > > double check on your suggestion, your proposal is to have an explicit > > PredictRuntimeContext for dynamic model file downloading? > > > > > > > > 4. In ML, users typically perform inference on batches of data. > > Therefore, > > > per-record evaluation may not be necessary. How about we just introduce > > API > > > like[2]? > > > > I agree completely. The row-by-row API is just a starting point, and > > we should aim to prioritize support for efficient batch inference to > > ensure good performance for real-world models. > > > > > > Best, > > > Shengkai > > > > > > [1] https://beam.apache.org/documentation/ml/about-ml/ > > > [2] > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-491%3A+BundledAggregateFunction+for+batched+aggregation > > > > > > > > > > > > > > > Swapna Marru <[email protected]> 于2025年10月14日周二 11:53写道: > > > > > > > Thanks Matyas. > > > > > > > > Hao, > > > > > > > > The proposal is to provide a generic framework . > > > > Interfaces -> PythonPredictRuntimeProvider / PythonPredictFunction / > > > > PredictFunction(in Python) are defined to provide a base for that > > > > framework. > > > > > > > > generic-python is one of the implementations, registered similar to > > openai > > > > in original FLIP. > > > > This is though not a concrete implementation end to end. It can be > > used as, > > > > 1. As a reference implementation for other complete end to end > concrete > > > > model provider implementations > > > > 2. For simple python model implementations, this can be used out of > > box to > > > > avoid boilerplate java provider implementation. > > > > > > > > I will also open a PR with current implementation changes , so it's > > more > > > > clear for further discussion. > > > > > > > > -Thanks, > > > > M.Swapna > > > > > > > > On Mon, Oct 13, 2025 at 5:04 PM Őrhidi Mátyás < > [email protected] > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-552+Support+ML_PREDICT+for+Python+based+model+providers > > > > > > > > > > On Mon, Oct 13, 2025 at 4:10 PM Őrhidi Mátyás < > > [email protected]> > > > > > wrote: > > > > > > > > > > > > Swapna, I can help you to create a FLIP page. > > > > > > > > > > > > On Mon, Oct 13, 2025 at 3:58 PM Hao Li <[email protected] > > > > > > wrote: > > > > > > > > > > > > > > Hi Swapna, > > > > > > > > > > > > > > Thanks for the proposal. Can you put it in a FLIP and start a > > > > > discussion > > > > > > > thread for it? > > > > > > > > > > > > > > From an initial look, I'm a bit confused if this is a concrete > > > > > > > implementation for "generic-python" or it's generic framework > to > > > > handle > > > > > > > python predict function. Because everything seems concrete like > > > > > > > `GenericPythonModelProviderFactory`, > `GenericPythonModelProvider` > > > > > exception > > > > > > > the final Python predict function. > > > > > > > > > > > > > > Also if `GenericPythonModelProviderFactory` is predefined, do > you > > > > > predefine > > > > > > > the required and optional options for it? Will it be inflexible > > if > > > > > > > predefined? > > > > > > > > > > > > > > Thanks, > > > > > > > Hao > > > > > > > > > > > > > > On Mon, Oct 13, 2025 at 10:04 AM Swapna Marru < > > > > > [email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > > Hi ShengKai, > > > > > > > > > > > > > > > > Documented the initial proposal here , > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1YzBxLUPvluaZIvR0S3ktc5Be1FF4bNeTsXB9ILfgyWY/edit?usp=sharing > > > > > > > > > > > > > > > > Please review and let me know your thoughts. > > > > > > > > > > > > > > > > -Thanks, > > > > > > > > Swapna > > > > > > > > > > > > > > > > On Tue, Sep 23, 2025 at 10:39 PM Shengkai Fang < > > [email protected]> > > > > > wrote: > > > > > > > > > > > > > > > > > I see your point, and I agree that your proposal is > feasible. > > > > > However, > > > > > > > > > there is one limitation to consider: the current loading > > > > mechanism > > > > > first > > > > > > > > > discovers all available factories on the classpath and then > > > > > filters them > > > > > > > > > based on the user-specified identifiers. > > > > > > > > > > > > > > > > > > In most practical scenarios, we would likely have only one > > > > generic > > > > > > > factory > > > > > > > > > (e.g., a GenericPythonModelFactory) present in the > classpath. > > > > This > > > > > means > > > > > > > > > the framework would be able to load either PyTorch or > > TensorFlow > > > > > > > > > models—whichever is defined within that single generic > > > > > > > implementation—but > > > > > > > > > not both simultaneously unless additional mechanisms are > > > > > introduced. > > > > > > > > > > > > > > > > > > This doesn't block the proposal, but it’s something worth > > noting > > > > > as we > > > > > > > > > design the extensibility model. We may want to explore ways > > to > > > > > support > > > > > > > > > multiple user-defined providers more seamlessly in the > > future. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Shengkai > > > > > > > > > > > > > > > > > > > > >
