[ 
https://issues.apache.org/jira/browse/FLINK-38067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kartikey Pant updated FLINK-38067:
----------------------------------
    Attachment: Screenshot 2025-07-11 at 8.45.23 PM.png

> Release Testing: Verify FLIP-525: Model ML_PREDICT, ML_EVALUATE 
> Implementation Design
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-38067
>                 URL: https://issues.apache.org/jira/browse/FLINK-38067
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Hao Li
>            Assignee: Kartikey Pant
>            Priority: Major
>         Attachments: Screenshot 2025-07-11 at 8.44.09 PM-1.png, Screenshot 
> 2025-07-11 at 8.44.09 PM.png, Screenshot 2025-07-11 at 8.45.23 PM.png, 
> Screenshot 2025-07-11 at 8.48.53 PM.png, Screenshot 2025-07-11 at 8.48.59 
> PM-1.png, Screenshot 2025-07-11 at 8.48.59 PM.png, Screenshot 2025-07-11 at 
> 8.49.12 PM.png
>
>
> h3. Follow up the test for https://issues.apache.org/jira/browse/FLINK-37777
> h3. Start cluster and sql client
> 1. Switch to 2.1 release branch and build source with
> {code:java}
> ./mvnw clean install -Pfast -DskipTests -Dscala-2.12 {code}
> *NOTE:* need fix [https://github.com/apache/flink/pull/26770] and then copy 
> flink-model-openai JAR file from .m2/repository to build-target/lib first 
> before step 2.
>  
> 2. Start cluster and sql client with
> {code:java}
> build-target/bin/start-cluster.sh
> build-target/bin/sql-client.sh {code}
> h3. Create table
> {code:java}
> create table source (text string)
> with (
> 'connector' = 'filesystem',
> 'path' = 'file:///<to your file path>',
> 'format' = 'raw'
> ); {code}
> h3. Create model
> {code:java}
> create model translate_model1
> input (i string)
> output (o string)
> with (
> 'provider' = 'openai',
> 'endpoint' = 'https://api.openai.com/v1/chat/completions',
> 'api-key' = '<your api key>',
> 'model' = 'gpt-4.1',
> 'system-prompt' = 'translate to Chinese'
> ); {code}
> h3. Test Prediction
> 1. With timeout
> {code:java}
> select * from ml_predict(table source, model translate_model1, 
> descriptor(text), map['timeout', '5s']); {code}
> 2. Without timeout
> {code:java}
> select * from ml_predict(table source, model translate_model1, 
> descriptor(text)); {code}
> 3. Sync mode should fail
> {code:java}
> select * from ml_predict(table source, model translate_model1, 
> descriptor(text), map['async', 'false']); {code}
> 4. Other options
> {code:java}
> select * from ml_predict(table source, model translate_model1, 
> descriptor(text), map['async', 'true', 'max-concurrent-operations', '1', 
> 'output-mode', 'ALLOW_UNORDERED', 'timeout', '5s']);  {code}
> h3. Test model output column has same name as table input column
> {code:java}
> create model translate_model2
> input (i string) 
> output (o string) 
> with ( 
> 'provider' = 'openai', 
> 'endpoint' = 'https://api.openai.com/v1/chat/completions', 
> 'api-key' = '<your api key>', 
> 'model' = 'gpt-4.1', 
> 'system-prompt' = 'translate to Chinese' 
> ); {code}
> {code:java}
> select * from ml_predict(table source, model translate_model1, 
> descriptor(text), map['timeout', '5s']); {code}
> Output column should be _text_ and {_}text0{_}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to