[
https://issues.apache.org/jira/browse/FLINK-38067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18004738#comment-18004738
]
Kartikey Pant edited comment on FLINK-38067 at 7/11/25 4:38 PM:
----------------------------------------------------------------
Hi [~lihaosky],
I successfully ran all the test cases as described. For the testing, I used the
{{confluent/model-uber-cp}} branch from the provided CP fix
([https://github.com/apache/flink/pull/26782]).
As per the instructions, I first applied the fix from
[https://github.com/apache/flink/pull/26770] and then copied the
{{flink-model-openai}} JAR file from my {{.m2/repository}} to the
{{build-target/lib}} directory before starting the cluster and the SQL client.
The attached screenshots demonstrate the successful execution of each test case:
* *Test Prediction (With timeout):* The query with a {{'5s'}} timeout ran
successfully, producing the correct translated output.
*
** !Screenshot 2025-07-11 at 9.56.38 PM.png!!Screenshot 2025-07-11 at 9.56.27
PM.png!
**
* *Test Prediction (Without timeout):* The query without a timeout also
completed successfully.
*
** !Screenshot 2025-07-11 at 8.48.59 PM.png!
** !Screenshot 2025-07-11 at 8.48.53 PM.png!
* *Sync mode should fail:* The query with {{MAP['async', 'false']}} correctly
failed with the expected error, stating that the provider does not support sync
mode.
*
** !Screenshot 2025-07-11 at 8.49.12 PM.png!
* *Other options:* The query with multiple options
({{{}max-concurrent-operations{}}}, {{{}output-mode{}}}, etc.) was executed as
well.
*
** !Screenshot 2025-07-11 at 8.55.06 PM.png!
** !Screenshot 2025-07-11 at 8.55.01 PM.png!
* *Test model output column has same name as table input column:* The final
test case correctly renamed the output column to {{text0}} to avoid a name
collision with the input column {{{}text{}}}.
*
** !Screenshot 2025-07-11 at 8.56.28 PM.png!
was (Author: JIRAUSER301503):
Hi [~lihaosky],
I successfully ran all the test cases as described. For the testing, I used the
{{confluent/model-uber-cp}} branch from the provided CP fix
([https://github.com/apache/flink/pull/26782]).
As per the instructions, I first applied the fix from
[https://github.com/apache/flink/pull/26770] and then copied the
{{flink-model-openai}} JAR file from my {{.m2/repository}} to the
{{build-target/lib}} directory before starting the cluster and the SQL client.
The attached screenshots demonstrate the successful execution of each test case:
* *Test Prediction (With timeout):* The query with a {{'5s'}} timeout ran
successfully, producing the correct translated output.
** !Screenshot 2025-07-11 at 9.56.38 PM.png!!Screenshot 2025-07-11 at 9.56.27
PM.png!
**
* *Test Prediction (Without timeout):* The query without a timeout also
completed successfully.
** !Screenshot 2025-07-11 at 8.48.59 PM.png!
** !Screenshot 2025-07-11 at 8.48.53 PM.png!
* *Sync mode should fail:* The query with {{MAP['async', 'false']}} correctly
failed with the expected error, stating that the provider does not support sync
mode.
** !Screenshot 2025-07-11 at 8.49.12 PM.png!
* *Other options:* The query with multiple options
({{{}max-concurrent-operations{}}}, {{{}output-mode{}}}, etc.) was executed as
well.
** !Screenshot 2025-07-11 at 8.55.06 PM.png!
** !Screenshot 2025-07-11 at 8.55.01 PM.png!
* *Test model output column has same name as table input column:* The final
test case correctly renamed the output column to {{text0}} to avoid a name
collision with the input column {{{}text{}}}.
** !Screenshot 2025-07-11 at 8.56.28 PM.png!
> Release Testing: Verify FLIP-525: Model ML_PREDICT, ML_EVALUATE
> Implementation Design
> -------------------------------------------------------------------------------------
>
> Key: FLINK-38067
> URL: https://issues.apache.org/jira/browse/FLINK-38067
> Project: Flink
> Issue Type: Sub-task
> Reporter: Hao Li
> Assignee: Kartikey Pant
> Priority: Major
> Attachments: Screenshot 2025-07-11 at 8.44.09 PM-1.png, Screenshot
> 2025-07-11 at 8.44.09 PM.png, Screenshot 2025-07-11 at 8.48.53 PM.png,
> Screenshot 2025-07-11 at 8.48.59 PM-1.png, Screenshot 2025-07-11 at 8.48.59
> PM.png, Screenshot 2025-07-11 at 8.49.12 PM.png, Screenshot 2025-07-11 at
> 8.55.01 PM.png, Screenshot 2025-07-11 at 8.55.06 PM.png, Screenshot
> 2025-07-11 at 8.56.28 PM.png, Screenshot 2025-07-11 at 9.56.27 PM.png,
> Screenshot 2025-07-11 at 9.56.38 PM.png
>
>
> h3. Follow up the test for https://issues.apache.org/jira/browse/FLINK-37777
> h3. Start cluster and sql client
> 1. Switch to 2.1 release branch and build source with
> {code:java}
> ./mvnw clean install -Pfast -DskipTests -Dscala-2.12 {code}
> *NOTE:* need fix [https://github.com/apache/flink/pull/26770] and then copy
> flink-model-openai JAR file from .m2/repository to build-target/lib first
> before step 2.
>
> 2. Start cluster and sql client with
> {code:java}
> build-target/bin/start-cluster.sh
> build-target/bin/sql-client.sh {code}
> h3. Create table
> {code:java}
> create table source (text string)
> with (
> 'connector' = 'filesystem',
> 'path' = 'file:///<to your file path>',
> 'format' = 'raw'
> ); {code}
> h3. Create model
> {code:java}
> create model translate_model1
> input (i string)
> output (o string)
> with (
> 'provider' = 'openai',
> 'endpoint' = 'https://api.openai.com/v1/chat/completions',
> 'api-key' = '<your api key>',
> 'model' = 'gpt-4.1',
> 'system-prompt' = 'translate to Chinese'
> ); {code}
> h3. Test Prediction
> 1. With timeout
> {code:java}
> select * from ml_predict(table source, model translate_model1,
> descriptor(text), map['timeout', '5s']); {code}
> 2. Without timeout
> {code:java}
> select * from ml_predict(table source, model translate_model1,
> descriptor(text)); {code}
> 3. Sync mode should fail
> {code:java}
> select * from ml_predict(table source, model translate_model1,
> descriptor(text), map['async', 'false']); {code}
> 4. Other options
> {code:java}
> select * from ml_predict(table source, model translate_model1,
> descriptor(text), map['async', 'true', 'max-concurrent-operations', '1',
> 'output-mode', 'ALLOW_UNORDERED', 'timeout', '5s']); {code}
> h3. Test model output column has same name as table input column
> {code:java}
> create model translate_model2
> input (i string)
> output (o string)
> with (
> 'provider' = 'openai',
> 'endpoint' = 'https://api.openai.com/v1/chat/completions',
> 'api-key' = '<your api key>',
> 'model' = 'gpt-4.1',
> 'system-prompt' = 'translate to Chinese'
> ); {code}
> {code:java}
> select * from ml_predict(table source, model translate_model1,
> descriptor(text), map['timeout', '5s']); {code}
> Output column should be _text_ and {_}text0{_}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)