Martijn Visser created FLINK-40064:
--------------------------------------

             Summary: Migrate PyFlink DataStream e2e test from Kafka to 
FileSource/FileSink and re-enable it
                 Key: FLINK-40064
                 URL: https://issues.apache.org/jira/browse/FLINK-40064
             Project: Flink
          Issue Type: Bug
          Components: API / Python, Test Infrastructure
            Reporter: Martijn Visser
            Assignee: Martijn Visser


FLINK-40048 removed flink-sql-connector-kafka from flink-sql-client-test, 
causing test_pyflink.sh to fail on every run. The only consumer of that jar was 
the "Test PyFlink DataStream job" case, disabled since FLINK-36185 because 
data_stream_job.py still used the legacy FlinkKafkaConsumer/FlinkKafkaProducer.

We should rewrite data_stream_job.py to the in-repo FileSource 
(text_line_format) and unified FileSink, re-enable the case with deterministic 
file-based verification (auto-watermark interval 0, bounded source, job runs to 
FINISHED), and remove the Kafka scaffolding from test_pyflink.sh. 
kafka_sql_common.sh remains (used by test_confluent_schema_registry.sh). 
Restores the nightly leg and PyFlink DataStream 
KeyedProcessFunction/event-time-timer coverage dormant since Sept 2024.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to