[ https://issues.apache.org/jira/browse/FLINK-26301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498056#comment-17498056 ]
Dawid Wysakowicz commented on FLINK-26301: ------------------------------------------ A very basic job works fine. I created parquet files with {{StreamingFileSink}} with some random data. I read that data using the {{AvroParquetReaders}} format. I run a job in STREAMING/BATCH mode with and without file monitoring. While running the job in STREAMING with checkpoints enabled I killed one TM and the job was restored fine. > Test AvroParquet format > ----------------------- > > Key: FLINK-26301 > URL: https://issues.apache.org/jira/browse/FLINK-26301 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) > Reporter: Jing Ge > Assignee: Dawid Wysakowicz > Priority: Blocker > Labels: release-testing > Fix For: 1.15.0 > > > The following scenarios are worthwhile to test > * Start a simple job with None/At-least-once/exactly-once delivery guarantee > read Avro Generic/sSpecific/Reflect records and write them to an arbitrary > sink. > * Start the above job with bounded/unbounded data. > * Start the above job with streaming/batch execution mode. > > This format works with FileSource[2] and can only be used with DataStream. > Normal parquet files can be used as test files. Schema introduced at [1] > could be used. > > [1]Reference: > [1][https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/formats/parquet/] > [2] > [https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/filesystem/] > -- This message was sent by Atlassian Jira (v8.20.1#820001)