[ https://issues.apache.org/jira/browse/FLINK-26301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498109#comment-17498109 ]
Jing Ge edited comment on FLINK-26301 at 2/25/22, 12:46 PM: ------------------------------------------------------------ > How can I use the AvroReadSupport? There is no method to pass configuration > in AvroParquetReaders nor in AvroParquetRecordFormat Got your point. Previously, I though it is possible to use the schema like ClassB.SCHEMA$ in https://stackoverflow.com/a/36871563/4250114 to enable projection. But it only works for GenericRecord. I would suggest supporting the projection in the next release. I will document it. > Experimental is used to mark features that are in early stages of development > and we are not confident about its readiness for production use. After > playing around with AvroParquetReaders I find it not ready for production > use. I don't see any relation with StreamFormat here. E.g. not all connectors > need to be PublicEvolving from day one, just because the Source API is > PublicEvolving. Like I already said previously, agreed to mark AvroParquetReaders as Experimental. AvroParquetRecordFormat will be changed to package private. Many thanks for the feedback. The questions are excellent and valuable. was (Author: jingge): > How can I use the AvroReadSupport? There is no method to pass configuration > in AvroParquetReaders nor in AvroParquetRecordFormat Got your point. Previously, I though it is possible to use the schema like ClassB.SCHEMA$ in https://stackoverflow.com/a/36871563/4250114 to enable projection. But it only works for GenericRecord. I would suggest supporting the projection in the next release. I will document it. > Experimental is used to mark features that are in early stages of development > and we are not confident about its readiness for production use. After > playing around with AvroParquetReaders I find it not ready for production > use. I don't see any relation with StreamFormat here. E.g. not all connectors > need to be PublicEvolving from day one, just because the Source API is > PublicEvolving. Like I already said previously, agreed to mark AvroParquetReaders as Experimental. AvroParquetRecordFormat will be package private. Many thanks for the feedback. The questions are excellent and valuable. > Test AvroParquet format > ----------------------- > > Key: FLINK-26301 > URL: https://issues.apache.org/jira/browse/FLINK-26301 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) > Reporter: Jing Ge > Assignee: Dawid Wysakowicz > Priority: Blocker > Labels: release-testing > Fix For: 1.15.0 > > > The following scenarios are worthwhile to test > * Start a simple job with None/At-least-once/exactly-once delivery guarantee > read Avro Generic/sSpecific/Reflect records and write them to an arbitrary > sink. > * Start the above job with bounded/unbounded data. > * Start the above job with streaming/batch execution mode. > > This format works with FileSource[2] and can only be used with DataStream. > Normal parquet files can be used as test files. Schema introduced at [1] > could be used. > > [1]Reference: > [1][https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/formats/parquet/] > [2] > [https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/filesystem/] > -- This message was sent by Atlassian Jira (v8.20.1#820001)