[ 
https://issues.apache.org/jira/browse/FLINK-26301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497568#comment-17497568
 ] 

Jing Ge edited comment on FLINK-26301 at 2/24/22, 5:08 PM:
-----------------------------------------------------------

> I am not saying you should use AvroReadSupport instead of AvroParquetReader. 
> I am saying about using a static method to properly setup configuration 
> passed to AvroParquetReader: https://stackoverflow.com/a/36871563/4250114

> In the end what I advocate is to explicitly specify what this format is good 
> for, what are the limitations and how to use it (where to get the schema 
> from).

I didn't see any conflict. It seems user can still use AvroReadSupport to set 
projection. It is the problem of how to use avroParquet reader itself. Do you 
mean we should write it into our documentation?


was (Author: jingge):
> I am not saying you should use AvroReadSupport instead of AvroParquetReader. 
> I am saying about using a static method to properly setup configuration 
> passed to AvroParquetReader: https://stackoverflow.com/a/36871563/4250114

> In the end what I advocate is to explicitly specify what this format is good 
> for, what are the limitations and how to use it (where to get the schema 
> from).

I didn't see any conflict. I think user can still use AvroReadSupport to set 
projection. It is the problem of how to use avroParquet reader itself. Do you 
mean we should write it into our documentation?

> Test AvroParquet format
> -----------------------
>
>                 Key: FLINK-26301
>                 URL: https://issues.apache.org/jira/browse/FLINK-26301
>             Project: Flink
>          Issue Type: Improvement
>          Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>            Reporter: Jing Ge
>            Assignee: Dawid Wysakowicz
>            Priority: Blocker
>              Labels: release-testing
>             Fix For: 1.15.0
>
>
> The following scenarios are worthwhile to test
>  * Start a simple job with None/At-least-once/exactly-once delivery guarantee 
> read Avro Generic/sSpecific/Reflect records and write them to an arbitrary 
> sink.
>  * Start the above job with bounded/unbounded data.
>  * Start the above job with streaming/batch execution mode.
>  
> This format works with FileSource[2] and can only be used with DataStream. 
> Normal parquet files can be used as test files. Schema introduced at [1] 
> could be used.
>  
> [1]Reference:
> [1][https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/formats/parquet/]
> [2] 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/filesystem/]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to