[ 
https://issues.apache.org/jira/browse/FLINK-26301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497996#comment-17497996
 ] 

Dawid Wysakowicz edited comment on FLINK-26301 at 2/25/22, 9:06 AM:
--------------------------------------------------------------------

> I didn't see any conflict. It seems user can still use AvroReadSupport to set 
> projection. It is the problem of how to use avroParquet reader itself. Do you 
> mean we should write it into our documentation?

How can I use the {{AvroReadSupport}}? There is no method to pass configuration 
in {{AvroParquetReaders}} nor in {{AvroParquetRecordFormat}}

> It make sense to mark AvroParquetReaders as Experimental. The class 
> AvroParquetRecordFormat is used via the PublicEvolving interface 
> StreamFormat. Why should we use Experimental in this case?

{{Experimental}} is used to mark features that are in early stages of 
development and we are not confident about its readiness for production use. 
After playing around with AvroParquetReaders I find it not ready for production 
use. I don't see any relation with {{StreamFormat}} here. E.g. not all 
connectors need to be PublicEvolving from day one, just because the Source API 
is PublicEvolving. 


was (Author: dawidwys):
> I didn't see any conflict. It seems user can still use AvroReadSupport to set 
> projection. It is the problem of how to use avroParquet reader itself. Do you 
> mean we should write it into our documentation?

How can I use the {{AvroReadSupport}}? There is no method to pass configuration 
in {{AvroParquetReaders}} nor in {{AvroParquetRecordFormat}}

> It make sense to mark AvroParquetReaders as Experimental. The class 
> AvroParquetRecordFormat is used via the PublicEvolving interface 
> StreamFormat. Why should we use Experimental in this case?

{{Experimental}} is used to mark features that are in early stage of 
development and we are not confident about its readiness for production use. 
After playing around with AvroParquetReaders I find it not ready for production 
use. I don't see any relation with {{StreamFormat}} here. E.g. not all 
connectors need to be PublicEvolving from day one, just because the Source API 
is PublicEvolving. 

> Test AvroParquet format
> -----------------------
>
>                 Key: FLINK-26301
>                 URL: https://issues.apache.org/jira/browse/FLINK-26301
>             Project: Flink
>          Issue Type: Improvement
>          Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>            Reporter: Jing Ge
>            Assignee: Dawid Wysakowicz
>            Priority: Blocker
>              Labels: release-testing
>             Fix For: 1.15.0
>
>
> The following scenarios are worthwhile to test
>  * Start a simple job with None/At-least-once/exactly-once delivery guarantee 
> read Avro Generic/sSpecific/Reflect records and write them to an arbitrary 
> sink.
>  * Start the above job with bounded/unbounded data.
>  * Start the above job with streaming/batch execution mode.
>  
> This format works with FileSource[2] and can only be used with DataStream. 
> Normal parquet files can be used as test files. Schema introduced at [1] 
> could be used.
>  
> [1]Reference:
> [1][https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/formats/parquet/]
> [2] 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/filesystem/]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to