dawidwys commented on issue #8387: [FLINK-11982] [table] File System 
Connector's support JSON Format and JSON file BatchTableSource
URL: https://github.com/apache/flink/pull/8387#issuecomment-491266766
 
 
   Hi @ambition119 
   I agree with @StephanEwen that this PR mixes the concepts of 
`connector`(file) and `format`(json).
   
   I don't necessarily understand the comment that `JsonRowFormatFactory` does 
not support File System Connector. First, there is no file system connector, 
and second of all this is a format factory thus it should have no notion of 
connector.
   
   I think you could rework your `JsonBatchTableSource` and 
`JsonRowInputFormat` to accept any `DeserializationSchema<Row>` similar to how 
`org.apache.flink.streaming.connectors.kafka.KafkaTableSourceBase` works.
   
   Another issue on a more conceptual layer is that the proposed 
`RowInputFormat` assumes that each line in a file is a separate record. I agree 
this is probably the most common case, but not the only one. 
   Files can be also written with a different layout (e.g. Parquet), thus we 
should differentiate that also on the connector level.
   
   I'm guessing you might have been inspired by `CsvTableSource`, but it was 
one of the first `TableSource`s that was written before we decided to split 
connectors and formats and its design is flawed. It also uses 
`RowCsvInputFormat` that as I said in the previous paragraph applies custom 
block(in this case line) splitting based on configurable delimiter.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to