[jira] [Commented] (FLINK-21393) Implement ParquetAvroInputFormat

2021-05-31 Thread Etienne Chauchot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17354506#comment-17354506
 ] 

Etienne Chauchot commented on FLINK-21393:
--

[~AHeise] can you assign the ticket to me ?

> Implement ParquetAvroInputFormat 
> -
>
> Key: FLINK-21393
> URL: https://issues.apache.org/jira/browse/FLINK-21393
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Dominik Wosiński
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
>  Currently, there are several classes extending ParquetInputFormat like 
> `ParquetPojoInputFormat` or `ParquetRowInputFormat`, but there is no class 
> that would allow us to read the parquet Avro without creating additional 
> mappings and so on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21393) Implement ParquetAvroInputFormat

2021-03-22 Thread Etienne Chauchot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17306084#comment-17306084
 ] 

Etienne Chauchot commented on FLINK-21393:
--

[~rmetzger] I just realized that I'm not assigned to the ticket despite the PR 
I submitted. Can you please assign me ?

> Implement ParquetAvroInputFormat 
> -
>
> Key: FLINK-21393
> URL: https://issues.apache.org/jira/browse/FLINK-21393
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Dominik Wosiński
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
>  Currently, there are several classes extending ParquetInputFormat like 
> `ParquetPojoInputFormat` or `ParquetRowInputFormat`, but there is no class 
> that would allow us to read the parquet Avro without creating additional 
> mappings and so on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21393) Implement ParquetAvroInputFormat

2021-03-11 Thread Etienne Chauchot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17299670#comment-17299670
 ] 

Etienne Chauchot commented on FLINK-21393:
--

Hi, I just submitted the PR, [~Wosinsan] as promised, if you want to take a 
look, I'd be happy.

> Implement ParquetAvroInputFormat 
> -
>
> Key: FLINK-21393
> URL: https://issues.apache.org/jira/browse/FLINK-21393
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Dominik Wosiński
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
>  Currently, there are several classes extending ParquetInputFormat like 
> `ParquetPojoInputFormat` or `ParquetRowInputFormat`, but there is no class 
> that would allow us to read the parquet Avro without creating additional 
> mappings and so on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21393) Implement ParquetAvroInputFormat

2021-03-10 Thread Etienne Chauchot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298732#comment-17298732
 ] 

Etienne Chauchot commented on FLINK-21393:
--

Hi all, just a quick update: I'm back from vacation I discovered a pb with 
fields projection either in my new code or in ParquetInputFormat. I should 
submit the PR by the end of the week when it's fixed

> Implement ParquetAvroInputFormat 
> -
>
> Key: FLINK-21393
> URL: https://issues.apache.org/jira/browse/FLINK-21393
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Dominik Wosiński
>Priority: Minor
> Fix For: 1.13.0
>
>
>  Currently, there are several classes extending ParquetInputFormat like 
> `ParquetPojoInputFormat` or `ParquetRowInputFormat`, but there is no class 
> that would allow us to read the parquet Avro without creating additional 
> mappings and so on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21393) Implement ParquetAvroInputFormat

2021-02-24 Thread Etienne Chauchot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289941#comment-17289941
 ] 

Etienne Chauchot commented on FLINK-21393:
--

[~Wosinsan] I'd be happy if you wanted to be part of the reviewers. I'll ping 
you when the PR is ready.

> Implement ParquetAvroInputFormat 
> -
>
> Key: FLINK-21393
> URL: https://issues.apache.org/jira/browse/FLINK-21393
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Dominik Wosiński
>Priority: Minor
> Fix For: 1.13.0
>
>
>  Currently, there are several classes extending ParquetInputFormat like 
> `ParquetPojoInputFormat` or `ParquetRowInputFormat`, but there is no class 
> that would allow us to read the parquet Avro without creating additional 
> mappings and so on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21393) Implement ParquetAvroInputFormat

2021-02-23 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289729#comment-17289729
 ] 

Timo Walther commented on FLINK-21393:
--

What is the status of the new source interfaces? Is it worth it to implement an 
`ParqueInputFormat`? As far as I know the `InputFormat` interface will not 
maintained in the mid-term. [~jqin] [~sewen] do you know more?

> Implement ParquetAvroInputFormat 
> -
>
> Key: FLINK-21393
> URL: https://issues.apache.org/jira/browse/FLINK-21393
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Dominik Wosiński
>Priority: Minor
> Fix For: 1.13.0
>
>
>  Currently, there are several classes extending ParquetInputFormat like 
> `ParquetPojoInputFormat` or `ParquetRowInputFormat`, but there is no class 
> that would allow us to read the parquet Avro without creating additional 
> mappings and so on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21393) Implement ParquetAvroInputFormat

2021-02-23 Thread Etienne Chauchot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289722#comment-17289722
 ] 

Etienne Chauchot commented on FLINK-21393:
--

[~Wosinsan] I'm sorry I started to implement this even if you offered to do it 
because I needed it for a benchmark that needs to be done by the end of the 
week. 

> Implement ParquetAvroInputFormat 
> -
>
> Key: FLINK-21393
> URL: https://issues.apache.org/jira/browse/FLINK-21393
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Dominik Wosiński
>Priority: Minor
> Fix For: 1.13.0
>
>
>  Currently, there are several classes extending ParquetInputFormat like 
> `ParquetPojoInputFormat` or `ParquetRowInputFormat`, but there is no class 
> that would allow us to read the parquet Avro without creating additional 
> mappings and so on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21393) Implement ParquetAvroInputFormat

2021-02-23 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289420#comment-17289420
 ] 

Dominik Wosiński commented on FLINK-21393:
--

[~rmetzger] feel free to assign the [~echauchot], since most of the work is 
done in his PR already ;) 

> Implement ParquetAvroInputFormat 
> -
>
> Key: FLINK-21393
> URL: https://issues.apache.org/jira/browse/FLINK-21393
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Dominik Wosiński
>Priority: Minor
> Fix For: 1.13.0
>
>
>  Currently, there are several classes extending ParquetInputFormat like 
> `ParquetPojoInputFormat` or `ParquetRowInputFormat`, but there is no class 
> that would allow us to read the parquet Avro without creating additional 
> mappings and so on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21393) Implement ParquetAvroInputFormat

2021-02-23 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289277#comment-17289277
 ] 

Robert Metzger commented on FLINK-21393:


I'm not sure how to decide who's going to implement this improvement. First 
come first served?

I'm also not really able to review such a PR. [~dwysakowicz] or [~twalthr] 
could you take a look at this, and review the PR once its there?

> Implement ParquetAvroInputFormat 
> -
>
> Key: FLINK-21393
> URL: https://issues.apache.org/jira/browse/FLINK-21393
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Dominik Wosiński
>Priority: Minor
> Fix For: 1.13.0
>
>
>  Currently, there are several classes extending ParquetInputFormat like 
> `ParquetPojoInputFormat` or `ParquetRowInputFormat`, but there is no class 
> that would allow us to read the parquet Avro without creating additional 
> mappings and so on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21393) Implement ParquetAvroInputFormat

2021-02-23 Thread Etienne Chauchot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289178#comment-17289178
 ] 

Etienne Chauchot commented on FLINK-21393:
--

[~rmetzger]Hi, can you assign to me please, I have a PR almost ready (lacks the 
tests)


> Implement ParquetAvroInputFormat 
> -
>
> Key: FLINK-21393
> URL: https://issues.apache.org/jira/browse/FLINK-21393
> Project: Flink
>  Issue Type: Improvement
>Reporter: Dominik Wosiński
>Priority: Minor
> Fix For: 1.13.0
>
>
>  Currently, there are several classes extending ParquetInputFormat like 
> `ParquetPojoInputFormat` or `ParquetRowInputFormat`, but there is no class 
> that would allow us to read the parquet Avro without creating additional 
> mappings and so on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21393) Implement ParquetAvroInputFormat

2021-02-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-21393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286461#comment-17286461
 ] 

Dominik Wosiński commented on FLINK-21393:
--

Happy to add that.

> Implement ParquetAvroInputFormat 
> -
>
> Key: FLINK-21393
> URL: https://issues.apache.org/jira/browse/FLINK-21393
> Project: Flink
>  Issue Type: Improvement
>Reporter: Dominik Wosiński
>Priority: Minor
> Fix For: 1.13.0
>
>
>  Currently, there are several classes extending ParquetInputFormat like 
> `ParquetPojoInputFormat` or `ParquetRowInputFormat`, but there is no class 
> that would allow us to read the parquet Avro without creating additional 
> mappings and so on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)