Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-27 Thread via GitHub


JingGe merged PR #24795:
URL: https://github.com/apache/flink/pull/24795


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-27 Thread via GitHub


ViktorCosenza commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2133329919

   @JingGe  CI Passed, I dindn't realize the bot just editted the same comment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-24 Thread via GitHub


ViktorCosenza commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2130263489

   > @ViktorCosenza did you rebase?
   
   Yes, i did an interactive rebase and then force-pushed the squashed commits


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-24 Thread via GitHub


JingGe commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2130068966

   @ViktorCosenza did you rebase?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-24 Thread via GitHub


JingGe commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2130067792

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-24 Thread via GitHub


ViktorCosenza commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2129462557

   @JingGe Are you able to trigger the CI manually? I think I't wasnt triggered 
after the squash because no changes were detected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-23 Thread via GitHub


ViktorCosenza commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2126981554

   > Thanks for the hint! Would you mind if I picked up some of your code to 
paimon?
   Sure, go ahead!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-23 Thread via GitHub


ViktorCosenza commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2126972794

   > @ViktorCosenza would you please squash your commits and rebase it? Once it 
passed the CI, I will merge the PR. Thanks!
   
   Done! But I wasn't able to trigger the CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-23 Thread via GitHub


ViktorCosenza commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2126966568

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-23 Thread via GitHub


ViktorCosenza commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2126965517

   @flinkbot run


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-23 Thread via GitHub


JingGe commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2126441145

   > > > Do you have any hints about the background info?
   > > 
   > > 
   > > Not really, I found out this issue because we were trying to save 
Parquet files to S3 and the writer would fail due to empty array being invalid 
in Parquet. When stepping in the debugger I realized that the methods were 
empty.
   > > After digging a bit i found 2 PRs here that jointly solved the problem, 
but both inactive. I simply created this PR merging both of them to facilitate 
the review/merge process
   > 
   > Also, as @xccui mentioned, there is a very similar issue/code with methods 
missing on apache-paimon, so maybe this code was copied around a bit without 
these methods
   
   Thanks for the hint! Would you mind if I pick up some of your code to paimon?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-23 Thread via GitHub


JingGe commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2126427307

   @ViktorCosenza would you please squash your commits and rebase it? Once it 
passed the CI, I will merge the PR. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-23 Thread via GitHub


JingGe commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2126415976

   I talked with @JingsongLi offline. It seems that at that time there was no 
such requirement and therefore skipped those implementation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-22 Thread via GitHub


ViktorCosenza commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2125705836

   I see, Ive got the impression that they were forgotten, not purposely left
   out because no tests covered writing nested structures ( if there were, the
   tests would fail and the methods would have been implemented before)
   
   Did you got the impression it wasn’t added on purpose? I could add more
   tests if you think it would help
   
   
   On Wed, 22 May 2024 at 17:26 Jing Ge ***@***.***> wrote:
   
   > It looks like those methods were skipped on purpose in #17542
   >  and #17542 (comment)
   > 
   >
   > —
   > Reply to this email directly, view it on GitHub
   > , or
   > unsubscribe
   > 

   > .
   > You are receiving this because you were mentioned.Message ID:
   > ***@***.***>
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-22 Thread via GitHub


JingGe commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2125686671

   It looks like those methods were skipped on purpose in #17542 and 
https://github.com/apache/flink/pull/17542#issuecomment-1954552466


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-22 Thread via GitHub


ViktorCosenza commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2125670540

   > > Do you have any hints about the background info?
   > 
   > Not really, I found out this issue because we were trying to save Parquet 
files to S3 and the writer would fail due to empty array being invalid in 
Parquet. When stepping in the debugger I realized that the methods were empty.
   > 
   > After digging a bit i found 2 PRs here that jointly solved the problem, 
but both inactive. I simply created this PR merging both of them to facilitate 
the review/merge process
   
   Also, as @xccui mentioned, there is a very similar issue/code with methods 
missing on apache-paimon, so maybe this code was copied around a bit without 
these methods


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-22 Thread via GitHub


JingGe commented on code in PR #24795:
URL: https://github.com/apache/flink/pull/24795#discussion_r1610586619


##
flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java:
##
@@ -381,9 +381,16 @@ private MapWriter(LogicalType keyType, LogicalType 
valueType, GroupType groupTyp
 
 @Override
 public void write(RowData row, int ordinal) {
-recordConsumer.startGroup();
+writeMapData(row.getMap(ordinal));
+}
 
-MapData mapData = row.getMap(ordinal);
+@Override
+public void write(ArrayData arrayData, int ordinal) {

Review Comment:
   NIT: Not sure why those write(ArrayData arrayData, int ordinal) methods 
didn't get implemented. It should be trivial to do it while building those 
Writers. Do you have any hints about the background info?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-22 Thread via GitHub


ViktorCosenza commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2125663861

   > Do you have any hints about the background info?
   
   Not really, I found out this issue because we were trying to save Parquet 
files to S3 and the writer would fail due to empty array being invalid in 
Parquet. When stepping in the debugger I realized that the methods were empty.
   
   After digging a bit i found 2 PRs here that jointly solved the problem, but 
both inactive. I simply created this PR merging both of them to facilitate the 
review/merge process
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-22 Thread via GitHub


JingGe commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2125655594

   Thanks @ViktorCosenza for taking care of it. The PR looks overall good. Not 
sure why those `write(ArrayData arrayData, int ordinal)` methods didn't get 
implemented. It should be trivial to do it while building those Writers. Do you 
have any hints about the background info?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-22 Thread via GitHub


ViktorCosenza commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2124999154

   > Thanks! The PR looks good to me. Hi @JingsongLi, please also take a look. 
I think we can port this to fix 
[apache/paimon#1730](https://github.com/apache/paimon/issues/1730)
   
   Hey @xccui, thanks for the quick review! Just checking in to keep this PR 
alive.
   Is there anything I can do to help get this merged?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-15 Thread via GitHub


flinkbot commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2113596687

   
   ## CI report:
   
   * c3b588c4a44d135ccebe6acafa75e3b1a9aa4895 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org