[jira] [Assigned] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-25 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu reassigned PARQUET-2347: Assignee: Atour Mousavi Gourabi > Add interface layer between Parquet and Hadoop Configuration >

[jira] [Updated] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-25 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2347: - Fix Version/s: 1.14.0 > Add interface layer between Parquet and Hadoop Configuration >

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779751#comment-17779751 ] ASF GitHub Bot commented on PARQUET-2347: - wgtmac commented on PR #1141: URL: h

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-25 Thread via GitHub
wgtmac commented on PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#issuecomment-1780440620 I will merge this if there is no more comment by the end of this week. Thanks @amousavigourabi for working on this! -- This is an automated message from the Apache Git Service. To res

[jira] [Commented] (PARQUET-2366) Optimize random seek during rewriting

2023-10-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779747#comment-17779747 ] ASF GitHub Bot commented on PARQUET-2366: - wgtmac commented on PR #1174: URL: h

Re: [PR] PARQUET-2366: Optimize random seek during rewriting [parquet-mr]

2023-10-25 Thread via GitHub
wgtmac commented on PR #1174: URL: https://github.com/apache/parquet-mr/pull/1174#issuecomment-1780438406 cc @gszadovszky @ggershinsky @shangxinli if this interests you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: Max bitwidth for delta encoding

2023-10-25 Thread wish maple
Hi Ed, As [1] said, the DELTA_BINARY_PACKED might not be suitable for all cases. [3] Also talks about the same problem. I think because we already have data like this. This could be compatible. Also, [2] introduce some optimizations about DELTA_BINARY_PACED. Besides, maybe we can introduce PFor m

Re: Max bitwidth for delta encoding

2023-10-25 Thread Gang Wu
Hi Ed, My concern for changing specs is that existing writer implementations have already produced parquet files that the change intends to avoid. So it would be a long time to deprecate the old writers while any reader implementation should always be able to decode legacy files. Best, Gang On W

[jira] [Commented] (PARQUET-2366) Optimize random seek during rewriting

2023-10-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779723#comment-17779723 ] ASF GitHub Bot commented on PARQUET-2366: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2366: Optimize random seek during rewriting [parquet-mr]

2023-10-25 Thread via GitHub
ConeyLiu commented on code in PR #1174: URL: https://github.com/apache/parquet-mr/pull/1174#discussion_r1372506213 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/IndexCache.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [VOTE][Format] Add Float16 type to specification

2023-10-25 Thread Ben Harkins
For anyone that didn't see the [RESULT] thread [1], this vote has passed. [1] https://lists.apache.org/thread/odm5pmxssyd9kw1wvgdkg8hd044czqnk On Tue, Oct 10, 2023 at 7:01 AM Uwe L. Korn wrote: > +1 (binding) > > On Sat, Oct 7, 2023, at 5:49 AM, Daniel Weeks wrote: > > +1 > > > > On Fri, Oct 6,

[jira] [Commented] (PARQUET-2365) Fixes NPE when rewriting column without column index

2023-10-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779550#comment-17779550 ] ASF GitHub Bot commented on PARQUET-2365: - wgtmac commented on code in PR #1173

Re: [PR] PARQUET-2365 : Fixes NPE when rewriting column without column index [parquet-mr]

2023-10-25 Thread via GitHub
wgtmac commented on code in PR #1173: URL: https://github.com/apache/parquet-mr/pull/1173#discussion_r1371930026 ## parquet-column/src/main/java/org/apache/parquet/internal/column/columnindex/ColumnIndexBuilder.java: ## @@ -543,6 +546,11 @@ public static ColumnIndex build( *

[jira] [Commented] (PARQUET-2366) Optimize random seek during rewriting

2023-10-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779539#comment-17779539 ] ASF GitHub Bot commented on PARQUET-2366: - wgtmac commented on code in PR #1174

Re: [PR] PARQUET-2366: Optimize random seek during rewriting [parquet-mr]

2023-10-25 Thread via GitHub
wgtmac commented on code in PR #1174: URL: https://github.com/apache/parquet-mr/pull/1174#discussion_r1371883961 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/IndexCache.java: ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

[jira] [Commented] (PARQUET-758) [Format] HALF precision FLOAT Logical type

2023-10-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779369#comment-17779369 ] ASF GitHub Bot commented on PARQUET-758: gszadovszky commented on PR #184: URL:

Re: [PR] PARQUET-758: Add Float16/Half-float logical type [parquet-format]

2023-10-25 Thread via GitHub
gszadovszky commented on PR #184: URL: https://github.com/apache/parquet-format/pull/184#issuecomment-1778641456 @benibus, could you officially close the vote on the mailing list so it is clear that it has passed? @anjakefala, since we have 3 approvals already on this PR, any committer ca