[VOTE] Release Apache Parquet Format 2.10.0 RC0

2023-11-15 Thread Gang Wu
Hi everyone, I propose the following RC to be released as the official Apache Parquet Format 2.10.0 release. The commit id is b9c4fa81c3be13dc98760c92b037fa4dd465cef8 * This corresponds to the tag: apache-parquet-format-2.10.0-rc0 * https://github.com/apache/parquet-format/tree/b9c4fa81c3be13dc98

[jira] [Resolved] (PARQUET-2379) [Format] Update changelog for 2.10.0

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2379. -- Fix Version/s: format-2.10.0 Resolution: Fixed > [Format] Update changelog for 2.10.0 > --

[jira] [Commented] (PARQUET-2379) [Format] Update changelog for 2.10.0

2023-11-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786603#comment-17786603 ] ASF GitHub Bot commented on PARQUET-2379: - wgtmac merged PR #219: URL: https://

Re: [PR] PARQUET-2379: [Format] Update changelog for 2.10.0 [parquet-format]

2023-11-15 Thread via GitHub
wgtmac merged PR #219: URL: https://github.com/apache/parquet-format/pull/219 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.ap

[jira] [Commented] (PARQUET-2379) [Format] Update changelog for 2.10.0

2023-11-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786600#comment-17786600 ] ASF GitHub Bot commented on PARQUET-2379: - wgtmac commented on PR #219: URL: ht

Re: [PR] PARQUET-2379: [Format] Update changelog for 2.10.0 [parquet-format]

2023-11-15 Thread via GitHub
wgtmac commented on PR #219: URL: https://github.com/apache/parquet-format/pull/219#issuecomment-1813747211 Thanks @mapleFU @Fokko @gszadovszky! I'll merge this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[jira] [Commented] (PARQUET-2380) Decouple RewriteOptions from Hadoop classes

2023-11-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786489#comment-17786489 ] ASF GitHub Bot commented on PARQUET-2380: - amousavigourabi opened a new pull re

[PR] PARQUET-2380: Decouple rewriter from Hadoop [parquet-mr]

2023-11-15 Thread via GitHub
amousavigourabi opened a new pull request, #1195: URL: https://github.com/apache/parquet-mr/pull/1195 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Parquet Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references

[jira] [Commented] (PARQUET-2374) Add metrics support for parquet file reader

2023-11-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786460#comment-17786460 ] ASF GitHub Bot commented on PARQUET-2374: - parthchandra commented on PR #1187:

Re: [PR] PARQUET-2374: Add metrics support for parquet file reader [parquet-mr]

2023-11-15 Thread via GitHub
parthchandra commented on PR #1187: URL: https://github.com/apache/parquet-mr/pull/1187#issuecomment-1813018561 Thank you @wgtmac ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[jira] [Commented] (PARQUET-2382) Remove the deprecated OriginalType

2023-11-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786452#comment-17786452 ] ASF GitHub Bot commented on PARQUET-2382: - Fokko commented on code in PR #1194:

Re: [PR] PARQUET-2382: Remove the deprecated OriginalType [parquet-mr]

2023-11-15 Thread via GitHub
Fokko commented on code in PR #1194: URL: https://github.com/apache/parquet-mr/pull/1194#discussion_r1394558775 ## parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java: ## @@ -1715,15 +1729,15 @@ private void buildChildren(Types.GroupBui

[jira] [Commented] (PARQUET-2382) Remove the deprecated OriginalType

2023-11-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786450#comment-17786450 ] ASF GitHub Bot commented on PARQUET-2382: - Fokko opened a new pull request, #11

[PR] PARQUET-2382: Remove the deprecated OriginalType [parquet-mr]

2023-11-15 Thread via GitHub
Fokko opened a new pull request, #1194: URL: https://github.com/apache/parquet-mr/pull/1194 Make sure you have checked _all_ steps below. For Iceberg we're adding nanosecond timestamps. During my investigation in Parquet, I noticed that there are two ways of declaring logical types:

[jira] [Created] (PARQUET-2382) Remove the deprecated OriginalType

2023-11-15 Thread Fokko Driesprong (Jira)
Fokko Driesprong created PARQUET-2382: - Summary: Remove the deprecated OriginalType Key: PARQUET-2382 URL: https://issues.apache.org/jira/browse/PARQUET-2382 Project: Parquet Issue Type:

[jira] [Commented] (PARQUET-2379) [Format] Update changelog for 2.10.0

2023-11-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786414#comment-17786414 ] ASF GitHub Bot commented on PARQUET-2379: - wgtmac commented on PR #219: URL: ht

Re: [PR] PARQUET-2379: [Format] Update changelog for 2.10.0 [parquet-format]

2023-11-15 Thread via GitHub
wgtmac commented on PR #219: URL: https://github.com/apache/parquet-format/pull/219#issuecomment-1812785646 I have updated the status of all JIRAs belong to the release. The changelog is updated as well. Please take a look again. Thanks! -- This is an automated message from the Apache Git

[jira] [Updated] (PARQUET-2221) [Format] Encoding spec incorrect for dictionary fallback

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2221: - Fix Version/s: (was: format-2.10.0) > [Format] Encoding spec incorrect for dictionary fallback > --

[jira] [Resolved] (PARQUET-2313) Bump actions/setup-java from 1 to 3

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2313. -- Assignee: Gang Wu Resolution: Fixed > Bump actions/setup-java from 1 to 3 > ---

[jira] [Resolved] (PARQUET-2344) Bump to Thirft 0.19.0

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2344. -- Resolution: Fixed > Bump to Thirft 0.19.0 > - > > Key: PARQUET-23

[jira] [Resolved] (PARQUET-2287) Bump maven-shade-plugin from 2.2 to 3.4.1

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2287. -- Fix Version/s: format-2.10.0 (was: 1.14.0) Resolution: Fixed > Bump mav

[jira] [Resolved] (PARQUET-2286) Bump apache-rat-plugin from 0.12 to 0.15

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2286. -- Fix Version/s: format-2.10.0 (was: 1.14.0) Resolution: Fixed > Bump apa

[jira] [Resolved] (PARQUET-2285) Add dependabot

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2285. -- Fix Version/s: format-2.10.0 (was: 1.14.0) Resolution: Fixed > Add depe

[jira] [Resolved] (PARQUET-2284) Bump junit from 4.10 to 4.13.1

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2284. -- Fix Version/s: format-2.10.0 (was: 1.14.0) Resolution: Fixed > Bump jun

[jira] [Resolved] (PARQUET-2270) Bump Thrift to 0.18.1

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2270. -- Resolution: Fixed > Bump Thrift to 0.18.1 > - > > Key: PARQUET-22

[jira] [Resolved] (PARQUET-2271) Bump Parquet POM to 29

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2271. -- Resolution: Fixed > Bump Parquet POM to 29 > -- > > Key: PARQUET-

[jira] [Updated] (PARQUET-2005) Upgrade thrift to 0.14.1

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2005: - Fix Version/s: format-2.10.0 > Upgrade thrift to 0.14.1 > > >

[jira] [Resolved] (PARQUET-2369) Clarify Support for Pages Compressed with Multiple GZIP Members

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2369. -- Assignee: Raphael Taylor-Davies Resolution: Fixed > Clarify Support for Pages Compressed with M

[jira] [Assigned] (PARQUET-2264) Update specification to allow DecimalType scale == precision

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu reassigned PARQUET-2264: Assignee: Devin Smith > Update specification to allow DecimalType scale == precision > -

[jira] [Resolved] (PARQUET-2264) Update specification to allow DecimalType scale == precision

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2264. -- Fix Version/s: format-2.10.0 Resolution: Fixed > Update specification to allow DecimalType sca

[jira] [Updated] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2241: - Fix Version/s: format-2.10.0 > ByteStreamSplitDecoder broken in presence of nulls > ---

[jira] [Resolved] (PARQUET-2215) Document how DELTA_BINARY_PACKED handles overflow for deltas

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2215. -- Fix Version/s: format-2.10.0 Resolution: Fixed > Document how DELTA_BINARY_PACKED handles over

[jira] [Updated] (PARQUET-2215) Document how DELTA_BINARY_PACKED handles overflow for deltas

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2215: - Issue Type: Improvement (was: New Feature) > Document how DELTA_BINARY_PACKED handles overflow for del

[jira] [Updated] (PARQUET-2257) [Format] Add bloom_filter_length to ColumnMetaData

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2257: - Issue Type: Improvement (was: New Feature) > [Format] Add bloom_filter_length to ColumnMetaData >

[jira] [Updated] (PARQUET-2261) [Format] Add statistics that reflect decoded size to metadata

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2261: - Issue Type: New Feature (was: Improvement) > [Format] Add statistics that reflect decoded size to meta

[jira] [Updated] (PARQUET-758) [Format] HALF precision FLOAT Logical type

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-758: Issue Type: New Feature (was: Improvement) > [Format] HALF precision FLOAT Logical type > ---

[jira] [Updated] (PARQUET-758) [Format] HALF precision FLOAT Logical type

2023-11-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-758: Fix Version/s: format-2.10.0 > [Format] HALF precision FLOAT Logical type > --

[jira] [Created] (PARQUET-2381) Deprecate methods relying on Hadoop classes when alternatives using more generic Parquet interfaces are available

2023-11-15 Thread Atour Mousavi Gourabi (Jira)
Atour Mousavi Gourabi created PARQUET-2381: -- Summary: Deprecate methods relying on Hadoop classes when alternatives using more generic Parquet interfaces are available Key: PARQUET-2381 URL: https://issue

[jira] [Created] (PARQUET-2380) Decouple RewriteOptions from Hadoop classes

2023-11-15 Thread Atour Mousavi Gourabi (Jira)
Atour Mousavi Gourabi created PARQUET-2380: -- Summary: Decouple RewriteOptions from Hadoop classes Key: PARQUET-2380 URL: https://issues.apache.org/jira/browse/PARQUET-2380 Project: Parquet

[jira] [Updated] (PARQUET-2369) Clarify Support for Pages Compressed with Multiple GZIP Members

2023-11-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated PARQUET-2369: Labels: pull-request-available (was: ) > Clarify Support for Pages Compressed with Multi

[jira] [Updated] (PARQUET-2369) Clarify Support for Pages Compressed with Multiple GZIP Members

2023-11-15 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated PARQUET-2369: Fix Version/s: format-2.10.0 > Clarify Support for Pages Compressed with Multiple GZIP Me

[jira] [Updated] (PARQUET-2369) Clarify Support for Pages Compressed with Multiple GZIP Members

2023-11-15 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated PARQUET-2369: Component/s: parquet-format > Clarify Support for Pages Compressed with Multiple GZIP Mem

[jira] [Updated] (PARQUET-2369) Clarify Support for Pages Compressed with Multiple GZIP Members

2023-11-15 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated PARQUET-2369: Priority: Major (was: Trivial) > Clarify Support for Pages Compressed with Multiple GZIP

[jira] [Commented] (PARQUET-2379) [Format] Update changelog for 2.10.0

2023-11-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786246#comment-17786246 ] ASF GitHub Bot commented on PARQUET-2379: - gszadovszky commented on PR #219: UR

Re: [PR] PARQUET-2379: [Format] Update changelog for 2.10.0 [parquet-format]

2023-11-15 Thread via GitHub
gszadovszky commented on PR #219: URL: https://github.com/apache/parquet-format/pull/219#issuecomment-1812041743 @wgtmac, When I am doing a release, I usually start with the list commits as you've done. Then update the jira accordingly (usually we miss to set the proper target release).

[jira] [Commented] (PARQUET-2379) [Format] Update changelog for 2.10.0

2023-11-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786240#comment-17786240 ] ASF GitHub Bot commented on PARQUET-2379: - wgtmac commented on PR #219: URL: ht

Re: [PR] PARQUET-2379: [Format] Update changelog for 2.10.0 [parquet-format]

2023-11-15 Thread via GitHub
wgtmac commented on PR #219: URL: https://github.com/apache/parquet-format/pull/219#issuecomment-1812023825 Thanks for the suggestion! @gszadovszky PARQUET- has two different commits associated with it. Let me consolidate them. I just scanned all commits since last release