[jira] [Commented] (PARQUET-758) [Format] HALF precision FLOAT Logical type

2023-06-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732886#comment-17732886 ] ASF GitHub Bot commented on PARQUET-758: gszadovszky commented on PR #184: URL:

[GitHub] [parquet-format] gszadovszky commented on pull request #184: PARQUET-758: Add Float16/Half-float logical type

2023-06-14 Thread via GitHub
gszadovszky commented on PR #184: URL: https://github.com/apache/parquet-format/pull/184#issuecomment-1592472334 I think that compatibility in Parquet file format is such a strong requirement that extending primitive types is simply not an option. (I agree though, if I would introduce a new

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-06-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732659#comment-17732659 ] ASF GitHub Bot commented on PARQUET-: - mapleFU opened a new pull request, #

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-06-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732660#comment-17732660 ] ASF GitHub Bot commented on PARQUET-: - mapleFU commented on PR #211: URL: h

[GitHub] [parquet-format] mapleFU commented on pull request #211: PARQUET-2222: Fix broken link for Plain Boolean

2023-06-14 Thread via GitHub
mapleFU commented on PR #211: URL: https://github.com/apache/parquet-format/pull/211#issuecomment-1591712882 @pitrou @gszadovszky @wgtmac This patch fix the RLE link in PLAIN, and also fix all internal links. It can be seen in https://github.com/mapleFU/parquet-format/blob/parquet/f

[GitHub] [parquet-format] mapleFU opened a new pull request, #211: PARQUET-2222: Fix broken link for Plain Boolean

2023-06-14 Thread via GitHub
mapleFU opened a new pull request, #211: URL: https://github.com/apache/parquet-format/pull/211 Make sure you have checked _all_ steps below. ### Jira https://issues.apache.org/jira/browse/PARQUET- ### Commits - [x] My commits all reference Jira issues in their

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-06-14 Thread Xuwei Fu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732652#comment-17732652 ] Xuwei Fu commented on PARQUET-: --- Yes this answers my question. I think arrow parq

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-06-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732617#comment-17732617 ] ASF GitHub Bot commented on PARQUET-2249: - mapleFU commented on code in PR #196

[GitHub] [parquet-format] mapleFU commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-06-14 Thread via GitHub
mapleFU commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1229883979 ## src/main/thrift/parquet.thrift: ## @@ -966,6 +985,23 @@ struct ColumnIndex { /** A list containing the number of null values for each page **/ 5: option

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-06-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732614#comment-17732614 ] ASF GitHub Bot commented on PARQUET-2249: - pitrou commented on code in PR #196:

[GitHub] [parquet-format] pitrou commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-06-14 Thread via GitHub
pitrou commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1229881617 ## src/main/thrift/parquet.thrift: ## @@ -886,16 +891,25 @@ union ColumnOrder { * FIXED_LEN_BYTE_ARRAY - unsigned byte-wise comparison * * (*) Because

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-06-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732613#comment-17732613 ] ASF GitHub Bot commented on PARQUET-2249: - pitrou commented on code in PR #196:

[GitHub] [parquet-format] pitrou commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-06-14 Thread via GitHub
pitrou commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1229880276 ## src/main/thrift/parquet.thrift: ## @@ -966,6 +985,23 @@ struct ColumnIndex { /** A list containing the number of null values for each page **/ 5: optiona

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-06-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732610#comment-17732610 ] ASF GitHub Bot commented on PARQUET-2249: - pitrou commented on code in PR #196:

[GitHub] [parquet-format] pitrou commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-06-14 Thread via GitHub
pitrou commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1229878920 ## src/main/thrift/parquet.thrift: ## @@ -966,6 +985,23 @@ struct ColumnIndex { /** A list containing the number of null values for each page **/ 5: optiona

[jira] [Commented] (PARQUET-758) [Format] HALF precision FLOAT Logical type

2023-06-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732592#comment-17732592 ] ASF GitHub Bot commented on PARQUET-758: pitrou commented on PR #184: URL: https

[GitHub] [parquet-format] pitrou commented on pull request #184: PARQUET-758: Add Float16/Half-float logical type

2023-06-14 Thread via GitHub
pitrou commented on PR #184: URL: https://github.com/apache/parquet-format/pull/184#issuecomment-1591494344 > FWIW, I rather think it should be a physical type for the following reasons: > > * encodings are currently only defined on the physical type, not the logical one. So allowing

[jira] [Commented] (PARQUET-2270) Bump Thrift to 0.18.1

2023-06-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732544#comment-17732544 ] ASF GitHub Bot commented on PARQUET-2270: - steveloughran commented on PR #201:

[GitHub] [parquet-format] steveloughran commented on pull request #201: PARQUET-2270: Bump Thrift to 0.16.0

2023-06-14 Thread via GitHub
steveloughran commented on PR #201: URL: https://github.com/apache/parquet-format/pull/201#issuecomment-1591225024 for anyone who need a version of thrift built for mac x86 (which also works on arm64 through emulation), here is a copy of the relevant homebrew directory before homebrew went

Re: building parquet macbook m1 with thrift 0.15.0

2023-06-14 Thread Aaron Niskode-Dossett
My solution was to write a docker container to build parquet on my macbook. I spent a couple of hours trying and failing to build it directly and got a docker solution working in far less time. On Wed, Jun 14, 2023 at 7:50 AM Steve Loughran wrote: > How do people get a version of the native thr

building parquet macbook m1 with thrift 0.15.0

2023-06-14 Thread Steve Loughran
How do people get a version of the native thrift binaries onto their macbook such that parquet build? 1. as homebrew is on 0.18.1, and if you try to build with that you can see that thrift has added some new things to implement. 2. try to rebuild thrift 0.15 and you end up in cmake pain

[jira] [Commented] (PARQUET-2128) Bump Thrift to 0.16.0

2023-06-14 Thread Steve Loughran (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732509#comment-17732509 ] Steve Loughran commented on PARQUET-2128: - homebrew doesn't have anything < 0..

[jira] [Comment Edited] (PARQUET-2223) Parquet Data Masking for Column Encryption

2023-06-14 Thread Jiashen Zhang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17654675#comment-17654675 ] Jiashen Zhang edited comment on PARQUET-2223 at 6/14/23 8:02 AM:

[jira] [Updated] (PARQUET-2223) Parquet Data Masking for Column Encryption

2023-06-14 Thread Jiashen Zhang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiashen Zhang updated PARQUET-2223: --- Description: h1. Background h2. What is Data Masking? Data masking is a technique used to

[jira] [Updated] (PARQUET-2223) Parquet Data Masking for Column Encryption

2023-06-14 Thread Jiashen Zhang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiashen Zhang updated PARQUET-2223: --- Issue Type: New Feature (was: Task) Priority: Major (was: Minor) > Parquet Data Mas