[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704433#comment-17704433 ] ASF GitHub Bot commented on PARQUET-: - wgtmac commented on PR #193: URL: ht

[GitHub] [parquet-format] wgtmac commented on pull request #193: PARQUET-2222: RLE encoding spec incorrect for v2 data pages

2023-03-23 Thread via GitHub
wgtmac commented on PR #193: URL: https://github.com/apache/parquet-format/pull/193#issuecomment-1482199554 > No problem, I think that's RLE_DICTIONARY's problem. Just ignore it Sounds good. @pitrou Do you have any comment? I just want to make sure all reviewers are happy on t

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704427#comment-17704427 ] ASF GitHub Bot commented on PARQUET-: - mapleFU commented on PR #193: URL: h

[GitHub] [parquet-format] mapleFU commented on pull request #193: PARQUET-2222: RLE encoding spec incorrect for v2 data pages

2023-03-23 Thread via GitHub
mapleFU commented on PR #193: URL: https://github.com/apache/parquet-format/pull/193#issuecomment-1482193618 No problem, I think that's RLE_DICTIONARY's problem. Just ignore it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704402#comment-17704402 ] ASF GitHub Bot commented on PARQUET-2249: - zhongyujiang commented on PR #196: U

[GitHub] [parquet-format] zhongyujiang commented on pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
zhongyujiang commented on PR #196: URL: https://github.com/apache/parquet-format/pull/196#issuecomment-1482138488 @JFinis Thanks for your reply, just realized that the page value count is stored in the page header, not in the column index. I overlooked your comments above before asked the q

[jira] [Commented] (PARQUET-2149) Implement async IO for Parquet file reader

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704399#comment-17704399 ] ASF GitHub Bot commented on PARQUET-2149: - wgtmac commented on PR #968: URL: ht

[GitHub] [parquet-mr] wgtmac commented on pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2023-03-23 Thread via GitHub
wgtmac commented on PR #968: URL: https://github.com/apache/parquet-mr/pull/968#issuecomment-1482128840 @parthchandra Do you have time to resolve the conflicts? I think it would be nice to be included in the next release. -- This is an automated message from the Apache Git Service. To res

[jira] [Commented] (PARQUET-2149) Implement async IO for Parquet file reader

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704398#comment-17704398 ] ASF GitHub Bot commented on PARQUET-2149: - parthchandra commented on PR #968: U

[GitHub] [parquet-mr] parthchandra commented on pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2023-03-23 Thread via GitHub
parthchandra commented on PR #968: URL: https://github.com/apache/parquet-mr/pull/968#issuecomment-1482127933 > Sorry to interrupt, just wondering if this PR can work with [S3AsyncClient](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/S3AsyncClient.html) by any

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704198#comment-17704198 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on PR #196: URL: ht

[GitHub] [parquet-format] JFinis commented on pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on PR #196: URL: https://github.com/apache/parquet-format/pull/196#issuecomment-1481370812 @zhongyujiang (as I can't answer your comment directly). Here is the problem with your suggestion of checking `nanCount == valueCount` for checking for only NaNs: > @mapleFU To

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704194#comment-17704194 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704193#comment-17704193 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146342719 ## README.md: ## @@ -163,18 +163,25 @@ following rules: [Thrift definition](src/main/thrift/parquet.thrift) in the `ColumnOrder` union. They are summa

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146342719 ## README.md: ## @@ -163,18 +163,25 @@ following rules: [Thrift definition](src/main/thrift/parquet.thrift) in the `ColumnOrder` union. They are summa

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704158#comment-17704158 ] ASF GitHub Bot commented on PARQUET-2249: - zhongyujiang commented on PR #196: U

[GitHub] [parquet-format] zhongyujiang commented on pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
zhongyujiang commented on PR #196: URL: https://github.com/apache/parquet-format/pull/196#issuecomment-1481237476 > Thus, to solve the problem of only-NaN pages, the comments in the spec are extended to mandate the following behavior: > > Once a writer writes the nan_count/nan_counts

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704120#comment-17704120 ] ASF GitHub Bot commented on PARQUET-2249: - mapleFU commented on code in PR #196

[GitHub] [parquet-format] mapleFU commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
mapleFU commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146114852 ## README.md: ## @@ -163,18 +163,25 @@ following rules: [Thrift definition](src/main/thrift/parquet.thrift) in the `ColumnOrder` union. They are summ

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704117#comment-17704117 ] ASF GitHub Bot commented on PARQUET-2249: - mapleFU commented on code in PR #196

[GitHub] [parquet-format] mapleFU commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
mapleFU commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146105914 ## README.md: ## @@ -163,18 +163,25 @@ following rules: [Thrift definition](src/main/thrift/parquet.thrift) in the `ColumnOrder` union. They are summ

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704114#comment-17704114 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146095493 ## README.md: ## @@ -163,18 +163,25 @@ following rules: [Thrift definition](src/main/thrift/parquet.thrift) in the `ColumnOrder` union. They are summa

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704111#comment-17704111 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1145999358 ## src/main/thrift/parquet.thrift: ## @@ -952,6 +961,9 @@ struct ColumnIndex { * Such more compact values must still be valid values within the column's * l

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704110#comment-17704110 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704109#comment-17704109 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146076533 ## README.md: ## @@ -163,18 +163,25 @@ following rules: [Thrift definition](src/main/thrift/parquet.thrift) in the `ColumnOrder` union. They are summa

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146080282 ## README.md: ## @@ -163,18 +163,25 @@ following rules: [Thrift definition](src/main/thrift/parquet.thrift) in the `ColumnOrder` union. They are summa

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704107#comment-17704107 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146076533 ## README.md: ## @@ -163,18 +163,25 @@ following rules: [Thrift definition](src/main/thrift/parquet.thrift) in the `ColumnOrder` union. They are summa

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704106#comment-17704106 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146076533 ## README.md: ## @@ -163,18 +163,25 @@ following rules: [Thrift definition](src/main/thrift/parquet.thrift) in the `ColumnOrder` union. They are summa

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704100#comment-17704100 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146067341 ## src/main/thrift/parquet.thrift: ## @@ -886,16 +888,23 @@ union ColumnOrder { * FIXED_LEN_BYTE_ARRAY - unsigned byte-wise comparison * * (*) Because

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704073#comment-17704073 ] ASF GitHub Bot commented on PARQUET-2249: - mapleFU commented on code in PR #196

[GitHub] [parquet-format] mapleFU commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
mapleFU commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146014814 ## src/main/thrift/parquet.thrift: ## @@ -886,16 +888,23 @@ union ColumnOrder { * FIXED_LEN_BYTE_ARRAY - unsigned byte-wise comparison * * (*) Becaus

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704071#comment-17704071 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704070#comment-17704070 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146008777 ## src/main/thrift/parquet.thrift: ## @@ -886,16 +888,23 @@ union ColumnOrder { * FIXED_LEN_BYTE_ARRAY - unsigned byte-wise comparison * * (*) Because

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146010836 ## src/main/thrift/parquet.thrift: ## @@ -952,6 +961,9 @@ struct ColumnIndex { * Such more compact values must still be valid values within the column's * l

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704069#comment-17704069 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704068#comment-17704068 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146010836 ## src/main/thrift/parquet.thrift: ## @@ -952,6 +961,9 @@ struct ColumnIndex { * Such more compact values must still be valid values within the column's * l

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1145999358 ## src/main/thrift/parquet.thrift: ## @@ -952,6 +961,9 @@ struct ColumnIndex { * Such more compact values must still be valid values within the column's * l

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704067#comment-17704067 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704066#comment-17704066 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1145999358 ## src/main/thrift/parquet.thrift: ## @@ -952,6 +961,9 @@ struct ColumnIndex { * Such more compact values must still be valid values within the column's * l

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1146008777 ## src/main/thrift/parquet.thrift: ## @@ -886,16 +888,23 @@ union ColumnOrder { * FIXED_LEN_BYTE_ARRAY - unsigned byte-wise comparison * * (*) Because

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704065#comment-17704065 ] ASF GitHub Bot commented on PARQUET-2249: - JFinis commented on code in PR #196:

[GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
JFinis commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1145999358 ## src/main/thrift/parquet.thrift: ## @@ -952,6 +961,9 @@ struct ColumnIndex { * Such more compact values must still be valid values within the column's * l

[jira] [Commented] (PARQUET-2149) Implement async IO for Parquet file reader

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704031#comment-17704031 ] ASF GitHub Bot commented on PARQUET-2149: - hazelnutsgz commented on PR #968: UR

[GitHub] [parquet-mr] hazelnutsgz commented on pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2023-03-23 Thread via GitHub
hazelnutsgz commented on PR #968: URL: https://github.com/apache/parquet-mr/pull/968#issuecomment-1480892881 Sorry to interrupt, just wondering if this PR can work with [S3AsyncClient](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/S3AsyncClient.html) by any c

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-03-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704022#comment-17704022 ] ASF GitHub Bot commented on PARQUET-2249: - wgtmac commented on code in PR #196:

[GitHub] [parquet-format] wgtmac commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics

2023-03-23 Thread via GitHub
wgtmac commented on code in PR #196: URL: https://github.com/apache/parquet-format/pull/196#discussion_r1145920913 ## src/main/thrift/parquet.thrift: ## @@ -952,6 +961,9 @@ struct ColumnIndex { * Such more compact values must still be valid values within the column's * l