[GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
gszadovszky commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1113966806 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) {

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692011#comment-17692011 ] ASF GitHub Bot commented on PARQUET-2237: - gszadovszky commented on code in PR

[jira] [Assigned] (PARQUET-2247) Fail-fast if CapacityByteArrayOutputStream write overflow

2023-02-22 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-2247: - Assignee: Gabor Szadovszky > Fail-fast if CapacityByteArrayOutputStream write

[GitHub] [parquet-mr] gszadovszky merged pull request #1031: PARQUET-2247: Fail-fast if CapacityByteArrayOutputStream write overflow

2023-02-22 Thread via GitHub
gszadovszky merged PR #1031: URL: https://github.com/apache/parquet-mr/pull/1031 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
wgtmac commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1113972176 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Resolved] (PARQUET-2247) Fail-fast if CapacityByteArrayOutputStream write overflow

2023-02-22 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-2247. --- Resolution: Fixed > Fail-fast if CapacityByteArrayOutputStream write overflow > ---

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692015#comment-17692015 ] ASF GitHub Bot commented on PARQUET-2237: - wgtmac commented on code in PR #1023

[jira] [Commented] (PARQUET-2247) Fail-fast if CapacityByteArrayOutputStream write overflow

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692014#comment-17692014 ] ASF GitHub Bot commented on PARQUET-2247: - gszadovszky merged PR #1031: URL: ht

[jira] [Assigned] (PARQUET-2247) Fail-fast if CapacityByteArrayOutputStream write overflow

2023-02-22 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-2247: - Assignee: dzcxzl (was: Gabor Szadovszky) > Fail-fast if CapacityByteArrayOutp

[GitHub] [parquet-mr] wgtmac commented on pull request #1030: PARQUET-2246: Add short circuit logic to column index filter

2023-02-22 Thread via GitHub
wgtmac commented on PR #1030: URL: https://github.com/apache/parquet-mr/pull/1030#issuecomment-1439613076 cc @gszadovszky -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[jira] [Commented] (PARQUET-2246) Add short circuit logic to column index filter

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692019#comment-17692019 ] ASF GitHub Bot commented on PARQUET-2246: - wgtmac commented on PR #1030: URL: h

[GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
gszadovszky commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114002851 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) {

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692029#comment-17692029 ] ASF GitHub Bot commented on PARQUET-2237: - gszadovszky commented on code in PR

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
wgtmac commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114020234 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692036#comment-17692036 ] ASF GitHub Bot commented on PARQUET-2237: - wgtmac commented on code in PR #1023

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
wgtmac commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114037215 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/compat/PredicateEvaluation.java: ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (A

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692041#comment-17692041 ] ASF GitHub Bot commented on PARQUET-2237: - wgtmac commented on code in PR #1023

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-22 Thread via GitHub
wgtmac commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1114063818 ## parquet-generator/src/main/java/org/apache/parquet/encoding/vectorbitpacking/BitPackingGenerator512Vector.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692059#comment-17692059 ] ASF GitHub Bot commented on PARQUET-2159: - wgtmac commented on code in PR #1011

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-22 Thread via GitHub
wgtmac commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1114064922 ## parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ParquetReadRouter.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Found

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692061#comment-17692061 ] ASF GitHub Bot commented on PARQUET-2159: - wgtmac commented on code in PR #1011

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114066105 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692062#comment-17692062 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
wgtmac commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114070539 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692065#comment-17692065 ] ASF GitHub Bot commented on PARQUET-2237: - wgtmac commented on code in PR #1023

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114066105 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692071#comment-17692071 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114087072 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692078#comment-17692078 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114087072 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692082#comment-17692082 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114087072 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114087072 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692083#comment-17692083 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692084#comment-17692084 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114087072 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692086#comment-17692086 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
wgtmac commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114096781 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692088#comment-17692088 ] ASF GitHub Bot commented on PARQUET-2237: - wgtmac commented on code in PR #1023

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114087072 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692089#comment-17692089 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114087072 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692090#comment-17692090 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114106111 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692095#comment-17692095 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114106111 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692101#comment-17692101 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114087072 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692102#comment-17692102 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
gszadovszky commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114171229 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) {

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692124#comment-17692124 ] ASF GitHub Bot commented on PARQUET-2237: - gszadovszky commented on code in PR

[GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-22 Thread via GitHub
gszadovszky commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1114179342 ## parquet-generator/src/main/java/org/apache/parquet/encoding/vectorbitpacking/BitPackingGenerator512Vector.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Ap

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692131#comment-17692131 ] ASF GitHub Bot commented on PARQUET-2159: - gszadovszky commented on code in PR

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-22 Thread via GitHub
wgtmac commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1114220877 ## parquet-generator/src/main/java/org/apache/parquet/encoding/vectorbitpacking/BitPackingGenerator512Vector.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692151#comment-17692151 ] ASF GitHub Bot commented on PARQUET-2159: - wgtmac commented on code in PR #1011

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
wgtmac commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114230889 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692155#comment-17692155 ] ASF GitHub Bot commented on PARQUET-2237: - wgtmac commented on code in PR #1023

[GitHub] [parquet-mr] ggershinsky commented on a diff in pull request #1019: PARQUET-2103: Fix crypto exception in print toPrettyJSON

2023-02-22 Thread via GitHub
ggershinsky commented on code in PR #1019: URL: https://github.com/apache/parquet-mr/pull/1019#discussion_r1114249805 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/metadata/FileMetaData.java: ## @@ -50,16 +51,23 @@ public final class FileMetaData implements Serializa

[jira] [Commented] (PARQUET-2103) crypto exception in print toPrettyJSON

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692165#comment-17692165 ] ASF GitHub Bot commented on PARQUET-2103: - ggershinsky commented on code in PR

[GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
gszadovszky commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114293572 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) {

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692189#comment-17692189 ] ASF GitHub Bot commented on PARQUET-2237: - gszadovszky commented on code in PR

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114365987 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692205#comment-17692205 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692207#comment-17692207 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-22 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1114365987 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java: ## @@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) { T val

[GitHub] [parquet-mr] parthchandra commented on a diff in pull request #1032: PARQUET-2164: Check size of buffered data to prevent page data from overflowing

2023-02-22 Thread via GitHub
parthchandra commented on code in PR #1032: URL: https://github.com/apache/parquet-mr/pull/1032#discussion_r1114670273 ## parquet-common/src/main/java/org/apache/parquet/bytes/CapacityByteArrayOutputStream.java: ## @@ -164,6 +164,15 @@ public CapacityByteArrayOutputStream(int in

[GitHub] [parquet-mr] parthchandra commented on a diff in pull request #1032: PARQUET-2164: Check size of buffered data to prevent page data from overflowing

2023-02-22 Thread via GitHub
parthchandra commented on code in PR #1032: URL: https://github.com/apache/parquet-mr/pull/1032#discussion_r1114670273 ## parquet-common/src/main/java/org/apache/parquet/bytes/CapacityByteArrayOutputStream.java: ## @@ -164,6 +164,15 @@ public CapacityByteArrayOutputStream(int in

[jira] [Commented] (PARQUET-2164) CapacityByteArrayOutputStream overflow while writing causes negative row group sizes to be written

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692282#comment-17692282 ] ASF GitHub Bot commented on PARQUET-2164: - parthchandra commented on code in PR

[jira] [Commented] (PARQUET-2164) CapacityByteArrayOutputStream overflow while writing causes negative row group sizes to be written

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692284#comment-17692284 ] ASF GitHub Bot commented on PARQUET-2164: - parthchandra commented on code in PR

[GitHub] [parquet-mr] jiangjiguang commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-22 Thread via GitHub
jiangjiguang commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1115144554 ## parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ParquetReadRouter.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692425#comment-17692425 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang commented on code in PR

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1032: PARQUET-2164: Check size of buffered data to prevent page data from overflowing

2023-02-22 Thread via GitHub
wgtmac commented on code in PR #1032: URL: https://github.com/apache/parquet-mr/pull/1032#discussion_r1115168960 ## parquet-common/src/main/java/org/apache/parquet/bytes/CapacityByteArrayOutputStream.java: ## @@ -164,6 +164,15 @@ public CapacityByteArrayOutputStream(int initialS

[jira] [Commented] (PARQUET-2164) CapacityByteArrayOutputStream overflow while writing causes negative row group sizes to be written

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692437#comment-17692437 ] ASF GitHub Bot commented on PARQUET-2164: - wgtmac commented on code in PR #1032

[GitHub] [parquet-mr] jatin-bhateja commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-22 Thread via GitHub
jatin-bhateja commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1115229390 ## parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ParquetReadRouter.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Softwar

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692485#comment-17692485 ] ASF GitHub Bot commented on PARQUET-2159: - jatin-bhateja commented on code in P

[GitHub] [parquet-mr] jatin-bhateja commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-22 Thread via GitHub
jatin-bhateja commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1115229390 ## parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ParquetReadRouter.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Softwar

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692488#comment-17692488 ] ASF GitHub Bot commented on PARQUET-2159: - jatin-bhateja commented on code in P

[GitHub] [parquet-mr] gszadovszky merged pull request #1027: PARQUET-2243: Support zstd-jni in DirectCodecFactory

2023-02-22 Thread via GitHub
gszadovszky merged PR #1027: URL: https://github.com/apache/parquet-mr/pull/1027 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet

[jira] [Resolved] (PARQUET-2243) Support zstd-jni in DirectCodecFactory

2023-02-22 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-2243. --- Resolution: Fixed > Support zstd-jni in DirectCodecFactory > --

[jira] [Commented] (PARQUET-2243) Support zstd-jni in DirectCodecFactory

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692509#comment-17692509 ] ASF GitHub Bot commented on PARQUET-2243: - gszadovszky merged PR #1027: URL: ht

[GitHub] [parquet-mr] jiangjiguang commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-22 Thread via GitHub
jiangjiguang commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1115317241 ## parquet-generator/src/main/java/org/apache/parquet/encoding/vectorbitpacking/BitPackingGenerator512Vector.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the A

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692538#comment-17692538 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang commented on code in PR

[GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-22 Thread via GitHub
gszadovszky commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r111532 ## parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ParquetReadRouter.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692545#comment-17692545 ] ASF GitHub Bot commented on PARQUET-2159: - gszadovszky commented on code in PR