[
https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gang Wu reassigned PARQUET-2241:
Assignee: Gang Wu
> ByteStreamSplitDecoder broken in presence of nulls
>
[
https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687342#comment-17687342
]
ASF GitHub Bot commented on PARQUET-2237:
-
yabola commented on code in PR #1023:
URL:
yabola commented on code in PR #1023:
URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1102921044
##
parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java:
##
@@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) {
T
[
https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687341#comment-17687341
]
ASF GitHub Bot commented on PARQUET-2237:
-
yabola commented on code in PR #1023:
URL:
[
https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687340#comment-17687340
]
ASF GitHub Bot commented on PARQUET-2237:
-
yabola commented on code in PR #1023:
URL:
yabola commented on code in PR #1023:
URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1102921044
##
parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java:
##
@@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) {
T
yabola commented on code in PR #1023:
URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1102921044
##
parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java:
##
@@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) {
T
[
https://issues.apache.org/jira/browse/PARQUET-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687306#comment-17687306
]
ASF GitHub Bot commented on PARQUET-2229:
-
shangxinli merged PR #1021:
URL:
shangxinli merged PR #1021:
URL: https://github.com/apache/parquet-mr/pull/1021
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
[
https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687304#comment-17687304
]
ASF GitHub Bot commented on PARQUET-2241:
-
shangxinli merged PR #192:
URL:
shangxinli merged PR #192:
URL: https://github.com/apache/parquet-format/pull/192
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
[
https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687228#comment-17687228
]
ASF GitHub Bot commented on PARQUET-2241:
-
emkornfield commented on PR #192:
URL:
emkornfield commented on PR #192:
URL: https://github.com/apache/parquet-format/pull/192#issuecomment-1426163629
Seems OK to me.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
[
https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687174#comment-17687174
]
ASF GitHub Bot commented on PARQUET-2241:
-
mapleFU commented on PR #192:
URL:
mapleFU commented on PR #192:
URL: https://github.com/apache/parquet-format/pull/192#issuecomment-1426068317
I think should we check that no more padding is added in all impl? At least,
seems C++, Rust, parquet-mr didn't padding at the end of data.
--
This is an automated message from
[
https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687131#comment-17687131
]
ASF GitHub Bot commented on PARQUET-2237:
-
yabola commented on code in PR #1023:
URL:
yabola commented on code in PR #1023:
URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1102921044
##
parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java:
##
@@ -289,8 +320,14 @@ public > Boolean visit(Lt lt) {
T
[
https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687119#comment-17687119
]
ASF GitHub Bot commented on PARQUET-2237:
-
yabola commented on code in PR #1023:
URL:
yabola commented on code in PR #1023:
URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1102881433
##
parquet-hadoop/src/main/java/org/apache/parquet/filter2/compat/PredicateEvaluation.java:
##
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation
[
https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687115#comment-17687115
]
ASF GitHub Bot commented on PARQUET-2237:
-
yabola commented on PR #1023:
URL:
yabola commented on PR #1023:
URL: https://github.com/apache/parquet-mr/pull/1023#issuecomment-1425942666
@wgtmac Sorry, `Boolean` type has to be used here, so that we can
distinguish the `BLOCK_MIGHT_MATCH` and `BLOCK_MUST_MATCH`. This is example:
```
Boolean b1 = new Boolean(true);
[
https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687113#comment-17687113
]
ASF GitHub Bot commented on PARQUET-2241:
-
pitrou commented on PR #192:
URL:
pitrou commented on PR #192:
URL: https://github.com/apache/parquet-format/pull/192#issuecomment-1425935594
cc @wjones127
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
[
https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687109#comment-17687109
]
ASF GitHub Bot commented on PARQUET-2241:
-
wgtmac commented on PR #192:
URL:
wgtmac commented on PR #192:
URL: https://github.com/apache/parquet-format/pull/192#issuecomment-1425920012
cc @shangxinli @gszadovszky @ggershinsky @pitrou @emkornfield
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
[
https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687106#comment-17687106
]
ASF GitHub Bot commented on PARQUET-2241:
-
wgtmac opened a new pull request, #192:
URL:
wgtmac opened a new pull request, #192:
URL: https://github.com/apache/parquet-format/pull/192
Propose to explicitly state that no padding is allowed within a data page.
This makes it easier for BYTE_STREAM_SPLIT decoder to decode page with nulls.
In this way, it can simply get the number
[
https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gang Wu updated PARQUET-2241:
-
Fix Version/s: (was: format-2.10.0)
> ByteStreamSplitDecoder broken in presence of nulls
>
[
https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gang Wu updated PARQUET-2241:
-
Component/s: parquet-mr
> ByteStreamSplitDecoder broken in presence of nulls
>
[
https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17686941#comment-17686941
]
Gang Wu commented on PARQUET-2241:
--
Have you seen any relevant issue in production? [~gershinsky]
[
https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17686938#comment-17686938
]
Gang Wu commented on PARQUET-2241:
--
It seems that the *ByteStreamSplitValuesReader* in the parquet-mr
31 matches
Mail list logo