[
https://issues.apache.org/jira/browse/PARQUET-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730385#comment-17730385
]
ASF GitHub Bot commented on PARQUET-675:
nevi-me closed pull request #165: PARQU
nevi-me closed pull request #165: PARQUET-675: Specify Interval LogicalType
URL: https://github.com/apache/parquet-format/pull/165
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment
wgtmac commented on PR #1068:
URL: https://github.com/apache/parquet-mr/pull/1068#issuecomment-1581769831
> Should we deprecate the dependency? The last release of `elephant-bird` is
from March 2018
Sorry for the late reply. I think it should be deprecated as it is not
maintained any
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730337#comment-17730337
]
ASF GitHub Bot commented on PARQUET-2305:
-
wgtmac commented on code in PR #1102
wgtmac commented on code in PR #1102:
URL: https://github.com/apache/parquet-mr/pull/1102#discussion_r1222352757
##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java:
##
@@ -31,10 +32,12 @@
import org.apache.parquet.io.api.Converter;
import org
You probably need to be more specific on which language bindings you are
using. I think the C++ community is just starting to work on being able to
write out bloom filters (so it isn't supported in C++, Python and R, Ruby,
etc).
The way I read the specification, yes each single value should be ad
@Micah Does that mean that columns of type array already get a bloom filter
on each single value?
I am using Apache Arrow in particular to deal with Parquet files
Il Mer 7 Giu 2023, 16:00 Micah Kornfield ha scritto:
> Hi Marco,
> Could you describe how your proposal differs from tokenizing the t
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730253#comment-17730253
]
ASF GitHub Bot commented on PARQUET-2305:
-
tddfan commented on code in PR #1102
tddfan commented on code in PR #1102:
URL: https://github.com/apache/parquet-mr/pull/1102#discussion_r1222036621
##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java:
##
@@ -86,32 +89,71 @@ class ProtoMessageConverter extends GroupConverter {
tddfan closed pull request #1108: Parquet2proto ignore unkown fields
URL: https://github.com/apache/parquet-mr/pull/1108
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsu
tddfan opened a new pull request, #1108:
URL: https://github.com/apache/parquet-mr/pull/1108
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them in
[
https://issues.apache.org/jira/browse/PARQUET-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730227#comment-17730227
]
ASF GitHub Bot commented on PARQUET-2171:
-
steveloughran commented on PR #1103:
steveloughran commented on PR #1103:
URL: https://github.com/apache/parquet-mr/pull/1103#issuecomment-1581240289
Hadoop API shim; now separate ASF repo there.
https://github.com/apache/hadoop-api-shim
--
This is an automated message from the Apache Git Service.
To respond to the message
wgtmac commented on code in PR #1102:
URL: https://github.com/apache/parquet-mr/pull/1102#discussion_r1221872560
##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoConstants.java:
##
@@ -26,6 +26,7 @@ public final class ProtoConstants {
public static final String
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730189#comment-17730189
]
ASF GitHub Bot commented on PARQUET-2305:
-
wgtmac commented on code in PR #1102
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730184#comment-17730184
]
ASF GitHub Bot commented on PARQUET-2305:
-
tddfan commented on code in PR #1102
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730185#comment-17730185
]
ASF GitHub Bot commented on PARQUET-2305:
-
tddfan commented on code in PR #1102
tddfan commented on code in PR #1102:
URL: https://github.com/apache/parquet-mr/pull/1102#discussion_r1221865678
##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java:
##
@@ -86,32 +89,71 @@ class ProtoMessageConverter extends GroupConverter {
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730183#comment-17730183
]
ASF GitHub Bot commented on PARQUET-2305:
-
tddfan commented on code in PR #1102
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730182#comment-17730182
]
ASF GitHub Bot commented on PARQUET-2305:
-
tddfan commented on code in PR #1102
tddfan commented on code in PR #1102:
URL: https://github.com/apache/parquet-mr/pull/1102#discussion_r1221864629
##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java:
##
@@ -86,32 +89,71 @@ class ProtoMessageConverter extends GroupConverter {
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730181#comment-17730181
]
ASF GitHub Bot commented on PARQUET-2305:
-
tddfan commented on code in PR #1102
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730180#comment-17730180
]
ASF GitHub Bot commented on PARQUET-2305:
-
tddfan commented on code in PR #1102
tddfan commented on code in PR #1102:
URL: https://github.com/apache/parquet-mr/pull/1102#discussion_r1221863218
##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java:
##
@@ -124,13 +166,15 @@ public void start() {
@Override
public void en
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730177#comment-17730177
]
ASF GitHub Bot commented on PARQUET-2305:
-
tddfan commented on code in PR #1102
tddfan commented on code in PR #1102:
URL: https://github.com/apache/parquet-mr/pull/1102#discussion_r1221862896
##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java:
##
@@ -86,32 +89,71 @@ class ProtoMessageConverter extends GroupConverter {
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730179#comment-17730179
]
ASF GitHub Bot commented on PARQUET-2305:
-
tddfan commented on code in PR #1102
tddfan commented on code in PR #1102:
URL: https://github.com/apache/parquet-mr/pull/1102#discussion_r1221861354
##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoParquetReader.java:
##
@@ -71,6 +77,13 @@ protected Builder(InputFile file) {
super(file);
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730175#comment-17730175
]
ASF GitHub Bot commented on PARQUET-2305:
-
tddfan commented on code in PR #1102
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730174#comment-17730174
]
ASF GitHub Bot commented on PARQUET-2305:
-
tddfan commented on code in PR #1102
tddfan commented on code in PR #1102:
URL: https://github.com/apache/parquet-mr/pull/1102#discussion_r1221860150
##
parquet-protobuf/src/test/java/org/apache/parquet/proto/ProtoSchemaEvolutionTest.java:
##
@@ -65,4 +66,68 @@ public void testEnumSchemaWriteV1ReadV2() throws IOExc
tddfan commented on code in PR #1102:
URL: https://github.com/apache/parquet-mr/pull/1102#discussion_r1221859356
##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoParquetReader.java:
##
@@ -37,11 +37,17 @@
public static ParquetReader.Builder builder(Path file)
[
https://issues.apache.org/jira/browse/PARQUET-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730170#comment-17730170
]
ASF GitHub Bot commented on PARQUET-2305:
-
tddfan commented on code in PR #1102
tddfan commented on code in PR #1102:
URL: https://github.com/apache/parquet-mr/pull/1102#discussion_r1221851162
##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java:
##
@@ -86,32 +89,71 @@ class ProtoMessageConverter extends GroupConverter {
tddfan closed pull request #1107: Parquet2proto ignore unkown fields
URL: https://github.com/apache/parquet-mr/pull/1107
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsu
tddfan opened a new pull request, #1107:
URL: https://github.com/apache/parquet-mr/pull/1107
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them in
Hi Marco,
Could you describe how your proposal differs from tokenizing the target
string and storing the list of tokens in a column that has a bloom filter
attached? I think this should be supportable today by the format at least
if not existing libraries.
Thanks,
Micah
On Wednesday, June 7, 202
Hi Marco,
That sounds interesting!
However, this requires the parquet implementation to be able to tokenize
both
strings to write and literals in the filters. The actual efficiency depends
on the
data distribution. I am also concerned with the possible explosion of
distinct
values introduced by s
38 matches
Mail list logo