[jira] [Commented] (PARQUET-1809) Add new APIs for nested predicate pushdown

2020-03-04 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051023#comment-17051023 ] Gabor Szadovszky commented on PARQUET-1809: --- I am afraid, it is not only the

[jira] [Resolved] (PARQUET-1806) [C++] [CI] Improve fuzzing seed corpus

2020-03-04 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved PARQUET-1806. - Fix Version/s: cpp-1.6.0 Resolution: Fixed Issue resolved by pull request 6526 [

Re: Provide pluggable APIs to support user customized compression codec

2020-03-04 Thread Gabor Szadovszky
Hi, My problem with this idea is that I cannot see how we can control that a customized codec would compress the data in the specified way so every reader that supports the codec can read it. We already have an issue about an incompatibility between the java and cpp implementations of the LZ4 comp

Re: Provide pluggable APIs to support user customized compression codec

2020-03-04 Thread Radev, Martin
Hi Xin, thanks for the interest in extending Parquet. I suppose this is only about the Parquet Writer/Reader implementation, not about changes to the Parquet specification. I would like to know whether offloading the task of compressing/decompressing some data is really beneficial performance

[jira] [Commented] (PARQUET-1808) SimpleGroup.toString() uses String += and so has poor performance

2020-03-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051383#comment-17051383 ] ASF GitHub Bot commented on PARQUET-1808: - koiralo commented on pull request #7

[jira] [Updated] (PARQUET-1808) SimpleGroup.toString() uses String += and so has poor performance

2020-03-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated PARQUET-1808: Labels: pull-request-available (was: ) > SimpleGroup.toString() uses String += and so ha

[jira] [Commented] (PARQUET-1808) SimpleGroup.toString() uses String += and so has poor performance

2020-03-04 Thread Shankar Koirala (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051389#comment-17051389 ] Shankar Koirala commented on PARQUET-1808: -- [~tiddman] nice finding. [~tiddman

[jira] [Commented] (PARQUET-1808) SimpleGroup.toString() uses String += and so has poor performance

2020-03-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051412#comment-17051412 ] ASF GitHub Bot commented on PARQUET-1808: - koiralo commented on pull request #7

[jira] [Commented] (PARQUET-1808) SimpleGroup.toString() uses String += and so has poor performance

2020-03-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051429#comment-17051429 ] ASF GitHub Bot commented on PARQUET-1808: - koiralo commented on pull request #7

[jira] [Comment Edited] (PARQUET-1808) SimpleGroup.toString() uses String += and so has poor performance

2020-03-04 Thread Shankar Koirala (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051389#comment-17051389 ] Shankar Koirala edited comment on PARQUET-1808 at 3/4/20, 5:07 PM: --

[jira] [Created] (PARQUET-1810) [C++] Fix undefined behaviour on invalid enum values (OSS-Fuzz)

2020-03-04 Thread Antoine Pitrou (Jira)
Antoine Pitrou created PARQUET-1810: --- Summary: [C++] Fix undefined behaviour on invalid enum values (OSS-Fuzz) Key: PARQUET-1810 URL: https://issues.apache.org/jira/browse/PARQUET-1810 Project: Parq

[jira] [Updated] (PARQUET-1810) [C++] Fix undefined behaviour on invalid enum values (OSS-Fuzz)

2020-03-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated PARQUET-1810: Labels: pull-request-available (was: ) > [C++] Fix undefined behaviour on invalid enum v

[jira] [Commented] (PARQUET-1808) SimpleGroup.toString() uses String += and so has poor performance

2020-03-04 Thread Randy Tidd (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051550#comment-17051550 ] Randy Tidd commented on PARQUET-1808: - [~gszadovszky] thanks for the response. We d

[jira] [Commented] (PARQUET-1809) Add new APIs for nested predicate pushdown

2020-03-04 Thread DB Tsai (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051567#comment-17051567 ] DB Tsai commented on PARQUET-1809: -- cc [~rdblue]   In Spark, we don't run into issue

[jira] [Comment Edited] (PARQUET-1809) Add new APIs for nested predicate pushdown

2020-03-04 Thread DB Tsai (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051567#comment-17051567 ] DB Tsai edited comment on PARQUET-1809 at 3/4/20, 7:47 PM: --- c

[jira] [Commented] (PARQUET-1809) Add new APIs for nested predicate pushdown

2020-03-04 Thread Ryan Blue (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051585#comment-17051585 ] Ryan Blue commented on PARQUET-1809: I think it should be fine to allow this. While