[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2021-04-17 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324404#comment-17324404 ] ASF GitHub Bot commented on PARQUET-41: --- jbapple commented on pull request #757: URL:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2021-04-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324126#comment-17324126 ] ASF GitHub Bot commented on PARQUET-41: --- shannonwells edited a comment on pull request #757: URL:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2021-04-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324125#comment-17324125 ] ASF GitHub Bot commented on PARQUET-41: --- shannonwells commented on pull request #757: URL:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2021-02-01 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276862#comment-17276862 ] Nicholas Chammas commented on PARQUET-41: - Thanks for the link [~yumwang]. That

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2021-02-01 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276854#comment-17276854 ] Yuming Wang commented on PARQUET-41: [~nchammas] You can check the related configuration parameters:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2021-02-01 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276842#comment-17276842 ] Nicholas Chammas commented on PARQUET-41: - Where is the user documentation for all the bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2020-03-18 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061558#comment-17061558 ] Gabor Szadovszky commented on PARQUET-41: - [~junma], the target release for this feature is

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2020-03-16 Thread Jun Ma (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060490#comment-17060490 ] Jun Ma commented on PARQUET-41: --- [~gszadovszky]/[~junjie], what's the release timeline for this feature? 

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2020-02-26 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045530#comment-17045530 ] Gabor Szadovszky commented on PARQUET-41: - [~junjie], feature branch for parquet-mr has been

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2020-02-26 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045527#comment-17045527 ] ASF GitHub Bot commented on PARQUET-41: --- gszadovszky commented on pull request #757: PARQUET-41:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-10-11 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647258#comment-16647258 ] ASF GitHub Bot commented on PARQUET-41: --- majetideepak closed pull request #113: PARQUET-41: Add

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-10-11 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647248#comment-16647248 ] ASF GitHub Bot commented on PARQUET-41: --- majetideepak opened a new pull request #113: PARQUET-41:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-10-11 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647243#comment-16647243 ] ASF GitHub Bot commented on PARQUET-41: --- majetideepak closed pull request #112: PARQUET-41: Add

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-10-03 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636560#comment-16636560 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust closed pull request #99: PARQUET-41: add bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-10-03 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636508#comment-16636508 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust opened a new pull request #112: PARQUET-41: Add

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-09-25 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627622#comment-16627622 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust closed pull request #62: PARQUET-41: Add bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-07-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550201#comment-16550201 ] Junjie Chen commented on PARQUET-41: [~aniket486], Thanks for watching this. Yes, I 'm still

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-07-19 Thread Aniket Mokashi (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550166#comment-16550166 ] Aniket Mokashi commented on PARQUET-41: --- [~junjie] - thanks for driving this project! Are you

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-06-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517638#comment-16517638 ] Junjie Chen commented on PARQUET-41: [~jbapple], I just created a new parquet-format PR since

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-06-19 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517637#comment-16517637 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust opened a new pull request #99: PARQUET-41: add

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-06-19 Thread Jim Apple (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517298#comment-16517298 ] Jim Apple commented on PARQUET-41: -- Is there and updated PR for parquet-format that matches the open PRs

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-06-15 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513908#comment-16513908 ] Junjie Chen commented on PARQUET-41: Thanks [~jbapple] Since the jira may contains several

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-06-12 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509737#comment-16509737 ] Junjie Chen commented on PARQUET-41: Hi Here is benchmark link:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-23 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486936#comment-16486936 ] Junjie Chen commented on PARQUET-41: [~jbapple], I understood your point, I will do benchmark to

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-23 Thread Jim Apple (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486884#comment-16486884 ] Jim Apple commented on PARQUET-41: -- In response to [~junjie]'s question above, "Sure, it is feasible,

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-23 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486862#comment-16486862 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust commented on issue #432: PARQUET-41: Add bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-23 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486844#comment-16486844 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust commented on issue #425: PARQUET-41:Add Bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-23 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486836#comment-16486836 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust closed pull request #484: PARQUET-41: rebase to

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-23 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486833#comment-16486833 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust opened a new pull request #484: PARQUET-41: rebase

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-22 Thread Junping Du (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483791#comment-16483791 ] Junping Du commented on PARQUET-41: --- Thanks [~Ferd] for quick update. The plan sounds good. > Add bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-18 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16481309#comment-16481309 ] Ferdinand Xu commented on PARQUET-41: - [~djp] I didn't have circles to move it forwards recently.

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-18 Thread Junping Du (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480460#comment-16480460 ] Junping Du commented on PARQUET-41: --- This is a very critical feature from performance perspective.

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-03 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462365#comment-16462365 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust commented on issue #425: PARQUET-41:Add Bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-03 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462359#comment-16462359 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust commented on issue #432: PARQUET-41: Add bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-03 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462316#comment-16462316 ] ASF GitHub Bot commented on PARQUET-41: --- BenoitHanotte commented on issue #425: PARQUET-41:Add Bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-03 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462312#comment-16462312 ] ASF GitHub Bot commented on PARQUET-41: --- BenoitHanotte commented on issue #425: PARQUET-41:Add Bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343563#comment-16343563 ] ASF GitHub Bot commented on PARQUET-41: --- daedric commented on issue #432: PARQUET-41: Add bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343497#comment-16343497 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust commented on issue #432: PARQUET-41: Add bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343376#comment-16343376 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust commented on a change in pull request #432:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343306#comment-16343306 ] ASF GitHub Bot commented on PARQUET-41: --- daedric commented on a change in pull request #432:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343305#comment-16343305 ] ASF GitHub Bot commented on PARQUET-41: --- daedric commented on a change in pull request #432:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-24 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338722#comment-16338722 ] Junjie Chen commented on PARQUET-41: Sure, it is feasible, then we are comparing bloom filter vs

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-24 Thread Jim Apple (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338709#comment-16338709 ] Jim Apple commented on PARQUET-41: -- Why not tweak that logic in parquet-mr to allow dictionary encoding

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-24 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338706#comment-16338706 ] Junjie Chen commented on PARQUET-41: In Parquet-mr, when we set dictionary encoding to true, the

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-24 Thread Jim Apple (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338680#comment-16338680 ] Jim Apple commented on PARQUET-41: -- Could you elaborate on "A column with large cardinality can not even

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-24 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338670#comment-16338670 ] Junjie Chen commented on PARQUET-41: Hi [~jbapple], AFAIK, we don't have benchmark progress to compare

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-24 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338631#comment-16338631 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust commented on a change in pull request #432:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-24 Thread Jim Apple (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338244#comment-16338244 ] Jim Apple commented on PARQUET-41: -- IIRC, there was a plan to create an end-to-end benchmark of an MR or

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-19 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331898#comment-16331898 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust opened a new pull request #432: PARQUET-41: Add

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-18 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331602#comment-16331602 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust closed pull request #431: PARQUET-41: Add bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-18 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331598#comment-16331598 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust opened a new pull request #431: PARQUET-41: Add

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-08-28 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143463#comment-16143463 ] Junjie Chen commented on PARQUET-41: please see initial PR:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-08-27 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143288#comment-16143288 ] Junjie Chen commented on PARQUET-41: Add related [design

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-07-20 Thread Jim Apple (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095685#comment-16095685 ] Jim Apple commented on PARQUET-41: -- In response to your request for a benchmark, see

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-07-20 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095614#comment-16095614 ] Ferdinand Xu commented on PARQUET-41: - Thanks [~jbapple] for the suggestions. {noformat} As a result,

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-07-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094006#comment-16094006 ] Junjie Chen commented on PARQUET-41: Thanks Jim Very useful links and example code! > Add bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-07-19 Thread Jim Apple (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093432#comment-16093432 ] Jim Apple commented on PARQUET-41: -- We might want to consider "Cache-, Hash- and Space-Efficient Bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-22 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019813#comment-16019813 ] Ryan Blue commented on PARQUET-41: -- Yeah, I'll help review it. > Add bloom filters to parquet statistics

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-22 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019812#comment-16019812 ] Ferdinand Xu commented on PARQUET-41: - The pull request for PARQUET-319 is out of date which requires

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-22 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019802#comment-16019802 ] Ryan Blue commented on PARQUET-41: -- [~Ferd], it shouldn't matter that the bloom filter is stored at the

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-22 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019792#comment-16019792 ] Ferdinand Xu commented on PARQUET-41: - Thanks [~rdblue] for your comments. bq. For example, if you

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018218#comment-16018218 ] Junjie Chen commented on PARQUET-41: Hi [~rdblue] In telecom example, query column is not unique if

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-18 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016790#comment-16016790 ] Junjie Chen commented on PARQUET-41: Hi [~rdblue] The distinct values in each column is increasing

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-18 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016458#comment-16016458 ] Ryan Blue commented on PARQUET-41: -- [~junjie] & [~Ferd], it would be great to get a bit more data on

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-18 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015449#comment-16015449 ] Junjie Chen commented on PARQUET-41: Hi [~rdblue] We have a real use case from a Telecom company which

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-11 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007446#comment-16007446 ] Ferdinand Xu commented on PARQUET-41: - bq. This is only applicable for columns that aren't

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-10 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005755#comment-16005755 ] Ferdinand Xu commented on PARQUET-41: - It's very useful when trying to filter non-partitioning column.

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-10 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005480#comment-16005480 ] Ryan Blue commented on PARQUET-41: -- [~costimuraru], dictionary-based filters were added that satisfy much

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-10 Thread Constantin Muraru (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005452#comment-16005452 ] Constantin Muraru commented on PARQUET-41: -- Any news on this one? This would be great. > Add

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2016-02-15 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148130#comment-15148130 ] Ferdinand Xu commented on PARQUET-41: - Hi [~rdblue], I have a basic idea about how to estimate the

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-11-17 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15009099#comment-15009099 ] Ryan Blue commented on PARQUET-41: -- [~Ferd], I think we need a design doc for this feature and some data

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-07-07 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616251#comment-14616251 ] Ferdinand Xu commented on PARQUET-41: - Hi [~rdblue], I have some thoughts for the

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-29 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605183#comment-14605183 ] Ferdinand Xu commented on PARQUET-41: - Hi [~rdblue], really appreciate for your long

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-29 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605258#comment-14605258 ] Ferdinand Xu commented on PARQUET-41: - I did a check for some entries in the 1st page

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-29 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606674#comment-14606674 ] Ryan Blue commented on PARQUET-41: -- I should also point out there's a table on the first

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-26 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602532#comment-14602532 ] Ryan Blue commented on PARQUET-41: -- Thanks for working on this, [~Ferd], it's great to be

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-24 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599093#comment-14599093 ] Ferdinand Xu commented on PARQUET-41: - Hi, I have updated the PR for multiple bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-24 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14600367#comment-14600367 ] Ryan Blue commented on PARQUET-41: -- I don't think the counting bloom filter idea is worth

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-23 Thread Prateek Rungta (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597767#comment-14597767 ] Prateek Rungta commented on PARQUET-41: --- Hey [~Ferd], I did a quick glance through

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-23 Thread Jason Altekruse (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597859#comment-14597859 ] Jason Altekruse commented on PARQUET-41: I did not get a chance to look through

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-23 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597906#comment-14597906 ] Ryan Blue commented on PARQUET-41: -- Interesting, I hadn't heard about the counting bloom

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-23 Thread Jason Altekruse (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597901#comment-14597901 ] Jason Altekruse commented on PARQUET-41: This might have been a little confusing,

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-23 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598800#comment-14598800 ] Ferdinand Xu commented on PARQUET-41: - Hi [~nezihyigitbasi] [~jaltekruse], I don’t

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-17 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590073#comment-14590073 ] Ryan Blue commented on PARQUET-41: -- Great, thanks [~Ferd]! Could you also tell us a bit

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2015-06-16 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589347#comment-14589347 ] Ferdinand Xu commented on PARQUET-41: - Hi guys, The pull request for parquet-format-mr