[jira] [Commented] (PARQUET-1805) Refactor the configuration for bloom filters

Gabor Szadovszky (Jira) Mon, 01 Feb 2021 01:55:04 -0800


    [ 
https://issues.apache.org/jira/browse/PARQUET-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276213#comment-17276213
 ]


Gabor Szadovszky commented on PARQUET-1805:
-------------------------------------------

Oh, I got it, thanks [~junjie]. I've felt it was more logical this way. The 
"major" configuration is for all columns and the "column specific" one is to 
configure otherwise. Since the "major" one is false by default you only need to 
enable the bloom filters for the columns one-by-one. You don't even need to set 
`parquet.bloom.filter.enabled` but the columns specific ones only. We've tried 
to describe this in the 
[README|https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/README.md].

> Refactor the configuration for bloom filters
> --------------------------------------------
>
>                 Key: PARQUET-1805
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1805
>             Project: Parquet
>          Issue Type: Improvement
>            Reporter: Gabor Szadovszky
>            Assignee: Gabor Szadovszky
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.12.0
>
>
> Refactor the hadoop configuration for bloom filters according to PARQUET-1784.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PARQUET-1805) Refactor the configuration for bloom filters

Reply via email to