Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
......................................................................


Patch Set 7:

> Patch Set 7:
>
> > > Patch Set 7:
>  > >
>  > > Perf results:
>  > > ...
>  >
>  > I'm surprised that only a few queries saw significant speedups. Is
>  > this in line with what you saw with Parquet runtime filters on
>  > TPC-H? Or are we losing a lot by using min/max instead of bloom or
>  > in-list style filters?
>
> Not sure about bloom filters perf, though I can run those numbers for 
> comparison.

I haven't looked at this patch, but had a question about the design:

Are we still pushing blooms across a join to prevent shuffling of data? Or are 
we now pushing _only_ min/max?

It seems there is value in pushing both: the bloom for evaluation on the other 
side of the join to prevent shuffling, and the min/max to push all the way to 
the scanner to reduce I/O.

Not sure if the patch is already doing this.


--
To view, visit http://gerrit.cloudera.org:8080/7793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c
Gerrit-Change-Number: 7793
Gerrit-PatchSet: 7
Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward #345
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mjac...@apache.org>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Tue, 24 Oct 2017 18:32:26 +0000
Gerrit-HasComments: No

Reply via email to