Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9440 )

Change subject: KUDU-2312: Scan predicate application ordering is 
non-deterministic
......................................................................


Patch Set 4:

I did a little research and it actually appears that Impala is ordering 
predicates based on selectivity. See 
https://gist.github.com/89344b31e455a72a831761b5c6ef75f7

That said, I checked a version of kudu without this patch and the scans 
dashboard shows that the predicates were being evaluated in the same order 
(field5 followed by field4) regardless of what Impala said in the explain. I 
also checked an RPC trace and can see that the actual RPC is showing the 
predicates in the order [field4, field5] regardless of what the explain plan 
says. So it seems like our client is likely dropping the order somewhere during 
the scan optimization phase.

If we changed this to be a std::stable_sort instead of tie-breaking with index, 
would we end up retaining the client's predicate order, you think? or would it 
be a bigger change? Certainly seems like it would be valuable to take advantage 
of Impala's stats knowledge in our order of application.


--
To view, visit http://gerrit.cloudera.org:8080/9440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I99b2cabecd8626cad7e11fbdd492af7276e08348
Gerrit-Change-Number: 9440
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Sat, 03 Mar 2018 01:52:37 +0000
Gerrit-HasComments: No

Reply via email to