Tim Armstrong has posted comments on this change. Change subject: IMPALA-4731: Crash when sorting on non-deterministic expr ......................................................................
Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/5914/1//COMMIT_MSG Commit Message: Line 7: IMPALA-4731: Crash when sorting on non-deterministic expr This also may fix IMPALA-4728 or IMPALA-397. Might be good to add a targeted-perf test to confirm it fixes the perf bug IMPALA-4728. I think Greg Rahn had some examples of analytic functions where lazy materialisation was very slow. http://gerrit.cloudera.org:8080/#/c/5914/1/fe/src/main/java/org/apache/impala/analysis/SortInfo.java File fe/src/main/java/org/apache/impala/analysis/SortInfo.java: Line 194: } else { This is a weird corner case case, but I think we should drop LiteralExprs instead of materialising them. It's plausible that real-world queries might have order-by expressions that can be rewritten into constants. Line 207: substituteOrderingExprs(substOrderBy, analyzer); I think we only need to substitute the SlotRefs in the first branch of the if() above. We could just do the substitution there. Otherwise it's a bit hard to reason about materializedOrderingExprs containing a mix of SlotRefs referring to the old and new tuples. http://gerrit.cloudera.org:8080/#/c/5914/1/tests/query_test/test_sort.py File tests/query_test/test_sort.py: Line 162: def test_order_by_random(self): I think the example in IMPALA-397 would also be a good test - there rand() is in the select list so we can verify that the order is correct. -- To view, visit http://gerrit.cloudera.org:8080/5914 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5dcda32fc7770d42fc500ce87fc54d58e5b5dc00 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-HasComments: Yes