[ https://issues.apache.org/jira/browse/IMPALA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16966631#comment-16966631 ]
Zoltán Borók-Nagy commented on IMPALA-8755: ------------------------------------------- [~arodoni] yeah I think we should wait until the feature is usable. > Implement Z-ordering for Impala > ------------------------------- > > Key: IMPALA-8755 > URL: https://issues.apache.org/jira/browse/IMPALA-8755 > Project: IMPALA > Issue Type: New Feature > Reporter: Zoltán Borók-Nagy > Assignee: Norbert Luksa > Priority: Major > > Implement Z-ordering for Impala: [https://en.wikipedia.org/wiki/Z-order_curve] > A Z-order curve defines an ordering on multi-dimensional data. Data sorted > that way can be efficiently filtered by min/max statistics regarding to the > columns participating in the ordering. > Impala currently only supports lexicographic ordering via the SORT BY clause. > This strongly prefers the first column, i.e. given the "SORT BY A, B, C" > clause => A will be totally ordered (hence filtering on A will be very > efficient), but values belonging to B and C will be scattered throughout the > data set (hence filtering on B or C will barely do any good). > We could add a new clause, e.g. a "ZSORT BY" clause to Impala that writes the > data in Z-order. > "ZSORT BY A, B C" would cluster the rows in a way that filtering on A, B, or > C would be equally efficient. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org