[
https://issues.apache.org/jira/browse/PHOENIX-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016098#comment-14016098
]
James Taylor commented on PHOENIX-995:
--------------------------------------
These are looking very good, [~tdsilva]. Thanks so much for the contributions.
bq. was not able to understand how the preservesOrder() is supposed to be
implemented. Does OrderPreserving.YES mean that if inputs to the function are
ordered in a particular way, applying the function will not re-order the
outputs wrt to inputs? However, can they be sorted differently, for eg INVERT
has OrderPreserving.YES, even thought it inverts the bits of the input?
Yes, exactly what you said: if inputs to the function are ordered in a
particular way, applying the function will not re-order the outputs wrt to
inputs. This is irregardless of the SortOrder, though, that's why INVERT is
able to still return OrderPreserving.YES. It's basically a determination of
whether or not the rows need to be sorted or not. The ASC/DESC option in ORDER
BY takes into account whether or not the SortOrder matches.
I believe LPAD can be OrderPreserving.YES as long as the amount of padding
being applied is constant (isStateless & isDeterministic are both true). I'm
not sure about your ENCODE function. A BIGINT sorts naturally with its value.
Does a base62 encoded BIGINT sort the same way?
The getKeyFormationTraversalIndex() is a way for a built-in function to define
how it interacts with the formation of the start key/stop key when a row key
column is used as an argument. An example would be an expression like: s LIKE
'a%'. In this case, we'd know that, assuming s is the leading PK column, that
the start key would be 'a' and the stop key would 'b' (exclusive). The
getKeyFormationTraversalIndex() allows these kinds of optimizations to be
expressed (as opposed to falling back to a full table scan).
If you don't envision LPAD or ENCODE(num,'base62') to be used in a WHERE
clause, it's kind of moot in which case you can just return NO_TRAVERSAL. If,
on the other hand, you think they'll be expression like WHERE
ENCODE(num,'base62') = 'abcdefg', then it might make sense to implement it.
Given that ENCODE will be used more as a key generator, this seems unlikely, so
I'd advise to just start with NO_TRAVERSAL.
> ADD ENCODE AND LPAD functions
> ------------------------------
>
> Key: PHOENIX-995
> URL: https://issues.apache.org/jira/browse/PHOENIX-995
> Project: Phoenix
> Issue Type: New Feature
> Reporter: Thomas D'Silva
> Attachments: PHOENIX-995.patch
>
>
> Add ENCODE(input number, format encodeformat) which can be used to convert a
> base 10 number to a base 62 number
> Add LPAD(input string, length int [, fill string]) which can be used to left
> pad an input string.
> Together these two functions can be used to generate IDs using sequences, for
> example:
> {code:sql}
> CREATE SEQUENCE foo.bar START WITH 0 INCREMENT BY 62
> SELECT LPAD(ENCODE(NEXT VALUE FOR foo.bar,'BASE62'), 10,'0') FROM
> SYSTEM."SEQUENCE"
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)