[
https://issues.apache.org/jira/browse/PHOENIX-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086041#comment-15086041
]
ramkrishna.s.vasudevan commented on PHOENIX-1544:
-------------------------------------------------
Generally how does indexing on array works in any SQL? Indexing on a specific
non ARRAY column i can understand. Suppose there is BIGINT array
[456789, 2343435,667878]
[2343435, 667878, 456789]
[11112222, 222233333, 444444444, 77778888]
There is an index on this BIGINT array.
Now the select query will say where this BIGINT col ='[11112222, 222233333,
444444444, 77778888]'
So in the index table for this array - we will check whether there is any row
that starts with '11112222' and ends with '77778888'. And that is why you were
saying we will add the index also in the row key of the array index right?
Without the index entry we cannot get the exact match by comparing the
serialized format of the given array in the select clause?
> Support indexing of an ARRAY
> ----------------------------
>
> Key: PHOENIX-1544
> URL: https://issues.apache.org/jira/browse/PHOENIX-1544
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
>
> Needs to be fleshed out more, but I think we could support indexing array
> data to improve query performance. We could generate an index row per array
> element, tacking on the position of the array element in the row key.
> For example, given the array: ARRAY['a','b','c','a'] you could generate the
> following row keys(where the space is a null byte) when an INDEX is created
> over it:
> {code}
> a 0
> b 1
> c 2
> a 3
> {code}
> Because the data is immutable, we don't need to worry about keeping it in
> sync with changes to the array (which would be difficult).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)