[
https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lars Hofhansl reopened PHOENIX-6458:
------------------------------------
> Using global indexes for queries with uncovered columns
> -------------------------------------------------------
>
> Key: PHOENIX-6458
> URL: https://issues.apache.org/jira/browse/PHOENIX-6458
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 5.1.0
> Reporter: Kadir Ozdemir
> Assignee: Kadir OZDEMIR
> Priority: Major
> Fix For: 4.17.0, 5.2.0, 5.1.3
>
> Attachments: PHOENIX-6458.master.001.patch,
> PHOENIX-6458.master.002.patch, PHOENIX-6458.master.addendum.patch
>
>
> The Phoenix query optimizer does not use a global index for a query with the
> columns that are not covered by the global index if the query does not have
> the corresponding index hint for this index. With the index hint, the
> optimizer rewrites the query where the index is used within a subquery. With
> this subquery, the row keys of the index rows that satisfy the subquery are
> retrieved by the Phoenix client and then pushed into the Phoenix server
> caches of the data table regions. Finally, on the server side, data table
> rows are scanned and joined with the index rows using HashJoin. Based on the
> selectivity of the original query, this join operation may still result in
> scanning a large amount of data table rows.
> Eliminating these data table scans would be a significant improvement. To do
> that, instead of rewriting the query, the Phoenix optimizer simply treats the
> global index as a covered index for the given query. With this, the Phoenix
> query optimizer chooses the index table for the query especially when the
> index row key prefix length is greater than the data row key prefix length
> for the query. On the server side, the index table is scanned using index row
> key ranges implied by the query and the index row keys are then mapped to the
> data table row keys (please note an index row key includes all the data row
> key columns). Finally, the corresponding data table rows are scanned using
> server-to-server RPCs. PHOENIX-6458 (this Jira) retrieves the data table
> rows one by one using the HBase get operation. PHOENIX-6501 replaces this get
> operation with the scan operation to reduce the number of server-to-server
> RPC calls.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)