[ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--------------------------------------
    Description: 
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=600!
*Which is a full table scan on the base table 't' since 'c' is not a covered 
column in the global index*

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=600! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.

  was:
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=250,width=250!
*Which is a full table scan on the base table 't' since 'c' is not a covered 
column in the global index*

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=250,width=250! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.


> Possible query optimization when query projects uncovered columns and queries 
> on indexed columns
> ------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5796
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5796
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 5.0.0, 4.15.0
>            Reporter: Chinmay Kulkarni
>            Priority: Major
>         Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=600!
> *Which is a full table scan on the base table 't' since 'c' is not a covered 
> column in the global index*
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=600! 
> In the first query, an optimization can be to *query the index table, get the 
> start and stop keys of the base table and then issue a range scan/(bunch of 
> point lookups) on the base table* instead of doing a full table scan on the 
> base table like we currently do.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to