[jira] [Updated] (PHOENIX-5793) Support parallel init and fast null return for SortMergeJoinPlan.

2020-03-23 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5793:
---
Attachment: PHOENIX-5793-v3.patch

> Support parallel init and fast null return for SortMergeJoinPlan.
> -
>
> Key: PHOENIX-5793
> URL: https://issues.apache.org/jira/browse/PHOENIX-5793
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Attachments: PHOENIX-5793-v2.patch, PHOENIX-5793-v3.patch
>
>
> For a join sql like A join B. The implementation of SortMergeJoinPlan 
> currently inits the two iterators A and B one by one.
> By initializing A and B in parallel, we can improve performance in two 
> aspects.
> 1) By overlapping the time in initializing.
> 2) If one child query is null, the other child query can be canceled since 
> the final result must be null.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5765) Add unit tests for prepareIndexMutationsForRebuild() of IndexRebuildRegionScanner

2020-03-23 Thread Weiming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wang updated PHOENIX-5765:
--
Attachment: (was: PHOENIX-5765.4.x-HBase-1.5.v1.patch)

> Add unit tests for prepareIndexMutationsForRebuild() of 
> IndexRebuildRegionScanner
> -
>
> Key: PHOENIX-5765
> URL: https://issues.apache.org/jira/browse/PHOENIX-5765
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Swaroopa Kadam
>Assignee: Weiming Wang
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Unit-tests for prepareIndexMutationsForRebuild



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5765) Add unit tests for prepareIndexMutationsForRebuild() of IndexRebuildRegionScanner

2020-03-23 Thread Weiming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wang updated PHOENIX-5765:
--
Attachment: PHOENIX-5765.4.x-HBase-1.5.v1.patch

> Add unit tests for prepareIndexMutationsForRebuild() of 
> IndexRebuildRegionScanner
> -
>
> Key: PHOENIX-5765
> URL: https://issues.apache.org/jira/browse/PHOENIX-5765
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Swaroopa Kadam
>Assignee: Weiming Wang
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Unit-tests for prepareIndexMutationsForRebuild



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5765) Add unit tests for prepareIndexMutationsForRebuild() of IndexRebuildRegionScanner

2020-03-23 Thread Weiming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wang updated PHOENIX-5765:
--
Attachment: (was: PHOENIX-5765.4.x-HBase-1.5.v1.patch)

> Add unit tests for prepareIndexMutationsForRebuild() of 
> IndexRebuildRegionScanner
> -
>
> Key: PHOENIX-5765
> URL: https://issues.apache.org/jira/browse/PHOENIX-5765
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Swaroopa Kadam
>Assignee: Weiming Wang
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Unit-tests for prepareIndexMutationsForRebuild



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5765) Add unit tests for prepareIndexMutationsForRebuild() of IndexRebuildRegionScanner

2020-03-23 Thread Weiming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wang updated PHOENIX-5765:
--
Attachment: (was: PHOENIX-5765.4.x-HBase-1.5.v1.patch)

> Add unit tests for prepareIndexMutationsForRebuild() of 
> IndexRebuildRegionScanner
> -
>
> Key: PHOENIX-5765
> URL: https://issues.apache.org/jira/browse/PHOENIX-5765
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Swaroopa Kadam
>Assignee: Weiming Wang
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Unit-tests for prepareIndexMutationsForRebuild



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5796) Possible query optimization when projecting uncovered columns and querying on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--
Summary: Possible query optimization when projecting uncovered columns and 
querying on indexed columns  (was: Possible query optimization when query 
projects uncovered columns and queries on indexed columns)

> Possible query optimization when projecting uncovered columns and querying on 
> indexed columns
> -
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
> *Which is a full table scan on the base table 't'* since we cannot use the 
> global index as 'c' is not a covered column in the global index.
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 
> In the first query, can there be an optimization to *query the index table, 
> get the start and stop keys of the base table and then issue a range 
> scan/(bunch of point lookups) on the base table* instead of doing a full 
> table scan on the base table like we currently do?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5718) GetTable builds a table excluding the given clientTimeStamp

2020-03-23 Thread Sandeep Guggilam (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandeep Guggilam updated PHOENIX-5718:
--
Attachment: PHOENIX-5718.master.v2.patch

> GetTable builds a table excluding the given clientTimeStamp
> ---
>
> Key: PHOENIX-5718
> URL: https://issues.apache.org/jira/browse/PHOENIX-5718
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.16.0
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-5718.4.x-HBase-1.3.v1.patch, 
> PHOENIX-5718.4.x-HBase-1.3.v2.patch, PHOENIX-5718.4.x.v1.patch, 
> PHOENIX-5718.4.x.v1.patch, PHOENIX-5718.master.v1.patch, 
> PHOENIX-5718.master.v2.patch
>
>
> Here is the scenario tested:
>  # Brought up a server with 4.16 where new columns are added but not added as 
> part of upgrade path
>  # Connect  with 4.14 client
>  # Connect with a 4.16 client - this will throw an exception as the new 
> columns added as part of 4.16 were not added as part of upgrade path
>  # Now the code will force update the cache in 
> PhoenixStatement#executeQuery() method
>  # Now the buildTable is removing even the columns added as part of 4.15 , 
> the reason being we are passing the clientTimeStamp to build table ( say 29 
> is the timestamp for column added for 4.15) but the table is scanning rows 
> EXCLUDING the passed clientTimeSTamp as the Scan#setTimeRange method excludes 
> the end time stamp
> The passing of clientTimeStamp to build table is in 
> MetaDataEndPointImpl#doGetTable method



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5718) GetTable builds a table excluding the given clientTimeStamp

2020-03-23 Thread Sandeep Guggilam (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandeep Guggilam updated PHOENIX-5718:
--
Attachment: (was: PHOENIX-5718.master.v2.patch)

> GetTable builds a table excluding the given clientTimeStamp
> ---
>
> Key: PHOENIX-5718
> URL: https://issues.apache.org/jira/browse/PHOENIX-5718
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.16.0
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-5718.4.x-HBase-1.3.v1.patch, 
> PHOENIX-5718.4.x-HBase-1.3.v2.patch, PHOENIX-5718.4.x.v1.patch, 
> PHOENIX-5718.4.x.v1.patch, PHOENIX-5718.master.v1.patch, 
> PHOENIX-5718.master.v2.patch
>
>
> Here is the scenario tested:
>  # Brought up a server with 4.16 where new columns are added but not added as 
> part of upgrade path
>  # Connect  with 4.14 client
>  # Connect with a 4.16 client - this will throw an exception as the new 
> columns added as part of 4.16 were not added as part of upgrade path
>  # Now the code will force update the cache in 
> PhoenixStatement#executeQuery() method
>  # Now the buildTable is removing even the columns added as part of 4.15 , 
> the reason being we are passing the clientTimeStamp to build table ( say 29 
> is the timestamp for column added for 4.15) but the table is scanning rows 
> EXCLUDING the passed clientTimeSTamp as the Scan#setTimeRange method excludes 
> the end time stamp
> The passing of clientTimeStamp to build table is in 
> MetaDataEndPointImpl#doGetTable method



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5718) GetTable builds a table excluding the given clientTimeStamp

2020-03-23 Thread Sandeep Guggilam (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandeep Guggilam updated PHOENIX-5718:
--
Attachment: PHOENIX-5718.4.x.v1.patch

> GetTable builds a table excluding the given clientTimeStamp
> ---
>
> Key: PHOENIX-5718
> URL: https://issues.apache.org/jira/browse/PHOENIX-5718
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.16.0
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-5718.4.x-HBase-1.3.v1.patch, 
> PHOENIX-5718.4.x-HBase-1.3.v2.patch, PHOENIX-5718.4.x.v1.patch, 
> PHOENIX-5718.4.x.v1.patch, PHOENIX-5718.master.v1.patch, 
> PHOENIX-5718.master.v2.patch
>
>
> Here is the scenario tested:
>  # Brought up a server with 4.16 where new columns are added but not added as 
> part of upgrade path
>  # Connect  with 4.14 client
>  # Connect with a 4.16 client - this will throw an exception as the new 
> columns added as part of 4.16 were not added as part of upgrade path
>  # Now the code will force update the cache in 
> PhoenixStatement#executeQuery() method
>  # Now the buildTable is removing even the columns added as part of 4.15 , 
> the reason being we are passing the clientTimeStamp to build table ( say 29 
> is the timestamp for column added for 4.15) but the table is scanning rows 
> EXCLUDING the passed clientTimeSTamp as the Scan#setTimeRange method excludes 
> the end time stamp
> The passing of clientTimeStamp to build table is in 
> MetaDataEndPointImpl#doGetTable method



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5718) GetTable builds a table excluding the given clientTimeStamp

2020-03-23 Thread Sandeep Guggilam (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandeep Guggilam updated PHOENIX-5718:
--
Attachment: (was: PHOENIX-5718.master.v2.patch)

> GetTable builds a table excluding the given clientTimeStamp
> ---
>
> Key: PHOENIX-5718
> URL: https://issues.apache.org/jira/browse/PHOENIX-5718
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.16.0
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-5718.4.x-HBase-1.3.v1.patch, 
> PHOENIX-5718.4.x-HBase-1.3.v2.patch, PHOENIX-5718.4.x.v1.patch, 
> PHOENIX-5718.4.x.v1.patch, PHOENIX-5718.master.v1.patch, 
> PHOENIX-5718.master.v2.patch
>
>
> Here is the scenario tested:
>  # Brought up a server with 4.16 where new columns are added but not added as 
> part of upgrade path
>  # Connect  with 4.14 client
>  # Connect with a 4.16 client - this will throw an exception as the new 
> columns added as part of 4.16 were not added as part of upgrade path
>  # Now the code will force update the cache in 
> PhoenixStatement#executeQuery() method
>  # Now the buildTable is removing even the columns added as part of 4.15 , 
> the reason being we are passing the clientTimeStamp to build table ( say 29 
> is the timestamp for column added for 4.15) but the table is scanning rows 
> EXCLUDING the passed clientTimeSTamp as the Scan#setTimeRange method excludes 
> the end time stamp
> The passing of clientTimeStamp to build table is in 
> MetaDataEndPointImpl#doGetTable method



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5765) Add unit tests for prepareIndexMutationsForRebuild() of IndexRebuildRegionScanner

2020-03-23 Thread Weiming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wang updated PHOENIX-5765:
--
Attachment: PHOENIX-5765.4.x-HBase-1.5.v1.patch

> Add unit tests for prepareIndexMutationsForRebuild() of 
> IndexRebuildRegionScanner
> -
>
> Key: PHOENIX-5765
> URL: https://issues.apache.org/jira/browse/PHOENIX-5765
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Swaroopa Kadam
>Assignee: Weiming Wang
>Priority: Major
> Attachments: PHOENIX-5765.4.x-HBase-1.5.v1.patch, 
> PHOENIX-5765.4.x-HBase-1.5.v1.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Unit-tests for prepareIndexMutationsForRebuild



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5796) Possible query optimization when query projects uncovered columns and queries on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--
Description: 
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
*Which is a full table scan on the base table 't'* since we cannot use the 
global index as 'c' is not a covered column in the global index.

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 

In the first query, can there be an optimization to *query the index table, get 
the start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do?

  was:
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
*Which is a full table scan on the base table 't'* since we cannot use the 
global index as 'c' is not a covered column in the global index.

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.


> Possible query optimization when query projects uncovered columns and queries 
> on indexed columns
> 
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
> *Which is a full table scan on the base table 't'* since we cannot use the 
> global index as 'c' is not a covered column in the global index.
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 
> In the first query, can there be an optimization to *query the index table, 
> get the start and stop keys of the base table and then issue a range 
> scan/(bunch of point lookups) on the base table* instead of doing a full 
> table scan on the base table like we currently do?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5796) Possible query optimization when query projects uncovered columns and queries on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--
Description: 
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
*Which is a full table scan on the base table 't'* since we cannot use the 
global index as 'c' is not a covered column in the global index.

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.

  was:
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
*Which is a full table scan on the base table 't' since 'c' is not a covered 
column in the global index*

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.


> Possible query optimization when query projects uncovered columns and queries 
> on indexed columns
> 
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
> *Which is a full table scan on the base table 't'* since we cannot use the 
> global index as 'c' is not a covered column in the global index.
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 
> In the first query, an optimization can be to *query the index table, get the 
> start and stop keys of the base table and then issue a range scan/(bunch of 
> point lookups) on the base table* instead of doing a full table scan on the 
> base table like we currently do.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5796) Possible query optimization when query projects uncovered columns and queries on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--
Description: 
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
*Which is a full table scan on the base table 't' since 'c' is not a covered 
column in the global index*

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.

  was:
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=800!
*Which is a full table scan on the base table 't' since 'c' is not a covered 
column in the global index*

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=800! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.


> Possible query optimization when query projects uncovered columns and queries 
> on indexed columns
> 
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
> *Which is a full table scan on the base table 't' since 'c' is not a covered 
> column in the global index*
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 
> In the first query, an optimization can be to *query the index table, get the 
> start and stop keys of the base table and then issue a range scan/(bunch of 
> point lookups) on the base table* instead of doing a full table scan on the 
> base table like we currently do.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5796) Possible query optimization when query projects uncovered columns and queries on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--
Description: 
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=800!
*Which is a full table scan on the base table 't' since 'c' is not a covered 
column in the global index*

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=800! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.

  was:
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=600!
*Which is a full table scan on the base table 't' since 'c' is not a covered 
column in the global index*

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=600! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.


> Possible query optimization when query projects uncovered columns and queries 
> on indexed columns
> 
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=800!
> *Which is a full table scan on the base table 't' since 'c' is not a covered 
> column in the global index*
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=800! 
> In the first query, an optimization can be to *query the index table, get the 
> start and stop keys of the base table and then issue a range scan/(bunch of 
> point lookups) on the base table* instead of doing a full table scan on the 
> base table like we currently do.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5796) Possible query optimization when query projects uncovered columns and queries on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--
Description: 
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=600!
*Which is a full table scan on the base table 't' since 'c' is not a covered 
column in the global index*

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=600! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.

  was:
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=250,width=250!
*Which is a full table scan on the base table 't' since 'c' is not a covered 
column in the global index*

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=250,width=250! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.


> Possible query optimization when query projects uncovered columns and queries 
> on indexed columns
> 
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=600!
> *Which is a full table scan on the base table 't' since 'c' is not a covered 
> column in the global index*
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=600! 
> In the first query, an optimization can be to *query the index table, get the 
> start and stop keys of the base table and then issue a range scan/(bunch of 
> point lookups) on the base table* instead of doing a full table scan on the 
> base table like we currently do.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5796) Possible query optimization when query projects uncovered columns and queries on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--
Description: 
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=250,width=250!
*Which is a full table scan on the base table 't' since 'c' is not a covered 
column in the global index*

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=250,width=250! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.

  was:
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png! 
*Which is a full table scan on the base table 't' since 'c' is not a covered 
column in the global index*

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.


> Possible query optimization when query projects uncovered columns and queries 
> on indexed columns
> 
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=250,width=250!
> *Which is a full table scan on the base table 't' since 'c' is not a covered 
> column in the global index*
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=250,width=250! 
> In the first query, an optimization can be to *query the index table, get the 
> start and stop keys of the base table and then issue a range scan/(bunch of 
> point lookups) on the base table* instead of doing a full table scan on the 
> base table like we currently do.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5796) Possible query optimization when query projects uncovered columns and queries on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--
Description: 
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}
You'd see the following explain plan:
 !Screen Shot 2020-03-23 at 3.25.38 PM.png! 
*Which is a full table scan on the base table 't' since 'c' is not a covered 
column in the global index*

*However, projecting columns contained fully within the index pk is correctly a 
range scan:*
{code:sql}
explain select a,b from t where b='abc';
{code}
produces the following explain plan:
 !Screen Shot 2020-03-23 at 3.32.24 PM.png! 

In the first query, an optimization can be to *query the index table, get the 
start and stop keys of the base table and then issue a range scan/(bunch of 
point lookups) on the base table* instead of doing a full table scan on the 
base table like we currently do.

  was:
Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}



> Possible query optimization when query projects uncovered columns and queries 
> on indexed columns
> 
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png! 
> *Which is a full table scan on the base table 't' since 'c' is not a covered 
> column in the global index*
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png! 
> In the first query, an optimization can be to *query the index table, get the 
> start and stop keys of the base table and then issue a range scan/(bunch of 
> point lookups) on the base table* instead of doing a full table scan on the 
> base table like we currently do.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5796) Possible query optimization when query projects uncovered columns and queries on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--
Attachment: Screen Shot 2020-03-23 at 3.32.24 PM.png

> Possible query optimization when query projects uncovered columns and queries 
> on indexed columns
> 
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5796) Possible query optimization when query projects uncovered columns and queries on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--
Attachment: Screen Shot 2020-03-23 at 3.25.38 PM.png

> Possible query optimization when query projects uncovered columns and queries 
> on indexed columns
> 
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5796) Possible query optimization when query projects uncovered columns and queries on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--
Attachment: (was: Screen Shot 2020-03-23 at 3.25.38 PM.png)

> Possible query optimization when query projects uncovered columns and queries 
> on indexed columns
> 
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5796) Possible query optimization when query projects uncovered columns and queries on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-5796:
--
Attachment: (was: Screen Shot 2020-03-23 at 3.25.38 PM.png)

> Possible query optimization when query projects uncovered columns and queries 
> on indexed columns
> 
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5796) Possible query optimization when query projects uncovered columns and queries on indexed columns

2020-03-23 Thread Chinmay Kulkarni (Jira)
Chinmay Kulkarni created PHOENIX-5796:
-

 Summary: Possible query optimization when query projects uncovered 
columns and queries on indexed columns
 Key: PHOENIX-5796
 URL: https://issues.apache.org/jira/browse/PHOENIX-5796
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 4.15.0, 5.0.0
Reporter: Chinmay Kulkarni
 Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
2020-03-23 at 3.25.38 PM.png

Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to it 
using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.

Create a base table like:
{code:sql}
create table t (a integer primary key, b varchar(10), c integer);
{code}

Create an uncovered index on top of it like:
{code:sql}
create index uncov_index_t on t(b);
{code}

Now if you issue the query:
{code:sql}
explain select c from t where b='abc';
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5765) Add unit tests for prepareIndexMutationsForRebuild() of IndexRebuildRegionScanner

2020-03-23 Thread Weiming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wang updated PHOENIX-5765:
--
Attachment: (was: PHOENIX-5765.4.x-HBase-1.5.v1.patch)

> Add unit tests for prepareIndexMutationsForRebuild() of 
> IndexRebuildRegionScanner
> -
>
> Key: PHOENIX-5765
> URL: https://issues.apache.org/jira/browse/PHOENIX-5765
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Swaroopa Kadam
>Assignee: Weiming Wang
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Unit-tests for prepareIndexMutationsForRebuild



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5746) Update release documentation to include versions information

2020-03-23 Thread Sandeep Guggilam (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandeep Guggilam updated PHOENIX-5746:
--
Attachment: PHOENIX-5746.v2.patch

> Update release documentation to include versions information
> 
>
> Key: PHOENIX-5746
> URL: https://issues.apache.org/jira/browse/PHOENIX-5746
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.16.0
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-5746.patch, PHOENIX-5746.v2.patch
>
>
> We need to update the release documentation to update the VERSIONS 
> information that include current version ( major or minor or patch version 
> change) and the compatible client versions with the current version



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (PHOENIX-5785) Remove TTL check in QueryCompiler when doing an SCN / Lookback query

2020-03-23 Thread Geoffrey Jacoby (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Jacoby resolved PHOENIX-5785.
--
Fix Version/s: 4.16.0
   Resolution: Fixed

Merged PR to 4.x. (Master isn't needed because PHOENIX-5645 isn't in master.) 

> Remove TTL check in QueryCompiler when doing an SCN / Lookback query
> 
>
> Key: PHOENIX-5785
> URL: https://issues.apache.org/jira/browse/PHOENIX-5785
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Major
> Fix For: 4.16.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As a sanity check, the Phoenix client verifies that SCN for a query is not 
> before the TTL of any table involved in the query. This causes problems if 
> access control is enabled and the current user doesn't have ADMIN or CREATE 
> privileges, because HBase requires schema-altering privileges to _read_ the 
> full schema in getTableDescriptor. 
> According to the HBase community, this is because sensitive config parameters 
> can be stored in table descriptor properties, such as those used in HBase 
> encryption. See HBASE-24018, HBASE-8692, and HBASE-9182 for previous 
> discussion, and PHOENIX-5750 for a previous instance where this has affected 
> Phoenix. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (PHOENIX-5795) Supporting selective queries for index rows updated concurrently

2020-03-23 Thread Kadir OZDEMIR (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kadir OZDEMIR reassigned PHOENIX-5795:
--

Assignee: Kadir OZDEMIR

> Supporting selective queries for index rows updated concurrently
> 
>
> Key: PHOENIX-5795
> URL: https://issues.apache.org/jira/browse/PHOENIX-5795
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Kadir OZDEMIR
>Assignee: Kadir OZDEMIR
>Priority: Critical
>
> From the consistent indexing design (PHOENIX-5156) perspective, two or more 
> pending updates from different batches on the same data row are concurrent if 
> and only if for all of these updates the data table row state is read from 
> HBase under the row lock and for none of them the row lock has been acquired 
> the second time for updating the data table. In other words, all of them are 
> in the first update phase concurrently. For concurrent updates, the first two 
> update phases are done but the last update phase is skipped. This means the 
> data table row will be updated by these updates but the corresponding index 
> table rows will be left with the unverified status. Then, the read repair 
> process will repair these unverified index rows during scans.
> In addition to leaving index rows unverified, the concurrent updates may 
> generate index row with incorrect row keys. For example, consider that an 
> application issues the verify first two upserts on the same row concurrently 
> and the second update does not include one or more of the indexed columns. 
> When these updates arrive concurrently to IndexRegionObserver, the existing 
> row state would be null for both of these updates. This mean the index 
> updates will be generated solely from the pending updates. The partial upsert 
> with missing indexed columns will generate an index row by assuming missing 
> indexed columns have null value, and this assumption may not true as the 
> other concurrent upsert may have non-null values for indexed columns. After 
> issuing the concurrent update, if the application attempts to read back the 
> row using a selective query on the index table and this selective query maps 
> to an HBase scan that does not scan these unverified rows due to incorrect 
> row keys on these rows, the application will not get the row content back 
> correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5795) Supporting selective queries for index rows updated concurrently

2020-03-23 Thread Kadir OZDEMIR (Jira)
Kadir OZDEMIR created PHOENIX-5795:
--

 Summary: Supporting selective queries for index rows updated 
concurrently
 Key: PHOENIX-5795
 URL: https://issues.apache.org/jira/browse/PHOENIX-5795
 Project: Phoenix
  Issue Type: Sub-task
Reporter: Kadir OZDEMIR


>From the consistent indexing design (PHOENIX-5156) perspective, two or more 
>pending updates from different batches on the same data row are concurrent if 
>and only if for all of these updates the data table row state is read from 
>HBase under the row lock and for none of them the row lock has been acquired 
>the second time for updating the data table. In other words, all of them are 
>in the first update phase concurrently. For concurrent updates, the first two 
>update phases are done but the last update phase is skipped. This means the 
>data table row will be updated by these updates but the corresponding index 
>table rows will be left with the unverified status. Then, the read repair 
>process will repair these unverified index rows during scans.

In addition to leaving index rows unverified, the concurrent updates may 
generate index row with incorrect row keys. For example, consider that an 
application issues the verify first two upserts on the same row concurrently 
and the second update does not include one or more of the indexed columns. When 
these updates arrive concurrently to IndexRegionObserver, the existing row 
state would be null for both of these updates. This mean the index updates will 
be generated solely from the pending updates. The partial upsert with missing 
indexed columns will generate an index row by assuming missing indexed columns 
have null value, and this assumption may not true as the other concurrent 
upsert may have non-null values for indexed columns. After issuing the 
concurrent update, if the application attempts to read back the row using a 
selective query on the index table and this selective query maps to an HBase 
scan that does not scan these unverified rows due to incorrect row keys on 
these rows, the application will not get the row content back correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5791) Eliminate false invalid row detection due to concurrent updates

2020-03-23 Thread Kadir OZDEMIR (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kadir OZDEMIR updated PHOENIX-5791:
---
Description: 
IndexTool verification generates an expected list of index mutations from the 
data table rows and uses this list to check if index table rows are consistent 
with the data table. To do that it follows the following steps:
 # The data table rows are scanned with a raw scan. This raw scan is configured 
to read all versions of rows. 
 # For each scanned row, the cells that are scanned are grouped into two sets: 
put and delete. The put set is the set of put cells and the delete set is the 
set of delete cells.
 # The put and delete sets for a given row are further grouped based on their 
timestamps into put and delete mutations such that all the cells in a mutation 
have the timestamp. 
 # The put and delete mutations are then sorted within a single list. Mutations 
in this list are sorted in ascending order of their timestamp. 

The above process assumes that for each data table update, the index table will 
be updated with the correct index row key. However, this assumption does not 
hold in the presence of concurrent updates.

>From the consistent indexing design (PHOENIX-5156) perspective, two or more 
>pending updates from different batches on the same data row are concurrent if 
>and only if for all of these updates the data table row state is read from 
>HBase under the row lock and for none of them the row lock has been acquired 
>the second time for updating the data table. In other words, all of them are 
>in the first update phase concurrently. For concurrent updates, the first two 
>update phases are done but the last update phase is skipped. This means the 
>data table row will be updated by these updates but the corresponding index 
>table rows will be left with the unverified status. Then, the read repair 
>process will repair these unverified index rows during scans.

Since expected index mutations are derived from the data table row after these 
concurrent mutations are applied, the expected list would not match with the 
actual list of index mutations.  

 

  was:
IndexTool verification generates an expected list of index mutations from the 
data table rows and uses this list to check if index table rows are consistent 
with the data table. To do that it follows the following steps:
 # The data table rows are scanned with a raw scan. This raw scan is configured 
to read all versions of rows. 
 # For each scanned row, the cells that are scanned are grouped into two sets: 
put and delete. The put set is the set of put cells and the delete set is the 
set of delete cells.
 # The put and delete sets for a given row are further grouped based on their 
timestamps into put and delete mutations such that all the cells in a mutation 
have the timestamp. 
 # The put and delete mutations are then sorted within a single list. Mutations 
in this list are sorted in ascending order of their timestamp. 

The above process assumes that for each data table update, the index table will 
be updated with the correct index row key. However, this assumption does not 
hold in the presence of concurrent updates.

>From the consistent indexing design (PHOENIX-5156) perspective, two or more 
>pending updates from different batches on the same data row are concurrent if 
>and only if for all of these updates the data table row state is read from 
>HBase under the row lock and for none of them the row lock has been acquired 
>the second time for updating the data table. In other words, all of them are 
>in the first update phase concurrently. For concurrent updates, the first two 
>update phases are done but the last update phase is skipped. This means the 
>data table row will be updated by these updates but the corresponding index 
>table rows will be left with the unverified status. Then, the read repair 
>process will repair these unverified index rows during scans.

In addition to leaving index rows unverified, the concurrent updates may 
generate index row with incorrect row keys. For example, consider that 
application issues the verify first two upserts on the same row concurrently 
and the second update does not include one or more of the indexed columns. When 
these updates arrive concurrently to IndexRegionObserver, the existing row 
state would be found null for both of these updates. This mean the index 
updates will be generated solely from the pending updates. The partial upsert 
with missing indexed columns will generate an index row by assuming missing 
indexed columns have null value, and this assumption may not true as the other 
concurrent upsert may have non-null values for indexed columns. 

Since expected index mutations are derived from the data table row after these 
concurrent mutations are applied, the expected list would not match with the 
actual list of index mutations. 

[jira] [Assigned] (PHOENIX-5794) Create a threshold for non async index creation, that can be modified in configs

2020-03-23 Thread Richard Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Antal reassigned PHOENIX-5794:
--

Attachment: PHOENIX-5794.master.v1.patch
  Assignee: Richard Antal

> Create a threshold for non async index creation, that can be modified in 
> configs
> 
>
> Key: PHOENIX-5794
> URL: https://issues.apache.org/jira/browse/PHOENIX-5794
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Richard Antal
>Assignee: Richard Antal
>Priority: Major
> Attachments: PHOENIX-5794.master.v1.patch
>
>
> Issue:
> When user try to create an index on a huge phoenix table the region servers 
> crashed which led to multiple regions going in RIT state. 
>  
> Solution:
> If the expected byte read size is higher than the limit we raise an exception 
> to notify the user that the index should be created asynchronously.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5794) Create a threshold for non async index creation, that can be modified in configs

2020-03-23 Thread Richard Antal (Jira)
Richard Antal created PHOENIX-5794:
--

 Summary: Create a threshold for non async index creation, that can 
be modified in configs
 Key: PHOENIX-5794
 URL: https://issues.apache.org/jira/browse/PHOENIX-5794
 Project: Phoenix
  Issue Type: Improvement
Reporter: Richard Antal


Issue:

When user try to create an index on a huge phoenix table the region servers 
crashed which led to multiple regions going in RIT state. 

 

Solution:

If the expected byte read size is higher than the limit we raise an exception 
to notify the user that the index should be created asynchronously.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5793) Support parallel init and fast null return for SortMergeJoinPlan.

2020-03-23 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5793:
---
Attachment: PHOENIX-5793-v2.patch

> Support parallel init and fast null return for SortMergeJoinPlan.
> -
>
> Key: PHOENIX-5793
> URL: https://issues.apache.org/jira/browse/PHOENIX-5793
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
> Attachments: PHOENIX-5793-v2.patch
>
>
> For a join sql like A join B. The implementation of SortMergeJoinPlan 
> currently inits the two iterators A and B one by one.
> By initializing A and B in parallel, we can improve performance in two 
> aspects.
> 1) By overlapping the time in initializing.
> 2) If one child query is null, the other child query can be canceled since 
> the final result must be null.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5793) Support parallel init and fast null return for SortMergeJoinPlan.

2020-03-23 Thread Chen Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Feng updated PHOENIX-5793:
---
Attachment: (was: PHOENIX-5793-v1.patch)

> Support parallel init and fast null return for SortMergeJoinPlan.
> -
>
> Key: PHOENIX-5793
> URL: https://issues.apache.org/jira/browse/PHOENIX-5793
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Chen Feng
>Assignee: Chen Feng
>Priority: Minor
>
> For a join sql like A join B. The implementation of SortMergeJoinPlan 
> currently inits the two iterators A and B one by one.
> By initializing A and B in parallel, we can improve performance in two 
> aspects.
> 1) By overlapping the time in initializing.
> 2) If one child query is null, the other child query can be canceled since 
> the final result must be null.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5793) Support parallel init and fast null return for SortMergeJoinPlan.

2020-03-23 Thread Chen Feng (Jira)
Chen Feng created PHOENIX-5793:
--

 Summary: Support parallel init and fast null return for 
SortMergeJoinPlan.
 Key: PHOENIX-5793
 URL: https://issues.apache.org/jira/browse/PHOENIX-5793
 Project: Phoenix
  Issue Type: Improvement
Reporter: Chen Feng
Assignee: Chen Feng


For a join sql like A join B. The implementation of SortMergeJoinPlan currently 
inits the two iterators A and B one by one.

By initializing A and B in parallel, we can improve performance in two aspects.

1) By overlapping the time in initializing.

2) If one child query is null, the other child query can be canceled since the 
final result must be null.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)