[ 
https://issues.apache.org/jira/browse/PHOENIX-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanuj Khurana updated PHOENIX-7694:
-----------------------------------
    Description: 
Using a PK column with DESC sort order causes a combinatorial explosion of keys 
generated when using skip scan. 

 
{code:java}
CREATE TABLE IF NOT EXISTS FOO (ID CHAR(15) NOT NULL, CREATED_DATE DATE NOT 
NULL, ENTITY_ID CHAR(15) NOT NULL, CREATED_BY VARCHAR, DATA VARCHAR CONSTRAINT 
PK PRIMARY KEY (ID, CREATED_DATE DESC, ENTITY_ID));

EXPLAIN SELECT * FROM FOO WHERE (ID, CREATED_DATE, ENTITY_ID) IN ((?, ?, ?), 
(?, ?, ?));

CLIENT PARALLEL 1-WAY POINT LOOKUP ON 8 KEYS OVER FOO
    SERVER FILTER BY (ID, CREATED_DATE, ENTITY_ID) IN 
(X'69645f3020202020202020202020207ffffffffffffc17656e746974795f3020202020202020',X'69645f3120202020202020202020207ffffffffffffc16656e746974795f3120202020202020'){code}
We are looking up 2 keys but we generate a skip scan with 8 keys (2*2*2 for 3 
PK columns). There is not a correctness issue here because we also pass a 
RowKeyExpression filter with the actual row keys but this is more a performance 
issue. Generating a skip scan with lot of keys can consume lot of memory and we 
have seen OOM issues in the past. 
[PHOENIX-6751|https://issues.apache.org/jira/browse/PHOENIX-6751] introduced a 
check to force range scan if the keys generated exceeded 50k. The purpose of 
this Jira is to see if we can eliminate the key explosion with DESC PK columns.

 

 

  was:
Using a PK column with DESC sort order causes a combinatorial explosion of keys 
generated when using skip scan. 

 
{code:java}
CREATE TABLE IF NOT EXISTS FOO (ID CHAR(15) NOT NULL, CREATED_DATE DATE NOT 
NULL, ENTITY_ID CHAR(15) NOT NULL, CREATED_BY VARCHAR, DATA VARCHAR CONSTRAINT 
PK PRIMARY KEY (ID, CREATED_DATE DESC, ENTITY_ID));

EXPLAIN SELECT * FROM FOO WHERE (ID, CREATED_DATE, ENTITY_ID) IN ((?, ?, ?), 
(?, ?, ?));

CLIENT PARALLEL 1-WAY POINT LOOKUP ON 8 KEYS OVER FOO
    SERVER FILTER BY (ID, CREATED_DATE, ENTITY_ID) IN 
(X'69645f3020202020202020202020207ffffffffffffc17656e746974795f3020202020202020',X'69645f3120202020202020202020207ffffffffffffc16656e746974795f3120202020202020'){code}
We are looking up 2 keys but we generate a skip scan with 8 keys (2*2*2 for 3 
PK columns). There is not a correctness issue here because we also pass a 
RowKeyExpression filter with the actual row keys but this is more a performance 
issue. Generating a skip scan with lot of keys can consume lot of memory and we 
have seen OOM issues in the past. 
[PHOENIX-6751|[https://issues.apache.org/jira/browse/PHOENIX-6751]|http://example.com]https://issues.apache.org/jira/browse/PHOENIX-6751]]
 

 

 

 


> Key explosion when a descending sort order column is part of a composite PK
> ---------------------------------------------------------------------------
>
>                 Key: PHOENIX-7694
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7694
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 5.2.0, 5.2.1, 5.3.0
>            Reporter: Tanuj Khurana
>            Priority: Major
>
> Using a PK column with DESC sort order causes a combinatorial explosion of 
> keys generated when using skip scan. 
>  
> {code:java}
> CREATE TABLE IF NOT EXISTS FOO (ID CHAR(15) NOT NULL, CREATED_DATE DATE NOT 
> NULL, ENTITY_ID CHAR(15) NOT NULL, CREATED_BY VARCHAR, DATA VARCHAR 
> CONSTRAINT PK PRIMARY KEY (ID, CREATED_DATE DESC, ENTITY_ID));
> EXPLAIN SELECT * FROM FOO WHERE (ID, CREATED_DATE, ENTITY_ID) IN ((?, ?, ?), 
> (?, ?, ?));
> CLIENT PARALLEL 1-WAY POINT LOOKUP ON 8 KEYS OVER FOO
>     SERVER FILTER BY (ID, CREATED_DATE, ENTITY_ID) IN 
> (X'69645f3020202020202020202020207ffffffffffffc17656e746974795f3020202020202020',X'69645f3120202020202020202020207ffffffffffffc16656e746974795f3120202020202020'){code}
> We are looking up 2 keys but we generate a skip scan with 8 keys (2*2*2 for 3 
> PK columns). There is not a correctness issue here because we also pass a 
> RowKeyExpression filter with the actual row keys but this is more a 
> performance issue. Generating a skip scan with lot of keys can consume lot of 
> memory and we have seen OOM issues in the past. 
> [PHOENIX-6751|https://issues.apache.org/jira/browse/PHOENIX-6751] introduced 
> a check to force range scan if the keys generated exceeded 50k. The purpose 
> of this Jira is to see if we can eliminate the key explosion with DESC PK 
> columns.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to