[ https://issues.apache.org/jira/browse/PHOENIX-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Istvan Toth updated PHOENIX-6797: --------------------------------- Description: For a salted table with a composite PK, queries using a PK prefix are turned into basically full scans. We should scan only the salt key + PK prefix range for each key. i.e we have the salted table {noformat} CREATE TABLE T (ID1 VARCHAR(64) not null, ID2 VARCHAR(15) not null, ID3 VARCHAR(24), V1 DATE, CONSTRAINT pk PRIMARY KEY (ID1, ID2, ID3)) SALT_BUCKETS = 31;{noformat} and we do a select based on ID1: {noformat} select * from T where id1='whatever';{noformat} Phoenix will do a range scan over the following, which is basically a full scan. {noformat} [0,'whatever'] - [30,'whatever']{noformat} However, we only really need to scan the far smaller {noformat} [0,'whatever'] - [0,'whateves'] ,[1,'whatever'] - [1,'whateves'] .. ,[30,'whatever'] - [30,'whateves']{noformat} ranges. was: For a salted table with a composite PK, queries using a PK prefix are turned into basically full scans. We should scan only the salt key + PK prefix range for each key. i.e we have the salted table {noformat} CREATE TABLE T (ID1 VARCHAR(64) not null, ID2 VARCHAR(15) not null, ID3 VARCHAR(24), V1 DATE, CONSTRAINT pk PRIMARY KEY (ID1, ID2, ID3)) SALT_BUCKETS = 31;{noformat} and we do a select based on ID1: {noformat} select * from T where id1='whatever';{noformat} Phoenix will do a range scan over the following, which is basically a full scan. [0,'whatever'] - [30,'whatever'] However, we only really need to scan the far smaller {noformat} [0,'whatever'] - [0,'whateves'] ,[1,'whatever'] - [1,'whateves'] .. ,[30,'whatever'] - [30,'whateves']{noformat} ranges. > Optimize rowkey prefix selects for salted tables > ------------------------------------------------ > > Key: PHOENIX-6797 > URL: https://issues.apache.org/jira/browse/PHOENIX-6797 > Project: Phoenix > Issue Type: Bug > Components: core > Reporter: Istvan Toth > Priority: Major > > For a salted table with a composite PK, queries using a PK prefix are turned > into basically full scans. > We should scan only the salt key + PK prefix range for each key. > i.e we have the salted table > {noformat} > CREATE TABLE T (ID1 VARCHAR(64) not null, ID2 VARCHAR(15) not null, ID3 > VARCHAR(24), V1 DATE, > CONSTRAINT pk PRIMARY KEY (ID1, ID2, ID3)) SALT_BUCKETS = 31;{noformat} > and we do a select based on ID1: > {noformat} > select * from T where id1='whatever';{noformat} > Phoenix will do a range scan over the following, which is basically a full > scan. > {noformat} > [0,'whatever'] - [30,'whatever']{noformat} > However, we only really need to scan the far smaller > {noformat} > [0,'whatever'] - [0,'whateves'] > ,[1,'whatever'] - [1,'whateves'] > .. > ,[30,'whatever'] - [30,'whateves']{noformat} > ranges. -- This message was sent by Atlassian Jira (v8.20.10#820010)