[ https://issues.apache.org/jira/browse/PHOENIX-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mehdi Salarkia updated PHOENIX-5950: ------------------------------------ Description: For a table with a composite primary {code:java} CREATE TABLE MY_TABLE (K1 INTEGER NOT NULL, K2 VARCHAR NOT NULL,K3 INTEGER NOT NULL, V1 DECIMAL, CONSTRAINT pk PRIMARY KEY (K1, K2, K3)) {code} when a view is created that has some (and not all) of primary key columns {code:java} CREATE VIEW MY_VIEW(v2 VARCHAR, V3 VARCHAR ) AS SELECT * FROM MY_TABLE WHERE K2 = 'A' {code} if you run a query on the view without providing all the primary columns {code:java} EXPLAIN SELECT K1, K2, K3, V1 FROM MY_VIEW WHERE (K1,K3) IN ((1,2),(3,4)); +------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN SKIP SCAN ON 2 KEYS OVER MY_TABLE [1,'A'] - [3,'A'] | null | null | null | | SERVER FILTER BY (K1, K3) IN ([128,0,0,1,128,0,0,2],[128,0,0,3,128,0,0,4]) | null | null | null | +------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ 2 rows selected (0.047 seconds) {code} the query generated is scan rather than a point look up same query on the parent table (with all the keys) looks like this {code:java} EXPLAIN SELECT K1, K2, K3, V1 FROM MY_TABLE WHERE (K1,K2,K3) IN ((1,'A',2),(3,'A',4)); +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ | CLIENT 1-CHUNK 2 ROWS 268 BYTES PARALLEL 1-WAY ROUND ROBIN POINT LOOKUP ON 2 KEYS OVER MY_TABLE | 268 | 2 | 0 | +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ 1 row selected (0.025 seconds) {code} The issue is view condition is always added as `AND` to user provided where clause and in this case query optimizer is failing to optimize this query to a point look up. ---------------------------------------------------------------*[POSSIBLE WORKAROUND]*--------------------------------------------------------------------------- One possible workaround is to provide all the pk (including the view pk columns) {code:java} EXPLAIN SELECT K1, K2, K3, V1, V2, V3 FROM MY_VIEW WHERE (K1,K2,K3) IN ((1,'A',2),(3,'A',4)); +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ | CLIENT 1-CHUNK 2 ROWS 632 BYTES PARALLEL 1-WAY ROUND ROBIN POINT LOOKUP ON 2 KEYS OVER MY_TABLE | 632 | 2 | 0 | +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ {code} but as you can see the projected _EST_BYTES_READ_ goes up because the underlying query that gets executed is something like: {code:java} SELECT K1, K2, K3, V1, V2, V3 FROM MY_VIEW WHERE (K1,K2,K3) IN ((1,'A',2),(3,'A',4)) AND K2 = 'A'; {code} and certainly the `_AND K2 = 'A'_` is redundant. ----------------------------------------------------------------*[PROPOSED SOLUTION]*-------------------------------------------------------------------------------- we can make the view condition to be injected into any partial primary key lookup (tuple style conditions) respecting the same order for columns defined in the parent table was: For a table with a composite primary {code:java} CREATE TABLE MY_TABLE (K1 INTEGER NOT NULL, K2 VARCHAR NOT NULL,K3 INTEGER NOT NULL, V1 DECIMAL, CONSTRAINT pk PRIMARY KEY (K1, K2, K3)) {code} when a view is created that has some (and not all) of primary key columns {code:java} CREATE VIEW MY_VIEW(v2 VARCHAR, V3 VARCHAR ) AS SELECT * FROM MY_TABLE WHERE K2 = 'A' {code} if you run a query on the view without providing all the primary columns {code:java} EXPLAIN SELECT K1, K2, K3, V1 FROM MY_VIEW WHERE (K1,K3) IN ((1,2),(3,4)); +------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN SKIP SCAN ON 2 KEYS OVER MY_TABLE [1,'A'] - [3,'A'] | null | null | null | | SERVER FILTER BY (K1, K3) IN ([128,0,0,1,128,0,0,2],[128,0,0,3,128,0,0,4]) | null | null | null | +------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ 2 rows selected (0.047 seconds) {code} the query generated is scan rather than a point look up same query on the parent table looks like this {code:java} EXPLAIN SELECT K1, K2, K3, V1 FROM MY_TABLE WHERE (K1,K2,K3) IN ((1,'A',2),(3,'A',4)); +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ | CLIENT 1-CHUNK 2 ROWS 268 BYTES PARALLEL 1-WAY ROUND ROBIN POINT LOOKUP ON 2 KEYS OVER MY_TABLE | 268 | 2 | 0 | +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ 1 row selected (0.025 seconds) {code} The issue is view condition is always added as `AND` to user provided where clause and in this case query optimizer is failing to optimize this query to a point look up. > View With Where Clause On A Table With Composite Key Should Be Able To > Optimize Queries > ---------------------------------------------------------------------------------------- > > Key: PHOENIX-5950 > URL: https://issues.apache.org/jira/browse/PHOENIX-5950 > Project: Phoenix > Issue Type: Bug > Reporter: Mehdi Salarkia > Priority: Major > > For a table with a composite primary > {code:java} > CREATE TABLE MY_TABLE (K1 INTEGER NOT NULL, K2 VARCHAR NOT NULL,K3 INTEGER > NOT NULL, V1 DECIMAL, CONSTRAINT pk PRIMARY KEY (K1, K2, K3)) > {code} > when a view is created that has some (and not all) of primary key columns > {code:java} > CREATE VIEW MY_VIEW(v2 VARCHAR, V3 VARCHAR ) AS SELECT * FROM MY_TABLE WHERE > K2 = 'A' > {code} > if you run a query on the view without providing all the primary columns > {code:java} > EXPLAIN SELECT K1, K2, K3, V1 FROM MY_VIEW WHERE (K1,K3) IN ((1,2),(3,4)); > +------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN SKIP SCAN ON 2 KEYS OVER MY_TABLE > [1,'A'] - [3,'A'] | null | null | null | > | SERVER FILTER BY (K1, K3) IN > ([128,0,0,1,128,0,0,2],[128,0,0,3,128,0,0,4]) | null > | null | null | > +------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ > 2 rows selected (0.047 seconds) > {code} > the query generated is scan rather than a point look up > same query on the parent table (with all the keys) looks like this > {code:java} > EXPLAIN SELECT K1, K2, K3, V1 FROM MY_TABLE WHERE (K1,K2,K3) IN > ((1,'A',2),(3,'A',4)); > +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ > | CLIENT 1-CHUNK 2 ROWS 268 BYTES PARALLEL 1-WAY ROUND ROBIN POINT LOOKUP ON > 2 KEYS OVER MY_TABLE | 268 | 2 | 0 | > +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ > 1 row selected (0.025 seconds) > {code} > The issue is view condition is always added as `AND` to user provided where > clause and in this case query optimizer is failing to optimize this query to > a point look up. > > > ---------------------------------------------------------------*[POSSIBLE > WORKAROUND]*--------------------------------------------------------------------------- > One possible workaround is to provide all the pk (including the view pk > columns) > {code:java} > EXPLAIN SELECT K1, K2, K3, V1, V2, V3 FROM MY_VIEW WHERE (K1,K2,K3) IN > ((1,'A',2),(3,'A',4)); > +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ > | CLIENT 1-CHUNK 2 ROWS 632 BYTES PARALLEL 1-WAY ROUND ROBIN POINT LOOKUP ON > 2 KEYS OVER MY_TABLE | 632 | 2 | 0 | > +--------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+ > {code} > but as you can see the projected _EST_BYTES_READ_ goes up because the > underlying query that gets executed is something like: > {code:java} > SELECT K1, K2, K3, V1, V2, V3 FROM MY_VIEW WHERE (K1,K2,K3) IN > ((1,'A',2),(3,'A',4)) AND K2 = 'A'; > {code} > and certainly the `_AND K2 = 'A'_` is redundant. > ----------------------------------------------------------------*[PROPOSED > SOLUTION]*-------------------------------------------------------------------------------- > we can make the view condition to be injected into any partial primary key > lookup (tuple style conditions) respecting the same order for columns defined > in the parent table -- This message was sent by Atlassian Jira (v8.3.4#803005)