[jira] [Updated] (PHOENIX-6980) Push TopN down into joins where applicable

Istvan Toth (Jira) Fri, 16 Jun 2023 00:36:04 -0700


     [ 
https://issues.apache.org/jira/browse/PHOENIX-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Istvan Toth updated PHOENIX-6980:
---------------------------------
    Description: 
This came up with an non-merging hinted uncovered index query, which may not be 
directly applicable in the latest versions{-}, but the same problem is 
reproducible with a simple IN subquery, which uses the same{-} plan:
On second thought, this only applies to indexed queries, otherwise there is no 
guarantee that there is a 1:1 match between the two table rows.
{noformat}
CREATE TABLE A(A_ID INTEGER NOT NULL PRIMARY KEY, A_V1 INTEGER);
CREATE TABLE B(B_ID INTEGER NOT NULL PRIMARY KEY, B_V1 INTEGER);
explain select * from A where (A_ID, A_V1) in (select B_ID, B_V1 from b ) order 
by a_v1 limit 10;
+-----------------------------------------------------------------+----------------+---------------+-------------+
|                              PLAN                               | 
EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
+-----------------------------------------------------------------+----------------+---------------+-------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER A                  | null        
   | null          | null        |
|     SERVER TOP 10 ROWS SORTED BY [A.A_V1]                       | null        
   | null          | null        |
| CLIENT MERGE SORT                                               | null        
   | null          | null        |
| CLIENT LIMIT 10                                                 | null        
   | null          | null        |
|     SKIP-SCAN-JOIN TABLE 0                                      | null        
   | null          | null        |
|         CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER B          | null        
   | null          | null        |
|             SERVER AGGREGATE INTO DISTINCT ROWS BY [B_ID, B_V1] | null        
   | null          | null        |
|         CLIENT MERGE SORT                                       | null        
   | null          | null        |
|     DYNAMIC SERVER FILTER BY A.A_ID IN ($48.$50)                | null        
   | null          | null        |
+-----------------------------------------------------------------+----------------+---------------+-------------+
{noformat}
 

 

  was:
This came up with an non-merging hinted uncovered index query, which may not be 
directly applicable in the latest versions, but the same problem is 
reproducible with a simple IN subquery, which uses the same plan:
{noformat}
CREATE TABLE A(A_ID INTEGER NOT NULL PRIMARY KEY, A_V1 INTEGER);
CREATE TABLE B(B_ID INTEGER NOT NULL PRIMARY KEY, B_V1 INTEGER);
explain select * from A where (A_ID, A_V1) in (select B_ID, B_V1 from b ) order 
by a_v1 limit 10;
+-----------------------------------------------------------------+----------------+---------------+-------------+
|                              PLAN                               | 
EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
+-----------------------------------------------------------------+----------------+---------------+-------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER A                  | null        
   | null          | null        |
|     SERVER TOP 10 ROWS SORTED BY [A.A_V1]                       | null        
   | null          | null        |
| CLIENT MERGE SORT                                               | null        
   | null          | null        |
| CLIENT LIMIT 10                                                 | null        
   | null          | null        |
|     SKIP-SCAN-JOIN TABLE 0                                      | null        
   | null          | null        |
|         CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER B          | null        
   | null          | null        |
|             SERVER AGGREGATE INTO DISTINCT ROWS BY [B_ID, B_V1] | null        
   | null          | null        |
|         CLIENT MERGE SORT                                       | null        
   | null          | null        |
|     DYNAMIC SERVER FILTER BY A.A_ID IN ($48.$50)                | null        
   | null          | null        |
+-----------------------------------------------------------------+----------------+---------------+-------------+
{noformat}
This is super inefficient, the topN part should be pushed down into the inner 
query.

 


> Push TopN down into joins where applicable
> ------------------------------------------
>
>                 Key: PHOENIX-6980
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6980
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Istvan Toth
>            Assignee: Istvan Toth
>            Priority: Major
>
> This came up with an non-merging hinted uncovered index query, which may not 
> be directly applicable in the latest versions{-}, but the same problem is 
> reproducible with a simple IN subquery, which uses the same{-} plan:
> On second thought, this only applies to indexed queries, otherwise there is 
> no guarantee that there is a 1:1 match between the two table rows.
> {noformat}
> CREATE TABLE A(A_ID INTEGER NOT NULL PRIMARY KEY, A_V1 INTEGER);
> CREATE TABLE B(B_ID INTEGER NOT NULL PRIMARY KEY, B_V1 INTEGER);
> explain select * from A where (A_ID, A_V1) in (select B_ID, B_V1 from b ) 
> order by a_v1 limit 10;
> +-----------------------------------------------------------------+----------------+---------------+-------------+
> |                              PLAN                               | 
> EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
> +-----------------------------------------------------------------+----------------+---------------+-------------+
> | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER A                  | null      
>      | null          | null        |
> |     SERVER TOP 10 ROWS SORTED BY [A.A_V1]                       | null      
>      | null          | null        |
> | CLIENT MERGE SORT                                               | null      
>      | null          | null        |
> | CLIENT LIMIT 10                                                 | null      
>      | null          | null        |
> |     SKIP-SCAN-JOIN TABLE 0                                      | null      
>      | null          | null        |
> |         CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER B          | null      
>      | null          | null        |
> |             SERVER AGGREGATE INTO DISTINCT ROWS BY [B_ID, B_V1] | null      
>      | null          | null        |
> |         CLIENT MERGE SORT                                       | null      
>      | null          | null        |
> |     DYNAMIC SERVER FILTER BY A.A_ID IN ($48.$50)                | null      
>      | null          | null        |
> +-----------------------------------------------------------------+----------------+---------------+-------------+
> {noformat}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PHOENIX-6980) Push TopN down into joins where applicable

Reply via email to