[ 
https://issues.apache.org/jira/browse/KYLIN-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17928348#comment-17928348
 ] 

Guoliang Sun commented on KYLIN-6044:
-------------------------------------

h3. RootCause

Based on the SQL in the use case, let's compare the RelRoot with the optimized 
logical query plan after the CBO phase:  
h4. RelRoot
{code:java}
LogicalSort(fetch=[500])
  LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
    LogicalJoin(condition=[=($1, $3)], joinType=[left])
      LogicalProject(ACCT_BAL_CURR=[$0], DT=[1])
        LogicalAggregate(group=[{}], ACCT_BAL_CURR=[SUM($0)])
          LogicalProject(D_YEAR=[$4])
            OlapTableScan(table=[[SSB, DATES]], ctx=[], fields=[[0, 1, 2, 3, 4, 
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]])
      LogicalProject(ACCT_ID_CURR=[$0], DT=[1])
        LogicalAggregate(group=[{}], ACCT_ID_CURR=[SUM($0)])
          LogicalProject(D_YEARMONTHNUM=[$5])
            OlapTableScan(table=[[SSB, DATES]], ctx=[], fields=[[0, 1, 2, 3, 4, 
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]]) {code}
h4. CBO
{code:java}
OlapToEnumerableConverter
  OlapLimitRel(ctx=[], fetch=[500])
    OlapAggregateRel(group-set=[[]], groups=[null], EXPR$0=[COUNT()], ctx=[])
      OlapJoinRel(condition=[=($0, $1)], joinType=[left], ctx=[])
        OlapProjectRel($f0=[1], ctx=[])
          OlapAggregateRel(group-set=[[]], groups=[null], ctx=[])
            OlapTableScan(table=[[SSB, DATES]], ctx=[], fields=[[0, 1, 2, 3, 4, 
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]])
        OlapProjectRel($f0=[1], ctx=[])
          OlapAggregateRel(group-set=[[]], groups=[null], ctx=[])
            OlapTableScan(table=[[SSB, DATES]], ctx=[], fields=[[0, 1, 2, 3, 4, 
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]]) {code}
It is clear that `LogicalAggregate(SUM)` + `LogicalProject` -> 
`OlapAggregateRel`, while `OlapAggregateRel` loses the `agg` operator `sum`.  

After debugging the relevant optimization rules in the CBO phase, the issue was 
traced to the optimization rule **ProjectAggregateMergeRule**. The purpose of 
this rule is to remove aggregation operators that are not needed in the final 
query. For more details, see the Calcite community issue: [Add a rule to merge 
a Project onto an 
Aggregate|https://issues.apache.org/jira/browse/CALCITE-4154]. After 
optimization, unnecessary collection and computation processes are reduced.  

Looking back at the SQL, the final query is for `count`, which indeed does not 
require the metrics `sum(D_YEAR)` and `sum(D_YEARMONTHNUM)`. Therefore, it is 
reasonable for these to be optimized away by this rule. However, because 
`OlapAggregateRel` lacks the aggregation operator, Kylin is unable to correctly 
determine and collect the comparison information required for aggregate 
indexes. This becomes a negative optimization for Kylin, forcing it to rely 
solely on detailed indexes to answer the query.

> Subqueries containing JOIN may lead to incorrect query results
> --------------------------------------------------------------
>
>                 Key: KYLIN-6044
>                 URL: https://issues.apache.org/jira/browse/KYLIN-6044
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: 5.0.0
>            Reporter: Guoliang Sun
>            Priority: Major
>
> It was discovered that a specific SQL query hits inconsistent indexes in a 
> particular version compared to earlier versions. Further simplification of 
> the SQL revealed that this issue might lead to incorrect query results.  
>  
> {code:java}
> SELECT count(1)
> FROM (
>     SELECT 1
>     FROM (
>         SELECT sum(D_YEAR) as acct_bal_CURR
>             ,1 AS DT
>         FROM SSB.DATES
>         ) a
>     left JOIN (
>         SELECT sum(D_YEARMONTHNUM) acct_id_CURR
>             ,1 AS DT
>         FROM SSB.DATES 
>         ) b ON a.DT = b.DT
>     )
> LIMIT 500{code}
> The model contains two SUM metrics from the SQL, and the correct query result 
> should be 1.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to