[jira] [Updated] (CALCITE-6176) 'exists' in 'join on' has a wrong result with CoreRules.JOIN_SUB_QUERY_TO_CORRELATE

SimonAlexs (Jira) Sun, 24 Dec 2023 22:00:17 -0800


     [ 
https://issues.apache.org/jira/browse/CALCITE-6176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


SimonAlexs updated CALCITE-6176:
--------------------------------
    Description: 
In calcite 1.36.0, using flowing rules:

        CoreRules.JOIN_SUB_QUERY_TO_CORRELATE

        CoreRules.JOIN_CONDITION_PUSH

I tested some wrong cases(the expected result and actual result are in comment):
 * expected result: the result in mysql;
 * actual result: the result in calcite1.36.0

{code:java}
-- not exists, expected: 1, 2 
--                        actual:   1, null
select *
from (select 1 id) t1
left join (select 2 id) t2
      on not exists(select *
                    from (select 3 id) p
                    where p.id=t2.id)

-- or exists, expected: 1, null
--                       actual:   1, 2
select *
from (select 1 id) t1
left join (select 2 id) t2
      on t1.id=t2.id or exists(select *
                               from (select 3 id) p
                               where p.id=t2.id)

-- or not exists, expected: 1, 2 
--                           actual:   1, null
select *
from (select 1 id) t1
left join (select 2 id) t2
      on t1.id=t2.id or not exists(select *
                                   from (select 3 id) p
                                   where p.id=t2.id)

-- and not exists, expected: 1, null
--                           2, 2
--                   actual: 1, null
--                           2, null
select *
from (select 1 id
      union all
      select 2) t1
left join (select 2 id) t2
      on t1.id=t2.id and not exists(select *
                                from (select 3 id) p
                                where p.id=t1.id){code}
This seems a bug.

The rel of case 2 is as below. The reason I guess is that, The '$1' in 'IS NOT 
NULL($1)' of line 3 may be wrong.Because In the rel, the $1 represents the 
first column of the right table, which is usually not null depends on right 
input data. However, it should represent the column name 'i' in right table 
which means the 'exists' result.

This is only my guessing.The real reason for this is needed to find by users 
who really understand Calcite.

LogicalProject(id=[$0], id0=[$1])
  LogicalProject(id=[$0], id0=[$1])
    LogicalJoin(condition=[OR(=($0, $1), IS NOT NULL($1))], joinType=[left])
      LogicalValues(tuples=[[\\{ 1 }]])
      LogicalCorrelate(correlation=[$cor0], joinType=[left], 
requiredColumns=[\\{1}])
        LogicalValues(tuples=[[\\{ 2 }]])
        LogicalProject(i=[true])
          LogicalFilter(condition=[=($0, $cor0.id0)])
            LogicalValues(tuples=[[\\{ 3 }]])

 

My whole code is:
{code:java}
 
Properties properties = new Properties();
properties.put(CalciteConnectionProperty.MODEL.camelName(), "inline:");
Connection calciteConnection = DriverManager.getConnection("jdbc:calcite:", 
properties);
Statement calciteStatement = calciteConnection.createStatement();
CalcitePrepare.Context prepareContext = 
calciteStatement.unwrap(CalciteServerStatement.class).createPrepareContext();
CalciteSchema pSpaceSchema = prepareContext.getRootSchema().getSubSchema("ps", 
false);
final FrameworkConfig config = Frameworks.newConfigBuilder()
                               .parserConfig(SqlParser.config()
                               .withLex(Lex.MYSQL)
                               .withConformance(SqlConformanceEnum.MYSQL_5))
                               .sqlValidatorConfig(SqlValidator.Config.DEFAULT)
                               .defaultSchema(pSpaceSchema.plus())
                               .build();
Planner planner = Frameworks.getPlanner(config);
SqlNode parsedSql = planner.parse(sql);
SqlNode validatedSql = planner.validate(parsedSql);
RelRoot relRoot = planner.rel(validatedSql);
RelNode originRel = relRoot.rel;
HepProgram program = HepProgram.builder()
                    .addRuleCollection(Arrays.asList(
                       CoreRules.JOIN_SUB_QUERY_TO_CORRELATE
                       , CoreRules.JOIN_CONDITION_PUSH
                     ))
                     .build();
HepPlanner hepPlanner = new HepPlanner(program);
hepPlanner.setRoot(originRel);
RelNode optimizedRel = hepPlanner.findBestExp(); {code}

  was:
In calcite 1.36.0, using flowing rules:

        CoreRules.JOIN_SUB_QUERY_TO_CORRELATE

        CoreRules.JOIN_CONDITION_PUSH

I tested 3 wrong cases(the expected result and actual result are in comment):
 * expected result: the result in mysql;
 * actual result: the result in calcite1.36.0

{code:java}
-- condition: not exists, expected: 1, 2 
--                        actual:   1, null
select *
from (select 1 id) t1
left join (select 2 id) t2
      on not exists(select *
                    from (select 3 id) p
                    where p.id=t2.id)

-- condition: or exists, expected: 1, null
--                       actual:   1, 2
select *
from (select 1 id) t1
left join (select 2 id) t2
      on t1.id=t2.id or exists(select *
                               from (select 3 id) p
                               where p.id=t2.id)

-- condition: or not exists, expected: 1, 2 
--                           actual:   1, null
select *
from (select 1 id) t1
left join (select 2 id) t2
      on t1.id=t2.id or not exists(select *
                                   from (select 3 id) p
                                   where p.id=t2.id){code}
This seems a bug.

The rel of case 2 is as below. The reason I guess is that, The '$1' in 'IS NOT 
NULL($1)' of line 3 may be wrong.Because In the rel, the $1 represents the 
first column of the right table, which is usually not null depends on right 
input data. However, it should represent the column name 'i' in right table 
which means the 'exists' result.

This is only my guessing.The real reason for this is needed to find by users 
who really understand Calcite.

LogicalProject(id=[$0], id0=[$1])
  LogicalProject(id=[$0], id0=[$1])
    LogicalJoin(condition=[OR(=($0, $1), IS NOT NULL($1))], joinType=[left])
      LogicalValues(tuples=[[\{ 1 }]])
      LogicalCorrelate(correlation=[$cor0], joinType=[left], 
requiredColumns=[\{1}])
        LogicalValues(tuples=[[\{ 2 }]])
        LogicalProject(i=[true])
          LogicalFilter(condition=[=($0, $cor0.id0)])
            LogicalValues(tuples=[[\{ 3 }]])

 

My whole code is:
{code:java}
 
Properties properties = new Properties();
properties.put(CalciteConnectionProperty.MODEL.camelName(), "inline:");
Connection calciteConnection = DriverManager.getConnection("jdbc:calcite:", 
properties);
Statement calciteStatement = calciteConnection.createStatement();
CalcitePrepare.Context prepareContext = 
calciteStatement.unwrap(CalciteServerStatement.class).createPrepareContext();
CalciteSchema pSpaceSchema = prepareContext.getRootSchema().getSubSchema("ps", 
false);
final FrameworkConfig config = Frameworks.newConfigBuilder()
                               .parserConfig(SqlParser.config()
                               .withLex(Lex.MYSQL)
                               .withConformance(SqlConformanceEnum.MYSQL_5))
                               .sqlValidatorConfig(SqlValidator.Config.DEFAULT)
                               .defaultSchema(pSpaceSchema.plus())
                               .build();
Planner planner = Frameworks.getPlanner(config);
SqlNode parsedSql = planner.parse(sql);
SqlNode validatedSql = planner.validate(parsedSql);
RelRoot relRoot = planner.rel(validatedSql);
RelNode originRel = relRoot.rel;
HepProgram program = HepProgram.builder()
                    .addRuleCollection(Arrays.asList(
                       CoreRules.JOIN_SUB_QUERY_TO_CORRELATE
                       , CoreRules.JOIN_CONDITION_PUSH
                     ))
                     .build();
HepPlanner hepPlanner = new HepPlanner(program);
hepPlanner.setRoot(originRel);
RelNode optimizedRel = hepPlanner.findBestExp(); {code}


> 'exists' in 'join on' has a wrong result with 
> CoreRules.JOIN_SUB_QUERY_TO_CORRELATE
> -----------------------------------------------------------------------------------
>
>                 Key: CALCITE-6176
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6176
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.36.0
>            Reporter: SimonAlexs
>            Priority: Critical
>
> In calcite 1.36.0, using flowing rules:
>         CoreRules.JOIN_SUB_QUERY_TO_CORRELATE
>         CoreRules.JOIN_CONDITION_PUSH
> I tested some wrong cases(the expected result and actual result are in 
> comment):
>  * expected result: the result in mysql;
>  * actual result: the result in calcite1.36.0
> {code:java}
> -- not exists, expected: 1, 2 
> --                        actual:   1, null
> select *
> from (select 1 id) t1
> left join (select 2 id) t2
>       on not exists(select *
>                     from (select 3 id) p
>                     where p.id=t2.id)
> -- or exists, expected: 1, null
> --                       actual:   1, 2
> select *
> from (select 1 id) t1
> left join (select 2 id) t2
>       on t1.id=t2.id or exists(select *
>                                from (select 3 id) p
>                                where p.id=t2.id)
> -- or not exists, expected: 1, 2 
> --                           actual:   1, null
> select *
> from (select 1 id) t1
> left join (select 2 id) t2
>       on t1.id=t2.id or not exists(select *
>                                    from (select 3 id) p
>                                    where p.id=t2.id)
> -- and not exists, expected: 1, null
> --                           2, 2
> --                   actual: 1, null
> --                           2, null
> select *
> from (select 1 id
>       union all
>       select 2) t1
> left join (select 2 id) t2
>       on t1.id=t2.id and not exists(select *
>                                 from (select 3 id) p
>                                 where p.id=t1.id){code}
> This seems a bug.
> The rel of case 2 is as below. The reason I guess is that, The '$1' in 'IS 
> NOT NULL($1)' of line 3 may be wrong.Because In the rel, the $1 represents 
> the first column of the right table, which is usually not null depends on 
> right input data. However, it should represent the column name 'i' in right 
> table which means the 'exists' result.
> This is only my guessing.The real reason for this is needed to find by users 
> who really understand Calcite.
> LogicalProject(id=[$0], id0=[$1])
>   LogicalProject(id=[$0], id0=[$1])
>     LogicalJoin(condition=[OR(=($0, $1), IS NOT NULL($1))], joinType=[left])
>       LogicalValues(tuples=[[\\{ 1 }]])
>       LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[\\{1}])
>         LogicalValues(tuples=[[\\{ 2 }]])
>         LogicalProject(i=[true])
>           LogicalFilter(condition=[=($0, $cor0.id0)])
>             LogicalValues(tuples=[[\\{ 3 }]])
>  
> My whole code is:
> {code:java}
>  
> Properties properties = new Properties();
> properties.put(CalciteConnectionProperty.MODEL.camelName(), "inline:");
> Connection calciteConnection = DriverManager.getConnection("jdbc:calcite:", 
> properties);
> Statement calciteStatement = calciteConnection.createStatement();
> CalcitePrepare.Context prepareContext = 
> calciteStatement.unwrap(CalciteServerStatement.class).createPrepareContext();
> CalciteSchema pSpaceSchema = 
> prepareContext.getRootSchema().getSubSchema("ps", false);
> final FrameworkConfig config = Frameworks.newConfigBuilder()
>                                .parserConfig(SqlParser.config()
>                                .withLex(Lex.MYSQL)
>                                .withConformance(SqlConformanceEnum.MYSQL_5))
>                                
> .sqlValidatorConfig(SqlValidator.Config.DEFAULT)
>                                .defaultSchema(pSpaceSchema.plus())
>                                .build();
> Planner planner = Frameworks.getPlanner(config);
> SqlNode parsedSql = planner.parse(sql);
> SqlNode validatedSql = planner.validate(parsedSql);
> RelRoot relRoot = planner.rel(validatedSql);
> RelNode originRel = relRoot.rel;
> HepProgram program = HepProgram.builder()
>                     .addRuleCollection(Arrays.asList(
>                        CoreRules.JOIN_SUB_QUERY_TO_CORRELATE
>                        , CoreRules.JOIN_CONDITION_PUSH
>                      ))
>                      .build();
> HepPlanner hepPlanner = new HepPlanner(program);
> hepPlanner.setRoot(originRel);
> RelNode optimizedRel = hepPlanner.findBestExp(); {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-6176) 'exists' in 'join on' has a wrong result with CoreRules.JOIN_SUB_QUERY_TO_CORRELATE

Reply via email to