[jira] [Created] (CALCITE-1887) Detect transitive join conditions via expressions

Claus Stadler (JIRA) Wed, 12 Jul 2017 03:16:16 -0700

Claus Stadler created CALCITE-1887:
--------------------------------------

             Summary: Detect transitive join conditions via expressions
                 Key: CALCITE-1887
                 URL: https://issues.apache.org/jira/browse/CALCITE-1887
             Project: Calcite
          Issue Type: Improvement
          Components: core
    Affects Versions: 1.13.0
            Reporter: Claus Stadler
            Assignee: Julian Hyde



Given table aliases ta, tb column names ca, cb, and an arbitrary 
(deterministic) expression expr then calcite should be capable to infer join 
conditions by transitivity:

{noformat}
ta.ca = expr AND tb.cb = expr -> ta.ca = tb.cb
{noformat}

The use case for us stems from SPARQL to SQL rewriting, where SPARQL queries 
such as

{code:java}
SELECT {
  dbr:Leipzig a ?type .
  dbr:Leipzig dbo:mayor ?mayor
}
{code}
result in an SQL query similar to

{noformat}
SELECT s.rdf a, s.rdf b WHERE a.s = 'dbr:Leipzig' AND b.s = 'dbr:Leipzig'
{noformat}

A consequence of the join condition not being recognized is, that Apache Flink 
does not find an executable plan to process the query.

Self contained example:
{code:java}
package my.package;

import org.apache.calcite.adapter.java.ReflectiveSchema;
import org.apache.calcite.plan.RelOptUtil;
import org.apache.calcite.rel.RelNode;
import org.apache.calcite.rel.RelRoot;
import org.apache.calcite.schema.SchemaPlus;
import org.apache.calcite.sql.SqlNode;
import org.apache.calcite.sql.parser.SqlParser;
import org.apache.calcite.tools.FrameworkConfig;
import org.apache.calcite.tools.Frameworks;
import org.apache.calcite.tools.Planner;
import org.junit.Test;


public class TestCalciteJoin {
    public static class Triple {
        public String s;
        public String p;
        public String o;

        public Triple(String s, String p, String o) {
            super();
            this.s = s;
            this.p = p;
            this.o = o;
        }

    }

    public static class TestSchema {
        public final Triple[] rdf = {new Triple("s", "p", "o")};
    }


    @Test
    public void testCalciteJoin() throws Exception {
        SchemaPlus rootSchema = Frameworks.createRootSchema(true);

        rootSchema.add("s", new ReflectiveSchema(new TestSchema()));

        Frameworks.ConfigBuilder configBuilder = Frameworks.newConfigBuilder();
        configBuilder.defaultSchema(rootSchema);
        FrameworkConfig frameworkConfig = configBuilder.build();

        SqlParser.ConfigBuilder parserConfig = 
SqlParser.configBuilder(frameworkConfig.getParserConfig());
        parserConfig
            .setCaseSensitive(false)
            .setConfig(parserConfig.build());

        Planner planner = Frameworks.getPlanner(frameworkConfig);

        // SELECT s.rdf a, s.rdf b WHERE a.s = 5 AND b.s = 5
        SqlNode sqlNode = planner.parse("SELECT * FROM \"s\".\"rdf\" \"a\", 
\"s\".\"rdf\" \"b\" WHERE \"a\".\"s\" = 5 AND \"b\".\"s\" = 5");
        planner.validate(sqlNode);
        RelRoot relRoot = planner.rel(sqlNode);
        RelNode relNode = relRoot.project();
        System.out.println(RelOptUtil.toString(relNode));
    }
}
{code}



Actual plan:
{code:java}
LogicalProject(s=[$0], p=[$1], o=[$2], s0=[$3], p0=[$4], o0=[$5])
  LogicalFilter(condition=[AND(=($0, 5), =($3, 5))])
    LogicalJoin(condition=[true], joinType=[inner])
      EnumerableTableScan(table=[[s, rdf]])
      EnumerableTableScan(table=[[s, rdf]])
{code}

Expected Plan fragment:
{code:java}
    LogicalJoin(condition=[=($0, $3)], joinType=[inner])
{code}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CALCITE-1887) Detect transitive join conditions via expressions

Reply via email to