The join is a sign that the conversion from SqlNode to RelNode has actually 
also decorrelated your query, converting the IN into a JOIN.

If you look at PlannerImpl you will see this for the rel() function, which you 
call last:

 @Override public RelRoot rel(SqlNode sql) {
    ensure(State.STATE_4_VALIDATED);
    SqlNode validatedSqlNode =
        requireNonNull(this.validatedSqlNode,
            "validatedSqlNode is null. Need to call #validate() first");
    final RexBuilder rexBuilder = createRexBuilder();
    final RelOptCluster cluster =
        RelOptCluster.create(requireNonNull(planner, "planner"),
            rexBuilder);
    final SqlToRelConverter.Config config =
        sqlToRelConverterConfig.withTrimUnusedFields(false);
    final SqlToRelConverter sqlToRelConverter =
        new SqlToRelConverter(this, validator,
            createCatalogReader(), cluster, convertletTable, config);
    RelRoot root =
        sqlToRelConverter.convertQuery(validatedSqlNode, false, true);
    root = root.withRel(sqlToRelConverter.flattenTypes(root.rel, true));
    final RelBuilder relBuilder =
        config.getRelBuilderFactory().create(cluster, null);
    root =
        root.withRel(RelDecorrelator.decorrelateQuery(root.rel, relBuilder));
    state = State.STATE_5_CONVERTED;
    return root;
  }

I suspect that the highlighted step does this transformation.

One thing you can do is write an alternative version of this "rel" function 
which doesn't decorrelate the query at the end.

I don't know if you found the blog post I wrote on some of the Calcite program 
representations: https://www.feldera.com/blog/calcite-irs. It won't answer this 
question, but may be handy.

Mihai

________________________________
From: Diego Fernandez <[email protected]>
Sent: Tuesday, October 22, 2024 9:37 PM
To: [email protected] <[email protected]>
Subject: Help with filter that is turned into a LogicalJoin -> 
(LogicalTableScan, (LogicalAggregate -> LogicalUnion))

Hi,

We're using Calcite to parse a query and extract the selected Metrics & 
Dimensions and any filters to then make a request to our Semantic Layer. We 
really deal with fairly basic SQL, where we mostly ignore any agg functions 
(the aggs are defined in the SL), don't allow joins (a "table" just contains 
all available metrics and dimensions as columns), and ignore any group bys.

We mainly use various visitors to traverse through the parsed query to find all 
these things. We built it mainly through trial and error using the debugger and 
following random tidbits we've found online. We have everything working, but 
recently ran into a bug in our filter parsing when a query contains a big IN 
clause in the filter. It seems that instead of using a Filter, it's doing a 
LogicalJoin -> (LogicalTableScan, (LogicalAggregate -> LogicalUnion)) and I 
don't know how to parse that as a filter. With a smaller set of dates in the IN 
clause, it generates a Filter and I can parse that just fine. My guess is that 
this is due to some optimization rule; I'd love to either disable it, or find 
some way to turn the join back into a filter. Any help would be greatly 
appreciated!

The query is something like:

```
        SELECT "ALL"."Dimension" AS "Dimension",
          SUM("ALL"."Metric") AS "Metric"
        FROM "METRICS"."ALL" "ALL"
        WHERE ("ALL"."Metric Time" IN
               ((DATE '2024-08-01'), (DATE '2024-08-02'), (DATE '2024-08-03'), 
(DATE '2024-08-04'), (DATE '2024-08-05'), (DATE '2024-08-06'),
                (DATE '2024-08-07'), (DATE '2024-08-08'), (DATE '2024-08-09'), 
(DATE '2024-08-10'), (DATE '2024-08-11'), (DATE '2024-08-12'),
                (DATE '2024-08-13'), (DATE '2024-08-14'), (DATE '2024-08-15'), 
(DATE '2024-08-16'), (DATE '2024-08-17'), (DATE '2024-08-18'),
                (DATE '2024-08-19'), (DATE '2024-08-20'), (DATE '2024-08-21'), 
(DATE '2024-08-22'), (DATE '2024-08-23'), (DATE '2024-08-24'),
                (DATE '2024-08-25'), (DATE '2024-08-26'), (DATE '2024-08-27'), 
(DATE '2024-08-28'), (DATE '2024-08-29'), (DATE '2024-08-30'),
                (DATE '2024-08-31')))
        GROUP BY 1
```

And here's some useful debugger output:
[image.png]

This is how we generate the plan:

```
    private fun getPlan(query: String, namedSchema: NamedSchema): RelRoot {
        val rootSchema = Frameworks.createRootSchema(true)
        rootSchema.add(namedSchema.name, namedSchema.schema)
        // Lenient conformance helps lower parsing errors such as groupBy 
ordinal on date_trunc
        // fields
        val conformance = SqlConformanceEnum.LENIENT

        // I'm not sure if Babel parser is actually needed here, but found 
recommendations for it
        // online
        val parserConfig =
            
SqlParser.Config.DEFAULT.withParserFactory(SqlBabelParserImpl.FACTORY)
                .withConformance(conformance)
        val validatorConfig = 
SqlValidator.Config.DEFAULT.withConformance(conformance)
        val operatorTable =
            SqlLibraryOperatorTableFactory.INSTANCE.getOperatorTable(
                SqlLibrary.STANDARD,
                SqlLibrary.POSTGRESQL,
                SqlLibrary.BIG_QUERY
            )
        val frameworkConfig =
            Frameworks.newConfigBuilder()
                .parserConfig(parserConfig)
                .sqlValidatorConfig(validatorConfig)
                .defaultSchema(rootSchema)
                .operatorTable(operatorTable)
                .ruleSets(listOf())
                .programs(listOf())
                .costFactory(null)
                .build()
        val planner = Frameworks.getPlanner(frameworkConfig)
        val parsed = planner.parse(query)
        val validated = planner.validate(parsed)
        return planner.rel(validated)
    }
```



--



Diego Fernandez

Senior Software Engineer - Semantic Layer

P: 336-324-0111<tel:336-324-0111>

www.getdbt.com<http://www.getdbt.com/?utm_medium=email&utm_source=opensense>

[https://smart.dbtlabs.com/v2/imagebucket/dbtlabs.com/dbt-labs.png]<http://www.getdbt.com/?utm_medium=email&utm_source=opensense>

Connect with us

[github]<https://github.com/dbt-labs/dbt>

[linkedin]<http://linkedin.com/company/dbtlabs/>

[https://smart.dbtlabs.com/v2/imagebucket/dbtlabs.com/twitter.png]<https://twitter.com/dbt_labs>

[https://smart.dbtlabs.com/v2/imagebucket/dbtlabs.com/youtube.png]<https://www.youtube.com/c/dbt-labs>

[2025 State of Analytics Engineering 
Report]<https://smart.dbtlabs.com/v2/a/2025_state_of_analytics_report/67187d9a6bcf4a2a6738ac12-zA1Wz/httpswww.getdbt.comresourcesreportsstate-of-analytics-engineering-2024utm_mediumemailutm_sourceopensenseutm_campaignfy-2025_the-state-of-analytics-engineering-report_awutm_content____utm_term___>
 

Reply via email to