Hi,

We're using Calcite to parse a query and extract the selected Metrics &
Dimensions and any filters to then make a request to our Semantic Layer. We
really deal with fairly basic SQL, where we mostly ignore any agg functions
(the aggs are defined in the SL), don't allow joins (a "table" just
contains all available metrics and dimensions as columns), and ignore any
group bys.

We mainly use various visitors to traverse through the parsed query to find
all these things. We built it mainly through trial and error using the
debugger and following random tidbits we've found online. We have
everything working, but recently ran into a bug in our filter parsing when
a query contains a big IN clause in the filter. It seems that instead of
using a Filter, it's doing a LogicalJoin -> (LogicalTableScan,
(LogicalAggregate -> LogicalUnion)) and I don't know how to parse that as a
filter. With a smaller set of dates in the IN clause, it generates a Filter
and I can parse that just fine. My guess is that this is due to some
optimization rule; I'd love to either disable it, or find some way to turn
the join back into a filter. Any help would be greatly appreciated!

The query is something like:

```
        SELECT "ALL"."Dimension" AS "Dimension",
          SUM("ALL"."Metric") AS "Metric"
        FROM "METRICS"."ALL" "ALL"
        WHERE ("ALL"."Metric Time" IN
               ((DATE '2024-08-01'), (DATE '2024-08-02'), (DATE
'2024-08-03'), (DATE '2024-08-04'), (DATE '2024-08-05'), (DATE
'2024-08-06'),
                (DATE '2024-08-07'), (DATE '2024-08-08'), (DATE
'2024-08-09'), (DATE '2024-08-10'), (DATE '2024-08-11'), (DATE
'2024-08-12'),
                (DATE '2024-08-13'), (DATE '2024-08-14'), (DATE
'2024-08-15'), (DATE '2024-08-16'), (DATE '2024-08-17'), (DATE
'2024-08-18'),
                (DATE '2024-08-19'), (DATE '2024-08-20'), (DATE
'2024-08-21'), (DATE '2024-08-22'), (DATE '2024-08-23'), (DATE
'2024-08-24'),
                (DATE '2024-08-25'), (DATE '2024-08-26'), (DATE
'2024-08-27'), (DATE '2024-08-28'), (DATE '2024-08-29'), (DATE
'2024-08-30'),
                (DATE '2024-08-31')))
        GROUP BY 1
```

And here's some useful debugger output:
[image: image.png]

This is how we generate the plan:

```
    private fun getPlan(query: String, namedSchema: NamedSchema): RelRoot {
        val rootSchema = Frameworks.createRootSchema(true)
        rootSchema.add(namedSchema.name, namedSchema.schema)
        // Lenient conformance helps lower parsing errors such as groupBy
ordinal on date_trunc
        // fields
        val conformance = SqlConformanceEnum.LENIENT

        // I'm not sure if Babel parser is actually needed here, but found
recommendations for it
        // online
        val parserConfig =

SqlParser.Config.DEFAULT.withParserFactory(SqlBabelParserImpl.FACTORY)
                .withConformance(conformance)
        val validatorConfig =
SqlValidator.Config.DEFAULT.withConformance(conformance)
        val operatorTable =
            SqlLibraryOperatorTableFactory.INSTANCE.getOperatorTable(
                SqlLibrary.STANDARD,
                SqlLibrary.POSTGRESQL,
                SqlLibrary.BIG_QUERY
            )
        val frameworkConfig =
            Frameworks.newConfigBuilder()
                .parserConfig(parserConfig)
                .sqlValidatorConfig(validatorConfig)
                .defaultSchema(rootSchema)
                .operatorTable(operatorTable)
                .ruleSets(listOf())
                .programs(listOf())
                .costFactory(null)
                .build()
        val planner = Frameworks.getPlanner(frameworkConfig)
        val parsed = planner.parse(query)
        val validated = planner.validate(parsed)
        return planner.rel(validated)
    }
```



-- 

Diego Fernandez
Senior Software Engineer - Semantic Layer
336-324-0111
 

Reply via email to