Good! :) A RelShuttle is another way of transforming relational expressions (using a Visitor design pattern, rather than a planner). For "simple", very specific, well-located conversions, it can be a good option. I'd suggest to take a look at RelShuttle, RelShuttleImpl, RelHomogenousShuttle, or some of the implementations within Calcite-core to have a better idea. You would need to override the appropriate "visit" method(s), and return the corresponding equivalent RelNode (as you did in your rule).
On Sat, Aug 24, 2024 at 1:34 PM Yarin Benado <[email protected]> wrote: > Bingo Ruben! > Matching on Project+Filter+TableScan (and then Project+Filter+Join...) > works! > > Can you elaborate on the RelShuttle approach though? How would that work? > Transforming a Logical plan to another Logical plan outside the planner? > > On Sat, Aug 24, 2024 at 3:16 PM Ruben Q L <[email protected]> wrote: > > > When you call "call.transformTo(newRelNode)", the newRelNode needs to be > an > > equivalent relational expression (it shall produce the same result, with > > the same rowType, as the original). > > What if your rule matches Project+Filter+TableScan, and you generate the > > newFilter, and on top of that a newProject, which keeps the original > > rowType? > > As I said before, a RelShuttle could be a different approach (with more > > flexibility). > > > > > > On Sat, Aug 24, 2024 at 11:07 AM Yarin Benado <[email protected]> wrote: > > > > > I'll try to better explain my goal here. > > > Here's the scenario: > > > > > > My input SQL is this: > > > > > > SELECT source, name, email > > > FROM contacts > > > WHERE name = 'John' > > > > > > Since the internal PostgreSQL schema is a lot more complex than this > (for > > > few reasons such as supporting dynamic user-defined fields and other > > > performance aspects), the actual SQL that should be executed is this: > > > > > > SELECT field_11, field_2, field_3 > > > FROM contacts c > > > JOIN contacts_indexed_fields cif ON c.id = cif.contact_id > > > WHERE cif.field_id = 'field_1' and string_value = 'John' > > > > > > So from this: > > > === Input Logical Plan: > > > LogicalProject(SOURCE=[$3], NAME=[$2], EMAIL=[$4]) > > > LogicalFilter(condition=[=($2, 'John')]) > > > LogicalTableScan(table=[[contacts]]) > > > > > > To this: > > > === Desired Output Logical Plan: > > > LogicalProject(FIELD_1=[$6], FIELD_2=[$7], FIELD_3=[$8]) > > > LogicalFilter(condition=[AND(=($11, 'field_1'), =($12, 'John'))]) > > > LogicalJoin(condition=[=($0, $10)], joinType=[inner]) > > > LogicalTableScan(table=[[contacts]]) > > > LogicalTableScan(table=[[contacts_indexed_fields]]) > > > > > > I'm able to create the new LogicalFilter and LogicalProject nodes - but > > > calling "call.transformTo(newFilter)" throws an exception as the new > > > filter's input rowtype s not the same: > > > > > > (link to Full exception and code snippet here: > > > https://gist.github.com/yarinb/f532112cf2f0d994c89eaf15f3c8d5f0) > > > > > > java.lang.AssertionError: Cannot add expression of different type to > set: > > > set type is RecordType(BIGINT NOT NULL id, BIGINT NOT NULL contact_id, > > > VARCHAR NOT NULL name, VARCHAR NOT NULL source, VARCHAR NOT NULL email, > > > VARCHAR NOT NULL account_id, VARCHAR NOT NULL field_1, VARCHAR NOT NULL > > > field_2, VARCHAR NOT NULL field_3, VARCHAR NOT NULL field_4) NOT NULL > > > > > > expression type is RecordType(BIGINT NOT NULL id, BIGINT NOT NULL > > > contact_id, VARCHAR NOT NULL name, VARCHAR NOT NULL source, VARCHAR NOT > > > NULL email, VARCHAR NOT NULL account_id, VARCHAR NOT NULL field_1, > > VARCHAR > > > NOT NULL field_2, VARCHAR NOT NULL field_3, VARCHAR NOT NULL field_4, > > > BIGINT NOT NULL id0, VARCHAR NOT NULL field_id, VARCHAR NOT NULL > > > string_value, DECIMAL(19, 0) NOT NULL numeric_value, TIMESTAMP_TZ(0) > NOT > > > NULL date_value) NOT NULL > > > > > > set is rel#6:LogicalFilter.NONE(input=HepRelVertex#5,condition==($2, > > > 'John')) > > > expression is LogicalFilter(condition=[=($12, 'John')]) > > > LogicalJoin(condition=[=($6, $11)], joinType=[inner]) > > > LogicalTableScan(table=[[contacts]]) > > > LogicalTableScan(table=[[contacts_indexed_fields]]) > > > > > > Type mismatch: the field sizes are not equal. > > > rowtype of original rel: RecordType(BIGINT NOT NULL id, BIGINT NOT NULL > > > contact_id, VARCHAR NOT NULL name, VARCHAR NOT NULL source, VARCHAR NOT > > > NULL email, VARCHAR NOT NULL account_id, VARCHAR NOT NULL field_1, > > VARCHAR > > > NOT NULL field_2, VARCHAR NOT NULL field_3, VARCHAR NOT NULL field_4) > NOT > > > NULL > > > rowtype of new rel: RecordType(BIGINT NOT NULL id, BIGINT NOT NULL > > > contact_id, VARCHAR NOT NULL name, VARCHAR NOT NULL source, VARCHAR NOT > > > NULL email, VARCHAR NOT NULL account_id, VARCHAR NOT NULL field_1, > > VARCHAR > > > NOT NULL field_2, VARCHAR NOT NULL field_3, VARCHAR NOT NULL field_4, > > > BIGINT NOT NULL id0, VARCHAR NOT NULL field_id, VARCHAR NOT NULL > > > string_value, DECIMAL(19, 0) NOT NULL numeric_value, TIMESTAMP_TZ(0) > NOT > > > NULL date_value) NOT NULL > > > > > > > > > On Thu, Aug 22, 2024 at 8:49 PM Ruben Q L <[email protected]> wrote: > > > > > > > If you mean you wish to transform A=>B where A's rowType is not the > > same > > > as > > > > B's rowType, then you won't be able to do that via a planner rule > > > (because > > > > there are checks in place which prevent precisely that, since it > would > > > mean > > > > a mistake in the conversion under normal circumstances). > > > > > > > > If you really know what you're doing, maybe you can achieve that via > > > > RelShuttle? > > > > > > > > > >
