Good! :)

A RelShuttle is another way of transforming relational expressions (using a
Visitor design pattern, rather than a planner).
For "simple", very specific, well-located conversions, it can be a good
option. I'd suggest to take a look at RelShuttle, RelShuttleImpl,
RelHomogenousShuttle, or some of the implementations within Calcite-core to
have a better idea.
You would need to override the appropriate "visit" method(s), and return
the corresponding equivalent RelNode (as you did in your rule).



On Sat, Aug 24, 2024 at 1:34 PM Yarin Benado <[email protected]> wrote:

> Bingo Ruben!
> Matching on Project+Filter+TableScan (and then Project+Filter+Join...)
> works!
>
> Can you elaborate on the RelShuttle approach though? How would that work?
> Transforming a Logical plan to another Logical plan outside the planner?
>
> On Sat, Aug 24, 2024 at 3:16 PM Ruben Q L <[email protected]> wrote:
>
> > When you call "call.transformTo(newRelNode)", the newRelNode needs to be
> an
> > equivalent relational expression (it shall produce the same result, with
> > the same rowType, as the original).
> > What if your rule matches Project+Filter+TableScan, and you generate the
> > newFilter, and on top of that a newProject, which keeps the original
> > rowType?
> > As I said before, a RelShuttle could be a different approach (with more
> > flexibility).
> >
> >
> > On Sat, Aug 24, 2024 at 11:07 AM Yarin Benado <[email protected]> wrote:
> >
> > > I'll try to better explain my goal here.
> > > Here's the scenario:
> > >
> > > My input SQL is this:
> > >
> > > SELECT source, name, email
> > > FROM contacts
> > > WHERE name = 'John'
> > >
> > > Since the internal PostgreSQL schema is a lot more complex than this
> (for
> > > few reasons such as supporting dynamic user-defined fields and other
> > > performance aspects), the actual SQL that should be executed is this:
> > >
> > > SELECT field_11, field_2, field_3
> > > FROM contacts c
> > > JOIN contacts_indexed_fields cif ON c.id = cif.contact_id
> > > WHERE cif.field_id = 'field_1' and string_value = 'John'
> > >
> > > So from this:
> > > === Input Logical Plan:
> > > LogicalProject(SOURCE=[$3], NAME=[$2], EMAIL=[$4])
> > >   LogicalFilter(condition=[=($2, 'John')])
> > >     LogicalTableScan(table=[[contacts]])
> > >
> > > To this:
> > > === Desired Output Logical Plan:
> > > LogicalProject(FIELD_1=[$6], FIELD_2=[$7], FIELD_3=[$8])
> > >   LogicalFilter(condition=[AND(=($11, 'field_1'), =($12, 'John'))])
> > >     LogicalJoin(condition=[=($0, $10)], joinType=[inner])
> > >       LogicalTableScan(table=[[contacts]])
> > >       LogicalTableScan(table=[[contacts_indexed_fields]])
> > >
> > > I'm able to create the new LogicalFilter and LogicalProject nodes - but
> > > calling "call.transformTo(newFilter)" throws an exception as the new
> > > filter's input rowtype s not the same:
> > >
> > > (link to Full exception and code snippet here:
> > > https://gist.github.com/yarinb/f532112cf2f0d994c89eaf15f3c8d5f0)
> > >
> > > java.lang.AssertionError: Cannot add expression of different type to
> set:
> > > set type is RecordType(BIGINT NOT NULL id, BIGINT NOT NULL contact_id,
> > > VARCHAR NOT NULL name, VARCHAR NOT NULL source, VARCHAR NOT NULL email,
> > > VARCHAR NOT NULL account_id, VARCHAR NOT NULL field_1, VARCHAR NOT NULL
> > > field_2, VARCHAR NOT NULL field_3, VARCHAR NOT NULL field_4) NOT NULL
> > >
> > > expression type is RecordType(BIGINT NOT NULL id, BIGINT NOT NULL
> > > contact_id, VARCHAR NOT NULL name, VARCHAR NOT NULL source, VARCHAR NOT
> > > NULL email, VARCHAR NOT NULL account_id, VARCHAR NOT NULL field_1,
> > VARCHAR
> > > NOT NULL field_2, VARCHAR NOT NULL field_3, VARCHAR NOT NULL field_4,
> > > BIGINT NOT NULL id0, VARCHAR NOT NULL field_id, VARCHAR NOT NULL
> > > string_value, DECIMAL(19, 0) NOT NULL numeric_value, TIMESTAMP_TZ(0)
> NOT
> > > NULL date_value) NOT NULL
> > >
> > > set is rel#6:LogicalFilter.NONE(input=HepRelVertex#5,condition==($2,
> > > 'John'))
> > > expression is LogicalFilter(condition=[=($12, 'John')])
> > >   LogicalJoin(condition=[=($6, $11)], joinType=[inner])
> > >     LogicalTableScan(table=[[contacts]])
> > >     LogicalTableScan(table=[[contacts_indexed_fields]])
> > >
> > > Type mismatch: the field sizes are not equal.
> > > rowtype of original rel: RecordType(BIGINT NOT NULL id, BIGINT NOT NULL
> > > contact_id, VARCHAR NOT NULL name, VARCHAR NOT NULL source, VARCHAR NOT
> > > NULL email, VARCHAR NOT NULL account_id, VARCHAR NOT NULL field_1,
> > VARCHAR
> > > NOT NULL field_2, VARCHAR NOT NULL field_3, VARCHAR NOT NULL field_4)
> NOT
> > > NULL
> > > rowtype of new rel: RecordType(BIGINT NOT NULL id, BIGINT NOT NULL
> > > contact_id, VARCHAR NOT NULL name, VARCHAR NOT NULL source, VARCHAR NOT
> > > NULL email, VARCHAR NOT NULL account_id, VARCHAR NOT NULL field_1,
> > VARCHAR
> > > NOT NULL field_2, VARCHAR NOT NULL field_3, VARCHAR NOT NULL field_4,
> > > BIGINT NOT NULL id0, VARCHAR NOT NULL field_id, VARCHAR NOT NULL
> > > string_value, DECIMAL(19, 0) NOT NULL numeric_value, TIMESTAMP_TZ(0)
> NOT
> > > NULL date_value) NOT NULL
> > >
> > >
> > > On Thu, Aug 22, 2024 at 8:49 PM Ruben Q L <[email protected]> wrote:
> > >
> > > > If you mean you wish to transform A=>B where A's rowType is not the
> > same
> > > as
> > > > B's rowType, then you won't be able to do that via a planner rule
> > > (because
> > > > there are checks in place which prevent precisely that, since it
> would
> > > mean
> > > > a mistake in the conversion under normal circumstances).
> > > >
> > > > If you really know what you're doing, maybe you can achieve that via
> > > > RelShuttle?
> > > >
> > >
> >
>

Reply via email to