Hi Mihai and Julian,
and thanks for getting back to me.
I'm glad to hear that I'm on the right track by not using Rules and the
Planner for this, thanks for clarifying.
I got it working, and now my tests are passing. Thinking I was being smart,
I changed your suggestion from
"SELECT " + filterExpression + " FROM " + table
into
"SELECT * FROM " + table + " WHERE " + filterExpression
since that felt more in line with what I'm doing. But that breaks down when
filterExpression is, for example, just "true" as the optimizer removes the
LogicalFilter altogether. So I'm going to stick with your approach.
I have to admit that concatenating strings like this is a bit
uncomfortable, however. Will this be subject to SQL injections? For a
malicous user's table, she might specify this filter expression:
* FROM sensitive_table --
But that will luckily not give her access to sensitive_table since we're
only extracting the first RexNode and not the resulting TableScan. But
perhaps there are other tricks I'm not considering?
This is a good starting point, thank you for helping out!
K.
On Mon, Mar 17, 2025 at 7:25 PM Julian Hyde <[email protected]> wrote:
> It seems that your requirement is for an ‘expression parser’ that works in
> the context of a table, and goes from SQL text via AST and validation to a
> RexNode.
>
> You can achieve this by generating a query, “select “ + expression + “
> from “ + table, and sending it through the parse/validate/SQL-to-rel
> process, after which you will surely receive a plan that looks like this:
>
> Project(rex)
> TableScan(table)
>
> From this plan you can mine ‘rex’ and splice it into other queries. If you
> do this ahead of time, the RexNode will refer to types in a different type
> factory (a soluble problem).
>
> Julian
>
> > On Mar 17, 2025, at 9:35 AM, Mihai Budiu <[email protected]> wrote:
> >
> > I don't think the planner is the right place; in general planner
> transformations preserve semantics, whereas your transformation won't
> necessarily. Planner rules are also designed for optimization, and here
> there is nothing to optimize.
> >
> > The visitor-based APIs are probably the main tool you need.
> > https://www.feldera.com/blog/calcite-irs#9099c1483ca6
> >
> > Mihai
> >
> > ________________________________
> > From: Kristian Lein-Mathisen <[email protected]>
> > Sent: Monday, March 17, 2025 7:57 AM
> > To: [email protected] <[email protected]>
> > Subject: Parsing expression inside RelShuttle
> >
> > Hi!
> >
> > I'm new to Calcite and I'd like to use it to transform incoming SQL
> queries
> > to conform to some generic policy rules, like wrapping all table-scans
> with
> > a row-level filter. I hope this is the right place to ask questions like
> > this. Here's an example of what I'm trying to acheive:
> >
> > -- admin says: ALTER TABLE orders SET TBLPROPERTIES (policy.filter =
> > 'country IN user_countries')
> > SELECT * FROM orders -- <-- from user
> > SELECT * FROM (SELECT * FROM orders WHERE country IN user_countries())
> --
> > <-- to backend
> >
> > This is fairly simple to acheive when you have a static filter
> expression.
> > Here's what I've got working:
> >
> > public static RelNode policyFilter(RelNode root, RelBuilder builder) {
> > return root.accept(new RelShuttleImpl() {
> > @Override
> > public RelNode visit(TableScan scan) {
> > RelNode n = super.visit(scan);
> > RexNode x = builder.in(builder.field("country"),
> > builder.literal("se"),
> > builder.literal("nl"),
> > builder.literal("dk"));
> > return builder
> > .push(n)
> > .filter(x)
> > .build();
> > }
> > });
> > }
> >
> > This works, provided all my tables have a "country" column. What I'd like
> > to do, however, is allow user-defined filter expressions: make "RexNode
> x"
> > from a per-table String. I'd like this to be evaluated in the context of
> > the current table (so you can reference column names) and session (so
> > current_user(), for example, is available). I've looked at the Planner,
> but
> > it doesn't seem it's exposing enough of its internals (I don't have
> access
> > to its validator.validateExpression, for example).
> >
> > Note that I'm new to Calcite, so I could be going around this completely
> > wrong. Looking at other uses of Calcite, I'm getting the impression that
> I
> > should be extending RelNodes and adding some Rules. I've also assumed
> that
> > I should be doing this with RelNodes instead of SqlNodes.
> >
> > For the record, Kyuubi does something very similar using Spark here:
> >
> https://github.com/apache/kyuubi/blob/master/extensions/spark/kyuubi-spark-authz/src/main/scala/org/apache/kyuubi/plugin/spark/authz/rule/rowfilter/RuleApplyRowFilter.scala#L42
> >
> > Thanks for your feedback,
> > - Kris
>
>