Hello Zhen Chen,

Thank you for the quick response!

I don't think I correctly explained what exactly we want to achieve. What
we actually wanted is to be able to judge for each of the conditions of the
filter if that specific condition should be pushed down or not. And the way
we envision it is to add something like the `withCondition` predicate to
the `FilterJoinRule` configuration which would be evaluated for each of the
conditions of the filter. In the example above the predicate containing
`containsHeavyUdf` call is just an example. This would be something that we
would be able to define in the rule configuration on our end (which depends
on the logic of the specific use-case). This would allow us to have a more
fine-grained control over which conditions get pushed down (instead of
having an "all or nothing" approach). Ultimately, we don't want a behavior
change (since we still want the benefits of pushing down the filters), we
want to allow for more flexibility when deciding whether a certain
condition should be pushed down or not.

You also brought up a good point about filters in the scope of
other operators. For our specific use-case we are only interested in
solving this for JOINs as we haven't identified similar challenges for
non-JOINs.

Thank you,
Danylo

On Mon, 13 Oct 2025 at 12:18, 我 <[email protected]> wrote:

> Hi Danylo,
>
> Based on my current understanding of Calcite, the pushdown of conditions
> from Filter clauses can occur not only below Join operators but also below
> operators such as Aggregate, Project, SetOp, etc. Many operators and rules
> are involved in this process.
>
> I'm uncertain whether adding containsHeavyUDF solely to the
> FILTER_INTO_JOIN rule would fully address your requirements. This approach
> would also prevent all conditions in the current Filter from being pushed
> down.
>
> If you are using HepPlanner and only intend to modify the FILTER_INTO_JOIN
> rule, you could copy the code of FILTER_INTO_JOIN and implement your own
> logic.
>
> If you are using VolcanoPlanner, you might consider modifying the cost
> calculation in Filter#computeSelfCost. When encountering what you consider
> HeavyUDFs, you could increase the cost of this Filter, and VolcanoPlanner
> would then select the optimal plan based on these cost calculations.
>
> I haven't tested this approach myself, but I offer this idea for your
> consideration.
>
>
>
>
> Best,
>
> Zhen Chen
>
>
>
>
>
>
>
>
>
>
>
>
> At 2025-10-13 17:51:20, "Danylo Naumenko"
> <[email protected]> wrote:
> >Hello,
> >
> >We're currently working with Calcite and have a use case where we use
> >expensive User-Defined Functions in our `WHERE` clauses.
> >
> >We've noticed that Calcite's default behavior is to eagerly push these
> >filters down the plan. However, for these specific "heavy" UDFs, this
> >optimization can sometimes be detrimental to performance, as the cost of
> >executing the function on many rows outweighs the benefit of filtering
> >early.
> >
> >We were wondering if `FilterJoinRule` could be made more configurable? For
> >example, this might look something like this:
> >```
> >FilterJoinRule.Config MY_CONFIG = CoreRules.FILTER_INTO_JOIN.config
> >    .withPredicate(...)
> >    .withCondition(call -> {
> >        Filter filter = call.rel(0);
> >        // User-defined logic to inspect the filter
> >        return !containsHeavyUDF(filter.getCondition());
> >    });
> >```
> >
> >If the `withCondition` predicate returns `false`, that specific filter
> does
> >not get pushed down.
> >
> >Does this seem like a reasonable addition?
> >
> >Best regards,
> >Danylo
>

Reply via email to