Re: Trait propagation in heterogeneous plans

Vladimir Ozerov Thu, 06 May 2021 04:03:18 -0700

Hi,

I'd like to stress out that I am not trying to argue about subjective
concepts at all. Quite the opposite - I would like to agree or disagree on
a set of objective facts and find the solution. Specifically, from what I
saw in Calcite's codebase and real projects, I assert the following:


   1. Calcite-based projects may use custom traits.
   2. Enumerable in its current state cannot propagate any traits except
   for collation. The relevant code is simply missing from the product, it was
   never implemented.
   3. Despite (2), Enumerable rules/operators may demand unsupported traits
   from inputs, or expose unsupported traits, which may lead to problems on
   the user side (an example is in the first message of this thread).

Do you agree with these points?

If we are in agreement here, then I propose only one thing - fix (3),
because it affects real-life integrations. The fix is trivial:

   - Make sure that Enumerable operators never set non-default trait values
   for anything except for collation. For example, EnumerableProjectRule
   creates an operator with the correct trait set, whilst
   EnumerableFilterRule propagates unsupported traits.
   - Replace RelNode.getTraitSet with RelOptCluster.traitSet when deducing
   the desired input trait set in Enumerable rules.

These two fixes would ensure that we never have any non-default values of
any traits except for collation in Enumerable operators. On the one hand,
it fixes (3). On the other hand, it doesn't break anything, because thanks
to (2) there is nothing to break.

Does it make sense to you?

Regards,
Vladimir.


чт, 6 мая 2021 г. в 10:35, Vladimir Sitnikov <sitnikov.vladi...@gmail.com>:

> Vladimir,
>
> I generally agree with what you are saying,
>
> >Enumerable backend provides a clear and consistent contract: we support
> collation and reset everything
>
> That sounds like a way to go until there's a way to externalize "input
> trait enforcement" rules.
> "output" traits are simpler since they can be computed with metadataquery
> (however, we still hard-code the set of computed traits).
> It might be worth trying to compute all the traits known to the planner.
>
> However, Enumerable could play well with in-core distribution trait as
> well, so there's no need to limit enumerable to "collation only".
>
> If you don't like in-core distribution trait, you just do not use it.
> There's no much sense in limiting enumerable to collation only.
>
> Vladimir
>

Re: Trait propagation in heterogeneous plans

Reply via email to