I think the goal should be to achieve the trait propagation without relying
on the add followed by remove strategy.  Consider a simple query with  2
table join followed by a group-by.  If I want to use merge join and
streaming aggregate,  there will be 1 pair of {hash-distribute, sort} added
on both sides of the merge join and another such pair added for the 2-phase
streaming aggregate.

So, altogether 6 enforcer nodes (3 for sort, 3 for distribution) could be
potentially added for a simple query and then removed later if the required
traits are available from the input.  This would add overhead.  Ideally,
this should be done through a requirements-driven mechanism where the
parent operator asks the child to satisfy the collation or distribution
requirement and the child recursively asks its descendants to satisfy it.
It should be the responsibility of the child to decide whether to add the
enforcer (if it cannot provide the trait natively), not that of the parent.

Aman

On Mon, Mar 14, 2016 at 3:09 PM, Jacques Nadeau <jacq...@dremio.com> wrote:

> Hey All,
>
> I've been thinking about the SubsetTransformer pattern [1] that we use in
> Drill to ensure trait propagation. It was discussed here in Calcite [2]
>
> Julian's felt that the correct solution (and the patch he ultimately
> applied) was to use a create and then remove behavior. Take a look at his
> revision to my test here [3] where he adds the SortRemoveRule in order to
> remove an extraneous Sort operation.
>
> It seems like we need to either introduce a new mechanism in Calcite to
> accomplish this or we need to adopt the removal behavior. (I also believe
> there are a small set of situations where we insert distribution for
> parallelization purposes as opposed to a requirement for a particular
> operation... we'll need to determine how those work and figure out how to
> express correctly in this removal pattern.)
>
> Thoughts?
>
> [1]
>
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java
> [2] https://issues.apache.org/jira/browse/CALCITE-606
> [3]
>
> https://github.com/julianhyde/calcite/commit/fb203dc4b9aea89bfed839c22ae3e285044df400#diff-9494b27dde1061ef95e3853cb6222b5bR103
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>

Reply via email to