Hi Benchao,

Thanks for the input. I believe there are some cases where hashadd and sortagg have different performance and runtime. The information about the flick is beneficial!

Another question is whether you know other database systems implementing hashagg and sortagg or have a cost model to calculate them.

I found another open-source query optimizer GP Orca include them in the cost model: https://github.com/greenplum-db/gporca/blob/2f8d2f6e3a9a466588efe8b4814d12188ce7ed2f/libgpdbcost/src/CCostModelGPDB.cpp

Thanks!
Jigao


https://github.com/greenplum-db/gporca/blob/2f8d2f6e3a9a466588efe8b4814d12188ce7ed2f/libgpdbcost/src/CCostModelGPDB.cpp


On 2023/02/21 14:46:06 Benchao Li wrote:
> Hi Jigao,
>
> Seems that we did not implement different cost models for them now, hence
> they can not be chosen via costs.
>
> Calcite Enumerable is not a distributed implementation, and is not supposed
> to be high performance. I guess there may not be much difference between
> using hash agg and sort merge agg. But I think we can accept such a
> contribution to make this complete, and a good example for other projects.
>
> Besides, Flink has implemented different cost models for hash and sort
> merge aggregation[1][2], which you can take a look at.
>
> [1]
> https://github.com/apache/flink/blob/5cda70d873c9630c898d765633ec7a6cfe53e3c6/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/nodes/physical/batch/BatchPhysicalHashAggregateBase.scala#L59
> [2]
> https://github.com/apache/flink/blob/5cda70d873c9630c898d765633ec7a6cfe53e3c6/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/nodes/physical/batch/BatchPhysicalSortAggregateBase.scala#L56
>
> Jigao Luo 于2023年2月21日周二 14:00写道:
>
> > Hi Calcite Contributors,
> >
> > I have a question about Aggregation Enumerable Rules (Converter Rule).
> > We know that EnumerableAggregate and EnumerableSortedAggregate are
> > physical operators of a logical Aggregate operator. I would like to know
> > if we have implemented rules or costs to select one of them when
> > building a physical plan.
> >
> > So this is what I mean:
> > ```
> > planner.addRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
> > planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE);
> > /// then in [Physical plan], one of them is selected depending on the query
> > ```
> >
> > Thanks!
> > Jigao
> >
>
>
> --
>
> Best,
> Benchao Li
>

Reply via email to