Currently, Calcite uses Project operator and all kinds of 
ProjectXXXTranposeRule to prune unused columns. Every operator's output columns 
use index to reference child operators' columns. If there is a Project operator 
with child operator of a Filter, if we push project down under Filter, we will 
have Project-Filter-Project-FilterInput. All the newly generated relnodes will 
trigger rule matches. e.g. If we already did ReduceExpressionRule on filter, 
but due to the new filter RexCall's input ref index changed, we have to apply 
ReduceExpressionRule on the new filter again, even there is nothing can be 
reduced. Similarly new operator transpose/merge rule will be triggered. This 
can trigger a lot of rule matches.

MEMO group (RelSet) represents logically equivalent expressions. All the 
expressions in one group should share the same logical properties, e.g. 
functional dependency, constraint properties etc. But they are not sharing it. 
Computation is done for each individual operator.

Without resolving those issue, space pruning won't help much.

There are a lot of room for improvement. Hope the community can join the effort 
to make Calcite better. 

- Haisheng

------------------------------------------------------------------
发件人:Roman Kondakov<kondako...@mail.ru.INVALID>
日 期:2020年01月10日 19:39:51
收件人:<dev@calcite.apache.org>
主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels

@Haisheng, could you please clarify what you mean by these points?

> - the poor-design of column pruning, 
> - lack of group properties etc.

I guess I'm not aware of these problems.

-- 
Kind Regards
Roman Kondakov


On 08.01.2020 02:21, Haisheng Yuan wrote:
>> @Haisheng, are you doing something like that?
> Kind of, but not exactly. It is about on-demand trait propagation.
> 
> @Roman seems to be keen on space pruning for Calcite. But IMHO, for now, the 
> main reason of Calcite's poor performance is not lack of branch & bound space 
> puning, but 
> - rule applying on physical nodes, 
> - randomness of rule matching,
> - the poor-design of column pruning, 
> - lack of on-demand trait propagation, 
> - lack of group properties etc.
> 
> We tried a similar change with Roman's on our product. We totally removed 
> rule match importance and its comparison, split it into exploration, 
> implementation, enforcement 3 phases with specific top-down/bottom-up order, 
> it achieved almost 100% speedup.
> Even @vlsi's RexNode normalization can improve it to some degree.
> 
> Calcite currently generates only 1 join-order alternative for 6-way joins in 
> testJoinManyWay, not even top 10, 100  or N! ordering alternatives, but it 
> still can't finish within reasonable amount of time when abstract converter 
> is allowed. If there is only 1 join order alternative, the query optimizer 
> should finish the optimization quickly even for clique or chain queries with 
> 20 way joins, without space pruning. But this is not the case for Calcite.
> 
> Simply put it, space pruning is important for optimization, especially for 
> join-reordering, but not an urgent issue for Calcite.
> 
> - Haisheng
> 
> ------------------------------------------------------------------
> 发件人:Roman Kondakov<kondako...@mail.ru.INVALID>
> 日 期:2020年01月08日 02:39:19
> 收件人:<dev@calcite.apache.org>
> 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels
> 
> I forgot to mention that this approach was inspired by Stamatis's idea [1]
> 
> [1]
> https://ponymail-vm.apache.org/_GUI_/thread.html/d8f8bc0efd091c0750534ca5cd224f4dfe8940c9d0a99ce486516fd5@%3Cdev.calcite.apache.org%3E
> 
> 

Reply via email to