Re:Re: Discussion about IMPALA-10789: Early materialize expressions in ScanNode

2021-07-20 Thread Xianqing
"It is difficult to cost - late materialisation can be a lot cheaper if there are predicates"We can materialize the expression after the predicates is executed. For example,select count(expr), sum(expr) from (select a+b+c expr from T where a>10) T where T.expr>201. Scan column a and execute a>1

Re:Re: Discussion about IMPALA-10789: Early materialize expressions in ScanNode

2021-07-20 Thread Xianqing
"It is difficult to cost - late materialisation can be a lot cheaper if there are predicates"We can materialize the expression after the predicates is executed. For example,select count(expr), sum(expr) from (select a+b+c expr from T where a>10) T where T.expr>201. Scan column a and execute a>1

Re: Re: Discussion about IMPALA-10789: Early materialize expressions in ScanNode

2021-07-20 Thread Tim Armstrong
I agree in that scenario where predicates are pushed down to the scan. It can be more of a trade-off if there are selective predicates later in the plan, e.g. after a join or aggregation node - in that case it depends on how expensive the expression is and how selective the predicates are. I defin