"It is difficult to cost - late materialisation can be a lot cheaper if there 
are predicates"<br/>We can materialize the expression after the predicates is 
executed. For example,<br/>select count(expr), sum(expr) from (select a+b+c 
expr from T where a&gt;10) T where T.expr&gt;20<br/>1. Scan column a and 
execute a&gt;10, then scan b, c<br/>2. Then Calculate a + b + c and execute 
expr&gt;20
At 2021-07-20 12:14:18, "Tim Armstrong" <tim.g.armstr...@gmail.com> wrote:
>I believe that this optimisation could be very beneficial. It is difficult
>to cost - late materialisation can be a lot cheaper if there are predicates.
>
>Pruning columns would be very useful for some queries just on its own. We
>tried to implement a related optimisation -
>https://issues.apache.org/jira/browse/IMPALA-2138 - and it showed some good
>performance benefits, but the implementation in the planner was quite
>complex because it tried to insert UNION operators at many points in the
>plan.
>
>We were hoping that https://issues.apache.org/jira/browse/IMPALA-3902 would
>be the long-term solution to getting more parallelism so that it doesn't
>matter whether expression evaluation is in the scan or not, but probably it
>makes sense to get the community's opinion on that.
>
>On Mon, 19 Jul 2021 at 20:35, hexianqing <hexianqing...@126.com> wrote:
>
>> I have submitted the following issue,
>> https://issues.apache.org/jira/browse/IMPALA-10789 <
>> https://issues.apache.org/jira/browse/IMPALA-10789>
>>
>> In some cases, the performance is better that early materialize
>> expressions in ScanNode.
>> For example,
>> SELECT SUM(col), COUNT(col), MIN(col), MAX(col) FROM ( SELECT
>> CAST(regexp_extract(string_col, '(\\d+)', 0) AS bigint) col FROM
>> functional_parquet.alltypesagg ) t
>> The expression only needs to be evaluated once if materialize expressions
>> in ScanNode.
>> I have roughly implemented this feature and the performance has improved
>> significantly.
>> I would like to discuss whether this feature can be contributed.
>>
>> Thanks!
>> Xianqing

Reply via email to