[
https://issues.apache.org/jira/browse/KYLIN-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865928#comment-17865928
]
pengfei.zhan edited comment on KYLIN-5875 at 7/15/24 7:59 AM:
--------------------------------------------------------------
The execution plan generated by VolcanoPlanner can be unstable. Given the
following sql
{code:sql}
select PART_DT, YEAR (CAL_DT) as y, YEAR (CAL_DT) * 100 + MONTH (CAL_DT) as ym,
YEAR (CAL_DT) * 10000 + MONTH (CAL_DT) * 100 + DAYOFMONTH(CAL_DT) as ymd,
CAL_DT, sum (price)
from SSB_STEP.KYLIN_SALES_SAMPLE a inner join SSB_STEP.KYLIN_CAL_DT_SAMPLE b
on a.PART_DT = b.CAL_DT
where (cast ("CAL_DT" as date) BETWEEN '2012-02-01'
and '2012-02-29'
and cast ("CAL_DT" as date) BETWEEN '2012-01-01'
and '2012-12-31' )
group by PART_DT, CAL_DT
order by PART_DT
LIMIT 500
{code}
The execution plan you get before kylin5.0-alpha might look like this
{code:java}
raceId: c114c6d5-071e-4cf2-c17b-34c9f22be1ae 2023-11-17T18:51:28,797 INFO
[project_38744] [Query 7eea9760-219b-dfb5-7433-8ecf4c95bf5e-332]
util.QueryContextCutter : select layout candidate for 1 olapContext cost 82 ms
traceId: c114c6d5-071e-4cf2-c17b-34c9f22be1ae 2023-11-17T18:51:28,797 INFO
[project_38744] [Query 7eea9760-219b-dfb5-7433-8ecf4c95bf5e-332]
exec.SparderPlanExec : OlapToEnumerableConverter
OlapProjectRel(PART_DT=[$0], Y=[EXTRACT(FLAG(YEAR), $1)],
YM=[+(*(EXTRACT(FLAG(YEAR), $1), 100), EXTRACT(FLAG(MONTH), $1))],
YMD=[+(+(*(EXTRACT(FLAG(YEAR), $1), 10000), *(EXTRACT(FLAG(MONTH), $1), 100)),
EXTRACT(FLAG(DAY), $1))], CAL_DT=[$1], EXPR$5=[$2], ctx=[0@NDataflow
[model_38744_01]])
OlapLimitRel(ctx=[0@NDataflow [model_38744_01]], fetch=[500])
OlapSortRel(sort0=[$0], dir0=[ASC-nulls-first], ctx=[0@NDataflow
[model_38744_01]])
OlapAggregateRel(group-set=[[0, 1]],
groups=[[SSB_STEP.KYLIN_SALES_SAMPLE.PART_DT,
SSB_STEP.KYLIN_CAL_DT_SAMPLE.CAL_DT]], EXPR$5=[SUM($2)], ctx=[0@NDataflow
[model_38744_01]])
OlapProjectRel(PART_DT=[$1], CAL_DT=[$22], PRICE=[$6],
ctx=[0@NDataflow [model_38744_01]])
OlapFilterRel(condition=[AND(>=($22, 2012-02-01), <=($22,
2012-02-29))], ctx=[0@NDataflow [model_38744_01]])
OlapJoinRel(condition=[=($1, $22)], joinType=[inner],
ctx=[0@NDataflow [model_38744_01]])
OlapTableScan(table=[[SSB_STEP, KYLIN_SALES_SAMPLE]],
ctx=[0@NDataflow [model_38744_01]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]])
OlapTableScan(table=[[SSB_STEP, KYLIN_CAL_DT_SAMPLE]],
ctx=[0@NDataflow [model_38744_01]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70,
71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
91, 92, 93, 94, 95, 96, 97, 98, 99]])
{code}
The execution plan you get at kylin-5.0.0 might look like this
{code:java}
[49047] 2023-11-17T19:11:50,375 INFO [SuggestRunner-p-5-t-1]
util.QueryContextCutter : Cut context OlapToEnumerableConverter
OlapProjectRel(PART_DT=[$0], Y=[EXTRACT(FLAG(YEAR), $1)],
YM=[+(*(EXTRACT(FLAG(YEAR), $1), 100), EXTRACT(FLAG(MONTH), $1))],
YMD=[+(+(*(EXTRACT(FLAG(YEAR), $1), 10000), *(EXTRACT(FLAG(MONTH), $1), 100)),
EXTRACT(FLAG(DAY), $1))], CAL_DT=[$1], EXPR$5=[$2], ctx=[0@null])
OlapLimitRel(ctx=[0@null], fetch=[500])
OlapSortRel(sort0=[$0], dir0=[ASC-nulls-first], ctx=[0@null])
OlapAggregateRel(group-set=[[0, 1]], groups=[null], EXPR$5=[SUM($2)],
ctx=[0@null])
OlapProjectRel(PART_DT=[$1], CAL_DT=[$12], PRICE=[$6], ctx=[0@null])
OlapFilterRel(condition=[AND(>=(CAST($12):DATE,
CAST('2012-02-01'):DATE NOT NULL), <=(CAST($12):DATE, CAST('2012-02-29'):DATE
NOT NULL), >=(CAST($12):DATE, CAST('2012-01-01'):DATE NOT NULL),
<=(CAST($12):DATE, CAST('2012-12-31'):DATE NOT NULL))], ctx=[0@null])
OlapJoinRel(condition=[=($1, $12)], joinType=[inner],
ctx=[0@null])
OlapTableScan(table=[[SSB_STEP, KYLIN_SALES_SAMPLE]],
ctx=[0@null], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]])
OlapTableScan(table=[[SSB_STEP, KYLIN_CAL_DT_SAMPLE]],
ctx=[0@null], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55,
56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75,
76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
96, 97, 98, 99]])
{code}
It can be seen that the contents of OlapFilterRel in these two execution plans
are different, but completely equivalent. And VolcanoPlanner gives what it
thinks the best execution plan is actually different, depending on the
execution cost of the execution plan. The previous version got the first
execution plan, while the new version gets the second, and the cost estimate
depends on the matching of column types and constant types. It looks like the
new execution plan is less costly when compared to the same type. Currently
kylin does not have a particularly good way of evaluating the cost of execution
plan selection, so renaming an execution plan or adding a new execution plan
may cause the execution plan to become unstable. We don't need to be too
concerned about this - is it possible that a change in calcite or kylin's code,
and the evaluation of the cost, caused the first plan to be chosen again? Is it
possible, and how can we avoid this problem?
Looking back, what does this condition do? In kylin, it's used for segment
pruning or partition pruning, so to start with, it turns out that kylin has
constructed a class FilterConditionExpander to generate some simplified
expressions. For example, the above condition will eventually be simplified to
AND(>=($12, date '2012-02-01'), <=($12, date '2012-02-29'), >=($12, date
'2012-01-01'), <=($12, date '2012-12-31'). However, the class definition of
simpleCall is incomplete, and the following two are recognized
* cast(column as datatype) <op> literal
* column <op> literal
The following is not considered a simpleCall
* cast(column as datatype) <op> cast(literal as datatype)
* column <op> cast(literal as datatype)
Above, column means column, literal means constant, and <op> means comparison
operator.
h1. 1st
add support for two types of simpleCall that are not supported, i.e. extract
literal from cast(literal as datatype), and eventually all four cases will be
converted to column <op> literal for subsequent segment pruning.
h1. 2nd
ReduceExpressionRule optimization rules can only optimize for LogicalProject,
LogicalJoin, LogicalFilter. OlapProjectRel, OlapJoinRel, and OlapFilterRel do
not inherit from these three classes, so the optimization fails.
Define OlapReduceExpressionRule rule to override ReduceExpressionRule rule for
OlapRel. This is why some previous code had expressions that should have been
optimized in theory but were not.
h1. 3rd
There is a lot of useless transformation logic in the SegmentPruningRule, which
leads to slow execution on this side of the SegmentPruningRule, and can be
occasional in CI.
Fixes: 1) For date string to be converted to timestamp constant, we need to
judge whether it has hours, minutes and seconds, if not, we need to fill in the
blanks, otherwise it may lead to the RexSimplify#simplifyAnds call to take a
long time; 2) FilterConditionExpander class converts RexNode to CNF expression,
and then it converts each sub-CNF expression into a CNF expression, and then it
converts each sub-CNF expression into a CNF expression. expression and extract
OlapContext#expandedFilterConditions information for each sub-CNF expression,
there are a large number of repeated calls, poor performance, the use of a map
in a computation to avoid the operation of creating the same object over and
over again.
h1. 4th
For the following sql, from
NBitmapFunctionForCalciteExecTest#testIntersectCountForFalseFilter
{code:sql}
select
intersect_count_v2(TEST_COUNT_DISTINCT_BITMAP, LSTG_FORMAT_NAME,
array['FP-.*GTC', 'Others'], 'REGEXP') as b,
intersect_count_v2(TEST_COUNT_DISTINCT_BITMAP, LSTG_FORMAT_NAME,
array['FP-GTC|FP-non GTC', 'Others'], 'RAWSTRING') as c
from test_kylin_fact where 1=2
{code}
Since the second problem has been solved to simplify the filter condition
taking effect, the onMatch method of FilterReduceExpressionsRule is found to be
{code:java}
Values values = (Values) createEmptyRelOrEquivalent(call, filter);
call.transformTo(values.copy(call.rel(0).getTraitSet(), values.getInputs()));
{code}
to
{code:java}
call.transformTo(createEmptyRelOrEquivalent(call, filter));
{code}
There is no need to copy, copying leads to incorrect traitSet and UT failure.
was (Author: JIRAUSER294653):
The execution plan generated by VolcanoPlanner can be unstable. Given the
following sql
{code:sql}
select PART_DT, YEAR (CAL_DT) as y, YEAR (CAL_DT) * 100 + MONTH (CAL_DT) as ym,
YEAR (CAL_DT) * 10000 + MONTH (CAL_DT) * 100 + DAYOFMONTH(CAL_DT) as ymd,
CAL_DT, sum (price)
from SSB_STEP.KYLIN_SALES_SAMPLE a inner join SSB_STEP.KYLIN_CAL_DT_SAMPLE b
on a.PART_DT = b.CAL_DT
where (cast ("CAL_DT" as date) BETWEEN '2012-02-01'
and '2012-02-29'
and cast ("CAL_DT" as date) BETWEEN '2012-01-01'
and '2012-12-31' )
group by PART_DT, CAL_DT
order by PART_DT
LIMIT 500
{code}
The execution plan you get before kylin5.0-alpha might look like this
{code:java}
raceId: c114c6d5-071e-4cf2-c17b-34c9f22be1ae 2023-11-17T18:51:28,797 INFO
[project_38744] [Query 7eea9760-219b-dfb5-7433-8ecf4c95bf5e-332]
util.QueryContextCutter : select layout candidate for 1 olapContext cost 82 ms
traceId: c114c6d5-071e-4cf2-c17b-34c9f22be1ae 2023-11-17T18:51:28,797 INFO
[project_38744] [Query 7eea9760-219b-dfb5-7433-8ecf4c95bf5e-332]
exec.SparderPlanExec : OlapToEnumerableConverter
OlapProjectRel(PART_DT=[$0], Y=[EXTRACT(FLAG(YEAR), $1)],
YM=[+(*(EXTRACT(FLAG(YEAR), $1), 100), EXTRACT(FLAG(MONTH), $1))],
YMD=[+(+(*(EXTRACT(FLAG(YEAR), $1), 10000), *(EXTRACT(FLAG(MONTH), $1), 100)),
EXTRACT(FLAG(DAY), $1))], CAL_DT=[$1], EXPR$5=[$2], ctx=[0@NDataflow
[model_38744_01]])
OlapLimitRel(ctx=[0@NDataflow [model_38744_01]], fetch=[500])
OlapSortRel(sort0=[$0], dir0=[ASC-nulls-first], ctx=[0@NDataflow
[model_38744_01]])
OlapAggregateRel(group-set=[[0, 1]],
groups=[[SSB_STEP.KYLIN_SALES_SAMPLE.PART_DT,
SSB_STEP.KYLIN_CAL_DT_SAMPLE.CAL_DT]], EXPR$5=[SUM($2)], ctx=[0@NDataflow
[model_38744_01]])
OlapProjectRel(PART_DT=[$1], CAL_DT=[$22], PRICE=[$6],
ctx=[0@NDataflow [model_38744_01]])
OlapFilterRel(condition=[AND(>=($22, 2012-02-01), <=($22,
2012-02-29))], ctx=[0@NDataflow [model_38744_01]])
OlapJoinRel(condition=[=($1, $22)], joinType=[inner],
ctx=[0@NDataflow [model_38744_01]])
OlapTableScan(table=[[SSB_STEP, KYLIN_SALES_SAMPLE]],
ctx=[0@NDataflow [model_38744_01]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]])
OlapTableScan(table=[[SSB_STEP, KYLIN_CAL_DT_SAMPLE]],
ctx=[0@NDataflow [model_38744_01]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70,
71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
91, 92, 93, 94, 95, 96, 97, 98, 99]])
{code}
The execution plan you get at kylin-5.0.0 might look like this
{code:java}
[49047] 2023-11-17T19:11:50,375 INFO [SuggestRunner-p-5-t-1]
util.QueryContextCutter : Cut context OlapToEnumerableConverter
OlapProjectRel(PART_DT=[$0], Y=[EXTRACT(FLAG(YEAR), $1)],
YM=[+(*(EXTRACT(FLAG(YEAR), $1), 100), EXTRACT(FLAG(MONTH), $1))],
YMD=[+(+(*(EXTRACT(FLAG(YEAR), $1), 10000), *(EXTRACT(FLAG(MONTH), $1), 100)),
EXTRACT(FLAG(DAY), $1))], CAL_DT=[$1], EXPR$5=[$2], ctx=[0@null])
OlapLimitRel(ctx=[0@null], fetch=[500])
OlapSortRel(sort0=[$0], dir0=[ASC-nulls-first], ctx=[0@null])
OlapAggregateRel(group-set=[[0, 1]], groups=[null], EXPR$5=[SUM($2)],
ctx=[0@null])
OlapProjectRel(PART_DT=[$1], CAL_DT=[$12], PRICE=[$6], ctx=[0@null])
OlapFilterRel(condition=[AND(>=(CAST($12):DATE,
CAST('2012-02-01'):DATE NOT NULL), <=(CAST($12):DATE, CAST('2012-02-29'):DATE
NOT NULL), >=(CAST($12):DATE, CAST('2012-01-01'):DATE NOT NULL),
<=(CAST($12):DATE, CAST('2012-12-31'):DATE NOT NULL))], ctx=[0@null])
OlapJoinRel(condition=[=($1, $12)], joinType=[inner],
ctx=[0@null])
OlapTableScan(table=[[SSB_STEP, KYLIN_SALES_SAMPLE]],
ctx=[0@null], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]])
OlapTableScan(table=[[SSB_STEP, KYLIN_CAL_DT_SAMPLE]],
ctx=[0@null], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55,
56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75,
76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
96, 97, 98, 99]])
{code}
It can be seen that the contents of OlapFilterRel in these two execution plans
are different, but completely equivalent. And VolcanoPlanner gives what it
thinks the best execution plan is actually different, depending on the
execution cost of the execution plan. The previous version got the first
execution plan, while the new version gets the second, and the cost estimate
depends on the matching of column types and constant types. It looks like the
new execution plan is less costly when compared to the same type. Currently
kylin does not have a particularly good way of evaluating the cost of execution
plan selection, so renaming an execution plan or adding a new execution plan
may cause the execution plan to become unstable. We don't need to be too
concerned about this - is it possible that a change in calcite or kylin's code,
and the evaluation of the cost, caused the first plan to be chosen again? Is it
possible, and how can we avoid this problem?
Looking back, what does this condition do? In kylin, it's used for segment
pruning or partition pruning, so to start with, it turns out that kylin has
constructed a class FilterConditionExpander to generate some simplified
expressions. For example, the above condition will eventually be simplified to
AND(>=($12, date '2012-02-01'), <=($12, date '2012-02-29'), >=($12, date
'2012-01-01'), <=($12, date '2012-12-31'). However, the class definition of
simpleCall is incomplete, and the following two are recognized
* cast(column as datatype) <op> literal
* column <op> literal
The following is not considered a simpleCall
* cast(column as datatype) <op> cast(literal as datatype)
* column <op> cast(literal as datatype)
Above, column means column, literal means constant, and <op> means comparison
operator.
Fix: add support for two types of simpleCall that are not supported, i.e.
extract literal from cast(literal as datatype), and eventually all four cases
will be converted to column <op> literal for subsequent segment pruning.
> Enhance the simplification method of filter condition for segment pruning
> -------------------------------------------------------------------------
>
> Key: KYLIN-5875
> URL: https://issues.apache.org/jira/browse/KYLIN-5875
> Project: Kylin
> Issue Type: Improvement
> Components: Query Engine
> Affects Versions: 5.0-alpha
> Reporter: pengfei.zhan
> Assignee: pengfei.zhan
> Priority: Major
> Fix For: 5.0.0
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)