[ https://issues.apache.org/jira/browse/KYLIN-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17690706#comment-17690706 ]
ASF GitHub Bot commented on KYLIN-5387: --------------------------------------- pfzhan commented on code in PR #2089: URL: https://github.com/apache/kylin/pull/2089#discussion_r1110716640 ########## src/spark-project/engine-spark/src/main/java/org/apache/kylin/engine/spark/job/NSparkCubingJob.java: ########## @@ -422,4 +431,53 @@ static class NSparkCubingJobStep { private final AbstractExecutable secondStorage; private final AbstractExecutable cleanUpTransactionalTable; } + + private static void enableCostBasedPlannerIfNeed(NDataflow df, Set<NDataSegment> segments, NSparkCubingJob job) { + // need run the cost based planner: + // 1. config enable the cube planner + // 2. the model dose not have the `layout_cost_based_pruned_list` + // 3. rule index has agg group + // 4. just only one segment to be built/refresh(other case will throw exception) + IndexPlan indexPlan = df.getIndexPlan(); + KylinConfig kylinConfig = indexPlan.getConfig(); + boolean needCostRecommendIndex = indexPlan.getRuleBasedIndex() != null + && indexPlan.getRuleBasedIndex().getLayoutsOfCostBasedList() == null + && !indexPlan.getRuleBasedIndex().getAggregationGroups().isEmpty(); + if (kylinConfig.enableCostBasedIndexPlanner() && needCostRecommendIndex + && canEnablePlannerJob(job.getJobType())) { + // must run the cost based planner + if (segments.size() == 1) { + if (noBuildingSegmentExist(df.getProject(), job.getTargetSubject(), kylinConfig)) { + // check the count of rowkey: + // if the count of row key exceed the 63, throw exception + if (indexPlan.getRuleBasedIndex().countOfIncludeDimension() > (Long.SIZE - 1)) { + throw new RuntimeException(String.format( + "The count of row key %d can't be larger than 63, when use the cube planner", + indexPlan.getRuleBasedIndex().countOfIncludeDimension())); + } + // Add the parameter `P_JOB_ENABLE_PLANNER` which is used to decide whether to use the cube planner + job.setParam(NBatchConstants.P_JOB_ENABLE_PLANNER, Boolean.TRUE.toString()); + } else { + throw new RuntimeException( + "There are running job for this model when submit the build job with cost based planner, " + + "please wait for other jobs to finish or cancel them"); + } + } else { + throw new RuntimeException("The number of segments to be built or refreshed must be 1, " Review Comment: 同上 > Migrate cube planner phase 1 to kylin5 > -------------------------------------- > > Key: KYLIN-5387 > URL: https://issues.apache.org/jira/browse/KYLIN-5387 > Project: Kylin > Issue Type: Improvement > Components: Job Engine > Affects Versions: 5.0-alpha > Reporter: Kun Liu > Assignee: Kun Liu > Priority: Major > > kylin3.1 support the cube planner to recommend cuboid when building the first > segment. > > We need to migrate the cost based algorithm to the kylin5, and leverage the > cube planner algorithm with the index in kylin5 > > designe doc: > https://docs.google.com/document/d/1wUNd8U1u-w8T-qQUReplPhDFkETj3AtFr_Shex501ls/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.20.10#820010)