[ 
https://issues.apache.org/jira/browse/KYLIN-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoxiang Yu resolved KYLIN-5536.
---------------------------------
    Resolution: Fixed

> Kylin query optimization, by limiting the data range of max query, improve 
> query efficiency
> -------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-5536
>                 URL: https://issues.apache.org/jira/browse/KYLIN-5536
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Query Engine
>    Affects Versions: 5.0-alpha
>            Reporter: Yaguang Jia
>            Assignee: Yaguang Jia
>            Priority: Major
>             Fix For: 5.0-beta
>
>
> h2. Dev design
> 1、Add configuration kylin.query.max-measure-segment-pruner-before-days
> Limit the time range of the query. The default value is -1, which is 
> equivalent to turning off this optimization. When configured to 0, no data is 
> scanned. When the configuration parameter is incorrect (e.g. 0.1), the effect 
> is to not turn on the switch. Includes three levels: model, project, and 
> system, in decreasing order of priority.
> 2、Where will the optimization be done?
> segment pruner at: 
> org.apache.kylin.query.routing.RealizationPruner#pruneSegments
> 3、What kind of queries will be optimized?
> select <max(partDT)> from T [where xxx]
> The query must be max(time partitioned column; where condition is optional; 
> no group by column
> 4、When configuration parameters are specified, which segment is selected to 
> answer the query?
> From the last (new) segment, the segment is selected according to the 
> configuration time.
> h3. dev design
> h4. 1、新增配置 {{kylin.query.max-measure-segment-pruner-before-days}}
> 用于限定查询时的时间范围。默认值为-1,相当于关闭此优化。当配置为0时,不扫描数据。当配置参数不对(比如0.1)时,效果为不打开开关。包括模型、项目、系统三个级别,优先级依次降低。
> *2、将优化做在哪?*
> segment pruner处:org.apache.kylin.query.routing.RealizationPruner#pruneSegments
> *3、什么样的查询会被优化?*
> select <max(partDT)> from T [where xxx]
> 查询必须是max(时间分区列;where 条件可有可无;不能有group by 列
> *4、当指定了配置参数时,选择哪些segment来回答查询?*
> 从最后(新)一个segment起,按照配置时间选择segment。



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to