[
https://issues.apache.org/jira/browse/KYLIN-5766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832355#comment-17832355
]
pengfei.zhan edited comment on KYLIN-5766 at 3/30/24 2:19 AM:
--------------------------------------------------------------
h1. Design
Parses the sql with javaCC, gets the "normalized" sql, and uses that sql as the
key. Among them, "normalization" specific form:
* Remove general comments (already implemented in the previous sql parsing
step)
* Replacing any number of spaces, line feeds, tabs, returns, and page breaks
with a single whitespace character;
* Replace "+", "-", "*", "/", "%", "=", ">=", "<=", "! =", "<>", "||" Single
operators are replaced with one space to the left and one space to the right;
* Replace ( ), the parentheses, with a single space to the left and right of
each;
* Converting , i.e. English comma to the left and replacing it with a single
space on the right, in the form of test ,test1 to test, test1.
* For strings with escaped identifiers, such as `2 + 3 `, no changes will be
made, leaving them as they are, so `2 + 3 ` and `2 + 3 ` are different sql, and
can't hit each other's caches.
For example, these two queries are the same after transformation.
{code:sql}
-- sql1
select user ,
count(*) from /*comments
comments
*/ demo group by user
-- sql2
select user, count(*) -- comments from demo group by user
{code}
the normalized cache key is
{code:sql}
select user, count ( * ) from demo group by user
{code}
was (Author: JIRAUSER294653):
h1. Design
Parses the sql with javaCC, gets the "normalized" sql, and uses that sql as the
key. Among them, "normalization" specific form:
* Remove general comments (already implemented in the previous sql parsing
step)
* Replacing any number of spaces, line feeds, tabs, returns, and page breaks
with a single whitespace character;
* Replace "+", "-", "*", "/", "%", "=", ">=", "<=", "! =", "<>", "||" Single
operators are replaced with one space to the left and one space to the right;
* Replace ( ), the parentheses, with a single space to the left and right of
each;
* Converting , i.e. English comma to the left and replacing it with a single
space on the right, in the form of test ,test1 to test, test1.
* For strings with escaped identifiers, such as `2 + 3 `, no changes will be
made, leaving them as they are, so `2 + 3 ` and `2 + 3 ` are different sql, and
can't hit each other's caches.
> Normalize query cache key
> -------------------------
>
> Key: KYLIN-5766
> URL: https://issues.apache.org/jira/browse/KYLIN-5766
> Project: Kylin
> Issue Type: Improvement
> Components: Query Engine
> Affects Versions: 5.0-beta
> Reporter: pengfei.zhan
> Assignee: pengfei.zhan
> Priority: Major
> Fix For: 5.0.0
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)