[ 
https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10190:
-----------------------------------
    Attachment: HIVE-10190.01.patch

Attached a patch using BFS. However, I doubt it will solve the problem. 
Previously, I assume that we can save some shift actions between adjacent Ast 
Nodes because we divide the whole string into small tokens. However, it seems 
that Java "contains" is based on "indexOf" and its complexity is O(mn) 
http://stackoverflow.com/questions/12752274/java-indexofstring-str-method-complexity.
 Thus, even after divide, we can not save too much. [~gopalv], could you please 
try the patch and see if it improves? If not, I will try a parser solution. 
Thanks.

> CBO: AST mode checks for TABLESAMPLE with 
> AST.toString().contains("TOK_TABLESPLITSAMPLE")
> -----------------------------------------------------------------------------------------
>
>                 Key: HIVE-10190
>                 URL: https://issues.apache.org/jira/browse/HIVE-10190
>             Project: Hive
>          Issue Type: Bug
>          Components: CBO
>    Affects Versions: 1.2.0
>            Reporter: Gopal V
>            Assignee: Pengcheng Xiong
>            Priority: Trivial
>              Labels: perfomance
>         Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch
>
>
> {code}
> public static boolean validateASTForUnsupportedTokens(ASTNode ast) {
>     String astTree = ast.toStringTree();
>     // if any of following tokens are present in AST, bail out
>     String[] tokens = { "TOK_CHARSETLITERAL", "TOK_TABLESPLITSAMPLE" };
>     for (String token : tokens) {
>       if (astTree.contains(token)) {
>         return false;
>       }
>     }
>     return true;
>   }
> {code}
> This is an issue for a SQL query which is bigger in AST form than in text 
> (~700kb).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to