[ https://issues.apache.org/jira/browse/HIVE-24031?focusedWorklogId=477932&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477932 ]
ASF GitHub Bot logged work on HIVE-24031: ----------------------------------------- Author: ASF GitHub Bot Created on: 02/Sep/20 15:02 Start Date: 02/Sep/20 15:02 Worklog Time Spent: 10m Work Description: zabetak opened a new pull request #1424: URL: https://github.com/apache/hive/pull/1424 ### What changes were proposed in this pull request? 1. Drop the defensive copy of children inside ASTNode#getChildren. 2. Protect clients by accidentally modifying the list via an unmodifiable collection. ### Why are the changes needed? Profiling shows the vast majority of time spend on creating defensive copies of the node expression list inside ASTNode#getChildren. The method is called extensively from various places in the code especially those walking over the expression tree so it needs to be efficient. Most of the time creating defensive copies is not necessary. For those cases (if any) that the list needs to be modified clients should perform a copy themselves. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? The test was added in a separate branch since it is not meant to be committed upstream for the following reasons: - the query for reproducing the problem takes up a few MBs - requires some changes in the default configurations. If you want to run the test run the following commands: ``` git checkout -b HIVE-24031-TEST master git pull g...@github.com:zabetak/hive.git HIVE-24031-PLUS-TEST mvn clean install -DskipTests cd itests mvn clean install -DskipTests cd qtest mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=big_query_with_array_constructor.q -Dtest.output.overwrite ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 477932) Time Spent: 40m (was: 0.5h) > Infinite planning time on syntactically big queries > --------------------------------------------------- > > Key: HIVE-24031 > URL: https://issues.apache.org/jira/browse/HIVE-24031 > Project: Hive > Issue Type: Bug > Components: Query Planning > Reporter: Stamatis Zampetakis > Assignee: Stamatis Zampetakis > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: ASTNode_getChildren_cost.png, > query_big_array_constructor.nps > > Time Spent: 40m > Remaining Estimate: 0h > > Syntactically big queries (~1 million tokens), such as the query shown below, > lead to very big (seemingly infinite) planning times. > {code:sql} > select posexplode(array('item1', 'item2', ..., 'item1M')); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)