[
https://issues.apache.org/jira/browse/HIVE-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278044#comment-14278044
]
Wei Zheng commented on HIVE-9382:
---------------------------------
The problem happens when there are already 10 rows fetched and output (in this
example), and the FetchTask still tries to retrieve more rows since the least
number of rows for each task is greater than 0. That is wrong, because we do
not need any more rows. FetchTask.fetch should just return without doing
anything.
> Query got rerun with Global Limit optimization on and Fetch optimization off
> ----------------------------------------------------------------------------
>
> Key: HIVE-9382
> URL: https://issues.apache.org/jira/browse/HIVE-9382
> Project: Hive
> Issue Type: Bug
> Components: Physical Optimizer
> Affects Versions: 0.14.0
> Reporter: Wei Zheng
> Assignee: Wei Zheng
>
> When Global Limit optimization is enabled, and Fetch Optimization for Simple
> Queries is off or not applicable, some queries with LIMIT clause will run
> twice.
> set hive.limit.optimize.enable=true;
> set hive.fetch.task.conversion=none;
> For example,
> {code:sql}
> hive> select * from t1 limit 10;
> Query ID = wzheng_20150107185252_4a6d0e65-9e58-464b-9ed3-9177740c30a9
> Total jobs = 1
> Launching Job 1 out of 1
> Status: Running (Executing on YARN cluster with App id
> application_1420567249453_0039)
> --------------------------------------------------------------------------------
> VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED
> KILLED
> --------------------------------------------------------------------------------
> Map 1 .......... SUCCEEDED 1 1 0 0 0
> 0
> --------------------------------------------------------------------------------
> VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 0.41 s
> --------------------------------------------------------------------------------
> OK
> 201208 99848 119820 32627 982976 509206 0.000100898
> 201208 99745 119820 32627 982976 509206 0.000100898
> 201208 99739 119820 32627 982976 509206 0.000100898
> 201208 99847 119820 32627 982976 509206 0.000100898
> 201208 613588 119820 32627 982976 509206 0.000100898
> 201208 99809 119820 32627 982976 509206 0.000100898
> 201208 99725 119820 32627 982976 509206 0.000100898
> 201208 99666 119820 32627 982976 509206 0.000100898
> 201208 99743 119820 32627 982976 509206 0.000100898
> 201208 99801 119820 32627 982976 509206 0.000100898
> Retry query with a different approach...
> Query ID = wzheng_20150107185252_8a77f793-cad7-4c6b-b64a-07d8310970b9
> Total jobs = 1
> Launching Job 1 out of 1
> Status: Running (Executing on YARN cluster with App id
> application_1420567249453_0039)
> --------------------------------------------------------------------------------
> VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED
> KILLED
> --------------------------------------------------------------------------------
> Map 1 .......... SUCCEEDED 309 309 0 0 0
> 0
> --------------------------------------------------------------------------------
> VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 2.04 s
> --------------------------------------------------------------------------------
> OK
> 201208 99848 119820 32627 982976 509206 0.000100898
> 201208 99745 119820 32627 982976 509206 0.000100898
> 201208 99739 119820 32627 982976 509206 0.000100898
> 201208 99847 119820 32627 982976 509206 0.000100898
> 201208 613588 119820 32627 982976 509206 0.000100898
> 201208 99809 119820 32627 982976 509206 0.000100898
> 201208 99725 119820 32627 982976 509206 0.000100898
> 201208 99666 119820 32627 982976 509206 0.000100898
> 201208 99743 119820 32627 982976 509206 0.000100898
> 201208 99801 119820 32627 982976 509206 0.000100898
> Time taken: 2.748 seconds, Fetched: 10 row(s)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)