[jira] [Commented] (HIVE-4002) Fetch task aggregation for simple group by query

Edward Capriolo (JIRA) Mon, 29 Jul 2013 19:48:20 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13723315#comment-13723315
 ]


Edward Capriolo commented on HIVE-4002:
---------------------------------------

[~navis]Sorry I dropped the ball on this review. Can you rebase?
                
> Fetch task aggregation for simple group by query
> ------------------------------------------------
>
>                 Key: HIVE-4002
>                 URL: https://issues.apache.org/jira/browse/HIVE-4002
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-4002.D8739.1.patch, HIVE-4002.D8739.2.patch
>
>
> Aggregation queries with no group-by clause (for example, select count(*) 
> from src) executes final aggregation in single reduce task. But it's too 
> small even for single reducer because the most of UDAF generates just single 
> row for map aggregation. If final fetch task can aggregate outputs from map 
> tasks, shuffling time can be removed.
> This optimization transforms operator tree something like,
> TS-FIL-SEL-GBY1-RS-GBY2-SEL-FS + FETCH-TASK
> into 
> TS-FIL-SEL-GBY1-FS + FETCH-TASK(GBY2-SEL-LS)
> With the patch, time taken for auto_join_filters.q test reduced to 6 min (10 
> min, before).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4002) Fetch task aggregation for simple group by query

Reply via email to