[jira] Commented: (HIVE-549) Parallel Execution Mechanism

Zheng Shao (JIRA) Tue, 01 Dec 2009 12:27:45 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784400#action_12784400
 ]


Zheng Shao commented on HIVE-549:
---------------------------------

We cannot use JobControl because not all Hive tasks are map-reduce jobs.

In some sense, Driver is reimplementing JobControl to accomodate all different 
types of Hive tasks. The alternative is to extend JobControl.
I think implementing Driver from scratch (as we already did) is more flexible 
at this point of time.

This is good information though. At some point we should revisit Driver and 
JobControl to see how we can refactor the code better.


> Parallel Execution Mechanism
> ----------------------------
>
>                 Key: HIVE-549
>                 URL: https://issues.apache.org/jira/browse/HIVE-549
>             Project: Hadoop Hive
>          Issue Type: Wish
>          Components: Query Processor
>    Affects Versions: 0.3.0
>            Reporter: Adam Kramer
>            Assignee: Chaitanya Mishra
>         Attachments: HIVE549-v6.patch
>
>
> In a massively parallel database system, it would be awesome to also 
> parallelize some of the mapreduce phases that our data needs to go through.
> One example that just occurred to me is UNION ALL: when you union two SELECT 
> statements, effectively you could run those statements in parallel. There's 
> no situation (that I can think of, but I don't have a formal proof) in which 
> the left statement would rely on the right statement, or vice versa. So, they 
> could be run at the same time...and perhaps they should be. Or, perhaps there 
> should be a way to make this happen...PARALLEL UNION ALL? PUNION ALL?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-549) Parallel Execution Mechanism

Reply via email to