[jira] Commented: (HIVE-964) handle skewed keys for a join in a separate job

Namit Jain (JIRA) Sun, 10 Jan 2010 21:56:18 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798578#action_12798578
 ]


Namit Jain commented on HIVE-964:
---------------------------------

Did not take a look in great detail, but some high level comments:

1. Changes in ExecDriver are not needed
2. Skew Join should be a optimization step - I remember initially we had 
thought about it and said it might be easy to do it at the end,
    but it makes more sense to plug it in the optimization phase. It can be the 
last optimization step, and we can assume that map join
    conversions etc. have been done.
3. Condtitional Task: needs some rework. Since execute is not getting called 
recursively, same thing should happen for initialize.
   It would be great if we can fold it in execute though - not sure how.
4. The numbers of jobs etc. should be correct - conditional task is not a 
single job, but 'n'.


> handle skewed keys for a join in a separate job
> -----------------------------------------------
>
>                 Key: HIVE-964
>                 URL: https://issues.apache.org/jira/browse/HIVE-964
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>         Attachments: hive-964-2009-12-17.txt, hive-964-2009-12-28-2.patch, 
> hive-964-2009-12-29-4.patch, hive-964-2010-01-08.patch
>
>
> The skewed keys can be written to a temporary table or file, and a followup 
> conditional task can be used to perform the join on those keys.
> As a first step, JDBM can be used for those keys

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-964) handle skewed keys for a join in a separate job

Reply via email to