[
https://issues.apache.org/jira/browse/HIVE-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401262#comment-13401262
]
alex gemini commented on HIVE-3086:
-----------------------------------
the design is very complicated IMO,what if we have a big table logs and a small
table users, table users have a column 'age', if we have issue a query skewed
by age which we can't pre-partition the big table.this design didn't handle
it,right? I guess what we want is customer partition at runtime,for the above
example, we need customer partition(or some hint)or tell the query plan we want
to partition the users table at 'userid,age' column and also partition the logs
table at 'userid' column, the partition number for same userid for two table
need to be same for further join.
> Skewed Join Optimization
> ------------------------
>
> Key: HIVE-3086
> URL: https://issues.apache.org/jira/browse/HIVE-3086
> Project: Hive
> Issue Type: New Feature
> Reporter: Nadeem Moidu
> Assignee: Nadeem Moidu
>
> During a join operation, if one of the columns has a skewed key, it can cause
> that particular reducer to become the bottleneck. The following feature will
> address it:
> https://cwiki.apache.org/confluence/display/Hive/Skewed+Join+Optimization
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira