[ 
https://issues.apache.org/jira/browse/HIVE-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806047#action_12806047
 ] 

He Yongqiang commented on HIVE-1110:
------------------------------------

By introducing an boolean vector to keep track of which table has already got a 
skew key, it will be able to tell how many tables have skew keys. And that can 
be used to tell how many skew jobs will be started at least from the counter in 
that reducer. So if we choose the biggest counter from all reducers, it will be 
the number of final jobs needed.

>>just increment the counter every time you see a new key.
This maybe better because sometimes i saw the counter is inaccurate. Even 
though there is a skew key and the counter got updated, it still reports zero. 
So it maybe better if we increment the counter multiple times, that maybe can 
hopefully let the reducer report a non-zero counter.

> add counters to show that skew join triggered
> ---------------------------------------------
>
>                 Key: HIVE-1110
>                 URL: https://issues.apache.org/jira/browse/HIVE-1110
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>             Fix For: 0.6.0
>
>         Attachments: hive-1110.patch
>
>
> It would be very useful to debug, and quickly find out if the skew join was 
> triggered.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to