[ 
https://issues.apache.org/jira/browse/PIG-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ying He updated PIG-955:
------------------------

    Description: SkewedPartitioner doesn't the skewed keys in partition table 
correctly. This can cause data loss.  (was: Fragmented replicated join has a 
few limitations:
 - One of the tables needs to be loaded into memory
 - Join is limited to two tables

Skewed join partitions the table and joins the records in the reduce phase. It 
computes a histogram of the key space to account for skewing in the input 
records. Further, it adjusts the number of reducers depending on the key 
distribution.

We need to implement the skewed join in pig.)

> Skewed join generates  incorrect results 
> -----------------------------------------
>
>                 Key: PIG-955
>                 URL: https://issues.apache.org/jira/browse/PIG-955
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Ying He
>         Attachments: PIG-955.patch
>
>
> SkewedPartitioner doesn't the skewed keys in partition table correctly. This 
> can cause data loss.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to