Hi,
  You can try using FULL OUTER JOIN on (rank).. if the number of distinct id's 
are small.

For example,
All rows that have a Id of 12324 will be in one relation and rows that have an 
id of 12325 will be in another relation. and a FULL OUTER JOIN on those two 
relations would give what you need.

Thanks,
Kannappan

On Nov 7, 2013, at 4:08 PM, Siddhi Borkar <[email protected]> 
wrote:

> 
> Hi,
> 
> Please ignore my previous mail.
> We have the following sample data which has to be transformed into a output 
> format using pig script
> 
> Id    rank    Value
> 12324 1       1582
> 12324 2       1142
> 12324 4       1292
> 12324 5       1134
> 12325 1       1582
> 12325 2       1142
> 12325 3       1292
> 12325 4       1134
> 12325 5       1183
> 12326 1       1582
> 12326 2       1142
> 12326 3       1292
> 12326 4       1134
> 12326 5       1183
> 
> We need to compare the values (of the value column) per rank for each id.
> The output needs to be generated in the following format
> 
> 
> Id1                                Id2
> value_rank1            value_rank1
> value_rank2             value_rank2
> value_rank3             value_rank3
> ...                                   ......
> value_rankn           value_rankn
> 
> 
> 
> For e.g.
> 
> 12324     12325
> 1582       1582
> 1142       1142
>                 1292
> 1292       1134
> 1134       1183
> 
> There has to be a blank value for any missing rank for a particular id.
> 
> Is there any way to achieve this?
> 
> Thanks,
> Siddhi

Reply via email to