[ 
https://issues.apache.org/jira/browse/PIG-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173987#comment-13173987
 ] 

Gianmarco De Francisci Morales commented on PIG-2353:
-----------------------------------------------------

Actually I was thinking that RANK would only do the counting and appending.
This way you could get a sort + rank with
{code}
B = RANK ( ORDER A BY <column> ASC);
{code}

But you could also get your dataset from file and rank it directly, without any 
specific order
{code}
A = LOAD 'path/to/file';
B = RANK A;
C = ORDER B BY <column>
{code}

This, for example, gives you the permutation that was used to sort the dataset, 
which might be useful.
Also, RANK would allow to create a data column that reflects the ordering that 
you have in your data.
                
> RANK function like in SQL
> -------------------------
>
>                 Key: PIG-2353
>                 URL: https://issues.apache.org/jira/browse/PIG-2353
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Gianmarco De Francisci Morales
>         Attachments: PIG2353.patch
>
>
> Implement a function that given a (sorted) bag adds to each tuple a unique, 
> increasing identifier without gaps, like what RANK does for SQL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to