[
https://issues.apache.org/jira/browse/PIG-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173987#comment-13173987
]
Gianmarco De Francisci Morales commented on PIG-2353:
-----------------------------------------------------
Actually I was thinking that RANK would only do the counting and appending.
This way you could get a sort + rank with
{code}
B = RANK ( ORDER A BY <column> ASC);
{code}
But you could also get your dataset from file and rank it directly, without any
specific order
{code}
A = LOAD 'path/to/file';
B = RANK A;
C = ORDER B BY <column>
{code}
This, for example, gives you the permutation that was used to sort the dataset,
which might be useful.
Also, RANK would allow to create a data column that reflects the ordering that
you have in your data.
> RANK function like in SQL
> -------------------------
>
> Key: PIG-2353
> URL: https://issues.apache.org/jira/browse/PIG-2353
> Project: Pig
> Issue Type: New Feature
> Reporter: Gianmarco De Francisci Morales
> Attachments: PIG2353.patch
>
>
> Implement a function that given a (sorted) bag adds to each tuple a unique,
> increasing identifier without gaps, like what RANK does for SQL.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira