[ 
https://issues.apache.org/jira/browse/PIG-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13669660#comment-13669660
 ] 

Rohini Palaniswamy commented on PIG-3257:
-----------------------------------------

Alan,
   Why don't we do it as a sequence instead of generating random numbers. Doing 
something like mapid-<sequence> or reduceid-<sequence>. i.e First mapper will 
do 0-0, 0-1..0-10000. 2nd mapper will do 1-0,1-1,...1-10000. Just a idea and we 
can think off a better implementation. It will anyways not be in sequence 
across the job -- but will be in sequence within the map and can be used as a 
UUID across the job which is repeatable if run with same number of 
mappers/reducers. This would avoid all problems of using random numbers and 
avoid human mistakes of writing a script without understanding the internals of 
how UUID is going to work which I don't think a user should be bothered with. 
                
> Add unique identifier UDF
> -------------------------
>
>                 Key: PIG-3257
>                 URL: https://issues.apache.org/jira/browse/PIG-3257
>             Project: Pig
>          Issue Type: Improvement
>          Components: internal-udfs
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>             Fix For: 0.12
>
>         Attachments: PIG-3257.patch
>
>
> It would be good to have a Pig function to generate unique identifiers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to