[
https://issues.apache.org/jira/browse/PIG-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13669660#comment-13669660
]
Rohini Palaniswamy commented on PIG-3257:
-----------------------------------------
Alan,
Why don't we do it as a sequence instead of generating random numbers. Doing
something like mapid-<sequence> or reduceid-<sequence>. i.e First mapper will
do 0-0, 0-1..0-10000. 2nd mapper will do 1-0,1-1,...1-10000. Just a idea and we
can think off a better implementation. It will anyways not be in sequence
across the job -- but will be in sequence within the map and can be used as a
UUID across the job which is repeatable if run with same number of
mappers/reducers. This would avoid all problems of using random numbers and
avoid human mistakes of writing a script without understanding the internals of
how UUID is going to work which I don't think a user should be bothered with.
> Add unique identifier UDF
> -------------------------
>
> Key: PIG-3257
> URL: https://issues.apache.org/jira/browse/PIG-3257
> Project: Pig
> Issue Type: Improvement
> Components: internal-udfs
> Reporter: Alan Gates
> Assignee: Alan Gates
> Fix For: 0.12
>
> Attachments: PIG-3257.patch
>
>
> It would be good to have a Pig function to generate unique identifiers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira