[
https://issues.apache.org/jira/browse/PIG-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491301#comment-13491301
]
Jonathan Coveney commented on PIG-3014:
---------------------------------------
(as a sidenote, I was thinking about this, and IF (big if) Hadoop can guarantee
an atomic write action (I don't think it can?) then we only need one file. Each
mapper can attempt to read it, and if it is empty, it appends the current time,
and then it reads the first date time in that file. It would avoid a race
condition because of the atomic write action. If writing isn't atomic though
you'd have to abuse some atomic action for coordination, a la delete above. In
fact, we could even make this a generic API that let's you coordinate some
runtime value for udf invocations, but once again, it's not really a pattern we
want to encourage).
Now I sort of want to do this just for the challenge of it...
> CurrentTime() UDF has undesirable characteristics
> -------------------------------------------------
>
> Key: PIG-3014
> URL: https://issues.apache.org/jira/browse/PIG-3014
> Project: Pig
> Issue Type: Bug
> Reporter: Jonathan Coveney
> Assignee: Jonathan Coveney
> Fix For: 0.12
>
> Attachments: PIG-3014-0.patch
>
>
> As part of the explanation of the new DateTime datatype I noticed that we had
> added a CurrentTime() UDF. The issue with this UDF is that it returns the
> current time _of every exec invocation_, which can lead to confusing results.
> In PIG-1431 I proposed a way such that every instance of the same NOW() will
> return the same time, which I think is better. Would enjoy thoughts.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira