Re: Why does `now()` produce different times within the same query?

Jonathan Haddad Fri, 02 Dec 2016 07:54:17 -0800

This isn't about using the same UUID though. It's about the timestamp bits
in the UUID.


What the use case is for generating multiple UUIDs in a single row? Why do
you need to extract the timestamp out of both?
On Fri, Dec 2, 2016 at 10:24 AM Edward Capriolo <edlinuxg...@gmail.com>
wrote:

>
> On Thu, Dec 1, 2016 at 11:09 AM, Sylvain Lebresne <sylv...@datastax.com>
> wrote:
>
> On Thu, Dec 1, 2016 at 4:44 PM, Edward Capriolo <edlinuxg...@gmail.com>
> wrote:
>
>
> I am not sure you saw my reply on thread but I believe everyone's needs
> can be met I will copy that here:
>
>
> I saw it, but the real problem that was raised initially was not that of
> UDF and of allowing both behavior. It's a matter of people being confused
> by the behavior of a non-UDF function, now(), and suggesting it should be
> changed.
>
> The Hive idea is interesting I guess, and we can switch to discussing
> that, but it's a different problem really and I'm not a fond of derailing
> threads. I will just note though that if we're not talking about a
> confusion issue but rather how to get a timeuuid to be fixed within a
> statement, then there is much much more trivial solution: generate it
> client side. The `now()` function is a small convenience but there is
> nothing you cannot do without it client side, and that actually basically
> stands for almost any use of (non aggregate) function in Cassandra
> currently.
>
>
>
>
> "Food for thought: Hive's UDFs introduced an annotation  
> @UDFType(deterministic
> = false)
>
>
> http://dmtolpeko.com/2014/10/15/invoking-stateful-udf-at-map-and-reduce-side-in-hive/
>
> The effect is the query planner can see when such a UDF is in use and
> determine the value once at the start of a very long query."
>
> Essentially hive had a similar if not identical problem, during a long
> running distributed process like map/reduce some users wanted the semantics
> of:
>
> 1) Each call should have a new timestamps
>
> While other users wanted the semantics of:
>
> 2) Each call should generate the same timestamp
>
> The solution implemented was to add an annotation to udf such that the
> query planner would pick up the annotation and act accordingly.
>
> (Here is a related issue https://issues.apache.org/jira/browse/HIVE-1986
>
> As a result you can essentially implement two UDFS
>
> @UDFType(deterministic = false)
> public class UDFNow
>
> and for the other people
>
> @UDFType(deterministic = true)
> public class UDFNowOnce extends UDFNow
>
> Both user cases are met in a sensible way.
>
>
>
> The `now()` function is a small convenience but there is nothing you
> cannot do without it client side, and that actually basically stands for
> almost any use of (non aggregate) function in Cassandra currently.
>
> Casandra's changing philosophy over which entity should create such
> information client/server/driver does not make this problem easy.
>
> If you take into account that you have users who do not understand all the
> intricacy of uuid the problem is compounded. IE How does one generate a
> UUID each c#, python, java etc? with the 47 random bits of bla bla. That is
> not super easy information to find. Maybe you find a stack overflow post
> that actually gives bad advice etc.
>
> Many times in Cassandra you are using a uuid because you do not have a
> unique key in the insert and you wish to create one. If you are inserting
> more then a single record using that same UUID and you do not want the
> burden of wanting to do it yourself you would have to do write>>read>>write
> which is an anti-pattern.
>

Re: Why does `now()` produce different times within the same query?

Reply via email to