[ 
https://issues.apache.org/jira/browse/CASSANDRA-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040646#comment-14040646
 ] 

Sylvain Lebresne commented on CASSANDRA-6108:
---------------------------------------------

bq. How do we safely manage the life cycle of the ids?

Not easily for sure. I suppose we could add a per-client "I'm out" message in 
the protocol and say that if your clients are not well-behaved and don't send 
those messages, then well, fix them, but that's clearly not ideal. But in fact, 
when I mentioned this idea (which was in no way a well though thing), I was 
really thinking of ID per-connections, not per clients, but that would require 
a lot more ID than is necessary for correctness which is definitively far from 
ideal.

All that said, I think it's worth taking a step back on this ticket, on what we 
need and what are our constraints. Typically, this ticker per-se, to have a 
time64 CQL type, is not, imho, the most important thing we care about. What we 
need is a way to make our cell timestamp cluster-wide unique both for 
CASSANDRA-6123 and for CASSANDRA-7056, and we obviously want that new "better" 
timestamp to not be overly big. This ticket is more a "by the way, if we add 
that and it's more compact that a timeuuid, then let's expose it for columns 
too".

In particular, I don't think the fact that it's a 64bits ID should be an 
absolute strong requirement.

And in fact, I would suggest that we *seriously* consider just using a 
timeuuid. Yes, a timeuuid is 128bits long which at face falue feels excessive 
for a per-cell thing. However, we can relatively easily optimize their storage 
(at least on disk, but probably in memory some additional efforts): the "clock 
and sequence" part of the timeuuid will basically be our per-client unique ID 
and a per-sstable dictionary should be reasonably efficient, and the timestamp 
can be stored as a delta from a per-sstable epoch. Overall, it should be 
relatively easy to get something more compact than what we currently have, 
which is imo a good bar for "acceptable" in term of compactness.

But a bonus is that if we do that systematically for timeuuid, we'll get 
compaction of existing tables using timeuuid. Basically, instead of inventing 
something new and ask everyone to use that from now on (which is pretty painful 
for users), let's reuse what we have and is somewhat standard and optimize it.  
The big advantage being simplicity: for drivers that won't have to implement 
new potentially complex schemes (all drivers have a timeuuid generator 
already), for users that won't have to use a new type and probably for us too 
as that's probably the simper route. As far as I'm concerned, those advantages 
out-weight the downside of having a slightly less compact representation than 
if we were to hand-craft something custom.




> Create timeid64 type
> --------------------
>
>                 Key: CASSANDRA-6108
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6108
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 2.1.1
>
>
> As discussed in CASSANDRA-6106, we could create a 64-bit type with 48 bits of 
> timestamp and 16 bites of unique coordinator id.  This would give us a 
> unique-per-cluster value that could be used as a more compact replacement for 
> many TimeUUID uses.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to