[ https://issues.apache.org/jira/browse/CASSANDRA-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16804990#comment-16804990 ]
Andreas Andersen commented on CASSANDRA-15064: ---------------------------------------------- The problem is a real world problem for us, and the timeuuid's in the example are not synthetic. We use the gocql client library for golang, which generates timeuuid's in this way. It's the function at https://github.com/gocql/gocql/blob/master/uuid.go#L121 we use to generate our timeuuid's from timestamps, and we rely on the sequence number to be incremented on each call. Our use case is sorting images on a timeline. Our table more or less look like: {noformat} CREATE TABLE timeline ( user_id uuid, image_timestamp timeuuid, image_id uuid, -- Lots of other metadata fields here... PRIMARY KEY(user_id, image_timestamp) ) WITH CLUSTERING ORDER BY(t ASC); {noformat} The images are sorted by the time that they were taken. However, images taken quickly in series by the same camera might all have equal timestamps. In this case, we want to order them by some other property to show them in the correct chronological order. As of today we use the filename for this. In order to keep the table structure simple, we just sort them in the correct order before inserting them, relying on each newly generated timeuuid being greater than the last one. Our client side code more or less look like: {noformat} type image struct { user_id uuid image_id uuid taken_at timestamp filename string } images := []image // Sort images in the order we want to store them in cassandra sort(images, [taken_at, filename]) // Now give every image a unique timeuuid and insert them into cassandra. We // rely on the timeuuid generator to generate id's in ascending order in case of // multiple images with the same timestamp. for i := range images { image_timestamp := uuidFromTime(i.taken_at) queryCassandra( "INSERT INTO timeline(user_id, image_timestamp, image_id) VALUES(?, ?, ?)", i.user_id, image_timestamp, i.image_id ) } {noformat} Our use case is kind of special, since we at the time of insertion have all images that will ever be created for a single user, which allows us to insert them in this way. Creating a table in the following way would give us the same result: {noformat} CREATE TABLE timeline ( user_id uuid, image_timestamp timestamp, sort_number int, image_id uuid, -- Lots of other metadata fields here... PRIMARY KEY(user_id, image_timestamp, sort_number) ) WITH CLUSTERING ORDER BY(t ASC); {noformat} In practicality this is not really a big issue for us. We can just change the column type from timeuuid to a uuid, and everything will behave the way that we expect. While reading the code for gocql.UUIDFromTime() I also realize the our implementation is broken, as the sequence number will wrap at times, since it's only 14 bits long :) > Wrong ordering for timeuuid fields > ---------------------------------- > > Key: CASSANDRA-15064 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15064 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Schema > Reporter: Andreas Andersen > Assignee: Jon Meredith > Priority: Normal > Attachments: example.cql > > > Hi! > We're seeing some strange behavior for the ordering of timeuuid fields. They > seem to be sorted in the wrong order when the clock_seq_low field in a > timeuuid goes from 7f to 80. Consider the following example: > {noformat} > cqlsh:test> show version; > [cqlsh 5.0.1 | Cassandra 3.11.4 | CQL spec 3.4.4 | Native protocol v4] > cqlsh:test> CREATE TABLE t ( > ... partition int, > ... t timeuuid, > ... i int, > ... > ... PRIMARY KEY(partition, t) > ... ) > ... WITH CLUSTERING ORDER BY(t ASC); > cqlsh:test> INSERT INTO t(partition, t, i) VALUES(1, > 84e2c963-4ef9-11e9-b57e-f0def1d0755e, 1); > cqlsh:test> INSERT INTO t(partition, t, i) VALUES(1, > 84e2c963-4ef9-11e9-b57f-f0def1d0755e, 2); > cqlsh:test> INSERT INTO t(partition, t, i) VALUES(1, > 84e2c963-4ef9-11e9-b580-f0def1d0755e, 3); > cqlsh:test> INSERT INTO t(partition, t, i) VALUES(1, > 84e2c963-4ef9-11e9-b581-f0def1d0755e, 4); > cqlsh:test> INSERT INTO t(partition, t, i) VALUES(1, > 84e2c963-4ef9-11e9-b582-f0def1d0755e, 5); > cqlsh:test> SELECT * FROM t WHERE partition = 1 ORDER BY t ASC; > > partition | t | i > -----------+--------------------------------------+--- > 1 | 84e2c963-4ef9-11e9-b580-f0def1d0755e | 3 > 1 | 84e2c963-4ef9-11e9-b581-f0def1d0755e | 4 > 1 | 84e2c963-4ef9-11e9-b582-f0def1d0755e | 5 > 1 | 84e2c963-4ef9-11e9-b57e-f0def1d0755e | 1 > 1 | 84e2c963-4ef9-11e9-b57f-f0def1d0755e | 2 > > (5 rows) > cqlsh:test> > {noformat} > The expected behavior is that the rows are returned in the same order as they > were inserted (we inserted them with their clustering key in an ascending > order). Instead, the order "wraps" in the middle. > This issue only arises when the 9th octet (clock_seq_low) in the uuid goes > from 7f to 80. A guess would be that the comparison is implemented as a > signed integer instead of an unsigned integer, as 0x7f = 127 and 0x80 = -128. > According to the RFC, the field should be treated as an unsigned integer: > [https://tools.ietf.org/html/rfc4122#section-4.1.2] > Changing the field from a timeuuid to a uuid gives the expected correct > behavior: > {noformat} > cqlsh:test> CREATE TABLE t ( > ... partition int, > ... t uuid, > ... i int, > ... > ... PRIMARY KEY(partition, t) > ... ) > ... WITH CLUSTERING ORDER BY(t ASC); > cqlsh:test> INSERT INTO t(partition, t, i) VALUES(1, > 84e2c963-4ef9-11e9-b57e-f0def1d0755e, 1); > cqlsh:test> INSERT INTO t(partition, t, i) VALUES(1, > 84e2c963-4ef9-11e9-b57f-f0def1d0755e, 2); > cqlsh:test> INSERT INTO t(partition, t, i) VALUES(1, > 84e2c963-4ef9-11e9-b580-f0def1d0755e, 3); > cqlsh:test> INSERT INTO t(partition, t, i) VALUES(1, > 84e2c963-4ef9-11e9-b581-f0def1d0755e, 4); > cqlsh:test> INSERT INTO t(partition, t, i) VALUES(1, > 84e2c963-4ef9-11e9-b582-f0def1d0755e, 5); > cqlsh:test> SELECT * FROM t WHERE partition = 1 ORDER BY t ASC; > > partition | t | i > -----------+--------------------------------------+--- > 1 | 84e2c963-4ef9-11e9-b57e-f0def1d0755e | 1 > 1 | 84e2c963-4ef9-11e9-b57f-f0def1d0755e | 2 > 1 | 84e2c963-4ef9-11e9-b580-f0def1d0755e | 3 > 1 | 84e2c963-4ef9-11e9-b581-f0def1d0755e | 4 > 1 | 84e2c963-4ef9-11e9-b582-f0def1d0755e | 5 > > (5 rows) > cqlsh:test>{noformat} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org