[ https://issues.apache.org/jira/browse/CASSANDRA-20100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902815#comment-17902815 ]
Caleb Rackliffe commented on CASSANDRA-20100: --------------------------------------------- {{LongType#asComparableBytes()}} uses {{ByteSource.variableLengthInteger()}}. > Is the fundamental problem the truncated types here? The nominal problem is that we index a reversed byte-ordered representation when {{ReversedType}} is in play for the types we don't explicitly have to truncate. In any case, I don't think we ever needed to do this (the reversing, rather than just using the base type comparable bytes). My plan is to file a follow-up Jira that does away with that extra work, while this Jira simply corrects the problems in query construction without changing the on-disk format. > Query construction is broken for SAI indexes on reversed types with > fixed-length encodings > ------------------------------------------------------------------------------------------ > > Key: CASSANDRA-20100 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20100 > Project: Apache Cassandra > Issue Type: Bug > Components: Feature/2i Index, Feature/SAI > Reporter: Caleb Rackliffe > Assignee: Caleb Rackliffe > Priority: Normal > Fix For: 5.0.x, 5.x > > Attachments: ci_summary.html > > Time Spent: 1h 10m > Remaining Estimate: 0h > > SAI indexes values in byte-comparable form, both in the in-memory trie that > sits alongside the Memtable, and in the on-disk SSTable-adjacent indexes. In > most cases, this means literally using {{asComparableBytes()}} from the type > of the indexed column. There are, however, a few types that use a custom > byte-comparable form, namely {{inet}}, {{bigint}}, {{varint}}, and > {{decimal}}, to make sure we're dealing with a fixed-length piece of data for > the numeric (balanced tree) index. > If we index one of these types as a reversed clustering key, however, we > don't write terms as reversed comparable bytes, and this breaks some > assumptions during query construction and post-filtering, where we generally > assume that {{asComparableBytes()}} will reverse terms before they are > indexed. We can make a short-term fix here without changing anything about > the on-disk format by making sure we interpret these special types as being > non-reversed (i.e. through the lens of their base types). > In the longer term, it might make sense to standardize on indexing everything > in a non-reversed fashion in the index itself, although this might push some > complexity into post-filtering, where we are going to have to filter data > coming out of the normal read path anyway. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org