[ 
https://issues.apache.org/jira/browse/CASSANDRA-20100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902815#comment-17902815
 ] 

Caleb Rackliffe commented on CASSANDRA-20100:
---------------------------------------------

{{LongType#asComparableBytes()}} uses {{ByteSource.variableLengthInteger()}}.

> Is the fundamental problem the truncated types here?

The nominal problem is that we index a reversed byte-ordered representation 
when {{ReversedType}} is in play for the types we don't explicitly have to 
truncate. In any case, I don't think we ever needed to do this (the reversing, 
rather than just using the base type comparable bytes). My plan is to file a 
follow-up Jira that does away with that extra work, while this Jira simply 
corrects the problems in query construction without changing the on-disk format.

> Query construction is broken for SAI indexes on reversed types with 
> fixed-length encodings
> ------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-20100
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20100
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Feature/2i Index, Feature/SAI
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 5.0.x, 5.x
>
>         Attachments: ci_summary.html
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> SAI indexes values in byte-comparable form, both in the in-memory trie that 
> sits alongside the Memtable, and in the on-disk SSTable-adjacent indexes. In 
> most cases, this means literally using {{asComparableBytes()}} from the type 
> of the indexed column. There are, however, a few types that use a custom 
> byte-comparable form, namely {{inet}}, {{bigint}}, {{varint}}, and 
> {{decimal}}, to make sure we're dealing with a fixed-length piece of data for 
> the numeric (balanced tree) index.
> If we index one of these types as a reversed clustering key, however, we 
> don't write terms as reversed comparable bytes, and this breaks some 
> assumptions during query construction and post-filtering, where we generally 
> assume that {{asComparableBytes()}} will reverse terms before they are 
> indexed. We can make a short-term fix here without changing anything about 
> the on-disk format by making sure we interpret these special types as being 
> non-reversed (i.e. through the lens of their base types).
> In the longer term, it might make sense to standardize on indexing everything 
> in a non-reversed fashion in the index itself, although this might push some 
> complexity into post-filtering, where we are going to have to filter data 
> coming out of the normal read path anyway.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to