[ 
https://issues.apache.org/jira/browse/CASSANDRA-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589935#comment-14589935
 ] 

Benedict commented on CASSANDRA-9499:
-------------------------------------

bq. Doesn't shifting the continuation bits to the first byte penalize the range 
of 1-byte values? 

No. You have to have a single continuation "bit" for either encoding. They're 
literally equivalent, just with their bit positions swapped around, except in 
the single byte case, in which case they are exactly identical.

Run-length encoding means you count the number of set (or unset - this would 
actually be cleaner) contiguously at the top; the first that isn't (or is) 
tells you how many more bytes you need to read. i.e. if the first bit is unset, 
you're done, and the remaining 7 bits are value. If all 8 bits are set, we need 
to read a full long.

This encoding gives us pretty ideal behaviour, of cheap operation over the 
single byte representation, followed by consistent behaviour across all the 
remaining possible values. It also lets us easily avoid wasting a whole byte 
for the final bit of a long.

This is equivalent to Aleksey's characterisation.

> Introduce writeVInt method to DataOutputStreamPlus
> --------------------------------------------------
>
>                 Key: CASSANDRA-9499
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9499
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Ariel Weisberg
>            Priority: Minor
>             Fix For: 3.0 beta 1
>
>
> CASSANDRA-8099 really could do with a writeVInt method, for both fixing 
> CASSANDRA-9498 but also efficiently encoding timestamp/deletion deltas. It 
> should be possible to make an especially efficient implementation against 
> BufferedDataOutputStreamPlus.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to