[ 
https://issues.apache.org/jira/browse/CASSANDRA-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375771#comment-14375771
 ] 

Benedict edited comment on CASSANDRA-6936 at 3/23/15 11:43 AM:
---------------------------------------------------------------

So, the more often I think of future storage changes, the more this becomes a 
pain and a headache. I would like to reassess the possibility of making 
everything byte-order comparable. How widely deployed are custom AbstractType 
implementations where the comparator makes a difference? Because it seems 
dropping support for just this (and having the user define an ASC/DESC order on 
the fields for maps/sets/tables within a UDT instead, for instance) would give 
us the ability to deliver it universally.

As far as I am aware, we're the only database that hamstrings ourselves with 
this limitation (or permittance). I would like to byte-prefix compress our 
index file (because as standard it takes up a significant proportion of the 
data it indexes unnecessarily, inflating the number of disk accesses and 
reducing the effective capacity of the key cache), but this isn't possible 
without a majority of fields supporting this. Even then, if we have special 
casing for those that do not, this is a headache and code complexity. It also 
pollutes the icache and branch predictors (not just with the inflation of 
variances, but in the logic to select between them). This is not to be 
understated: it's surprising how many icache misses you can get on a simple 
in-memory stress workload, which is underrepresentative of the variation for a 
normal deployment. vtune rates our utilisation of chips pretty poorly, and this 
is a major contributor. The same is true for optimising merges (we get 
significantly better algorithmic complexity with much fewer changes if the 
comparable fields are byte-prefix comparable), and for compressing clustering 
columns in data files on disk. I am certain I will encounter more scenarios 
before long.

I think the cumulative performance wins here would be really _very_ 
significant, for all workloads (compaction, disk reads and in-memory reads all 
have significant wins from this change).

CASSANDRA-8099, CASSANDRA-8731, CASSANDRA-8906 and CASSANDRA-8915 all help, but 
none will help as significantly - and each adds its own complexity, whereas 
this would _simplify_, which I think is important (for us as well as the CPU)


was (Author: benedict):
So, the more often I think of future storage changes, the more this becomes a 
pain and a headache. I would like to reassess the possibility of making 
everything byte-order comparable. How widely deployed are custom AbstractType 
implementations where the comparator makes a difference? Because it seems 
dropping support for just this (and having the user define an ASC/DESC order on 
the fields for maps/sets/tables within a UDT instead, for instance) would give 
us the ability to deliver it universally.

As far as I am aware, we're the only database that hamstrings ourselves with 
this limitation (or permittance). I would like to byte-prefix compress our 
index file (because as standard it takes up a significant proportion of the 
data it indexes unnecessarily, inflating the number of disk accesses and 
reducing the effective capacity of the key cache), but this isn't possible 
without a majority of fields supporting this. Even then, if we have special 
casing for those that do not, this is a headache and code complexity. It also 
pollutes the icache and branch predictors (not just with the inflation of 
variances, but in the logic to select between them). This is not to be 
understated: it's surprising how many icache misses you can get on a simple 
in-memory stress workload, which is underrepresentative of the variation for a 
normal deployment. vtune rates our utilisation of chips pretty poorly, and this 
is a major contributor. The same is true for optimising merges (we get 
significantly better algorithmic complexity with much fewer changes if the 
comparable fields are byte-prefix comparable), and for compressing clustering 
columns in data files on disk. I am certain I will encounter more scenarios 
before long.

I think the cumulative performance wins here would be really _very_ 
significant, for all workloads (compaction, disk reads and in-memory reads all 
have significant wins from this change).

> Make all byte representations of types comparable by their unsigned byte 
> representation only
> --------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6936
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6936
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>              Labels: performance
>             Fix For: 3.0
>
>
> This could be a painful change, but is necessary for implementing a 
> trie-based index, and settling for less would be suboptimal; it also should 
> make comparisons cheaper all-round, and since comparison operations are 
> pretty much the majority of C*'s business, this should be easily felt (see 
> CASSANDRA-6553 and CASSANDRA-6934 for an example of some minor changes with 
> major performance impacts). No copying/special casing/slicing should mean 
> fewer opportunities to introduce performance regressions as well.
> Since I have slated for 3.0 a lot of non-backwards-compatible sstable 
> changes, hopefully this shouldn't be too much more of a burden.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to