[jira] [Comment Edited] (CASSANDRA-6936) Make all byte representations of types comparable by their unsigned byte representation only
[ https://issues.apache.org/jira/browse/CASSANDRA-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17560942#comment-17560942 ] Ivan Senic edited comment on CASSANDRA-6936 at 6/30/22 9:12 AM: Do I understand correct that this will be first available in the `4.2` release that is scheduled to go out in a year? was (Author: JIRAUSER281556): Do I understand good that this will be first available in the `4.2` release that is scheduled to go out in a year? > Make all byte representations of types comparable by their unsigned byte > representation only > > > Key: CASSANDRA-6936 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6936 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Core >Reporter: Benedict Elliott Smith >Assignee: Branimir Lambov >Priority: Normal > Labels: compaction, performance > Fix For: 4.2 > > Time Spent: 25h > Remaining Estimate: 0h > > This could be a painful change, but is necessary for implementing a > trie-based index, and settling for less would be suboptimal; it also should > make comparisons cheaper all-round, and since comparison operations are > pretty much the majority of C*'s business, this should be easily felt (see > CASSANDRA-6553 and CASSANDRA-6934 for an example of some minor changes with > major performance impacts). No copying/special casing/slicing should mean > fewer opportunities to introduce performance regressions as well. > Since I have slated for 3.0 a lot of non-backwards-compatible sstable > changes, hopefully this shouldn't be too much more of a burden. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-6936) Make all byte representations of types comparable by their unsigned byte representation only
[ https://issues.apache.org/jira/browse/CASSANDRA-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375771#comment-14375771 ] Benedict edited comment on CASSANDRA-6936 at 3/23/15 11:43 AM: --- So, the more often I think of future storage changes, the more this becomes a pain and a headache. I would like to reassess the possibility of making everything byte-order comparable. How widely deployed are custom AbstractType implementations where the comparator makes a difference? Because it seems dropping support for just this (and having the user define an ASC/DESC order on the fields for maps/sets/tables within a UDT instead, for instance) would give us the ability to deliver it universally. As far as I am aware, we're the only database that hamstrings ourselves with this limitation (or permittance). I would like to byte-prefix compress our index file (because as standard it takes up a significant proportion of the data it indexes unnecessarily, inflating the number of disk accesses and reducing the effective capacity of the key cache), but this isn't possible without a majority of fields supporting this. Even then, if we have special casing for those that do not, this is a headache and code complexity. It also pollutes the icache and branch predictors (not just with the inflation of variances, but in the logic to select between them). This is not to be understated: it's surprising how many icache misses you can get on a simple in-memory stress workload, which is underrepresentative of the variation for a normal deployment. vtune rates our utilisation of chips pretty poorly, and this is a major contributor. The same is true for optimising merges (we get significantly better algorithmic complexity with much fewer changes if the comparable fields are byte-prefix comparable), and for compressing clustering columns in data files on disk. I am certain I will encounter more scenarios before long. I think the cumulative performance wins here would be really _very_ significant, for all workloads (compaction, disk reads and in-memory reads all have significant wins from this change). CASSANDRA-8099, CASSANDRA-8731, CASSANDRA-8906 and CASSANDRA-8915 all help, but none will help as significantly - and each adds its own complexity, whereas this would _simplify_, which I think is important (for us as well as the CPU) was (Author: benedict): So, the more often I think of future storage changes, the more this becomes a pain and a headache. I would like to reassess the possibility of making everything byte-order comparable. How widely deployed are custom AbstractType implementations where the comparator makes a difference? Because it seems dropping support for just this (and having the user define an ASC/DESC order on the fields for maps/sets/tables within a UDT instead, for instance) would give us the ability to deliver it universally. As far as I am aware, we're the only database that hamstrings ourselves with this limitation (or permittance). I would like to byte-prefix compress our index file (because as standard it takes up a significant proportion of the data it indexes unnecessarily, inflating the number of disk accesses and reducing the effective capacity of the key cache), but this isn't possible without a majority of fields supporting this. Even then, if we have special casing for those that do not, this is a headache and code complexity. It also pollutes the icache and branch predictors (not just with the inflation of variances, but in the logic to select between them). This is not to be understated: it's surprising how many icache misses you can get on a simple in-memory stress workload, which is underrepresentative of the variation for a normal deployment. vtune rates our utilisation of chips pretty poorly, and this is a major contributor. The same is true for optimising merges (we get significantly better algorithmic complexity with much fewer changes if the comparable fields are byte-prefix comparable), and for compressing clustering columns in data files on disk. I am certain I will encounter more scenarios before long. I think the cumulative performance wins here would be really _very_ significant, for all workloads (compaction, disk reads and in-memory reads all have significant wins from this change). Make all byte representations of types comparable by their unsigned byte representation only Key: CASSANDRA-6936 URL: https://issues.apache.org/jira/browse/CASSANDRA-6936 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Labels: performance Fix For: 3.0 This could be a painful change, but is necessary for
[jira] [Comment Edited] (CASSANDRA-6936) Make all byte representations of types comparable by their unsigned byte representation only
[ https://issues.apache.org/jira/browse/CASSANDRA-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305367#comment-14305367 ] Aleksey Yeschenko edited comment on CASSANDRA-6936 at 2/4/15 4:03 PM: -- Additionally, I wouldn't want to layer extra conversion logic on top of the already happening CASSANDRA-8099. We will have bugs there (in back and forth conversion of mutations and read commands). We are still catching bugs of this kind from CASSANDRA-3237. You don't want to make things worse by having this on top, in a single release. was (Author: iamaleksey): Additionally, I wouldn't want to layer extra conversion logic on top of the already happening CASSANDRA-8099. We will have bugs there (in back and forth convertion of mutations and read commands). We are still catching bugs of this kind from CASSANDRA-3237. You don't want to make things worth by having this on top, in a single release. Make all byte representations of types comparable by their unsigned byte representation only Key: CASSANDRA-6936 URL: https://issues.apache.org/jira/browse/CASSANDRA-6936 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Labels: performance Fix For: 3.0 This could be a painful change, but is necessary for implementing a trie-based index, and settling for less would be suboptimal; it also should make comparisons cheaper all-round, and since comparison operations are pretty much the majority of C*'s business, this should be easily felt (see CASSANDRA-6553 and CASSANDRA-6934 for an example of some minor changes with major performance impacts). No copying/special casing/slicing should mean fewer opportunities to introduce performance regressions as well. Since I have slated for 3.0 a lot of non-backwards-compatible sstable changes, hopefully this shouldn't be too much more of a burden. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-6936) Make all byte representations of types comparable by their unsigned byte representation only
[ https://issues.apache.org/jira/browse/CASSANDRA-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305181#comment-14305181 ] Benedict edited comment on CASSANDRA-6936 at 2/4/15 2:49 PM: - bq. Maybe a time will come where comparisons are our main bottleneck but we're not there atm and future storage changes will probably impact this as well. We are there already. Speak to [~jblangs...@datastax.com] and [~jshook] for instance, who've each been working with users seeing CPU costs of comparison bottleneck performance. One of these customers is seeing a blistering 4MB/s of compaction throughput with their CPUs maxed out. The other had to stop using collections entirely. Comparisons are pretty much the main time sink for c* when working with clustering columns, and especially collections. The big problem fields are int, bigint and timestamp. All of these are very commonly used, and trivial to make byte-order comparable. The optimisations made a little while back had a significant impact on CPU cost of merges, and they all depend on byte-order comaprability of every clustering column on the table. For such small fields the cost of the virtual invocation is a significant percentage of the time spent since the data will generally be in cache, having just been read off disk. We can avoid multiple such virtual invocations if all of the fields are byte-order comparable. It also improves instruction cache occupancy for these common methods, since they all go through the same codepath (at the time of making those optimisations, instruction cache misses were actually a significant problem, and likely worse on a live server with a more varied workload). Future storage changes largely depend on it too for delivering the best performance, as the binary trie is likely to be the most significant win. Further CASSANDRA-8731 can perhaps exploit the nature of these fields to reduce costs of merging even further. That all said, CASSANDRA-8731 may well help get some of the way there by itself, depending on how things pan out. was (Author: benedict): bq. Maybe a time will come where comparisons are our main bottleneck but we're not there atm and future storage changes will probably impact this as well. We are there already. Speak to [~jblangs...@datastax.com], for instance, who's been working with two users recently seeing CPU costs of comparison bottleneck performance. One of these customers is seeing a blistering 4MB/s of compaction throughput with their CPUs maxed out. Comparisons are pretty much the main time sink for c* when working with clustering columns, and especially collections. The big problem fields are int, bigint and timestamp. All of these are very commonly used, and trivial to make byte-order comparable. The optimisations made a little while back had a significant impact on CPU cost of merges, and they all depend on byte-order comaprability of every clustering column on the table. For such small fields the cost of the virtual invocation is a significant percentage of the time spent since the data will generally be in cache, having just been read off disk. We can avoid multiple such virtual invocations if all of the fields are byte-order comparable. It also improves instruction cache occupancy for these common methods, since they all go through the same codepath (at the time of making those optimisations, instruction cache misses were actually a significant problem, and likely worse on a live server with a more varied workload). Future storage changes largely depend on it too for delivering the best performance, as the binary trie is likely to be the most significant win. Further CASSANDRA-8731 can perhaps exploit the nature of these fields to reduce costs of merging even further. Make all byte representations of types comparable by their unsigned byte representation only Key: CASSANDRA-6936 URL: https://issues.apache.org/jira/browse/CASSANDRA-6936 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Labels: performance Fix For: 3.0 This could be a painful change, but is necessary for implementing a trie-based index, and settling for less would be suboptimal; it also should make comparisons cheaper all-round, and since comparison operations are pretty much the majority of C*'s business, this should be easily felt (see CASSANDRA-6553 and CASSANDRA-6934 for an example of some minor changes with major performance impacts). No copying/special casing/slicing should mean fewer opportunities to introduce performance regressions as well. Since I have slated for 3.0 a lot of non-backwards-compatible sstable changes,