[ https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986441#comment-16986441 ]
Blake Eggleston edited comment on CASSANDRA-15393 at 12/3/19 1:14 AM: ---------------------------------------------------------------------- [~benedict], I’d like your opinion on this. Instead of adding a generic {{Value}} wrapper, I tried using the same opaque value / value accessor pattern I’d used on CASSANDRA-15390 to generalize serialization logic. Code is [here|https://github.com/bdeggleston/cassandra/tree/15393-accessor]. Decoupling the serialization logic from the memory abstraction like this makes it possible to realize all the allocation improvements of using byte arrays for cell values without locking us into byte arrays or adding overhead of a wrapper object (which incurs a ~6% allocation penalty with these tests). We also avoid worrying about megamorphic call sites. As a bonus, this pattern means we can convert clustering prefix to use byte arrays as well without a lot of extra work, which saves an additional 5% of allocations, on top of the existing ~18% from cells. So that’s the good news, the bad news is that it touches a lot of serialization code, and is just a larger patch in general than simply converting cells to use byte arrays (although I’m not sure how well that would have worked with the slab allocator in practice). I still need to chase down a handful of failing tests and there are several rough edges that need to be worked out. was (Author: bdeggleston): [~benedict], I’d like your opinion on this. Instead of adding a generic {{Value}} wrapper, I tried using the same opaque value / value accessor pattern I’d used on CASSANDRA-15390 to generalize serialization logic. Code is [here|https://github.com/bdeggleston/cassandra/tree/15393-accessor]. Decoupling the serialization logic from the memory abstraction like this makes it possible to realize all the allocation improvements of using byte arrays for cell values without locking us into byte arrays or adding overhead of a wrapper object (which incurs a ~6% allocation penalty with these tests). We also avoid worrying about megamorphic call sites. As a bonus, this pattern means we can convert clustering prefix to use byte arrays as well without a lot of extra work, which saves an additional 5% of allocations, on top of the existing ~18% from cells. So that’s the good news, the bad news is that it touches a lot of serialization code, and is just a larger patch in general than simply converting cells to use byte arrays (although I’m not sure how well that would have worked with the slab allocator in practice). I still need to chase down a handful of failing tests and there are some rough edges around how value/accessors are handled via generic wildcards that need to be worked out. > Add byte array backed cells > --------------------------- > > Key: CASSANDRA-15393 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15393 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction > Reporter: Blake Eggleston > Assignee: Blake Eggleston > Priority: Normal > Fix For: 4.0 > > > We currently materialize all values as on heap byte buffers. Byte buffers > have a fairly high overhead given how frequently they’re used, and on the > compaction and local read path we don’t do anything that needs them. Use of > byte buffer methods only happens on the coordinator. Using cells that are > backed by byte arrays instead in these situations reduces compaction and read > garbage up to 22% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org