[ 
https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986441#comment-16986441
 ] 

Blake Eggleston edited comment on CASSANDRA-15393 at 12/3/19 1:14 AM:
----------------------------------------------------------------------

[~benedict], I’d like your opinion on this. Instead of adding a generic 
{{Value}} wrapper, I tried using the same opaque value / value accessor pattern 
I’d used on CASSANDRA-15390 to generalize serialization logic. Code is 
[here|https://github.com/bdeggleston/cassandra/tree/15393-accessor].

Decoupling the serialization logic from the memory abstraction like this makes 
it possible to realize all the allocation improvements of using byte arrays for 
cell values without locking us into byte arrays or adding overhead of a wrapper 
object (which incurs a ~6% allocation penalty with these tests). We also avoid 
worrying about megamorphic call sites.

As a bonus, this pattern means we can convert clustering prefix to use byte 
arrays as well without a lot of extra work, which saves an additional 5% of 
allocations, on top of the existing ~18% from cells.

So that’s the good news, the bad news is that it touches a lot of serialization 
code, and is just a larger patch in general than simply converting cells to use 
byte arrays (although I’m not sure how well that would have worked with the 
slab allocator in practice). I still need to chase down a handful of failing 
tests and there are several rough edges that need to be worked out.


was (Author: bdeggleston):
[~benedict], I’d like your opinion on this. Instead of adding a generic 
{{Value}} wrapper, I tried using the same opaque value / value accessor pattern 
I’d used on CASSANDRA-15390 to generalize serialization logic. Code is 
[here|https://github.com/bdeggleston/cassandra/tree/15393-accessor].

Decoupling the serialization logic from the memory abstraction like this makes 
it possible to realize all the allocation improvements of using byte arrays for 
cell values without locking us into byte arrays or adding overhead of a wrapper 
object (which incurs a ~6% allocation penalty with these tests). We also avoid 
worrying about megamorphic call sites.

As a bonus, this pattern means we can convert clustering prefix to use byte 
arrays as well without a lot of extra work, which saves an additional 5% of 
allocations, on top of the existing ~18% from cells.

So that’s the good news, the bad news is that it touches a lot of serialization 
code, and is just a larger patch in general than simply converting cells to use 
byte arrays (although I’m not sure how well that would have worked with the 
slab allocator in practice). I still need to chase down a handful of failing 
tests and there are some rough edges around how value/accessors are handled via 
generic wildcards that need to be worked out.

> Add byte array backed cells
> ---------------------------
>
>                 Key: CASSANDRA-15393
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15393
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Local/Compaction
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>            Priority: Normal
>             Fix For: 4.0
>
>
> We currently materialize all values as on heap byte buffers. Byte buffers 
> have a fairly high overhead given how frequently they’re used, and on the 
> compaction and local read path we don’t do anything that needs them. Use of 
> byte buffer methods only happens on the coordinator. Using cells that are 
> backed by byte arrays instead in these situations reduces compaction and read 
> garbage up to 22% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to