[ 
https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175656#comment-17175656
 ] 

Blake Eggleston commented on CASSANDRA-15393:
---------------------------------------------

bq. I'm sorry, if my objections sound harsh. But my point is that it's better 
to fix the whole heap-pressure-nightmare with (de)serialization 
(reads/writes/re-serializations/compaction/etc) in the next major release.

It’s fine, we should be talking about these things. I disagree with the idea 
that we should delay improving compaction allocations because we could 
implement a better solution at some point in the future. It's a textbook 
example of “letting perfect be the enemy of good enough”. The C* project has a 
problem with favoring rewrites in favor of incremental improvements. Compaction 
heap pressure is a real operational problem that causes a lot of headaches, and 
there is real value in mitigating it as part of 4.0, even if it can be further 
improved in the future. 

There's certainly risk here, but I think it's being a bit overstated. The 
changes here are wide, but not particularly deep. The most complex parts are 
probably the collection serializers and other places where we're now having to 
do offset bookkeeping. These should be carefully reviewed, but they're hardly 
unverifiable.

bq. Unrelatedly,  [Blake 
Eggleston](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=bdeggleston)
 , is there a reason you didn't go the whole hog and just get rid of the 
ByteBuffer versions of everything?

1) That would have been a much larger change, and I wanted to limit scope. 
IIRC, replacing partition keys would have been a lot of work for a 
comparatively small gc win.
2) bytebuffers are still useful in some places. Specifically in places where 
we're using allocators
3) I've been working on a flyweight reader in my free time that reduces another 
95% of garbage and uses bytebuffers. This will be ready for 4.next, but using 
it should be optional.
4) There's value in decoupling data format from data logic. For instance, this 
would allow us to compare native and bytebuffer values without requiring the 
allocation of an intermediate buffer.

> Add byte array backed cells
> ---------------------------
>
>                 Key: CASSANDRA-15393
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15393
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Local/Compaction
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>            Priority: Normal
>             Fix For: 4.0-beta
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> We currently materialize all values as on heap byte buffers. Byte buffers 
> have a fairly high overhead given how frequently they’re used, and on the 
> compaction and local read path we don’t do anything that needs them. Use of 
> byte buffer methods only happens on the coordinator. Using cells that are 
> backed by byte arrays instead in these situations reduces compaction and read 
> garbage up to 22% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to