[ https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175656#comment-17175656 ]
Blake Eggleston commented on CASSANDRA-15393: --------------------------------------------- bq. I'm sorry, if my objections sound harsh. But my point is that it's better to fix the whole heap-pressure-nightmare with (de)serialization (reads/writes/re-serializations/compaction/etc) in the next major release. It’s fine, we should be talking about these things. I disagree with the idea that we should delay improving compaction allocations because we could implement a better solution at some point in the future. It's a textbook example of “letting perfect be the enemy of good enough”. The C* project has a problem with favoring rewrites in favor of incremental improvements. Compaction heap pressure is a real operational problem that causes a lot of headaches, and there is real value in mitigating it as part of 4.0, even if it can be further improved in the future. There's certainly risk here, but I think it's being a bit overstated. The changes here are wide, but not particularly deep. The most complex parts are probably the collection serializers and other places where we're now having to do offset bookkeeping. These should be carefully reviewed, but they're hardly unverifiable. bq. Unrelatedly, [Blake Eggleston](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=bdeggleston) , is there a reason you didn't go the whole hog and just get rid of the ByteBuffer versions of everything? 1) That would have been a much larger change, and I wanted to limit scope. IIRC, replacing partition keys would have been a lot of work for a comparatively small gc win. 2) bytebuffers are still useful in some places. Specifically in places where we're using allocators 3) I've been working on a flyweight reader in my free time that reduces another 95% of garbage and uses bytebuffers. This will be ready for 4.next, but using it should be optional. 4) There's value in decoupling data format from data logic. For instance, this would allow us to compare native and bytebuffer values without requiring the allocation of an intermediate buffer. > Add byte array backed cells > --------------------------- > > Key: CASSANDRA-15393 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15393 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction > Reporter: Blake Eggleston > Assignee: Blake Eggleston > Priority: Normal > Fix For: 4.0-beta > > Time Spent: 20m > Remaining Estimate: 0h > > We currently materialize all values as on heap byte buffers. Byte buffers > have a fairly high overhead given how frequently they’re used, and on the > compaction and local read path we don’t do anything that needs them. Use of > byte buffer methods only happens on the coordinator. Using cells that are > backed by byte arrays instead in these situations reduces compaction and read > garbage up to 22% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org