[ 
https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175291#comment-17175291
 ] 

Robert Stupp commented on CASSANDRA-15393:
------------------------------------------

This one compiles without errors just by omitting the generic type information:
{code}
        ValueAccessor acc = ByteArrayAccessor.instance;
        CounterContext.headerLength(ByteBuffer.allocate(10), acc);
{code}
^ This is a verbose example e.g. of the method signatures 
{{o.a.c.db.rows.RangeTombstoneMarker#openBound}} + {{closeBound}} / 
{{o.a.c.db.Clusterable#clustering}} - and actual production code like the above 
(e.g. in {{o.a.c.db.MutableDeletionInfo.Builder#add}}). It seems that the 
existing production code base including the unchanged parts with all call sites 
need a full re-review to validate that this change is safe.

Another source of errors is that positions & offsets are manually handled now. 
For example 
[here|https://github.com/apache/cassandra/pull/712/files#diff-a9c2759ec4c4663dd4a24d5e0f89358fR225]
 or 
[here|https://github.com/apache/cassandra/pull/712/files#diff-1e045cbe2fa513c1d2e7ac63c1b764ccR58].
 Just think about merging a change from the 3.x line into trunk but not adding 
the necessary position/offsets arithmetics.

As I said, there's definitive value in reducing heap pressure in critical hot 
paths. But my concern that the approach is error prone due to the issues 
mentioned here still stands.
It might have been less of an issue, if the whole code base would have proper 
unit testing for all production code paths with all potential inputs or even 
fuzzy testing, but that's unfortunately not the case.

The risks are:
* Mis-computing offsets+positions can lead to potentially unnoticed data 
corruption, overflows, etc that initially go unnoticed
* Spurious new runtime errors in rare situations (barely tested ones) leading 
to ClassCastExceptions

I'm sorry, if my objections sound harsh. But my point is that it's better to 
fix the whole heap-pressure-nightmare with (de)serialization 
(reads/writes/re-serializations/compaction/etc) in the next major release.

> Add byte array backed cells
> ---------------------------
>
>                 Key: CASSANDRA-15393
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15393
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Local/Compaction
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>            Priority: Normal
>             Fix For: 4.0-beta
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> We currently materialize all values as on heap byte buffers. Byte buffers 
> have a fairly high overhead given how frequently they’re used, and on the 
> compaction and local read path we don’t do anything that needs them. Use of 
> byte buffer methods only happens on the coordinator. Using cells that are 
> backed by byte arrays instead in these situations reduces compaction and read 
> garbage up to 22% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to