[jira] [Commented] (CASSANDRA-15393) Add byte array backed cells

Caleb Rackliffe (Jira) Fri, 28 Aug 2020 20:27:31 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186875#comment-17186875
 ]


Caleb Rackliffe commented on CASSANDRA-15393:
---------------------------------------------

[~bdeggleston] There are the [items 
above|https://issues.apache.org/jira/browse/CASSANDRA-15393?focusedCommentId=17185489&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17185489]
 to resolve, of course, but here's my full feedback on the latest raft of 
commits...

*Complexity*

- The semantics of {{ValueAccessor#size()}} are a little odd that it always 
indicates the length of the array in the byte array case, but the number of 
bytes left in a byte buffer. Some of the usages are just checking if the value 
is empty, which makes me wonder if having an isEmpty() get rid of some of the 
awkwardness. On the other hand, most usages occur right before we read.
- There are some switch statements we might be able to avoid in the two 
accessor implementations if we encode the {{copy()}} and {{compare}} strategies 
in {{BackingKind}}.
- How valuable is {{ValueAccessor.copy()}} when we can make calls like 
{{srcAccessor.copyTo(src, srcOffset, dst, dstAccessor, dstOffset, size)}} call 
directly from {{CounterContext}}? Same for {{compareUnsigned()}}. Want to make 
sure we aren't cluttering the accessor interface.
- This is really an extension of the conversation above, but I'm worried that a 
handful of places where we only use {{ByteBuffer}} being changed to use the 
more abstract value/accessor combination are not only problematic for 
readability, but are adding more surface area for testing. 
{{TupleType#split()}}, for instance, might not be covered too well, but we 
could probably avoid additional tests and get to the immediate goal faster here 
if we simply stay in {{ByteBuffer}} space and avoid the new offset logic in the 
first place. Other examples of this could be 
{{CounterContext#clearAllLocal()}}, {{ClusteringPrefix.Deserializer}} (byte 
array only), and {{Clustering.Serializer#deserialize()}}.

*Formatting*

- Horizontal alignment in {{ClusteringIndexNamesFilter#forPaging()}} on the 
ternary.
- Might have to line wrap some parameter lists in 
{{RangeTombstoneBoundaryMarker}}.

*Documentation*

- Brief JavaDoc for {{ByteBufferAccessor#copyTo()}} might be helpful. 
- JavaDoc for {{Unfiltered#clustering()}}, perhaps just to explain that we 
couldn't just use the {{clustering()}} signature in {{Clusterable}}?
- What level of JavaDoc do we want for {{ValueAccessor}} and {{ObjectFactory}} 
at the interface and method level? Certainly a lot of these methods are pretty 
straightforward, but it's also a pretty critical public interface.

*Miscellaneous*

- A number of places that call {{serializeBuffer()}} with both arguments could 
just call the version with one argument.
- There are some cases, like in {{CassandraIndex.Indexer#indexCell()}}, where 
it would be nice to bind {{Clustering}} and {{Cell}} to the same type, but I 
couldn't find a clean way to do it.
- {{BufferClusteringBoundOrBoundary#exclusiveCloseInclusiveOpen()}} looks 
unused.
- {{CollectionSerializer}} has quite a few methods that don't do anything w/ 
{{version}}. (Perhaps there will be another serialized form later, so I won't 
argue hard to remove.)
- {{SinglePartitionSliceBuilder#sliceBuilder}} looks like it can be {{final}}.
- The {{Relationship}} enum in {{CounterContext}} doesn't need to have a 
{{static}} modifier?
- {{ByteArrayUtil#writeWithLength()}} is unused.
- What about renaming {{ComparatorSet}} to {{ValueComparators}} and the field 
in {{AbstractType}} to {{comparators}} (which I think we do in one other place 
already)?
- What are the {{// FIXME: value input buffer}} comments in 
{{DurationSerializer}} about?
- {{putByte()}}, {{putDouble()}}, {{putFloat()}}, {{putLong()}} are neither 
used nor tested. (The underlying {{ByteArrayUtil}} stuff lingering even if 
unused seems less an issue.)
- {{import java.nio.ByteBuffer}} use unused in {{RangeTombstoneMarker}}.

*Testing*

- Did we want to make {{CQL3TypeLiteralTest}} deterministic with the fixed zero 
seed?
- {{ByteArrayUtilTest}} is empty?
- {{logger}} is unused in {{ReadCommandTest}}
- The new randomized tests here (ex. {{ValueAccessorTest}}) might be able to 
reuse some of the infrastructure [~dcapwell] created in CASSANDRA-16064 
(generating random primitives, etc.) It might also just be nice to make QT the 
standard way we do randomization for true unit tests (free seed printing, etc.) 
I'm not making a hard suggestion that we re-write, since the new tests look 
pretty straightforward as is.

> Add byte array backed cells
> ---------------------------
>
>                 Key: CASSANDRA-15393
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15393
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Local/Compaction
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>            Priority: Normal
>             Fix For: 4.0-beta
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> We currently materialize all values as on heap byte buffers. Byte buffers 
> have a fairly high overhead given how frequently they’re used, and on the 
> compaction and local read path we don’t do anything that needs them. Use of 
> byte buffer methods only happens on the coordinator. Using cells that are 
> backed by byte arrays instead in these situations reduces compaction and read 
> garbage up to 22% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15393) Add byte array backed cells

Reply via email to