[ https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186875#comment-17186875 ]
Caleb Rackliffe commented on CASSANDRA-15393: --------------------------------------------- [~bdeggleston] There are the [items above|https://issues.apache.org/jira/browse/CASSANDRA-15393?focusedCommentId=17185489&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17185489] to resolve, of course, but here's my full feedback on the latest raft of commits... *Complexity* - The semantics of {{ValueAccessor#size()}} are a little odd that it always indicates the length of the array in the byte array case, but the number of bytes left in a byte buffer. Some of the usages are just checking if the value is empty, which makes me wonder if having an isEmpty() get rid of some of the awkwardness. On the other hand, most usages occur right before we read. - There are some switch statements we might be able to avoid in the two accessor implementations if we encode the {{copy()}} and {{compare}} strategies in {{BackingKind}}. - How valuable is {{ValueAccessor.copy()}} when we can make calls like {{srcAccessor.copyTo(src, srcOffset, dst, dstAccessor, dstOffset, size)}} call directly from {{CounterContext}}? Same for {{compareUnsigned()}}. Want to make sure we aren't cluttering the accessor interface. - This is really an extension of the conversation above, but I'm worried that a handful of places where we only use {{ByteBuffer}} being changed to use the more abstract value/accessor combination are not only problematic for readability, but are adding more surface area for testing. {{TupleType#split()}}, for instance, might not be covered too well, but we could probably avoid additional tests and get to the immediate goal faster here if we simply stay in {{ByteBuffer}} space and avoid the new offset logic in the first place. Other examples of this could be {{CounterContext#clearAllLocal()}}, {{ClusteringPrefix.Deserializer}} (byte array only), and {{Clustering.Serializer#deserialize()}}. *Formatting* - Horizontal alignment in {{ClusteringIndexNamesFilter#forPaging()}} on the ternary. - Might have to line wrap some parameter lists in {{RangeTombstoneBoundaryMarker}}. *Documentation* - Brief JavaDoc for {{ByteBufferAccessor#copyTo()}} might be helpful. - JavaDoc for {{Unfiltered#clustering()}}, perhaps just to explain that we couldn't just use the {{clustering()}} signature in {{Clusterable}}? - What level of JavaDoc do we want for {{ValueAccessor}} and {{ObjectFactory}} at the interface and method level? Certainly a lot of these methods are pretty straightforward, but it's also a pretty critical public interface. *Miscellaneous* - A number of places that call {{serializeBuffer()}} with both arguments could just call the version with one argument. - There are some cases, like in {{CassandraIndex.Indexer#indexCell()}}, where it would be nice to bind {{Clustering}} and {{Cell}} to the same type, but I couldn't find a clean way to do it. - {{BufferClusteringBoundOrBoundary#exclusiveCloseInclusiveOpen()}} looks unused. - {{CollectionSerializer}} has quite a few methods that don't do anything w/ {{version}}. (Perhaps there will be another serialized form later, so I won't argue hard to remove.) - {{SinglePartitionSliceBuilder#sliceBuilder}} looks like it can be {{final}}. - The {{Relationship}} enum in {{CounterContext}} doesn't need to have a {{static}} modifier? - {{ByteArrayUtil#writeWithLength()}} is unused. - What about renaming {{ComparatorSet}} to {{ValueComparators}} and the field in {{AbstractType}} to {{comparators}} (which I think we do in one other place already)? - What are the {{// FIXME: value input buffer}} comments in {{DurationSerializer}} about? - {{putByte()}}, {{putDouble()}}, {{putFloat()}}, {{putLong()}} are neither used nor tested. (The underlying {{ByteArrayUtil}} stuff lingering even if unused seems less an issue.) - {{import java.nio.ByteBuffer}} use unused in {{RangeTombstoneMarker}}. *Testing* - Did we want to make {{CQL3TypeLiteralTest}} deterministic with the fixed zero seed? - {{ByteArrayUtilTest}} is empty? - {{logger}} is unused in {{ReadCommandTest}} - The new randomized tests here (ex. {{ValueAccessorTest}}) might be able to reuse some of the infrastructure [~dcapwell] created in CASSANDRA-16064 (generating random primitives, etc.) It might also just be nice to make QT the standard way we do randomization for true unit tests (free seed printing, etc.) I'm not making a hard suggestion that we re-write, since the new tests look pretty straightforward as is. > Add byte array backed cells > --------------------------- > > Key: CASSANDRA-15393 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15393 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction > Reporter: Blake Eggleston > Assignee: Blake Eggleston > Priority: Normal > Fix For: 4.0-beta > > Time Spent: 20m > Remaining Estimate: 0h > > We currently materialize all values as on heap byte buffers. Byte buffers > have a fairly high overhead given how frequently they’re used, and on the > compaction and local read path we don’t do anything that needs them. Use of > byte buffer methods only happens on the coordinator. Using cells that are > backed by byte arrays instead in these situations reduces compaction and read > garbage up to 22% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org