[ 
https://issues.apache.org/jira/browse/CASSANDRA-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013292#comment-17013292
 ] 

Benedict Elliott Smith commented on CASSANDRA-15389:
----------------------------------------------------

bq. Good idea about complex data accumulation in collect stats. I ended up 
adding and using BiLongAccumulator to make all cases garbage free.
 
Sounds great!  Though looking at the code currently posted, I think it’s still 
capturing a reference to the {{collector}}?  I think you may have simply not 
posted your latest changes.

bq. This seems like it makes sense, but everything may just be looking like a 
nail to me, so let me know if you have a better idea
 
My personal view is it's absolutely fine to duplicate code somewhere like this, 
and I’ve done that in some patches I'll be posting soon.  I think clarity for 
the reader is also valuable, and if the methods are maintained next to each 
other and clearly marked as copy/paste variants because we don't have a macro 
language, I think that's fine and probably preferable.
 
However, I think in this case it's fine either way.  Looking at your 
implementation I realise it’s possible to simply define {{accept(Consumer<V> c, 
V v)}} in terms of {{accept(BiConsumer<V1,V2> c, V1 v1, V2 v2)}} by simply 
invoking {{accept(Consumer::accept, c, v)}}, which should be more than clear 
enough.

It’s a shame this is a recursive function so we won’t be able to benefit from 
inlining, and will probably incur an extra virtual invocation cost.  It might 
be that we anyway win by splitting implementations, but it may not be worth 
overthinking.
 
FWIW, I think it might be helpful to treat {{apply}} to the same restructure 
you’ve done to {{accumulate}} - which helped me a great deal in reading that.
 
{{isStopSentinel}}: do we use {{Long.MIN_VALUE}} anymore?  Should we also 
consider optionally permitting a stop sentinel to be provided, so we can make 
{{minDeletionTime}} more efficient?  I don’t think we have anywhere that needs 
more than one stop sentinel anymore, so I think it’s probably an acceptable 
complication.
 
nit: {{ComplexColumnData.accumulate}} should we call the parameter e.g. 
{{initialValue}} to distinguish it from those {{accumulate}} methods that take 
a value to start from in the tree?

> Minimize BTree iterator allocations
> -----------------------------------
>
>                 Key: CASSANDRA-15389
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15389
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Local/Compaction
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>            Priority: Normal
>             Fix For: 4.0
>
>
> Allocations of BTree iterators contribute a lot amount of garbage to the 
> compaction and read paths.
> This patch removes most btree iterator allocations on hot paths by:
>  • using Row#apply where appropriate on frequently called methods 
> (Row#digest, Row#validateData
>  • adding BTree accumulate method. Like the apply method, this method walks 
> the btree with a function that takes and returns a long argument, this 
> eliminates iterator allocations without adding helper object allocations 
> (BTreeRow#hasComplex, BTreeRow#hasInvalidDeletions, BTreeRow#dataSize, 
> BTreeRow#unsharedHeapSizeExcludingData, Rows#collectStats, 
> UnfilteredSerializer#serializedRowBodySize) as well as eliminating the 
> allocation of helper objects in places where apply was used previously^[1]^.
>  • Create map of columns in SerializationHeader, this lets us avoid 
> allocating a btree search iterator for each row we serialize.
> These optimizations reduce garbage created during compaction by up to 13.5%
>  
> [1] the memory test does measure memory allocated by lambdas capturing objects



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to