[ 
https://issues.apache.org/jira/browse/CASSANDRA-15213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013139#comment-17013139
 ] 

Benedict Elliott Smith commented on CASSANDRA-15213:
----------------------------------------------------

I'm inclined to agree that a simple approach is best, and simply offering a 
tunable parameter to modify the number of stripes on startup might be 
sufficient.  Here's some further minor feedback/suggestions:

# It's faster to use {{addAndGet}} than {{compareAndSet}} because it's 
guaranteed to succeed - the increment occurs whilst holding the cache line in 
an exclusive state, so it will always execute as quickly as the first failed 
{{compareAndSet}}
# Combined with this, it probably makes most sense to pick a "random" bucket 
using e.g. {{Thread.currentThread().getId() & (stripes-1)}}, to always increment
# Finally, it might make sense to "shuffle" the ordinary buckets as well, so 
that logically adjacent buckets are not spatially adjacent; since a histogram 
is _likely_ to receive updates to mostly oscillating around a given median 
point, this means there won't be any false-sharing competition between two 
updates to two logically adjacent buckets

I think at this point it then makes most sense to improve the benchmarks, in 
one case to account for memory latency of lookup, and in another to produce 
more realistic distributions of values.


> DecayingEstimatedHistogramReservoir Inefficiencies
> --------------------------------------------------
>
>                 Key: CASSANDRA-15213
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15213
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Observability/Metrics
>            Reporter: Benedict Elliott Smith
>            Assignee: Jordan West
>            Priority: Normal
>             Fix For: 4.0-beta
>
>
> * {{LongAdder}} introduced to trunk consumes 9MiB of heap without user 
> schemas, and this will grow significantly under contention and user schemas 
> with many tables.  This is because {{LongAdder}} is a very heavy class 
> designed for single contended values.  
>  ** This can likely be improved significantly, without significant loss of 
> performance in the contended case, by simply increasing the size of our 
> primitive backing array and providing multiple buckets, with each thread 
> picking a bucket to increment, or simply multiple backing arrays.  Probably a 
> better way still to do this would be to introduce some competition detection 
> to the update, much like {{LongAdder}} utilises, that increases the number of 
> backing arrays under competition.
>  ** To save memory this approach could partition the space into chunks that 
> are likely to be updated together, so that we do not need to duplicate the 
> entire array under competition.
>  * Similarly, binary search is costly and a measurable cost as a share of the 
> new networking work (without filtering it was > 10% of the CPU used overall). 
>  We can compute an approximation floor(log2 n / log2 1.2) extremely cheaply, 
> to save the random memory access costs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to