[jira] [Commented] (OAK-7262) LockBasedScheduler#getHeadNodeState poor performance due to lock contention in commitTimeHistogram implementation

Andrei Dulceanu (JIRA) Tue, 13 Feb 2018 05:44:50 -0800

    [ 
https://issues.apache.org/jira/browse/OAK-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362338#comment-16362338
 ]


Andrei Dulceanu commented on OAK-7262:
--------------------------------------

[~frm], looking at possible {{Reservoir}} implementations needed for backing up 
the {{Histogram}} used for bookkeeping the commit time, I came across of 
{{UniformReservoir}}, which provides a random sampling reservoir, "which offers 
a 99.9% confidence level with a 5% margin of error assuming a normal 
distribution" (according to the JavaDoc). I think this is a reasonable choice, 
since it uses under the hood an {{AtomicLongArray}} (no locks acquired for 
important operations such as #size, #getSnapshot, etc.).

My idea is to replace the implementation and start measuring the effect by 
running some of our benchmarks (patched vs trunk) to asses whether there's a 
performance gain or not.

WDYT?

> LockBasedScheduler#getHeadNodeState poor performance due to lock contention 
> in commitTimeHistogram implementation
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: OAK-7262
>                 URL: https://issues.apache.org/jira/browse/OAK-7262
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar
>            Reporter: Andrei Dulceanu
>            Assignee: Andrei Dulceanu
>            Priority: Major
>             Fix For: 1.9.0, 1.10
>
>
> With OAK-4732, we slightly prioritised reads, allowing callers of 
> {{LockBasedScheduler#getHeadNodeState}} to wait for a certain amount of time 
> on the commit semaphore, in order to fetch the latest head. The amount of 
> time to wait for was computed as the median of the latest 1000 commits, but 
> the implementation was flexible enough to replace the 0.5 quantile with a 
> different value.
> The initial implementation used {{SynchronizedDescriptiveStatistics}} from 
> {{commons-math3}}. In OAK-6430, we replaced that with a {{Histogram}} backed 
> by a {{SlindingWindowReservoir}} from the dropwizard metric library 
> ({{io.dropwizard.metrics}}). 
> One of the problems observed with the current implementation is that, under 
> heavy loading, the performance of {{LockBasedScheduler#getHeadNodeState}} 
> decreases due to lock contention in {{commitTimeHistogram}} implementation. 
> Namely,  {{SlidingWindowReservoir#getSnapshot}} seems to be a bottleneck. It 
> copies an array of 1000 elements at each invocation of 
> {{LockBasedScheduler#getHeadNodeState}}, acquiring and releasing a lock for 
> each element in the array.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (OAK-7262) LockBasedScheduler#getHeadNodeState poor performance due to lock contention in commitTimeHistogram implementation

Reply via email to