[ 
https://issues.apache.org/jira/browse/COUCHDB-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15233915#comment-15233915
 ] 

ASF GitHub Bot commented on COUCHDB-2984:
-----------------------------------------

GitHub user banjiewen opened a pull request:

    https://github.com/apache/couchdb-mem3/pull/19

    Improve mem3_sync event listener performance

    There are three fundamental changes here, reflected in the latter three 
commits. These changes were motivated by observing and profiling mem3_sync's 
event listener pid during high (10k per second) write throughput load tests on 
databases with high (300+) q values.
    
    We could conceivably pick and choose any of the three approaches here.  The 
`read_concurrency` flag seems like a no-brainer and the `ets:select/2` switch 
looks like a win in my limited testing.
    
    I'm ambivalent about the frequency/delay work; it turned out surprisingly 
subtle (to avoid dropping the "last" update on an otherwise-idle host) and 
required more LOC than I'd preferred, since the simplest solution wasn't 
compatible with the way the event listener was being spawned previously.
    
    COUCHDB-2984

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/banjiewen/couchdb-mem3 
2984-mem3-sync-event-listener-perf

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/couchdb-mem3/pull/19.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19
    
----
commit d3ce2273c0c1eba5b4107e7bb0a83aaa1736cc6a
Author: Benjamin Anderson <[email protected]>
Date:   2016-04-10T03:55:58Z

    Refactor mem3_sync events to dedicated module
    
    COUCHDB-2984

commit 78caab5ff91cb743da9a8dc211d6e814dfead120
Author: Benjamin Anderson <[email protected]>
Date:   2016-04-10T05:44:58Z

    Reduce frequency of mem3_sync:push/2 calls
    
    In high-throughput scenarios on databases with large q values the
    mem3_sync event listener becomes overloaded with messages due to the
    poor performance of the shard selection logic.
    
    It's not strictly necessary to sync on every update, but we do need to
    be careful not to lose updates by keeping history too naively. This
    patch adds a configurable delay and push frequencyto reduce pressure on
    the mem3_sync event listener.
    
    COUCHDB-2984

commit 8e800221ba85593c33c611c91ea1372982aa2956
Author: Benjamin Anderson <[email protected]>
Date:   2016-04-10T06:08:39Z

    Use ets:select/2 to retrieve shards by name
    
    The result of mem3_shards:for_db/1 on databases with high q values can
    be very large, resulting in suboptimal performance for high-volume
    callers.
    
    mem3_sync_event_listener is only interested in a small subset of the
    result of mem3_shards:for_db/1; moving this filter in to an ets:select/2
    call improves performance significantly.
    
    COUCHDB-2984

commit 0f43fa8136ce83fc6aa775204723b23dea0d325e
Author: Benjamin Anderson <[email protected]>
Date:   2016-04-10T06:21:58Z

    Add read_concurrency option to mem3_shards table
    
    This table sees a great deal of activity from various subsystems -
    turning on read_concurrency should be a win.
    
    COUCHDB-2984

----


> mem3_sync event listener performance degrades with high q values
> ----------------------------------------------------------------
>
>                 Key: COUCHDB-2984
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-2984
>             Project: CouchDB
>          Issue Type: Improvement
>            Reporter: Benjamin Anderson
>
> High throughput applications on databases with high (300+) q values have a 
> tendency to cause very poor performance. While I don't fully understand the 
> issue at hand, one clear manifestation is in mem3_sync's event listener. With 
> high q values, the shard "selection" routine (the <<"shards/",ยท_/binary>> 
> head of handle_event/3) will bottleneck on calls to mem3_shards:for_db/1 due 
> to the large (tens of KB) shard maps in ETS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to