GitHub user banjiewen opened a pull request:
https://github.com/apache/couchdb-mem3/pull/19
Improve mem3_sync event listener performance
There are three fundamental changes here, reflected in the latter three
commits. These changes were motivated by observing and profiling mem3_sync's
event listener pid during high (10k per second) write throughput load tests on
databases with high (300+) q values.
We could conceivably pick and choose any of the three approaches here. The
`read_concurrency` flag seems like a no-brainer and the `ets:select/2` switch
looks like a win in my limited testing.
I'm ambivalent about the frequency/delay work; it turned out surprisingly
subtle (to avoid dropping the "last" update on an otherwise-idle host) and
required more LOC than I'd preferred, since the simplest solution wasn't
compatible with the way the event listener was being spawned previously.
COUCHDB-2984
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/banjiewen/couchdb-mem3
2984-mem3-sync-event-listener-perf
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/couchdb-mem3/pull/19.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19
----
commit d3ce2273c0c1eba5b4107e7bb0a83aaa1736cc6a
Author: Benjamin Anderson <[email protected]>
Date: 2016-04-10T03:55:58Z
Refactor mem3_sync events to dedicated module
COUCHDB-2984
commit 78caab5ff91cb743da9a8dc211d6e814dfead120
Author: Benjamin Anderson <[email protected]>
Date: 2016-04-10T05:44:58Z
Reduce frequency of mem3_sync:push/2 calls
In high-throughput scenarios on databases with large q values the
mem3_sync event listener becomes overloaded with messages due to the
poor performance of the shard selection logic.
It's not strictly necessary to sync on every update, but we do need to
be careful not to lose updates by keeping history too naively. This
patch adds a configurable delay and push frequencyto reduce pressure on
the mem3_sync event listener.
COUCHDB-2984
commit 8e800221ba85593c33c611c91ea1372982aa2956
Author: Benjamin Anderson <[email protected]>
Date: 2016-04-10T06:08:39Z
Use ets:select/2 to retrieve shards by name
The result of mem3_shards:for_db/1 on databases with high q values can
be very large, resulting in suboptimal performance for high-volume
callers.
mem3_sync_event_listener is only interested in a small subset of the
result of mem3_shards:for_db/1; moving this filter in to an ets:select/2
call improves performance significantly.
COUCHDB-2984
commit 0f43fa8136ce83fc6aa775204723b23dea0d325e
Author: Benjamin Anderson <[email protected]>
Date: 2016-04-10T06:21:58Z
Add read_concurrency option to mem3_shards table
This table sees a great deal of activity from various subsystems -
turning on read_concurrency should be a win.
COUCHDB-2984
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---