[couchdb] 06/13: Replace Folsom and improve performance

ronny Sun, 23 Jul 2023 05:43:10 -0700

This is an automated email from the ASF dual-hosted git repository.

ronny pushed a commit to branch nouveau4win
in repository https://gitbox.apache.org/repos/asf/couchdb.git


commit 1712fc5d167772d5f7642c05866fd9ff39f33081
Author: Nick Vatamaniuc <[email protected]>
AuthorDate: Fri Jul 7 15:27:36 2023 -0400

    Replace Folsom and improve performance
    
    Folsom histograms are a major bottleneck under high concurrency, as 
described
    in #4650. This was noticed during performance testing, confirmed using 
Erlang
    VM lock counting, then verified by creating a test release with histogram
    update logic commented out [1].
    
    CouchDB doesn't use most of the Folsom statistics and metrics; we only use
    counters, gauges and one type of sliding window, sampling histogram. 
Instead of
    trying to re-design and update Folsom, which is a generic stats and metrics
    library, take a simpler approach and create just the three metrics we need, 
and
    then remove Folsom and Bear dependencies altogether.
    
    All the metrics types we re-implement are based on two relatively new
    Erlang/OTP features: counters [2] and persistent terms [3]. Counters are
    mutable arrays of integers, which allow fast concurrent updates, and 
persistent
    terms allow fast, global, constant time access to Erlang terms.
    
    Gauges and counters are implemented as counter arrays with one element.
    Histograms are represented as counter arrays where each array element is a
    histogram bin. Since we're dealing with sliding time window histograms, we 
have
    a tuple of counter arrays, where each time instant (each second) is a 
counter
    array. The overall histogram object then looks something like:
    
    ```
    Histogram = {
         1          = [1, 2, ..., ?BIN_COUNT]
         2          = [1, 2, ..., ?BIN_COUNT]
         ...
         TimeWindow = [1, 2, ..., ?BIN_COUNT]
      }
    ```
    
    To keep the structure immutable we need to set a limit on both the number of
    bins and the time window size. To limit the number of bins we need to set 
some
    minimum and maximum value limits. Since almost all our histograms record 
access
    times in milliseconds, we pick a range from 10 microseconds up to over one
    hour. Histogram bin widths are increasing exponentially in order to keep a
    reasonable precision across the whole range of values. This encoding is 
similar
    to how floating point numbers work. Additional details on how this works are
    described in the the `couch_stats_histogram.erl` module.
    
    To keep the histogram object structure immutable, the time window is used 
in a
    circular fashion. The time parameter to the histogram `update/3` function 
is the
    monotonic clock time, and the histogram time index is computed as `Time rem
    TimeWindow`. So, as the monotonic time is advancing forward, the histogram 
time
    index will loop around. This comes with a minor annoyance of having to 
allocate
    a larger time window to accommodate some process which cleans stale 
(expired)
    histogram entries, possibly with some extra buffers to ensure the currently
    updated interval and the interval ready to be cleaned would not overlap. 
This
    periodic cleanup is performed in the couch_stats_server process.
    
    Besides performance, the new histograms have two other improvement over the
    Folsom ones:
    
      - They record every single value. Previous histograms did sampling and
        recorded mostly just the first 1024 values during each time instant
        (second).
    
      - They are mergeable. Multiple histograms can be merged with corresponding
        bins summed together. This could allow cluster wide histogram summaries 
or
        gathering histograms from individual processes, then combining them at 
the
        end in a central process.
    
    Other performance improvement in this commit is eliminating the need to
    periodically flush or scrape stats in the background in both couch_stats and
    prometheus apps. Stats fetching from persistent terms and counters takes 
less
    than 5 milliseconds, and sliding time window histogram will always return 
the
    last 10 seconds of data no matter when the stats are queried. Now that will 
be
    done only when the stats are actually queried.
    
    Since the Folsom library was abstracted away behind a couch_stats API, the 
rest
    of the applications do not need to be updated. They still call
    `couch_stats:update_histogram/2`, `couch_stats:increment_counter/1`, etc.
    
    Previously couch_stats did not have any tests at all. Folsom and Bear had 
some
    tests, but I don't think we ever ran those test suites. To rectify the
    situation added tests to cover the functionality. All the newly added or
    updated modules should be have near or exactly 100% test coverage.
    
    [1] https://github.com/apache/couchdb/issues/4650#issue-1764685693
    [2] https://www.erlang.org/doc/man/counters.html
    [3] https://www.erlang.org/doc/man/persistent_term.html
---
 Makefile                                           |   2 +-
 Makefile.win                                       |   2 +-
 build-aux/print-committerlist.sh                   |   2 +-
 mix.exs                                            |   4 +-
 rebar.config.script                                |   1 -
 rel/reltool.config                                 |   4 -
 src/chttpd/src/chttpd_node.erl                     |  13 +-
 src/couch_prometheus/src/couch_prometheus.app.src  |   2 +-
 ..._prometheus_server.erl => couch_prometheus.erl} |  92 +----
 src/couch_prometheus/src/couch_prometheus.hrl      |  15 -
 src/couch_prometheus/src/couch_prometheus_http.erl |   5 +-
 src/couch_prometheus/src/couch_prometheus_sup.erl  |   4 +-
 src/couch_prometheus/src/couch_prometheus_util.erl |   2 -
 .../test/eunit/couch_prometheus_e2e_tests.erl      |  11 +-
 src/couch_stats/README.md                          |  21 +-
 src/couch_stats/src/couch_stats.app.src            |   2 +-
 src/couch_stats/src/couch_stats.erl                | 209 ++++++----
 src/couch_stats/src/couch_stats.hrl                |  14 -
 src/couch_stats/src/couch_stats_aggregator.erl     | 162 --------
 src/couch_stats/src/couch_stats_counter.erl        |  67 +++
 src/couch_stats/src/couch_stats_gauge.erl          |  54 +++
 src/couch_stats/src/couch_stats_histogram.erl      | 457 +++++++++++++++++++++
 src/couch_stats/src/couch_stats_httpd.erl          |   9 -
 src/couch_stats/src/couch_stats_math.erl           | 406 ++++++++++++++++++
 src/couch_stats/src/couch_stats_server.erl         | 250 +++++++++++
 src/couch_stats/src/couch_stats_sup.erl            |   2 +-
 src/couch_stats/src/couch_stats_util.erl           | 190 +++++++++
 27 files changed, 1592 insertions(+), 410 deletions(-)

diff --git a/Makefile b/Makefile
index 8cd1d6f68..12c433056 100644
--- a/Makefile
+++ b/Makefile
@@ -73,7 +73,7 @@ DESTDIR=
 
 # Rebar options
 apps=
-skip_deps=folsom,meck,mochiweb,triq,proper,snappy,bcrypt,hyper,ibrowse
+skip_deps=meck,mochiweb,triq,proper,snappy,bcrypt,hyper,ibrowse
 suites=
 tests=
 
diff --git a/Makefile.win b/Makefile.win
index 39e4fd6c9..4c8cb6f04 100644
--- a/Makefile.win
+++ b/Makefile.win
@@ -78,7 +78,7 @@ DESTDIR=
 
 # Rebar options
 apps=
-skip_deps=folsom,meck,mochiweb,triq,proper,snappy,bcrypt,hyper,ibrowse,local
+skip_deps=meck,mochiweb,triq,proper,snappy,bcrypt,hyper,ibrowse,local
 suites=
 tests=
 
diff --git a/build-aux/print-committerlist.sh b/build-aux/print-committerlist.sh
index f6abc4c78..ba3eddaa9 100755
--- a/build-aux/print-committerlist.sh
+++ b/build-aux/print-committerlist.sh
@@ -40,7 +40,7 @@ function get_contributors {
 
 function print_comitter_list {
   # list of external repos that we exclude
-  local EXCLUDE=("bear" "folsom" "goldrush" "ibrowse" "jiffy" "lager" "meck" 
"mochiweb" "snappy")
+  local EXCLUDE=("goldrush" "ibrowse" "jiffy" "lager" "meck" "mochiweb" 
"snappy")
   local EXCLUDE=$(printf "\|%s" "${EXCLUDE[@]}")
   local EXCLUDE=${EXCLUDE:2}
   local SUBREPOS=$(ls src/ | grep -v "$EXCLUDE")
diff --git a/mix.exs b/mix.exs
index a2102aac6..7adb0e318 100644
--- a/mix.exs
+++ b/mix.exs
@@ -146,7 +146,6 @@ defmodule CouchDBTest.Mixfile do
       "unicode_util_compat",
       "b64url",
       "exxhash",
-      "bear",
       "mochiweb",
       "snappy",
       "rebar",
@@ -155,8 +154,7 @@ defmodule CouchDBTest.Mixfile do
       "meck",
       "khash",
       "hyper",
-      "fauxton",
-      "folsom"
+      "fauxton"
     ]
 
     deps |> Enum.map(fn app -> "src/#{app}" end)
diff --git a/rebar.config.script b/rebar.config.script
index d5ff1193a..9098e6db7 100644
--- a/rebar.config.script
+++ b/rebar.config.script
@@ -152,7 +152,6 @@ DepDescs = [
 {fauxton,          {url, "https://github.com/apache/couchdb-fauxton"},
                    {tag, "v1.2.9"}, [raw]},
 %% Third party deps
-{folsom,           "folsom",           {tag, "CouchDB-0.8.4"}},
 {hyper,            "hyper",            {tag, "CouchDB-2.2.0-7"}},
 {ibrowse,          "ibrowse",          {tag, "CouchDB-4.4.2-5"}},
 {jiffy,            "jiffy",            {tag, "CouchDB-1.0.9-2"}},
diff --git a/rel/reltool.config b/rel/reltool.config
index d84ef597c..da9ad6d3b 100644
--- a/rel/reltool.config
+++ b/rel/reltool.config
@@ -28,7 +28,6 @@
         %% couchdb
         b64url,
         exxhash,
-        bear,
         chttpd,
         config,
         couch,
@@ -46,7 +45,6 @@
         dreyfus,
         ets_lru,
         fabric,
-        folsom,
         global_changes,
         hyper,
         ibrowse,
@@ -92,7 +90,6 @@
     %% couchdb
     {app, b64url, [{incl_cond, include}]},
     {app, exxhash, [{incl_cond, include}]},
-    {app, bear, [{incl_cond, include}]},
     {app, chttpd, [{incl_cond, include}]},
     {app, config, [{incl_cond, include}]},
     {app, couch, [{incl_cond, include}]},
@@ -110,7 +107,6 @@
     {app, dreyfus, [{incl_cond, include}]},
     {app, ets_lru, [{incl_cond, include}]},
     {app, fabric, [{incl_cond, include}]},
-    {app, folsom, [{incl_cond, include}]},
     {app, global_changes, [{incl_cond, include}]},
     {app, hyper, [{incl_cond, include}]},
     {app, ibrowse, [{incl_cond, include}]},
diff --git a/src/chttpd/src/chttpd_node.erl b/src/chttpd/src/chttpd_node.erl
index 4cb01e012..46850fc4e 100644
--- a/src/chttpd/src/chttpd_node.erl
+++ b/src/chttpd/src/chttpd_node.erl
@@ -159,7 +159,6 @@ handle_node_req(#httpd{path_parts = [_, _Node, 
<<"_config">>, _Section, _Key | _
     chttpd:send_error(Req, not_found);
 % GET /_node/$node/_stats
 handle_node_req(#httpd{method = 'GET', path_parts = [_, Node, <<"_stats">> | 
Path]} = Req) ->
-    flush(Node, Req),
     Stats0 = call_node(Node, couch_stats, fetch, []),
     Stats = couch_stats_httpd:transform_stats(Stats0),
     Nested = couch_stats_httpd:nest(Stats),
@@ -169,8 +168,8 @@ handle_node_req(#httpd{method = 'GET', path_parts = [_, 
Node, <<"_stats">> | Pat
 handle_node_req(#httpd{path_parts = [_, _Node, <<"_stats">>]} = Req) ->
     send_method_not_allowed(Req, "GET");
 handle_node_req(#httpd{method = 'GET', path_parts = [_, Node, 
<<"_prometheus">>]} = Req) ->
-    Metrics = call_node(Node, couch_prometheus_server, scrape, []),
-    Version = call_node(Node, couch_prometheus_server, version, []),
+    Metrics = call_node(Node, couch_prometheus, scrape, []),
+    Version = call_node(Node, couch_prometheus, version, []),
     Type = "text/plain; version=" ++ Version,
     Header = [{<<"Content-Type">>, ?l2b(Type)}],
     chttpd:send_response(Req, 200, Header, Metrics);
@@ -261,14 +260,6 @@ call_node(Node, Mod, Fun, Args) when is_atom(Node) ->
             Else
     end.
 
-flush(Node, Req) ->
-    case couch_util:get_value("flush", chttpd:qs(Req)) of
-        "true" ->
-            call_node(Node, couch_stats_aggregator, flush, []);
-        _Else ->
-            ok
-    end.
-
 get_stats() ->
     Other =
         erlang:memory(system) -
diff --git a/src/couch_prometheus/src/couch_prometheus.app.src 
b/src/couch_prometheus/src/couch_prometheus.app.src
index 9d3a36582..24d0c1d9d 100644
--- a/src/couch_prometheus/src/couch_prometheus.app.src
+++ b/src/couch_prometheus/src/couch_prometheus.app.src
@@ -14,7 +14,7 @@
     {description, "Aggregated metrics info for Prometheus consumption"},
     {vsn, git},
     {registered, []},
-    {applications, [kernel, stdlib, folsom, couch_stats, couch_log, mem3, 
couch]},
+    {applications, [kernel, stdlib, couch_stats, couch_log, mem3, couch]},
     {mod, {couch_prometheus_app, []}},
     {env, []}
 ]}.
diff --git a/src/couch_prometheus/src/couch_prometheus_server.erl 
b/src/couch_prometheus/src/couch_prometheus.erl
similarity index 84%
rename from src/couch_prometheus/src/couch_prometheus_server.erl
rename to src/couch_prometheus/src/couch_prometheus.erl
index 1649898c7..ee9a2b6ce 100644
--- a/src/couch_prometheus/src/couch_prometheus_server.erl
+++ b/src/couch_prometheus/src/couch_prometheus.erl
@@ -10,9 +10,7 @@
 % License for the specific language governing permissions and limitations under
 % the License.
 
--module(couch_prometheus_server).
-
--behaviour(gen_server).
+-module(couch_prometheus).
 
 -import(couch_prometheus_util, [
     couch_to_prom/3,
@@ -26,72 +24,15 @@
     version/0
 ]).
 
--export([
-    start_link/0,
-    init/1,
-    handle_call/3,
-    handle_cast/2,
-    handle_info/2,
-    code_change/3,
-    terminate/2
-]).
-
 -ifdef(TEST).
 -export([
     get_internal_replication_jobs_stat/0
 ]).
 -endif.
 
--include("couch_prometheus.hrl").
-
-start_link() ->
-    gen_server:start_link({local, ?MODULE}, ?MODULE, [], []).
-
--record(st, {
-    metrics,
-    refresh
-}).
-
-init([]) ->
-    Metrics = refresh_metrics(),
-    RT = update_refresh_timer(),
-    {ok, #st{metrics = Metrics, refresh = RT}}.
+-define(PROMETHEUS_VERSION, "2.0").
 
 scrape() ->
-    {ok, Metrics} = gen_server:call(?MODULE, scrape),
-    Metrics.
-
-version() ->
-    ?PROMETHEUS_VERSION.
-
-handle_call(scrape, _from, #st{metrics = Metrics} = State) ->
-    {reply, {ok, Metrics}, State};
-handle_call(refresh, _from, #st{refresh = OldRT} = State) ->
-    timer:cancel(OldRT),
-    Metrics = refresh_metrics(),
-    RT = update_refresh_timer(),
-    {reply, ok, State#st{metrics = Metrics, refresh = RT}};
-handle_call(Msg, _From, State) ->
-    {stop, {unknown_call, Msg}, error, State}.
-
-handle_cast(Msg, State) ->
-    {stop, {unknown_cast, Msg}, State}.
-
-handle_info(refresh, #st{refresh = OldRT} = State) ->
-    timer:cancel(OldRT),
-    Metrics = refresh_metrics(),
-    RT = update_refresh_timer(),
-    {noreply, State#st{metrics = Metrics, refresh = RT}};
-handle_info(Msg, State) ->
-    {stop, {unknown_info, Msg}, State}.
-
-terminate(_Reason, _State) ->
-    ok.
-
-code_change(_OldVsn, State, _Extra) ->
-    {ok, State}.
-
-refresh_metrics() ->
     CouchDB = get_couchdb_stats(),
     System = couch_stats_httpd:to_ejson(get_system_stats()),
     couch_prometheus_util:to_bin(
@@ -103,6 +44,9 @@ refresh_metrics() ->
         )
     ).
 
+version() ->
+    ?PROMETHEUS_VERSION.
+
 get_couchdb_stats() ->
     Stats = lists:sort(couch_stats:fetch()),
     lists:flatmap(
@@ -416,29 +360,3 @@ get_distribution_stats() ->
 get_ets_stats() ->
     NumTabs = length(ets:all()),
     to_prom(erlang_ets_table, gauge, "number of ETS tables", NumTabs).
-
-drain_refresh_messages() ->
-    receive
-        refresh -> drain_refresh_messages()
-    after 0 ->
-        ok
-    end.
-
-update_refresh_timer() ->
-    drain_refresh_messages(),
-    RefreshTime = 1000 * config:get_integer("prometheus", "interval", 
?REFRESH_INTERVAL),
-    erlang:send_after(RefreshTime, self(), refresh).
-
--ifdef(TEST).
-
--include_lib("couch/include/couch_eunit.hrl").
-
-drain_refresh_messages_test() ->
-    self() ! refresh,
-    {messages, Mq0} = erlang:process_info(self(), messages),
-    ?assert(lists:member(refresh, Mq0)),
-    drain_refresh_messages(),
-    {messages, Mq1} = erlang:process_info(self(), messages),
-    ?assert(not lists:member(refresh, Mq1)).
-
--endif.
diff --git a/src/couch_prometheus/src/couch_prometheus.hrl 
b/src/couch_prometheus/src/couch_prometheus.hrl
deleted file mode 100644
index 0970f4469..000000000
--- a/src/couch_prometheus/src/couch_prometheus.hrl
+++ /dev/null
@@ -1,15 +0,0 @@
-% Licensed under the Apache License, Version 2.0 (the "License"); you may not
-% use this file except in compliance with the License. You may obtain a copy of
-% the License at
-%
-%   http://www.apache.org/licenses/LICENSE-2.0
-%
-% Unless required by applicable law or agreed to in writing, software
-% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
-% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
-% License for the specific language governing permissions and limitations under
-% the License.
-
--define(REFRESH_INTERVAL, 5).
--define(PROMETHEUS_VERSION, "2.0").
-
diff --git a/src/couch_prometheus/src/couch_prometheus_http.erl 
b/src/couch_prometheus/src/couch_prometheus_http.erl
index b3df1ea4b..bd961249d 100644
--- a/src/couch_prometheus/src/couch_prometheus_http.erl
+++ b/src/couch_prometheus/src/couch_prometheus_http.erl
@@ -19,7 +19,6 @@
     handle_request/1
 ]).
 
--include("couch_prometheus.hrl").
 -include_lib("couch/include/couch_db.hrl").
 
 start_link() ->
@@ -63,13 +62,13 @@ handle_request(MochiReq) ->
     end.
 
 send_prometheus(MochiReq, Node) ->
-    Type = "text/plain; version=" ++ ?PROMETHEUS_VERSION,
+    Type = "text/plain; version=" ++ couch_prometheus:version(),
     Headers =
         couch_httpd:server_header() ++
             [
                 {<<"Content-Type">>, ?l2b(Type)}
             ],
-    Body = call_node(Node, couch_prometheus_server, scrape, []),
+    Body = call_node(Node, couch_prometheus, scrape, []),
     send_resp(MochiReq, 200, Headers, Body).
 
 send_resp(MochiReq, Status, ExtraHeaders, Body) ->
diff --git a/src/couch_prometheus/src/couch_prometheus_sup.erl 
b/src/couch_prometheus/src/couch_prometheus_sup.erl
index 45a884fad..3bd7361d8 100644
--- a/src/couch_prometheus/src/couch_prometheus_sup.erl
+++ b/src/couch_prometheus/src/couch_prometheus_sup.erl
@@ -27,9 +27,7 @@ start_link() ->
 init([]) ->
     {ok, {
         {one_for_one, 5, 10},
-        [
-            ?CHILD(couch_prometheus_server, worker)
-        ] ++ maybe_start_prometheus_http()
+        [] ++ maybe_start_prometheus_http()
     }}.
 
 maybe_start_prometheus_http() ->
diff --git a/src/couch_prometheus/src/couch_prometheus_util.erl 
b/src/couch_prometheus/src/couch_prometheus_util.erl
index 51a902163..633bb036d 100644
--- a/src/couch_prometheus/src/couch_prometheus_util.erl
+++ b/src/couch_prometheus/src/couch_prometheus_util.erl
@@ -20,8 +20,6 @@
     to_prom_summary/2
 ]).
 
--include("couch_prometheus.hrl").
-
 couch_to_prom([couch_log, level, alert], Info, _All) ->
     to_prom(couch_log_requests_total, counter, "number of logged messages", {
         [{level, alert}], val(Info)
diff --git a/src/couch_prometheus/test/eunit/couch_prometheus_e2e_tests.erl 
b/src/couch_prometheus/test/eunit/couch_prometheus_e2e_tests.erl
index d24a01b20..6e1faf107 100644
--- a/src/couch_prometheus/test/eunit/couch_prometheus_e2e_tests.erl
+++ b/src/couch_prometheus/test/eunit/couch_prometheus_e2e_tests.erl
@@ -88,8 +88,6 @@ setup_prometheus(WithAdditionalPort) ->
     % It's already started by default, so restart to pick up config
     ok = application:stop(couch_prometheus),
     ok = application:start(couch_prometheus),
-    % Flush so that stats aggregator starts using the new, shorter interval
-    couch_stats_aggregator:flush(),
     Ctx.
 
 t_chttpd_port(Port) ->
@@ -175,17 +173,18 @@ t_starts_with_couchdb(Port) ->
     ).
 
 t_survives_mem3_sync_termination(_) ->
-    ServerPid = whereis(couch_prometheus_server),
-    ?assertNotEqual(undefined, ServerPid),
     ?assertNotEqual(undefined, whereis(mem3_sync)),
     ok = supervisor:terminate_child(mem3_sup, mem3_sync),
     ?assertEqual(undefined, whereis(mem3_sync)),
     ?assertMatch(
         [[_, _], <<"couchdb_internal_replication_jobs 0">>],
-        couch_prometheus_server:get_internal_replication_jobs_stat()
+        couch_prometheus:get_internal_replication_jobs_stat()
     ),
     {ok, _} = supervisor:restart_child(mem3_sup, mem3_sync),
-    ?assertEqual(ServerPid, whereis(couch_prometheus_server)).
+    ?assertMatch(
+        [[_, _], <<"couchdb_internal_replication_jobs", _/binary>>],
+        couch_prometheus:get_internal_replication_jobs_stat()
+    ).
 
 node_local_url(Port) ->
     Addr = config:get("chttpd", "bind_address", "127.0.0.1"),
diff --git a/src/couch_stats/README.md b/src/couch_stats/README.md
index 53c9ea4f4..3df81c981 100644
--- a/src/couch_stats/README.md
+++ b/src/couch_stats/README.md
@@ -1,18 +1,13 @@
 # couch_stats
 
-couch_stats is a simple statistics collection app for Erlang applications. Its
-core API is a thin wrapper around a stat storage library (currently Folsom,) 
but
-abstracting over that library provides several benefits:
+couch_stats is a simple statistics collection app for Erlang applications. It
+uses https://www.erlang.org/doc/man/counters.html to implement counters,
+gauges and histograms. By default histograms record 10 seconds worth of data,
+with a granularity of 1 second.
 
-* All references to stat storage are in one place, so it's easy to swap
-  the module out.
-
-* Some common patterns, such as tying a process's lifetime to a counter value,
-  are straightforward to support.
-
-* Configuration can be managed in a single place - for example, it's much 
easier
-  to ensure that all histogram metrics use a 10-second sliding window if those
-  metrics are instantiated/configured centrally.
+Stats can be fetched with `couch_stats:fetch()`. That returns the current
+values of all the counters and gauges as well as the histogram statistics for
+the last 10 seconds.
 
 ## Adding a metric
 
@@ -26,4 +21,4 @@ abstracting over that library provides several benefits:
 
 2. Tell couch_stats to use your description file via application configuration.
 
-2. Instrument your code with the helper functions in `couch_stats.erl`.
+3. Instrument your code with the helper functions in `couch_stats.erl`.
diff --git a/src/couch_stats/src/couch_stats.app.src 
b/src/couch_stats/src/couch_stats.app.src
index 990f8de62..a54fac734 100644
--- a/src/couch_stats/src/couch_stats.app.src
+++ b/src/couch_stats/src/couch_stats.app.src
@@ -14,7 +14,7 @@
     {description, "Simple statistics collection"},
     {vsn, git},
     {registered, [couch_stats_aggregator, couch_stats_process_tracker]},
-    {applications, [kernel, stdlib, folsom]},
+    {applications, [kernel, stdlib]},
     {mod, {couch_stats_app, []}},
     {env, []}
 ]}.
diff --git a/src/couch_stats/src/couch_stats.erl 
b/src/couch_stats/src/couch_stats.erl
index e0303fc0f..29a402449 100644
--- a/src/couch_stats/src/couch_stats.erl
+++ b/src/couch_stats/src/couch_stats.erl
@@ -13,14 +13,9 @@
 -module(couch_stats).
 
 -export([
-    start/0,
-    stop/0,
     fetch/0,
     reload/0,
     sample/1,
-    new/2,
-    delete/1,
-    list/0,
     increment_counter/1,
     increment_counter/2,
     decrement_counter/1,
@@ -29,102 +24,174 @@
     update_gauge/2
 ]).
 
--include("couch_stats.hrl").
-
--type response() :: ok | {error, unknown_metric}.
+-type response() :: ok | {error, unknown_metric} | {error, invalid_metric}.
 -type stat() :: {any(), [{atom(), any()}]}.
 
-start() ->
-    application:start(couch_stats).
-
-stop() ->
-    application:stop(couch_stats).
-
 fetch() ->
-    couch_stats_aggregator:fetch().
+    Seconds = couch_stats_util:histogram_interval_sec(),
+    StartSec = now_sec() - (Seconds - 1),
+    % Last -1 is because the interval ends are inclusive
+    couch_stats_util:fetch(stats(), StartSec, Seconds).
 
 reload() ->
-    couch_stats_aggregator:reload().
+    couch_stats_server:reload().
 
 -spec sample(any()) -> stat().
 sample(Name) ->
-    [{Name, Info}] = folsom_metrics:get_metric_info(Name),
-    sample_type(Name, proplists:get_value(type, Info)).
-
--spec new(atom(), any()) -> ok | {error, metric_exists | unsupported_type}.
-new(counter, Name) ->
-    case folsom_metrics:new_counter(Name) of
-        ok -> ok;
-        {error, Name, metric_already_exists} -> {error, metric_exists}
-    end;
-new(histogram, Name) ->
-    Time = config:get_integer("stats", "interval", ?DEFAULT_INTERVAL),
-    case folsom_metrics:new_histogram(Name, slide_uniform, {Time, 1024}) of
-        ok -> ok;
-        {error, Name, metric_already_exists} -> {error, metric_exists}
-    end;
-new(gauge, Name) ->
-    case folsom_metrics:new_gauge(Name) of
-        ok -> ok;
-        {error, Name, metric_already_exists} -> {error, metric_exists}
-    end;
-new(_, _) ->
-    {error, unsupported_type}.
-
-delete(Name) ->
-    folsom_metrics:delete_metric(Name).
-
-list() ->
-    folsom_metrics:get_metrics_info().
+    Seconds = couch_stats_util:histogram_interval_sec(),
+    StartSec = now_sec() - (Seconds - 1),
+    % Last -1 is because the interval ends are inclusive
+    couch_stats_util:sample(Name, stats(), StartSec, Seconds).
 
 -spec increment_counter(any()) -> response().
 increment_counter(Name) ->
-    notify_existing_metric(Name, {inc, 1}, counter).
+    increment_counter(Name, 1).
 
 -spec increment_counter(any(), pos_integer()) -> response().
 increment_counter(Name, Value) ->
-    notify_existing_metric(Name, {inc, Value}, counter).
+    case couch_stats_util:get_counter(Name, stats()) of
+        {ok, Ctx} -> couch_stats_counter:increment(Ctx, Value);
+        {error, Error} -> {error, Error}
+    end.
 
 -spec decrement_counter(any()) -> response().
 decrement_counter(Name) ->
-    notify_existing_metric(Name, {dec, 1}, counter).
+    decrement_counter(Name, 1).
 
 -spec decrement_counter(any(), pos_integer()) -> response().
 decrement_counter(Name, Value) ->
-    notify_existing_metric(Name, {dec, Value}, counter).
+    case couch_stats_util:get_counter(Name, stats()) of
+        {ok, Ctx} -> couch_stats_counter:decrement(Ctx, Value);
+        {error, Error} -> {error, Error}
+    end.
+
+-spec update_gauge(any(), number()) -> response().
+update_gauge(Name, Value) ->
+    case couch_stats_util:get_gauge(Name, stats()) of
+        {ok, Ctx} -> couch_stats_gauge:update(Ctx, Value);
+        {error, Error} -> {error, Error}
+    end.
 
 -spec update_histogram
     (any(), number()) -> response();
     (any(), function()) -> any().
 update_histogram(Name, Fun) when is_function(Fun, 0) ->
-    Begin = os:timestamp(),
+    Begin = erlang:monotonic_time(),
     Result = Fun(),
-    Duration = timer:now_diff(os:timestamp(), Begin) div 1000,
-    case notify_existing_metric(Name, Duration, histogram) of
+    Dt = erlang:monotonic_time() - Begin,
+    Duration = erlang:convert_time_unit(Dt, native, millisecond),
+    case update_histogram(Name, Duration) of
         ok ->
             Result;
         {error, unknown_metric} ->
-            throw({unknown_metric, Name})
+            throw({unknown_metric, Name});
+        {error, invalid_metric} ->
+            throw({invalid_metric, Name})
     end;
 update_histogram(Name, Value) when is_number(Value) ->
-    notify_existing_metric(Name, Value, histogram).
-
--spec update_gauge(any(), number()) -> response().
-update_gauge(Name, Value) ->
-    notify_existing_metric(Name, Value, gauge).
-
--spec notify_existing_metric(any(), any(), any()) -> response().
-notify_existing_metric(Name, Op, Type) ->
-    try
-        ok = folsom_metrics:notify_existing_metric(Name, Op, Type)
-    catch
-        _:_ ->
-            error_logger:error_msg("unknown metric: ~p", [Name]),
-            {error, unknown_metric}
+    case couch_stats_util:get_histogram(Name, stats()) of
+        {ok, Ctx} -> couch_stats_histogram:update(Ctx, now_sec(), Value);
+        {error, Error} -> {error, Error}
     end.
 
--spec sample_type(any(), atom()) -> stat().
-sample_type(Name, histogram) ->
-    folsom_metrics:get_histogram_statistics(Name);
-sample_type(Name, _) ->
-    folsom_metrics:get_metric_value(Name).
+stats() ->
+    couch_stats_util:stats().
+
+now_sec() ->
+    erlang:monotonic_time(second).
+
+-ifdef(TEST).
+
+-include_lib("couch/include/couch_eunit.hrl").
+
+couch_stats_test_() ->
+    {
+        foreach,
+        fun setup/0,
+        fun teardown/1,
+        [
+            ?TDEF_FE(t_fetch_metrics),
+            ?TDEF_FE(t_sample_metrics),
+            ?TDEF_FE(t_reload),
+            ?TDEF_FE(t_increment_counter),
+            ?TDEF_FE(t_decrement_counter),
+            ?TDEF_FE(t_update_gauge),
+            ?TDEF_FE(t_update_histogram),
+            ?TDEF_FE(t_update_histogram_fun),
+            ?TDEF_FE(t_access_invalid_metrics)
+        ]
+    }.
+
+setup() ->
+    test_util:start_couch([couch_replicator]).
+
+teardown(Ctx) ->
+    config:delete("stats", "interval", _Persist = false),
+    test_util:stop_couch(Ctx).
+
+t_fetch_metrics(_) ->
+    Metrics = fetch(),
+    ?assertEqual(map_size(stats()), length(Metrics)),
+    ?assertMatch([{_, [{value, _}, {type, _}, {desc, _}]} | _], Metrics).
+
+t_sample_metrics(_) ->
+    Hist = sample([fsync, time]),
+    ?assertMatch([{_Name, _Val} | _], Hist),
+
+    Count = sample([fsync, count]),
+    ?assert(is_integer(Count)),
+    ?assert(Count >= 0),
+
+    ?assertEqual(0, sample([couch_replicator, jobs, total])).
+
+t_reload(_) ->
+    % This is tested in detail in couch_stats_server.
+    ?assertEqual(ok, reload()).
+
+t_increment_counter(_) ->
+    [increment_counter([fsync, count]) || _ <- lists:seq(1, 1000)],
+    ?assert(sample([fsync, count]) > 1000).
+
+t_decrement_counter(_) ->
+    [decrement_counter([fsync, count]) || _ <- lists:seq(1, 10000)],
+    ?assert(sample([fsync, count]) < 10).
+
+t_update_gauge(_) ->
+    application:stop(couch_replicator),
+    % We don't want replicator to reset the gauge back to 0
+    update_gauge([couch_replicator, jobs, total], 42),
+    ?assertEqual(42, sample([couch_replicator, jobs, total])).
+
+t_update_histogram(_) ->
+    [update_histogram([fsync, time], rand:uniform(1000)) || _ <- lists:seq(1, 
1000)],
+    Hist = sample([fsync, time]),
+    N = proplists:get_value(n, Hist),
+    ?assert(is_integer(N)),
+    ?assert(N >= 1000).
+
+t_update_histogram_fun(_) ->
+    Fun = fun() -> timer:sleep(rand:uniform(2)) end,
+    [update_histogram([fsync, time], Fun) || _ <- lists:seq(1, 100)],
+    Hist = sample([fsync, time]),
+    N = proplists:get_value(n, Hist),
+    ?assert(is_integer(N)),
+    ?assert(N >= 100).
+
+t_access_invalid_metrics(_) ->
+    Fun = fun() -> ok end,
+    ?assertThrow(unknown_metric, sample([invalid])),
+    ?assertEqual({error, unknown_metric}, increment_counter([invalid], 100)),
+    ?assertEqual({error, unknown_metric}, decrement_counter([invalid], 100)),
+    ?assertEqual({error, unknown_metric}, update_gauge([invalid], 100)),
+    ?assertEqual({error, unknown_metric}, update_histogram([invalid], 100)),
+    ?assertThrow({unknown_metric, _}, update_histogram([invalid], Fun)),
+    % Invalid metric types
+    ?assertEqual({error, invalid_metric}, increment_counter([fsync, time], 
100)),
+    ?assertEqual({error, invalid_metric}, decrement_counter([fsync, time], 
100)),
+    ?assertEqual({error, invalid_metric}, update_gauge([fsync, count], 100)),
+    ?assertEqual({error, invalid_metric}, update_histogram([fsync, count], 
100)),
+    ?assertThrow({invalid_metric, _}, update_histogram([fsync, count], Fun)),
+    InvalidMetrics = #{[bad] => {invalid, <<"desc">>}},
+    ?assertThrow({unknown_metric, _}, 
couch_stats_util:create_metrics(InvalidMetrics)).
+
+-endif.
diff --git a/src/couch_stats/src/couch_stats.hrl 
b/src/couch_stats/src/couch_stats.hrl
deleted file mode 100644
index 3cffe99f1..000000000
--- a/src/couch_stats/src/couch_stats.hrl
+++ /dev/null
@@ -1,14 +0,0 @@
-% Licensed under the Apache License, Version 2.0 (the "License"); you may not
-% use this file except in compliance with the License. You may obtain a copy of
-% the License at
-%
-%   http://www.apache.org/licenses/LICENSE-2.0
-%
-% Unless required by applicable law or agreed to in writing, software
-% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
-% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
-% License for the specific language governing permissions and limitations under
-% the License.
-
--define(DEFAULT_INTERVAL, 10).
--define(RELOAD_INTERVAL, 600).
diff --git a/src/couch_stats/src/couch_stats_aggregator.erl 
b/src/couch_stats/src/couch_stats_aggregator.erl
deleted file mode 100644
index 34b28bfd6..000000000
--- a/src/couch_stats/src/couch_stats_aggregator.erl
+++ /dev/null
@@ -1,162 +0,0 @@
-% Licensed under the Apache License, Version 2.0 (the "License"); you may not
-% use this file except in compliance with the License. You may obtain a copy of
-% the License at
-%
-%   http://www.apache.org/licenses/LICENSE-2.0
-%
-% Unless required by applicable law or agreed to in writing, software
-% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
-% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
-% License for the specific language governing permissions and limitations under
-% the License.
-
--module(couch_stats_aggregator).
-
--behaviour(gen_server).
-
--export([
-    fetch/0,
-    flush/0,
-    reload/0
-]).
-
--export([
-    start_link/0,
-    init/1,
-    handle_call/3,
-    handle_cast/2,
-    handle_info/2,
-    code_change/3,
-    terminate/2
-]).
-
--include("couch_stats.hrl").
-
--record(st, {
-    descriptions,
-    stats,
-    collect_timer,
-    reload_timer
-}).
-
-fetch() ->
-    {ok, Stats} = gen_server:call(?MODULE, fetch),
-    Stats.
-
-flush() ->
-    gen_server:call(?MODULE, flush).
-
-reload() ->
-    gen_server:call(?MODULE, reload).
-
-start_link() ->
-    gen_server:start_link({local, ?MODULE}, ?MODULE, [], []).
-
-init([]) ->
-    {ok, Descs} = reload_metrics(),
-    CT = erlang:send_after(get_interval(collect), self(), collect),
-    RT = erlang:send_after(get_interval(reload), self(), reload),
-    {ok, #st{descriptions = Descs, stats = [], collect_timer = CT, 
reload_timer = RT}}.
-
-handle_call(fetch, _from, #st{stats = Stats} = State) ->
-    {reply, {ok, Stats}, State};
-handle_call(flush, _From, State) ->
-    {reply, ok, collect(State)};
-handle_call(reload, _from, #st{reload_timer = OldRT} = State) ->
-    timer:cancel(OldRT),
-    {ok, Descriptions} = reload_metrics(),
-    RT = update_timer(reload),
-    {reply, ok, State#st{descriptions = Descriptions, reload_timer = RT}};
-handle_call(Msg, _From, State) ->
-    {stop, {unknown_call, Msg}, error, State}.
-
-handle_cast(Msg, State) ->
-    {stop, {unknown_cast, Msg}, State}.
-
-handle_info(collect, State) ->
-    {noreply, collect(State)};
-handle_info(reload, State) ->
-    {ok, Descriptions} = reload_metrics(),
-    {noreply, State#st{descriptions = Descriptions}};
-handle_info(Msg, State) ->
-    {stop, {unknown_info, Msg}, State}.
-
-terminate(_Reason, _State) ->
-    ok.
-
-code_change(_OldVsn, State, _Extra) ->
-    {ok, State}.
-
-comparison_set(Metrics) ->
-    sets:from_list(
-        [{Name, proplists:get_value(type, Props)} || {Name, Props} <- Metrics]
-    ).
-
-reload_metrics() ->
-    Current = load_metrics_for_applications(),
-    CurrentSet = comparison_set(Current),
-    Existing = couch_stats:list(),
-    ExistingSet = comparison_set(Existing),
-    ToDelete = sets:subtract(ExistingSet, CurrentSet),
-    ToCreate = sets:subtract(CurrentSet, ExistingSet),
-    sets:fold(
-        fun({Name, _}, _) ->
-            couch_stats:delete(Name),
-            nil
-        end,
-        nil,
-        ToDelete
-    ),
-    sets:fold(
-        fun({Name, Type}, _) ->
-            couch_stats:new(Type, Name),
-            nil
-        end,
-        nil,
-        ToCreate
-    ),
-    {ok, Current}.
-
-load_metrics_for_applications() ->
-    Apps = [element(1, A) || A <- application:loaded_applications()],
-    lists:foldl(
-        fun(AppName, Acc) ->
-            case load_metrics_for_application(AppName) of
-                error -> Acc;
-                Descriptions -> Descriptions ++ Acc
-            end
-        end,
-        [],
-        Apps
-    ).
-
-load_metrics_for_application(AppName) ->
-    case code:priv_dir(AppName) of
-        {error, _Error} ->
-            error;
-        Dir ->
-            case file:consult(Dir ++ "/stats_descriptions.cfg") of
-                {ok, Descriptions} ->
-                    Descriptions;
-                {error, _Error} ->
-                    error
-            end
-    end.
-
-collect(#st{collect_timer = OldCT} = State) ->
-    timer:cancel(OldCT),
-    Stats = lists:map(
-        fun({Name, Props}) ->
-            {Name, [{value, couch_stats:sample(Name)} | Props]}
-        end,
-        State#st.descriptions
-    ),
-    CT = update_timer(collect),
-    State#st{stats = Stats, collect_timer = CT}.
-
-update_timer(Type) ->
-    Interval = get_interval(Type),
-    erlang:send_after(Interval, self(), Type).
-
-get_interval(reload) -> 1000 * ?RELOAD_INTERVAL;
-get_interval(collect) -> 1000 * config:get_integer("stats", "interval", 
?DEFAULT_INTERVAL).
diff --git a/src/couch_stats/src/couch_stats_counter.erl 
b/src/couch_stats/src/couch_stats_counter.erl
new file mode 100644
index 000000000..3d0ac01af
--- /dev/null
+++ b/src/couch_stats/src/couch_stats_counter.erl
@@ -0,0 +1,67 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(couch_stats_counter).
+
+-export([
+    new/0,
+    increment/2,
+    decrement/2,
+    read/1
+]).
+
+new() ->
+    counters:new(1, [write_concurrency]).
+
+increment(Ctx, Val) when is_integer(Val) ->
+    counters:add(Ctx, 1, Val).
+
+decrement(Ctx, Val) when is_integer(Val) ->
+    counters:sub(Ctx, 1, Val).
+
+read(Ctx) ->
+    counters:get(Ctx, 1).
+
+-ifdef(TEST).
+
+-include_lib("couch/include/couch_eunit.hrl").
+
+counter_test() ->
+    C = new(),
+
+    ?assertEqual(0, read(C)),
+
+    increment(C, 1),
+    ?assertEqual(1, read(C)),
+
+    increment(C, 2),
+    ?assertEqual(3, read(C)),
+
+    decrement(C, 2),
+    ?assertEqual(1, read(C)),
+
+    decrement(C, 1),
+    ?assertEqual(0, read(C)),
+
+    decrement(C, 1),
+    ?assertEqual(-1, read(C)),
+
+    decrement(C, 2),
+    ?assertEqual(-3, read(C)),
+
+    increment(C, -2),
+    ?assertEqual(-5, read(C)),
+
+    decrement(C, -5),
+    ?assertEqual(0, read(C)).
+
+-endif.
diff --git a/src/couch_stats/src/couch_stats_gauge.erl 
b/src/couch_stats/src/couch_stats_gauge.erl
new file mode 100644
index 000000000..4ca245908
--- /dev/null
+++ b/src/couch_stats/src/couch_stats_gauge.erl
@@ -0,0 +1,54 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(couch_stats_gauge).
+
+-export([
+    new/0,
+    update/2,
+    read/1
+]).
+
+new() ->
+    counters:new(1, [write_concurrency]).
+
+update(Ctx, Val) when is_integer(Val) ->
+    counters:put(Ctx, 1, Val).
+
+read(Ctx) ->
+    counters:get(Ctx, 1).
+
+-ifdef(TEST).
+
+-include_lib("couch/include/couch_eunit.hrl").
+
+counter_test() ->
+    G = new(),
+
+    ?assertEqual(0, read(G)),
+
+    update(G, 0),
+    ?assertEqual(0, read(G)),
+
+    update(G, 1),
+    ?assertEqual(1, read(G)),
+
+    update(G, 0),
+    ?assertEqual(0, read(G)),
+
+    update(G, 1 bsl 10),
+    ?assertEqual(1 bsl 10, read(G)),
+
+    update(G, -1),
+    ?assertEqual(-1, read(G)).
+
+-endif.
diff --git a/src/couch_stats/src/couch_stats_histogram.erl 
b/src/couch_stats/src/couch_stats_histogram.erl
new file mode 100644
index 000000000..5f5d35179
--- /dev/null
+++ b/src/couch_stats/src/couch_stats_histogram.erl
@@ -0,0 +1,457 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+% This module implements windowed base-2 histograms using Erlang counters [1].
+%
+% Base-2 histograms use power of 2 exponentially increasing bin widths. This
+% allows capturing a range of values from microseconds to hours with a
+% relatively small number of bins. The same principle is used when encoding
+% floating point numbers [2]. In fact, our histograms rely on the ease of
+% constructing and mainpululating binary representations of 64 bit floats in
+% Erlang to do all of its heavy lifting.
+%
+% As a refresher, the standard (IEEE 754) 64 bit floating point representations
+% looks something like:
+%
+%  sign  exponent    mantissa
+%  [s64] [e63...e53] [m52...m1]
+%  <-1-> <---11----> <---52--->
+%
+%
+% The simplest scheme migth be to use the exponent to select the histogram bin
+% and throw away the mantissa bits. However, in that case bin sizes end up
+% growing a bit too fast and we lose resolution quickly. To increase the
+% resolution, use a few most significant bit from the mantissa. For example,
+% use 3 more bits mantissa bits for a total of 14 bits: [e63...e53] + [m52,
+% m51, m50]:
+%
+%  sign  exponent    mantissa
+%  [s64] [e63...e53] [m52, m51, m50, m49...m1]
+%  <-1-> <-----------14------------> <--49--->
+%        ^^^^^^^ bin index ^^^^^^^^^
+%
+% With Erlang's wonderful binary matching capabilities this becomes a
+% one-liner:
+%
+%    <<0:1, BinIndex:14, _/bitstring>> = <<Val/float>>
+%
+% The internal implementation is a tuple of counter:new(?BIN_COUNT)
+% elements. The tuple size is determined by the time window parameter when the
+% histogram is created. After a histogram object is created, its Erlang term
+% structure is static so it's suitable to be stored in a persistent term [3].
+% The structure looks something like:
+%
+%  {
+%     1          = [1, 2, ..., ?BIN_COUNT]
+%     2          = [1, 2, ..., ?BIN_COUNT]
+%     ...
+%     TimeWindow = [1, 2, ..., ?BIN_COUNT]
+%  }
+%
+% The representation can also be regarded as a table with rows as abstract time
+% units: 1 = 1st second, 2 = 2nd second, etc, and the columns as histogram
+% bins. The update/3 function takes a value and a current time as parameters.
+% The time is used to select which row to update, and the value picks which
+% bin to increment.
+%
+% In practice, the time window would be used as a circular buffer. The time
+% parameter to the update/3 function might be the system monotonic clock time
+% and then the histogram time index is computed as `Time rem TimeWindow`. So,
+% as the monotonic time is advancing forward, the histogram time index will
+% loop around. This comes with a minor annoynance of having to allocate a
+% larger time window to accomodate some process which cleans stale (expired)
+% histogram entries, possibly with some extra buffers to ensure the currently
+% updated interval and the interval ready to be cleaned would not overlap.
+%
+% Reading a histogram can be done via the read/3 function. The function takes a
+% start time and an interval. All the histogram entries in the interval will be
+% summed together, bin by bin, and returned as a new counters object. This
+% functionality can be used to gather and merge multiple histogram together.
+%
+% To get a stats summary across a time window use the stats/3 function. Just
+% like the read/3 function, it takes a start time and a time interval over
+% which to summarize the data.
+%
+% In addition to the new/1, update/3, read/3, stats/3 there are simple
+% functions which default to using WindowSize = 1. The intent is they would be
+% used when a simple histogram is needed without the time window functionality.
+%
+% [1] https://www.erlang.org/doc/man/counters.html
+% [2] https://en.wikipedia.org/wiki/IEEE_754
+% [3] https://www.erlang.org/doc/man/persistent_term.html
+
+-module(couch_stats_histogram).
+
+-export([
+    new/0,
+    new/1,
+
+    update/2,
+    update/3,
+
+    stats/1,
+    stats/3,
+
+    read/1,
+    read/3,
+
+    clear/1,
+    clear/3,
+
+    bin_min/1,
+    bin_max/1,
+    bin_middle/1,
+
+    calc_bin_offset/0,
+    calc_bin_count/0,
+    get_bin_boundaries/0
+]).
+
+% When updating constants comment out this directive, otherwise it might
+% prevent module loading with the intermediate/new constants values.
+%
+-on_load(check_constants/0).
+
+-define(FLOAT_SIZE, 64).
+
+% 11 standard float64 exponent bits + 3 more extra msb mantissa bits
+%
+-define(INDEX_BITS, 14).
+
+% Some practical min and max values to be able to have a fixed number of
+% histogram bins. When used for timing typical units are milliseconds, so use 
0.01
+% msec as the minimum, and 4M msec (over 1 hour) for maximum.
+%
+-define(MIN_VAL, 0.01).
+-define(MAX_VAL, 4000000.0).
+
+% These are computed from previously defined constants. Recompute them with
+% cal_bin_offset() and calc_bin_count(), respectively, if any of the constants
+% above change.
+%
+-define(BIN_OFFSET, -8129).
+-define(BIN_COUNT, 230).
+
+% Public API
+
+new() ->
+    new(1).
+
+new(TimeWindow) when is_integer(TimeWindow), TimeWindow >= 1 ->
+    list_to_tuple([counter() || _ <- lists:seq(1, TimeWindow)]).
+
+update(Ctx, Val) ->
+    update(Ctx, 1, Val).
+
+update(Ctx, Time, Val) when is_integer(Val) ->
+    update(Ctx, Time, float(Val));
+update(Ctx, Time, Val) when is_integer(Time), is_float(Val) ->
+    Val1 = min(max(Val, ?MIN_VAL), ?MAX_VAL),
+    Counter = hist_at(Ctx, Time),
+    counters:add(Counter, bin_index(Val1), 1).
+
+read(Ctx) ->
+    read(Ctx, 1, 1).
+
+read(Ctx, Time, Ticks) when is_integer(Time), is_integer(Ticks), Ticks >= 1 ->
+    Ticks1 = min(tuple_size(Ctx), Ticks),
+    read_fold(Ctx, Time, Ticks1, counter()).
+
+stats(Ctx) ->
+    stats(Ctx, 1, 1).
+
+stats(Ctx, Time, Ticks) when is_integer(Time), is_integer(Ticks), Ticks >= 1 ->
+    Counter = read(Ctx, Time, Ticks),
+    couch_stats_math:summary(Counter, ?BIN_COUNT).
+
+clear(Ctx) ->
+    clear(Ctx, 1, 1).
+
+clear(Ctx, Time, Ticks) when is_integer(Time), is_integer(Ticks), Ticks >= 1 ->
+    Ticks1 = min(tuple_size(Ctx), Ticks),
+    clear_fold(Ctx, Time, Ticks1).
+
+% Utility functions
+
+% Use this to recompute ?BIN_OFFSET if ?INDEX_BITS or MIN_VAL changes.
+%
+calc_bin_offset() ->
+    <<0:1, I:?INDEX_BITS, _/bitstring>> = <<?MIN_VAL/float>>,
+    % "1" because counter indices start at 1
+    1 - I.
+
+% Use this to recompute ?BIN_COUNT if ?INDEX_BITS, MIN_VAL, or MAX_VAL changes.
+%
+calc_bin_count() ->
+    bin_index(?MAX_VAL).
+
+get_bin_boundaries() ->
+    [{bin_min(I), bin_max(I)} || I <- lists:seq(1, ?BIN_COUNT)].
+
+% Private functions
+
+% Called from -on_load() directive. Verify that our constants are sane
+% if some are not updated module loading will throw an error.
+%
+check_constants() ->
+    case is_float(?MIN_VAL) of
+        true -> ok;
+        false -> error({min_val_is_not_a_float, ?MIN_VAL})
+    end,
+    case ?MIN_VAL > 0.0 of
+        true -> ok;
+        false -> error({min_val_must_be_positive, ?MIN_VAL})
+    end,
+    case calc_bin_count() of
+        ?BIN_COUNT -> ok;
+        OtherBinCount -> error({bin_count_stale, ?BIN_COUNT, OtherBinCount})
+    end,
+    case calc_bin_offset() of
+        ?BIN_OFFSET -> ok;
+        OtherBinOffset -> error({bin_offset_stale, ?BIN_OFFSET, 
OtherBinOffset})
+    end.
+
+bin_index(Val) ->
+    % Select the exponent bits plus a few most significant bits from
+    % mantissa. ?BIN_OFFSET shifts the index into the range starting with 1
+    % so we can index counter bins (those start with 1, just like tuples).
+    <<0:1, BinIndex:?INDEX_BITS, _/bitstring>> = <<Val/float>>,
+    BinIndex + ?BIN_OFFSET.
+
+bin_min(Index) when is_integer(Index), Index >= 1, Index =< ?BIN_COUNT ->
+    BiasedIndex = Index - ?BIN_OFFSET,
+    % 1 is the sign bit
+    BinBitSize = ?FLOAT_SIZE - 1 - ?INDEX_BITS,
+    % Minimum value is the one with all the rest of mantissa bits set to 0
+    <<Min/float>> = <<0:1, BiasedIndex:?INDEX_BITS, 0:BinBitSize>>,
+    Min.
+
+bin_max(Index) when is_integer(Index), Index >= 1, Index =< ?BIN_COUNT ->
+    BiasedIndex = Index - ?BIN_OFFSET,
+    % 1 is the sign bit
+    BinBitSize = ?FLOAT_SIZE - 1 - ?INDEX_BITS,
+    % For Max the intuition is we first construct a next highest power of two
+    % value, by shifting left BinBitSize, then subtract 1. That sets all the
+    % bits to 1. (for ex.:  1 bsl 4 = 1000, 1000 - 1 = 111)
+    <<Max/float>> = <<0:1, BiasedIndex:?INDEX_BITS, ((1 bsl BinBitSize) - 
1):BinBitSize>>,
+    Max.
+
+bin_middle(Index) when is_integer(Index), Index >= 1, Index =< ?BIN_COUNT ->
+    BiasedIndex = Index - ?BIN_OFFSET,
+    % 1 is the sign bit
+    BinBitSize = ?FLOAT_SIZE - 1 - ?INDEX_BITS,
+    % Shift left 1 bit less than we do in bin_max, which is effectively Max/2
+    <<Mid/float>> = <<0:1, BiasedIndex:?INDEX_BITS, (1 bsl (BinBitSize - 
1)):BinBitSize>>,
+    Mid.
+
+read_fold(_, _, 0, Acc) ->
+    Acc;
+read_fold(Counters, Time, Ticks, Acc) ->
+    Acc1 = merge(Acc, hist_at(Counters, Time), ?BIN_COUNT),
+    read_fold(Counters, Time + 1, Ticks - 1, Acc1).
+
+clear_fold(_, _, 0) ->
+    ok;
+clear_fold(Counters, Time, Ticks) ->
+    reset(hist_at(Counters, Time), ?BIN_COUNT),
+    clear_fold(Counters, Time + 1, Ticks - 1).
+
+merge(A, _, 0) ->
+    A;
+merge(A, B, I) when is_integer(I), I > 0 ->
+    counters:add(A, I, counters:get(B, I)),
+    merge(A, B, I - 1).
+
+reset(_, 0) ->
+    ok;
+reset(Counters, I) when is_integer(I), I > 0 ->
+    counters:put(Counters, I, 0),
+    reset(Counters, I - 1).
+
+counter() ->
+    counters:new(?BIN_COUNT, [write_concurrency]).
+
+hist_at(Counters, Time) when is_tuple(Counters), is_integer(Time) ->
+    % Erlang monotonic time can be negative, so add a TimeWindow to it, to make
+    % it positive again. Add +1 because counter indices start with 1 but X rem
+    % Y returns values betweeen 0 and Y-1.
+    TimeWindow = tuple_size(Counters),
+    case Time rem TimeWindow of
+        Idx when Idx < 0 -> element(Idx + TimeWindow + 1, Counters);
+        Idx -> element(Idx + 1, Counters)
+    end.
+
+-ifdef(TEST).
+
+-include_lib("couch/include/couch_eunit.hrl").
+
+basics_test() ->
+    H = new(),
+    ?assert(is_tuple(H)),
+    ?assertEqual(1, tuple_size(H)),
+    ?assertEqual(2, tuple_size(new(2))),
+    ?assertEqual([], bins(H)),
+    ?assertMatch([{_, _} | _], stats(H)),
+    ?assertEqual(0, proplists:get_value(n, stats(H))),
+    ?assertEqual(0, proplists:get_value(min, stats(H))),
+    ?assertEqual(0, proplists:get_value(max, stats(H))).
+
+update_test() ->
+    H = new(),
+    ?assertMatch(#{size := ?BIN_COUNT}, counters:info(read(H))),
+    ?assertEqual([], bins(H)),
+    update(H, 10.42),
+    ?assertEqual([{81, 1}], bins(H)),
+    ?assertEqual(81, bin_index(10.42)),
+    % Update again to see how histogram gets bumped to 2
+    update(H, 10.42),
+    ?assertEqual([{81, 2}], bins(H)),
+    ?assertEqual(2, proplists:get_value(n, stats(H))),
+    ?assert(proplists:get_value(min, stats(H)) >= 10.0),
+    ?assert(proplists:get_value(max, stats(H)) =< 11.0),
+    clear(H),
+    ?assertEqual([], bins(H)).
+
+update_with_small_value_test() ->
+    H = new(),
+    % 0 is below the minimum value
+    update(H, 0),
+    ?assertEqual([{1, 1}], bins(H)),
+    ?assertMatch([{_, _} | _], stats(H)),
+    ?assertEqual(1, proplists:get_value(n, stats(H))),
+    ?assert(bin_min(1) =< proplists:get_value(min, stats(H))),
+    ?assert(proplists:get_value(max, stats(H)) =< bin_max(1)),
+    clear(H),
+    ?assertEqual([], bins(H)).
+
+update_negative_time_index_test() ->
+    H = new(3),
+    update(H, -12, 0.42),
+    update(H, -11, 4.2),
+    update(H, -10, 4.2001),
+
+    ?assertEqual(44, bin_index(0.42)),
+    ?assertEqual(71, bin_index(4.2)),
+    ?assertEqual(71, bin_index(4.2001)),
+
+    ?assertEqual([{44, 1}], bins(H, -12, 1)),
+    ?assertEqual([{71, 1}], bins(H, -11, 1)),
+    ?assertEqual([{71, 1}], bins(H, -10, 1)),
+
+    % Combine 1st and 2nd
+    ?assertEqual([{44, 1}, {71, 1}], bins(H, -12, 2)),
+    % Combine 2nd and 3rd
+    ?assertEqual([{71, 2}], bins(H, -11, 2)),
+    % Combine all three
+    ?assertEqual([{44, 1}, {71, 2}], bins(H, -12, 3)),
+    % Wrap around
+    ?assertEqual([{44, 1}, {71, 2}], bins(H, -10, 3)),
+
+    % Clear last two
+    clear(H, -11, 2),
+    ?assertEqual([{44, 1}], bins(H, -12, 3)),
+
+    % Clear all
+    clear(H, -12, 3),
+    ?assertEqual([], bins(H, -12, 3)).
+
+update_positive_time_index_test() ->
+    H = new(3),
+    update(H, 1, 0.42),
+    update(H, 2, 4.2),
+    update(H, 3, 4.2001),
+
+    ?assertEqual([{44, 1}], bins(H, 1, 1)),
+    ?assertEqual([{71, 1}], bins(H, 2, 1)),
+    ?assertEqual([{71, 1}], bins(H, 3, 1)),
+
+    % Combine 1st and 2nd
+    ?assertEqual([{44, 1}, {71, 1}], bins(H, 1, 2)),
+    % Combine 2nd and 3rd
+    ?assertEqual([{71, 2}], bins(H, 2, 2)),
+    % Combine all three
+    ?assertEqual([{44, 1}, {71, 2}], bins(H, 1, 3)),
+    % Wrap around
+    ?assertEqual([{44, 1}, {71, 2}], bins(H, 3, 3)),
+
+    % Clear last two
+    clear(H, 2, 2),
+    ?assertEqual([{44, 1}], bins(H, 1, 3)),
+
+    % Clear all
+    clear(H, 1, 3),
+    ?assertEqual([], bins(H, 1, 3)).
+
+update_negative_and_positive_time_index_test() ->
+    H = new(3),
+    update(H, -1, 0.42),
+    update(H, 0, 4.2),
+    update(H, 1, 4.2001),
+
+    ?assertEqual([{44, 1}], bins(H, -1, 1)),
+    ?assertEqual([{71, 1}], bins(H, 0, 1)),
+    ?assertEqual([{71, 1}], bins(H, 1, 1)),
+
+    % Combine 1st and 2nd
+    ?assertEqual([{44, 1}, {71, 1}], bins(H, -1, 2)),
+    % Combine 2nd and 3rd
+    ?assertEqual([{71, 2}], bins(H, 0, 2)),
+    % Combine all three
+    ?assertEqual([{44, 1}, {71, 2}], bins(H, -1, 3)),
+    % Wrap around
+    ?assertEqual([{44, 1}, {71, 2}], bins(H, 1, 3)),
+
+    % Clear all
+    clear(H, -1, 3),
+    ?assertEqual([], bins(H, -1, 3)).
+
+update_with_large_value_test() ->
+    H = new(),
+    % Update with value > max
+    [update(H, 1.0e300) || _ <- lists:seq(1, 1000)],
+    ?assertEqual([{?BIN_COUNT, 1000}], bins(H)),
+    clear(H),
+    ?assertEqual([], bins(H)).
+
+calculated_constants_test() ->
+    ?assertEqual(?BIN_OFFSET, calc_bin_offset()),
+    ?assertEqual(?BIN_COUNT, calc_bin_count()).
+
+get_bin_boundaries_test() ->
+    Boundaries = get_bin_boundaries(),
+    ?assertEqual(?BIN_COUNT, length(Boundaries)),
+    lists:foreach(
+        fun({Min, Max}) ->
+            ?assert(Min < Max)
+        end,
+        Boundaries
+    ).
+
+% Test utility functions
+
+bins(H) ->
+    bins(H, 1, 1).
+
+bins(H, Time, Ticks) ->
+    Counters = read(H, Time, Ticks),
+    lists:foldl(
+        fun(I, Acc) ->
+            case counters:get(Counters, I) of
+                C when C > 0 -> [{I, C} | Acc];
+                _ -> Acc
+            end
+        end,
+        [],
+        lists:seq(?BIN_COUNT, 1, -1)
+    ).
+
+-endif.
diff --git a/src/couch_stats/src/couch_stats_httpd.erl 
b/src/couch_stats/src/couch_stats_httpd.erl
index b40ba6094..88ea169d0 100644
--- a/src/couch_stats/src/couch_stats_httpd.erl
+++ b/src/couch_stats/src/couch_stats_httpd.erl
@@ -19,7 +19,6 @@
 -export([transform_stats/1, nest/1, to_ejson/1, extract_path/2]).
 
 handle_stats_req(#httpd{method = 'GET', path_parts = [_ | Path]} = Req) ->
-    flush(Req),
     Stats0 = couch_stats:fetch(),
     Stats = transform_stats(Stats0),
     Nested = nest(Stats),
@@ -105,11 +104,3 @@ maybe_format_key(Key) when is_integer(Key) ->
     list_to_binary(integer_to_list(Key));
 maybe_format_key(Key) when is_binary(Key) ->
     Key.
-
-flush(Req) ->
-    case couch_util:get_value("flush", chttpd:qs(Req)) of
-        "true" ->
-            couch_stats_aggregator:flush();
-        _Else ->
-            ok
-    end.
diff --git a/src/couch_stats/src/couch_stats_math.erl 
b/src/couch_stats/src/couch_stats_math.erl
new file mode 100644
index 000000000..dfffc9a16
--- /dev/null
+++ b/src/couch_stats/src/couch_stats_math.erl
@@ -0,0 +1,406 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, eithe r express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(couch_stats_math).
+
+-export([
+    summary/2
+]).
+
+% Stats are computed over two passes. First one, with acc1, gathers some
+% basics, then the second pass, with acc2, computes additional, more complex,
+% statistics.
+%
+-record(acc1, {
+    % Non-zero bins = [{BinIndex, Counts}, ...]
+    bins = [],
+    n = 0,
+    sum = 0,
+    sum_log = 0,
+    sum_inv = 0
+}).
+
+-record(acc2, {
+    diff_2_sum = 0,
+    diff_3_sum = 0,
+    diff_4_sum = 0,
+    % pXX = {Rank, PercentileValue}
+    p50 = {0, 0},
+    p75 = {0, 0},
+    p90 = {0, 0},
+    p95 = {0, 0},
+    p99 = {0, 0},
+    p999 = {0, 0}
+}).
+
+summary(Counter, BinCount) when is_tuple(Counter), is_integer(BinCount) ->
+    calc_stats(pass1(Counter, BinCount)).
+
+calc_stats(#acc1{n = 0}) ->
+    % Can't do much with 0 items. Instead of inserting division checks for
+    % 0 everywhere, just return a empty stats object
+    n0_stats();
+calc_stats(#acc1{n = N} = Stats) when is_integer(N), N > 0 ->
+    #acc1{bins = Bins, sum = Sum, sum_log = SumLog, sum_inv = SumInv} = Stats,
+    Mean = Sum / N,
+    Acc2 = pass2(Bins, N, Mean),
+    #acc2{
+        diff_2_sum = Diff2Sum,
+        diff_3_sum = Diff3Sum,
+        diff_4_sum = Diff4Sum,
+        p50 = {_, P50},
+        p75 = {_, P75},
+        p90 = {_, P90},
+        p95 = {_, P95},
+        p99 = {_, P99},
+        p999 = {_, P999}
+    } = Acc2,
+    Variance = Diff2Sum / N,
+    StdDev = math:sqrt(Variance),
+    [
+        {n, N},
+        {min, couch_stats_histogram:bin_min(element(1, hd(Bins)))},
+        {max, couch_stats_histogram:bin_max(element(1, lists:last(Bins)))},
+        {arithmetic_mean, Mean},
+        {geometric_mean, math:exp(SumLog / N)},
+        {harmonic_mean, N / SumInv},
+        {median, P50},
+        {variance, Variance},
+        {standard_deviation, StdDev},
+        {skewness, skewness(N, Diff3Sum, StdDev)},
+        {kurtosis, kurtosis(N, Diff4Sum, StdDev)},
+        {percentile, [
+            {50, P50},
+            {75, P75},
+            {90, P90},
+            {95, P95},
+            {99, P99},
+            {999, P999}
+        ]},
+        % Emit for compatibility reasons
+        {histogram, [{0, 0}]}
+    ].
+
+% Fist pass removes 0 bins and calculates some basics like sums, counts, sum
+% logs. Some of these can only be used after the second pass over the data.
+%
+pass1(Counter, BinCount) ->
+    lists:foldl(
+        fun(Index, #acc1{} = Acc) ->
+            case counters:get(Counter, Index) of
+                Count when is_integer(Count), Count =< 0 ->
+                    Acc;
+                Count when is_integer(Count), Count > 0 ->
+                    Val = couch_stats_histogram:bin_middle(Index),
+                    Acc#acc1{
+                        bins = [{Index, Count} | Acc#acc1.bins],
+                        n = Acc#acc1.n + Count,
+                        sum = Acc#acc1.sum + Count * Val,
+                        sum_log = Acc#acc1.sum_log + Count * math:log(Val),
+                        sum_inv = Acc#acc1.sum_inv + Count / Val
+                    }
+            end
+        end,
+        #acc1{},
+        lists:seq(BinCount, 1, -1)
+    ).
+
+% Second statistics pass. This calculates the diff squared, cubed and 4th power
+% sums. These are used later to get the 2nd, 3rd and 4th central momements to
+% calcuate variance, skewness and kurtosis. In this pass we also calculate
+% percentiles.
+%
+pass2(Bins, N, Mean) ->
+    % Initialize each percentile's rank value as N * Q. During traversal each
+    % corresponding rank will be decremeneted by the bin's count value until we
+    % get to the bin where percentile rank < bin count value. After we
+    % calculate a percentile (as a non-0 value), we ignore that percentile
+    % entry and stop updating its rank.
+    %
+    Acc0 = #acc2{
+        p50 = {N * 0.50, 0},
+        p75 = {N * 0.75, 0},
+        p90 = {N * 0.90, 0},
+        p95 = {N * 0.95, 0},
+        p99 = {N * 0.99, 0},
+        p999 = {N * 0.999, 0}
+    },
+    lists:foldl(
+        fun({Index, Count}, #acc2{} = Acc) ->
+            Acc1 = percentiles(Acc, Index, Count),
+            Diff = couch_stats_histogram:bin_middle(Index) - Mean,
+            Diff2 = Diff * Diff,
+            Diff3 = Diff2 * Diff,
+            Diff4 = Diff2 * Diff2,
+            Acc1#acc2{
+                diff_2_sum = Acc1#acc2.diff_2_sum + Count * Diff2,
+                diff_3_sum = Acc1#acc2.diff_3_sum + Count * Diff3,
+                diff_4_sum = Acc1#acc2.diff_4_sum + Count * Diff4
+            }
+        end,
+        Acc0,
+        Bins
+    ).
+
+%% erlfmt-ignore
+percentiles(#acc2{} = Acc1, Index, Count) ->
+    Acc2 = Acc1#acc2{p50 = percentile(Acc1#acc2.p50, Index, Count)},
+    Acc3 = Acc2#acc2{p75 = percentile(Acc2#acc2.p75, Index, Count)},
+    Acc4 = Acc3#acc2{p90 = percentile(Acc3#acc2.p90, Index, Count)},
+    Acc5 = Acc4#acc2{p95 = percentile(Acc4#acc2.p95, Index, Count)},
+    Acc6 = Acc5#acc2{p99 = percentile(Acc5#acc2.p99, Index, Count)},
+    Acc7 = Acc6#acc2{p999 = percentile(Acc6#acc2.p999, Index, Count)},
+    Acc7.
+
+percentile({Rank, 0}, Index, Count) when Rank < Count ->
+    Min = couch_stats_histogram:bin_min(Index),
+    Max = couch_stats_histogram:bin_max(Index),
+    Width = Max - Min,
+    % Count should not be 0, we already filtered out Count == 0 bins
+    Frac = Rank / Count,
+    % Do a bit of extra work to get a nicer interpolated percentile value. Frac
+    % is the fractional part of the bin width based on the rank left-over. For
+    % example, if the Count = 1000:
+    %
+    %   If Rank = 1, then Frac = 0.001 : We're closer to bin min
+    %   If Rank = 500, then Frac = 0.5 : We're closer the middle
+    %   If Rank = 950, then Frac = 0.95 : We're closer to bin max
+    %
+    Percentile = Min + Width * Frac,
+    {Rank, Percentile};
+percentile({Rank, 0}, _Index, Count) ->
+    % Haven't reached our bin yet, reduce this percentile's rank by Count
+    % amount and keep going.
+    {Rank - Count, 0};
+percentile({Rank, Percentile}, _, _) when is_number(Percentile), Percentile > 
0 ->
+    % Nothing left to do, we already calculated this percentile value.
+    {Rank, Percentile}.
+
+skewness(N, Diff3Sum, StdDev) ->
+    % https://en.wikipedia.org/wiki/Skewness
+    %   Skewness = M3 / StdDev^3
+    %   M3 = mean(Diff3Sum) = Diff3Sum/N. (3rd central moment)
+    case math:pow(StdDev, 3) of
+        StdDev3 when StdDev3 < 1.0e-12 ->
+            % If StdDev is too low avoid dividing by 0, assume skewness = 0
+            0;
+        StdDev3 ->
+            M3 = Diff3Sum / N,
+            M3 / StdDev3
+    end.
+
+kurtosis(N, Diff4Sum, StdDev) ->
+    % http://en.wikipedia.org/wiki/Kurtosis
+    %     Kurtosis = M4 / StdDev^4 - 3
+    %     M4 = mean(Diff4Sum) = Diff4Sum/4. (4th central momement)
+    %
+    % Normal distribution kurtosis is 3 so we subtract 3 to get excess kurtosis
+    % to show how it's different from a normal distribution.
+    case math:pow(StdDev, 4) of
+        StdDev4 when StdDev4 < 1.0e-12 ->
+            0;
+        StdDev4 ->
+            M4 = Diff4Sum / N,
+            M4 / StdDev4 - 3
+    end.
+
+n0_stats() ->
+    [
+        {n, 0},
+        {min, 0},
+        {max, 0},
+        {arithmetic_mean, 0},
+        {geometric_mean, 0},
+        {harmonic_mean, 0},
+        {median, 0},
+        {variance, 0},
+        {standard_deviation, 0},
+        {skewness, 0},
+        {kurtosis, 0},
+        {percentile, [
+            {50, 0},
+            {75, 0},
+            {90, 0},
+            {95, 0},
+            {99, 0},
+            {999, 0}
+        ]},
+        {histogram, [{0, 0}]}
+    ].
+
+-ifdef(TEST).
+
+-include_lib("couch/include/couch_eunit.hrl").
+
+basic_test() ->
+    H = couch_stats_histogram:new(),
+    Vals = [0.05, 0.9, 0.7, 0.7, 10.1, 11, 100.5, 0.10, 13.5],
+    [couch_stats_histogram:update(H, V) || V <- Vals],
+    Stats = couch_stats_histogram:stats(H),
+    Percentiles = prop(percentile, Stats),
+    ?assertEqual(length(Vals), prop(n, Stats)),
+    ?assert(flim(0.05, prop(min, Stats))),
+    ?assert(flim(104, prop(max, Stats))),
+    ?assert(flim(15.3, prop(arithmetic_mean, Stats))),
+    ?assert(flim(1.9, prop(geometric_mean, Stats))),
+    ?assert(flim(0.25, prop(harmonic_mean, Stats))),
+    ?assert(flim(0.9, prop(median, Stats))),
+    ?assert(flim(923, prop(variance, Stats))),
+    ?assert(flim(30.4, prop(standard_deviation, Stats))),
+    % Values are skewed toward the left so we should have a positive skew
+    % https://en.wikipedia.org/wiki/Skewness
+    ?assert(flim(2.3, prop(skewness, Stats))),
+    % We have more extreme tail outliers compared to a normal distribution, so
+    % excess kurtosis should be > 0. In stats-speak the distribution would
+    % be "leptokurtic".
+    ?assert(flim(3.7, prop(kurtosis, Stats))),
+    ?assert(flim(0.9, prop(50, Percentiles))),
+    ?assert(flim(11.7, prop(75, Percentiles))),
+    ?assert(flim(97, prop(90, Percentiles))),
+    ?assert(flim(100, prop(95, Percentiles))),
+    ?assert(flim(103, prop(99, Percentiles))),
+    ?assert(flim(104, prop(999, Percentiles))).
+
+min_extreme_test() ->
+    % All the values in the smallest bin
+    H = couch_stats_histogram:new(),
+    N = 1000000,
+    [couch_stats_histogram:update(H, 0) || _ <- lists:seq(1, N)],
+    Stats = couch_stats_histogram:stats(H),
+    Percentiles = prop(percentile, Stats),
+    ?assertEqual(N, prop(n, Stats)),
+    ?assert(flim(0, prop(min, Stats))),
+    ?assert(flim(0, prop(max, Stats))),
+    ?assert(flim(0, prop(arithmetic_mean, Stats))),
+    ?assert(flim(0, prop(geometric_mean, Stats))),
+    ?assert(flim(0, prop(harmonic_mean, Stats))),
+    ?assert(flim(0, prop(median, Stats))),
+    ?assert(flim(0, prop(variance, Stats))),
+    ?assert(flim(0, prop(standard_deviation, Stats))),
+    ?assert(flim(0, prop(skewness, Stats))),
+    ?assert(flim(0, prop(kurtosis, Stats))),
+    ?assert(flim(0, prop(50, Percentiles))),
+    ?assert(flim(0, prop(75, Percentiles))),
+    ?assert(flim(0, prop(90, Percentiles))),
+    ?assert(flim(0, prop(95, Percentiles))),
+    ?assert(flim(0, prop(99, Percentiles))),
+    ?assert(flim(0, prop(999, Percentiles))).
+
+max_extreme_test() ->
+    % All the values are in the largest bin
+    H = couch_stats_histogram:new(),
+    N = 1000000,
+    % ?BIN_COUNT in couch_stats_histogram.erl
+    HighestBin = 230,
+    [couch_stats_histogram:update(H, 10000000) || _ <- lists:seq(1, N)],
+    Stats = couch_stats_histogram:stats(H),
+    Percentiles = prop(percentile, Stats),
+    ?assertEqual(N, prop(n, Stats)),
+    % Min would be the lower bound of the highest bin
+    BinMin = couch_stats_histogram:bin_min(HighestBin),
+    ?assert(flim(BinMin, prop(min, Stats))),
+    % Max would be the highest bound of the highest bin
+    BinMax = couch_stats_histogram:bin_max(HighestBin),
+    ?assert(flim(BinMax, prop(max, Stats))),
+    BinMid = couch_stats_histogram:bin_middle(HighestBin),
+    ?assert(flim(BinMid, prop(arithmetic_mean, Stats))),
+    ?assert(flim(BinMid, prop(geometric_mean, Stats))),
+    ?assert(flim(BinMid, prop(harmonic_mean, Stats))),
+    ?assert(flim(BinMid, prop(median, Stats))),
+    ?assert(flim(0, prop(variance, Stats))),
+    ?assert(flim(0, prop(standard_deviation, Stats))),
+    ?assert(flim(0, prop(skewness, Stats))),
+    ?assert(flim(0, prop(kurtosis, Stats))),
+    ?assert(flim(BinMid, prop(50, Percentiles))),
+    ?assert(flim(4128767, prop(75, Percentiles))),
+    ?assert(flim(4168089, prop(90, Percentiles))),
+    ?assert(flim(4181196, prop(95, Percentiles))),
+    ?assert(flim(4191682, prop(99, Percentiles))),
+    ?assert(flim(4194041, prop(999, Percentiles))).
+
+normal_dist_test() ->
+    H = couch_stats_histogram:new(),
+    rand:seed(default, {1, 2, 3}),
+    N = 1000000,
+    Mean = 50,
+    Var = 100,
+    [couch_stats_histogram:update(H, rand:normal(Mean, Var)) || _ <- 
lists:seq(1, N)],
+    Stats = couch_stats_histogram:stats(H),
+    Percentiles = prop(percentile, Stats),
+    ?assertEqual(N, prop(n, Stats)),
+    ?assert(flim(3.7, prop(min, Stats))),
+    ?assert(flim(104, prop(max, Stats))),
+    ?assert(flim(Mean, prop(arithmetic_mean, Stats))),
+    ?assert(flim(49, prop(geometric_mean, Stats))),
+    ?assert(flim(48, prop(harmonic_mean, Stats))),
+    % Median and mean of a normal distribution should be the same
+    ?assert(flim(Mean, prop(median, Stats))),
+    ?assert(flim(Var, prop(variance, Stats))),
+    ?assert(flim(math:sqrt(Var), prop(standard_deviation, Stats))),
+    % Skewness should be close to 0 as the distribution is symmetric
+    ?assert(flim(0.0, prop(skewness, Stats))),
+    % Excess kurtosis should be 0. In stats-speak normal distribution is
+    % "mesokurtic".
+    ?assert(flim(0.0, prop(kurtosis, Stats))),
+    % P50 = Median = Mean
+    ?assert(flim(Mean, prop(50, Percentiles))),
+    ?assert(flim(56, prop(75, Percentiles))),
+    ?assert(flim(63, prop(90, Percentiles))),
+    ?assert(flim(68, prop(95, Percentiles))),
+    ?assert(flim(74, prop(99, Percentiles))),
+    ?assert(flim(82, prop(999, Percentiles))).
+
+uniform_dist_test() ->
+    H = couch_stats_histogram:new(),
+    rand:seed(default, {1, 2, 3}),
+    N = 1000000,
+    % rand:uniform/1 returns values in [1,N], so subtract 1 to get values 
closer to 0
+    RandFun = fun() -> rand:uniform(10000001) / 10 - 1 end,
+    [couch_stats_histogram:update(H, RandFun()) || _ <- lists:seq(1, N)],
+    Stats = couch_stats_histogram:stats(H),
+    Percentiles = prop(percentile, Stats),
+    ?assertEqual(N, prop(n, Stats)),
+    ?assert(flim(0, prop(min, Stats))),
+    ?assert(flim(1040000, prop(max, Stats))),
+    ?assert(flim(500000, prop(arithmetic_mean, Stats))),
+    ?assert(flim(368000, prop(geometric_mean, Stats))),
+    ?assert(flim(8800, prop(harmonic_mean, Stats))),
+    ?assert(flim(500000, prop(median, Stats))),
+    % Variance and stddev should be large for a uniform distribution
+    ?assert(flim(83.0e9, prop(variance, Stats))),
+    ?assert(flim(290000, prop(standard_deviation, Stats))),
+    % Skewness should be close to 0 as the distribution is symmetric
+    ?assert(flim(0.0, prop(skewness, Stats))),
+    % Uniform distribution would be platykurtic. Excess kurtosis should be
+    % negative we'd have fewer extreme outliers (at the tails) than a normal
+    % distribution might have.
+    ?assert(flim(-1.2, prop(kurtosis, Stats))),
+    ?assert(flim(500000, prop(50, Percentiles))),
+    ?assert(flim(750000, prop(75, Percentiles))),
+    ?assert(flim(900000, prop(90, Percentiles))),
+    ?assert(flim(950000, prop(95, Percentiles))),
+    ?assert(flim(1010000, prop(99, Percentiles))),
+    ?assert(flim(1040000, prop(999, Percentiles))).
+
+prop(Prop, KVs) ->
+    proplists:get_value(Prop, KVs).
+
+% Since we can't compare float exactly we use
+% a tollerance of 5% and a minimum of 0.05
+%
+flim(X, Y) ->
+    flim(X, Y, max(0.05, abs(X * 0.05))).
+
+flim(X, Y, Tol) ->
+    abs(X - Y) < Tol.
+
+-endif.
diff --git a/src/couch_stats/src/couch_stats_server.erl 
b/src/couch_stats/src/couch_stats_server.erl
new file mode 100644
index 000000000..d58a8f061
--- /dev/null
+++ b/src/couch_stats/src/couch_stats_server.erl
@@ -0,0 +1,250 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+% couch_stats_server is in charge of:
+%   - Initial metric loading from application stats descriptions.
+%   - Recycling(resetting to 0) stale histogram counters.
+%   - Checking and reloading if stats descriptions change.
+%   - Checking and reloading if histogram interval config value changes.
+%
+
+-module(couch_stats_server).
+
+-behaviour(gen_server).
+
+-export([
+    reload/0
+]).
+
+-export([
+    start_link/0,
+    init/1,
+    handle_call/3,
+    handle_cast/2,
+    handle_info/2
+]).
+
+-define(RELOAD_INTERVAL_SEC, 600).
+
+-record(st, {
+    hist_interval,
+    histograms,
+    clean_tref,
+    reload_tref
+}).
+
+reload() ->
+    gen_server:call(?MODULE, reload).
+
+start_link() ->
+    gen_server:start_link({local, ?MODULE}, ?MODULE, [], []).
+
+init([]) ->
+    St = #st{
+        hist_interval = config:get("stats", "interval"),
+        clean_tref = erlang:send_after(clean_msec(), self(), clean),
+        reload_tref = erlang:send_after(reload_msec(), self(), reload)
+    },
+    {_, Stats} = try_reload(St),
+    {ok, St#st{histograms = couch_stats_util:histograms(Stats)}}.
+
+handle_call(reload, _From, #st{} = St) ->
+    {reply, ok, do_reload(St)};
+handle_call(Msg, _From, #st{} = St) ->
+    {stop, {unknown_call, Msg}, unknown_call, St}.
+
+handle_cast(Msg, #st{} = St) ->
+    {stop, {unknown_cast, Msg}, St}.
+
+handle_info(reload, #st{} = St) ->
+    {noreply, do_reload(St)};
+handle_info(clean, #st{} = St) ->
+    {noreply, do_clean(St)};
+handle_info(Msg, #st{} = St) ->
+    {stop, {unknown_info, Msg}, St}.
+
+do_clean(#st{} = St) ->
+    timer:cancel(St#st.clean_tref),
+    HistTRef = erlang:send_after(clean_msec(), self(), clean),
+    NowSec = erlang:monotonic_time(second),
+    BufferSec = couch_stats_util:histogram_safety_buffer_size_sec(),
+    IntervalSec = couch_stats_util:histogram_interval_sec(),
+    % The histogram timeline looks something like:
+    %
+    %  |<--buffer-->|<--stale-->|<--buffer-->|<--current-->|
+    %                ^                                    ^
+    %                StartSec                             NowSec
+    %
+    % To get to the start of "stale" part to clean it, subtract one interval,
+    % then a buffer, then another interval from NowSec.
+    %
+    StartSec = NowSec - IntervalSec - BufferSec - (IntervalSec - 1),
+    % Last -1 is because the interval ends are inclusive
+    maps:foreach(
+        fun(_, {_, Ctx, _}) ->
+            couch_stats_histogram:clear(Ctx, StartSec, IntervalSec)
+        end,
+        St#st.histograms
+    ),
+    St#st{clean_tref = HistTRef}.
+
+do_reload(#st{} = St) ->
+    timer:cancel(St#st.reload_tref),
+    RTRef = erlang:send_after(reload_msec(), self(), reload),
+    case try_reload(St) of
+        {true, NewStats} ->
+            timer:cancel(St#st.clean_tref),
+            Histograms = couch_stats_util:histograms(NewStats),
+            HTRef = erlang:send_after(clean_msec(), self(), clean),
+            St#st{
+                histograms = Histograms,
+                clean_tref = HTRef,
+                reload_tref = RTRef,
+                hist_interval = config:get("stats", "interval")
+            };
+        {false, _} ->
+            St#st{reload_tref = RTRef}
+    end.
+
+try_reload(#st{} = St) ->
+    NewDefs = couch_stats_util:load_metrics_for_applications(),
+    Stats = couch_stats_util:stats(),
+    MetricsChanged = couch_stats_util:metrics_changed(Stats, NewDefs),
+    IntervalChanged = interval_changed(St),
+    case MetricsChanged orelse IntervalChanged of
+        true ->
+            couch_stats_util:reset_histogram_interval_sec(),
+            NewStats = couch_stats_util:create_metrics(NewDefs),
+            couch_stats_util:replace_stats(NewStats),
+            {true, NewStats};
+        false ->
+            {false, Stats}
+    end.
+
+interval_changed(#st{hist_interval = OldInterval}) ->
+    case config:get("stats", "interval") of
+        Interval when OldInterval =:= Interval ->
+            false;
+        _ ->
+            true
+    end.
+
+reload_msec() ->
+    1000 * ?RELOAD_INTERVAL_SEC.
+
+clean_msec() ->
+    % We want to wake up more often than our interval so we decide to wake
+    % about twice as often. If the interval is 10 seconds, we'd wake up every 5
+    % seconds and clean the most stale 10 seconds. It's a bit wasteful but it's
+    % a safety feature to ensure we don't miss anything.
+    (couch_stats_util:histogram_interval_sec() * 1000) div 2.
+
+-ifdef(TEST).
+
+-include_lib("couch/include/couch_eunit.hrl").
+
+couch_stats_server_test_() ->
+    {
+        foreach,
+        fun setup/0,
+        fun teardown/1,
+        [
+            ?TDEF_FE(t_server_starts),
+            ?TDEF_FE(t_reload_with_no_changes_works),
+            ?TDEF_FE(t_reload_with_changes_works),
+            ?TDEF_FE(t_cleaning_works, 10),
+            ?TDEF_FE(t_invalid_call),
+            ?TDEF_FE(t_invalid_cast),
+            ?TDEF_FE(t_invalid_msg)
+        ]
+    }.
+
+setup() ->
+    test_util:start_couch().
+
+teardown(Ctx) ->
+    config:delete("stats", "interval", _Persist = false),
+    test_util:stop_couch(Ctx).
+
+t_server_starts(_) ->
+    ?assert(is_process_alive(whereis(?MODULE))).
+
+t_reload_with_no_changes_works(_) ->
+    Pid = whereis(?MODULE),
+    ?assert(is_process_alive(Pid)),
+    ?assertEqual(ok, reload()),
+    ?assertEqual(Pid, whereis(?MODULE)),
+    ?assert(is_process_alive(Pid)),
+    % Let's reload a few more hundred times
+    lists:foreach(
+        fun(_) ->
+            ?assertEqual(ok, reload()),
+            ?assertEqual(Pid, whereis(?MODULE)),
+            ?assert(is_process_alive(Pid))
+        end,
+        lists:seq(1, 100)
+    ).
+
+t_reload_with_changes_works(_) ->
+    Pid = whereis(?MODULE),
+    ?assert(is_process_alive(Pid)),
+    #st{hist_interval = Interval0} = sys:get_state(Pid),
+    ?assertEqual(undefined, Interval0),
+
+    config:set("stats", "interval", "7", false),
+    ?assertEqual(ok, reload()),
+    ?assertEqual(Pid, whereis(?MODULE)),
+    ?assert(is_process_alive(Pid)),
+    #st{hist_interval = Interval1} = sys:get_state(Pid),
+    ?assertEqual("7", Interval1),
+
+    #st{histograms = Hists} = sys:get_state(Pid),
+    [{_Key, {histogram, HCtx1, _Desc}} | _] = maps:to_list(Hists),
+    % Histogram window size should now be shorter
+    % 7 (active time window) + 7 (stale) + 5 + 5 for buffers = 24.
+    ?assertEqual(24, tuple_size(HCtx1)).
+
+t_cleaning_works(_) ->
+    config:set("stats", "interval", "1", false),
+    sys:log(?MODULE, {true, 100}),
+    ok = reload(),
+    timer:sleep(2000),
+    {ok, Events} = sys:log(?MODULE, get),
+    ok = sys:log(?MODULE, false),
+    config:set("stats", "interval", "10", false),
+    ok = reload(),
+    % Events looks like: [{in, Msg} | {noreply, ...} | {out, ..}, ...]
+    CleanEvents = [clean || {in, clean} <- Events],
+    ?assert(length(CleanEvents) >= 3).
+
+t_invalid_call(_) ->
+    Pid = whereis(?MODULE),
+    ?assert(is_process_alive(Pid)),
+    ?assertEqual(unknown_call, gen_server:call(Pid, potato)),
+    test_util:wait_value(fun() -> is_process_alive(Pid) end, false),
+    ?assertNot(is_process_alive(Pid)).
+
+t_invalid_cast(_) ->
+    Pid = whereis(?MODULE),
+    ?assert(is_process_alive(Pid)),
+    ok = gen_server:cast(Pid, potato),
+    test_util:wait_value(fun() -> is_process_alive(Pid) end, false),
+    ?assertNot(is_process_alive(Pid)).
+
+t_invalid_msg(_) ->
+    Pid = whereis(?MODULE),
+    ?assert(is_process_alive(Pid)),
+    Pid ! potato,
+    test_util:wait_value(fun() -> is_process_alive(Pid) end, false),
+    ?assertNot(is_process_alive(Pid)).
+
+-endif.
diff --git a/src/couch_stats/src/couch_stats_sup.erl 
b/src/couch_stats/src/couch_stats_sup.erl
index 2a92ac69c..325372c3e 100644
--- a/src/couch_stats/src/couch_stats_sup.erl
+++ b/src/couch_stats/src/couch_stats_sup.erl
@@ -28,7 +28,7 @@ init([]) ->
     {ok,
         {
             {one_for_one, 5, 10}, [
-                ?CHILD(couch_stats_aggregator, worker),
+                ?CHILD(couch_stats_server, worker),
                 ?CHILD(couch_stats_process_tracker, worker)
             ]
         }}.
diff --git a/src/couch_stats/src/couch_stats_util.erl 
b/src/couch_stats/src/couch_stats_util.erl
new file mode 100644
index 000000000..5cbd54b1d
--- /dev/null
+++ b/src/couch_stats/src/couch_stats_util.erl
@@ -0,0 +1,190 @@
+% Licensed under the Apache License, Version 2.0 (the "License"); you may not
+% use this file except in compliance with the License. You may obtain a copy of
+% the License at
+%
+%   http://www.apache.org/licenses/LICENSE-2.0
+%
+% Unless required by applicable law or agreed to in writing, software
+% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+% License for the specific language governing permissions and limitations under
+% the License.
+
+-module(couch_stats_util).
+
+-export([
+    % Load metrics from apps
+    create_metrics/1,
+    load_metrics_for_applications/0,
+    metrics_changed/2,
+
+    % Get various metric types
+    get_counter/2,
+    get_gauge/2,
+    get_histogram/2,
+
+    % Get histogram interval config settings
+    histogram_interval_sec/0,
+    histogram_safety_buffer_size_sec/0,
+    reset_histogram_interval_sec/0,
+
+    % Manage the main stats (metrics) persistent term map
+    replace_stats/1,
+    stats/0,
+    histograms/1,
+
+    % Fetch stats values
+    fetch/3,
+    sample/4
+]).
+
+-define(DEFAULT_INTERVAL_SEC, 10).
+
+% Histogram types
+-define(HIST, histogram).
+-define(CNTR, counter).
+-define(GAUGE, gauge).
+
+% Safety buffer before and after current window to prevent
+% overwrites from the cleaner process
+-define(HIST_WRAP_BUFFER_SIZE_SEC, 5).
+
+% Persistent term keys
+-define(STATS_KEY, {?MODULE, stats}).
+-define(HIST_TIME_INTERVAL_KEY, {?MODULE, hist_time_interval}).
+
+load_metrics_for_applications() ->
+    Apps = [element(1, A) || A <- application:loaded_applications()],
+    lists:foldl(fun load_metrics_for_application_fold/2, #{}, Apps).
+
+load_metrics_for_application_fold(AppName, #{} = Acc) ->
+    case code:priv_dir(AppName) of
+        {error, _Error} ->
+            Acc;
+        Dir ->
+            case file:consult(Dir ++ "/stats_descriptions.cfg") of
+                {ok, Descriptions} ->
+                    DescMap = maps:map(
+                        fun(_, TypeDesc) ->
+                            Type = proplists:get_value(type, TypeDesc, 
counter),
+                            Desc = proplists:get_value(desc, TypeDesc, <<>>),
+                            {Type, Desc}
+                        end,
+                        maps:from_list(Descriptions)
+                    ),
+                    maps:merge(Acc, DescMap);
+                {error, _Error} ->
+                    Acc
+            end
+    end.
+
+metrics_changed(#{} = Map1, #{} = Map2) when map_size(Map1) =/= map_size(Map2) 
->
+    % If their sizes are differently they are obvioulsy not the same
+    true;
+metrics_changed(#{} = Map1, #{} = Map2) when map_size(Map1) =:= map_size(Map2) 
->
+    % If their intersection size is not the same as their individual size
+    % they are also not the same
+    map_size(maps:intersect(Map1, Map2)) =/= map_size(Map1).
+
+get_counter(Name, #{} = Stats) ->
+    get_metric(Name, ?CNTR, Stats).
+
+get_gauge(Name, #{} = Stats) ->
+    get_metric(Name, ?GAUGE, Stats).
+
+get_histogram(Name, #{} = Stats) ->
+    get_metric(Name, ?HIST, Stats).
+
+get_metric(Name, Type, Stats) when is_atom(Type), is_map(Stats) ->
+    case maps:get(Name, Stats, unknown_metric) of
+        {FoundType, Metric, _Desc} when FoundType =:= Type ->
+            {ok, Metric};
+        {OtherType, _, _} ->
+            error_logger:error_msg("invalid metric: ~p ~p =/= ~p", [Name, 
Type, OtherType]),
+            {error, invalid_metric};
+        unknown_metric ->
+            error_logger:error_msg("unknown metric: ~p", [Name]),
+            {error, unknown_metric}
+    end.
+
+histogram_interval_sec() ->
+    case persistent_term:get(?HIST_TIME_INTERVAL_KEY, not_cached) of
+        not_cached ->
+            Time = config:get_integer("stats", "interval", 
?DEFAULT_INTERVAL_SEC),
+            persistent_term:put(?HIST_TIME_INTERVAL_KEY, Time),
+            Time;
+        Val when is_integer(Val) ->
+            Val
+    end.
+
+reset_histogram_interval_sec() ->
+    persistent_term:erase(?HIST_TIME_INTERVAL_KEY).
+
+histogram_safety_buffer_size_sec() ->
+    ?HIST_WRAP_BUFFER_SIZE_SEC.
+
+histogram_total_size_sec() ->
+    % Add a safety buffer before and after the window couch_stats_server will
+    % periodically clear.
+    histogram_interval_sec() * 2 + ?HIST_WRAP_BUFFER_SIZE_SEC * 2.
+
+replace_stats(#{} = Stats) ->
+    persistent_term:put(?STATS_KEY, Stats).
+
+stats() ->
+    persistent_term:get(?STATS_KEY, #{}).
+
+histograms(Stats) ->
+    maps:filter(fun(_, {Type, _, _}) -> Type =:= ?HIST end, Stats).
+
+create_metrics(MetricsDefs) ->
+    maps:fold(fun create_fold/3, #{}, MetricsDefs).
+
+create_fold(Name, {?CNTR, Desc}, #{} = Acc) ->
+    Acc#{Name => {?CNTR, couch_stats_counter:new(), Desc}};
+create_fold(Name, {?GAUGE, Desc}, #{} = Acc) ->
+    Acc#{Name => {?GAUGE, couch_stats_gauge:new(), Desc}};
+create_fold(Name, {?HIST, Desc}, #{} = Acc) ->
+    TotalSizeSec = histogram_total_size_sec(),
+    Acc#{Name => {?HIST, couch_stats_histogram:new(TotalSizeSec), Desc}};
+create_fold(Name, Unknown, #{} = _Acc) ->
+    throw({unknown_metric, {Name, Unknown}}).
+
+fetch(#{} = Stats, Now, Interval) when is_integer(Now), is_integer(Interval) ->
+    {Result, _, _} = maps:fold(fun fetch_fold/3, {[], Now, Interval}, Stats),
+    Result.
+
+fetch_fold(Name, {?CNTR, Ctx, Desc}, {Entries, Now, Interval}) ->
+    Entry = [
+        {value, couch_stats_counter:read(Ctx)},
+        {type, ?CNTR},
+        {desc, Desc}
+    ],
+    {[{Name, Entry} | Entries], Now, Interval};
+fetch_fold(Name, {?GAUGE, Ctx, Desc}, {Entries, Now, Interval}) ->
+    Entry = [
+        {value, couch_stats_gauge:read(Ctx)},
+        {type, ?GAUGE},
+        {desc, Desc}
+    ],
+    {[{Name, Entry} | Entries], Now, Interval};
+fetch_fold(Name, {?HIST, Ctx, Desc}, {Entries, Now, Interval}) ->
+    Stats = couch_stats_histogram:stats(Ctx, Now, Interval),
+    Entry = [
+        {value, Stats},
+        {type, ?HIST},
+        {desc, Desc}
+    ],
+    {[{Name, Entry} | Entries], Now, Interval}.
+
+sample(Name, #{} = Stats, Time, Ticks) when is_integer(Time), 
is_integer(Ticks) ->
+    case maps:get(Name, Stats, unknown_metric) of
+        {?CNTR, Ctx, _Desc} ->
+            couch_stats_counter:read(Ctx);
+        {?GAUGE, Ctx, _Desc} ->
+            couch_stats_gauge:read(Ctx);
+        {?HIST, Ctx, _Desc} ->
+            couch_stats_histogram:stats(Ctx, Time, Ticks);
+        unknown_metric ->
+            throw(unknown_metric)
+    end.

[couchdb] 06/13: Replace Folsom and improve performance

Reply via email to