chewbranca commented on code in PR #5602:
URL: https://github.com/apache/couchdb/pull/5602#discussion_r2221072289


##########
src/couch_stats/CSRT.md:
##########
@@ -0,0 +1,893 @@
+# Couch Stats Resource Tracker (CSRT)
+
+CSRT (Couch Stats Resource Tracker) is a real time stats tracking system that
+tracks the quantity of resources induced at the process level in a live
+queryable manner that also generates process lifetime reports containing
+statistics on the total resource load of a request, as a function of things 
like
+dbs/docs opened, view and changes rows read, changes returned vs processed,
+Javascript filter usage, duration, and more. This system is a paradigm shift in
+CouchDB visibility and introspection, allowing for expressive real time 
querying
+capabilities to introspect, understand, and aggregate CouchDB internal resource
+usage, as well as powerful filtering facilities for conditionally generating
+reports on "heavy usage" requests or "long/slow" requests. CSRT also extends
+`recon:proc_window` with `csrt:proc_window` allowing for the same style of
+battle hardened introspection with Recon's excellent `proc_window`, but with 
the
+sample window over any of the CSRT tracked CouchDB stats!
+
+CSRT does this by piggy-backing off of the existing metrics tracked by way of
+`couch_stats:increment_counter` at the time when the local process induces 
those
+metrics inc calls, and then CSRT updates an ets entry containing the context
+information for the local process, such that global aggregate queries can be
+performed against the ets table as well as the generation of the process
+resource usage reports at the conclusions of the process's lifecyle.The ability
+to do aggregate querying in realtime in addition to the process lifecycle
+reports for post facto analysis over time, is a cornerstone of CSRT that is the
+result of a series of iterations until a robust and scalable aproach was built.
+
+The real time querying is achieved by way of a global ets table with
+`read_concurrency`, `write_concurrency`, and `decentralized_counters` enabled.
+Great care was taken to ensure that _zero_ concurrent writes to the same key
+occure in this model, and this entire system is predicated on the fact that
+incremental updates to `ets:update_counters` provides *really* fast and
+efficient updates in an atomic and isolated fashion when coupled with
+decentralized counters and write concurrency. Each process that calls
+`couch_stats:increment_counter` tracks their local context in CSRT as well, 
with
+zero concurrent writes from any other processes. Outside of the context setup
+and teardown logic, _only_ operations to `ets:update_counter` are performed, 
one
+per process invocation of `couch_stats:increment_counter`, and one for
+coordinators to update worker deltas in a single batch, resulting in a 1:1 
ratio
+of ets calls to real time stats updates for the primary workloads.
+
+The primary achievement of CSRT is the core framework iself for concurrent
+process local stats tracking and real time RPC delta accumulation in a scalable
+manner that allows for real time aggregate querying and process lifecycle
+reports. This took several versions to find a scalable and robust approach that
+induced minimal impact on maximum system throughput. Now that the framework is
+in place, it can be extended to track any further desired process local uses of
+`couch_stats:increment_counter`. That said, the currently selected set of stats
+to track was heavily influenced by the challenges in reotractively 
understanding
+the quantity of resources induced by a query like `/db/_changes?since=$SEQ`, or
+similarly, `/db/_find`.
+
+CSRT started as an extension of the Mango execution stats logic to `_changes`
+feeds to get proper visibility into quantity of docs read and filtered per
+changes request, but then the focus inverted with the realization that we 
should
+instead use the existing stats tracking mechanisms that have already been 
deemed
+critical information to track, which then also allows for the real time 
tracking
+and aggregate query capabilities. The Mango execution stats can be ported into
+CSRT itself and just become one subset of the stats tracked as a whole, and
+similarly, any additional desired stats tracking can be easily added and will
+be picked up in the RPC deltas and process lifetime reports.
+
+# CSRT Config Keys
+
+## -define(CSRT, "csrt").
+
+> config:get("csrt").
+
+Primary CSRT config namespace: contains core settings for enabling different
+layers of functionality in CSRT, along with global config settings for limiting
+data volume generation.
+
+## -define(CSRT_MATCHERS_ENABLED, "csrt_logger.matchers_enabled").
+
+> config:get("csrt_logger.matchers_enabled").
+
+Config toggles for enabling specific builtin logger matchers, see the dedicated
+section below on `# CSRT Default Matchers`.
+
+## -define(CSRT_MATCHERS_THRESHOLD, "csrt_logger.matchers_threshold").
+
+> config:get("csrt_logger.matchers_threshold").
+
+Config settings for defining the primary `Threshold` value of the builtin 
logger
+matchers, see the dedicated section below on `# CSRT Default Matchers`.
+
+## -define(CSRT_MATCHERS_DBNAMES, "csrt_logger.dbnames_io").
+
+> config:get("csrt_logger.matchers_enabled").
+
+Config section for setting `$db_name = $threshold` resulting in instantiating a
+"dbname_io" logger matcher for each `$db_name` that will generate a CSRT
+lifecycle report for any contexts that that induced more operations on _any_ 
one
+field of `ioq_calls|get_kv_node|get_kp_node|docs_read|rows_read` that is 
greater
+than `$threshold` and is on database `$db_name`.
+
+This is basically a simple matcher for finding heavy IO requests on a 
particular
+database, in a manner amenable to key/value pair specifications in this .ini
+file until a more sophisticated declarative model exists. In particular, it's
+not easy to sequentially generate matchspecs by way `ets:fun2ms/1`, and so an
+alternative mechanism for either dynamically assembling an `#rctx{}` to match
+against or generating the raw matchspecs themselves is warranted.
+
+## -define(CSRT_INIT_P, "csrt.init_p").
+
+> config:get("csrt.init_p").
+
+Config toggles for tracking counters on spawning of RPC `fabric_rpc` workers by
+way of `rexi_server:init_p`. This allows us to conditionally enable new metrics
+for the desired RPC operations in an expandable manner, without having to add
+new stats for every single potential RPC operation. These are for the 
individual
+metrics to track, the feature is enabled by way of the config toggle
+`config:get(?CSRT, "enable_init_p")`, and these configs can left alone for the
+most part until new operations are tracked.
+
+# CSRT Code Markers
+
+## -define(CSRT_ETS, csrt_server).
+
+This is the reference to the CSRT ets table, it's managed by `csrt_server` so
+that's where the name originates from.
+
+## -define(MATCHERS_KEY, {csrt_logger, all_csrt_matchers}).
+
+This marker is where the active matchers are written to in `persisten_term` for
+concurrently and parallelly and accessing the logger matchers in the CSRT
+tracker processes for lifecycle reporting.
+
+# CSRT Process Dictionary Markers
+
+## -define(PID_REF, {csrt, pid_ref}).
+
+This marker is for the core storing the core `PidRef` identifier. The key idea
+here is that a lifecycle is a context lifecycle is contained to within the 
given
+`PidRef`, meaning that a `Pid` can instantiate different CSRT lifecycles and
+pass those to different workers.
+
+This is specifically necessary for long running processes that need to handle
+many CSRT context lifecycles over the course of that individual process's
+lifecycle independent. In practice, this is immediately needed for the actual
+coordinator lifecycle tracking, as `chttpd` uses a worker pool of http request
+handlers that can be re-used, so we need a way to create a CSRT lifecycle
+corresponding to the given request currently being serviced. This is also
+intended to be used in other long running processes, like IOQ or `couch_js` 
pids
+such that we can track the specific context inducing the operations on the
+`couch_file` pid or indexer or replicator or whatever.
+
+Worker processes have a more clear cut lifecycle, but either style of process
+can be exit'ed in a manner that skips the ability to do cleanup operations, so
+additionally there's a dedicated tracker process spawned to monitor the process
+that induced the CSRT context such that we can do the dynamic logger matching
+directly in these tracker processes and also we can properly cleanup the ets
+entries even if the Pid crashes.
+
+## -define(TRACKER_PID, {csrt, tracker}).
+
+A handle to the spawned tracker process that does cleanup and logger matching
+reprots at the end of the process lifecycle. We store a reference to the 
tracker
+pid so that for explicit context destruction, like in `chttpd` workers after a
+request has been serviced, we can update stop the tracker and perform the
+expected cleanup directly.

Review Comment:
   Yes, the key idea with CSRT is that it's a real time stats tracking engine 
where all the callers that induce `couch_stats:increment_counter` calls also 
immediately induce the corresponding CSRT call for tracking at the process 
level. This makes those values immediately available. The final "report" is 
just reading from those values once the process is dead or the context is 
closed, and then writing them to a report.
   
   This is especially critical for `rexi:ping`, as we _need_ to be streaming 
the induced deltas over the wire through `rexi:ping` to keep the coordinators 
apprised of the usage of long running requests. This was specifically done so 
that we still get real time updates for long running `_find` queries that 
rarely return data. Checkout the comments in `rexi:ping`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to