On 2025-03-26 19:36, Mateusz Guzik wrote:
[...]

Hell, it may be your patch as is can be easily repurposed to
decentralize the main percpu counter? I mean perhaps there is no need
for any fancy hierarchical structure.

Here is an initial attempt at a design document for the hierarchical
percpu counters. Feedback welcome!

Split Counters With Binary Tree Approximation Propagation
=========================================================

Mathieu Desnoyers <[email protected]>
March 27, 2025

* Propagation diagram when reaching batch size thresholds (± batch size):

Example diagram for 8 CPUs:

log2(8) = 3 levels

At each level, each pair propagates its values to the next level when
reaching the batch size thresholds.

Counters at levels 0, 1, 2 can be kept on a single byte (±128 range).
Counter at level 3 can be kept on a 32/64-bit counter.

Level 0:  0    1    2    3    4    5    6    7
          |   /     |   /     |   /     |   /
          |  /      |  /      |  /      |  /
          | /       | /       | /       | /
Level 1:  0         1         2         3
          |       /           |       /
          |    /              |    /
          | /                 | /
Level 2:  0                   1
          |               /
          |         /
          |   /
Level 3:  0


* Inaccuracy:

BATCH(level N): Level N batch size.

Example for BATCH(level 0) = 4

BATCH(level 0) =  4
BATCH(level 1) =  8
BATCH(level 2) = 16
BATCH(level N) = BATCH(level 0) * 2^N

          per-counter     global
          inaccuracy      inaccuracy
Level 0:    ±  3          ± 24  (8 * 3)
Level 1:    ±  7          ± 28  (4 * 7)
Level 2:    ± 15          ± 30  (2 * 15)
Total:      ------        ± 82  (log2(nr_cpus) * BATCH(level 0) * nr_cpus - 
Sum[0 .. log2(nr_cpus) - 1](nr_cpus / 2^n)

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Reply via email to