nickva commented on PR #5625:
URL: https://github.com/apache/couchdb/pull/5625#issuecomment-3202941355
> Finally, ets_lru exists and has both size and time expiration. why invent
a new way?
I looked at ets_lru but opted to go with something which has O(1) reads and
inserts. And it's kind of a mix of a frequency based lru with a delayed time
component, not a plain LRU. For an LRU the update would need to bump a sorted
ets or list of some sort, while with a frequency counter we can rely on an
ets:update_counter/3 call then let the scanners for each sharded ets do the
work. I mostly thought of how we do the rate limiting tables for
couch_replicator and the couch_server sharding.
> I'd focus on benchmarking and more statistics, so we know what we're
comparing. We'd want to know how many times we read the same term from disk in
the current codebase versus how often we avoided it with the cache.
Yeah, definitely will try to run more benchmarks. These are ones were run on
my laptop, I'll see if I can get my hands on a real cluster. I think some of
the stats there already tell a good story for instance for the benchmark above
we get:
```
> % http $DB/_node/_local/_stats/couchdb/bt_engine_cache
{
"full": {
"desc": "number of times bt_engine cache was full",
"type": "counter",
"value": 0
},
"hits": {
"desc": "number of bt_engine cache hits",
"type": "counter",
"value": 233296
},
"misses": {
"desc": "number of bt_engine cache misses",
"type": "counter",
"value": 7343
}
}
```
That is caching level 2 btree tops we already get 233296 kp_node hits,
that's how many of them we can fetch from the cache not from the disk.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]