I've made some benchmarks, and I have found that for every (costly) collection routine of the GC, about ~0.7% of an application's (GC page bin contents) used memory is actually freed (in the GC pages).

I made some tools to come up with those statistics, available with a patched druntime:

https://github.com/D-Programming-Language/druntime/pull/803

My proposal is to implement pointer sampling in the GC (using hypothesis testing - hypergeometric or poisson distributions) to tweak this collection efficiency. The idea would be to be able to specify how much % we'd like the GC to swipe on average at every cycle, so that these cycles run less frequently.

I'm still looking to challenge this idea with someone that is knowledgeable with probabilistic statistics and/or quality assurance. Does anyone think my time would be wasted if I added it? Would this collide with a semi-precise GC?

Reply via email to