On 03/06/16 20:29, Sergey Fedorov wrote: > On 03/06/16 20:22, Emilio G. Cota wrote: >> On Sat, May 28, 2016 at 21:15:06 +0300, Sergey Fedorov wrote: >>> On 25/05/16 04:13, Emilio G. Cota wrote: >>> (snip) >>>> +double qdist_avg(const struct qdist *dist) >>>> +{ >>>> + unsigned long count; >>>> + size_t i; >>>> + double ret = 0; >>>> + >>>> + count = qdist_sample_count(dist); >>>> + if (!count) { >>>> + return NAN; >>>> + } >>>> + for (i = 0; i < dist->n; i++) { >>>> + struct qdist_entry *e = &dist->entries[i]; >>>> + >>>> + ret += e->x * e->count / count; >>> Please use Welford’s method or something like that, see >>> http://stackoverflow.com/a/1346890. >> Yes, the way the mean is computed right now, we might suffer >> from underflow if count is huge. But I'd rather take that, than the >> perf penalty of an iterative method (such as the one used >> in Welford's). Note that we might have huge amounts of >> items, e.g. one item per head bucket in qht's occupancy qdist >> (and 0.5M head buckets is easy to achieve). >> >> If we were to use an iterative method, we'd need to do something >> like: >> >> double qdist_avg(const struct qdist *dist) >> { >> size_t i, j; >> double ret = 0; >> >> if (!qdist_sample_count(dist)) { >> return NAN; >> } >> /* compute moving average to prevent under/overflow */ >> for (i = 0; i < dist->n; i++) { >> struct qdist_entry *e = &dist->entries[i]; >> >> for (j = 0; j < e->count; j++) { >> >> ret += (e->x - ret) / (i + j + 1); >> } >> } >> return ret; >> } >> >> Note that skipping the inner loop would be incorrect. > Ah, it's a shame. I'm wondering if there is some other algorithm that > could work for us?
Maybe something like https://en.wikipedia.org/wiki/Kahan_summation_algorithm could help? Regards, Sergey > >> I measured the time it takes to execute qdist_avg(&hst.occupancy) at the >> end of booting debian jessie for ARM. The difference is >> significant: >> >> Original: 0.000002 s >> Iterative: 0.002846 s > Have you compared the results of computing the average as well? > >> So really I think we should be OK with a potential underflow. If you want >> I can add a comment to remind our future selves of these findings. > Kind regards, > Sergey