> about percentages, i runned the bench with -c 200 to have instant > results for development process. here in the benchmark file > attached, it made more acceptable result when increased the -c flag > to 2000.
This is much better, thanks! However, there are still tests which have a difference of over 10% for the same commit, inspite of having a very large number of runs. Increasing the number of iterations is actually a brute-force method – what are the timings for the new defaults? The tests must not be too slow, overwise I could run everything in a virtual machine like 'valgrind' and use just a single iteration... Apropos timings: Please add some info to the HTML page that tells how long it takes to test a given font (or perhaps even more detailed information to tell how long it takes to perform a certain test). I suggest that you have a look at other statistical tools that do such sampling, for example google's 'benchmark' project. In particular, have a look at its user manual: https://github.com/google/benchmark/blob/main/docs/user_guide.md What caught my attention especially was the warmup-time option: Maybe it helps if you add an option `--warmup=N` to the benchmark program to make it ignore the first N iterations before starting the timing. Maybe there are other things in the user manual (and/or source code) that you could use to improve the statistical quality of the FreeType tests. Other useful information to reduce the variance can be found here; please do some research of what might be applicable! https://github.com/google/benchmark/blob/main/docs/reducing_variance.md > I changed compiling and the linking process as the demo programs. i > would like to continue to another build system if it seem ok. Will test soon. Werner