> Something else: I think that the 'TOTAL' line doesn't make sense
> right now. Please separate this line slightly from the rest of the
> table and print the *cumulated timing* (in 's', not 'µs') of all
> tests, something like
>
> Total duration for all tests: 25.3s
>
> and
>
> Total
>> I still think that for such cases the number of iterations of the
>> affected tests should be increased to get more precise values.
>
> the times are for single iteration. (chunk median/chunk size)
Yes, but the number of iterations is the same regardless whether a
test takes 10µs or 1000µs –
> I still think that for such cases the number of iterations of the affected
> tests should be increased to get more precise values.
the times are for single iteration. (chunk median/chunk size)
> Please separate this line slightly from the rest of the table
> and print the *cumulated timing* (in
> here is the results with chris’ suggestion. (thanks chris)
Much better, thanks!
> still a bit noise on only load and load_advances. are results
> acceptable?
As far as I can see, the biggest differences occur if the 'Baseline'
and 'Benchmark' columns contain very small values. I still
hi,
here is the results with chris’ suggestion. (thanks chris)
i will check hyperfine.
still a bit noise on only load and load_advances.
are results acceptable?
Best,
Goksu
goksu.in
On 28 Aug 2023 21:19 +0300, Werner LEMBERG , wrote:
>
> code
Freetype Benchmark Results
Warning: Baseline and
>> Should I proceed to detect outliers? Since we do not get the same
>> error rate consistently, I think we will not find the target we
>> expected by outliers.
>
> Why do you think so? Please explain your reasoning. Just remember
> that backup processes (like cleaning up the hard disk,
Ahmet,
> I have edited the code aligning with the Hin-Tak’s suggestion. Here
> is the two results pages, also pushed on gitlab.
Thanks. It seems to me we are getting nearer. However, there are
still large differences.
* Chris mentioned a potential problem with `clock_gettime` in the
code
>> To summarize: Benchmark comparisons only work if there is a sound
>> mathematical foundation to reduce the noise.
>
> I am probably not qualified, but I am following the discussion for
> some time. And I think there is a problem with the benchmarking
> itself. If I understand correctly the
Hİ,
I have edited the code aligning with the Hin-Tak’s suggestion. Here is the two
results pages, also pushed on gitlab.
Best,
Goksu
goksu.in
On 18 Aug 2023 14:02 +0300, Werner LEMBERG , wrote:
> > > What happens if you use, say, `-c 10', just running the
> > > `Get_Char_Index` test? Are the
On Fri, 18 Aug 2023 11:02:49 + (UTC), Werner LEMBERG wrote:
> To summarize: Benchmark comparisons only work if there is a sound
> mathematical foundation to reduce the noise.
I am probably not qualified, but I am following the discussion for some
time. And I think there is a problem with the
>> What happens if you use, say, `-c 10', just running the
>> `Get_Char_Index` test? Are the percental timing differences then
>> still that large?
> Actually Get_Char_Index, on the three pages I have sent in the
> prev. mail, is higher than 6% only 4 times out of 15 total. (which is
> seem on
Hi,
The approach we initially took was, in fact, based on the principle of the
interquartile range (IQR) – a method that excludes outliers by determining the
range between the first and third quartiles. However, I understand from your
feedback that directly focusing on the median and quantiles
On Friday, 18 August 2023 at 00:21:41 BST, Ahmet Göksu
wrote:
> about outliers, i splitted every tests into chuncks that is sized 100. Made
> IQR calculations and calculated average time on valid chunks. you can find
> the result in the attachment also pushed to gitlab.
> also,
> What happens if you use, say, `-c 10', just running the
> `Get_Char_Index` test? Are the percental timing differences then
> still that large?
Actually Get_Char_Index, on the three pages I have sent in the prev. mail, is
higher than 6% only 4 times out of 15 total. (which is seem on other
> Just remember that backup processes (like cleaning up the hard disk,
> running some cron jobs, etc.) can pop up anytime, thus influencing
> the result.
s/backup/background/
> I have added the total table that you suggested.
Thanks.
> I think Get_Char_Index is not the problem, the results varies all
> the time.
As far as I can see, there is a direct relationship between the total
cumulated time of a test and the timing variation: The smaller the
cumulated time,
Hi,
I have added the total table that you suggested.
I think Get_Char_Index is not the problem, the results varies all the time.
Here are the three results that i had in the same minute (one has different
flags).
Should I proceed to detect outliers?
Since we do not get the same error rate
>> What exactly means 'Baseline (ms)'? Is the shown number the time
>> for one loop? For all loops together? Please clarify and mention
>> this on the HTML page.
>
> Clarified that the times are milliseconds for the cumulative time
> for all iterations.
Thanks. The sentence is not easily
Hi!
I changed code to warmup with number of iterations.
> What exactly means 'Baseline (ms)'? Is the shown number the time
> for one loop? For all loops together? Please clarify and mention
> this on the HTML page.
Clarified that the times are milliseconds for the cumulative time for all
> It is warming up as the given number of seconds with -w flag before
> every benchmark test.
>
> There are still differences like 100%.. Also, 1 sec warmup means
> (test count)*(font count) 70 secs for the results.
Mhmm, I'm not sure whether a warmup *time span* makes sense. I would
rather
20 matches
Mail list logo