On Fri, Jul 5, 2024, at 18:42, Joel Jacobson wrote:
> Very nice, v7-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
> is now the winner on all my CPUs:

I thought it would be interesting to also measure the isolated effect
on just numeric_mul() without the query overhead.

Included var1ndigits=5 var2ndigits=5, that should be unaffected,
just to get a sense of the noise level.

SELECT timeit.h('numeric_mul',array['9999','9999'],2,min_time:='1 s'::interval);
SELECT timeit.h('numeric_mul',array['9999_9999','9999_9999'],2,min_time:='1 
s'::interval);
SELECT 
timeit.h('numeric_mul',array['9999_9999_9999','9999_9999_9999'],2,min_time:='1 
s'::interval);
SELECT 
timeit.h('numeric_mul',array['9999_9999_9999_9999','9999_9999_9999_9999'],2,min_time:='1
 s'::interval);
SELECT 
timeit.h('numeric_mul',array['9999_9999_9999_9999_9999','9999_9999_9999_9999_9999'],2,min_time:='1
 s'::interval);

CPU                  | var1ndigits | var2ndigits | HEAD  | v7    | HEAD/v7
---------------------+-------------+-------------+-------+-------+---------
Apple M3 Max         |           1 |           1 | 28 ns | 18 ns | 1.56
Apple M3 Max         |           2 |           2 | 32 ns | 18 ns | 1.78
Apple M3 Max         |           3 |           3 | 38 ns | 21 ns | 1.81
Apple M3 Max         |           4 |           4 | 42 ns | 24 ns | 1.75
Intel Core i9-14900K |           1 |           1 | 25 ns | 20 ns | 1.25
Intel Core i9-14900K |           2 |           2 | 28 ns | 20 ns | 1.40
Intel Core i9-14900K |           3 |           3 | 33 ns | 24 ns | 1.38
Intel Core i9-14900K |           4 |           4 | 37 ns | 25 ns | 1.48
AMD Ryzen 9 7950X3D  |           1 |           1 | 37 ns | 29 ns | 1.28
AMD Ryzen 9 7950X3D  |           2 |           2 | 43 ns | 31 ns | 1.39
AMD Ryzen 9 7950X3D  |           3 |           3 | 50 ns | 37 ns | 1.35
AMD Ryzen 9 7950X3D  |           4 |           4 | 55 ns | 39 ns | 1.41

Impressive speed-up, between 25% - 81%.

Regards,
Joel


Reply via email to