Em qui., 12 de mar. de 2026 às 16:21, Bryan Green <[email protected]> escreveu:
> I modified your memcpy1.c program to not inline the version functions. I > changed the memcpy function > call in version 1, added volatile to keep some DCE opportunities from > happening and added a range > of N values to keep the compiler from specializing the code for N = 4. > Before it did DCE and the test1 > function was just a ret. > > The interesting issue is the use of malloc versus the stack. The use of > malloc will probably track closer > with PG's use of palloc so I would say in that case this is an > optimization. It might be fun to compile PG > with and without the patch (in debug mode) and actually see what gets > generated for this function. > > Here are the results I got using your modified benchmark: > --- stack allocated --- > stack n=1 v1(patch): 49721599 ns v2(original): 21477302 ns ratio: > 2.315 original wins > stack n=2 v1(patch): 52065462 ns v2(original): 28765199 ns ratio: > 1.810 original wins > stack n=3 v1(patch): 58914958 ns v2(original): 39726110 ns ratio: > 1.483 original wins > stack n=4 v1(patch): 64585275 ns v2(original): 47046397 ns ratio: > 1.373 original wins > stack n=5 v1(patch): 73929844 ns v2(original): 58588698 ns ratio: > 1.262 original wins > stack n=6 v1(patch): 95465376 ns v2(original): 67807817 ns ratio: > 1.408 original wins > stack n=7 v1(patch): 86910226 ns v2(original): 76999488 ns ratio: > 1.129 original wins > stack n=8 v1(patch): 107765417 ns v2(original): 86046016 ns ratio: > 1.252 original wins > > --- malloc allocated --- > malloc n=1 v1(patch): 133283824 ns v2(original): 141361091 ns ratio: > 0.943 patch wins > malloc n=2 v1(patch): 145625895 ns v2(original): 180912711 ns ratio: > 0.805 patch wins > malloc n=3 v1(patch): 153975594 ns v2(original): 228459879 ns ratio: > 0.674 patch wins > malloc n=4 v1(patch): 154483094 ns v2(original): 248157408 ns ratio: > 0.623 patch wins > malloc n=5 v1(patch): 157710598 ns v2(original): 298795018 ns ratio: > 0.528 patch wins > malloc n=6 v1(patch): 165196636 ns v2(original): 332940132 ns ratio: > 0.496 patch wins > malloc n=7 v1(patch): 169576370 ns v2(original): 358438778 ns ratio: > 0.473 patch wins > malloc n=8 v1(patch): 184463815 ns v2(original): 403721513 ns ratio: > 0.457 patch wins > Thanks for your attention and tests. I think that patch can continue then. best regards, Ranier Vilela
