[Bug 1951032] Re: AArch64: Backport memcpy improvements
This bug was fixed in the package glibc - 2.31-0ubuntu9.9 --- glibc (2.31-0ubuntu9.9) focal; urgency=medium * Disable testsuite on riscv64. It is failing maths tests intermittently in ways that cannot be a glibc regression and is disabled in later series anyway. glibc (2.31-0ubuntu9.8) focal; urgency=medium * Update for 20.04. (LP: #1951033) [ Balint Reczey ] * Cherry-pick upstream patch to fix building with -moutline-atomics * Prevent rare deadlock in pthread_cond_signal (LP: #1899800) [ Matthias Klose ] * Revert: Use DH_COMPAT=8 for dh_strip to fix debug sections for valgrind. Enables debugging ld.so related issues. (LP: #1918035) * Don't strip ld.so on armhf. (LP: #1927192) [ Gunnar Hjalmarsson ] * d/local/usr_sbin/update-locale: improve sanity checks. (LP: #1892825) [ Heitor Alves de Siqueira ] * d/p/u/git-lp1928508-reversing-calculation-of-__x86_shared_non_temporal.patch: - Fix memcpy() performance regression on x86 AMD systems (LP: #1928508) [ Aurelien Jarno ] * debian/debhelper.in/libc.preinst: drop the check for kernel release > 255 now that glibc and preinstall script are fixed. (LP: #1962225) [ Michael Hudson-Doyle ] * libc6 on arm64 is now built with -moutline-atomics so libc6-lse can now be an empty package that is safe to remove. (LP: #1912652) * d/patches/u/aarch64-memcpy-improvements.patch: Backport memcpy improvements. (LP: #1951032) * Add test-float64x-yn to xfails on riscv64. -- Michael Hudson-Doyle Thu, 07 Apr 2022 13:24:41 +1200 ** Changed in: glibc (Ubuntu Focal) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
** Tags removed: verification-needed-focal ** Tags added: verification-done-focal ** Tags removed: verification-needed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
2.31-0ubuntu9.7 vs 2.31-0ubuntu9.9 = X-Gene = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 122.27 | 122.26 | -0.00% 65536 | 216.55 | 215.81 | -0.34% 131072 | 321.97 | 322.25 |0.09% 262144 | 515.08 | 515.29 |0.04% 524288 | 934.42 | 934.58 |0.02% 1048576 |1781.58 |1783.57 |0.11% = ThunderX = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 56.52 | 56.57 |0.09% 65536 | 86.45 | 86.54 |0.10% 131072 | 131.51 | 131.58 |0.06% 262144 | 235.38 | 235.50 |0.05% 524288 | 428.95 | 429.21 |0.06% 1048576 | 578.01 | 578.32 |0.05% = ThunderX2 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 124.61 | 121.43 | -2.54% 65536 | 267.12 | 267.78 |0.25% 131072 | 508.33 | 509.59 |0.25% 262144 | 976.22 | 986.22 |1.02% 524288 |1876.75 |1894.43 |0.94% 1048576 |3723.27 |3747.04 |0.64% = Hi1616 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 135.41 | 132.37 | -2.25% 65536 | 251.27 | 248.20 | -1.22% 131072 | 473.71 | 471.16 | -0.54% 262144 | 938.66 | 934.16 | -0.48% 524288 |1627.75 |1611.05 | -1.03% 1048576 |2919.80 |2912.29 | -0.26% = Hi1620 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 213.00 | 213.93 |0.44% 65536 | 407.69 | 408.29 |0.15% 131072 | 755.45 | 761.75 |0.83% 262144 |1409.21 |1418.65 |0.67% 524288 |2364.40 |2388.89 |1.04% 1048576 |4024.87 |4080.98 |1.39% = Altra = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 268.34 | 285.36 |6.34% 65536 | 519.58 | 549.29 |5.72% 131072 | 999.22 |1049.03 |4.98% 262144 |1853.42 |1942.76 |4.82% 524288 |2654.58 |2650.23 | -0.16% 1048576 |4030.24 |3955.51 | -1.85% -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
Ran the tests against -updates and -proposed on a Pi 4, Pi 3B+, and Pi Zero 2 with the following results: = Raspberry Pi 4 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 83.58 | 83.03 | -0.66% 65536 | 156.08 | 155.69 | -0.25% 131072 | 292.88 | 291.93 | -0.33% 262144 | 543.00 | 550.95 |1.47% 524288 | 596.54 | 612.14 |2.62% 1048576 | 640.72 | 643.76 |0.47% = Raspberry Pi 3B+ = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 55.55 | 55.38 | -0.31% 65536 | 98.39 | 98.36 | -0.03% 131072 | 173.72 | 169.80 | -2.26% 262144 | 165.05 | 161.24 | -2.31% 524288 | 146.80 | 147.16 |0.25% 1048576 | 221.23 | 221.32 |0.04% = Raspberry Pi Zero 2 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 39.09 | 39.32 |0.58% 65536 | 70.09 | 70.33 |0.34% 131072 | 125.40 | 122.88 | -2.01% 262144 | 133.79 | 133.76 | -0.02% 524288 | 134.59 | 133.84 | -0.56% 1048576 | 203.99 | 203.89 | -0.04% Looking much better than before; some mild improvement (and not too much cost) on the Pi 4 (the only platform where arm64 typically makes sense in the Pi series), and only minor degradation in the rest. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
I ran the tests on a graviton2 instance and saw the improvement as expected/desired: length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 233.19 | 247.61 |6.18% 65536 | 443.44 | 468.68 |5.69% 131072 | 852.52 | 895.00 |4.98% 262144 |1630.48 |1704.48 |4.54% 524288 |2480.18 |2601.47 |4.89% 1048576 |3900.73 |4112.08 |5.42% I'll try to get some people to test some other arm64 systems before calling this verification-done though. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
Dave, if you have time to test 2.31-0ubuntu9.8~ppa4 vs focal-update that would be interesting -- I think given Dann's results though they will be pretty neutral though so I wouldn't worry over much about it. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
On the performance regression on the Pi 0/3 models above, bear in mind that the models showing the regression are all the models with <=1GB of RAM. Assuming there's no regression on the armhf side of things (I haven't tested this, but I got the impression this was an arm64 only change?), we wouldn't be affecting users of an architecture which is, in several regards, preferable on the smaller platforms. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
Results for the 2.31-0ubuntu9.8~ppa3 build: = X-Gene = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 122.23 | 122.24 |0.01% 65536 | 215.55 | 215.23 | -0.15% 131072 | 321.10 | 323.41 |0.72% 262144 | 514.52 | 514.71 |0.04% 524288 | 934.18 | 934.98 |0.09% 1048576 |1783.41 |1781.47 | -0.11% = ThunderX = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 122.23 | 122.24 |0.01% 65536 | 215.55 | 215.23 | -0.15% 131072 | 321.10 | 323.41 |0.72% 262144 | 514.52 | 514.71 |0.04% 524288 | 934.18 | 934.98 |0.09% 1048576 |1783.41 |1781.47 | -0.11% = ThunderX2 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 127.67 | 127.31 | -0.28% 65536 | 267.54 | 267.83 |0.11% 131072 | 511.12 | 511.25 |0.03% 262144 | 984.24 | 984.64 |0.04% 524288 |1894.43 |1893.61 | -0.04% 1048576 |3750.56 |3750.02 | -0.01% = Hi1616 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 135.17 | 134.92 | -0.19% 65536 | 251.17 | 249.38 | -0.71% 131072 | 473.54 | 471.53 | -0.42% 262144 | 932.73 | 929.82 | -0.31% 524288 |1586.08 |1596.86 |0.68% 1048576 |2837.49 |2853.36 |0.56% = Hi1620 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 210.90 | 213.21 |1.10% 65536 | 407.92 | 408.19 |0.07% 131072 | 756.54 | 761.39 |0.64% 262144 |1405.13 |1420.10 |1.07% 524288 |2376.22 |2390.99 |0.62% 1048576 |4094.42 |4062.20 | -0.79% = Altra = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 270.03 | 286.89 |6.25% 65536 | 520.20 | 549.39 |5.61% 131072 | 999.18 |1048.87 |4.97% 262144 |1899.90 |1931.89 |1.68% 524288 |2694.23 |2874.09 |6.68% 1048576 |4102.38 |4356.14 |6.19% -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
Results for the 2.31-0ubuntu9.8~ppa2 build: = X-Gene = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 122.20 | 122.20 |0.00% 65536 | 215.42 | 213.49 | -0.89% 131072 | 321.09 | 321.35 |0.08% 262144 | 513.74 | 514.24 |0.10% 524288 | 933.63 | 934.49 |0.09% 1048576 |1782.39 |1782.30 | -0.00% = ThunderX = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 64.26 | 64.32 |0.09% 65536 | 96.20 | 96.22 |0.03% 131072 | 146.15 | 146.19 |0.03% 262144 | 261.63 | 261.69 |0.02% 524288 | 476.88 | 476.90 |0.00% 1048576 | 642.50 | 642.45 | -0.01% = ThunderX2 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 126.34 | 127.45 |0.88% 65536 | 267.46 | 267.44 | -0.01% 131072 | 510.13 | 508.72 | -0.28% 262144 | 985.29 | 982.52 | -0.28% 524288 |1898.92 |1892.56 | -0.34% 1048576 |3758.78 |3737.30 | -0.57% = Hi1616 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 134.48 | 134.65 |0.13% 65536 | 249.25 | 250.88 |0.66% 131072 | 471.34 | 473.74 |0.51% 262144 | 932.67 | 930.73 | -0.21% 524288 |1616.07 |1611.49 | -0.28% 1048576 |2915.93 |2928.78 |0.44 = Hi1620 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 211.49 | 212.56 |0.50% 65536 | 407.57 | 408.56 |0.24% 131072 | 753.83 | 755.84 |0.27% 262144 |1401.03 |1399.78 | -0.09% 524288 |2371.48 |2364.44 | -0.30% 1048576 |4075.81 |4041.50 | -0.84% = Altra = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 270.18 | 270.38 |0.08% 65536 | 520.13 | 519.56 | -0.11% 131072 | 990.88 |1000.81 |1.00% 262144 |1890.44 |1914.64 |1.28% 524288 |2711.94 |2707.39 | -0.17% 1048576 |4051.21 |4057.54 |0.16% -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
Some more results: = Raspberry Pi 3B 1GB = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 48.24 | 46.19 | -4.26% 65536 | 85.99 | 79.96 | -7.02% 131072 | 154.00 | 139.68 | -9.30% 262144 | 178.72 | 164.12 | -8.17% 524288 | 163.56 | 156.55 | -4.28% 1048576 | 246.15 | 234.32 | -4.81% = Raspberry Pi 3A+ 512MB = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 57.11 | 54.22 | -5.06% 65536 | 101.16 | 94.53 | -6.56% 131072 | 186.94 | 168.37 | -9.94% 262144 | 200.16 | 181.37 | -9.39% 524288 | 175.91 | 168.93 | -3.97% 1048576 | 261.19 | 250.62 | -4.04% = Raspberry Pi Zero 2 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 40.58 | 38.75 | -4.51% 65536 | 72.51 | 67.57 | -6.81% 131072 | 132.02 | 121.20 | -8.20% 262144 | 165.26 | 149.13 | -9.76% 524288 | 160.46 | 153.15 | -4.55% 1048576 | 241.92 | 230.87 | -4.57% Worth noting that the Pi 4 uses the 2711 SoC, while these (the 3B, 3A+, and Zero 2) all use the older 2837 SoC. In other words, while the new memcpy seems "okay" on the 2711, it's got "some" performance regression on the 2837. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
= Raspberry Pi 4B (rev 1.1) 4GB = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 83.88 | 84.21 |0.39% 65536 | 156.15 | 158.85 |1.73% 131072 | 292.38 | 298.58 |2.12% 262144 | 551.15 | 543.42 | -1.40% 524288 | 606.33 | 599.74 | -1.09% 1048576 | 651.02 | 654.38 |0.52% (will test some more models when I'm back home this afternoon) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
= ThunderX2 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 126.67 | 115.21 | -9.05% 65536 | 267.91 | 244.92 | -8.58% 131072 | 510.48 | 473.60 | -7.22% 262144 | 985.45 | 928.21 | -5.81% 524288 |1893.52 |1799.80 | -4.95% 1048576 |3755.69 |3576.35 | -4.78% = Altra = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 271.34 | 287.24 |5.86% 65536 | 520.02 | 548.40 |5.46% 131072 | 998.58 |1047.18 |4.87% 262144 |1890.99 |1970.43 |4.20% 524288 |2571.28 |2731.96 |6.25% 1048576 |3873.37 |4134.21 |6.73% = X-Gene = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 122.15 | 125.95 |3.11% 65536 | 212.34 | 218.16 |2.74% 131072 | 321.07 | 317.36 | -1.15% 262144 | 513.98 | 486.59 | -5.33% 524288 | 935.09 | 866.97 | -7.28% 1048576 |1785.51 |1647.05 | -7.75% = Hi1620 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 211.63 | 213.69 |0.97% 65536 | 408.22 | 407.80 | -0.11% 131072 | 753.09 | 764.44 |1.51% 262144 |1396.94 |1414.23 |1.24% 524288 |2360.75 |2375.61 |0.63% 1048576 |4063.19 |4085.62 |0.55% = ThunderX = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 62.84 | 62.87 |0.06% 65536 | 96.08 | 96.10 |0.02% 131072 | 146.16 | 146.16 | -0.00% 262144 | 261.62 | 261.64 |0.01% 524288 | 476.86 | 476.87 |0.00% 1048576 | 642.25 | 642.53 |0.04% = Hi1616 = length | before (MiB/s) | after (MiB/s) |delta --|||-- 32768 | 132.80 | 135.33 |1.90% 65536 | 245.60 | 259.11 |5.50% 131072 | 465.64 | 493.09 |5.90% 262144 | 917.01 | 967.84 |5.54% 524288 |1607.14 |1658.73 |3.21% 1048576 |2898.84 |2893.28 | -0.19% -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
It was pointed out to me that my compare.py script printed the times the wrong way around, so I fixed that and also changed it to print output in MiB/s -- always easier to reason about a benchmark when "bigger is better"! ** Description changed: [impact] glibc 2.32 contained a number of improvements to the memcpy routines for server-grade AArch64 implementations (in particular, graviton2 & graviton3). They should be backported to focal, as the LTS releases are by far the most used on servers. [test case] - Download the "bench.tar.gz" attachment from this report. It has a README + Download the "bench.tar.gz" attachment from this report. It has a README that explains what to do, but here it is for reference: benchmark for testing arm64 memcpy improvements in SRU This is a benchmark that was derived from the memcpy benchmarks in glibc but altered to benchmark the public 'memcpy' symbol and be linked to the installed libc. To use this there are 5 steps: 1. build -- just run "make test" 2. run before upgrade -- "make bench-before" 3. upgrade libc6 package -- depends on what is being tested! 4. run again -- "make bench-after" 5. compare -- "make compare" It produces output like this: -length | before |after |delta - --|--|--|-- - 32768 | 125995 | 133696 | -6.11% - 65536 | 133349 | 140856 | -5.63% -131072 | 139653 | 146419 | -4.84% -262144 | 145441 | 152353 | -4.75% -524288 | 191951 | 199856 | -4.12% - 1048576 | 240515 | 256623 | -6.70% +length | before (MiB/s) | after (MiB/s) |delta + --|||-- + 32768 | 233.74 | 248.03 |6.11% + 65536 | 443.72 | 468.69 |5.63% +131072 | 853.71 | 895.08 |4.84% +262144 |1640.93 |1718.91 |4.75% +524288 |2501.80 |2604.83 |4.12% + 1048576 |3896.77 |4157.74 |6.70% On graviton2 systems, this should show an improvement of at least several percent. On other arm64 systems (raspberry pis of various vintage, thunderx2, xgene, etc etc) no significant regression should be seen. [regression potential] Rebuilding glibc is always a little risky (toolchain bugs and incompatibilities between the old and new versions can be surprising). But the autopkgtests and some manual general testing can help here. For this specific change, there is a potential risk that the new memcpy implementation could be used on a system where it is not in fact the fastest. We should run the test case not only on the systems where it is expected to help, but other systems such as the RPi4 and the launchpad build farm to ensure performance is not regressed there. ** Attachment removed: "bench.tar.gz" https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+attachment/5566380/+files/bench.tar.gz ** Attachment added: "benchmark with fixed comparison script" https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+attachment/5566659/+files/bench.tar.gz -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
** Description changed: [impact] glibc 2.32 contained a number of improvements to the memcpy routines for server-grade AArch64 implementations (in particular, graviton2 & graviton3). They should be backported to focal, as the LTS releases are by far the most used on servers. [test case] - Compile the test_memcpy.c that is attached to this report: + Download the "bench.tar.gz" attachment from this report. It has a README + that explains what to do, but here it is for reference: - $ gcc -g -O3 test_memcpy.c -o test_memcpy64 + benchmark for testing arm64 memcpy improvements in SRU - "./test_memcpy64 1024" should be run before and after installing the - libc packages from proposed. On graviton2 systems, this should show a - substantial improvement. On other arm64 systems (raspberry pis of - various vintage, thunderx2, xgene, etc etc) at least no significant - regression should be seen. + This is a benchmark that was derived from the memcpy benchmarks in glibc but altered to benchmark the public 'memcpy' symbol and be linked to the + installed libc. + + To use this there are 5 steps: + + 1. build -- just run "make test" + 2. run before upgrade -- "make bench-before" + 3. upgrade libc6 package -- depends on what is being tested! + 4. run again -- "make bench-after" + 5. compare -- "make compare" + + It produces output like this: + +length | before |after |delta + --|--|--|-- + 32768 | 125995 | 133696 | -6.11% + 65536 | 133349 | 140856 | -5.63% +131072 | 139653 | 146419 | -4.84% +262144 | 145441 | 152353 | -4.75% +524288 | 191951 | 199856 | -4.12% + 1048576 | 240515 | 256623 | -6.70% + + On graviton2 systems, this should show an improvement of at least + several percent. On other arm64 systems (raspberry pis of various + vintage, thunderx2, xgene, etc etc) no significant regression should be + seen. [regression potential] Rebuilding glibc is always a little risky (toolchain bugs and incompatibilities between the old and new versions can be surprising). But the autopkgtests and some manual general testing can help here. For this specific change, there is a potential risk that the new memcpy implementation could be used on a system where it is not in fact the fastest. We should run the test case not only on the systems where it is expected to help, but other systems such as the RPi4 and the launchpad build farm to ensure performance is not regressed there. ** Attachment added: "bench.tar.gz" https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+attachment/5566380/+files/bench.tar.gz -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
Here's a set of regression tests I ran across various ARM Server SoCs: https://docs.google.com/spreadsheets/d/1hSdI5XKgXXw2iKV1Ceab0w3kqaO8wcO_cwYKQn0cdwA/edit#gid=0 This does show a negative performance impact for all but 1, the worst one being just over 2% (Altra). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
Bumping up the buffer sizes mentioned in [Test Case] from 32 -> 1024. In my testing, 32 results are inconsistent, perhaps due to caching effects. 1024 results look far more reliable. ** Description changed: [impact] glibc 2.32 contained a number of improvements to the memcpy routines for server-grade AArch64 implementations (in particular, graviton2 & graviton3). They should be backported to focal, as the LTS releases are by far the most used on servers. [test case] Compile the test_memcpy.c that is attached to this report: $ gcc -g -O3 test_memcpy.c -o test_memcpy64 - "./test_memcpy64 32" should be run before and after installing the libc - packages from proposed. On graviton2 systems, this should show a + "./test_memcpy64 1024" should be run before and after installing the + libc packages from proposed. On graviton2 systems, this should show a substantial improvement. On other arm64 systems (raspberry pis of various vintage, thunderx2, xgene, etc etc) at least no significant regression should be seen. [regression potential] Rebuilding glibc is always a little risky (toolchain bugs and incompatibilities between the old and new versions can be surprising). But the autopkgtests and some manual general testing can help here. For this specific change, there is a potential risk that the new memcpy implementation could be used on a system where it is not in fact the fastest. We should run the test case not only on the systems where it is expected to help, but other systems such as the RPi4 and the launchpad build farm to ensure performance is not regressed there. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
Here's a version that compiles on arm64. Do note it expects a command line argument, the size of the buffer to use in megabytes. ** Attachment added: "test_memcpy.c" https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+attachment/5557684/+files/test_memcpy.c ** Description changed: [impact] glibc 2.32 contained a number of improvements to the memcpy routines for server-grade AArch64 implementations (in particular, graviton2 & graviton3). They should be backported to focal, as the LTS releases are by far the most used on servers. [test case] - Compile the test_memcpy.c that is attached to bug 1928508: + Compile the test_memcpy.c that is attached to this report: $ gcc -g -O3 test_memcpy.c -o test_memcpy64 "./test_memcpy64 32" should be run before and after installing the libc packages from proposed. On graviton2 systems, this should show a substantial improvement. On other arm64 systems (raspberry pis of various vintage, thunderx2, xgene, etc etc) at least no significant regression should be seen. [regression potential] Rebuilding glibc is always a little risky (toolchain bugs and incompatibilities between the old and new versions can be surprising). But the autopkgtests and some manual general testing can help here. For this specific change, there is a potential risk that the new memcpy implementation could be used on a system where it is not in fact the fastest. We should run the test case not only on the systems where it is expected to help, but other systems such as the RPi4 and the launchpad build farm to ensure performance is not regressed there. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
** Description changed: [impact] glibc 2.32 contained a number of improvements to the memcpy routines for server-grade AArch64 implementations (in particular, graviton2 & graviton3). They should be backported to focal, as the LTS releases are by far the most used on servers. [test case] Compile the test_memcpy.c that is attached to bug 1928508: $ gcc -g -O3 test_memcpy.c -o test_memcpy64 - This should be run before and after installing the libc packages from - proposed. On graviton2 systems, this should show a substantial - improvement. On other arm64 systems (raspberry pis of various vintage, - thunderx2, xgene, etc etc) at least no significant regression should be - seen. + "./test_memcpy64 32" should be run before and after installing the libc + packages from proposed. On graviton2 systems, this should show a + substantial improvement. On other arm64 systems (raspberry pis of + various vintage, thunderx2, xgene, etc etc) at least no significant + regression should be seen. [regression potential] Rebuilding glibc is always a little risky (toolchain bugs and incompatibilities between the old and new versions can be surprising). But the autopkgtests and some manual general testing can help here. For this specific change, there is a potential risk that the new memcpy implementation could be used on a system where it is not in fact the fastest. We should run the test case not only on the systems where it is expected to help, but other systems such as the RPi4 and the launchpad build farm to ensure performance is not regressed there. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
I attempted to regression test this on a set of arm64 server SoCs, but: $ gcc -g -O3 test_memcpy.c -o test_memcpy64 test_memcpy.c:6:10: fatal error: mm_malloc.h: No such file or directory 6 | #include | ^ compilation terminated. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
** Changed in: glibc (Ubuntu) Status: Invalid => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
Hello Michael, or anyone else affected, Accepted glibc into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/glibc/2.31-0ubuntu9.4 in a few hours, and then in the -proposed repository. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed- focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification- failed-focal. In either case, without details of your testing we will not be able to proceed. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping! N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days. ** Changed in: glibc (Ubuntu Focal) Status: In Progress => Fix Committed ** Tags added: verification-needed verification-needed-focal -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
** Description changed: [impact] glibc 2.32 contained a number of improvements to the memcpy routines for server-grade AArch64 implementations (in particular, graviton2 & graviton3). They should be backported to focal, as the LTS releases are by far the most used on servers. [test case] Compile the test_memcpy.c that is attached to bug 1928508: $ gcc -g -O3 test_memcpy.c -o test_memcpy64 This should be run before and after installing the libc packages from - proposed. On graviton2 systems, this should show a substantial increase. - On other arm64 systems (raspberry pis of various vintage, thunderx2, - xgene, etc etc) at least no significant regression should be seen. + proposed. On graviton2 systems, this should show a substantial + improvement. On other arm64 systems (raspberry pis of various vintage, + thunderx2, xgene, etc etc) at least no significant regression should be + seen. [regression potential] Rebuilding glibc is always a little risky (toolchain bugs and incompatibilities between the old and new versions can be surprising). But the autopkgtests and some manual general testing can help here. For this specific change, there is a potential risk that the new memcpy implementation could be used on a system where it is not in fact the fastest. We should run the test case not only on the systems where it is expected to help, but other systems such as the RPi4 and the launchpad build farm to ensure performance is not regressed there. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
** Description changed: [impact] glibc 2.32 contained a number of improvements to the memcpy routines for server-grade AArch64 implementations (in particular, graviton2 & graviton3). They should be backported to focal, as the LTS releases are by far the most used on servers. [test case] - The testcase from https://bugs.launchpad.net/ubuntu/focal/+source/glibc/+bug/1928508 can be used for this too. + Compile the test_memcpy.c that is attached to bug 1928508: + + $ gcc -g -O3 test_memcpy.c -o test_memcpy64 + + This should be run before and after installing the libc packages from + proposed. On graviton2 systems, this should show a substantial increase. + On other arm64 systems (raspberry pis of various vintage, thunderx2, + xgene, etc etc) at least no significant regression should be seen. [regression potential] Rebuilding glibc is always a little risky (toolchain bugs and incompatibilities between the old and new versions can be surprising). But the autopkgtests and some manual general testing can help here. For this specific change, there is a potential risk that the new memcpy implementation could be used on a system where it is not in fact the fastest. We should run the test case not only on the systems where it is expected to help, but other systems such as the RPi4 and the launchpad build farm to ensure performance is not regressed there. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
** Changed in: glibc (Ubuntu Focal) Status: New => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951032] Re: AArch64: Backport memcpy improvements
** Description changed: [impact] glibc 2.32 contained a number of improvements to the memcpy routines for server-grade AArch64 implementations (in particular, graviton2 & graviton3). They should be backported to focal, as the LTS releases are by far the most used on servers. [test case] - TBD. + The testcase from https://bugs.launchpad.net/ubuntu/focal/+source/glibc/+bug/1928508 can be used for this too. [regression potential] Rebuilding glibc is always a little risky (toolchain bugs and incompatibilities between the old and new versions can be surprising). But the autopkgtests and some manual general testing can help here. For this specific change, there is a potential risk that the new memcpy implementation could be used on a system where it is not in fact the fastest. We should run the test case not only on the systems where it is expected to help, but other systems such as the RPi4 and the launchpad build farm to ensure performance is not regressed there. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951032 Title: AArch64: Backport memcpy improvements To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1951032/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs