Bug#1037198: locales: please parallelise locale-gen
Hi, On 2023-06-15 22:19, наб wrote: > Hi! > > On Thu, Jun 15, 2023 at 09:26:43PM +0200, Aurelien Jarno wrote: > > On 2023-06-07 16:04, наб wrote: > > > Posting as a bug per comment from Andrej; originally posted 2022-05-06 as > > > https://salsa.debian.org/glibc-team/glibc/-/merge_requests/7 > > > > > > Patch based on current Salsa HEAD attached, incl. analysis. > > > > Thanks for the patch. I looks good, I have a comment though. > > > MemFree: in /proc/meminfo is available on all supported Debian kernels, > > > and, indeed, exactly what procps free(1) uses > > What is the reason to use MemFree instead of MemAvailable. > That's what procps free(1) used, and all Debian kernels > (kFreeBSD, Hurd, Linux) supported it. > > > The Linux > > kernel tends to maintain MemFree close to 0 by using the free RAM as > > cache. MemAvailable also includes reclaimable memory blocks like cache > > or inactive pages and therefore sounds better suited. > Since I first posted this, procps free(1) started using MemAvailable to > evaluate free/used, so sure. I don't feel strongly either way. > > A Hurd image from 2021 I have (bullseye branding) and the 2023 release > (bookworm branding) don't have MemAvailable, neither does kFreeBSD 10 > (from the 2017 installer ISO; appears to be the latest from > https://wiki.debian.org/Debian_GNU/kFreeBSD). > > I've updated the Salsa revision and am including an updated patch here, > which overrides MemFree with MemAvailable if available. Please note that I had to revert that patch in glibc 2.37-5 as it corrupts /usr/lib/locale/locale-archive and breaks systems during upgrade. Regards Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://aurel32.net signature.asc Description: PGP signature
Bug#1037198: locales: please parallelise locale-gen
Hi! On Thu, Jun 15, 2023 at 09:26:43PM +0200, Aurelien Jarno wrote: > On 2023-06-07 16:04, наб wrote: > > Posting as a bug per comment from Andrej; originally posted 2022-05-06 as > > https://salsa.debian.org/glibc-team/glibc/-/merge_requests/7 > > > > Patch based on current Salsa HEAD attached, incl. analysis. > > Thanks for the patch. I looks good, I have a comment though. > > MemFree: in /proc/meminfo is available on all supported Debian kernels, > > and, indeed, exactly what procps free(1) uses > What is the reason to use MemFree instead of MemAvailable. That's what procps free(1) used, and all Debian kernels (kFreeBSD, Hurd, Linux) supported it. > The Linux > kernel tends to maintain MemFree close to 0 by using the free RAM as > cache. MemAvailable also includes reclaimable memory blocks like cache > or inactive pages and therefore sounds better suited. Since I first posted this, procps free(1) started using MemAvailable to evaluate free/used, so sure. I don't feel strongly either way. A Hurd image from 2021 I have (bullseye branding) and the 2023 release (bookworm branding) don't have MemAvailable, neither does kFreeBSD 10 (from the 2017 installer ISO; appears to be the latest from https://wiki.debian.org/Debian_GNU/kFreeBSD). I've updated the Salsa revision and am including an updated patch here, which overrides MemFree with MemAvailable if available. Best, наб From d64e6b551948726dbe5cc6800e93a2d7b25d3f89 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=D0=BD=D0=B0=D0=B1?= Date: Fri, 6 May 2022 01:22:10 +0200 Subject: [PATCH] Parallelise locale-gen if possible MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Mutt-PGP: OS Assuming a very generous 200M free/localedef (because I saw a max RSS of 147M w/time(1)), this will attempt to keep all jobs saturated, and usually succeed. There's little starvation, since the vast majority of time is spent in gzip(1) ‒ 1:14 user vs 27:55 sys At 2.2ish seconds per locale, even on a low-end system of today with 4 CPUs (and 800 free MB), we can generate up to 4 locales at once for 6.6s' speed-up. Assuming no super-pathological cases, this globally scales in roughly ceil(locales/ncpus)*2.2s chunks, which is a massive win The only user-visible change is that, with nproc>1, the output is en_GB.UTF-8... instead of en_GB.UTF-8... MemFree: in /proc/meminfo is available on all supported Debian kernels, MemAvailable: only on Linux; procps free(1) uses MemAvailable to estimate "used" space where available. --- debian/local/usr_sbin/locale-gen | 31 +-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/debian/local/usr_sbin/locale-gen b/debian/local/usr_sbin/locale-gen index 7fa3d772..30f70f5e 100755 --- a/debian/local/usr_sbin/locale-gen +++ b/debian/local/usr_sbin/locale-gen @@ -23,6 +23,19 @@ is_entry_ok() { fi } +nproc="$(nproc 2>/dev/null)" || nproc=1 +if [ "$nproc" -gt 1 ]; then + mem_free=0 + while read -r k v _; do + [ "$k" = "MemFree:" ] && mem_free="$v" || : + [ "$k" = "MemAvailable:" ] && mem_free="$v" && break || : # Prefer using MemAvailable on Linux; other Debian kernels only have MemFree + done < /proc/meminfo || : + mem_free=$(( mem_free / 1024 / 200 )) + [ "$mem_free" -lt 1 ] && mem_free=1 || : + [ "$mem_free" -lt "$nproc" ] && nproc="$mem_free" || : + jobs=0; pids= +fi 2>/dev/null + echo "Generating locales (this might take a while)..." while read -r locale charset; do if [ -z "$locale" ] || [ "${locale#\#}" != "$locale" ]; then continue; fi @@ -35,6 +48,7 @@ while read -r locale charset; do locale_at="${locale#*@}" [ "$locale_at" = "$locale" ] && locale_at= || locale_at="@$locale_at" printf " %s.%s%s..." "$locale_base" "$charset" "$locale_at" + [ "$nproc" -gt 1 ] && echo || : if [ -e "$USER_LOCALES/$locale" ]; then input="$USER_LOCALES/$locale" @@ -46,7 +60,20 @@ while read -r locale charset; do input="$USER_LOCALES/$input" fi fi - localedef -i "$input" -c -f "$charset" -A /usr/share/locale/locale.alias "$locale" || : - echo " done" + localedef -i "$input" -c -f "$charset" -A /usr/share/locale/locale.alias "$locale" & + if [ "$nproc" -gt 1 ]; then + pids="$pids$! " + jobs=$(( jobs + 1 )) + + if [ "$jobs" -ge "$nproc" ]; then + wait "${pids%% *}" || : + jobs=$(( jobs - 1 )) + pids="${pids#* }" + fi + else + wait + echo " done" + fi done < "$LOCALEGEN" +wait echo "Generation complete." -- 2.39.2 signature.asc Description: PGP signature
Bug#1037198: locales: please parallelise locale-gen
Hi, On 2023-06-07 16:04, наб wrote: > Package: locales > Version: 2.36-9 > Severity: wishlist > Tags: patch > > Dear Maintainer, > > Posting as a bug per comment from Andrej; originally posted 2022-05-06 as > https://salsa.debian.org/glibc-team/glibc/-/merge_requests/7 > > Patch based on current Salsa HEAD attached, incl. analysis. Thanks for the patch. I looks good, I have a comment though. > MemFree: in /proc/meminfo is available on all supported Debian kernels, > and, indeed, exactly what procps free(1) uses What is the reason to use MemFree instead of MemAvailable. The Linux kernel tends to maintain MemFree close to 0 by using the free RAM as cache. MemAvailable also includes reclaimable memory blocks like cache or inactive pages and therefore sounds better suited. Regards Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://aurel32.net signature.asc Description: PGP signature
Bug#1037198: locales: please parallelise locale-gen
Package: locales Version: 2.36-9 Severity: wishlist Tags: patch Dear Maintainer, Posting as a bug per comment from Andrej; originally posted 2022-05-06 as https://salsa.debian.org/glibc-team/glibc/-/merge_requests/7 Patch based on current Salsa HEAD attached, incl. analysis. Best, наб -- System Information: Debian Release: 12.0 APT prefers unstable APT policy: (500, 'unstable') Architecture: x32 (x86_64) Foreign Architectures: amd64, i386 Kernel: Linux 6.1.0-2-amd64 (SMP w/2 CPU threads; PREEMPT) Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages locales depends on: ii debconf [debconf-2.0] 1.5.82 ii libc-bin 2.36-9 ii libc-l10n 2.36-9 locales recommends no packages. locales suggests no packages. -- debconf information: * locales/locales_to_be_generated: en_GB.UTF-8 UTF-8 * locales/default_environment_locale: en_GB.UTF-8 From b6af0ad83f5517fd1987f9c7ac0493565bc0976d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=D0=BD=D0=B0=D0=B1?= Date: Fri, 6 May 2022 01:22:10 +0200 Subject: [PATCH] Parallelise locale-gen if possible MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Mutt-PGP: OS Assuming a very generous 200M free/localedef (because I saw a max RSS of 147M w/time(1)), this will attempt to keep all jobs saturated, and usually succeed. There's little starvation, since the vast majority of time is spent in gzip(1) ‒ 1:14 user vs 27:55 sys At 2.2ish seconds per locale, even on a low-end system of today with 4 CPUs (and 800 free MB), we can generate up to 4 locales at once for 6.6s' speed-up. Assuming no super-pathological cases, this globally scales in roughly ceil(locales/ncpus)*2.2s chunks, which is a massive win The only user-visible change is that, with nproc>1, the output is en_GB.UTF-8... instead of en_GB.UTF-8... MemFree: in /proc/meminfo is available on all supported Debian kernels, and, indeed, exactly what procps free(1) uses --- debian/local/usr_sbin/locale-gen | 30 -- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/debian/local/usr_sbin/locale-gen b/debian/local/usr_sbin/locale-gen index 7fa3d772..f1632f4e 100755 --- a/debian/local/usr_sbin/locale-gen +++ b/debian/local/usr_sbin/locale-gen @@ -23,6 +23,18 @@ is_entry_ok() { fi } +nproc="$(nproc 2>/dev/null)" || nproc=1 +if [ "$nproc" -gt 1 ]; then + mem_free=0 + while read -r k v _; do + [ "$k" = "MemFree:" ] && mem_free="$v" && break || : + done < /proc/meminfo || : + mem_free=$(( mem_free / 1024 / 200 )) + [ "$mem_free" -lt 1 ] && mem_free=1 || : + [ "$mem_free" -lt "$nproc" ] && nproc="$mem_free" || : + jobs=0; pids= +fi 2>/dev/null + echo "Generating locales (this might take a while)..." while read -r locale charset; do if [ -z "$locale" ] || [ "${locale#\#}" != "$locale" ]; then continue; fi @@ -35,6 +47,7 @@ while read -r locale charset; do locale_at="${locale#*@}" [ "$locale_at" = "$locale" ] && locale_at= || locale_at="@$locale_at" printf " %s.%s%s..." "$locale_base" "$charset" "$locale_at" + [ "$nproc" -gt 1 ] && echo || : if [ -e "$USER_LOCALES/$locale" ]; then input="$USER_LOCALES/$locale" @@ -46,7 +59,20 @@ while read -r locale charset; do input="$USER_LOCALES/$input" fi fi - localedef -i "$input" -c -f "$charset" -A /usr/share/locale/locale.alias "$locale" || : - echo " done" + localedef -i "$input" -c -f "$charset" -A /usr/share/locale/locale.alias "$locale" & + if [ "$nproc" -gt 1 ]; then + pids="$pids$! " + jobs=$(( jobs + 1 )) + + if [ "$jobs" -ge "$nproc" ]; then + wait "${pids%% *}" || : + jobs=$(( jobs - 1 )) + pids="${pids#* }" + fi + else + wait + echo " done" + fi done < "$LOCALEGEN" +wait echo "Generation complete." -- 2.30.2 signature.asc Description: PGP signature