Bug#986749: apt-cacher-ng: acng still stops working with APOLL ADD failing and Bad file descriptor
Am Montag, dem 12.04.2021 um 21:13 +0200 schrieb Eduard Bloch: > Hallo, > * Christian Meyer [Sun, Apr 11 2021, 01:26:01PM]: > > Package: apt-cacher-ng > > Version: 3.6.3-1 > > Severity: important > > X-Debbugs-Cc: c2h...@web.de > > What you describe here does, all in one, not make much sense. Too > many symptoms coming together which are not adding up. Are you sure > that there is no bad memory or system instability involved? Until now I have no hint to that direction, but I will test memory and / or replace the machine. > Please send the output of > journalctl -b0 -u apt-cacher-ng For today (and cron driven restarts on errors every 3 minutes) it is: Apr 13 07:57:37 121-My apt-cacher-ng[202463]: [warn] Epoll ADD(1) on fd 13 failed. Old events were 0; read change was 1 (add); write change was 0 (none); close change was 0 (none): Bad file descriptor Apr 13 08:00:02 121-My systemd[1]: Stopping Apt-Cacher NG software download proxy... Apr 13 08:00:02 121-My systemd[1]: apt-cacher-ng.service: Succeeded. Apr 13 08:00:02 121-My systemd[1]: Stopped Apt-Cacher NG software download proxy. Apr 13 08:00:02 121-My systemd[1]: apt-cacher-ng.service: Consumed 55.182s CPU time. Apr 13 08:00:02 121-My systemd[1]: Starting Apt-Cacher NG software download proxy... Apr 13 08:00:02 121-My systemd[1]: Started Apt-Cacher NG software download proxy. Apr 13 08:00:40 121-My apt-cacher-ng[389015]: [warn] Epoll ADD(1) on fd 13 failed. Old events were 0; read change was 1 (add); write change was 0 (none); close change was 0 (none): Bad file descriptor Apr 13 08:03:02 121-My systemd[1]: Stopping Apt-Cacher NG software download proxy... Apr 13 08:03:02 121-My systemd[1]: apt-cacher-ng.service: Succeeded. Apr 13 08:03:02 121-My systemd[1]: Stopped Apt-Cacher NG software download proxy. Apr 13 08:03:02 121-My systemd[1]: apt-cacher-ng.service: Consumed 2.402s CPU time. Apr 13 08:03:02 121-My systemd[1]: Starting Apt-Cacher NG software download proxy... Apr 13 08:03:02 121-My systemd[1]: Started Apt-Cacher NG software download proxy. Apr 13 10:42:12 121-My apt-cacher-ng[391899]: [warn] Epoll ADD(1) on fd 13 failed. Old events were 0; read change was 1 (add); write change was 0 (none); close change was 0 (none): Bad file descriptor Apr 13 10:43:25 121-My apt-cacher-ng[391899]: [warn] Epoll ADD(1) on fd 13 failed. Old events were 0; read change was 1 (add); write change was 0 (none); close change was 0 (none): Bad file descriptor Apr 13 10:45:02 121-My systemd[1]: Stopping Apt-Cacher NG software download proxy... Apr 13 10:45:02 121-My systemd[1]: apt-cacher-ng.service: Succeeded. Apr 13 10:45:02 121-My systemd[1]: Stopped Apt-Cacher NG software download proxy. Apr 13 10:45:02 121-My systemd[1]: apt-cacher-ng.service: Consumed 56.112s CPU time. Apr 13 10:45:02 121-My systemd[1]: Starting Apt-Cacher NG software download proxy... Apr 13 10:45:02 121-My systemd[1]: Started Apt-Cacher NG software download proxy. Apr 13 11:36:26 121-My apt-cacher-ng[414217]: [warn] Epoll ADD(1) on fd 13 failed. Old events were 0; read change was 1 (add); write change was 0 (none); close change was 0 (none): Bad file descriptor Apr 13 11:39:02 121-My systemd[1]: Stopping Apt-Cacher NG software download proxy... Apr 13 11:39:02 121-My systemd[1]: apt-cacher-ng.service: Succeeded. Apr 13 11:39:02 121-My systemd[1]: Stopped Apt-Cacher NG software download proxy. Apr 13 11:39:02 121-My systemd[1]: apt-cacher-ng.service: Consumed 2min 47.837s CPU time. Apr 13 11:39:02 121-My systemd[1]: Starting Apt-Cacher NG software download proxy... Apr 13 11:39:02 121-My systemd[1]: Started Apt-Cacher NG software download proxy. > and also please set debug=7 in acng.conf or similar conf file. I did so an now I'm waiting for new errors. > If that's a crash, please install systemd-coredump service and we can > debug the corefiles later. > Oh, that's probably our "nice" apt hiding the actual error string and > showing this dummy instead. It should match some error in the error > log, > though. > > This doesn't make sense. ACNG does not render this message string > explicitly. > Is there some proxy server involved? Yes, there is an elderly squid upstream proxy: Proxy: http://user:pass@myProxy:8080 and I ReuseConnections: 0 These are tho only modifications of the config-file > > my workaround is to automatically restart the service by a cronjob > > (and after that /usr/lib/apt-cacher-ng/expire-caller.pl) when this > > line is seen. > > And now that is getting really odd. Do you have some kind of > application > firewall installed, like apparmor? I didn't install (or configure) it by hand, but obviously it is pulled in by $something in Bullseye: At least 'dpkg -i apparmor' says something like: ii apparmor2.13.6-10amd64user-space parser utility for AppArmor un apparmor-profiles-extra (keine Beschreibung vorhanden) un apparmor-utils (keine Beschreibung vorhanden) > We can have an interactive Jitsi session
Bug#986749: apt-cacher-ng: acng still stops working with APOLL ADD failing and Bad file descriptor
Hallo, * Christian Meyer [Sun, Apr 11 2021, 01:26:01PM]: > Package: apt-cacher-ng > Version: 3.6.3-1 > Severity: important > X-Debbugs-Cc: c2h...@web.de > > Dear Maintainer, > > even after the recent acng updates I still see apt-cacher-ng (3.6.3-1 amd64) > stop working. What you describe here does, all in one, not make much sense. Too many symptoms coming together which are not adding up. Are you sure that there is no bad memory or system instability involved? > Apt throws a bunch of errors like: > > E: Fehlschlag beim Holen von > http://deb.debian.org/debian/dists/bullseye/main/Contents-all Fehler beim > Lesen vom Server: Verbindung wurde durch den Se > rver auf der anderen Seite geschlossen. [IP: 127.0.0.1 3142] Closing connection is a potential outcome in worst situations but the connection wouldn't stay down permanently as seen in the rest of your log. Actually, that might be caused by periodic crashes so systemd keeps the server down for a while. Which brings me to the suspicion of system instability. Please send the output of journalctl -b0 -u apt-cacher-ng and also please set debug=7 in acng.conf or similar conf file. If that's a crash, please install systemd-coredump service and we can debug the corefiles later. > E: Einige Indexdateien konnten nicht heruntergeladen werden. Sie wurden > ignoriert oder alte an ihrer Stelle benutzt. > ... > > W: Das Depot »http://127.0.0.1:3142/deb.debian.org/debian bullseye Release« > enthält keine Release-Datei. > E: Fehlschlag beim Holen von > http://127.0.0.1:3142/deb.debian.org/debian/dists/bullseye/main/binary-i386/Packages > 503 Service Unavailable [IP: 127.0. > 0.1 3142] Oh, that's probably our "nice" apt hiding the actual error string and showing this dummy instead. It should match some error in the error log, though. This doesn't make sense. ACNG does not render this message string explicitly. Is there some proxy server involved? > Get:99 http://deb.debian.org/debian bullseye/main amd64 krb5-user amd64 > 1.18.3-4 [151 kB] > Get:100 http://deb.debian.org/debian bullseye/main amd64 ldap-utils amd64 > 2.4.57+dfsg-2 [206 kB] > Get:101 http://deb.debian.org/debian bullseye/main amd64 > linux-image-5.10.0-5-amd64 amd64 5.10.26-1 [53.4 MB] > Err:101 http://deb.debian.org/debian bullseye/main amd64 > linux-image-5.10.0-5-amd64 amd64 5.10.26-1 > Undetermined Error [IP: 127.0.0.1 3142] That's apt's insufficient error message which probably indicates a short read. Would fit to the connection refusal series. > Err:102 http://deb.debian.org/debian bullseye/main amd64 linux-image-amd64 > amd64 5.10.26-1 > Could not connect to 127.0.0.1:3142 (127.0.0.1). - connect (111: Connection > refused) [IP: 127.0.0.1 3142] > Err:103 http://deb.debian.org/debian bullseye/main amd64 whois amd64 5.5.8 > Unable to connect to 127.0.0.1:3142: [IP: 127.0.0.1 3142] > Since systemctl often tells me something about "Epoll ADD" failing: > # systemctl status apt-cacher-ng.service > apt-cacher-ng[646]: [warn] Epoll ADD(1) on fd 14 failed. Old events were 0; > read change was 1 (add); write change was 0 (none); close change was 0 > (none): Bad file descriptor > > my workaround is to automatically restart the service by a cronjob (and after > that /usr/lib/apt-cacher-ng/expire-caller.pl) when this line is seen. And now that is getting really odd. Do you have some kind of application firewall installed, like apparmor? > More information: > > Yes, my internet connection is very slow (around 8000 kbit/s) and that's my > reason to run acng). 8mbit/s isn't awful slow and it shouldn't make a difference anyway. > # less /var/log/apt-cacher-ng/apt-cacher.err.1 > Beside of tons of "Bad file descriptor" messages: > Fri Apr 9 11:13:05 > 2021|debrep/pool/main/f/ffmpeg/libavutil56_4.3.2-0+deb11u1_amd64.deb storage > error [HTTP/1.1 503 Bad file descriptor], last errno: Bad file descriptor > Fri Apr 9 11:13:06 > 2021|debrep/pool/main/c/codec2/libcodec2-0.9_0.9.2-4_amd64.deb storage error > [HTTP/1.1 503 Bad file descriptor], last errno: Bad file descriptor > Fri Apr 9 11:13:17 > 2021|debrep/pool/main/p/pulseaudio/libpulsedsp_14.2-2_amd64.deb storage error > [HTTP/1.1 503 Bad file descriptor], last errno: Bad file descriptor So I checked the code and it doesn't make sense. fileitem.cc:755 is the only location reporting that and it wouldn't get there when the filedescriptor is okay, and you cannot get that FD closed by any regular code path. We can have an interactive Jitsi session if you prefer to get this solved in realtime and it can be made reproducible somehow. Best regards, Eduard.
Bug#986749: apt-cacher-ng: acng still stops working with APOLL ADD failing and Bad file descriptor
Package: apt-cacher-ng Version: 3.6.3-1 Severity: important X-Debbugs-Cc: c2h...@web.de Dear Maintainer, even after the recent acng updates I still see apt-cacher-ng (3.6.3-1 amd64) stop working. Apt throws a bunch of errors like: E: Fehlschlag beim Holen von http://deb.debian.org/debian/dists/bullseye/main/Contents-all Fehler beim Lesen vom Server: Verbindung wurde durch den Se rver auf der anderen Seite geschlossen. [IP: 127.0.0.1 3142] E: Einige Indexdateien konnten nicht heruntergeladen werden. Sie wurden ignoriert oder alte an ihrer Stelle benutzt. ... W: Das Depot »http://127.0.0.1:3142/deb.debian.org/debian bullseye Release« enthält keine Release-Datei. E: Fehlschlag beim Holen von http://127.0.0.1:3142/deb.debian.org/debian/dists/bullseye/main/binary-i386/Packages 503 Service Unavailable [IP: 127.0. 0.1 3142] E: Einige Indexdateien konnten nicht heruntergeladen werden. Sie wurden ignoriert oder alte an ihrer Stelle benutzt. E: Paket adduser kann nicht gefunden werden. E: Paket apt kann nicht gefunden werden. ... ... Get:99 http://deb.debian.org/debian bullseye/main amd64 krb5-user amd64 1.18.3-4 [151 kB] Get:100 http://deb.debian.org/debian bullseye/main amd64 ldap-utils amd64 2.4.57+dfsg-2 [206 kB] Get:101 http://deb.debian.org/debian bullseye/main amd64 linux-image-5.10.0-5-amd64 amd64 5.10.26-1 [53.4 MB] Err:101 http://deb.debian.org/debian bullseye/main amd64 linux-image-5.10.0-5-amd64 amd64 5.10.26-1 Undetermined Error [IP: 127.0.0.1 3142] Err:102 http://deb.debian.org/debian bullseye/main amd64 linux-image-amd64 amd64 5.10.26-1 Could not connect to 127.0.0.1:3142 (127.0.0.1). - connect (111: Connection refused) [IP: 127.0.0.1 3142] Err:103 http://deb.debian.org/debian bullseye/main amd64 whois amd64 5.5.8 Unable to connect to 127.0.0.1:3142: [IP: 127.0.0.1 3142] Since systemctl often tells me something about "Epoll ADD" failing: # systemctl status apt-cacher-ng.service apt-cacher-ng[646]: [warn] Epoll ADD(1) on fd 14 failed. Old events were 0; read change was 1 (add); write change was 0 (none); close change was 0 (none): Bad file descriptor my workaround is to automatically restart the service by a cronjob (and after that /usr/lib/apt-cacher-ng/expire-caller.pl) when this line is seen. More information: Yes, my internet connection is very slow (around 8000 kbit/s) and that's my reason to run acng). # less /var/log/apt-cacher-ng/apt-cacher.err.1 Beside of tons of "Bad file descriptor" messages: Fri Apr 9 11:13:05 2021|debrep/pool/main/f/ffmpeg/libavutil56_4.3.2-0+deb11u1_amd64.deb storage error [HTTP/1.1 503 Bad file descriptor], last errno: Bad file descriptor Fri Apr 9 11:13:06 2021|debrep/pool/main/c/codec2/libcodec2-0.9_0.9.2-4_amd64.deb storage error [HTTP/1.1 503 Bad file descriptor], last errno: Bad file descriptor Fri Apr 9 11:13:17 2021|debrep/pool/main/p/pulseaudio/libpulsedsp_14.2-2_amd64.deb storage error [HTTP/1.1 503 Bad file descriptor], last errno: Bad file descriptor I only see a few different ones: Fri Apr 9 10:57:49 2021|Failed to move stale item debrep/dists/bullseye/main/i18n/Translation-de.bz2 out of the way: No such file or directory Fri Apr 9 10:59:29 2021|Failed to move stale item debrep/dists/bullseye/main/i18n/Translation-en.bz2 out of the way: No such file or directory Fri Apr 9 11:05:10 2021|debrep/pool/main/f/fonts-liberation/fonts-liberation_1.07.4-11_all.deb storage error [HTTP/1.1 503 Broken pipe], last errno: Broken pipe Fri Apr 9 11:06:56 2021|debrep/pool/main/l/lm-sensors/libsensors5_3.6.0-7_i386.deb storage error [HTTP/1.1 503 Connection reset by peer], last errno: Connection reset by peer Fri Apr 9 11:12:51 2021|debrep/pool/main/libr/libreoffice/libreoffice-base-drivers_7.0.4-3_amd64.deb storage error [HTTP/1.1 503 Broken pipe], last errno: Broken pipe # journalctl only shows something like: Apr 03 00:34:03 121-My apt-cacher-ng[2900630]: [warn] fcntl(15, F_SETFL): Bad file descriptor ... Apr 09 11:12:44 121-My apt-cacher-ng[2900630]: [warn] Epoll ADD(1) on fd 75 failed. Old events were 0; read change was 1 (add); write change was 0 (n> Apr 09 11:12:53 121-My apt-cacher-ng[2900630]: [warn] Epoll ADD(1) on fd 75 failed. Old events were 0; read change was 1 (add); write change was 0 (n> Apr 09 11:12:53 121-My apt-cacher-ng[2900630]: [warn] Epoll ADD(1) on fd 73 failed. Old events were 0; read change was 1 (add); write change was 0 Christian Meyer -- Package-specific info: -- System Information: Debian Release: bullseye/sid APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 5.10.0-5-amd64 (SMP w/4 CPU threads) Kernel taint flags: TAINT_CRAP Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages apt-cacher-ng depends on: ii adduser 3.118 ii debconf