Bug#1072917: Cannot generate a certificate request for a RSA-PSS key
Package: tpm2-openssl Version: 1.1.1-1 Severity: important In order to use tpm to store TLS keys, the key type must be usable for TLS. If, the ecc algo family cannot be used, this has to be RSA-PSS. RSA-PSS keys can be created with tpm2-tools and appear to function correctly outside openssl. Trying to generate an openssl cert request with invalid padding. How to reproduce: tpm2_createek -G rsa -c ek_pss.ctx tpm2_createak -C ek_pss.ctx -G rsa -g sha256 -s pss -c ak_ecc.ctx tpm2_evictcontrol -c ak_ecc.ctx 0x8101 OPENSSL_CONF=./openssl.cnf openssl req -provider tpm2 -provider default \ -propquery '?provider=tpm2' -key handle:0x8101 -out testcsr.pem -new The resulting csr has invalid padding (200+ bytes instead of 32) and is rejected if passed to a CA -- System Information: Debian Release: 12.5 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 6.1.0-13-amd64 (SMP w/12 CPU threads; PREEMPT) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages tpm2-openssl depends on: ii libc6 2.36-9+deb12u4 ii libtss2-esys-3.0.2-0 3.2.1-3 ii libtss2-rc0 3.2.1-3 ii libtss2-tctildr0 3.2.1-3 tpm2-openssl recommends no packages. tpm2-openssl suggests no packages. -- no debconf information
Bug#1054115: closed by Colin Watson (Re: Bug#1054115: broken on NFS)
On 17/10/2023 13:00, Debian Bug Tracking System wrote: This is an automatic notification regarding your Bug report which was filed against the man-db package: #1054115: broken on NFS It has been closed by Colin Watson . Their explanation is attached below along with your original report. If this explanation is unsatisfactory and you have not received a better one in a separate message then please contact Colin Watson by replying to this email. Apologies for reopening, you can close it again after that. Actually, to be fair this is not just usr.bin.man. As apparmor creep across more and more things it needs to become aware of network filesystems. What Debian as a whole needs is an extra profile to load network inet network inet6 as defaults ONLY if /usr and/or root is on a network filesystem. That is an apparmor bug, not man bug. Further to this. For documentation purposes: network inet and inet6 needs to added both to the man and groff profiles. Fixing man results in a similar failure invoking groff. It starts working only after both have been fixed. -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#1054123: apparmor breaks nfs root
Package: apparmor Version: 3.0.8-3 Severity: important Dear Maintainer, The default profile denies network functionality and it breaks man and other software which has an apparmor profile. They stop working on NFS. For an example see Debian bug 1054115 While it is possible to solve it on a case by case basis, the right bugfix is to check if root and/or /usr are on NFS and load an extra profile to allow network access. Alternatively, the kernel should stop treating network filesystem access as network access for apparmor purposes. That, however, is likely to a be a bit difficult. -- System Information: Debian Release: 12.2 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 5.10.0-22-amd64 (SMP w/12 CPU threads) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages apparmor depends on: ii debconf [debconf-2.0] 1.5.82 ii libc6 2.36-9+deb12u2 apparmor recommends no packages. Versions of packages apparmor suggests: pn apparmor-profiles-extra pn apparmor-utils -- debconf information excluded
Bug#1054115: broken on NFS
Package: man-db Version: 2.11.2-2 Severity: important Can the genius who denied man internet access please come forward and explain how it will now work on NFS-root systems [ 79.257369] audit: type=1400 audit(1697531933.690:139): apparmor="DENIED" operation="sendmsg" profile="/usr/bin/man" pid=3921 comm="man" laddr=192.168.3.98 lport=676 faddr=192.168.3.3 fport=2049 family="inet" sock_type="stream" protocol=6 requested_mask="send" denied_mask="send" Genius. Sheer, unadulterated, crystallized and purified. -- System Information: Debian Release: 12.2 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 5.10.0-22-amd64 (SMP w/12 CPU threads) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages man-db depends on: ii bsdextrautils 2.38.1-5+b1 ii bsdmainutils 12.1.8 ii debconf [debconf-2.0] 1.5.82 ii groff-base 1.22.4-10 ii libc6 2.36-9+deb12u2 ii libgdbm6 1.23-3 ii libpipeline1 1.5.7-1 ii libseccomp22.5.4-1+b3 ii zlib1g 1:1.2.13.dfsg-1 man-db recommends no packages. Versions of packages man-db suggests: ii apparmor 3.0.8-3 ii chromium [www-browser] 116.0.5845.180-1~deb12u1 ii firefox-esr [www-browser] 102.15.1esr-1~deb12u1 pn groff ii less 590-2 ii lynx [www-browser] 2.9.0dev.12-1 ii w3m [www-browser] 0.5.3+git20230121-2 -- debconf information excluded
Bug#924664: ejabberd: node migration broken
Unfortunately, I have stopped using it. My use case disappeared the moment google started the process of shutting off the jabber gateway into chat. Best Regards, A. On 02/11/2022 18:02, Philipp Huebner wrote: Hello Anton, any news on this? On Wed, 15 Sep 2021 16:21:27 +0100 Anton Ivanov wrote: OK. I will retest when I upgrade to current stable which should happen in the next few days. A. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#1005900: [PATCH v2] um: Fix uml_mconsole stop/go
From: Anton Ivanov Moving to an EPOLL based IRQ controller broke uml_mconsole stop/go commands. This fixes it and restores stop/go functionality. Fixes: ff6a17989c08 ("Epoll based IRQ controller") Signed-off-by: Anton Ivanov --- arch/um/drivers/mconsole_kern.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c index 6ead1e240457..8ca67a692683 100644 --- a/arch/um/drivers/mconsole_kern.c +++ b/arch/um/drivers/mconsole_kern.c @@ -224,7 +224,7 @@ void mconsole_go(struct mc_request *req) void mconsole_stop(struct mc_request *req) { - deactivate_fd(req->originating_fd, MCONSOLE_IRQ); + block_signals(); os_set_fd_block(req->originating_fd, 1); mconsole_reply(req, "stopped", 0, 0); for (;;) { @@ -247,6 +247,7 @@ void mconsole_stop(struct mc_request *req) } os_set_fd_block(req->originating_fd, 0); mconsole_reply(req, "", 0, 0); + unblock_signals(); } static DEFINE_SPINLOCK(mc_devices_lock); -- 2.30.2
Bug#1005900: [PATCH] um: Fix uml_mconsole stop/go
On 22/02/2022 12:11, Johannes Berg wrote: On Tue, 2022-02-22 at 10:57 +, anton.iva...@cambridgegreys.com wrote: From: Anton Ivanov Moving to an EPOLL based IRQ controller broke uml_mconsole stop/go commands. This fixes it and restores stop/go functionality. Fixes: ff6a17989c08b0bb0fd490cc500b084581b3a9b9 Epoll based IRQ controller The right format would be Fixes: ff6a17989c08 ("Epoll based IRQ controller") Ack, will resubmit shortly. Don't think I can comment on the patch itself, sorry. The old poll controller had all IO IRQs shared and disabled IRQ processing while in the IRQ loop. Thus a while(;;) in the IRQ loop combined with a blocking read was an effective way to stop processing. That is no longer the case. 1. While individual IRQs are not reentrant (there is a check for that in the IRQ handler), other IRQs will be processed and each FD is allocated a separate one. So looping inside one will not stop the kernel. It will still handle timer IRQs and other IO. 2. In the old controller disable_fd() was the reentrance guard. It removed the fd from the poll set so that it is not triggered again until the IRQ is handled. It was used everywhere in the beginning of each handler (followed by re-enable at IRQ exit). It has different semantics, cost and should not be used without need in the epoll case. In fact, I removed it throughout, but somehow missed the mconsole. Still having it was a bug. As we do not have any means to shut-off the IRQs in the IRQ controller itself, the easiest way to stop them is to kill signals - as per the patch. johannes -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#1005900: [PATCH] um: Fix uml_mconsole stop/go
From: Anton Ivanov Moving to an EPOLL based IRQ controller broke uml_mconsole stop/go commands. This fixes it and restores stop/go functionality. Fixes: ff6a17989c08b0bb0fd490cc500b084581b3a9b9 Epoll based IRQ controller Signed-off-by: Anton Ivanov --- arch/um/drivers/mconsole_kern.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c index 6ead1e240457..8ca67a692683 100644 --- a/arch/um/drivers/mconsole_kern.c +++ b/arch/um/drivers/mconsole_kern.c @@ -224,7 +224,7 @@ void mconsole_go(struct mc_request *req) void mconsole_stop(struct mc_request *req) { - deactivate_fd(req->originating_fd, MCONSOLE_IRQ); + block_signals(); os_set_fd_block(req->originating_fd, 1); mconsole_reply(req, "stopped", 0, 0); for (;;) { @@ -247,6 +247,7 @@ void mconsole_stop(struct mc_request *req) } os_set_fd_block(req->originating_fd, 0); mconsole_reply(req, "", 0, 0); + unblock_signals(); } static DEFINE_SPINLOCK(mc_devices_lock); -- 2.30.2
Bug#1005900: linux.uml: uml_mconsole client becomes blocked on read after issuing commands stop and go
Hi Ritesh, hi Mihai, Apologies for the delay in the answer, I was traveling last week. 1. Your patch will not achieve the desired aim. The IRQS in UML nowadays are per fd and looping inside the IRQ handler for mconsole will not stop UML as it used to when it was using the old poll() based IRQ subsystem. It will still handle other IRQs. That is a bug, we need to see what can be done here. 2. Otherwise, just a - on the deactivate_fd() would do the trick. There is a reentrancy check on the IRQ handler and while you are looping inside it, the same IRQ will not be triggered again. No need to deactivate_fd(). However, as per "1", this is insufficient - all other IRQS will still be handled. Brgds, A. On 18/02/2022 08:38, Ritesh Raj Sarraf wrote: Hello Mihai, In this case, it is good to have the User Mode Linux upstream in the loop. Thanks, Ritesh --- linux-source-5.16/arch/um/drivers/mconsole_kern.c 2022-02-05 20:22:06.0 +0200 +++ linux-source-5.16.fix/arch/um/drivers/mconsole_kern.c 2022-02-16 23:35:39.562668086 +0200 @@ -224,6 +224,7 @@ void mconsole_stop(struct mc_request *req) { + int err; deactivate_fd(req->originating_fd, MCONSOLE_IRQ); os_set_fd_block(req->originating_fd, 1); mconsole_reply(req, "stopped", 0, 0); @@ -247,6 +248,11 @@ } os_set_fd_block(req->originating_fd, 0); mconsole_reply(req, "", 0, 0); + err=activate_fd(MCONSOLE_IRQ, req->originating_fd, IRQ_READ, + (void*)(req->originating_fd), NULL); + if (err) + mconsole_reply(req, "Failed to reactivate MCONSOLE_IRQ, \ + this will block the read for uml_mconsole", 1, 0); } static DEFINE_SPINLOCK(mc_devices_lock); --- linux-source-5.16/arch/um/kernel/irq.c 2022-02-05 20:22:06.0 +0200 +++ linux-source-5.16.fix/arch/um/kernel/irq.c 2022-02-16 23:39:15.650279367 +0200 @@ -249,7 +249,7 @@ free_irq_entry(entry, false); } -static int activate_fd(int irq, int fd, enum um_irq_type type, void *dev_id, +int activate_fd(int irq, int fd, enum um_irq_type type, void *dev_id, void (*timetravel_handler)(int, int, void *, struct time_travel_event *)) { @@ -304,6 +304,7 @@ out: return err; } +EXPORT_SYMBOL(activate_fd); /* * Remove the entry or entries for a specific FD, if you --- linux-source-5.16/arch/um/include/shared/irq_user.h 2022-02-05 20:22:06.0 +0200 +++ linux-source-5.16.fix/arch/um/include/shared/irq_user.h 2022-02-16 23:39:09.642292312 +0200 @@ -19,6 +19,7 @@ void sigio_run_timetravel_handlers(void); extern void free_irq_by_fd(int fd); extern void deactivate_fd(int fd, int irqnum); +extern int activate_fd(int irq, int fd, enum um_irq_type type, void *dev_id, void (*timetravel_handler)(int, int, void *, struct time_travel_event *)); extern int deactivate_all_fds(void); extern int activate_ipi(int fd, int pid); On Thu, 2022-02-17 at 01:06 +0200, Mihai Hanor wrote: Package: user-mode-linux Version: 5.16um1 Severity: normal File: /usr/bin/linux.uml X-Debbugs-Cc: quake2i...@gmail.com Dear Maintainer, * What is the problem: Issuing the commands stop followed by go, at the input of the uml_mconsole client, results in the client becoming blocked on read socket. This is because of logic in arch/um/drivers/mconsole_kern.c, where mconsole_stop() doesn't reactivate the MCONSOLE_IRQ before the function has exited. I've managed to find a fix which seems to be working, but I don't know if it's a proper fix. Please see the attached file. -- System Information: Debian Release: bookworm/sid APT prefers testing APT policy: (900, 'testing'), (500, 'unstable-debug'), (500, 'testing-debug'), (1, 'unstable') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 5.16.0-1-amd64 (SMP w/4 CPU threads; PREEMPT) Kernel taint flags: TAINT_WARN Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US:en Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages user-mode-linux depends on: ii libc6 2.33-5 Versions of packages user-mode-linux recommends: ii uml-utilities 20070815.4-1 Versions of packages user-mode-linux suggests: ii mate-terminal [x-terminal-emulator] 1.26.0-1 ii pterm [x-terminal-emulator] 0.76-2 pn rootstrap pn slirp pn user-mode-linux-doc pn vde2 -- no debconf information ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#1004392: systemd: Incorrect location of configuration files
Package: systemd Version: 247.3-6 Severity: serious Justification: Policy 10.7 Dear Maintainer, /usr/lib/tmpfiles.d/x11.conf should be a configuration file. Entries in it must be disabled in order to run containers with accelerated X11 and DRI access. As it is under lib, changes to it are overwritten on every systemd update breaking all containers which run X apps with direct access to local X-server. 1. There is no way to disable it permanently. 2. There is no way to override it in a way which disables the defaults Actually, most of that directory does not belong in /usr - it should be under /etc as per Debian policy for configuration files and should be handled as config on system upgrades and updates. -- Package-specific info: -- System Information: Debian Release: 11.2 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 5.10.0-10-amd64 (SMP w/8 CPU threads) Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages systemd depends on: ii adduser3.118 ii libacl12.2.53-10 ii libapparmor1 2.13.6-10 ii libaudit1 1:3.0-2 ii libblkid1 2.36.1-8 ii libc6 2.31-13+deb11u2 ii libcap21:2.44-1 ii libcrypt1 1:4.4.18-4 ii libcryptsetup122:2.3.5-1 ii libgcrypt201.8.7-6 ii libgnutls303.7.1-5 ii libgpg-error0 1.38-2 ii libip4tc2 1.8.7-1 ii libkmod2 28-1 ii liblz4-1 1.9.3-2 ii liblzma5 5.2.5-2 ii libmount1 2.36.1-8 ii libpam0g 1.4.0-9+deb11u1 ii libseccomp22.5.1-1+deb11u1 ii libselinux13.1-3 ii libsystemd0247.3-6 ii libzstd1 1.4.8+dfsg-2.1 ii mount 2.36.1-8 ii ntp [time-daemon] 1:4.2.8p15+dfsg-1 ii util-linux 2.36.1-8 Versions of packages systemd recommends: ii dbus 1.12.20-2 Versions of packages systemd suggests: ii policykit-10.105-31 pn systemd-container Versions of packages systemd is related to: pn dracut ii initramfs-tools 0.140 ii libnss-systemd 247.3-6 ii libpam-systemd 247.3-6 ii udev 247.3-6 -- Configuration Files: /etc/systemd/logind.conf changed: [Login] KillUserProcesses=yes KillExcludeUsers=root -- no debconf information
Bug#924664: ejabberd: node migration broken
OK. I will retest when I upgrade to current stable which should happen in the next few days. A. On 15/09/2021 15:38, Badlop wrote: On Fri, 15 Mar 2019 at 17:42, Anton Ivanov wrote: All files exist, retried several times with files both in /tmp/ and in /var/lib/jabberd/ no difference in either case. It failes with "Table config" message. I tried this old report, using a recent ejabberd, and it works correctly. Maybe the problem was related to some old bug? For testing I set your hosts in /etc/hosts and the erlang node names in ejabberdctl.cfg, to obtain a scenario similar to yours. Some output, in case it gives some clue: ❯ ejabberdctl mnesia-change-nodename ejabberd@smaug ejabb...@jabber.kot-begemot.co.uk /tmp/e/ejabberd.backup /tmp/e/ejabberd.restore * Checking table: 'roster' + Checking key: 'ram_copies' + Checking key: 'disc_copies' + Checking key: 'disc_only_copies' - Replacing nodename: 'ejabberd@smaug' with: ''ejabb...@jabber.kot-begemot.co.uk'' ... * Checking table: 'push_session' + Checking key: 'ram_copies' + Checking key: 'disc_copies' + Checking key: 'disc_only_copies' - Replacing nodename: 'ejabberd@smaug' with: ''ejabb...@jabber.kot-begemot.co.uk'' switched ❯ ls -la total 124 -rw-r--r-- 1 badlop badlop 22950 de set. 15 16:17 ejabberd.backup -rw-r--r-- 1 badlop badlop 23748 de set. 15 16:19 ejabberd.restore ❯ ejabberdctl registered_users localhost ❯ ejabberdctl restore /tmp/e/ejabberd.restore ❯ ejabberdctl registered_users localhost admin user1 By the way, a quick and dirty alternative is to not bother changing the mnesia backup nodename, instead tell the new erlang node to use the old node name (using ejabberdctl.cfg ERLANG_NODE) -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#989571: linux-image-5.10.0-0.bpo.3-amd64: Incorrect large USB disk sizing leading to data corruption
Close please. The 17G was from trying to blank the drive, which for some reason disconnected in the process resulting in a file written in /dev with the name sda. From there on the loop and so on. So there was a /dev/sda file as a left-over after that. Thanks for pointing me in the right direction and apologies. I am going to continue investigating why I got the data corruption in the first place, before I tried to blank it, but it looks like it may have been a hardware issue with the original USB-to-ATA bridge. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#989571: linux-image-5.10.0-0.bpo.3-amd64: Incorrect large USB disk sizing leading to data corruption
Package: src:linux Version: 5.10.13-1~bpo10+1 Severity: critical Justification: causes serious data loss Dear Maintainer, Large USB drives (example - Seagate 4TB Backup) which work perfectly fine with 4.19 are identified as incorrect size. In the case of the 4TB sized USB it's identified as a 17GB and for some unfatomable reason mounted as loop. The result is severe data corruption making all 4TB of data on the drive unrecoverable. Tested with the original USB bridge coming with the drive and after attaching the SATA drive inside to an alternative USB bridge. Same result in both cases. -- Package-specific info: ** Version: Linux version 5.10.0-0.bpo.3-amd64 (debian-ker...@lists.debian.org) (gcc-8 (Debian 8.3.0-6) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP Debian 5.10.13-1~bpo10+1 (2021-02-11) ** Command line: BOOT_IMAGE=diskless/amd64/vmlinuz-5.10.0-0.bpo.3-amd64 initrd=diskless/amd64/initrd.img-5.10.0-0.bpo.3-amd64 root=/dev/nfs ip=dhcp nfsroot=192.168.3.3:/exports/boot/madding mitigations=off rw -- ** Tainted: S (4) * SMP kernel oops on an officially SMP incapable processor ** Kernel log: [754632.929276] nfs: server 192.168.3.3 OK [754635.600887] rpc_check_timeout: 443 callbacks suppressed [754635.600889] nfs: server 192.168.3.3 not responding, still trying [754635.612996] nfs: server 192.168.3.3 not responding, still trying [754635.625266] nfs: server 192.168.3.3 not responding, still trying [754635.625462] nfs: server 192.168.3.3 not responding, still trying [754635.637374] nfs: server 192.168.3.3 not responding, still trying [754635.649472] nfs: server 192.168.3.3 not responding, still trying [754635.661739] nfs: server 192.168.3.3 not responding, still trying [754635.661922] nfs: server 192.168.3.3 not responding, still trying [754635.673850] nfs: server 192.168.3.3 not responding, still trying [754635.686131] nfs: server 192.168.3.3 not responding, still trying [791938.374623] lxc-bridge0: port 3(tap-opsft2-0) entered blocking state [791938.374628] lxc-bridge0: port 3(tap-opsft2-0) entered forwarding state [791938.374654] lxc-bridge0: port 4(tap-opsft3-0) entered blocking state [791938.374655] lxc-bridge0: port 4(tap-opsft3-0) entered forwarding state [791938.375075] lxc-bridge0: port 2(tap-opsft1-0) entered blocking state [791938.375078] lxc-bridge0: port 2(tap-opsft1-0) entered forwarding state [791938.388241] k8-bridge0: port 2(tap-opsft1-1) entered blocking state [791938.388243] k8-bridge0: port 2(tap-opsft1-1) entered forwarding state [791938.388402] k8-bridge0: port 4(tap-opsft3-1) entered blocking state [791938.388405] k8-bridge0: port 4(tap-opsft3-1) entered forwarding state [791938.388481] k8-bridge0: port 3(tap-opsft2-1) entered blocking state [791938.388484] k8-bridge0: port 3(tap-opsft2-1) entered forwarding state [801076.265404] usb 4-2.4: new SuperSpeed Gen 1 USB device number 5 using xhci_hcd [801076.289933] usb 4-2.4: New USB device found, idVendor=174c, idProduct=55aa, bcdDevice= 1.00 [801076.289937] usb 4-2.4: New USB device strings: Mfr=2, Product=3, SerialNumber=1 [801076.289939] usb 4-2.4: Product: ASM105x [801076.289940] usb 4-2.4: Manufacturer: ASMT [801076.289942] usb 4-2.4: SerialNumber: [801076.291139] scsi host10: uas [801076.291557] scsi 10:0:0:0: Direct-Access ASMT 2115 0 PQ: 0 ANSI: 6 [801076.292065] sd 10:0:0:0: Attached scsi generic sg0 type 0 [801076.292232] sd 10:0:0:0: [sda] Spinning up disk... [801077.321342] ..ready [801082.447597] sd 10:0:0:0: [sda] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB) [801082.447600] sd 10:0:0:0: [sda] 4096-byte physical blocks [801082.447673] sd 10:0:0:0: [sda] Write Protect is off [801082.447674] sd 10:0:0:0: [sda] Mode Sense: 43 00 00 00 [801082.447832] sd 10:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [801082.448032] sd 10:0:0:0: [sda] Optimal transfer size 33553920 bytes not a multiple of physical block size (4096 bytes) [801082.494646] sd 10:0:0:0: [sda] Attached SCSI disk [801150.687429] loop: module loaded [801150.815997] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) [803002.579925] blk_update_request: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 [803002.579960] blk_update_request: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 [803017.725341] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) [803081.125594] blk_update_request: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 [803081.125635] blk_update_request: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 [803085.522063] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) [803239.336895] blk_update_request: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 [803239.336950] blk_update_request: I/O err
Bug#983379: [PATCH] um: mark all kernel symbols as local
On 05/03/2021 20:43, Johannes Berg wrote: From: Johannes Berg Ritesh reported a bug [1] against UML, noting that it crashed on startup. The backtrace shows the following (heavily redacted): (gdb) bt ... #26 0x60015b5d in sem_init () at ipc/sem.c:268 #27 0x7f89906d92f7 in ?? () from /lib/x86_64-linux-gnu/libcom_err.so.2 #28 0x7f8990ab8fb2 in call_init (...) at dl-init.c:72 ... #40 0x7f89909bf3a6 in nss_load_library (...) at nsswitch.c:359 ... #44 0x7f8990895e35 in _nss_compat_getgrnam_r (...) at nss_compat/compat-grp.c:486 #45 0x7f8990968b85 in __getgrnam_r [...] #46 0x7f89909d6b77 in grantpt [...] #47 0x7f8990a9394e in __GI_openpty [...] #48 0x604a1f65 in openpty_cb (...) at arch/um/os-Linux/sigio.c:407 #49 0x604a58d0 in start_idle_thread (...) at arch/um/os-Linux/skas/process.c:598 #50 0x60004a3d in start_uml () at arch/um/kernel/skas/process.c:45 #51 0x600047b2 in linux_main (...) at arch/um/kernel/um_arch.c:334 #52 0x6000574f in main (...) at arch/um/os-Linux/main.c:144 indicating that the UML function openpty_cb() calls openpty(), which internally calls __getgrnam_r(), which causes the nsswitch machinery to get started. This loads, through lots of indirection that I snipped, the libcom_err.so.2 library, which (in an unknown function, "??") calls sem_init(). Now, of course it wants to get libpthread's sem_init(), since it's linked against libpthread. However, the dynamic linker looks up that symbol against the binary first, and gets the kernel's sem_init(). Hajime Tazaki noted that "objcopy -L" can localize a symbol, so the dynamic linker wouldn't do the lookup this way. I tried, but for some reason that didn't seem to work. Doing the same thing in the linker script instead does seem to work, though I cannot entirely explain - it *also* works if I just add "VERSION { { global: *; }; }" instead, indicating that something else is happening that I don't really understand. It may be that explicitly doing that marks them with some kind of empty version, and that's different from the default. Explicitly marking them with a version breaks kallsyms, so that doesn't seem to be possible. Marking all the symbols as local seems correct, and does seem to address the issue, so do that. Also do it for static link, nsswitch libraries could still be loaded there. [1] https://bugs.debian.org/983379 Reported-by: Ritesh Raj Sarraf Signed-off-by: Johannes Berg --- arch/um/kernel/dyn.lds.S | 6 ++ arch/um/kernel/uml.lds.S | 6 ++ 2 files changed, 12 insertions(+) diff --git a/arch/um/kernel/dyn.lds.S b/arch/um/kernel/dyn.lds.S index dacbfabf66d8..2f2a8ce92f1e 100644 --- a/arch/um/kernel/dyn.lds.S +++ b/arch/um/kernel/dyn.lds.S @@ -6,6 +6,12 @@ OUTPUT_ARCH(ELF_ARCH) ENTRY(_start) jiffies = jiffies_64; +VERSION { + { +local: *; + }; +} + SECTIONS { PROVIDE (__executable_start = START); diff --git a/arch/um/kernel/uml.lds.S b/arch/um/kernel/uml.lds.S index 45d957d7004c..7a8e2b123e29 100644 --- a/arch/um/kernel/uml.lds.S +++ b/arch/um/kernel/uml.lds.S @@ -7,6 +7,12 @@ OUTPUT_ARCH(ELF_ARCH) ENTRY(_start) jiffies = jiffies_64; +VERSION { + { +local: *; + }; +} + SECTIONS { /* This must contain the right address - not quite the default ELF one.*/ Acked-By: Anton Ivanov -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#983379: linux uml segfault
On 05/03/2021 18:32, Johannes Berg wrote: On 5 March 2021 18:39:42 CET, Anton Ivanov wrote: On 04/03/2021 07:47, Johannes Berg wrote: On Thu, 2021-03-04 at 14:38 +0900, Hajime Tazaki wrote: Now, I don't know how to fix it (short of changing your nsswitch configuration) - maybe we could somehow rename sem_init()? Or maybe we can somehow give the kernel binary a lower symbol resolution than the libc/libpthread. objcopy (from binutils) can localize symbols (i.e., objcopy -L sem_init $orig_file $new_file). It also does renaming symbols. But not sure this is the ideal solution. Yes, we started thinking about it but it was too late at night when I replied ... I think there's basically a way to have an external list of symbols to export, for symbol versioning, that we could/should use to basically not export any of the kernel symbols out to libs. How does UML handle symbol conflicts between userspace code and Linux kernel (like this case sem_init) ? AFAIK, libnl has a same symbol as Linux kernel (genlmsg_put) and others can possibly do as well. I fear it doesn't? Let's assume it does not, and try to fix this by de-conflicting the symbol. For the time being, also, let's aim for a Debian specific patch just to go into their "patches" dir for build so that UML is not dropped out of the release. This should make all internal uses of sem_init be um_sem_init in the actual object files. I will chase the issue of it picking up glibc memcpy separately. Upon close inspection it looks like a different issue - it is in the other direction (picking a dynamic symbol instead of the one from the tree). I spent all day chasing it today and I cannot reproduce it. At the same time it was reproducible yesterday without any problems :( +#ifdef CONFIG_UML +void __init um_sem_init(void) +#else void __init sem_init(void) +#endif Might be easier to just #define sem_init um_sem_init in an appropriate header file, perhaps even in arch/um/? I thought of that, but surrendered to the "dark side" of the quick and ugly fix. We can do that for the ipc/sem.c - it brings in uaccess.h which ultimately pulls uaccess from our asm tree. So if we do it there, it will end up in sem.c However, that function is also referenced and is invoked out of ipc/util.c which does not pull that include. I am going to dig through the rest of our includes to see if we can find a suitable one which will be picked up by both sem.c and util.c. I hope there is a place which we can use for a "proper" fix. By the way, I actually remember seeing a couple of includes like that somewhere dealing with other um symbol conflicts, just can't remember where I saw it. johannes -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#983379: linux uml segfault
On 04/03/2021 07:47, Johannes Berg wrote: On Thu, 2021-03-04 at 14:38 +0900, Hajime Tazaki wrote: Now, I don't know how to fix it (short of changing your nsswitch configuration) - maybe we could somehow rename sem_init()? Or maybe we can somehow give the kernel binary a lower symbol resolution than the libc/libpthread. objcopy (from binutils) can localize symbols (i.e., objcopy -L sem_init $orig_file $new_file). It also does renaming symbols. But not sure this is the ideal solution. Yes, we started thinking about it but it was too late at night when I replied ... I think there's basically a way to have an external list of symbols to export, for symbol versioning, that we could/should use to basically not export any of the kernel symbols out to libs. How does UML handle symbol conflicts between userspace code and Linux kernel (like this case sem_init) ? AFAIK, libnl has a same symbol as Linux kernel (genlmsg_put) and others can possibly do as well. I fear it doesn't? Let's assume it does not, and try to fix this by de-conflicting the symbol. For the time being, also, let's aim for a Debian specific patch just to go into their "patches" dir for build so that UML is not dropped out of the release. This should make all internal uses of sem_init be um_sem_init in the actual object files. I will chase the issue of it picking up glibc memcpy separately. Upon close inspection it looks like a different issue - it is in the other direction (picking a dynamic symbol instead of the one from the tree). I spent all day chasing it today and I cannot reproduce it. At the same time it was reproducible yesterday without any problems :( Ritesh, can you give the following a spin - it renames sem_init as um_sem_init for UML only? diff --git a/ipc/sem.c b/ipc/sem.c index f6c30a85dadf..5157796daf54 100644 --- a/ipc/sem.c +++ b/ipc/sem.c @@ -263,7 +263,11 @@ void sem_exit_ns(struct ipc_namespace *ns) } #endif +#ifdef CONFIG_UML +void __init um_sem_init(void) +#else void __init sem_init(void) +#endif { sem_init_ns(&init_ipc_ns); ipc_init_proc_interface("sysvipc/sem", diff --git a/ipc/util.h b/ipc/util.h index 5766c61aed0e..b3356efb3c96 100644 --- a/ipc/util.h +++ b/ipc/util.h @@ -47,7 +47,12 @@ extern int ipc_min_cycle; #define IPCMNI_IDX_MASK((1 << IPCMNI_SHIFT) - 1) #endif /* CONFIG_SYSVIPC_SYSCTL */ +#ifdef CONFIG_UML +void um_sem_init(void); +#define sem_init() um_sem_init() +#else void sem_init(void); +#endif void msg_init(void); void shm_init(void); johannes -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#983379: linux uml segfault
On 04/03/2021 18:41, Anton Ivanov wrote: On 04/03/2021 08:05, Benjamin Berg wrote: On Thu, 2021-03-04 at 08:47 +0100, Johannes Berg wrote: On Thu, 2021-03-04 at 14:38 +0900, Hajime Tazaki wrote: Now, I don't know how to fix it (short of changing your nsswitch configuration) - maybe we could somehow rename sem_init()? Or maybe we can somehow give the kernel binary a lower symbol resolution than the libc/libpthread. objcopy (from binutils) can localize symbols (i.e., objcopy -L sem_init $orig_file $new_file). It also does renaming symbols. But not sure this is the ideal solution. Yes, we started thinking about it but it was too late at night when I replied ... I think there's basically a way to have an external list of symbols to export, for symbol versioning, that we could/should use to basically not export any of the kernel symbols out to libs. Maybe using the ld --version-script= option here works to mark all kernel symbols as being "local" and prevent them from being picked up by libraries. Benjamin How does UML handle symbol conflicts between userspace code and Linux kernel (like this case sem_init) ? AFAIK, libnl has a same symbol as Linux kernel (genlmsg_put) and others can possibly do as well. I fear it doesn't? I can confirm that it did and this bug is bisect-able. with 5.7 # dd if=/dev/ubda of=/dev/null bs=1M 16384+1 records in 16384+1 records out 17179869696 bytes (17 GB, 16 GiB) copied, 10.6973 s, 1.6 GB/s with 5.10 the speed is 2.2 5.7 with "strings from glibc" patch speed is 2.2 As we did not do anything else in this timeframe to jack up the speed from 1.6GB/s to 2.2GB/s and as it is identical to the speed you get with the "use glibc strings.h" this looks like a good criteria to bisect on. I am going to do a bisect with 5.7 "good" and 5.10 "bad" using the speed test as a working hypothesis. This is proving very "interesting" to try to chase down, because the "picking the wrong library" does not happen every time. F.E. yesterday my 5.10 builds were picking glibc memcpy and friends. Today with the same config and everything else the same it is picking built-ins. I need to finds some better way to reproduce this. A. A. johannes ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#983379: linux uml segfault
On 04/03/2021 08:05, Benjamin Berg wrote: On Thu, 2021-03-04 at 08:47 +0100, Johannes Berg wrote: On Thu, 2021-03-04 at 14:38 +0900, Hajime Tazaki wrote: Now, I don't know how to fix it (short of changing your nsswitch configuration) - maybe we could somehow rename sem_init()? Or maybe we can somehow give the kernel binary a lower symbol resolution than the libc/libpthread. objcopy (from binutils) can localize symbols (i.e., objcopy -L sem_init $orig_file $new_file). It also does renaming symbols. But not sure this is the ideal solution. Yes, we started thinking about it but it was too late at night when I replied ... I think there's basically a way to have an external list of symbols to export, for symbol versioning, that we could/should use to basically not export any of the kernel symbols out to libs. Maybe using the ld --version-script= option here works to mark all kernel symbols as being "local" and prevent them from being picked up by libraries. Benjamin How does UML handle symbol conflicts between userspace code and Linux kernel (like this case sem_init) ? AFAIK, libnl has a same symbol as Linux kernel (genlmsg_put) and others can possibly do as well. I fear it doesn't? I can confirm that it did and this bug is bisect-able. with 5.7 # dd if=/dev/ubda of=/dev/null bs=1M 16384+1 records in 16384+1 records out 17179869696 bytes (17 GB, 16 GiB) copied, 10.6973 s, 1.6 GB/s with 5.10 the speed is 2.2 5.7 with "strings from glibc" patch speed is 2.2 As we did not do anything else in this timeframe to jack up the speed from 1.6GB/s to 2.2GB/s and as it is identical to the speed you get with the "use glibc strings.h" this looks like a good criteria to bisect on. I am going to do a bisect with 5.7 "good" and 5.10 "bad" using the speed test as a working hypothesis. A. johannes ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#983379: linux uml segfault
On 04/03/2021 05:38, Hajime Tazaki wrote: On Thu, 04 Mar 2021 07:40:00 +0900, Johannes Berg wrote: I think the problem is here: #24 0x6080f234 in ipc_init_ids (ids=0x60c60de8 ) at ipc/util.c:119 #25 0x60813c6d in sem_init_ns (ns=0x60d895bb ) at ipc/sem.c:254 #26 0x60015b5d in sem_init () at ipc/sem.c:268 #27 0x7f89906d92f7 in ?? () from /lib/x86_64-linux- gnu/libcom_err.so.2 You're in the init of libcom_err.so.2, which is loaded by "libnss_nis.so.2" which is loaded by normal NSS code (getgrnam): #40 0x7f89909bf3a6 in nss_load_library (ni=ni@entry=0x61497db0) at nsswitch.c:359 #41 0x7f89909bfc39 in __GI___nss_lookup_function (ni=0x61497db0, fct_name=, fct_name@entry=0x7f899089b020 "setgrent") at nsswitch.c:467 #42 0x7f899089554b in init_nss_interface () at nss_compat/compat- grp.c:83 #43 init_nss_interface () at nss_compat/compat-grp.c:79 #44 0x7f8990895e35 in _nss_compat_getgrnam_r (name=0x7f8990a2a1e0 "tty", grp=0x7ffe3e7a2910, buffer=0x7ffe3e7a24e0 "", buflen=1024, errnop=0x7f899089eb00) at nss_compat/compat-grp.c:486 #45 0x7f8990968b85 in __getgrnam_r (name=name@entry=0x7f8990a2a1e0 "tty", resbuf=resbuf@entry=0x7ffe3e7a2910, buffer=buffer@entry=0x7ffe3e7a24e0 "", buflen=1024, result=result@entry=0x7ffe3e7a2908) at ../nss/getXXbyYY_r.c:315 You have a strange nsswitch configuration that causes all of this (libnss_nis.so.2 -> libcom_err.so.2) to get loaded. Now libcom_err.so.2 is trying to call sem_init(), and that gets ... tada ... Linux's sem_init() instead of libpthread's. And then the crash. Now, I don't know how to fix it (short of changing your nsswitch configuration) - maybe we could somehow rename sem_init()? Or maybe we can somehow give the kernel binary a lower symbol resolution than the libc/libpthread. objcopy (from binutils) can localize symbols (i.e., objcopy -L sem_init $orig_file $new_file). It also does renaming symbols. But not sure this is the ideal solution. How does UML handle symbol conflicts between userspace code and Linux kernel (like this case sem_init) ? AFAIK, libnl has a same symbol as Linux kernel (genlmsg_put) and others can possibly do as well. It used to handle them. I do not think it does now - something broke and it's fairly recent. I actually have something which confirms this. I worked on a patch around 5.8-5.9 which would give the option to pick up libc equivalents for the functions from string.h and there was a clear performance difference of ~ 20%+ This is because UML has no means of optimizing them and picks up the worst case scenario x86 version. I parked that for a while, because had to look at other stuff at work. I restarted working on it after 5.10. My first observation was that despite not changing anything in the patches, the gain was no longer there. The performance was the same as if it picked up libc equivalents. I can either try to reproduce the nss config which causes the sem_init issue or use my own libc patchset to try to dissect. The problem commit will be roughly around the time the performance difference from applying the "switch to libc" goes away. Brgds, A. -- Hajime ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#983379: linux uml segfault
On 03/03/2021 22:40, Johannes Berg wrote: I think the problem is here: #24 0x6080f234 in ipc_init_ids (ids=0x60c60de8 ) at ipc/util.c:119 #25 0x60813c6d in sem_init_ns (ns=0x60d895bb ) at ipc/sem.c:254 #26 0x60015b5d in sem_init () at ipc/sem.c:268 #27 0x7f89906d92f7 in ?? () from /lib/x86_64-linux- gnu/libcom_err.so.2 You're in the init of libcom_err.so.2, which is loaded by "libnss_nis.so.2" which is loaded by normal NSS code (getgrnam): #40 0x7f89909bf3a6 in nss_load_library (ni=ni@entry=0x61497db0) at nsswitch.c:359 #41 0x7f89909bfc39 in __GI___nss_lookup_function (ni=0x61497db0, fct_name=, fct_name@entry=0x7f899089b020 "setgrent") at nsswitch.c:467 #42 0x7f899089554b in init_nss_interface () at nss_compat/compat- grp.c:83 #43 init_nss_interface () at nss_compat/compat-grp.c:79 #44 0x7f8990895e35 in _nss_compat_getgrnam_r (name=0x7f8990a2a1e0 "tty", grp=0x7ffe3e7a2910, buffer=0x7ffe3e7a24e0 "", buflen=1024, errnop=0x7f899089eb00) at nss_compat/compat-grp.c:486 #45 0x7f8990968b85 in __getgrnam_r (name=name@entry=0x7f8990a2a1e0 "tty", resbuf=resbuf@entry=0x7ffe3e7a2910, buffer=buffer@entry=0x7ffe3e7a24e0 "", buflen=1024, result=result@entry=0x7ffe3e7a2908) at ../nss/getXXbyYY_r.c:315 You have a strange nsswitch configuration that causes all of this (libnss_nis.so.2 -> libcom_err.so.2) to get loaded. Now libcom_err.so.2 is trying to call sem_init(), and that gets ... tada ... Linux's sem_init() instead of libpthread's. And then the crash. Now, I don't know how to fix it (short of changing your nsswitch configuration) - maybe we could somehow rename sem_init()? Or maybe we can somehow give the kernel binary a lower symbol resolution than the libc/libpthread. I have not looked in depth in how the linking process works, but it should have picked up the sem_init from the kernel library, not libc. We are already supposed to do that regarding kernel vs libc string.h functions - memcpy, etc. Though for all of them the libc does the same so invoking the wrong one does not kill you so this may have been broken for a while and we were simply not noticing it. johannes -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#983379: linux uml segfault
On 03/03/2021 10:45, Ritesh Raj Sarraf wrote: HI Anton, On Wed, 2021-03-03 at 09:30 +, Anton Ivanov wrote: OTOH, I have one more user (other than you) who's not been able to reproduce the issue. I will do a dissect the moment I figure out how to reproduce it. I will try to do some more experiments on that tomorrow. I tried to alter the userspace a bit, but it makes no difference. Out of curiosity, what are you running it on? Bare-metal machines. 3 different machines, all Intel processors. And it fails on all 3 of them. Hmmm... All mine are AMD. I can try to boot up an Intel later today with Bullseye to see if it makes a difference. On the distribution side, all 3 of them run Debian Unstable, with Linux 5.10.13 The code here is: static inline u32 printk_caller_id(void) { return in_task() ? task_pid_nr(current) : 0x8000 + raw_smp_processor_id(); } That is something which should not bomb out unless we have memory corruption or something along those lines - current being invalid. Must be something different. Not all machines could have bad memory at the same time. I did not mean bad memory. I meant memory corruption as a result of race, buffer overrun or anything else like that. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#983379: linux uml segfault
On 02/03/2021 17:27, Ritesh Raj Sarraf wrote: On Tue, 2021-03-02 at 17:05 +, Anton Ivanov wrote: So the best I can extract for you is to compile the kernel with as much information as possible. Can you try using one of the older kernels so we can verify if this is indeed a 5.10 thing. That was the first thing I tried. I tested it with 5.10, 5.9 and 5.4. All 3 crashed. That's when I knew this one was going to be painful one to conclude. The only other input I have is that I have one more user who's reported to be able to reproduce the issue. OTOH, I have one more user (other than you) who's not been able to reproduce the issue. I will do a dissect the moment I figure out how to reproduce it. I will try to do some more experiments on that tomorrow. I tried to alter the userspace a bit, but it makes no difference. Out of curiosity, what are you running it on? Meanwhile, I enabled some debug info in the kernel. Here's what I have got so far: ``` (gdb) bt #0 0x7f89908dc087 in kill () at ../sysdeps/unix/syscall- template.S:120 #1 0x604a3514 in uml_abort () at arch/um/os-Linux/util.c:94 #2 0x604a3791 in os_dump_core () at arch/um/os- Linux/util.c:149 #3 0x6048d126 in panic_exit (self=0x2e66d5, unused1=6, unused2=0x0) at arch/um/kernel/um_arch.c:217 #4 0x604c725a in notifier_call_chain (nl=0x2e66d5, val=0, v=0x60d82f40 , nr_to_call=-1, nr_calls=0x0) at kernel/notifier.c:83 #5 0x604c72f6 in atomic_notifier_call_chain (nh=0x2e66d5, val=6, v=0x0) at kernel/notifier.c:217 #6 0x60a54607 in panic (fmt=0x60a55225 "UH\211\345H\201\354", ) at kernel/panic.c:272 #7 0x6048cca3 in segv (fi=, ip=1615717312, is_user=0, regs=0x60c2ee58 ) at arch/um/kernel/trap.c:246 #8 0x6048ce64 in segv_handler (sig=3040981, unused_si=0x6, regs=0x60c2ee58 ) at arch/um/kernel/trap.c:190 #9 0x604a2556 in sig_handler_common (sig=11, si=0x60c2fbf0 , mc=0x60c2fae8 ) at arch/um/os-Linux/signal.c:48 #10 0x604a2aa2 in sig_handler (sig=3040981, si=0x6, mc=0x0) at arch/um/os-Linux/signal.c:81 #11 0x604a265f in hard_handler (sig=3040981, si=0x60c2fbf0 , p=0x0) at arch/um/os-Linux/signal.c:180 #12 The code here is: static inline u32 printk_caller_id(void) { return in_task() ? task_pid_nr(current) : 0x8000 + raw_smp_processor_id(); } That is something which should not bomb out unless we have memory corruption or something along those lines - current being invalid. A. #13 0x604de3c0 in printk_caller_id () at kernel/printk/printk.c:1924 #14 log_output (text_len=, text=, dev_info=, lflags=, level=, facility=) at kernel/printk/printk.c:1932 #15 vprintk_store (facility=1624806843, level=5, dev_info=0x0, fmt=0x35 , args=0x1) at kernel/printk/printk.c:2004 #16 0x604de8b7 in vprintk_emit (facility=1624806843, level=1622768673, dev_info=0x35, fmt=0x1 , args=0x60b97c22) at kernel/printk/printk.c:2029 #17 0x604debad in vprintk_deferred (fmt=0x1 , args=0x60b97c21) at kernel/printk/printk.c:3079 #18 0x60a554de in printk_deferred (fmt=0x60d895bb "\n") at kernel/printk/printk.c:3091 #19 0x6092680f in _warn_unseeded_randomness (previous=, caller=, func_name=) at drivers/char/random.c:1534 #20 _warn_unseeded_randomness (func_name=0x60abf380 <__func__.38> "get_random_u32", caller=0x608b5f25 , previous=0x35) at drivers/char/random.c:1516 #21 0x60927d47 in get_random_u32 () at drivers/char/random.c:2221 #22 0x608b5f25 in bucket_table_alloc (nbuckets=64, gfp=3264, ht=) at lib/rhashtable.c:203 #23 0x608b6733 in rhashtable_init (ht=0x60c60e30 , params=0x608b5e06 ) at lib/rhashtable.c:1061 #24 0x6080f234 in ipc_init_ids (ids=0x60c60de8 ) at ipc/util.c:119 #25 0x60813c6d in sem_init_ns (ns=0x60d895bb ) at ipc/sem.c:254 #26 0x60015b5d in sem_init () at ipc/sem.c:268 #27 0x7f89906d92f7 in ?? () from /lib/x86_64-linux- gnu/libcom_err.so.2 #28 0x7f8990ab8fb2 in call_init (l=, argc=argc@entry=5, argv=argv@entry=0x7ffe3e7a4c98, env=env@entry=0x7ffe3e7a4cc8) at dl-init.c:72 #29 0x7f8990ab90b9 in call_init (env=0x7ffe3e7a4cc8, argv=0x7ffe3e7a4c98, argc=5, l=) at dl-init.c:30 #30 _dl_init (main_map=0x61497ea0, argc=5, argv=0x7ffe3e7a4c98, env=0x7ffe3e7a4cc8) at dl-init.c:119 #31 0x7f89909d82bd in __GI__dl_catch_exception (exception=exception@entry=0x0, operate=operate@entry=0x7f8990abc5a0 , args=args@entry=0x7ffe3e7a1e80) at dl-error- skeleton.c:182 #32 0x7f8990abd028 in dl_open_worker (a=a@entry=0x7ffe3e7a2020) at dl-open.c:758 #33 0x7f89909d8260 in __GI__dl_catch_exception (exception=exception@entry=0x7ffe3e7a2000, operate=operate@entry=0x7f8990abcc70 , args=args@entry=0x7ffe3e7a2020) at dl-error-skeleton.c:208 #34 0x7f8990abc8ca in _dl_open (file=0x7ffe3e7a22a0 "libnss_nis.so.2", mode=-2147483646, caller_dlopen=0x7f89909bf3a6 , nsid=-2, argc
Bug#983379: linux uml segfault
On 02/03/2021 14:23, Ritesh Raj Sarraf wrote: On Tue, 2021-03-02 at 11:34 +, Anton Ivanov wrote: If gdb gives you the exact lines, that may be helpful. It doesn't. But it does show drawbacks in my packaging. The debug symbols packaged are not read/honored by gdb at all. ``` Reading symbols from /usr/bin/linux.uml... Reading symbols from /usr/lib/debug/.build- id/6f/ea141539149074c72e80fb8004de124fda115b.debug... (No debugging symbols found in /usr/lib/debug/.build- id/6f/ea141539149074c72e80fb8004de124fda115b.debug) warning: Can't open file /dev/shm/#20817 (deleted) during file-backed mapping note processing [New LWP 18788] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux- gnu/libthread_db.so.1". Core was generated by `linux ubd0=qemu-linux-image.img'. Program terminated with signal SIGABRT, Aborted. #0 0x7f51842c0087 in kill () at ../sysdeps/unix/syscall- template.S:120 120 ../sysdeps/unix/syscall-template.S: No such file or directory. (gdb) bt #0 0x7f51842c0087 in kill () at ../sysdeps/unix/syscall- template.S:120 #1 0x6049dc20 in uml_abort () #2 0x6049de7a in os_dump_core () #3 0x60486e47 in panic_exit () #4 0x604c0a03 in notifier_call_chain () #5 0x604c0a98 in atomic_notifier_call_chain () #6 0x60a26b85 in panic () #7 0x604869e1 in segv () #8 0x60486ba9 in segv_handler () #9 0x6049ccc0 in sig_handler_common () #10 0x6049d1ec in sig_handler () #11 0x6049cdc6 in hard_handler () #12 #13 0x604d45b4 in vprintk_store () #14 0x604d4aa8 in vprintk_emit () #15 0x604d4d86 in vprintk_deferred () #16 0x60a27a02 in printk_deferred () #17 0x609031b2 in get_random_u32 () #18 0x6088ff65 in bucket_table_alloc.isra () #19 0x60890740 in rhashtable_init () #20 0x607efaa2 in ipc_init_ids () #21 0x600153c9 in sem_init () ``` So the best I can extract for you is to compile the kernel with as much information as possible. Can you try using one of the older kernels so we can verify if this is indeed a 5.10 thing. I will do a dissect the moment I figure out how to reproduce it. I will try to do some more experiments on that tomorrow. Thanks, Ritesh -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#983379: linux uml segfault
On 02/03/2021 09:09, Ritesh Raj Sarraf wrote: On Wed, 2021-02-24 at 11:44 +, Anton Ivanov wrote: In all cases it boots cleanly and there are no segfaults. So, frankly, no idea what is causing it to crash - I have run most combinations of 5.10 on a 5.10, all work fine here. Is there any other way I can help you with this issue ? I do have the core dump available on my local machine. If gdb gives you the exact lines, that may be helpful. I have looked through the bt several times, it is something through which my set-up cruises through. The actual moment you see in the backtrace is this one: [0.08] random: get_random_u32 called from bucket_table_alloc.isra.0+0x115/0x13d with crng_init=0 However, in your case, instead of getting this printk warning out it blows up. Why - I don't know. A. ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#940821: NFS Caching broken in 4.19.37
On 26/02/2021 15:03, Timo Rothenpieler wrote: I think I can reproduce this, or something that at least looks very similar to this, on 5.10. Namely on 5.10.17 (On both Client and Server). I think this is a different issue - see below. We are running slurm, and since a while now (coincides with updating from 5.4 to 5.10, but a whole bunch of other stuff was updated at the same time, so it took me a while to correlate this) the logs it writes have been truncated, but only while they're being observed on the client, using tail -f or something like that. Looks like this then: On Server: store01 /srv/export/home/users/timo/TestRun # ls -l slurm-41101.out -rw-r--r-- 1 timo timo 1931 Feb 26 15:46 slurm-41101.out store01 /srv/export/home/users/timo/TestRun # wc -l slurm-41101.out 61 slurm-41101.out On Client: timo@login01 ~/TestRun $ ls -l slurm-41101.out -rw-r--r-- 1 timo timo 1931 Feb 26 15:46 slurm-41101.out timo@login01 ~/TestRun $ wc -l slurm-41101.out 24 slurm-41101.out See https://gist.github.com/BtbN/b9eb4fc08ccc53bb20087bce0bf9f826 for the respective file-contents. If I run the same test job, wait until its done, and then look at its slurm.out file, it matches between NFS Client and Server. If I tail -f the slurm.out on an NFS client, the file stops getting updated on the client, but keeps getting more logs written to it on the NFS server. The slurm.out file is being written to by another NFS client, which is running on one of the compute nodes of the system. It's being reads from a login node. These are two different clients, then what you see is possible on NFS with client side caching. If you have multiple clients reading/writing to the same files you usually need to tune the caching options and/or use locking. I suspect that if you leave it for a while (until the cache expires) it will sort itself out. In my test-case it is just one client, it missed a file deletion and nothing short of an unmount and remount fixes that. I have waited for 30 mins+. It does not seem to refresh or expire. I also see the opposite behavior - the bug shows up on 4.x up to at least 5.4. I do not see it on 5.10. Brgds, Timo On 21.02.2021 16:53, Anton Ivanov wrote: Client side. This seems to be an entirely client side issue. A variety of kernels on the clients starting from 4.9 and up to 5.10 using 4.19 servers. I have observed it on a 4.9 client versus 4.9 server earlier. 4.9 fails, 4.19 fails, 5.2 fails, 5.4 fails, 5.10 works. At present the server is at 4.19.67 in all tests. Linux jain 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux I can set-up a couple of alternative servers during the week, but so far everything is pointing towards a client fs cache issue, not a server one. Brgds, -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#983379: linux uml segfault
On 23/02/2021 17:26, Ritesh Raj Sarraf wrote: Added the debian bug report in CC. On Tue, 2021-02-23 at 17:19 +, Anton Ivanov wrote: The current Debian user-mode-linux package in unstable is based on the 5.10.5 stable source which includes the mentioned patch, but is still causing an error for some users. After updating the tree to 5.10.5 and applying all Debian patches from the package, I cannot reproduce the bug. I am running it on 5.10, 5.2 and 4.19 hosts with the same parameters without issues. Hosts are all up to date Debian 10.8 and so is the UML userspace. Did you mean 5.10, 5.2 and 4.19 (UML) guests ? We've seen this happen on Debian Testing and Unstable Host (of which the former would soon be the next stable i.e. Debian Bullseye). In our tests, when running the same linux uml binary (5.10) on a Debian Stable Host, it is working fine. I cannot reproduce it on a physical Bullseye host using the Debian user-mode-linux package compiled from source. Environment - Bullseye minimal install and build deps. 6 cores/12 threads Ryzen I cannot reproduce it using the upstream source and the patches from the user-mode-linux package Environment - same as above. I cannot reproduce it using the upstream source + patches and compiling on Buster using the following: 1. Bullseye physical host, minimal install, same hardware 2. Bullseye VM, minimal install, running with 4 vCPUs on the same host 3. Bullseye LXC container running on a Debian Buster host, minimal install, same hardware In all cases it boots cleanly and there are no segfaults. So, frankly, no idea what is causing it to crash - I have run most combinations of 5.10 on a 5.10, all work fine here. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#983379: linux uml segfault
On 23/02/2021 17:26, Ritesh Raj Sarraf wrote: Added the debian bug report in CC. On Tue, 2021-02-23 at 17:19 +, Anton Ivanov wrote: The current Debian user-mode-linux package in unstable is based on the 5.10.5 stable source which includes the mentioned patch, but is still causing an error for some users. After updating the tree to 5.10.5 and applying all Debian patches from the package, I cannot reproduce the bug. I am running it on 5.10, 5.2 and 4.19 hosts with the same parameters without issues. Hosts are all up to date Debian 10.8 and so is the UML userspace. Did you mean 5.10, 5.2 and 4.19 (UML) guests ? No. Hosts. I have several 6core/12thread Ryzens which are used for development testing. They all use identical userspace with the sole difference being the kernel. They all use a selection of 5.x because 4.19 does not support the hardware properly. The 4.19 testing is done on my old "test farm" which is all A8s and Athlon X760. We've seen this happen on Debian Testing and Unstable Host (of which the former would soon be the next stable i.e. Debian Bullseye). In our tests, when running the same linux uml binary (5.10) on a Debian Stable Host, it is working fine. OK. I will upgrade one of my systems to Debian testing to try to reproduce this. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#940821: NFS Caching broken in 4.19.37
On 21/02/2021 14:37, Bruce Fields wrote: On Sun, Feb 21, 2021 at 11:38:51AM +, Anton Ivanov wrote: On 21/02/2021 09:13, Salvatore Bonaccorso wrote: On Sat, Feb 20, 2021 at 08:16:26PM +, Chuck Lever wrote: Confirming you are varying client-side kernels. Should the Linux NFS client maintainers be Cc'd? Ok, agreed. Let's add them as well. NFS client maintainers any ideas on how to trackle this? This is not observed with Debian backports 5.10 package uname -a Linux madding 5.10.0-0.bpo.3-amd64 #1 SMP Debian 5.10.13-1~bpo10+1 (2021-02-11) x86_64 GNU/Linux I'm still unclear: when you say you tested a certain kernel: are you varying the client-side kernel version, or the server side, or both at once? Client side. This seems to be an entirely client side issue. A variety of kernels on the clients starting from 4.9 and up to 5.10 using 4.19 servers. I have observed it on a 4.9 client versus 4.9 server earlier. 4.9 fails, 4.19 fails, 5.2 fails, 5.4 fails, 5.10 works. At present the server is at 4.19.67 in all tests. Linux jain 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux I can set-up a couple of alternative servers during the week, but so far everything is pointing towards a client fs cache issue, not a server one. Brgds, --b. -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#940821: NFS Caching broken in 4.19.37
On 21/02/2021 09:13, Salvatore Bonaccorso wrote: Hi, On Sat, Feb 20, 2021 at 08:16:26PM +, Chuck Lever wrote: On Feb 20, 2021, at 3:13 PM, Anton Ivanov wrote: On 20/02/2021 20:04, Salvatore Bonaccorso wrote: Hi, On Mon, Jul 08, 2019 at 07:19:54PM +0100, Anton Ivanov wrote: Hi list, NFS caching appears broken in 4.19.37. The more cores/threads the easier to reproduce. Tested with identical results on Ryzen 1600 and 1600X. 1. Mount an openwrt build tree over NFS v4 2. Run make -j `cat /proc/cpuinfo | grep vendor | wc -l` ; make clean in a loop 3. Result after 3-4 iterations: State on the client ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 8 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ State as seen on the server (mounted via nfs from localhost): ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h Actual state on the filesystem: ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h So the client has quite clearly lost the plot. Telling it to drop caches and re-reading the directory shows the file present. It is possible to reproduce this using a linux kernel tree too, just takes much more iterations - 10+ at least. Both client and server run 4.19.37 from Debian buster. This is filed as debian bug 931500. I originally thought it to be autofs related, but IMHO it is actually something fundamentally broken in nfs caching resulting in cache corruption. According to the reporter downstream in Debian, at https://bugs.debian.org/940821#26 thi seem still reproducible with more recent kernels than the initial reported. Is there anything Anton can provide to try to track down the issue? Anton, can you reproduce with current stable series? 100% reproducible with any kernel from 4.9 to 5.4, stable or backports. It may exist in earlier versions, but I do not have a machine with anything before 4.9 to test at present. Confirming you are varying client-side kernels. Should the Linux NFS client maintainers be Cc'd? Ok, agreed. Let's add them as well. NFS client maintainers any ideas on how to trackle this? This is not observed with Debian backports 5.10 package uname -a Linux madding 5.10.0-0.bpo.3-amd64 #1 SMP Debian 5.10.13-1~bpo10+1 (2021-02-11) x86_64 GNU/Linux I left the testcase running for ~ 4 hours on a 6core/12thread Ryzen. It should have blown up 10 times by now. So one of the commits between 5.4 and 5.10.13 fixed it. If nobody can think of a particular commit which fixes it, I can try dissecting it during the week. A. From 1-2 make clean && make cycles to one afternoon depending on the number of machine cores. More cores/threads the faster it does it. I tried playing with protocol minor versions, caching options, etc - it is still reproducible for any nfs4 settings as long as there is client side caching of metadata. A. Regards, Salvatore -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/ -- Chuck Lever Regards, Salvatore -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#940821: NFS Caching broken in 4.19.37
On 20/02/2021 20:04, Salvatore Bonaccorso wrote: Hi, On Mon, Jul 08, 2019 at 07:19:54PM +0100, Anton Ivanov wrote: Hi list, NFS caching appears broken in 4.19.37. The more cores/threads the easier to reproduce. Tested with identical results on Ryzen 1600 and 1600X. 1. Mount an openwrt build tree over NFS v4 2. Run make -j `cat /proc/cpuinfo | grep vendor | wc -l` ; make clean in a loop 3. Result after 3-4 iterations: State on the client ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 8 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ State as seen on the server (mounted via nfs from localhost): ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h Actual state on the filesystem: ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h So the client has quite clearly lost the plot. Telling it to drop caches and re-reading the directory shows the file present. It is possible to reproduce this using a linux kernel tree too, just takes much more iterations - 10+ at least. Both client and server run 4.19.37 from Debian buster. This is filed as debian bug 931500. I originally thought it to be autofs related, but IMHO it is actually something fundamentally broken in nfs caching resulting in cache corruption. According to the reporter downstream in Debian, at https://bugs.debian.org/940821#26 thi seem still reproducible with more recent kernels than the initial reported. Is there anything Anton can provide to try to track down the issue? Anton, can you reproduce with current stable series? 100% reproducible with any kernel from 4.9 to 5.4, stable or backports. It may exist in earlier versions, but I do not have a machine with anything before 4.9 to test at present. From 1-2 make clean && make cycles to one afternoon depending on the number of machine cores. More cores/threads the faster it does it. I tried playing with protocol minor versions, caching options, etc - it is still reproducible for any nfs4 settings as long as there is client side caching of metadata. A. Regards, Salvatore -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#940821: closed by Bastian Blank (No response by submitter)
On 20/02/2021 10:33, Debian Bug Tracking System wrote: This is an automatic notification regarding your Bug report which was filed against the src:linux package: #940821: linux-image-5.2.0-2-amd64: file cache corruption with nfs4 It has been closed by Bastian Blank . Their explanation is attached below along with your original report. If this explanation is unsatisfactory and you have not received a better one in a separate message then please contact Bastian Blank by replying to this email. I missed the question. Probably hit the spam bucket for some reason. I am able to reproduce it with more recent versions as well. The most recent one I have around is 5.4.0-0.bpo.2-amd64 Still reproducible 100% - just tested it. It is trivial to reproduce if anyone actually bothers to do so. Just grab a big enough tree where make runs truly in parallel - openwrt is best, but even the Linux kernel does the job. Mount it via nfs4 from another server (it will work even locally, but takes longer to reproduce - may take a whole afternoon) Run while make -j 12 clean && make -j 12 ; do true ; done Leave it to run. On 6 cores/12 threads it takes 2-3 builds of openwrt or ~ 5-8 linux kernel builds to blow up. More cores - faster. Less cores slower. I sent it to the mailing list too, but nobody could be bothered to even ask any questions. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#928924: user-mode-linux: xterm functionality broken due to wrong path to port-helper
On 06/01/2020 16:21, Ritesh Raj Sarraf wrote: Control: tag -1 +help On Mon, 2020-01-06 at 10:38 +0100, Sjoerd Simons wrote: On my sid system: ``` $ strings /usr/bin/linux.uml | grep port-helper /usr/lib//uml/port-helper ``` So the path is still incorrect even with newer upstream kernels. I spent some time today looking at the new build but I haven't been able to ascertain why this isn't setting the correct path. It is used in a "user" side file - xterm.c None of these sees "CONFIG_" so it considers it undef-ed which defaults to 32 bit. You need to use some other way to figure out what is the build or to set OS_LIB_PATH for this case. ``` $ strings `which linux.uml` | grep port-helper /usr/lib/uml/port-helper ``` First, for context to the readers here, the port-helper binary is shipped with uml-utilities package. This package, depending on the architecture, installs the binary to a architecture specific location. https://sources.debian.org/src/uml-utilities/20070815.2-1/Makefile/#L10 Which on an amd64 machine is: /usr/lib64/uml/port-helper ``` $ dpkg -S /usr/lib64/uml/port-helper uml-utilities: /usr/lib64/uml/port-helper ``` The UML setup on my box always worked because long back, when I first encountered this problem, I had created a symlink of the path to /usr/lib/ too. And had completely forgotten about it. My apologies. But that said, the current problem is with the UML binary built by the kernel sources. Problem is that, as mentioned above and other reports too on this bug report thread, the path resolved at build time is always "/usr/lib/uml/". The build configuration and the code are all correct. ``` $ grep 64BIT .config CONFIG_64BIT=y CONFIG_64BIT_TIME=y CONFIG_PHYS_ADDR_T_64BIT=y CONFIG_ARCH_DMA_ADDR_T_64BIT=y ``` Snipped from: arch/um/include/shared/os.h ``` #ifdef CONFIG_64BIT #define OS_LIB_PATH "/usr/lib64/" #else #define OS_LIB_PATH "/usr/lib/" #endif ``` I also checked the generated include headers and they are correct for the amd64 .config file. ``` linux-source-5.4/include/generated$ grep 64BIT autoconf.h #define CONFIG_64BIT_TIME 1 #define CONFIG_PHYS_ADDR_T_64BIT 1 #define CONFIG_64BIT 1 #define CONFIG_ARCH_DMA_ADDR_T_64BIT 1 ``` I'll keep looking as time permits but if anyone else has ideas on what I may be doing wrong, please do mention. Thanks, Ritesh ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#945213: Info received (Bug#945213: linux-image-5.2.0-3-amd64: OOM handling broken if hugepages are enabled)
[0.00] Linux version 5.2.0-3-amd64 (debian-ker...@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-22)) #1 SMP Debian 5.2.17-1 (2019-09-26) [0.00] Command line: BOOT_IMAGE=diskless/amd64/vmlinuz-5.2.0-3-amd64 initrd=diskless/amd64/initrd.img-5.2.0-3-amd64 root=/dev/nfs ip=dhcp nfsroot=192.168.3.3:/exports/boot/buster-bess mitigations=off rw -- [0.00] random: get_random_u32 called from bsp_init_amd+0x20b/0x2b0 with crng_init=0 [0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [0.00] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [0.00] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009e7ff] usable [0.00] BIOS-e820: [mem 0x0009e800-0x0009] reserved [0.00] BIOS-e820: [mem 0x000e-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x9dc43fff] usable [0.00] BIOS-e820: [mem 0x9dc44000-0x9ddc] reserved [0.00] BIOS-e820: [mem 0x9ddd-0x9ddd] ACPI data [0.00] BIOS-e820: [mem 0x9dde-0x9e13bfff] ACPI NVS [0.00] BIOS-e820: [mem 0x9e13c000-0x9e694fff] reserved [0.00] BIOS-e820: [mem 0x9e695000-0x9e695fff] usable [0.00] BIOS-e820: [mem 0x9e696000-0x9e89bfff] ACPI NVS [0.00] BIOS-e820: [mem 0x9e89c000-0x9ecb1fff] usable [0.00] BIOS-e820: [mem 0x9ecb2000-0x9eff3fff] reserved [0.00] BIOS-e820: [mem 0x9eff4000-0x9eff] usable [0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved [0.00] BIOS-e820: [mem 0xfec1-0xfec10fff] reserved [0.00] BIOS-e820: [mem 0xfed0-0xfed00fff] reserved [0.00] BIOS-e820: [mem 0xfed8-0xfed8] reserved [0.00] BIOS-e820: [mem 0xff00-0x] reserved [0.00] BIOS-e820: [mem 0x00011000-0x00015eff] usable [0.00] NX (Execute Disable) protection: active [0.00] SMBIOS 2.7 present. [0.00] DMI: System manufacturer System Product Name/F2A55, BIOS 5301 10/10/2012 [0.00] tsc: Fast TSC calibration using PIT [0.00] tsc: Detected 3501.783 MHz processor [0.003478] e820: update [mem 0x-0x0fff] usable ==> reserved [0.003479] e820: remove [mem 0x000a-0x000f] usable [0.003485] last_pfn = 0x15f000 max_arch_pfn = 0x4 [0.003490] MTRR default type: uncachable [0.003490] MTRR fixed ranges enabled: [0.003491] 0-9 write-back [0.003492] A-B write-through [0.003493] C-D2FFF write-protect [0.003494] D3000-E7FFF uncachable [0.003494] E8000-F write-protect [0.003495] MTRR variable ranges enabled: [0.003496] 0 base mask 8000 write-back [0.003497] 1 base 8000 mask E000 write-back [0.003498] 2 base 9F00 mask FF00 uncachable [0.003498] 3 disabled [0.003499] 4 disabled [0.003499] 5 disabled [0.003500] 6 disabled [0.003500] 7 disabled [0.003501] TOM2: 00015f00 aka 5616M [0.003713] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT [0.003882] e820: update [mem 0x9f00-0x] usable ==> reserved [0.003887] last_pfn = 0x9f000 max_arch_pfn = 0x4 [0.007940] found SMP MP-table at [mem 0x000fd870-0x000fd87f] [0.030016] Using GB pages for direct mapping [0.030018] BRK [0x133801000, 0x133801fff] PGTABLE [0.030020] BRK [0x133802000, 0x133802fff] PGTABLE [0.030021] BRK [0x133803000, 0x133803fff] PGTABLE [0.030074] BRK [0x133804000, 0x133804fff] PGTABLE [0.030076] BRK [0x133805000, 0x133805fff] PGTABLE [0.030380] BRK [0x133806000, 0x133806fff] PGTABLE [0.030449] BRK [0x133807000, 0x133807fff] PGTABLE [0.030551] BRK [0x133808000, 0x133808fff] PGTABLE [0.030642] BRK [0x133809000, 0x133809fff] PGTABLE [0.030767] BRK [0x13380a000, 0x13380afff] PGTABLE [0.030857] BRK [0x13380b000, 0x13380bfff] PGTABLE [0.030919] BRK [0x13380c000, 0x13380cfff] PGTABLE [0.031040] RAMDISK: [mem 0x7e75-0x7fff] [0.031046] ACPI: Early table checksum verification disabled [0.039448] ACPI: RSDP 0x000F0490 24 (v02 ALASKA) [0.039451] ACPI: XSDT 0x9DDD8078 64 (v01 ALASKA A M I 01072009 AMI 00010013) [0.039457] ACPI: FACP 0x9DDDE868 00010C (v05 ALASKA A M I 01072009 AMI 00010013) [0.039461] ACPI BIOS Warni
Bug#945213: linux-image-5.2.0-3-amd64: OOM handling broken if hugepages are enabled
On 22/11/2019 19:32, Ben Hutchings wrote: Control: reassign -1 src:linux 5.2.17-1 Control: tag -1 moreinfo On Thu, 2019-11-21 at 08:58 +, Anton Ivanov wrote: Package: linux-image-5.2.0-3-amd64 Version: 5.2.17+1 Severity: important Dear Maintainer, Dear Maintainer, OOM handling appears to be broken in 5.2.17-1 if hugepages are enabled. Test system: AMD A4-5300, 40G RAM, no swap, booted disklessly. Without hugepages enabled can compile dpdk without any issues. With huge pages enabled it will reproducibly OOM when trying to link one of the libraries. There are 20G+ free RAM at that point according to free with the rest being mostly used as buffers. It is sufficient to just enable huge pages to trigger this (2G out of 40G), they are not allocated or used by anything. What do you mean by "if hugepages are enabled"? hugetlbfs and THP are enabled by default. $ tail -2 sysctl.conf vm.nr_hugepages=1024 If you do not have that, compile completes fine. If you have that compile blows up when linking one of the dpdk libraries. At that point the machine has ~ 20G free RAM. You need to provide a log of the OOM messages. Ack. I will re-run the tests tomorrow and update the bug with detailed logs and the OOM. Ben. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#945213: linux-image-5.2.0-3-amd64: OOM handling broken if hugepages are enabled
Package: linux-image-5.2.0-3-amd64 Version: 5.2.17+1 Severity: important Dear Maintainer, Dear Maintainer, OOM handling appears to be broken in 5.2.17-1 if hugepages are enabled. Test system: AMD A4-5300, 40G RAM, no swap, booted disklessly. Without hugepages enabled can compile dpdk without any issues. With huge pages enabled it will reproducibly OOM when trying to link one of the libraries. There are 20G+ free RAM at that point according to free with the rest being mostly used as buffers. It is sufficient to just enable huge pages to trigger this (2G out of 40G), they are not allocated or used by anything. -- System Information: Debian Release: 10.2 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 5.2.0-3-amd64 (SMP w/2 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled
Bug#938962: [PATCH] um: Add back support for extra userspace libraries
PCAP and VDE network transports require linking with userspace libraries. The current build system has no means of passing these as arguments. This patch adds a script to expand the library list for linking for these transports as well as any future driver that needs to rely on additional libraries on the userspace side. Signed-off-by: Anton Ivanov --- arch/um/scripts/extra-libs.sh | 10 ++ scripts/link-vmlinux.sh | 4 +++- 2 files changed, 13 insertions(+), 1 deletion(-) create mode 100644 arch/um/scripts/extra-libs.sh diff --git a/arch/um/scripts/extra-libs.sh b/arch/um/scripts/extra-libs.sh new file mode 100644 index ..0592485e0675 --- /dev/null +++ b/arch/um/scripts/extra-libs.sh @@ -0,0 +1,10 @@ +#!/bin/sh + +# This file should be included from link-vmlinux, not executed!!! + +if [ "${CONFIG_UML_NET_VDE}" = "y" ] ; then + UML_EXTRA_LIBS="$UML_EXTRA_LIBS -lvde -lvdeplug" +fi +if [ "${CONFIG_UML_NET_PCAP}" = "y" ] ; then + UML_EXTRA_LIBS="$UML_EXTRA_LIBS -lpcap" +fi diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index 06495379fcd8..15f9e5096da0 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -90,11 +90,13 @@ vmlinux_link() -Wl,--end-group \ ${@}" + . arch/um/scripts/extra-libs.sh + ${CC} ${CFLAGS_vmlinux} \ -o ${output}\ -Wl,-T,${lds} \ ${objects} \ - -lutil -lrt -lpthread + -lutil -lrt -lpthread ${UML_EXTRA_LIBS} rm -f linux fi } -- 2.20.1
Bug#938962: [PATCH] um: Fix pcap and vde driver builds
On 16/10/2019 08:53, Anton Ivanov wrote: Signed-off-by: Anton Ivanov --- arch/um/drivers/Makefile | 8 scripts/link-vmlinux.sh | 2 +- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/um/drivers/Makefile b/arch/um/drivers/Makefile index 693319839f69..34355057ec85 100644 --- a/arch/um/drivers/Makefile +++ b/arch/um/drivers/Makefile @@ -24,6 +24,14 @@ LDFLAGS_vde.o := -r $(shell $(CC) $(CFLAGS) -print-file-name=libvdeplug.a) targets := pcap_kern.o pcap_user.o vde_kern.o vde_user.o +ifeq ($(CONFIG_UML_NET_PCAP),y) + export UML_EXTRA_LIBS += -lpcap +endif +ifeq ($(CONFIG_UML_NET_VDE),y) + export UML_EXTRA_LIBS += -lvde -lvdeplug +endif + + $(obj)/pcap.o: $(obj)/pcap_kern.o $(obj)/pcap_user.o $(LD) -r -dp -o $@ $^ $(ld_flags) diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index 915775eb2921..d3e6a6cdfc13 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -86,7 +86,7 @@ vmlinux_link() ${CC} ${CFLAGS_vmlinux} -o ${2} \ -Wl,-T,${lds} \ ${objects} \ - -lutil -lrt -lpthread + -lutil -lrt -lpthread ${UML_EXTRA_LIBS} rm -f linux fi } This will not work as advertised unfortunately - I have to write out the libs list somewhere and load it again in the link script instead of passing it as an environment variable. A fixed patch will follow shortly. -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#938962: Build fix for VDE and PCAP drivers
Hi all, A patch to fix the build for these follows. I will stick to my original suggestion - pcap should be obsoleted in favour of vector raw + BPF firmware load. The latter will work on interfaces where gso/gro is enabled. The original pcap will fail on that due to the 1500 bytes size limit in the legacy net code. I had to dig the root cause here and figure out what is going on while working on an AF_XDP transport as that had the same problem - it needed to pass -lbpf -lelf -lz which could not be passed under the current build system. A.
Bug#938962: [PATCH] um: Fix pcap and vde driver builds
Signed-off-by: Anton Ivanov --- arch/um/drivers/Makefile | 8 scripts/link-vmlinux.sh | 2 +- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/um/drivers/Makefile b/arch/um/drivers/Makefile index 693319839f69..34355057ec85 100644 --- a/arch/um/drivers/Makefile +++ b/arch/um/drivers/Makefile @@ -24,6 +24,14 @@ LDFLAGS_vde.o := -r $(shell $(CC) $(CFLAGS) -print-file-name=libvdeplug.a) targets := pcap_kern.o pcap_user.o vde_kern.o vde_user.o +ifeq ($(CONFIG_UML_NET_PCAP),y) + export UML_EXTRA_LIBS += -lpcap +endif +ifeq ($(CONFIG_UML_NET_VDE),y) + export UML_EXTRA_LIBS += -lvde -lvdeplug +endif + + $(obj)/pcap.o: $(obj)/pcap_kern.o $(obj)/pcap_user.o $(LD) -r -dp -o $@ $^ $(ld_flags) diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index 915775eb2921..d3e6a6cdfc13 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -86,7 +86,7 @@ vmlinux_link() ${CC} ${CFLAGS_vmlinux} -o ${2} \ -Wl,-T,${lds} \ ${objects} \ - -lutil -lrt -lpthread + -lutil -lrt -lpthread ${UML_EXTRA_LIBS} rm -f linux fi } -- 2.20.1
Bug#940820: UML not loading on Debian buster with a 5.2 kernel from testing
This is a regression in the randomization of the va setting. UML will boot on debian 4.19 kernel host with kernel.randomize_va_space = 2 UML will not boot debian 5.2 kernel host with kernel.randomize_va_space = 2 UML will boot on 5.2 once kernel.randomize_va_space is set to 0 on the host. So something has changed in how randomize is implemented between 4.19 and 5.2. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#941011: asterisk: Silently failing on weak certificates with no debug messages
On 04/10/2019 21:43, Bernhard Schmidt wrote: Am 23.09.19 um 14:19 schrieb Anton Ivanov: Dear Anton, Package: asterisk Version: 1:16.2.1~dfsg-1+deb10u1 Severity: minor Dear Maintainer, After an upgrade from stretch to buster, my asterisk installation lost tls support. Debug provided minimal information - it was failing to load the certificate in tcptls.c Root cause was openssl deciding that the old certificates were too weak. There is no debug info. There is no easy fix because the openssl error api can print the error queue only to a file/bio. It is not possible to feed into another logging framework (f.e. asterisk) and dump it at that level. I was able to stick a couple of statements dumping openssl errors to stderr, but this approach is not fit for a proper fix. IMHO the only thing that can be done here is to add a note to the changes file and relevant warnings apt-changes. Are you using chan_sip or chan_pjsip? chan_sip Since these affect everything in Buster using SSL certificates (with both OpenSSL and GnuTLS) I don't think this is Asterisk specific and should not be handled as such. I had to replace quite a lot of internal/self signed certificates because they refused to load, including unbound's local control certificate. However, I feel your pain. I had an issue with a remote certificate, and it drove me nuts to identify the failing peer, because it is not logged. That has been fixed fortunately. https://issues.asterisk.org/jira/browse/ASTERISK-26006 https://issues.asterisk.org/jira/browse/ASTERISK-28444 I'd suggest filing an issue upstream. Good idea. Though the way openssl handles this particular error reporting makes capturing it quite difficult. In any case, let's let upstream figure it out :) Brgds, Bernhard -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#941637: linux-image-4.19.0-6-amd64: noht flag on command line has no effect for 6 core/12 Thread Ryzens
On 03/10/2019 16:06, Salvatore Bonaccorso wrote: Control: tags -1 + moreinfo Hi On Thu, Oct 03, 2019 at 09:24:26AM +0100, Anton Ivanov wrote: Package: src:linux Version: 4.19.67-2+deb10u1 Severity: important Dear Maintainer, noht has no effect. I have been trying to chase down a weird hang which occurs only on 6 core/12 thread Ryzens (I cannot reproduce it on 4/8 or older CPUs). As a part of that I tried to disable ht. Well, it cannot be disabled - the noht command line arg has no effect whatosever. As ht can be a security hole this may have security implications as well. Do you mean 'nosmt'? (See kernel-parameters.txt). You can find further information as well in Documentation/admin-guide/hw-vuln/l1tf.rst. I picked up noht from an older document somewhere and I cannot remember the actual source. It was definitely in the older version of RHEL guides, etc. I can see that the parameter is nosmt now. You can close the bug. Regards, Salvatore -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#941637: linux-image-4.19.0-6-amd64: noht flag on command line has no effect for 6 core/12 Thread Ryzens
Package: src:linux Version: 4.19.67-2+deb10u1 Severity: important Dear Maintainer, noht has no effect. I have been trying to chase down a weird hang which occurs only on 6 core/12 thread Ryzens (I cannot reproduce it on 4/8 or older CPUs). As a part of that I tried to disable ht. Well, it cannot be disabled - the noht command line arg has no effect whatosever. As ht can be a security hole this may have security implications as well. -- Package-specific info: ** Version: Linux version 4.19.0-6-amd64 (debian-ker...@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.67-2+deb10u1 (2019-09-20) ** Command line: BOOT_IMAGE=/boot/vmlinuz-4.19.0-6-amd64 root=UUID=8eb17efb-6574-42d0-885e-487b98364059 ro mitigations=off noht quiet ** Not tainted ** Kernel log: [4.833468] EDAC amd64: Node 0: DRAM ECC disabled. [4.833470] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.892875] EDAC amd64: Node 0: DRAM ECC disabled. [4.892877] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.932919] EDAC amd64: Node 0: DRAM ECC disabled. [4.932920] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.968846] audit: type=1400 audit(1570086470.642:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-senddoc" pid=638 comm="apparmor_parser" [4.969330] audit: type=1400 audit(1570086470.642:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-xpdfimport" pid=643 comm="apparmor_parser" [4.971460] audit: type=1400 audit(1570086470.642:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash" pid=636 comm="apparmor_parser" [4.972463] pktcdvd: pktcdvd0: writer mapped to sr0 [4.973798] audit: type=1400 audit(1570086470.646:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=639 comm="apparmor_parser" [4.973802] audit: type=1400 audit(1570086470.646:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=639 comm="apparmor_parser" [4.976702] EDAC amd64: Node 0: DRAM ECC disabled. [4.976704] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.977529] audit: type=1400 audit(1570086470.650:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=646 comm="apparmor_parser" [4.977534] audit: type=1400 audit(1570086470.650:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=646 comm="apparmor_parser" [4.977537] audit: type=1400 audit(1570086470.650:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=646 comm="apparmor_parser" [4.977935] audit: type=1400 audit(1570086470.650:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/tcpdump" pid=647 comm="apparmor_parser" [5.036714] EDAC amd64: Node 0: DRAM ECC disabled. [5.036716] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [5.057409] new mount options do not match the existing superblock, will be ignored [5.108619] EDAC amd64: Node 0: DRAM ECC disabled. [5.108621] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [5.130890] fuse init (API version 7.27) [5.164629] EDAC amd64: Node 0: DRAM ECC disabled. [5.164630] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [5.212714] EDAC amd64: Node 0: DRAM ECC disabled. [5.212716] EDAC amd64:
Bug#938962: [PATCH] um: Loadable BPF "Firmware" for vector drivers
On 01/10/2019 08:50, Johannes Berg wrote: On Mon, 2019-09-30 at 14:19 +0100, Anton Ivanov wrote: All vector drivers now allow a BPF program to be loaded and associated with the RX socket in the host kernel. 1. The program can be loaded as an extra kernel command line option to any of the drivers. 2. The program can also be loaded as "firmware", using the ethtool flash option. It is possible to turn this facility on or off using a command line option. A simplistic wrapper for generating the BPF firmware for the raw socket driver out of a tcpdump/libpcap filter expression can be found at: https://github.com/kot-begemot-uk/uml_vector_utilities/ That's kinda cool. Why just BPF though, not eBPF with all that brings? The filter language for the SOCKOPT is specified as BPF everywhere. I have not looked at what the sockopt does in the host kernel under the hood and if it takes eBPF. Also, the intention is to provide backward compatible wrappers for the existing pcap functionality as per the Debian bug which is cc-ed and that generates/uses basic BPF out of a pcap expression. We can add those to the "uml-utilities" package present in Debian and other distros. I will try to get around and write a wrapper which takes legacy UML network interface arguments and rewrites them as options for the new vector drivers. Or is that because the BPF filter is actually attached to the socket in the host, if I'm reading this correctly? Yes. The idea is to offload it from the guest to the host. I have had this idea as well as some PoC code to do that since like 2012. (e)BPF is an excellent way to represent "firmware" for vNICs, I am surprised it is not in active use :) It should be possible to expand the concept for other stuff like AF_XDP, etc but I need to get around to implement that in the first place. Couple of style nits below: +static bool get_bpf_flash(struct arglist *def) +{ + return uml_vector_fetch_arg(def, "bpfflash") != NULL; +} + + Needs just one blank line? @@ -1125,6 +1142,7 @@ static int vector_net_close(struct net_device *dev) netif_stop_queue(dev); del_timer(&vp->tl); + if (vp->fds == NULL) return 0; not needed @@ -1139,6 +1157,8 @@ static int vector_net_close(struct net_device *dev) } tasklet_kill(&vp->tx_poll); if (vp->fds->rx_fd > 0) { + if (vp->bpf) + uml_vector_detach_bpf(vp->fds->rx_fd, vp->bpf); os_close_file(vp->fds->rx_fd); vp->fds->rx_fd = -1; } I guess you moved some code here or something and the blank line was left? +/* + * We cannot use the firmware.c loader API here because this is not a module + * and we do not have a proper device structure to pass to it as required + * by the firmware API + */ You just have to make up a platform device, see e.g. net/wireless/reg.c. IMHO better than open-coding all this. Good idea. @@ -1528,8 +1618,9 @@ static void vector_eth_configure( .in_write_poll = false, .coalesce = 2, .req_size = get_req_size(def), - .in_error = false - }); + .in_error = false, + .bpf= NULL + }); That's not really needed, but I guess you have everything here anyway. +int uml_vector_detach_bpf(int fd, void *bpf) +{ + struct sock_fprog *prog = bpf; + + int err = setsockopt(fd, SOL_SOCKET, SO_DETACH_FILTER, bpf, sizeof(struct sock_fprog)); Spurious blank line, line too long. -void *uml_vector_default_bpf(int fd, void *mac) + if (err < 0) + printk(KERN_ERR BPF_DETACH_FAIL, prog->len, prog->filter, fd, -errno); also looks pretty long + return err; +} +void *uml_vector_default_bpf(void *mac) { struct sock_filter *bpf; uint32_t *mac1 = (uint32_t *)(mac + 2); uint16_t *mac2 = (uint16_t *) mac; - struct sock_fprog bpf_prog = { - .len = 6, - .filter = NULL, - }; + struct sock_fprog *bpf_prog; + bpf_prog = uml_kmalloc(sizeof(struct sock_fprog), UM_GFP_KERNEL); + if (bpf_prog != NULL) { generally, kernel coding style prefers to remove " != NULL" (per checkpatch, anyway) + bpf_prog->len = DEFAULT_BPF_LEN; + bpf_prog->filter = NULL; + } else + return NULL; and braces on all branches of if statements johannes Ack - I will look at the other bits, thanks for reviewing it. ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#938962: [PATCH] um: Loadable BPF "Firmware" for vector drivers
All vector drivers now allow a BPF program to be loaded and associated with the RX socket in the host kernel. 1. The program can be loaded as an extra kernel command line option to any of the drivers. 2. The program can also be loaded as "firmware", using the ethtool flash option. It is possible to turn this facility on or off using a command line option. A simplistic wrapper for generating the BPF firmware for the raw socket driver out of a tcpdump/libpcap filter expression can be found at: https://github.com/kot-begemot-uk/uml_vector_utilities/ Signed-off-by: Anton Ivanov --- arch/um/drivers/vector_kern.c | 109 +++--- arch/um/drivers/vector_kern.h | 8 ++- arch/um/drivers/vector_user.c | 94 +++-- arch/um/drivers/vector_user.h | 8 ++- 4 files changed, 190 insertions(+), 29 deletions(-) diff --git a/arch/um/drivers/vector_kern.c b/arch/um/drivers/vector_kern.c index af27d5c41776..7453b99ac1d2 100644 --- a/arch/um/drivers/vector_kern.c +++ b/arch/um/drivers/vector_kern.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2017 - Cambridge Greys Limited + * Copyright (C) 2017 - 2019 Cambridge Greys Limited * Copyright (C) 2011 - 2014 Cisco Systems Inc * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) * Copyright (C) 2001 Lennert Buytenhek (buyt...@gnu.org) and @@ -21,6 +21,9 @@ #include #include #include +#include +#include +#include #include #include #include @@ -128,6 +131,17 @@ static int get_mtu(struct arglist *def) return ETH_MAX_PACKET; } +static char *get_bpf_file(struct arglist *def) +{ + return uml_vector_fetch_arg(def, "bpffile"); +} + +static bool get_bpf_flash(struct arglist *def) +{ + return uml_vector_fetch_arg(def, "bpfflash") != NULL; +} + + static int get_depth(struct arglist *def) { char *mtu = uml_vector_fetch_arg(def, "depth"); @@ -176,6 +190,7 @@ static int get_transport_options(struct arglist *def) int vec_rx = VECTOR_RX; int vec_tx = VECTOR_TX; long parsed; + int result = 0; if (vector != NULL) { if (kstrtoul(vector, 10, &parsed) == 0) { @@ -186,14 +201,16 @@ static int get_transport_options(struct arglist *def) } } + if (get_bpf_flash(def)) + result = VECTOR_BPF_FLASH; if (strncmp(transport, TRANS_TAP, TRANS_TAP_LEN) == 0) - return 0; + return result; if (strncmp(transport, TRANS_HYBRID, TRANS_HYBRID_LEN) == 0) - return (vec_rx | VECTOR_BPF); + return (result | vec_rx | VECTOR_BPF); if (strncmp(transport, TRANS_RAW, TRANS_RAW_LEN) == 0) - return (vec_rx | vec_tx | VECTOR_QDISC_BYPASS); - return (vec_rx | vec_tx); + return (result | vec_rx | vec_tx | VECTOR_QDISC_BYPASS); + return (result | vec_rx | vec_tx); } @@ -1125,6 +1142,7 @@ static int vector_net_close(struct net_device *dev) netif_stop_queue(dev); del_timer(&vp->tl); + if (vp->fds == NULL) return 0; @@ -1139,6 +1157,8 @@ static int vector_net_close(struct net_device *dev) } tasklet_kill(&vp->tx_poll); if (vp->fds->rx_fd > 0) { + if (vp->bpf) + uml_vector_detach_bpf(vp->fds->rx_fd, vp->bpf); os_close_file(vp->fds->rx_fd); vp->fds->rx_fd = -1; } @@ -1146,7 +1166,10 @@ static int vector_net_close(struct net_device *dev) os_close_file(vp->fds->tx_fd); vp->fds->tx_fd = -1; } + if (vp->bpf != NULL) + kfree(vp->bpf->filter); kfree(vp->bpf); + vp->bpf = NULL; kfree(vp->fds->remote_addr); kfree(vp->transport_data); kfree(vp->header_rxbuffer); @@ -1196,6 +1219,8 @@ static int vector_net_open(struct net_device *dev) vp->opened = true; spin_unlock_irqrestore(&vp->lock, flags); + vp->bpf = uml_vector_user_bpf(get_bpf_file(vp->parsed)); + vp->fds = uml_vector_user_open(vp->unit, vp->parsed); if (vp->fds == NULL) @@ -1267,8 +1292,11 @@ static int vector_net_open(struct net_device *dev) if (!uml_raw_enable_qdisc_bypass(vp->fds->rx_fd)) vp->options |= VECTOR_BPF; } - if ((vp->options & VECTOR_BPF) != 0) - vp->bpf = uml_vector_default_bpf(vp->fds->rx_fd, dev->dev_addr); + if (((vp->options & VECTOR_BPF) != 0) && (vp->bpf == NULL)) + vp->bpf = uml_vector_default_bpf(dev->dev_addr); + + if (vp->bpf != NULL) + uml_vector_attach_bpf(vp->fds->rx_fd, vp->bpf); netif_start_queue(de
Bug#941011: asterisk: Silently failing on weak certificates with no debug messages
Package: asterisk Version: 1:16.2.1~dfsg-1+deb10u1 Severity: minor Dear Maintainer, After an upgrade from stretch to buster, my asterisk installation lost tls support. Debug provided minimal information - it was failing to load the certificate in tcptls.c Root cause was openssl deciding that the old certificates were too weak. There is no debug info. There is no easy fix because the openssl error api can print the error queue only to a file/bio. It is not possible to feed into another logging framework (f.e. asterisk) and dump it at that level. I was able to stick a couple of statements dumping openssl errors to stderr, but this approach is not fit for a proper fix. IMHO the only thing that can be done here is to add a note to the changes file and relevant warnings apt-changes. -- System Information: Debian Release: 10.1 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-6-amd64 (SMP w/8 CPU cores) Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages asterisk depends on: ii adduser 3.118 ii asterisk-config 1:16.2.1~dfsg-1+deb10u1 ii asterisk-core-sounds-en 1.6.1-1 ii asterisk-modules 1:16.2.1~dfsg-1+deb10u1 ii libc62.28-10 ii libcap2 1:2.25-2 ii libedit2 3.1-20181209-1 ii libjansson4 2.12-1 ii libpopt0 1.16-12 ii libsqlite3-0 3.27.2-3 ii libssl1.11.1.1c-1 ii libsystemd0 241-7~deb10u1 ii liburiparser10.9.1-1 ii libuuid1 2.33.1-0.1 ii libxml2 2.9.4+dfsg1-7+b3 ii libxslt1.1 1.1.32-2.1~deb10u1 ii lsb-base 10.2019051400 Versions of packages asterisk recommends: ii asterisk-moh-opsound-gsm 2.03-1 ii asterisk-voicemail [asterisk-voicemail-storage] 1:16.2.1~dfsg-1+deb10u1 ii sox 14.4.2+git20190427-1 Versions of packages asterisk suggests: pn asterisk-dahdi pn asterisk-dev pn asterisk-doc pn asterisk-ooh323 pn asterisk-opus pn asterisk-vpb -- no debconf information
Bug#940820: linux-image-5.2.0-2-amd64: breaks UML all versions, both debian stock and compiled from source.
Looks like the culprit is a different default elf start address on 5.x What changes is not the sbrk(0) or _end - these are pretty much identical as in 4.x. It is the START which after some "fixups" in arch/um/kernel/uml.lds.S becomes __binary_start I do not see an easy way to fix it :( A. On 20/09/2019 15:48, Anton Ivanov wrote: These are the Start (that is what sbrk(0) returns) and &_end values I get for the two kernels: Linux 4.19 on host - Start 1645867008 end 1631412224 diff 14454784 Linux 5.2 on host - Start 93825006145536 end 1631412224 diff 93823374733312 I think the whole logic in UML here is broken because with memory model = large &_end is less than start to start off with so reserving XM gap does not quite make sense. I am going to see if I can sort out the UML side, but I think we still need to check the host kernel side and what is reason for the sudden change in behavior. A. On 20/09/2019 11:12, Anton Ivanov wrote: Package: src:linux Version: 5.2.9-2 Severity: important Dear Maintainer, Any attempt to run UML on a machine running 5.2.9-2 results in: Adding 9382334992 bytes to physical memory to account for exec-shield gap Too few physical memory! Needed=93823417974784, given=547037904896 Running the same UML images on 4.19 debian stock has no issues. A. -- Package-specific info: ** Version: Linux version 5.2.0-2-amd64 (debian-ker...@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-21)) #1 SMP Debian 5.2.9-2 (2019-08-21) ** Command line: BOOT_IMAGE=/boot/vmlinuz-5.2.0-2-amd64 root=UUID=8eb17efb-6574-42d0-885e-487b98364059 ro mitigations=off noht quiet ** Not tainted ** Kernel log: [ 3.684402] input: HD-Audio Generic Front Mic as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input8 [ 3.684490] input: HD-Audio Generic Rear Mic as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input9 [ 3.684555] input: HD-Audio Generic Line as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input10 [ 3.685553] input: HD-Audio Generic Line Out as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input11 [ 3.685627] input: HD-Audio Generic Front Headphone as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input12 [ 3.806626] kvm: Nested Virtualization enabled [ 3.806636] kvm: Nested Paging enabled [ 3.806637] SVM: Virtual VMLOAD VMSAVE supported [ 3.806637] SVM: Virtual GIF supported [ 3.820371] MCE: In-kernel MCE decoding enabled. [ 3.824533] EDAC amd64: Node 0: DRAM ECC disabled. [ 3.824536] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [ 3.872569] pktcdvd: pktcdvd0: writer mapped to sr0 [ 3.900858] EDAC amd64: Node 0: DRAM ECC disabled. [ 3.900860] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [ 3.948661] EDAC amd64: Node 0: DRAM ECC disabled. [ 3.948662] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [ 3.996651] EDAC amd64: Node 0: DRAM ECC disabled. [ 3.996652] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [ 4.002382] audit: type=1400 audit(1568973482.655:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-xpdfimport" pid=706 comm="apparmor_parser" [ 4.002712] audit: type=1400 audit(1568973482.655:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-senddoc" pid=701 comm="apparmor_parser" [ 4.005254] audit: type=1400 audit(1568973482.659:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash" pid=699 comm="apparmor_parser" [ 4.007555] audit: type=1400 audit(1568973482.659:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=702 comm="apparmor_parser" [ 4.007558] audit: type=1400 audit(1568973482.659:6): apparmor="STATUS" operation="profile_lo
Bug#940821: linux-image-5.2.0-2-amd64: file cache corruption with nfs4
Package: src:linux Version: 5.2.9-2 Severity: critical Justification: breaks unrelated software Dear Maintainer, NFSv4 caching is completely broken on SMP. How to reproduce: Option 1. clone openwrt, run while make clean && make -j `nproc` ; do true ; done It will break depending on number of CPUs within several runs. Symptoms of breakage. A directory on the client looks empty. Example (mnt is an NFSv4 mount): ls -laF /mnt/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 8 drwxr-xr-x 2 anivanov anivanov 4096 Sep 20 10:51 ./ drwxr-xr-x 3 anivanov anivanov 4096 Sep 20 10:51 ../ While it actually has a file in it (same on server): ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Sep 20 10:51 ./ drwxr-xr-x 3 anivanov anivanov 4096 Sep 20 10:51 ../ -rw-r--r-- 1 anivanov anivanov 32 Sep 20 10:51 ipcbuf.h This cache entry on the client does not expire as it should per the NFSv4 caching documentation - the only way of dealing with it is reboot, unmount or caches drop. Option 2. Have your $HOME on nfsv4 and use thunderbird. Move mails between folders. Sooner or later (usually sooner) you will lose an email. So this is both "breaks unrelated software" and "data loss" depending on what you are doing. Tested on: AMD Ryzen 5 2400G, AMD Ryzen 5 1600X, AMD Ryzen 5 1600, AMD A8-6500 Shows up on all. Fastest on the 6 core 12 thread ryzens, slowest on the AMD A8 (takes up to 3 iterations of make there). Brgds, A. -- Package-specific info: ** Version: Linux version 5.2.0-2-amd64 (debian-ker...@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-21)) #1 SMP Debian 5.2.9-2 (2019-08-21) ** Command line: BOOT_IMAGE=/boot/vmlinuz-5.2.0-2-amd64 root=UUID=8eb17efb-6574-42d0-885e-487b98364059 ro mitigations=off noht quiet ** Not tainted ** Kernel log: [3.684402] input: HD-Audio Generic Front Mic as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input8 [3.684490] input: HD-Audio Generic Rear Mic as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input9 [3.684555] input: HD-Audio Generic Line as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input10 [3.685553] input: HD-Audio Generic Line Out as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input11 [3.685627] input: HD-Audio Generic Front Headphone as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input12 [3.806626] kvm: Nested Virtualization enabled [3.806636] kvm: Nested Paging enabled [3.806637] SVM: Virtual VMLOAD VMSAVE supported [3.806637] SVM: Virtual GIF supported [3.820371] MCE: In-kernel MCE decoding enabled. [3.824533] EDAC amd64: Node 0: DRAM ECC disabled. [3.824536] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [3.872569] pktcdvd: pktcdvd0: writer mapped to sr0 [3.900858] EDAC amd64: Node 0: DRAM ECC disabled. [3.900860] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [3.948661] EDAC amd64: Node 0: DRAM ECC disabled. [3.948662] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [3.996651] EDAC amd64: Node 0: DRAM ECC disabled. [3.996652] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.002382] audit: type=1400 audit(1568973482.655:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-xpdfimport" pid=706 comm="apparmor_parser" [4.002712] audit: type=1400 audit(1568973482.655:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-senddoc" pid=701 comm="apparmor_parser" [4.005254] audit: type=1400 audit(1568973482.659:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash" pid=699 comm="apparmor_parser" [4.007555] audit: type=1400 audit(1568973482.659:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=702 comm="apparmor_parser" [4.007558] audit: type=1400 audit(1568973482.659:6): apparmor="
Bug#940820: linux-image-5.2.0-2-amd64: breaks UML all versions, both debian stock and compiled from source.
Package: src:linux Version: 5.2.9-2 Severity: important Dear Maintainer, Any attempt to run UML on a machine running 5.2.9-2 results in: Adding 9382334992 bytes to physical memory to account for exec-shield gap Too few physical memory! Needed=93823417974784, given=547037904896 Running the same UML images on 4.19 debian stock has no issues. A. -- Package-specific info: ** Version: Linux version 5.2.0-2-amd64 (debian-ker...@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-21)) #1 SMP Debian 5.2.9-2 (2019-08-21) ** Command line: BOOT_IMAGE=/boot/vmlinuz-5.2.0-2-amd64 root=UUID=8eb17efb-6574-42d0-885e-487b98364059 ro mitigations=off noht quiet ** Not tainted ** Kernel log: [3.684402] input: HD-Audio Generic Front Mic as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input8 [3.684490] input: HD-Audio Generic Rear Mic as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input9 [3.684555] input: HD-Audio Generic Line as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input10 [3.685553] input: HD-Audio Generic Line Out as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input11 [3.685627] input: HD-Audio Generic Front Headphone as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input12 [3.806626] kvm: Nested Virtualization enabled [3.806636] kvm: Nested Paging enabled [3.806637] SVM: Virtual VMLOAD VMSAVE supported [3.806637] SVM: Virtual GIF supported [3.820371] MCE: In-kernel MCE decoding enabled. [3.824533] EDAC amd64: Node 0: DRAM ECC disabled. [3.824536] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [3.872569] pktcdvd: pktcdvd0: writer mapped to sr0 [3.900858] EDAC amd64: Node 0: DRAM ECC disabled. [3.900860] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [3.948661] EDAC amd64: Node 0: DRAM ECC disabled. [3.948662] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [3.996651] EDAC amd64: Node 0: DRAM ECC disabled. [3.996652] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.002382] audit: type=1400 audit(1568973482.655:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-xpdfimport" pid=706 comm="apparmor_parser" [4.002712] audit: type=1400 audit(1568973482.655:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-senddoc" pid=701 comm="apparmor_parser" [4.005254] audit: type=1400 audit(1568973482.659:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash" pid=699 comm="apparmor_parser" [4.007555] audit: type=1400 audit(1568973482.659:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=702 comm="apparmor_parser" [4.007558] audit: type=1400 audit(1568973482.659:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=702 comm="apparmor_parser" [4.011004] audit: type=1400 audit(1568973482.663:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=709 comm="apparmor_parser" [4.011007] audit: type=1400 audit(1568973482.663:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=709 comm="apparmor_parser" [4.011009] audit: type=1400 audit(1568973482.663:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=709 comm="apparmor_parser" [4.012542] audit: type=1400 audit(1568973482.667:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/ntpd" pid=705 comm="apparmor_parser" [4.052465] EDAC amd64: Node 0: DRAM ECC disabled. [4.052466] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.132680] EDAC amd64: Node 0: DRAM ECC disabled. [4.132682] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
Bug#938962: user-mode-linux needs update for new linux
On 12/09/2019 15:42, Anton Ivanov wrote: On 12/09/2019 13:14, Ritesh Raj Sarraf wrote: Hi, I am not sure if this has been reported upstream but with libpcap 1.9, user mode linux fails to build. The build failure happens with both, 5.2 and 4.19 LTS kernels. A much detailed report is available at: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=938962 libpcap 1.9 introduces `pcap_open` which is also declared in linux headers in arch/um/drivers/pcap_user.c I think the best way forward here is to kill the old libpcap driver altogether. You get the same functionality from vector raw including the ability to load a bpf filter. The only thing that needs is a wrapper to compile the filter before handing it to UML. A side effect is that it is ~ 10+ time faster - in the multigigabit range. Alternatively, I can wrap it so it looks like pcap to any existing scripts and is actually vector underneath, but that will lose some of the tunables, like offloads, vector depth, etc. Thanks, Ritesh On Sat, 2019-09-07 at 17:18 +0200, Romain Francoise wrote: Hi, On Tue, Sep 3, 2019 at 3:21 PM Ritesh Raj Sarraf wrote: [...] In file included from /usr/include/pcap.h:43, from arch/um/drivers/pcap_user.c:7: /usr/include/pcap/pcap.h:835:18: note: previous declaration of ‘pcap_open’ was here PCAP_API pcap_t *pcap_open(const char *source, int snaplen, int flags, ^ make[2]: *** [scripts/Makefile.build:309: arch/um/drivers/pcap_user.o] Error 1 libpcap 1.9 includes support for remote capture, which was originally a part of WinPcap extensions. The `pcap_open()' symbol is part of that API and that's why it's defined in the header file even though remote support is not enabled in Debian. I suggest you rename the function defined in your program so that it doesn't conflict with libpcap. ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um I am going to try to write a wrapper to form arguments for the current vector raw driver and if there is something that needs to be fixed in it. I will post is as a proposed patch vs the debian package once its ready. Brgds, A -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#938962: user-mode-linux needs update for new linux
On 12/09/2019 13:14, Ritesh Raj Sarraf wrote: Hi, I am not sure if this has been reported upstream but with libpcap 1.9, user mode linux fails to build. The build failure happens with both, 5.2 and 4.19 LTS kernels. A much detailed report is available at: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=938962 libpcap 1.9 introduces `pcap_open` which is also declared in linux headers in arch/um/drivers/pcap_user.c I think the best way forward here is to kill the old libpcap driver altogether. You get the same functionality from vector raw including the ability to load a bpf filter. The only thing that needs is a wrapper to compile the filter before handing it to UML. A side effect is that it is ~ 10+ time faster - in the multigigabit range. Alternatively, I can wrap it so it looks like pcap to any existing scripts and is actually vector underneath, but that will lose some of the tunables, like offloads, vector depth, etc. Thanks, Ritesh On Sat, 2019-09-07 at 17:18 +0200, Romain Francoise wrote: Hi, On Tue, Sep 3, 2019 at 3:21 PM Ritesh Raj Sarraf wrote: [...] In file included from /usr/include/pcap.h:43, from arch/um/drivers/pcap_user.c:7: /usr/include/pcap/pcap.h:835:18: note: previous declaration of ‘pcap_open’ was here PCAP_API pcap_t *pcap_open(const char *source, int snaplen, int flags, ^ make[2]: *** [scripts/Makefile.build:309: arch/um/drivers/pcap_user.o] Error 1 libpcap 1.9 includes support for remote capture, which was originally a part of WinPcap extensions. The `pcap_open()' symbol is part of that API and that's why it's defined in the header file even though remote support is not enabled in Debian. I suggest you rename the function defined in your program so that it doesn't conflict with libpcap. ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#939877: closed by Josue Ortega (Bug#939877: fixed in rpcbind 1.2.5-7)
I missed the actual receive line in the 1.2.5-7 apologies. It alone DOES Not fix it though. There is breakage in libwrap to accompany it. Once the fix in 1.2.5-7 is in, rpcbind starts receiving (according to strace) messages which is followed by interrogating addresses and interfaces by netlink. As I do not see any netlink references anywhere in the rpcbind or the libtirpc-dev, I believe this is wrap which now has broken broadcast check. So anything compiled with wrap which needs to receive broadcasts need to be set as ALL:ALL in hosts.allow - otherwise it is dropped. Upgrading to both 1.2.5-7 _AND_ setting hosts.allow to ALL:ALL provides a viable workaround. The remaining part of this bug is libwrap, you can refile it vs that. Best Regards, -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#939877: closed by Josue Ortega (Bug#939877: fixed in rpcbind 1.2.5-7)
That's not it. Same story with 1.2.5-7 from unstable. This is after NIS restart on the client on the NIS server: root@jain:# tcpdump -nvvv -i enp7s0f1.502 udp and port 111 tcpdump: listening on enp7s0f1.502, link-type EN10MB (Ethernet), capture size 262144 bytes 192.168.20.41.36268 > 192.168.20.63.111: [udp sum ok] UDP, length 92 09:02:57.820457 IP (tos 0x0, ttl 64, id 55627, offset 0, flags [DF], proto UDP (17), length 120) 192.168.20.41.36268 > 192.168.20.63.111: [udp sum ok] UDP, length 92 09:03:03.826888 IP (tos 0x0, ttl 64, id 55969, offset 0, flags [DF], proto UDP (17), length 120) And on - the RPC retransmits to broadcast address (63 on this subnet it is /26) Traffic only one way, strace on rpcbind shows only netlink messages, no udp recv Same thing after setting a nis server address on the client and restarting nis - immediate response tcpdump -nvvv -i enp7s0f1.502 udp and port 111 192.168.20.41.800 > 192.168.3.3.111: [udp sum ok] UDP, length 56 09:05:00.429940 IP (tos 0x0, ttl 64, id 22755, offset 0, flags [DF], proto UDP (17), length 56) 192.168.3.3.111 > 192.168.20.41.800: [bad udp cksum 0x98b2 -> 0x1245!] UDP, strace of the rpcbind process sendmsg(6, {msg_name={sa_family=AF_INET, sin_port=htons(800), sin_addr=inet_addr("192.168.20.41")}, msg_namelen=16, msg_iov=[{iov_base=".{\272q\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\2\265", iov_len=28}], msg_iovlen=1, msg_control=[{cmsg_len=28, cmsg_level=SOL_IP, cmsg_type=IP_PKTINFO, cmsg_data={ipi_ifindex=0, ipi_spec_dst=inet_addr("192.168.3.3"), ipi_addr=inet_addr("192.168.3.3")}}], msg_controllen=32, msg_flags=0}, 0) = 28 That line (strace) never occurs in the broadcast case. It simply is not listening to broadcast queries. I will try to wade through the source to see exactly how it manages it, because listening on INADDR_ANY should in theory get you broadcasts. On 09/09/2019 22:00, Debian Bug Tracking System wrote: This is an automatic notification regarding your Bug report which was filed against the rpcbind package: #939877: rpcbind: Does not receive any broadcast queries resulting in complete breakage of NIS It has been closed by Josue Ortega . Their explanation is attached below along with your original report. If this explanation is unsatisfactory and you have not received a better one in a separate message then please contact Josue Ortega by replying to this email. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#939877: rpcbind: Does not receive any broadcast queries resulting in complete breakage of NIS
Package: rpcbind Version: 1.2.5-0.3 Severity: grave Justification: renders package unusable Dear Maintainer, After an upgrade to buster rpcbind no longer receives any broadcast queries. Unicast works. This is verified via strace - it has occasional netlink messages, but any of the broadcast traffic to port 111 never hit it. As a result clients can no longer find a nis server which has been upgraded to buster. -- System Information: Debian Release: 10.1 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-6-amd64 (SMP w/8 CPU cores) Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages rpcbind depends on: ii adduser 3.118 ii libc62.28-10 ii libsystemd0 241-7~deb10u1 ii libtirpc31.1.4-0.4 ii libwrap0 7.6.q-28 ii lsb-base 10.2019051400 rpcbind recommends no packages. rpcbind suggests no packages. -- no debconf information
Bug#939389: Info received (Bug#939389: light-locker: Does not work in buster with xfce4)
Root cause: systemd-login service does not work in a NIS environment without nscd installed. It ends up doing a yp call which is TCP and that is prohibited by configuration. The call fails, a rather meaningless error is returned and pam_systemd fails to setup session. It does not prevent a login because session setup is marked as optional in the corresponding pam files. Installing nscd bypasses this because it will be going over the nscd unix domain socket which is allowed IMHO this as a workaround is NOT stable and not guaranteed to work. The config for logind should be relaxed to allow it TCP calls in order to function correctly in PAM environments which use network authentication. Brgds, A. On 05/09/2019 07:33, Debian Bug Tracking System wrote: Thank you for the additional information you have supplied regarding this Bug report. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): Debian Xfce Maintainers If you wish to submit further information on this problem, please send it to 939...@bugs.debian.org. Please do not send mail to ow...@bugs.debian.org unless you wish to report a problem with the Bug-tracking system. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#939389: light-locker: Does not work in buster with xfce4
On 05/09/2019 08:43, Yves-Alexis Perez wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On Thu, 2019-09-05 at 07:28 +0100, Anton Ivanov wrote: It is not set to start off with and even if it was 95dbus_update-activation-env will unset it. Lines 8-10 in that are: unset XDG_SEAT unset XDG_SESSION_ID unset XDG_VTNR This is done in a subshell and shouldn't touch the parent environment. XDG_SESSION_ID should be set to the logind session, so it's likely *something* in your environment is wrong, but I have no idea which and you'll have to investigate yourself (it works just fine on other Buster boxes I have). Note that if you don't have a logind session (check with loginctl) it's likely other stuff won't work correctly. Just in case, make sure libpam-systemd is installed. That looks to be the case and it is something specific to upgrade. I do not see the user session using loginctl, only the root console logins. I do not recall having any of these issues on a clean buster install. I will continue digging, thanks for your help. Regards, - -- Yves-Alexis -BEGIN PGP SIGNATURE- iQEzBAEBCAAdFiEE8vi34Qgfo83x35gF3rYcyPpXRFsFAl1wvLYACgkQ3rYcyPpX RFsM/gf+JGQBNZgFpy5N1cHYOw7vX2QYl5TRPuq6G3LupHzQzF5T05j9cURj6ypK qUIiOXnlH5+Y9fHNb5WHsQimEJj5ldj6CUQXwo3nb08ertH3+PWVBpLsNxf8MR7u TbaycI95cK6z0NLOi3Ux1pO7aOvQ2th5YbJLCCuOgFfaAjd6Z2izf2XVSJ/72XAw mC3sOPojoUEEDtN7dn+OFb6dau6BfLERsQ3BMBVyxhWuD7vBZ5EO8n1o+2H8f2Go W3WSd4sbhRKk+soPvOTJ7TDTe617SwLE7D35m+cKlldBEDlO3rGp8XcB4sXhCzDY +giqZlt+La61XPDyVijKPWDRmFAyoA== =4KrQ -END PGP SIGNATURE- -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#939389: light-locker: Does not work in buster with xfce4
I tried to trace it sticking debug echoes into Xsession.d scripts It is not set to start off with and even if it was 95dbus_update-activation-env will unset it. Lines 8-10 in that are: unset XDG_SEAT unset XDG_SESSION_ID unset XDG_VTNR A. On 04/09/2019 19:40, Yves-Alexis Perez wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 control: tag -1 unreproducible moreinfo On Wed, 2019-09-04 at 17:37 +0100, Anton Ivanov wrote: No These are the XDG environment variables: XDG_CONFIG_DIRS=/etc/xdg XDG_CURRENT_DESKTOP=XFCE XDG_DATA_DIRS=/usr/share/xfce4:/usr/local/share/:/usr/share/:/usr/share XDG_GREETER_DATA_DIR=/var/lib/lightdm/data/aivanov XDG_MENU_PREFIX=xfce- XDG_SEAT=seat0 XDG_SEAT_PATH=/org/freedesktop/DisplayManager/Seat0 XDG_SESSION_DESKTOP=lightdm-xsession XDG_SESSION_PATH=/org/freedesktop/DisplayManager/Session14 XDG_SESSION_TYPE=x11 XDG_VTNR=7 So there's definitely something fishy, because lightdm should definitely set XDG_SESSION_ID. Try looking at lightdm logs maybe? - -- Yves-Alexis -BEGIN PGP SIGNATURE- iQEzBAEBCAAdFiEE8vi34Qgfo83x35gF3rYcyPpXRFsFAl1wBRcACgkQ3rYcyPpX RFtcCQgA2LtYtspQrvEDoyvlDTtMeUCs4YhJxwC53WiJcNT7BhXzE2v1Y9tzGWGs 5sDZfQFfKqjgwrL4ofBD/QwGUa5/ZKmPx4IKRDIGiNNNLdmI5tXKYLaWncjOXZc5 5OGHl+wCBt4NP4BpPe0BL1zxME/Oy2KCUO2KnNfPgZ/m5oTJgpmBhYUdGLRqhF0O EknH73/0+0Umt+Wu9/WxCA3HCxgxla3L/m+lkcXDETGzVjM6fBP945oFynaUnd3E OQEkpAuCKXHToB1bLyyUK7oaaidaHyOhZ0Drt2QopFcBgXdcbIZclMKmKlJr8/0X d9Suq/A0ljctx+tz1a2OEn9lBR9biQ== =+8fp -END PGP SIGNATURE- -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#939389: light-locker: Does not work in buster with xfce4
On 04/09/2019 17:18, Yves-Alexis Perez wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On Wed, 2019-09-04 at 12:47 +0100, Anton Ivanov wrote: I do not. It is mounted normally and light-locker worked fine before the upgrade to buster. In your session, is XDG_SESSION_ID set (and to what)? No These are the XDG environment variables: XDG_CONFIG_DIRS=/etc/xdg XDG_CURRENT_DESKTOP=XFCE XDG_DATA_DIRS=/usr/share/xfce4:/usr/local/share/:/usr/share/:/usr/share XDG_GREETER_DATA_DIR=/var/lib/lightdm/data/aivanov XDG_MENU_PREFIX=xfce- XDG_SEAT=seat0 XDG_SEAT_PATH=/org/freedesktop/DisplayManager/Seat0 XDG_SESSION_DESKTOP=lightdm-xsession XDG_SESSION_PATH=/org/freedesktop/DisplayManager/Session14 XDG_SESSION_TYPE=x11 XDG_VTNR=7 Could you give us the debug output? light-locker --debug ** (light-locker:8185): ERROR **: 17:36:51.025: session_id is not set, is /proc mounted with hidepid>0? Trace/breakpoint trap Regards, - -- Yves-Alexis -BEGIN PGP SIGNATURE- iQEzBAEBCAAdFiEE8vi34Qgfo83x35gF3rYcyPpXRFsFAl1v48IACgkQ3rYcyPpX RFvXUggA4qG3JbgGWPtSu61824ScLXDk+zuX1xuvS5LSP59xxVMbU1uv7AsC6qt5 X6kZbH4EcFJoIsubf3clGCo6x012x1nQIHgXvfOV7KKuJq0Kf3LTLhdDvjg/El7l wo8XlPcsl/ULXlWs693VzgcMiCnWYAows1m+4phKEg4eAeagHKTOx5UwttMTmUGi dSEmTEjY7Jtmn0escL1oN52A0x99y3Unfy+hT6NxMYoF1Xr2OCSlmYTFXkOuyg8w U4G6onJXV+PN9iR8nOL+QRA75P5ogoooq+XFkPWivmb4b/LqAIFP77nFmxigT7kT ZoqXejAlWTwK9aQun8oztyYs6mpfDw== =627D -END PGP SIGNATURE- -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#939389: light-locker: Does not work in buster with xfce4
I do not. It is mounted normally and light-locker worked fine before the upgrade to buster. A On 04/09/2019 12:20, Yves-Alexis Perez wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On Wed, 2019-09-04 at 11:51 +0100, Anton Ivanov wrote: Fails to start with the following message: ERROR **: 11:46:17.929: session_id is not set, is /proc mounted with hidepid>0? Looks like this bug: https://github.com/the-cavalry/light-locker/issues/141 And do you have /proc mounted with hidepid>0? Regards, - -- Yves-Alexis -BEGIN PGP SIGNATURE- iQEzBAEBCAAdFiEE8vi34Qgfo83x35gF3rYcyPpXRFsFAl1vngYACgkQ3rYcyPpX RFt6/wf+Jkt0vsRu00nClS4rXiNAw4zBZSV1acorNbcLk1nBt/BvLDlZA2prPyQO EyrUPZLQ680rNMj6m8HWgtp4zaSewApiT8NaDgAv0q9UNlvTOSCw+oOAyAc8niaU qG5AgcLpjXx5E1wK1iNLilkZtiidb6nrEsyVDOnj/rWJO+7AoHU0XwG/+LkG72NM F7ZdVpblEA81OyABzAcoMuqnonsIVnCFa9FbzGvYcRvhcZZrABynKx2RPL262Swq yjIE2zJ/BQzaSZH4OFtaca3OVMk4aruqhiLGD4H7lxabUzbxox/Xpf9i2zrGzebm OMH5zuDCp5hioFlLOw4KxW1njp/htA== =oFdH -END PGP SIGNATURE- -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#939389: light-locker: Does not work in buster with xfce4
Package: light-locker Version: 1.8.0-3 Severity: important Dear Maintainer, Fails to start with the following message: ERROR **: 11:46:17.929: session_id is not set, is /proc mounted with hidepid>0? Looks like this bug: https://github.com/the-cavalry/light-locker/issues/141 -- System Information: Debian Release: 10.0 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-5-amd64 (SMP w/12 CPU cores) Kernel taint flags: TAINT_UNSIGNED_MODULE Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages light-locker depends on: ii dconf-gsettings-backend [gsettings-backend] 0.30.1-2 ii libc62.28-10 ii libcairo21.16.0-4 ii libdbus-1-3 1.12.16-1 ii libdbus-glib-1-2 0.110-4 ii libglib2.0-0 2.58.3-2 ii libgtk-3-0 3.24.5-1 ii libpango-1.0-0 1.42.4-7~deb10u1 ii libpangocairo-1.0-0 1.42.4-7~deb10u1 ii libsystemd0 241-5 ii libx11-6 2:1.6.7-1 ii libxext6 2:1.3.3-1+b2 ii libxss1 1:1.2.3-1 ii lightdm 1.26.0-4 light-locker recommends no packages. light-locker suggests no packages. -- no debconf information
Bug#931500:
Same picture with different NFS minor versions - 4.0, 4.1 Same picture with and without hyperthreading Same picture with and without different mitigations on/off via kernel command line. 100% reproducible within 4-5 repeats of make -j `cat /proc/cpuinfo | grep processor | wc -l` ; make clean on an openwrt tree. Reproducing it on a linux tree takes a bit longer, but it is also reproducible - 10-12 times. So actually the executive summary is - NFS is broken. Completely. That is not level 6 bug, that is a much higher, please adjust priority accordingly. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#931500: Acknowledgement (linux-image-4.19.0-5-amd64: kernel deadlock with autofs)
The most interesting part - it is always the same file. ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm/ipcbuf.h It becomes invisible from the client, but exists in the server. Usually takes ~4-5 builds in a loop to achieve that. A. On 08/07/2019 12:01, Anton Ivanov wrote: On 08/07/2019 11:59, Anton Ivanov wrote: There are clearly some issues with nfs across an autofs mount (maybe for hard mounts as well), so this may warrant an upgrade. Example test. Run make -j 12 ; make clean in a loop on an nfs mounted openwrt tree until it fails (usually 2-3 iterations). State on the client ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 8 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ State as seen on the server (mounted via nfs across localhost): ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h State on the filesystem: ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h So actually this looks like the caching on NFS is royally fubar Dropping caches restores things to normal, but that is not a solution. It is a diagnosis. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#931500: Acknowledgement (linux-image-4.19.0-5-amd64: kernel deadlock with autofs)
On 08/07/2019 11:59, Anton Ivanov wrote: There are clearly some issues with nfs across an autofs mount (maybe for hard mounts as well), so this may warrant an upgrade. Example test. Run make -j 12 ; make clean in a loop on an nfs mounted openwrt tree until it fails (usually 2-3 iterations). State on the client ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 8 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ State as seen on the server (mounted via nfs across localhost): ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h State on the filesystem: ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h So actually this looks like the caching on NFS is royally fubar Dropping caches restores things to normal, but that is not a solution. It is a diagnosis. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#931500: Acknowledgement (linux-image-4.19.0-5-amd64: kernel deadlock with autofs)
There are clearly some issues with nfs across an autofs mount (maybe for hard mounts as well), so this may warrant an upgrade. Example test. Run make -j 12 ; make clean in a loop on an nfs mounted openwrt tree until it fails (usually 2-3 iterations). State on the client ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 8 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ State as seen on the server (mounted via nfs across localhost): ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h State on the filesystem: ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h So actually this looks like the caching on NFS is royally fubar -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#931500: linux-image-4.19.0-5-amd64: kernel deadlock with autofs
Package: src:linux Version: 4.19.37-5 Severity: normal File: linux-image-4.19.0-5-amd64 Dear Maintainer, An attempt to mount an nfs mount via autofs when it is being unmounted sometimes results in a deadlock. This is easier to reproduce with nfsv3. It is more difficult but still possible with nfs4. I have been unable to reproduce it on any CPU with lower number of threads/cores than Ryzen 5 1600 (6/12). It is reliably reproducible on any 6 core 12 thread or higher Ryzen. It is not easy to trigger - usually takes up to 1-2 days of regular mount/unmounts at the normal autofs 5 min unmount interval to do that. It may sometimes happen in less than 30 minutes. In my case the culprit were system stats scripts executed every 5 minutes from cron. Raising the autofs timeout to 600s eliminated the deadlocks. The deadlock is usually hard and it is impossible to use Alt-SysRQ. The only time I managed to obtain a trace it was as follows: Jun 28 12:56:01 sleer kernel: [101497.077162] rcu: INFO: rcu_sched self-detected stall on CPU Jun 28 12:56:01 sleer kernel: [101497.077172] rcu: #0118-...!: (5250 ticks this GP) idle=6fa/1/0x4002 softirq=514095/514095 fqs=175 Jun 28 12:56:01 sleer kernel: [101497.077174] rcu: #011 (t=5250 jiffies g=2596081 q=15) Jun 28 12:56:01 sleer kernel: [101497.077179] rcu: rcu_sched kthread starved for 4900 jiffies! g2596081 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=7 Jun 28 12:56:01 sleer kernel: [101497.077180] rcu: RCU grace-period kthread stack dump: Jun 28 12:56:01 sleer kernel: [101497.077182] rcu_sched R running task 010 2 0x8000 Jun 28 12:56:01 sleer kernel: [101497.077185] Call Trace: Jun 28 12:56:01 sleer kernel: [101497.077192] ? __schedule+0x2a2/0x870 Jun 28 12:56:01 sleer kernel: [101497.077194] schedule+0x28/0x80 Jun 28 12:56:01 sleer kernel: [101497.077196] schedule_timeout+0x16b/0x390 Jun 28 12:56:01 sleer kernel: [101497.077200] ? __next_timer_interrupt+0xc0/0xc0 Jun 28 12:56:01 sleer kernel: [101497.077203] rcu_gp_kthread+0x40d/0x850 Jun 28 12:56:01 sleer kernel: [101497.077205] ? call_rcu_sched+0x20/0x20 Jun 28 12:56:01 sleer kernel: [101497.077207] kthread+0x112/0x130 Jun 28 12:56:01 sleer kernel: [101497.077209] ? kthread_bind+0x30/0x30 Jun 28 12:56:01 sleer kernel: [101497.077211] ret_from_fork+0x1f/0x40 Jun 28 12:56:01 sleer kernel: [101497.077213] NMI backtrace for cpu 8 Jun 28 12:56:01 sleer kernel: [101497.077215] CPU: 8 PID: 21552 Comm: localStorage DB Tainted: GE 4.19.0-5-amd64 #1 Debian 4.19.37-5 Jun 28 12:56:01 sleer kernel: [101497.077216] Hardware name: System manufacturer System Product Name/PRIME B450M-A, BIOS 0604 12/07/2018 Jun 28 12:56:01 sleer kernel: [101497.077217] Call Trace: Jun 28 12:56:01 sleer kernel: [101497.077218] Jun 28 12:56:01 sleer kernel: [101497.077220] dump_stack+0x5c/0x80 Jun 28 12:56:01 sleer kernel: [101497.077223] nmi_cpu_backtrace.cold.4+0x13/0x50 Jun 28 12:56:01 sleer kernel: [101497.077225] ? lapic_can_unplug_cpu.cold.29+0x3b/0x3b Jun 28 12:56:01 sleer kernel: [101497.077227] nmi_trigger_cpumask_backtrace+0xf9/0xfb Jun 28 12:56:01 sleer kernel: [101497.077229] rcu_dump_cpu_stacks+0x9b/0xcb Jun 28 12:56:01 sleer kernel: [101497.077231] rcu_check_callbacks.cold.80+0x1db/0x338 Jun 28 12:56:01 sleer kernel: [101497.077234] ? tick_sched_do_timer+0x60/0x60 Jun 28 12:56:01 sleer kernel: [101497.077236] update_process_times+0x28/0x60 Jun 28 12:56:01 sleer kernel: [101497.077238] tick_sched_handle+0x22/0x60 Jun 28 12:56:01 sleer kernel: [101497.077240] tick_sched_timer+0x37/0x70 Jun 28 12:56:01 sleer kernel: [101497.077241] __hrtimer_run_queues+0x100/0x280 Jun 28 12:56:01 sleer kernel: [101497.077243] hrtimer_interrupt+0x100/0x220 Jun 28 12:56:01 sleer kernel: [101497.077245] ? handle_irq_event+0x47/0x5c Jun 28 12:56:01 sleer kernel: [101497.077247] smp_apic_timer_interrupt+0x6a/0x140 Jun 28 12:56:01 sleer kernel: [101497.077248] apic_timer_interrupt+0xf/0x20 Jun 28 12:56:01 sleer kernel: [101497.077249] Jun 28 12:56:01 sleer kernel: [101497.077251] RIP: 0010:smp_call_function_many+0x1f8/0x250 Jun 28 12:56:01 sleer kernel: [101497.077253] Code: c7 e8 0c c4 5e 00 3b 05 1a 86 01 01 0f 83 8c fe ff ff 48 63 d0 48 8b 0b 48 03 0c d5 00 b7 8c a4 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c8 48 c7 c2 60 e3 b2 a4 4c 89 fe 89 df Jun 28 12:56:01 sleer kernel: [101497.077254] RSP: 0018:b93dc9cd3bb8 EFLAGS: 0202 ORIG_RAX: ff13 Jun 28 12:56:01 sleer kernel: [101497.077256] RAX: RBX: 9309fec22c00 RCX: 9309fea27000 Jun 28 12:56:01 sleer kernel: [101497.077256] RDX: 0001 RSI: RDI: 9309fec22c08 Jun 28 12:56:01 sleer kernel: [101497.077257] RBP: 9309fec22c08 R08: 0004 R09: 9309fec22c48 Jun 28 12:56:01 sleer kernel: [101497.077258] R10: 9309fec22c08 R11: 0008 R12: a3a6ca90 Jun 28 12:56:01 sleer kernel: [1014
Bug#931048: linux-image-4.19.0-4-amd64: bridge MAC learning is broken
Package: src:linux Version: 4.19.28-1 Severity: normal File: linux-image-4.19.0-4-amd64 Dear Maintainer, Bridge MAC learning is completely broken at present. How to reproduce: 1. Build one or more MINIMAL vms or connect machines with MINIMAL installs to interfaces which join to a Linux bridge 2. Observe the bridge fdb using the bridge utility or brctl. 3. Run traffic. Obvious issues: 1. MACs expire even if there are gigabytes of traffic flowing to/from them. The refresh if used is completely broken 2. MACs are not immediately reinstated into the forwarding database if there is traffic upon expiry Observations: This seems to be a result of learning being tightly bound with the idea of neighbour and neighbour discovery code. MACs are learned instantaneously if one of the hosts issues a multicast join - f.e. performs IPv6 neighbour discovery or runs avahi. If either one of these is not present the bridge code does not function as it should. While as an idea this is good it should not completely replace learning from unicast traffic. -- Package-specific info: ** Version: Linux version 4.19.0-4-amd64 (debian-ker...@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-2)) #1 SMP Debian 4.19.28-1 (2019-03-12) ** Command line: BOOT_IMAGE=/boot/vmlinuz-4.19.0-4-amd64 root=UUID=3db3d925-a3d9-4c1d-b63d-c087261f1fb2 ro quiet ** Tainted: WE (8704) * Taint on warning. * Unsigned module has been loaded. ** Kernel log: Unable to read kernel log; any relevant messages should be attached ** Model information sys_vendor: System manufacturer product_name: System Product Name product_version: System Version chassis_vendor: Default string chassis_version: Default string bios_vendor: American Megatrends Inc. bios_version: 0409 board_vendor: ASUSTeK COMPUTER INC. board_name: PRIME B450M-A board_version: Rev X.0x ** Loaded modules: cfg80211(E) bnep(E) nfnetlink_queue(E) nfnetlink_log(E) nfnetlink(E) bluetooth(E) drbg(E) ansi_cprng(E) ecdh_generic(E) squashfs(E) loop(E) ufs(E) qnx4(E) hfsplus(E) hfs(E) minix(E) ntfs(E) msdos(E) jfs(E) xfs(E) dm_mod(E) cpuid(E) uas(E) usb_storage(E) xt_nat(E) xt_tcpudp(E) xt_conntrack(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) ip6table_filter(E) ip6_tables(E) nfsv3(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) iptable_filter(E) veth(E) bridge(E) 8021q(E) garp(E) mrp(E) stp(E) llc(E) fuse(E) tun(E) binfmt_misc(E) nls_ascii(E) eeepc_wmi(E) asus_wmi(E) nls_cp437(E) sparse_keymap(E) rfkill(E) wmi_bmof(E) vfat(E) fat(E) edac_mce_amd(E) uvcvideo(E) videobuf2_vmalloc(E) videobuf2_memops(E) videobuf2_v4l2(E) kvm_amd(E) videobuf2_common(E) ccp(E) amdkfd(E) videodev(E) rng_core(E) media(E) snd_usb_audio(E) joydev(E) snd_usbmidi_lib(E) kvm(E) snd_rawmidi(E) evdev(E) snd_seq_device(E) irqbypass(E) efi_pstore(E) crct10dif_pclmul(E) crc32_pclmul(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) amdgpu(E) ghash_clmulni_intel(E) snd_hda_codec_hdmi(E) efivars(E) snd_hda_intel(E) pcspkr(E) chash(E) snd_hda_codec(E) gpu_sched(E) snd_hda_core(E) ttm(E) snd_hwdep(E) k10temp(E) sp5100_tco(E) snd_pcm_oss(E) snd_mixer_oss(E) drm_kms_helper(E) snd_pcm(E) snd_timer(E) drm(E) snd(E) soundcore(E) sg(E) wmi(E) video(E) button(E) pcc_cpufreq(E) acpi_cpufreq(E) hwmon_vid(E) parport_pc(E) nfsd(E) auth_rpcgss(E) ppdev(E) nfs_acl(E) lockd(E) lp(E) grace(E) parport(E) sunrpc(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) ecb(E) btrfs(E) zstd_decompress(E) zstd_compress(E) xxhash(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) raid0(E) multipath(E) linear(E) raid1(E) md_mod(E) sd_mod(E) hid_generic(E) usbhid(E) hid(E) crc32c_intel(E) aesni_intel(E) aes_x86_64(E) crypto_simd(E) cryptd(E) glue_helper(E) ahci(E) mptsas(E) xhci_pci(E) libahci(E) igb(E) mptscsih(E) r8169(E) i2c_piix4(E) xhci_hcd(E) realtek(E) mptbase(E) i2c_algo_bit(E) libphy(E) libata(E) scsi_transport_sas(E) dca(E) usbcore(E) usb_common(E) scsi_mod(E) gpio_amdpt(E) gpio_generic(E) ** PCI devices: 00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:15d0] Subsystem: ASUSTeK Computer Inc. Device [1043:876b] Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452] Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: Kernel driver in use: pcieport 00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [A
Bug#931047: linux-image-4.19.0-4-amd64: bridge igmp snooping is throughly broken
Package: linux-image-4.19.0-4-amd64 Version: linux-image-4.19.0-4-amd64 Severity: minor Dear Maintainer, The multicast/igmp snooping code in the linux bridge is throughly broken: 1. If a program binds to a v4 multicast group a v6 entry is created in the bridge mdb instead 2. As a result multicast appears to be flooded instead of limited to the ports for which there is a known igmp join. The end result is that it sort-a works, but is not anywhere near correct or performing. -- System Information: Debian Release: 9.9 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-4-amd64 (SMP w/8 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system)
Bug#930789: iproute2: -json flag to bridge produces broken JSON
Package: iproute2 Version: 4.20.0-2 Severity: normal Dear Maintainer, bridge -json mdb show ["mdb":[]] The outer brackets should be curly. A. -- System Information: Debian Release: 10.0 APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-5-amd64 (SMP w/12 CPU cores) Kernel taint flags: TAINT_UNSIGNED_MODULE Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages iproute2 depends on: ii debconf [debconf-2.0] 1.5.71 ii libc6 2.28-10 ii libcap21:2.25-2 ii libcap2-bin1:2.25-2 ii libdb5.3 5.3.28+dfsg1-0.5 ii libelf10.176-1.1 ii libmnl01.0.4-2 ii libselinux12.8-1+b1 ii libxtables12 1.8.2-4 Versions of packages iproute2 recommends: pn libatm1 Versions of packages iproute2 suggests: pn iproute2-doc -- debconf information: iproute2/setcaps: false
Bug#930481: python-pypcap: block/noblocking broken
Package: python-pypcap Version: 1.2.2-1 Severity: important Dear Maintainer, Setting of block/nonblock in the pyppcap variant of working with pcap from python is completely broken To test - set iface and FILTER for capture spec p = pcap.pcap(iface.encode('ascii'), MAXPACKET, True, 1) p.setfilter(FILTER) p.setnonblock(True) If there are no packets, it a p.next() or calling p in an iter context should immediately return a None. Instead of that it sits there and waits. This makes the package unusable for all but the very simple blocking use cases -- System Information: Debian Release: 10.0 APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-5-amd64 (SMP w/12 CPU cores) Kernel taint flags: TAINT_UNSIGNED_MODULE Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages python-pypcap depends on: ii libc6 2.28-10 ii libpcap0.8 1.8.1-6 ii python 2.7.16-1 python-pypcap recommends no packages. python-pypcap suggests no packages. -- no debconf information
Bug#926305: closed by Elimar Riesebieter (Re: Bug#926305: nis startup scripts are completely broken)
Please reopen. Advice is no replacement for a Depends in the package control file. As shipped the package is still broken and at the reported severity - breaking most of the system A. On 18/04/2019 14:48, Debian Bug Tracking System wrote: This is an automatic notification regarding your Bug report which was filed against the nis package: #926305: nis startup scripts are completely broken It has been closed by Elimar Riesebieter . Their explanation is attached below along with your original report. If this explanation is unsatisfactory and you have not received a better one in a separate message then please contact Elimar Riesebieter by replying to this email. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#926305: nis startup scripts are completely broken
That is not an advice. If nscd is a required dependency, NIS should bring it in. Presently it is not. Still broken A. On 18/04/2019 14:43, Elimar Riesebieter wrote: * Elimar Riesebieter [2019-04-03 11:06 +0200]: * Anton Ivanov [2019-04-03 09:43 +0100]: Package: nis Version: 3.17.1-3+b1 Severity: critical Justification: breaks unrelated software Dear Maintainer, Startup scripts are completely broken. Something in the systemd conversion/autogeneration. The ypbind binary is never started, the script goes into "backgrounded" and fails. From there on the system is unusable - you cannot log in, UIDs and groups do not resolve, etc. The same system operated correctly before buster upgrade and will operate correctly if ypbind is invoked from the command line. This looks like a pure systemd conversion issue of some sort. At my systems installing nscd helped. As well setting "YPBINDARGS=" in /etc/default/nis must be. This bug should be closed as there is no response from the reporter. As well it seems to be fixed following the advices given above, though. Elimar -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#926443: xfce4-session: eats more entropy at startup than available in the system
Package: xfce4-session Version: 4.12.1-5 Severity: minor Dear Maintainer, This can be observed predominantly on systems with autologin. xfce will sit and wait sometimes up to 5 minutes until the startup goes through. The more the system is stripped down and optimized, the longer the wait. stracing xfce4-session points the finger firmly at trying to get more random data. Installing haveged fixes it - the boot becomes nearly instantaneous instead of taking up to 5 minutes. -- System Information: Debian Release: 9.8 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-4-amd64 (SMP w/8 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages xfce4-session depends on: ii libatk1.0-02.22.0-1 ii libc6 2.24-11+deb9u4 ii libcairo2 1.14.8-1 ii libdbus-1-31.10.26-0+deb9u1 ii libdbus-glib-1-2 0.108-2 ii libfontconfig1 2.11.0-6.7+b1 ii libfreetype6 2.6.3-3.2 ii libgdk-pixbuf2.0-0 2.36.5-2+deb9u2 ii libglib2.0-0 2.50.3-2 ii libgtk2.0-02.24.31-2 ii libice62:1.0.9-2 ii libpango-1.0-0 1.40.5-1 ii libpangocairo-1.0-01.40.5-1 ii libpangoft2-1.0-0 1.40.5-1 ii libpolkit-gobject-1-0 0.105-18+deb9u1 ii libsm6 2:1.2.2-1+b3 ii libwnck22 2.30.7-5.1 ii libx11-6 2:1.6.4-3+deb9u1 ii libxfce4ui-1-0 4.12.1-2 ii libxfce4util7 4.12.1-3 ii libxfconf-0-2 4.12.1-1 ii xfce4-settings 4.12.1-1 ii xfconf 4.12.1-1 Versions of packages xfce4-session recommends: ii dbus-x11 1.10.26-0+deb9u1 ii libpam-systemd 232-25+deb9u9 ii light-locker 1.7.0-3 ii systemd-sysv 232-25+deb9u9 ii upower 0.99.4-4+b1 ii x11-xserver-utils 7.7+7+b1 ii xfdesktop4 4.12.3-3 ii xfwm4 4.12.4-1 Versions of packages xfce4-session suggests: pn fortunes-mod ii pm-utils 1.4.1-17 ii sudo 1.8.19p1-2.1 -- no debconf information
Bug#878170: Problem elsewhere, you can close it
I found the underlying root cause and the fix by chance when trawling the web for something else. The issue is compositing in the window manager. If compositing is enabled the video WILL tear. If compositing is disabled the videos display correctly. Definitely the case for xfce4. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#926305: nis startup scripts are completely broken
Package: nis Version: 3.17.1-3+b1 Severity: critical Justification: breaks unrelated software Dear Maintainer, Startup scripts are completely broken. Something in the systemd conversion/autogeneration. The ypbind binary is never started, the script goes into "backgrounded" and fails. From there on the system is unusable - you cannot log in, UIDs and groups do not resolve, etc. The same system operated correctly before buster upgrade and will operate correctly if ypbind is invoked from the command line. This looks like a pure systemd conversion issue of some sort. -- Package-specific info: NIS domain: home -- System Information: Debian Release: buster/sid APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-4-amd64 (SMP w/2 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages nis depends on: ii debconf [debconf-2.0] 1.5.71 ii hostname 3.21 ii libc6 2.28-8 ii libgdbm6 1.18.1-4 ii libsystemd0241-1 ii lsb-base 10.2019031300 ii make 4.2.1-1.2 ii netbase5.6 ii rpcbind [portmap] 1.2.5-0.3 nis recommends no packages. Versions of packages nis suggests: pn nscd -- Configuration Files: /etc/yp.conf changed [not included] -- debconf information: * nis/domain: home
Bug#878069: Acknowledgement (lightdm: xdmcp broken)
I can no longer observe it for up-to-date stretch clients vs stretch server and buster clients versus stretch server. Both sides are amd64 (the original reported crash was with a 32 bit client which I no longer have so cannot re-test). A. On 09/10/2017 14:06, Debian Bug Tracking System wrote: Thank you for filing a new Bug report with Debian. You can follow progress on this Bug here: 878069: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=878069. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): Debian Xfce Maintainers If you wish to submit further information on this problem, please send it to 878...@bugs.debian.org. Please do not send mail to ow...@bugs.debian.org unless you wish to report a problem with the Bug-tracking system. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#925937: light-locker: makes debian/xfce unusable as server in thin client/X11 environment
Package: light-locker Version: 1.7.0-3 Severity: important Dear Maintainer, If the client is non-local (f.e. xterm via xdmcp/lightdm) a lock via light-locker is "fatal" - there is no means of unlocking besides ssh-ing in and killing the light-locker. Suggested - should not run if the DISPLAY has a non-null host part. I ended up having to wrap it as follows: #!/bin/sh TEST=`echo $DISPLAY | sed -e 's/:.*//g'` if [ -n "$TEST" ] ; then exit 0 fi exec /usr/bin/light-locker.real -- System Information: Debian Release: 9.8 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-4-amd64 (SMP w/8 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages light-locker depends on: ii dconf-gsettings-backend [gsettings-backend] 0.26.0-2+b1 ii libc62.24-11+deb9u4 ii libcairo21.14.8-1 ii libdbus-1-3 1.10.26-0+deb9u1 ii libdbus-glib-1-2 0.108-2 ii libglib2.0-0 2.50.3-2 ii libgtk-3-0 3.22.11-1 ii libpango-1.0-0 1.40.5-1 ii libpangocairo-1.0-0 1.40.5-1 ii libsystemd0 232-25+deb9u9 ii libx11-6 2:1.6.4-3+deb9u1 ii libxext6 2:1.3.3-1+b2 ii libxss1 1:1.2.2-1 ii lightdm 1.18.3-1 light-locker recommends no packages. light-locker suggests no packages. -- no debconf information
Bug#925935: light-locker: should not run concurrently with xscreensaver
Package: light-locker Version: 1.7.0-3 Severity: important Dear Maintainer, At present light-locker is allowed to coexist with xscreensaver at least in xfce. The results are mostly harmless on a normal machine, but fatal in a xterm environment. While xscreensaver functions correclty for xterms light-locker does not. These two should not be allowed to coexist, it should be either one or the other -- System Information: Debian Release: 9.8 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-4-amd64 (SMP w/8 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages light-locker depends on: ii dconf-gsettings-backend [gsettings-backend] 0.26.0-2+b1 ii libc62.24-11+deb9u4 ii libcairo21.14.8-1 ii libdbus-1-3 1.10.26-0+deb9u1 ii libdbus-glib-1-2 0.108-2 ii libglib2.0-0 2.50.3-2 ii libgtk-3-0 3.22.11-1 ii libpango-1.0-0 1.40.5-1 ii libpangocairo-1.0-0 1.40.5-1 ii libsystemd0 232-25+deb9u9 ii libx11-6 2:1.6.4-3+deb9u1 ii libxext6 2:1.3.3-1+b2 ii libxss1 1:1.2.2-1 ii lightdm 1.18.3-1 light-locker recommends no packages. light-locker suggests no packages. -- no debconf information
Bug#918978: Problem is fixed by upgrading to kernel 4.19 from unstable
This looks like DRM/amdgpu driver related. The problem goes away if the kernel on the box is upgraded to 4.19. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#924664: ejabberd: node migration broken
On 15/03/2019 16:17, Philipp Huebner wrote: Hi there, Can't restore backup from "/var/lib/ejabberd/restore.erl" at node 'ejabb...@jabber.kot-begemot.co.uk': Table config does not exist. The backup is taken off an up-to-date stretch and is being restored on an up-to-date stretch. this could be caused by a number of different issues, please state the full commands you have been using as well as the user you have been issuing them as. Kind regards, On original host (smaug): ejabberdctl backup ejabberd.backup On new host (jabber is a cname to jain): ejabberdctl mnesia-change-nodename ejabberd@smaug ejabb...@jabber.kot-begemot.co.uk ejabberd.backup ejabberd.restore root@jain:/var/lib/ejabberd# ejabberdctl restore restore.erl Can't restore backup from "/var/lib/ejabberd/restore.erl" at node 'ejabb...@jabber.kot-begemot.co.uk': Table config does not exist. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#924663: ejabberd: default apparmour profile broken
On 15/03/2019 16:15, Philipp Huebner wrote: Hi there, AFAIK apparmor is not enabled by default on Debian Stretch, I built a machine from scratch without touching any defaults and the apparmor is on. I can try retracing what got it enabled, but it got enabled by something in the default build, not by me manually. but even if it is, it's both apparmor's and systemd's job to make sure that ejabberd can not just read/write arbitrary files. So please state the exact commands and paths you were trying to use as well as the error messages you got in response. ejabberdctl restore restore.erl with the original apparmour profile results in a core dump Changing su to rx as in the profile attached to the bug report makes the command execute, but it fails on the other bug I filed. Everything is being executed as root. Regards, -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#924664: ejabberd: node migration broken
On 15/03/2019 16:35, Philipp Huebner wrote: On original host (smaug): ejabberdctl backup ejabberd.backup On new host (jabber is a cname to jain): ejabberdctl mnesia-change-nodename ejabberd@smaug ejabb...@jabber.kot-begemot.co.uk ejabberd.backup ejabberd.restore root@jain:/var/lib/ejabberd# ejabberdctl restore restore.erl ^^^ Can't restore backup from "/var/lib/ejabberd/restore.erl" at node 'ejabb...@jabber.kot-begemot.co.uk': Table config does not exist. Shouldn't 'restore.erl' be 'ejabberd.restore' or did you rename the file somewhere in between? I did. Mea culpa. Cut-n-pasting from history. All files exist, retried several times with files both in /tmp/ and in /var/lib/jabberd/ no difference in either case. It failes with "Table config" message. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#924664: ejabberd: node migration broken
Package: ejabberd Version: 16.09-4 Severity: important Dear Maintainer, converting the mnesia backup to a new node using either ejabberctl or the erlang code off ejabberd website does not result in a viable backup/restore file. Attempting to restore using this file resiults in: Can't restore backup from "/var/lib/ejabberd/restore.erl" at node 'ejabb...@jabber.kot-begemot.co.uk': Table config does not exist. The backup is taken off an up-to-date stretch and is being restored on an up-to-date stretch. -- System Information: Debian Release: 9.8 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-4-amd64 (SMP w/8 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages ejabberd depends on: ii adduser3.115 ii debconf [debconf-2.0] 1.5.61 ii erlang-asn11:19.2.1+dfsg-2+deb9u2 ii erlang-base [erlang-abi-17.0] 1:19.2.1+dfsg-2+deb9u2 ii erlang-crypto 1:19.2.1+dfsg-2+deb9u2 ii erlang-inets 1:19.2.1+dfsg-2+deb9u2 ii erlang-jiffy 0.14.8+dfsg-1 ii erlang-lager 3.2.4-1 ii erlang-mnesia 1:19.2.1+dfsg-2+deb9u2 ii erlang-odbc1:19.2.1+dfsg-2+deb9u2 ii erlang-p1-cache-tab1.0.4-2 ii erlang-p1-iconv1.0.2-2 ii erlang-p1-stringprep 1.0.6-2 ii erlang-p1-tls 1.0.7-2+deb9u1 ii erlang-p1-utils1.0.5-3 ii erlang-p1-xml 1.1.15-2 ii erlang-p1-yaml 1.0.6-2 ii erlang-p1-zlib 1.0.1-4 ii erlang-public-key 1:19.2.1+dfsg-2+deb9u2 ii erlang-ssl 1:19.2.1+dfsg-2+deb9u2 ii erlang-syntax-tools1:19.2.1+dfsg-2+deb9u2 ii erlang-xmerl 1:19.2.1+dfsg-2+deb9u2 ii init-system-helpers1.48 ii lsb-base 9.20161125 ii openssl1.1.0j-1~deb9u1 ii ucf3.0036 ejabberd recommends no packages. Versions of packages ejabberd suggests: ii apparmor 2.11.0-3+deb9u2 pn apparmor-utils pn ejabberd-contrib pn erlang-luerl pn erlang-p1-mysql pn erlang-p1-oauth2 pn erlang-p1-pam pn erlang-p1-pgsql pn erlang-p1-sip pn erlang-p1-sqlite3 pn erlang-p1-stun pn erlang-redis-client ii imagemagick 8:6.9.7.4+dfsg-11+deb9u6 ii imagemagick-6.q16 [imagemagick] 8:6.9.7.4+dfsg-11+deb9u6 pn libunix-syslog-perl pn yamllint -- Configuration Files: /etc/apparmor.d/usr.sbin.ejabberdctl changed: /usr/sbin/ejabberdctl { #include #include #include capability net_bind_service, capability dac_override, /bin/bash rmix, /bin/dash rmix, /bin/date ix, /bin/grep ix, /bin/ps ix, /bin/sedix, /bin/sleep ix, /bin/su px -> /usr/sbin/ejabberdctl//su, profile su { #include #include #include #include capability audit_write, capability setgid, capability setuid, capability sys_resource, @{PROC}/@{pid}/loginuid r, @{PROC}/1/limitsr, /bin/bash px -> /usr/sbin/ejabberdctl, /bin/dash px -> /usr/sbin/ejabberdctl, /bin/su rm, /etc/environmentr, /etc/default/locale r, /etc/security/limits.d**r, /lib/@{multiarch}/libpam.so*rm, } /etc/default/ejabberd r, /etc/ejabberd** r, /etc/ImageMagick** r, /run/ejabberd** rw, /sys/devices/system/cpu** r, /sys/devices/system/node** r, /proc/sys/kernel/random/uuidr, /usr
Bug#924663: ejabberd: default apparmour profile broken
Package: ejabberd Version: 16.09-4 Severity: important Dear Maintainer, Default apparmour profile prohibits restoring backups. -- System Information: Debian Release: 9.8 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-4-amd64 (SMP w/8 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages ejabberd depends on: ii adduser3.115 ii debconf [debconf-2.0] 1.5.61 ii erlang-asn11:19.2.1+dfsg-2+deb9u2 ii erlang-base [erlang-abi-17.0] 1:19.2.1+dfsg-2+deb9u2 ii erlang-crypto 1:19.2.1+dfsg-2+deb9u2 ii erlang-inets 1:19.2.1+dfsg-2+deb9u2 ii erlang-jiffy 0.14.8+dfsg-1 ii erlang-lager 3.2.4-1 ii erlang-mnesia 1:19.2.1+dfsg-2+deb9u2 ii erlang-odbc1:19.2.1+dfsg-2+deb9u2 ii erlang-p1-cache-tab1.0.4-2 ii erlang-p1-iconv1.0.2-2 ii erlang-p1-stringprep 1.0.6-2 ii erlang-p1-tls 1.0.7-2+deb9u1 ii erlang-p1-utils1.0.5-3 ii erlang-p1-xml 1.1.15-2 ii erlang-p1-yaml 1.0.6-2 ii erlang-p1-zlib 1.0.1-4 ii erlang-public-key 1:19.2.1+dfsg-2+deb9u2 ii erlang-ssl 1:19.2.1+dfsg-2+deb9u2 ii erlang-syntax-tools1:19.2.1+dfsg-2+deb9u2 ii erlang-xmerl 1:19.2.1+dfsg-2+deb9u2 ii init-system-helpers1.48 ii lsb-base 9.20161125 ii openssl1.1.0j-1~deb9u1 ii ucf3.0036 ejabberd recommends no packages. Versions of packages ejabberd suggests: ii apparmor 2.11.0-3+deb9u2 pn apparmor-utils pn ejabberd-contrib pn erlang-luerl pn erlang-p1-mysql pn erlang-p1-oauth2 pn erlang-p1-pam pn erlang-p1-pgsql pn erlang-p1-sip pn erlang-p1-sqlite3 pn erlang-p1-stun pn erlang-redis-client ii imagemagick 8:6.9.7.4+dfsg-11+deb9u6 ii imagemagick-6.q16 [imagemagick] 8:6.9.7.4+dfsg-11+deb9u6 pn libunix-syslog-perl pn yamllint -- Configuration Files: /etc/apparmor.d/usr.sbin.ejabberdctl changed: /usr/sbin/ejabberdctl { #include #include #include capability net_bind_service, capability dac_override, /bin/bash rmix, /bin/dash rmix, /bin/date ix, /bin/grep ix, /bin/ps ix, /bin/sedix, /bin/sleep ix, /bin/su px -> /usr/sbin/ejabberdctl//su, profile su { #include #include #include #include capability audit_write, capability setgid, capability setuid, capability sys_resource, @{PROC}/@{pid}/loginuid r, @{PROC}/1/limitsr, /bin/bash px -> /usr/sbin/ejabberdctl, /bin/dash px -> /usr/sbin/ejabberdctl, /bin/su rm, /etc/environmentr, /etc/default/locale r, /etc/security/limits.d**r, /lib/@{multiarch}/libpam.so*rm, } /etc/default/ejabberd r, /etc/ejabberd** r, /etc/ImageMagick** r, /run/ejabberd** rw, /sys/devices/system/cpu** r, /sys/devices/system/node** r, /proc/sys/kernel/random/uuidr, /usr/bin/cutix, /usr/bin/erlix, /usr/bin/expr ix, /usr/bin/flock ix, /usr/bin/getent ix, /usr/bin/id ix, /usr/bin/seq
Bug#924460: linux-image-4.19.0-0.bpo.2-amd64: Weird hangs on AMD Ryzen
Package: src:linux Version: 4.19.16-1~bpo9+1 Severity: important Dear Maintainer, Occasional hangs, under X only. During the hang no new processes can be spawned from any terminal windows in the X session, windows which use DRM like firefox, thunderbird, etc do not update. Windows can be moved and it is possible to switch to a new desktop. At the same time the rest of the machine works fine. Switching to a text console works fine and any processes launched from there also work fine. Firefox and other processes relying on DRM during the hang are shown in D state. The machine recovers by itself in less than a minute. The hang frequency is once in a 3-4 hours. I am using an up-todate out of tree it87 version to get the right sensors on the MB. The bug shows both with and without this driver. I also had to pull the most recent firmware from kernel.org for the video. The bug is not observed when using a plug-in video card (Nvidia Quadro 290 NVS) so this looks like something related to DRM or amdgpu power management. -- Package-specific info: ** Version: Linux version 4.19.0-0.bpo.2-amd64 (debian-ker...@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)) #1 SMP Debian 4.19.16-1~bpo9+1 (2019-02-07) ** Command line: BOOT_IMAGE=/boot/vmlinuz-4.19.0-0.bpo.2-amd64 root=UUID=3db3d925-a3d9-4c1d-b63d-c087261f1fb2 ro quiet ** Tainted: WOE (12800) * Taint on warning. * Out-of-tree module has been loaded. * Unsigned module has been loaded. ** Kernel log: [665617.595702] CR2: 557931720f18 CR3: 00024536e000 CR4: 003406e0 [665617.595703] Call Trace: [665617.595751] optc1_lock+0x9e/0xb0 [amdgpu] [665617.595796] dcn10_pipe_control_lock.part.25+0x2d/0x70 [amdgpu] [665617.595840] dcn10_apply_ctx_for_surface+0xdf/0x540 [amdgpu] [665617.595883] ? hubbub1_verify_allow_pstate_change_high+0x82/0x1a0 [amdgpu] [665617.595924] dc_commit_state+0x23d/0x550 [amdgpu] [665617.595963] ? set_freesync_on_streams.part.7+0xce/0x2c0 [amdgpu] [665617.596002] ? mod_freesync_set_user_enable+0x16d/0x1b0 [amdgpu] [665617.596046] amdgpu_dm_atomic_commit_tail+0x33e/0xe60 [amdgpu] [665617.596079] ? amdgpu_bo_pin_restricted+0x68/0x280 [amdgpu] [665617.596083] ? _cond_resched+0x16/0x40 [665617.596085] ? wait_for_completion_timeout+0x3b/0x1a0 [665617.596087] ? refcount_inc_checked+0x5/0x30 [665617.596119] ? amdgpu_bo_ref+0x17/0x20 [amdgpu] [665617.596127] commit_tail+0x3d/0x70 [drm_kms_helper] [665617.596133] drm_atomic_helper_commit+0xb4/0x120 [drm_kms_helper] [665617.596147] drm_atomic_connector_commit_dpms+0xe5/0xf0 [drm] [665617.596159] drm_mode_obj_set_property_ioctl+0x247/0x290 [drm] [665617.596170] ? drm_connector_set_obj_prop+0x80/0x80 [drm] [665617.596181] drm_connector_property_set_ioctl+0x3e/0x60 [drm] [665617.596191] drm_ioctl_kernel+0xaa/0xf0 [drm] [665617.596194] ? sock_write_iter+0x87/0x100 [665617.596204] drm_ioctl+0x2ff/0x390 [drm] [665617.596215] ? drm_connector_set_obj_prop+0x80/0x80 [drm] [665617.596217] ? do_iter_write+0xd6/0x180 [665617.596248] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [665617.596251] do_vfs_ioctl+0xa2/0x640 [665617.596254] ? do_sigaction+0xad/0x1e0 [665617.596256] ksys_ioctl+0x70/0x80 [665617.596258] __x64_sys_ioctl+0x16/0x20 [665617.596260] do_syscall_64+0x55/0x110 [665617.596262] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [665617.596264] RIP: 0033:0x7fb56083a017 [665617.596265] Code: 00 00 00 48 8b 05 81 7e 2b 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 7e 2b 00 f7 d8 64 89 01 48 [665617.596266] RSP: 002b:7ffd64cbfd08 EFLAGS: 3246 ORIG_RAX: 0010 [665617.596267] RAX: ffda RBX: RCX: 7fb56083a017 [665617.596268] RDX: 7ffd64cbfd40 RSI: c01064ab RDI: 000e [665617.596269] RBP: 7ffd64cbfd40 R08: 556b0190 R09: 556aff1154d0 [665617.596270] R10: R11: 3246 R12: c01064ab [665617.596270] R13: 000e R14: 556afdb28fb0 R15: 556afd86d580 [665617.596272] ---[ end trace 070aabde88b649c0 ]--- [665929.195580] [drm:generic_reg_wait [amdgpu]] *ERROR* REG_WAIT timeout 1us * 10 tries - optc1_lock line:628 [665929.195675] WARNING: CPU: 4 PID: 15694 at /build/linux-qcc0VE/linux-4.19.16/drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:254 generic_reg_wait+0xe5/0x150 [amdgpu] [665929.195676] Modules linked in: 8021q garp mrp stp llc nls_utf8 isofs uas usb_storage fuse ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs dm_mod cpuid nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache binfmt_misc eeepc_wmi asus_wmi sparse_keymap rfkill wmi_bmof nls_ascii uvcvideo nls_cp437 amdkfd vfat videobuf2_vmalloc videobuf2_memops fat videobuf2_v4l2 videobuf2_common efi_pstore videodev edac_mce_amd snd_usb_audio media amdgpu snd_hda_codec_realtek kvm_amd snd_hda_codec_generic joydev ccp snd_usbmidi_lib snd_rawmidi rng_core snd_
Bug#918978: /usr/bin/firefox: immediate crash on startup on AMD KABINI
Package: firefox-esr Version: 60.4.0esr-1~deb9u1 Severity: important File: /usr/bin/firefox Dear Maintainer, Firefox crashes immediately (including safe mode) when starting on AMD KABINI. 00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Kabini [Radeon HD 8280E] The crash occurs with any possible combinations of the amdgpu and radeon kernel and Xorg drivers. The crash is local - does not apply to running firefox elsewhere and directing display back over SSH. It also crashes when reportbug tries to invoke it to get the relevant package information thus the next section is missing. The $HOME directory is shared via NFS and the same config with same extensions works fine on a variety of other (mostly AMD) hardware. -- Package-specific info: -- Addons package information -- System Information: Debian Release: 9.6 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.18.0-0.bpo.1-amd64 (SMP w/2 CPU cores) Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages firefox-esr depends on: ii debianutils 4.8.1.1 ii fontconfig2.11.0-6.7+b1 ii libasound21.1.3-5 ii libatk1.0-0 2.22.0-1 ii libc6 2.24-11+deb9u3 ii libcairo-gobject2 1.14.8-1 ii libcairo2 1.14.8-1 ii libdbus-1-3 1.10.26-0+deb9u1 ii libdbus-glib-1-2 0.108-2 ii libffi6 3.2.1-6 ii libfontconfig12.11.0-6.7+b1 ii libfreetype6 2.6.3-3.2 ii libgcc1 1:6.3.0-18+deb9u1 ii libgdk-pixbuf2.0-02.36.5-2+deb9u2 ii libglib2.0-0 2.50.3-2 ii libgtk-3-03.22.11-1 ii libjsoncpp1 1.7.4-3 ii libpango-1.0-01.40.5-1 ii libstartup-notification0 0.12-4+b2 ii libstdc++66.3.0-18+deb9u1 ii libvpx4 1.6.1-3+deb9u1 ii libx11-6 2:1.6.4-3+deb9u1 ii libx11-xcb1 2:1.6.4-3+deb9u1 ii libxcb-shm0 1.12-1 ii libxcb1 1.12-1 ii libxcomposite11:0.4.4-2 ii libxdamage1 1:1.1.4-2+b3 ii libxext6 2:1.3.3-1+b2 ii libxfixes31:5.0.3-1 ii libxrender1 1:0.9.10-1 ii libxt61:1.1.5-1 ii procps2:3.3.12-3+deb9u1 ii zlib1g1:1.2.8.dfsg-5 Versions of packages firefox-esr recommends: ii libavcodec57 7:3.2.12-1~deb9u1 Versions of packages firefox-esr suggests: pn fonts-lmodern ii fonts-stix [otf-stix] 1.1.1-4 ii libcanberra0 0.30-3 ii libgssapi-krb5-2 1.15-1+deb9u1 ii libgtk2.0-02.24.31-2 ii pulseaudio 10.0-1+deb9u1 -- no debconf information
Bug#884284: nfs-kernel-server: NFSv4 broken
Package: nfs-kernel-server Version: 1:1.3.4-2.1 Severity: important Dear Maintainer, NFSv4 in stretch is broken and unusable. After some time the server exporting the directories starts throwing [1130732.440356] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [1130734.801510] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [1173981.176268] NFS: nfs4_reclaim_open_state: Lock reclaim failed! messages, read/writes slow down to a crawl and at the end there is no choice but to reboot the server. Restarting nfs-kernel-server, unmounting from all known clients and remouting does not help. I have now been forced to downgrade back to nfsv3 across the board. The same setup works fine with NFSv3. NFSv4 used to work perfectly fine in jessie and before that. I am not sure if this started from the stretch upgrade or after one of the stretch mid-life kernel updates (I think it is the latter). Setup: Standard mid-size classic Linux/Unix multiuser install. Server(s) exporting $HOME and other directories to a local network. Clients mount via autofs when needed. Most directories are mounted from at least 2 (usually more) clients. -- Package-specific info: -- rpcinfo -- program vers proto port service 104 tcp111 portmapper 103 tcp111 portmapper 102 tcp111 portmapper 104 udp111 portmapper 103 udp111 portmapper 102 udp111 portmapper 151 udp 58357 mountd 151 tcp 37131 mountd 152 udp 54135 mountd 152 tcp 32951 mountd 153 udp 47587 mountd 153 tcp 41773 mountd 133 tcp 2049 nfs 134 tcp 2049 nfs 1002273 tcp 2049 133 udp 2049 nfs 134 udp 2049 nfs 1002273 udp 2049 1000211 udp 46283 nlockmgr 1000213 udp 46283 nlockmgr 1000214 udp 46283 nlockmgr 1000211 tcp 40039 nlockmgr 1000213 tcp 40039 nlockmgr 1000214 tcp 40039 nlockmgr 142 udp856 ypserv 141 udp856 ypserv 142 tcp857 ypserv 141 tcp857 ypserv 191 udp866 yppasswdd 6001000691 udp874 fypxfrd 6001000691 tcp875 fypxfrd 172 udp969 ypbind 171 udp969 ypbind 172 tcp970 ypbind 171 tcp970 ypbind 1000241 udp 44513 status 1000241 tcp 58657 status -- /etc/default/nfs-kernel-server -- RPCNFSDCOUNT=8 RPCNFSDPRIORITY=0 RPCMOUNTDOPTS="--manage-gids" NEED_SVCGSSD="" RPCSVCGSSDOPTS="" -- /etc/exports -- /exports 192.168.0.0/16(rw,async,no_root_squash,no_subtree_check,nohide,fsid=root) 127.0.0.0/8(rw,async,no_root_squash,no_subtree_check,nohide,fsid=root) /exports/md0 192.168.0.0/16(rw,async,no_root_squash,no_subtree_check,nohide) 127.0.0.0/8(rw,async,no_root_squash,no_subtree_check,nohide) /exports/md1 192.168.0.0/16(rw,async,no_root_squash,no_subtree_check,nohide) 127.0.0.0/8(rw,async,no_root_squash,no_subtree_check,nohide) /exports/md2 192.168.0.0/16(rw,async,no_root_squash,no_subtree_check,nohide) 127.0.0.0/8(rw,async,no_root_squash,no_subtree_check,nohide) -- /proc/fs/nfs/exports -- # Version 1.1 # Path Client(Flags) # IPs /exports/md0 192.168.0.0/16(rw,no_root_squash,async,wdelay,nohide,no_subtree_check,uuid=a114f04d:9e54427e:b051ce17:4dc02e9f,sec=1) /exports 192.168.0.0/16(rw,no_root_squash,async,wdelay,nohide,no_subtree_check,fsid=0,uuid=a3734f7a:774744b7:b41d4cea:bc2a4f0f,sec=1) /exports 127.0.0.0/8(rw,no_root_squash,async,wdelay,nohide,no_subtree_check,fsid=0,uuid=a3734f7a:774744b7:b41d4cea:bc2a4f0f,sec=1) /exports/md0 127.0.0.0/8(rw,no_root_squash,async,wdelay,nohide,no_subtree_check,uuid=a114f04d:9e54427e:b051ce17:4dc02e9f,sec=1) -- System Information: Debian Release: 9.2 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.9.0-4-amd64 (SMP w/2 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages nfs-kernel-server depends on: ii init-system-helpers 1.48 ii keyutils 1.5.9-9 ii libblkid12.29.2-1 ii libc62.24-11+deb9u1 ii libcap2 1:2.25-1 ii libsqlite3-0 3.16.2-5 ii libtirpc10.2.5-1.2 ii libwrap0 7.6.q-26 ii lsb-base 9.20161125 ii netbase 5.4 ii nfs-common 1:1.3.4-2.1 ii ucf 3.0036 nfs-kernel-server recommends no packages. nfs-kernel-server suggests no packages. -- no debconf information
Bug#878046: amanda-server: Fails all backups if one or more hosts are down
I am OK to wait for the upload On 22 October 2017 13:26:56 EEST, Jose M Calhariz wrote: >That is an old problem of amanda that is solved on v3.5. But the error >messages are usually different from what you see. > >I have been working on a new package that I should upload very shortly, >to sid and backports. If you are dead on water I >can provide my working in progress packages for stretch on amd64. > >Kind regards >Jose M Calhariz > >On 09/10/17 06:55, Anton Ivanov wrote: >> Package: amanda-server >> Version: 1:3.3.9-5 >> Severity: grave >> Justification: renders package unusable >> >> Dear Maintainer, >> >> If one or more backup host is unreachable, the backup of all hosts >fails. >> >> Example - backing up two hosts - smaug and TerriblTerror: >> >> If the latter is unreachable >> >> TerribleTerror1 /etc lev 0 FAILED [Request to TerribleTerror1 >failed: Connection timed out] >> >> The former (and all other hosts in the backup sequence) fail with: >> >> smaug /exports/md0/home/aivanov lev 0 FAILED [Request to smaug >failed: error sending REQ: write error to: Broken pipe] >> >> -- System Information: >> Debian Release: 9.0 >> APT prefers stable >> APT policy: (500, 'stable') >> Architecture: amd64 (x86_64) >> >> Kernel: Linux 4.9.0-3-amd64 (SMP w/4 CPU cores) >> Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8), >LANGUAGE=en_GB:en (charmap=UTF-8) >> Shell: /bin/sh linked to /bin/dash >> Init: systemd (via /run/systemd/system) >> >> Versions of packages amanda-server depends on: >> ii amanda-common 1:3.3.9-5 >> ii bsd-mailx [mailx] 8.1.2-0.20160123cvs-4 >> ii libc6 2.24-11+deb9u1 >> ii libcurl3 7.52.1-5 >> ii libglib2.0-0 2.50.3-2 >> ii libssl1.1 1.1.0f-3 >> ii perl 5.24.1-3 >> >> amanda-server recommends no packages. >> >> Versions of packages amanda-server suggests: >> ii amanda-client 1:3.3.9-5 >> ii cpio 2.11+dfsg-6 >> ii gnuplot5.0.5+dfsg1-6 >> ii mt-st 1.3-1 >> >> -- no debconf information -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Bug#878170: xserver-xorg-video-radeon: Fails to match video to vsync
It tears in full screen with and without compositing in xfce On 11 October 2017 09:40:59 BST, "Michel Dänzer" wrote: >On 10/10/17 07:50 PM, Anton Ivanov wrote: >> Package: xserver-xorg-video-radeon >> Version: 1:7.8.0-1+b1 >> Severity: important >> >> Dear Maintainer, >> >> Radeon (and amdgpu for that matter) in stretch no longer match frames >> to vsync correctly. This is observable with vdpau, opengl and plain >> xvideo. >> >> This used to work correctly in jessie so this is a recent regression. >> >> This is also observable in both full screen and windowed mode. The >> bottom ~5-10% of the picture updates on the wrong vsycn which is >> clearly visible especially in action sequences and animation. >> >> Tested with vlc, mplayer, xine and other software in a variety of >> output modes. I think I have eliminated other possible common factors >> leaving the video driver (and/or firmware) the most likely culprit. > >The only possibilities for reliably avoiding tearing in Xorg have >always >been: > >1. Using a compositing manager which uses OpenGL for rendering >2. Running an application in fullscreen, using page flipping (i.e. the > application must use something like OpenGL / VDPAU / VA-API / ... for > rendering / presentation, but not something like XVideo or even pure > X11) >3. Enabling TearFree > >Note that 1.+2. are not sufficient when using rotation or other >transforms via the RandR extension. > > >Does your setup fall under any of these cases? If not, you may just >have >gotten lucky before. > > >-- >Earthling Michel Dänzer | >http://www.amd.com >Libre software enthusiast | Mesa and X >developer -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Bug#878170: xserver-xorg-video-radeon: Fails to match video to vsync
Package: xserver-xorg-video-radeon Version: 1:7.8.0-1+b1 Severity: important Dear Maintainer, Radeon (and amdgpu for that matter) in stretch no longer match frames to vsync correctly. This is observable with vdpau, opengl and plain xvideo. This used to work correctly in jessie so this is a recent regression. This is also observable in both full screen and windowed mode. The bottom ~5-10% of the picture updates on the wrong vsycn which is clearly visible especially in action sequences and animation. Tested with vlc, mplayer, xine and other software in a variety of output modes. I think I have eliminated other possible common factors leaving the video driver (and/or firmware) the most likely culprit. Brgds, A. -- Package-specific info: X server symlink status: lrwxrwxrwx 1 root root 13 Aug 25 2012 /etc/X11/X -> /usr/bin/Xorg -rwxr-xr-x 1 root root 274 Jul 7 06:09 /usr/bin/Xorg Diversions concerning libGL are in place diversion of /usr/lib/arm-linux-gnueabihf/libGL.so.1.2.0 to /usr/lib/mesa-diverted/arm-linux-gnueabihf/libGL.so.1.2.0 by glx-diversions diversion of /usr/lib/libGL.so.1 to /usr/lib/mesa-diverted/libGL.so.1 by glx-diversions diversion of /usr/lib/arm-linux-gnueabihf/libGLESv2.so.2.0.0 to /usr/lib/mesa-diverted/arm-linux-gnueabihf/libGLESv2.so.2.0.0 by glx-diversions diversion of /usr/lib/libGLESv2.so.2 to /usr/lib/mesa-diverted/libGLESv2.so.2 by glx-diversions diversion of /usr/lib/arm-linux-gnueabihf/libGL.so to /usr/lib/mesa-diverted/arm-linux-gnueabihf/libGL.so by glx-diversions diversion of /usr/lib/x86_64-linux-gnu/libGLESv1_CM.so.1.1.0 to /usr/lib/mesa-diverted/x86_64-linux-gnu/libGLESv1_CM.so.1.1.0 by glx-diversions diversion of /usr/lib/arm-linux-gnueabihf/libGLESv1_CM.so to /usr/lib/mesa-diverted/arm-linux-gnueabihf/libGLESv1_CM.so by glx-diversions diversion of /usr/lib/i386-linux-gnu/libGLESv2.so.2 to /usr/lib/mesa-diverted/i386-linux-gnu/libGLESv2.so.2 by glx-diversions diversion of /usr/lib/x86_64-linux-gnu/libGLESv2.so.2 to /usr/lib/mesa-diverted/x86_64-linux-gnu/libGLESv2.so.2 by glx-diversions diversion of /usr/lib/arm-linux-gnueabihf/libGL.so.1.2 to /usr/lib/mesa-diverted/arm-linux-gnueabihf/libGL.so.1.2 by glx-diversions diversion of /usr/lib/libGLESv1_CM.so.1.1.0 to /usr/lib/mesa-diverted/libGLESv1_CM.so.1.1.0 by glx-diversions diversion of /usr/lib/i386-linux-gnu/libGLESv1_CM.so.1 to /usr/lib/mesa-diverted/i386-linux-gnu/libGLESv1_CM.so.1 by glx-diversions diversion of /usr/lib/x86_64-linux-gnu/libGLESv1_CM.so to /usr/lib/mesa-diverted/x86_64-linux-gnu/libGLESv1_CM.so by glx-diversions diversion of /usr/lib/arm-linux-gnueabihf/libGLESv1_CM.so.1.1.0 to /usr/lib/mesa-diverted/arm-linux-gnueabihf/libGLESv1_CM.so.1.1.0 by glx-diversions diversion of /usr/lib/libGL.so.1.2.0 to /usr/lib/mesa-diverted/libGL.so.1.2.0 by glx-diversions diversion of /usr/lib/libGLESv2.so to /usr/lib/mesa-diverted/libGLESv2.so by glx-diversions diversion of /usr/lib/libGL.so.1.2 to /usr/lib/mesa-diverted/libGL.so.1.2 by glx-diversions diversion of /usr/lib/i386-linux-gnu/libGLESv1_CM.so.1.1.0 to /usr/lib/mesa-diverted/i386-linux-gnu/libGLESv1_CM.so.1.1.0 by glx-diversions diversion of /usr/lib/x86_64-linux-gnu/libGL.so.1.2.0 to /usr/lib/mesa-diverted/x86_64-linux-gnu/libGL.so.1.2.0 by glx-diversions diversion of /usr/lib/arm-linux-gnueabihf/libGLESv2.so to /usr/lib/mesa-diverted/arm-linux-gnueabihf/libGLESv2.so by glx-diversions diversion of /usr/lib/libGL.so to /usr/lib/mesa-diverted/libGL.so by glx-diversions diversion of /usr/lib/arm-linux-gnueabihf/libGLESv2.so.2 to /usr/lib/mesa-diverted/arm-linux-gnueabihf/libGLESv2.so.2 by glx-diversions diversion of /usr/lib/x86_64-linux-gnu/libGL.so.1.2 to /usr/lib/mesa-diverted/x86_64-linux-gnu/libGL.so.1.2 by glx-diversions diversion of /usr/lib/i386-linux-gnu/libGLESv2.so to /usr/lib/mesa-diverted/i386-linux-gnu/libGLESv2.so by glx-diversions diversion of /usr/lib/libGLESv1_CM.so to /usr/lib/mesa-diverted/libGLESv1_CM.so by glx-diversions diversion of /usr/lib/i386-linux-gnu/libGL.so.1.2.0 to /usr/lib/mesa-diverted/i386-linux-gnu/libGL.so.1.2.0 by glx-diversions diversion of /usr/lib/i386-linux-gnu/libGL.so to /usr/lib/mesa-diverted/i386-linux-gnu/libGL.so by glx-diversions diversion of /usr/lib/x86_64-linux-gnu/libGL.so.1 to /usr/lib/mesa-diverted/x86_64-linux-gnu/libGL.so.1 by glx-diversions diversion of /usr/lib/arm-linux-gnueabihf/libGL.so.1 to /usr/lib/mesa-diverted/arm-linux-gnueabihf/libGL.so.1 by glx-diversions diversion of /usr/lib/i386-linux-gnu/libGLESv2.so.2.0.0 to /usr/lib/mesa-diverted/i386-linux-gnu/libGLESv2.so.2.0.0 by glx-diversions diversion of /usr/lib/libGLESv1_CM.so.1 to /usr/lib/mesa-diverted/libGLESv1_CM.so.1 by glx-diversions diversion of /usr/lib/x86_64-linux-gnu/libGL.so to /usr/lib/mesa-diverted/x86_64-linux-gnu/libGL.so by glx-diversions diversion of /usr/lib/x86_64-linux-gnu/libGLESv2.so.2.0.0 to /usr/lib/mesa-diverted/
Bug#878069: [Pkg-xfce-devel] Bug#878069: lightdm: xdmcp broken
This looks related. https://ubuntuforums.org/showthread.php?t=2332313 It smells like something similar - not doing the correct auth for the X connection resulting in an IO error which immediately terminates the session. Anything else would have been less immediate (it returns to the login screen straight away). A On 10/09/17 15:02, Yves-Alexis Perez wrote: On Mon, 2017-10-09 at 14:03 +0100, Anton Ivanov wrote: After upgrade to stretch XDMCP no longer works. Setup has worked virtually unchanged for a decade, with the last 5+ years using lightdm. Lightdm shows a greeter, you can enter username and password after which it bombs out straight back to greeter. Hi, thanks for the report. Unfortunately I don't have an XDMCP setup so I can't really test that, you'll have to investigate that by yourself. Regards,
Bug#878069: lightdm: xdmcp broken
Package: lightdm Version: 1.18.3-1 Severity: important Dear Maintainer, After upgrade to stretch XDMCP no longer works. Setup has worked virtually unchanged for a decade, with the last 5+ years using lightdm. Lightdm shows a greeter, you can enter username and password after which it bombs out straight back to greeter. lighdm log snippet for the connecting client: [+146895.21s] DEBUG: Greeter connected version=1.18.3 resettable=false [+146895.66s] DEBUG: Greeter start authentication [+146895.66s] DEBUG: Session: Not setting XDG_VTNR [+146895.66s] DEBUG: Session pid=13183: Started with service 'lightdm', username '(null)' [+146895.66s] DEBUG: Session pid=13183: Got 1 message(s) from PAM [+146895.66s] DEBUG: Prompt greeter with 1 message(s) [+147067.77s] DEBUG: Greeter start authentication for aivanov [+147067.77s] DEBUG: Session pid=13183: Sending SIGTERM [+147067.77s] DEBUG: Session: Not setting XDG_VTNR [+147067.77s] DEBUG: Session pid=18969: Started with service 'lightdm', username 'aivanov' [+147067.77s] DEBUG: Session pid=13183: Terminated with signal 15 [+147067.77s] DEBUG: Session: Failed during authentication [+147067.77s] DEBUG: Seat (null): Session stopped [+147067.78s] DEBUG: Session pid=18969: Got 1 message(s) from PAM [+147067.78s] DEBUG: Prompt greeter with 1 message(s) [+147072.36s] DEBUG: Continue authentication [+147072.38s] DEBUG: Session pid=18969: Authentication complete with return value 0: Success [+147072.38s] DEBUG: Authenticate result for user aivanov: Success [+147072.38s] DEBUG: User aivanov authorized [+147072.39s] DEBUG: Greeter sets language en_GB.utf8 [+147072.46s] DEBUG: Greeter requests session xfce [+147072.47s] DEBUG: Seat (null): Stopping greeter; display server will be re-used for user session [+147072.47s] DEBUG: Session pid=13148: Sending SIGTERM [+147072.48s] DEBUG: Greeter closed communication channel [+147072.48s] DEBUG: Session pid=13148: Exited with return value 0 [+147072.48s] DEBUG: Seat (null): Session stopped [+147072.48s] DEBUG: Seat (null): Greeter stopped, running session [+147072.48s] DEBUG: Registering session with bus path /org/freedesktop/DisplayManager/Session12 [+147072.48s] DEBUG: Session pid=18969: Not setting XDG_VTNR [+147072.48s] DEBUG: Session pid=18969: Running command /etc/X11/Xsession startxfce4 [+147072.48s] DEBUG: Session pid=18969: Logging to .xsession-errors [+147072.51s] DEBUG: Activating login1 session 618 [+147072.51s] WARNING: Error activating login1 session: GDBus.Error:org.freedesktop.DBus.Error.NotSupported: Operation not supported [+147072.75s] DEBUG: Session pid=18969: Exited with return value 0 [+147072.75s] DEBUG: Seat (null): Session stopped [+147072.75s] DEBUG: Seat (null): Stopping display server, no sessions require it [+147072.75s] DEBUG: Seat (null): Display server stopped [+147072.75s] DEBUG: Seat (null): Active display server stopped, starting greeter [+147072.75s] DEBUG: Seat (null): Stopping; failed to start a greeter [+147072.75s] DEBUG: Seat (null): Stopping [+147072.75s] DEBUG: Seat (null): Stopped [+147072.76s] DEBUG: Got Query(authentication_names=[]) from 192.168.3.145:43839 [+147072.76s] DEBUG: Send Willing(authentication_name='' hostname='wyvern' status='') to 192.168.3.145:43839 [+147072.96s] DEBUG: Got Request(display_number=0 connections=[192.168.3.145 fe80::21e:bff:fe7b:6513] authentication_name='' authentication_data= authorization_names=['MIT-MAGIC-COOKIE-1' 'XDM-AUTHORIZATION-1' 'SUN-DES-1'] manufacturer_display_id='') from 192.168.3.145:43839 [+147072.96s] CRITICAL: g_object_unref: assertion 'G_IS_OBJECT (object)' failed [+147072.96s] DEBUG: Send Accept(session_id=51275 authentication_name='' authentication_data= authorization_name='MIT-MAGIC-COOKIE-1' authorization_data=936E7D99E7EF9DC62998E48D97972425) to 192.168.3.145:43839 [+147072.96s] DEBUG: Got Manage(session_id=51275 display_number=0 display_class='MIT-unspecified') from 192.168.3.145:43839 [+147072.96s] DEBUG: Seat (null): Loading properties from config section Seat:* [+147072.96s] DEBUG: Seat (null): Starting [+147072.96s] DEBUG: Seat (null): Creating greeter session [+147072.96s] DEBUG: Seat (null): Creating display server of type x [+147072.96s] DEBUG: DisplayServer x-192.168.3.145-0: Connecting to XServer 192.168.3.145:0 [+147072.96s] DEBUG: Seat (null): Display server ready, starting session authentication [+147072.96s] DEBUG: Session: Not setting XDG_VTNR [+147072.96s] DEBUG: Session pid=19127: Started with service 'lightdm-greeter', username 'lightdm' [+147072.96s] DEBUG: Registering seat with bus path /org/freedesktop/DisplayManager/Seat10 [+147072.98s] DEBUG: Session pid=19127: Authentication complete with return value 0: Success [+147072.98s] DEBUG: Seat (null): Session authenticated, running command [+147072.98s] DEBUG: Session pid=19127: Not setting XDG_VTNR [+147072.98s] DEBUG: Session pid=19127: Running command /usr/sbin/lightdm-gtk-greeter [+147072.98s] DEBUG: Session pid=19127: Logging to /var/log/l
Bug#878046: amanda-server: Fails all backups if one or more hosts are down
Package: amanda-server Version: 1:3.3.9-5 Severity: grave Justification: renders package unusable Dear Maintainer, If one or more backup host is unreachable, the backup of all hosts fails. Example - backing up two hosts - smaug and TerriblTerror: If the latter is unreachable TerribleTerror1 /etc lev 0 FAILED [Request to TerribleTerror1 failed: Connection timed out] The former (and all other hosts in the backup sequence) fail with: smaug /exports/md0/home/aivanov lev 0 FAILED [Request to smaug failed: error sending REQ: write error to: Broken pipe] -- System Information: Debian Release: 9.0 APT prefers stable APT policy: (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.9.0-3-amd64 (SMP w/4 CPU cores) Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages amanda-server depends on: ii amanda-common 1:3.3.9-5 ii bsd-mailx [mailx] 8.1.2-0.20160123cvs-4 ii libc6 2.24-11+deb9u1 ii libcurl3 7.52.1-5 ii libglib2.0-0 2.50.3-2 ii libssl1.1 1.1.0f-3 ii perl 5.24.1-3 amanda-server recommends no packages. Versions of packages amanda-server suggests: ii amanda-client 1:3.3.9-5 ii cpio 2.11+dfsg-6 ii gnuplot5.0.5+dfsg1-6 ii mt-st 1.3-1 -- no debconf information
Bug#878045: amanda-server: Fails to format blank virtual tapes
Package: amanda-server Version: 1:3.3.9-5 Severity: important Dear Maintainer, Amanda no longer formats completely blank tapes. Tested with virtual tapes on disk, hence reporting only for virtual tapes. The only way to create a new virtual tape at present is to copy the label file out of an existing virtual tape. This allows amanda to overwrite it and function correctly. $ amlabel AutoSet3 AUTOE05 slot 6 Reading label... /usr/lib/amanda/chg-multi: 96: local: 0: bad variable name Malformed output from changer script -- no output Same after copying label file $ amlabel AutoSet3 AUTOE05 slot 6 Reading label... Volume with label 'AUTOE04' is active and contains data from this configuration. Not writing label. After copying label file with -f $ amlabel -f AutoSet3 AUTOE05 slot 6 Reading label... Volume with label 'AUTOE04' is active and contains data from this configuration. Consider using 'amrmtape' to remove volume 'AUTOE04' from the catalog. Writing label 'AUTOE05'... Checking label... Success! -- System Information: Debian Release: 9.0 APT prefers stable APT policy: (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.9.0-3-amd64 (SMP w/4 CPU cores) Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages amanda-server depends on: ii amanda-common 1:3.3.9-5 ii bsd-mailx [mailx] 8.1.2-0.20160123cvs-4 ii libc6 2.24-11+deb9u1 ii libcurl3 7.52.1-5 ii libglib2.0-0 2.50.3-2 ii libssl1.1 1.1.0f-3 ii perl 5.24.1-3 amanda-server recommends no packages. Versions of packages amanda-server suggests: ii amanda-client 1:3.3.9-5 ii cpio 2.11+dfsg-6 ii gnuplot5.0.5+dfsg1-6 ii mt-st 1.3-1 -- no debconf information
Bug#215690: This is not dependent on -jump-pointer
I am observing the same behavior regardless of the jump pointer setting. Environment - xfce4, xvkbd is used with irexec/lirc to drive Thunar and vlc How to reproduce: xvkbd -text "a" It will produce a long string of "a" A.
Bug#867227: [pkg-ntp-maintainers] Bug#867227: ntpdc no longer works
ntpq works. You might as well remove ntpdc and/or make it emit a warning. A. On 04/07/17 22:53, Bernhard Schmidt wrote: > Control: tags -1 + moreinfo > > On 04.07.2017 23:21, Anton Ivanov wrote: > > Hi, > >> Package: ntp >> Version: 1:4.2.8p10+dfsg-3 >> Severity: important >> >> Dear Maintainer, >> >> It is no longer possible to query the ntpd state. ntpdc fails to work. > Upstream says (man ntpdc) > > DESCRIPTION > ntpdc is deprecated. Please use ntpq(1) instead - it can do > everything ntpdc used to do, and it does so using a much more sane > interface. > > Please try to use ntpq. > > Best Regards, > Bernhard >
Bug#867227: ntpdc no longer works
Package: ntp Version: 1:4.2.8p10+dfsg-3 Severity: important Dear Maintainer, It is no longer possible to query the ntpd state. ntpdc fails to work. -- System Information: Debian Release: 9.0 APT prefers stable APT policy: (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.9.0-3-amd64 (SMP w/2 CPU cores) Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages ntp depends on: ii adduser3.115 ii dpkg 1.18.24 ii libc6 2.24-11+deb9u1 ii libcap21:2.25-1 ii libedit2 3.1-20160903-3 ii libopts25 1:5.18.12-3 ii libssl1.1 1.1.0f-3 ii lsb-base 9.20161125 ii netbase5.4 Versions of packages ntp recommends: ii perl 5.24.1-3 Versions of packages ntp suggests: pn ntp-doc -- Configuration Files: /etc/ntp.conf changed [not included] -- no debconf information
Bug#395572: This one is security bug with high severity too
First of all - confirming the bug. The bug is still there in jessie. Second, staying after the logout prevents the pam component of ecryptfs from unmounting the filesystem. It defeats the security of any ecryptfs system or other method. Third, it breaks autofs/nfs deployments - mounts for $HOME remain mounted Frankly, considering that there has been no intention to fix it since potato, I suggest we remove the package altogether as it defeats key security measures in other packages. A.
Bug#844584: dhclient should perform additional validity checks
Package: isc-dhcp-client Version: 4.3.1-6+deb8u2 Severity: serious File: /sbin/dhclient Tags: security https://samy.pl/poisontap/ This is a variation on an ancient "gem" by a DSL Modem vendor where the router pretends to be the entire internet by spoofing arp so that it captures all traffic. The best way to deal with this is to set an upper limit on the size of acceptable netmask in /etc/default/isc-dhcp-client and verify it in a hook (which can be debian specific). This way dhcp reply of 0.0.0.0/0 or anything larger than a class A will raise a security alert instead of blindly exposing the machine to a spoofing attack. -- System Information: Debian Release: 8.6 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 3.16.0-4-amd64 (SMP w/4 CPU cores) Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages isc-dhcp-client depends on: ii debianutils 4.4+b1 ii iproute2 3.16.0-2 ii isc-dhcp-common 4.3.1-6+deb8u2 ii libc6 2.19-18+deb8u6 ii libdns-export100 1:9.9.5.dfsg-9+deb8u7 ii libirs-export91 1:9.9.5.dfsg-9+deb8u7 ii libisc-export95 1:9.9.5.dfsg-9+deb8u7 isc-dhcp-client recommends no packages. Versions of packages isc-dhcp-client suggests: pn avahi-autoipd pn resolvconf -- no debconf information
Bug#838689: xorg: X hangs on Mac Mini G4
Thanks, Appears to be stable, you can close it. A. On 26/09/16 02:00, Michel Dänzer wrote: On 24/09/16 01:09 AM, Anton Ivanov wrote: Guaranteed hang within first 1 minute after upgrading to Jessie. Used to work with older Debian releases. 100% reproducible - in all cases the log file is full of (EE) [mi] EQ overflow continuing. 600 events have been dropped. [...] [ 13.223099] agpgart-uninorth :00:0b.0: putting AGP V2 device into 4x mode [ 13.223113] radeon :00:10.0: putting AGP V2 device into 4x mode Does radeon.agpmode=1 or radeon.agpmode=-1 on the kernel command line help?
Bug#838689: xorg: X hangs on Mac Mini G4
Package: xorg Version: 1:7.7+7 Severity: important Dear Maintainer, Guaranteed hang within first 1 minute after upgrading to Jessie. Used to work with older Debian releases. 100% reproducible - in all cases the log file is full of (EE) [mi] EQ overflow continuing. 600 events have been dropped. -- Package-specific info: X server symlink status: lrwxrwxrwx 1 root root 13 Jun 2 2014 /etc/X11/X -> /usr/bin/Xorg -rwxr-xr-x 1 root root 2498728 Feb 11 2015 /usr/bin/Xorg VGA-compatible devices on PCI bus: -- :00:10.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] RV280 [Radeon 9200] [1002:5962] (rev 01) /etc/X11/xorg.conf does not exist. /etc/X11/xorg.conf.d does not exist. /etc/modprobe.d contains no KMS configuration files. Kernel version (/proc/version): --- Linux version 3.16.0-4-powerpc (debian-ker...@lists.debian.org) (gcc version 4.8.4 (Debian 4.8.4-1) ) #1 Debian 3.16.36-1+deb8u1 (2016-09-03) Xorg X server log files on system: -- -rw-r--r-- 1 root root 63540 Jun 2 2014 /var/log/Xorg.1.log -rw-r--r-- 1 root root 57605 Sep 23 16:59 /var/log/Xorg.0.log Contents of most recent Xorg X server log file (/var/log/Xorg.0.log): - [ 114.126] X.Org X Server 1.16.4 Release Date: 2014-12-20 [ 114.126] X Protocol Version 11, Revision 0 [ 114.126] Build Operating System: Linux 3.2.0-4-powerpc64 ppc Debian [ 114.126] Current Operating System: Linux aenea 3.16.0-4-powerpc #1 Debian 3.16.36-1+deb8u1 (2016-09-03) ppc [ 114.126] Kernel command line: root=UUID=0c078c50-417b-42bb-b84b-758867517e90 ro [ 114.126] Build Date: 11 February 2015 01:13:01AM [ 114.126] xorg-server 2:1.16.4-1 (http://www.debian.org/support) [ 114.126] Current version of pixman: 0.32.6 [ 114.126]Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. [ 114.126] Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. [ 114.126] (==) Log file: "/var/log/Xorg.0.log", Time: Fri Sep 23 16:56:33 2016 [ 114.258] (==) Using system config directory "/usr/share/X11/xorg.conf.d" [ 114.324] (==) No Layout section. Using the first Screen section. [ 114.324] (==) No screen section available. Using defaults. [ 114.324] (**) |-->Screen "Default Screen Section" (0) [ 114.324] (**) | |-->Monitor "" [ 114.325] (==) No monitor specified for screen "Default Screen Section". Using a default monitor configuration. [ 114.325] (==) Automatically adding devices [ 114.325] (==) Automatically enabling devices [ 114.325] (==) Automatically adding GPU devices [ 114.513] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist. [ 114.513]Entry deleted from font path. [ 114.711] (==) FontPath set to: /usr/share/fonts/X11/misc, /usr/share/fonts/X11/100dpi/:unscaled, /usr/share/fonts/X11/75dpi/:unscaled, /usr/share/fonts/X11/Type1, /usr/share/fonts/X11/100dpi, /usr/share/fonts/X11/75dpi, built-ins [ 114.711] (==) ModulePath set to "/usr/lib/xorg/modules" [ 114.711] (II) The server relies on udev to provide the list of input devices. If no devices become available, reconfigure udev or disable AutoAddDevices. [ 114.740] (II) Loader magic: 0x209ae698 [ 114.740] (II) Module ABI versions: [ 114.740]X.Org ANSI C Emulation: 0.4 [ 114.740]X.Org Video Driver: 18.0 [ 114.740]X.Org XInput driver : 21.0 [ 114.740]X.Org Server Extension : 8.0 [ 114.741] (II) xfree86: Adding drm device (/dev/dri/card0) [ 114.743] (--) PCI:*(0:0:16:0) 1002:5962:1002:5962 rev 1, Mem @ 0x9800/134217728, 0x9000/65536, I/O @ 0x0400/256, BIOS @ 0x/131072 [ 114.788] (II) LoadModule: "glx" [ 114.939] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so [ 115.319] (II) Module glx: vendor="X.Org Foundation" [ 115.319]compiled for 1.16.4, module version = 1.0.0 [ 115.319]ABI class: X.Org Server Extension, version 8.0 [ 115.319] (==) AIGLX enabled [ 115.320] (==) Matched ati as autoconfigured driver 0 [ 115.320] (==) Matched ati as autoconfigured driver 1 [ 115.320] (==) Matched modesetting as autoconfigured driver 2 [ 115.320] (==) Matched fbdev as autoconfigured driver 3 [ 115.320] (==) Assigned the driver to the xf86ConfigLayout [ 115.320] (II) LoadModule: "ati" [ 115.425] (II) Loading /usr/lib/xorg/modules/drivers/ati_drv.so [ 115.437] (II) Module ati: vendor="X.Org Foundation" [ 115.438]compiled for 1.16.1, module version = 7.5.0 [ 115.438]Module class: X.Org Video Driver [ 115.438]ABI class: X.Org Video Driver, version 18.0 [ 115.438] (II) L