Re: [Intel-wired-lan] [PATCH net-next v2 00/10] Remove unnecessary (void*) conversions
[Cc: Remove mostr...@earthlink.net (550 5.5.1 Recipient rejected - ELNK001_403 -)] Am 11.07.23 um 10:53 schrieb Paul Menzel: Dear Su, Thank you for your patch. Am 10.07.23 um 08:38 schrieb Su Hui: From: wuych Can you please write the full name correctly? Maybe Yun Chuan? git config --global user.name "Yun Chuan" git commit --amend --author="Yun Chuan " I only got the cover letter by the way. Kind regards, Paul Changes in v2: move declarations to be reverse xmas tree. compile it in net and net-next branch. remove some error patches in v1. PATCH v1 link: https://lore.kernel.org/all/20230628024121.1439149-1-yunch...@nfschina.com/ wuych (10): net: wan: Remove unnecessary (void*) conversions net: atlantic: Remove unnecessary (void*) conversions net: ppp: Remove unnecessary (void*) conversions net: hns3: remove unnecessary (void*) conversions net: hns: Remove unnecessary (void*) conversions ice: remove unnecessary (void*) conversions ethernet: smsc: remove unnecessary (void*) conversions net: mdio: Remove unnecessary (void*) conversions can: ems_pci: Remove unnecessary (void*) conversions net: bna: Remove unnecessary (void*) conversions drivers/net/can/sja1000/ems_pci.c | 6 +++--- .../aquantia/atlantic/hw_atl2/hw_atl2.c | 12 ++-- .../atlantic/hw_atl2/hw_atl2_utils_fw.c | 2 +- drivers/net/ethernet/brocade/bna/bnad.c | 19 +-- .../ethernet/hisilicon/hns3/hns3_ethtool.c | 2 +- drivers/net/ethernet/hisilicon/hns_mdio.c | 10 +- drivers/net/ethernet/intel/ice/ice_main.c | 4 ++-- drivers/net/ethernet/smsc/smsc911x.c | 4 ++-- drivers/net/ethernet/smsc/smsc9420.c | 4 ++-- drivers/net/mdio/mdio-xgene.c | 4 ++-- drivers/net/ppp/pppoe.c | 4 ++-- drivers/net/ppp/pptp.c | 4 ++-- drivers/net/wan/fsl_ucc_hdlc.c | 6 +++--- 13 files changed, 40 insertions(+), 41 deletions(-)
Re: [Intel-wired-lan] [PATCH net-next v2 00/10] Remove unnecessary (void*) conversions
Dear Su, Thank you for your patch. Am 10.07.23 um 08:38 schrieb Su Hui: From: wuych Can you please write the full name correctly? Maybe Yun Chuan? git config --global user.name "Yun Chuan" git commit --amend --author="Yun Chuan " I only got the cover letter by the way. Kind regards, Paul Changes in v2: move declarations to be reverse xmas tree. compile it in net and net-next branch. remove some error patches in v1. PATCH v1 link: https://lore.kernel.org/all/20230628024121.1439149-1-yunch...@nfschina.com/ wuych (10): net: wan: Remove unnecessary (void*) conversions net: atlantic: Remove unnecessary (void*) conversions net: ppp: Remove unnecessary (void*) conversions net: hns3: remove unnecessary (void*) conversions net: hns: Remove unnecessary (void*) conversions ice: remove unnecessary (void*) conversions ethernet: smsc: remove unnecessary (void*) conversions net: mdio: Remove unnecessary (void*) conversions can: ems_pci: Remove unnecessary (void*) conversions net: bna: Remove unnecessary (void*) conversions drivers/net/can/sja1000/ems_pci.c | 6 +++--- .../aquantia/atlantic/hw_atl2/hw_atl2.c | 12 ++-- .../atlantic/hw_atl2/hw_atl2_utils_fw.c | 2 +- drivers/net/ethernet/brocade/bna/bnad.c | 19 +-- .../ethernet/hisilicon/hns3/hns3_ethtool.c| 2 +- drivers/net/ethernet/hisilicon/hns_mdio.c | 10 +- drivers/net/ethernet/intel/ice/ice_main.c | 4 ++-- drivers/net/ethernet/smsc/smsc911x.c | 4 ++-- drivers/net/ethernet/smsc/smsc9420.c | 4 ++-- drivers/net/mdio/mdio-xgene.c | 4 ++-- drivers/net/ppp/pppoe.c | 4 ++-- drivers/net/ppp/pptp.c| 4 ++-- drivers/net/wan/fsl_ucc_hdlc.c| 6 +++--- 13 files changed, 40 insertions(+), 41 deletions(-)
Re: bnx2x: ppc64le: Unable to set message level greater than 0x7fff
Dear Jakub, Sorry, one more addition. Am 16.03.22 um 06:16 schrieb Paul Menzel: Am 16.03.22 um 02:35 schrieb Jakub Kicinski: On Tue, 15 Mar 2022 22:58:57 +0100 Paul Menzel wrote: On the POWER8 server IBM S822LC (ppc64le), I am unable to set the message level for the network device to 0x010 but it fails. $ sudo ethtool -s enP1p1s0f2 msglvl 0x010 netlink error: cannot modify bits past kernel bitset size (offset 56) netlink error: Invalid argument Below is more information. 0x7fff is the largest value I am able to set. ``` $ sudo ethtool -i enP1p1s0f2 driver: bnx2x version: 5.17.0-rc7+ firmware-version: bc 7.10.4 expansion-rom-version: bus-info: 0001:01:00.2 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes $ sudo ethtool -s enP1p1s0f2 msglvl 0x7fff $ sudo ethtool enP1p1s0f2 Settings for enP1p1s0f2: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: Unknown! Duplex: Unknown! (255) Auto-negotiation: on Port: Twisted Pair PHYAD: 17 Transceiver: internal MDI-X: Unknown Supports Wake-on: g Wake-on: d Current message level: 0x7fff (32767) drv probe link timer ifdown ifup rx_err tx_err tx_queued intr tx_done rx_status pktdata hw wol Link detected: no $ sudo ethtool -s enP1p1s0f2 msglvl 0x8000 netlink error: cannot modify bits past kernel bitset size (offset 56) netlink error: Invalid argument ``` The new ethtool-over-netlink API limits the msg levels to the ones officially defined by the kernel (NETIF_MSG_CLASS_COUNT). CC: Michal Thank you for the prompt reply. So, it’s unrelated to the architecture, and to the Linux kernel version, as it works on x86_64 with Linux 5.10.x. Michal, how do I turn on certain bnx2x messages? $ git grep BNX2X_MSG_SP drivers/net/ethernet/broadcom/bnx2x/bnx2x.h drivers/net/ethernet/broadcom/bnx2x/bnx2x.h:#define BNX2X_MSG_SP 0x010 /* was: NETIF_MSG_INTR */ Testing this on the x86_64 Dell OptiPlex 5055 with a Broadcom NetXtreme BCM5762 Gigabit Ethernet PCIe [14e4:1687], it still works. ``` $ uname -a Linux serotimor.molgen.mpg.de 5.17.0-rc5.mx64.428 #1 SMP PREEMPT Mon Feb 21 04:00:47 CET 2022 x86_64 GNU/Linux $ sudo ethtool -s net00 msglvl 0x010 $ ethtool net00 Settings for net00: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Supported pause frame use: No Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Advertised pause frame use: Symmetric Advertised auto-negotiation: Yes Link partner advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Link partner advertised pause frame use: No Link partner advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on MDI-X: off Cannot get wake-on-lan settings: Operation not permitted Current message level: 0x0010 (1048576) 0x10 Link detected: yes $ sudo ethtool -s net00 msglvl 0xfff $ ethtool net00 Settings for net00: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Supported pause frame use: No Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Advertised pause frame use: Symmetric Advertised auto-negotiation: Yes Link partner advertised link modes: 10baseT/Half 10baseT/Full
Re: bnx2x: ppc64le: Unable to set message level greater than 0x7fff
Dear Jakub, Am 16.03.22 um 02:35 schrieb Jakub Kicinski: On Tue, 15 Mar 2022 22:58:57 +0100 Paul Menzel wrote: On the POWER8 server IBM S822LC (ppc64le), I am unable to set the message level for the network device to 0x010 but it fails. $ sudo ethtool -s enP1p1s0f2 msglvl 0x010 netlink error: cannot modify bits past kernel bitset size (offset 56) netlink error: Invalid argument Below is more information. 0x7fff is the largest value I am able to set. ``` $ sudo ethtool -i enP1p1s0f2 driver: bnx2x version: 5.17.0-rc7+ firmware-version: bc 7.10.4 expansion-rom-version: bus-info: 0001:01:00.2 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes $ sudo ethtool -s enP1p1s0f2 msglvl 0x7fff $ sudo ethtool enP1p1s0f2 Settings for enP1p1s0f2: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: Unknown! Duplex: Unknown! (255) Auto-negotiation: on Port: Twisted Pair PHYAD: 17 Transceiver: internal MDI-X: Unknown Supports Wake-on: g Wake-on: d Current message level: 0x7fff (32767) drv probe link timer ifdown ifup rx_err tx_err tx_queued intr tx_done rx_status pktdata hw wol Link detected: no $ sudo ethtool -s enP1p1s0f2 msglvl 0x8000 netlink error: cannot modify bits past kernel bitset size (offset 56) netlink error: Invalid argument ``` The new ethtool-over-netlink API limits the msg levels to the ones officially defined by the kernel (NETIF_MSG_CLASS_COUNT). CC: Michal Thank you for the prompt reply. So, it’s unrelated to the architecture, and to the Linux kernel version, as it works on x86_64 with Linux 5.10.x. Michal, how do I turn on certain bnx2x messages? $ git grep BNX2X_MSG_SP drivers/net/ethernet/broadcom/bnx2x/bnx2x.h drivers/net/ethernet/broadcom/bnx2x/bnx2x.h:#define BNX2X_MSG_SP 0x010 /* was: NETIF_MSG_INTR */ Kind regards, Paul
bnx2x: ppc64le: Unable to set message level greater than 0x7fff
Dear Linux folks, On the POWER8 server IBM S822LC (ppc64le), I am unable to set the message level for the network device to 0x010 but it fails. $ sudo ethtool -s enP1p1s0f2 msglvl 0x010 netlink error: cannot modify bits past kernel bitset size (offset 56) netlink error: Invalid argument Below is more information. 0x7fff is the largest value I am able to set. ``` $ sudo ethtool -i enP1p1s0f2 driver: bnx2x version: 5.17.0-rc7+ firmware-version: bc 7.10.4 expansion-rom-version: bus-info: 0001:01:00.2 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes $ sudo ethtool -s enP1p1s0f2 msglvl 0x7fff $ sudo ethtool enP1p1s0f2 Settings for enP1p1s0f2: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: Unknown! Duplex: Unknown! (255) Auto-negotiation: on Port: Twisted Pair PHYAD: 17 Transceiver: internal MDI-X: Unknown Supports Wake-on: g Wake-on: d Current message level: 0x7fff (32767) drv probe link timer ifdown ifup rx_err tx_err tx_queued intr tx_done rx_status pktdata hw wol Link detected: no $ sudo ethtool -s enP1p1s0f2 msglvl 0x8000 netlink error: cannot modify bits past kernel bitset size (offset 56) netlink error: Invalid argument ``` Kind regards, Paul PS: This is unrelated to the other problem.
Re: rcutorture’s init segfaults in ppc64le VM
Dear Zhouyi, Thank you for still looking into this. Am 10.03.22 um 03:37 schrieb Zhouyi Zhou: I try to reproduce the bug in ppc64 VM in Oregon State University using the vmlinux extracted from https://owww.molgen.mpg.de/~pmenzel/rcutorture-2022.02.01-21.52.37-torture-locktorture-kasan-lock01.tar.xz the ppc64 VM in which I run the qemu without hardware acceleration is: Linux version 5.4.0-100-generic (buildd@bos02-ppc64el-021) (gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)) #113-Ubuntu SMP Thu Feb 3 18:43:11 UTC 2022 (Ubuntu 5.4.0-100.113-generic 5.4.166) The qemu command I use to test: cd /tmp/dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-locktorture-kasan/LOCK01$ $qemu-system-ppc64 -nographic -smp cores=2,threads=1 -net none -M pseries -nodefaults -device spapr-vscsi -serial file:/tmp/console.log -m 512 -kernel ./vmlinux -append "debug_boot_weak_hash panic=-1 console=ttyS0 rcutorture.onoff_interval=200 rcutorture.onoff_holdoff=30 rcutree.gp_preinit_delay=12 rcutree.gp_init_delay=3 rcutree.gp_cleanup_delay=3 rcutree.kthread_prio=2 threadirqs tree.use_softirq=0 rcutorture.n_barrier_cbs=4 rcutorture.stat_interval=15 rcutorture.shutdown_secs=1800 rcutorture.test_no_idle_hz=1 rcutorture.verbose=1" The console.log is uploaded to: http://154.223.142.244/logs/20220310/console.paul.log The log tells us it is illegal instruction that causes the trouble: [4.246387][T1] init[1]: illegal instruction (4) at 1002c308 nip 1002c308 lr 10001684 code 1 in init[1000+d] [4.251400][T1] init[1]: code: f90d88c0 f92a0008 f9480008 7c2004ac 2c2d f949 386d88d0 38e8 [4.253416][T1] init[1]: code: 41820098 e92d8f98 75290010 4182008c <4401> 2c2d 6000 8902f438 Meanwhile, the vmlinux compiled by myself runs smoothly. How did you build it? Using GCC or clang? I forgot, if the problem was only reproducible if the host Linux kernel was built with clang or the VM kernel. Then I modify mkinitrd.sh to let it panic manually: http://154.223.142.244/logs/20220310/mkinitrd.sh I only see the change: - + int *ptr = 0; + *ptr = 0; The log tells us it is a segfault (instead of a illegal instruction): http://154.223.142.244/logs/20220310/console.zhouyi.log Then I use gdb to debug the init in host: ubuntu@zhouzhouyi-1:~/newkernel/linux-next$ gdb tools/testing/selftests/rcutorture/initrd/init (gdb) run Starting program: /home/ubuntu/newkernel/linux-next/tools/testing/selftests/rcutorture/initrd/init Program received signal SIGSEGV, Segmentation fault. 0x1b2c in ?? () (gdb) x/10i $pc => 0x1b2c:stw r9,0(r9) 0x1b30:trap 0x1b34:.long 0x0 0x1b38:.long 0x0 0x1b3c:.long 0x0 0x1b40:lis r2,4110 0x1b44:addir2,r2,31488 0x1b48:mr r9,r1 0x1b4c:rldicr r1,r1,0,59 0x1b50:li r0,0 (gdb) p $r9 $1 = 0 (gdb) x/30x $pc - 0x30 0x1afc:0x388400400x387f00400xf80100400x48026919 0x1b0c:0x60000xe80100400x7c0803a60x4b24 0x1b1c:0x0x01000x01800x3920 0x1b2c:0x91290x7fe80x0x which matches the hex content of http://154.223.142.244/logs/20220310/console.zhouyi.log: [5.077431][T1] init[1]: segfault (11) at 0 nip 1b2c lr 10001024 code 1 in init[1000+d] [5.087167][T1] init[1]: code: 38840040 387f0040 f8010040 48026919 6000 e8010040 7c0803a6 4b24 [5.093987][T1] init[1]: code: 0100 0180 3920 <9129> 7fe8 Conclusions: there might be something wrong when packing the init into vmlinux in your environment. I will continue to do research on this interesting problem with you. As written I think it’s a problem with LLVM/clang. Unfortunately, I won’t be able to retest before next week. Kind regards, Paul
Re: No Linux logs when doing `ppc64_cpu --smt=off/8`
Dear Michal, Thank you for your reply. Am 14.02.22 um 10:43 schrieb Michal Suchánek: On Mon, Feb 14, 2022 at 07:08:07AM +0100, Paul Menzel wrote: Dear PPC folks, On the POWER8 server IBM S822LC running `ppc64_cpu --smt=off` or `ppc64_cpu --smt=8`, Linux 5.17-rc4 does not log anything. I would have expected a message about the change in number of processing units. IIRC it was considered too noisy for systems with many CPUs and the message was dropped. You can always check the resulting state with ppc64_cpu or examining sysfs. Yes, simple `nproc` suffice, but I was more thinking about, that the Linux log is often used for debugging and the changes of amount of processing units might be good to have. `ppc64_cpu --smt=off` or `=8` seems to block for quite some time, and each thread/processing unit seems to powered down/on sequentially, so it takes quite some time and it blocks. So 140 messages would indeed be quite noise. No idea how `ppc64_cpu` works, and if it could log a message at the beginning and end. Kind regards, Paul
[PATCH] powerpc/boot: Add `otheros-too-big.bld` to .gitignore
Currently, `git status` lists the file as untracked by git, so tell git to ignore it. Fixes: aa3bc365ee73 ("powerpc/ps3: Add check for otheros image size") Signed-off-by: Paul Menzel --- arch/powerpc/boot/.gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/boot/.gitignore b/arch/powerpc/boot/.gitignore index 1eee61b82341..a4716d138cfc 100644 --- a/arch/powerpc/boot/.gitignore +++ b/arch/powerpc/boot/.gitignore @@ -16,6 +16,7 @@ kernel-vmlinux.strip.c kernel-vmlinux.strip.gz mktree otheros.bld +otheros-too-big.bld uImage cuImage.* dtbImage.* -- 2.34.1
No Linux logs when doing `ppc64_cpu --smt=off/8`
Dear PPC folks, On the POWER8 server IBM S822LC running `ppc64_cpu --smt=off` or `ppc64_cpu --smt=8`, Linux 5.17-rc4 does not log anything. I would have expected a message about the change in number of processing units. Kind regards, Paul
Re: BUG: sleeping function called from invalid context at include/linux/sched/mm.h:256
Dear Paul, Am 13.02.22 um 15:45 schrieb Paul E. McKenney: On Sun, Feb 13, 2022 at 08:39:13AM +0100, Paul Menzel wrote: Am 13.02.22 um 00:48 schrieb Paul E. McKenney: On Sun, Feb 13, 2022 at 12:05:50AM +0100, Paul Menzel wrote: […] Running rcutorture on the POWER8 system IBM S822LC with Ubuntu 20.10, it found the bug below. I more or less used rcu/dev (0ba8896d2fd7 (lib/irq_poll: Declare IRQ_POLL softirq vector as ksoftirqd-parking safe)) [1]. The bug manifested for the four configurations below. 1. results-rcutorture-kasan/SRCU-T 2. results-rcutorture-kasan/TINY02 3. results-rcutorture/SRCU-T 4. results-rcutorture/TINY02 Adding Frederic on CC... I am dropping these three for the moment: 0ba8896d2fd75 lib/irq_poll: Declare IRQ_POLL softirq vector as ksoftirqd-parking safe efa8027149a1f tick/rcu: Stop allowing RCU_SOFTIRQ in idle d338d22b9d338 tick/rcu: Remove obsolete rcu_needs_cpu() parameters Though it might be that these are victims of circumstance, in other words, that the original bug that Paul Menzel reported was caused by something else. Even without these three patches, the issue is reproducible. I tested commit 7a935b7ac61b (tools/nolibc/stdlib: implement abort()). Ah, I thought you were saying that the issue was caused by them. I will put them back. And apologies to Frederic for kicking his patches out! Sorry for being unclear. Are you able to bisect to see what commit introduced the problem? I have not checked yet, if it’s a regression. I am going to test it next week. […] Kind regards,
Re: rcutorture’s init segfaults in ppc64le VM
Dear Michael, Am 11.02.22 um 15:19 schrieb Paul Menzel: Am 11.02.22 um 02:48 schrieb Michael Ellerman: Paul Menzel writes: Am 08.02.22 um 11:09 schrieb Michael Ellerman: Paul Menzel writes: […] On the POWER8 server IBM S822LC running Ubuntu 21.10, building Linux 5.17-rc2+ with rcutorture tests I'm not sure if that's the host kernel version or the version you're using of rcutorture? Can you tell us the sha1 of your host kernel and of the tree you're running rcutorture from? The host system runs Linux 5.17-rc1+ started with kexec. Unfortunately, I am unable to find the exact sha1. $ more /proc/version Linux version 5.17.0-rc1+ (x...@eddb.molgen.mpg.de) (Ubuntu clang version 13.0.0-2, LLD 13.0.0) #1 SMP Fri Jan 28 17:13:04 CET 2022 OK. In general rc1 kernels can have issues, so it might be worth rebooting the host into either v5.17-rc3 or a distro or stable kernel. Just to rule out any issues on the host. Yes, that was a good test. It works with Ubuntu’s 5.13 Linux kernel. $ more /proc/version Linux version 5.13.0-28-generic (buildd@bos02-ppc64el-013) (gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0, GNU ld (GNU Binutils for Ubuntu) 2.37) #31-Ubuntu SMP Thu Jan 13 17:40:19 UTC 2022 I have to do more tests, but it could be LLVM/clang related. Building commit f1baf68e1383 (Merge tag 'net-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net) with the ata patches on top with GCC, I am unable to reproduce the issue. Before I built it with make -j100 LLVM=1 LLVM_IAS=0 bindeb-pkg […] Kind regards, Paul
Re: [PATCH v5 2/6] powerpc/kexec_file: Add KEXEC_SIG support.
Dear Michal, Am 09.02.22 um 13:01 schrieb Michal Suchánek: On Wed, Feb 09, 2022 at 07:44:15AM +0100, Paul Menzel wrote: Am 11.01.22 um 12:37 schrieb Michal Suchanek: […] How can this be tested? Apparently KEXEC_SIG_FORCE is x86 only although the use of the option is arch neutral: arch/x86/Kconfig:config KEXEC_SIG_FORCE kernel/kexec_file.c:if (IS_ENABLED(CONFIG_KEXEC_SIG_FORCE)) { Maybe it should be moved? Sounds good. I used a patched kernel that enables lockdown in secure boot, and then verified that signed kernel can be loaded by kexec and unsigned not, with KEXEC_SIG enabled and IMA_KEXEC disabled. The lockdown support can be enabled on any platform, and although I can't find it documented anywhere there appears to be code in kexec_file to take it into account: kernel/kexec.c: result = security_locked_down(LOCKDOWN_KEXEC); kernel/kexec_file.c:security_locked_down(LOCKDOWN_KEXEC)) kernel/module.c:return security_locked_down(LOCKDOWN_MODULE_SIGNATURE); kernel/params.c:security_locked_down(LOCKDOWN_MODULE_PARAMETERS)) and lockdown can be enabled with a buildtime option, a kernel parameter, or a debugfs file. Still for testing lifting KEXEC_SIG_FORCE to some arch-neutral Kconfig file is probably the simplest option. kexec -s option should be used to select kexec_file rather than the old style kexec which would either fail always or succeed always regardelss of signature. Thank you. Signed-off-by: Michal Suchanek --- v3: - Philipp Rudo : Update the comit message with explanation why the s390 code is usable on powerpc. - Include correct header for mod_check_sig - Nayna : Mention additional IMA features in kconfig text --- arch/powerpc/Kconfig| 16 arch/powerpc/kexec/elf_64.c | 36 2 files changed, 52 insertions(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index dea74d7717c0..1cde9b6c5987 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -560,6 +560,22 @@ config KEXEC_FILE config ARCH_HAS_KEXEC_PURGATORY def_bool KEXEC_FILE +config KEXEC_SIG + bool "Verify kernel signature during kexec_file_load() syscall" + depends on KEXEC_FILE && MODULE_SIG_FORMAT + help + This option makes kernel signature verification mandatory for + the kexec_file_load() syscall. + + In addition to that option, you need to enable signature + verification for the corresponding kernel image type being + loaded in order for this to work. + + Note: on powerpc IMA_ARCH_POLICY also implements kexec'ed kernel + verification. In addition IMA adds kernel hashes to the measurement + list, extends IMA PCR in the TPM, and implements kernel image + blacklist by hash. So, what is the takeaway for the user? IMA_ARCH_POLICY is preferred? What is the disadvantage, and two implementations(?) needed then? More overhead? IMA_KEXEC does more than KEXEC_SIG. The overhead is probably not big unless you are trying to really minimize the kernel code size. Arguably the simpler implementation has less potential for bugs, too. Both in code and in user configuration required to enable the feature. Interestingly IMA_ARCH_POLICY depends on KEXEC_SIG rather than IMA_KEXEC. Just mind-boggling. I have not looked into that. The main problem with IMA_KEXEC from my point of view is it is not portable. To record the measurements TPM support is requireed which is not available on all platforms. It does not support PE so it cannot be used on platforms that use PE kernel signature format. Could you add that to the comment please? + config RELOCATABLE bool "Build a relocatable kernel" depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE)) diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c index eeb258002d1e..98d1cb5135b4 100644 --- a/arch/powerpc/kexec/elf_64.c +++ b/arch/powerpc/kexec/elf_64.c @@ -23,6 +23,7 @@ #include #include #include +#include static void *elf64_load(struct kimage *image, char *kernel_buf, unsigned long kernel_len, char *initrd, @@ -151,7 +152,42 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, return ret ? ERR_PTR(ret) : NULL; } +#ifdef CONFIG_KEXEC_SIG +int elf64_verify_sig(const char *kernel, unsigned long kernel_len) +{ + const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1; + struct module_signature *ms; + unsigned long sig_len; Use size_t to match the signature of `verify_pkcs7_signature()`? Nope. struct module_signature uses unsigned long, and this needs to be matched to avoid type errors on 32bit. I meant for `sig_len`. Technically using size_t for in-memory buffers is misguided because AFAICT no memory buffer can be bigger than ULONG_MAX, and
Re: rcutorture’s init segfaults in ppc64le VM
Dear Michael, Am 11.02.22 um 02:48 schrieb Michael Ellerman: Paul Menzel writes: Am 08.02.22 um 11:09 schrieb Michael Ellerman: Paul Menzel writes: […] On the POWER8 server IBM S822LC running Ubuntu 21.10, building Linux 5.17-rc2+ with rcutorture tests I'm not sure if that's the host kernel version or the version you're using of rcutorture? Can you tell us the sha1 of your host kernel and of the tree you're running rcutorture from? The host system runs Linux 5.17-rc1+ started with kexec. Unfortunately, I am unable to find the exact sha1. $ more /proc/version Linux version 5.17.0-rc1+ (x...@eddb.molgen.mpg.de) (Ubuntu clang version 13.0.0-2, LLD 13.0.0) #1 SMP Fri Jan 28 17:13:04 CET 2022 OK. In general rc1 kernels can have issues, so it might be worth rebooting the host into either v5.17-rc3 or a distro or stable kernel. Just to rule out any issues on the host. Yes, that was a good test. It works with Ubuntu’s 5.13 Linux kernel. $ more /proc/version Linux version 5.13.0-28-generic (buildd@bos02-ppc64el-013) (gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0, GNU ld (GNU Binutils for Ubuntu) 2.37) #31-Ubuntu SMP Thu Jan 13 17:40:19 UTC 2022 I have to do more tests, but it could be LLVM/clang related. The Linux tree, from where I run rcutorture from, is at commit dfd42facf1e4 (Linux 5.17-rc3) with four patches on top: $ git log --oneline -6 207cec79e752 (HEAD -> master, origin/master, origin/HEAD) Problems with rcutorture on ppc64le: allmodconfig(2) and other failures 8c82f96fbe57 ata: libata-sata: improve sata_link_debounce() a447541d925f ata: libata-sata: remove debounce delay by default afd84e1eeafc ata: libata-sata: introduce struct sata_deb_timing f4caf7e48b75 ata: libata-sata: Simplify sata_link_resume() interface dfd42facf1e4 (tag: v5.17-rc3) Linux 5.17-rc3 $ tools/testing/selftests/rcutorture/bin/torture.sh --duration 10 the built init $ file tools/testing/selftests/rcutorture/initrd/init tools/testing/selftests/rcutorture/initrd/init: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, BuildID[sha1]=0ded0e45649184a296f30d611f7a03cc51ecb616, for GNU/Linux 3.10.0, stripped Mine looks pretty much identical: $ file tools/testing/selftests/rcutorture/initrd/init tools/testing/selftests/rcutorture/initrd/init: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, BuildID[sha1]=86078bf6e5d54ab0860d36aa9a65d52818b972c8, for GNU/Linux 3.10.0, stripped segfaults in QEMU. From one of the log files But mine doesn't segfault, it runs fine and the test completes. What qemu version are you using? I tried 4.2.1 and 6.2.0, both worked. $ qemu-system-ppc64le --version QEMU emulator version 6.0.0 (Debian 1:6.0+dfsg-2expubuntu1.1) Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers OK, that's one difference between our setups, but I'd be surprised if it explains this bug, but I guess anything's possible. /dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-rcutorture/TREE03/console.log Sorry, that was the wrong path/test. The correct one for the excerpt below is: /dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-locktorture-kasan/LOCK01/console.log (For TREE03, QEMU does not start the Linux kernel at all, that means no output after: Booting Linux via __start() @ 0x0040 ... OK yeah I see that too. Removing "threadirqs" from tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot seems to fix it. Nice find. I have no idea, what that means though. I still see some preempt related warnings, we clearly have some bugs with preempt enabled. You can now download the content of `/dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-locktorture-kasan/LOCK01` [1, 65 MB]. Can you reproduce the segmentation fault with the line below? $ qemu-system-ppc64 -enable-kvm -nographic -smp cores=1,threads=8 \ -net none -enable-kvm -M pseries -nodefaults -device spapr-vscsi -serial stdio -m 512 \ -kernel /dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-locktorture-kasan/LOCK01/vmlinux \ -append "debug_boot_weak_hash panic=-1 console=ttyS0 \ torture.disable_onoff_at_boot locktorture.onoff_interval=3 \ locktorture.onoff_holdoff=30 locktorture.stat_interval=15 \ locktorture.shutdown_secs=60 locktorture.verbose=1" That works fine for me, boots and runs the test, then shuts down. I assume you see the segfault on every boot, not intermittently? So the differences between our setups are the host kernel and the qemu version. Can you try a different host kernel easily? The other thing would be to tr
Re: [PATCH v5 2/6] powerpc/kexec_file: Add KEXEC_SIG support.
Dear Michal, Thank you for the patch. Am 11.01.22 um 12:37 schrieb Michal Suchanek: Could you please remove the dot/period at the end of the git commit message summary? Copy the code from s390x Both powerpc and s390x use appended signature format (as opposed to EFI based patforms using PE format). patforms → platforms How can this be tested? Signed-off-by: Michal Suchanek --- v3: - Philipp Rudo : Update the comit message with explanation why the s390 code is usable on powerpc. - Include correct header for mod_check_sig - Nayna : Mention additional IMA features in kconfig text --- arch/powerpc/Kconfig| 16 arch/powerpc/kexec/elf_64.c | 36 2 files changed, 52 insertions(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index dea74d7717c0..1cde9b6c5987 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -560,6 +560,22 @@ config KEXEC_FILE config ARCH_HAS_KEXEC_PURGATORY def_bool KEXEC_FILE +config KEXEC_SIG + bool "Verify kernel signature during kexec_file_load() syscall" + depends on KEXEC_FILE && MODULE_SIG_FORMAT + help + This option makes kernel signature verification mandatory for + the kexec_file_load() syscall. + + In addition to that option, you need to enable signature + verification for the corresponding kernel image type being + loaded in order for this to work. + + Note: on powerpc IMA_ARCH_POLICY also implements kexec'ed kernel + verification. In addition IMA adds kernel hashes to the measurement + list, extends IMA PCR in the TPM, and implements kernel image + blacklist by hash. So, what is the takeaway for the user? IMA_ARCH_POLICY is preferred? What is the disadvantage, and two implementations(?) needed then? More overhead? + config RELOCATABLE bool "Build a relocatable kernel" depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE)) diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c index eeb258002d1e..98d1cb5135b4 100644 --- a/arch/powerpc/kexec/elf_64.c +++ b/arch/powerpc/kexec/elf_64.c @@ -23,6 +23,7 @@ #include #include #include +#include static void *elf64_load(struct kimage *image, char *kernel_buf, unsigned long kernel_len, char *initrd, @@ -151,7 +152,42 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, return ret ? ERR_PTR(ret) : NULL; } +#ifdef CONFIG_KEXEC_SIG +int elf64_verify_sig(const char *kernel, unsigned long kernel_len) +{ + const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1; + struct module_signature *ms; + unsigned long sig_len; Use size_t to match the signature of `verify_pkcs7_signature()`? + int ret; + + if (marker_len > kernel_len) + return -EKEYREJECTED; + + if (memcmp(kernel + kernel_len - marker_len, MODULE_SIG_STRING, + marker_len)) + return -EKEYREJECTED; + kernel_len -= marker_len; + + ms = (void *)kernel + kernel_len - sizeof(*ms); + ret = mod_check_sig(ms, kernel_len, "kexec"); + if (ret) + return ret; + + sig_len = be32_to_cpu(ms->sig_len); + kernel_len -= sizeof(*ms) + sig_len; + + return verify_pkcs7_signature(kernel, kernel_len, + kernel + kernel_len, sig_len, + VERIFY_USE_PLATFORM_KEYRING, + VERIFYING_MODULE_SIGNATURE, + NULL, NULL); +} +#endif /* CONFIG_KEXEC_SIG */ + const struct kexec_file_ops kexec_elf64_ops = { .probe = kexec_elf_probe, .load = elf64_load, +#ifdef CONFIG_KEXEC_SIG + .verify_sig = elf64_verify_sig, +#endif }; Kind regards, Paul
[PATCH v2 2/2] lib/raid6: Include for `VPERMXOR`
On Ubuntu 21.10 (ppc64le) building `raid6test` with gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0 fails with the error below. gcc -I.. -I ../../../include -g -O2 -I../../../arch/powerpc/include -DCONFIG_ALTIVEC -c -o vpermxor1.o vpermxor1.c vpermxor1.c: In function ‘raid6_vpermxor1_gen_syndrome_real’: vpermxor1.c:64:29: error: expected string literal before ‘VPERMXOR’ 64 | asm(VPERMXOR(%0,%1,%2,%3):"=v"(wq0):"v"(gf_high), "v"(gf_low), "v"(wq0)); | ^~~~ make: *** [Makefile:58: vpermxor1.o] Error 1 So, include the header `asm/ppc-opcode.h` defining this macro also when not building the Linux kernel but only this too. Cc: Matt Brown Signed-off-by: Paul Menzel --- v2: Resend lib/raid6/vpermxor.uc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/raid6/vpermxor.uc b/lib/raid6/vpermxor.uc index 10475dc423c1..1bfb127fbfe8 100644 --- a/lib/raid6/vpermxor.uc +++ b/lib/raid6/vpermxor.uc @@ -24,9 +24,9 @@ #ifdef CONFIG_ALTIVEC #include +#include #ifdef __KERNEL__ #include -#include #include #endif -- 2.34.1
[PATCH v2 1/2] lib/raid6/test/Makefile: Use `$(pound)` instead of `\#` for Make 4.3
Buidling `raid6test` on Ubuntu 21.10 (ppc64le) with GNU Make 4.3 shows the errors below: $ cd lib/raid6/test/ $ make :1:1: error: stray ‘\’ in program :1:2: error: stray ‘#’ in program :1:11: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘<’ token cp -f ../int.uc int.uc awk -f ../unroll.awk -vN=1 < int.uc > int1.c gcc -I.. -I ../../../include -g -O2 -c -o int1.o int1.c awk -f ../unroll.awk -vN=2 < int.uc > int2.c gcc -I.. -I ../../../include -g -O2 -c -o int2.o int2.c awk -f ../unroll.awk -vN=4 < int.uc > int4.c gcc -I.. -I ../../../include -g -O2 -c -o int4.o int4.c awk -f ../unroll.awk -vN=8 < int.uc > int8.c gcc -I.. -I ../../../include -g -O2 -c -o int8.o int8.c awk -f ../unroll.awk -vN=16 < int.uc > int16.c gcc -I.. -I ../../../include -g -O2 -c -o int16.o int16.c awk -f ../unroll.awk -vN=32 < int.uc > int32.c gcc -I.. -I ../../../include -g -O2 -c -o int32.o int32.c rm -f raid6.a ar cq raid6.a int1.o int2.o int4.o int8.o int16.o int32.o recov.o algos.o tables.o ranlib raid6.a gcc -I.. -I ../../../include -g -O2 -o raid6test test.c raid6.a /usr/bin/ld: raid6.a(algos.o):/dev/shm/linux/lib/raid6/test/algos.c:28: multiple definition of `raid6_call'; /scratch/local/ccIJjN8s.o:/dev/shm/linux/lib/raid6/test/test.c:22: first defined here collect2: error: ld returned 1 exit status make: *** [Makefile:72: raid6test] Error 1 The errors come from the `HAS_ALTIVEC` test, which fails, and the POWER optimized versions are not built. That’s also reason nobody noticed on the other architectures. GNU Make 4.3 does not remove the backslash anymore. From the 4.3 release announcment: > * WARNING: Backward-incompatibility! > Number signs (#) appearing inside a macro reference or function invocation > no longer introduce comments and should not be escaped with backslashes: > thus a call such as: > foo := $(shell echo '#') > is legal. Previously the number sign needed to be escaped, for example: > foo := $(shell echo '\#') > Now this latter will resolve to "\#". If you want to write makefiles > portable to both versions, assign the number sign to a variable: > H := \# > foo := $(shell echo '$H') > This was claimed to be fixed in 3.81, but wasn't, for some reason. > To detect this change search for 'nocomment' in the .FEATURES variable. So, do the same as commit 9564a8cf422d ("Kbuild: fix # escaping in .cmd files for future Make") and commit 929bef467771 ("bpf: Use $(pound) instead of \# in Makefiles") and define and use a `$(pound)` variable. Reference for the change in make: https://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57 Cc: Matt Brown Signed-off-by: Paul Menzel --- v2: Fix checkpatch.pl errors by adding missing quotes around git commit message summary/title. lib/raid6/test/Makefile | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/raid6/test/Makefile b/lib/raid6/test/Makefile index a4c7cd74cff5..4fb7700a741b 100644 --- a/lib/raid6/test/Makefile +++ b/lib/raid6/test/Makefile @@ -4,6 +4,8 @@ # from userspace. # +pound := \# + CC = gcc OPTFLAGS = -O2 # Adjust as desired CFLAGS = -I.. -I ../../../include -g $(OPTFLAGS) @@ -42,7 +44,7 @@ else ifeq ($(HAS_NEON),yes) OBJS += neon.o neon1.o neon2.o neon4.o neon8.o recov_neon.o recov_neon_inner.o CFLAGS += -DCONFIG_KERNEL_MODE_NEON=1 else -HAS_ALTIVEC := $(shell printf '\#include \nvector int a;\n' |\ +HAS_ALTIVEC := $(shell printf '$(pound)include \nvector int a;\n' |\ gcc -c -x c - >/dev/null && rm ./-.o && echo yes) ifeq ($(HAS_ALTIVEC),yes) CFLAGS += -I../../../arch/powerpc/include -- 2.34.1
Re: rcutorture’s init segfaults in ppc64le VM
[Correct sha1 for test for 2022.02.01-21.52.37] Am 08.02.22 um 13:12 schrieb Paul Menzel: Dear Michael, Thank you for looking into this. Am 08.02.22 um 11:09 schrieb Michael Ellerman: Paul Menzel writes: […] On the POWER8 server IBM S822LC running Ubuntu 21.10, building Linux 5.17-rc2+ with rcutorture tests I'm not sure if that's the host kernel version or the version you're using of rcutorture? Can you tell us the sha1 of your host kernel and of the tree you're running rcutorture from? The host system runs Linux 5.17-rc1+ started with kexec. Unfortunately, I am unable to find the exact sha1. $ more /proc/version Linux version 5.17.0-rc1+ (pmen...@flughafenberlinbrandenburgwillybrandt.molgen.mpg.de) (Ubuntu clang version 13.0.0-2, LLD 13.0.0) #1 SMP Fri Jan 28 17:13:04 CET 2022 The Linux tree, from where I run rcutorture from, is at commit dfd42facf1e4 (Linux 5.17-rc3) with four patches on top: $ git log --oneline -6 207cec79e752 (HEAD -> master, origin/master, origin/HEAD) Problems with rcutorture on ppc64le: allmodconfig(2) and other failures 8c82f96fbe57 ata: libata-sata: improve sata_link_debounce() a447541d925f ata: libata-sata: remove debounce delay by default afd84e1eeafc ata: libata-sata: introduce struct sata_deb_timing f4caf7e48b75 ata: libata-sata: Simplify sata_link_resume() interface dfd42facf1e4 (tag: v5.17-rc3) Linux 5.17-rc3 I was able to reproduce this with the above, but the report and the attached logs at the end are from: $ git log --oneline -6 b37a34a8cf5a b37a34a8cf5a Problems with rcutorture on ppc64le: allmodconfig(2) and other failures 9a78ddead89a ata: libata-sata: improve sata_link_debounce() 567da2eaf099 ata: libata-sata: remove debounce delay by default 70ae61851660 ata: libata-sata: introduce struct sata_deb_timing 9ebb6433d9c3 ata: libata-sata: Simplify sata_link_resume() interface 26291c54e111 (tag: v5.17-rc2) Linux 5.17-rc2 $ tools/testing/selftests/rcutorture/bin/torture.sh --duration 10 the built init $ file tools/testing/selftests/rcutorture/initrd/init tools/testing/selftests/rcutorture/initrd/init: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, BuildID[sha1]=0ded0e45649184a296f30d611f7a03cc51ecb616, for GNU/Linux 3.10.0, stripped Mine looks pretty much identical: $ file tools/testing/selftests/rcutorture/initrd/init tools/testing/selftests/rcutorture/initrd/init: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, BuildID[sha1]=86078bf6e5d54ab0860d36aa9a65d52818b972c8, for GNU/Linux 3.10.0, stripped segfaults in QEMU. From one of the log files But mine doesn't segfault, it runs fine and the test completes. What qemu version are you using? I tried 4.2.1 and 6.2.0, both worked. $ qemu-system-ppc64le --version QEMU emulator version 6.0.0 (Debian 1:6.0+dfsg-2expubuntu1.1) Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers /dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-rcutorture/TREE03/console.log Sorry, that was the wrong path/test. The correct one for the excerpt below is: /dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-locktorture-kasan/LOCK01/console.log (For TREE03, QEMU does not start the Linux kernel at all, that means no output after: Booting Linux via __start() @ 0x0040 ... ) [ 1.119803][ T1] Run /init as init process [ 1.122011][ T1] init[1]: segfault (11) at f0656d90 nip 1a18 lr 0 code 1 in init[1000+d] [ 1.124863][ T1] init[1]: code: 2c2903e7 f9210030 4081ff84 4b58 0100 0580 3c40100f [ 1.128823][ T1] init[1]: code: 38427c00 7c290b78 782106e4 3800 7c0803a6 f801 e9028010 The disassembly from 3c40100f is: lis r2,4111 addi r2,r2,31744 mr r9,r1 rldicr r1,r1,0,59 li r0,0 stdu r1,-128(r1) <- fault mtlr r0 std r0,0(r1) ld r8,-32752(r2) I think you'll find that's the code at the ELF entry point. You can check with: $ readelf -e tools/testing/selftests/rcutorture/initrd/init | grep Entry Entry point address: 0x1c0c $ objdump -d tools/testing/selftests/rcutorture/initrd/init | grep -m 1 -A 8 1c0c 1c0c: 0e 10 40 3c lis r2,4110 1c10: 00 7b 42 38 addi r2,r2,31488 1c14: 78 0b 29 7c mr r9,r1 1c18: e4 06 21 78 rldicr r1,r1,0,59 1c1c: 00 00 00 38 li r0,0 1c20: 81 ff 21 f8 stdu r1,-128(r1) 1c24: a6 03 08 7c mtlr r0 1c28: 00 00 01 f8 std r0,0(r1) 1c2c: 10 80 02 e9 ld r8,-32752(r2) T
Re: rcutorture’s init segfaults in ppc64le VM
Dear Michael, Thank you for looking into this. Am 08.02.22 um 11:09 schrieb Michael Ellerman: Paul Menzel writes: […] On the POWER8 server IBM S822LC running Ubuntu 21.10, building Linux 5.17-rc2+ with rcutorture tests I'm not sure if that's the host kernel version or the version you're using of rcutorture? Can you tell us the sha1 of your host kernel and of the tree you're running rcutorture from? The host system runs Linux 5.17-rc1+ started with kexec. Unfortunately, I am unable to find the exact sha1. $ more /proc/version Linux version 5.17.0-rc1+ (pmen...@flughafenberlinbrandenburgwillybrandt.molgen.mpg.de) (Ubuntu clang version 13.0.0-2, LLD 13.0.0) #1 SMP Fri Jan 28 17:13:04 CET 2022 The Linux tree, from where I run rcutorture from, is at commit dfd42facf1e4 (Linux 5.17-rc3) with four patches on top: $ git log --oneline -6 207cec79e752 (HEAD -> master, origin/master, origin/HEAD) Problems with rcutorture on ppc64le: allmodconfig(2) and other failures 8c82f96fbe57 ata: libata-sata: improve sata_link_debounce() a447541d925f ata: libata-sata: remove debounce delay by default afd84e1eeafc ata: libata-sata: introduce struct sata_deb_timing f4caf7e48b75 ata: libata-sata: Simplify sata_link_resume() interface dfd42facf1e4 (tag: v5.17-rc3) Linux 5.17-rc3 $ tools/testing/selftests/rcutorture/bin/torture.sh --duration 10 the built init $ file tools/testing/selftests/rcutorture/initrd/init tools/testing/selftests/rcutorture/initrd/init: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, BuildID[sha1]=0ded0e45649184a296f30d611f7a03cc51ecb616, for GNU/Linux 3.10.0, stripped Mine looks pretty much identical: $ file tools/testing/selftests/rcutorture/initrd/init tools/testing/selftests/rcutorture/initrd/init: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, BuildID[sha1]=86078bf6e5d54ab0860d36aa9a65d52818b972c8, for GNU/Linux 3.10.0, stripped segfaults in QEMU. From one of the log files But mine doesn't segfault, it runs fine and the test completes. What qemu version are you using? I tried 4.2.1 and 6.2.0, both worked. $ qemu-system-ppc64le --version QEMU emulator version 6.0.0 (Debian 1:6.0+dfsg-2expubuntu1.1) Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers /dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-rcutorture/TREE03/console.log Sorry, that was the wrong path/test. The correct one for the excerpt below is: /dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-locktorture-kasan/LOCK01/console.log (For TREE03, QEMU does not start the Linux kernel at all, that means no output after: Booting Linux via __start() @ 0x0040 ... ) [1.119803][T1] Run /init as init process [1.122011][T1] init[1]: segfault (11) at f0656d90 nip 1a18 lr 0 code 1 in init[1000+d] [1.124863][T1] init[1]: code: 2c2903e7 f9210030 4081ff84 4b58 0100 0580 3c40100f [1.128823][T1] init[1]: code: 38427c00 7c290b78 782106e4 3800 7c0803a6 f801 e9028010 The disassembly from 3c40100f is: lis r2,4111 addir2,r2,31744 mr r9,r1 rldicr r1,r1,0,59 li r0,0 stdur1,-128(r1) <- fault mtlrr0 std r0,0(r1) ld r8,-32752(r2) I think you'll find that's the code at the ELF entry point. You can check with: $ readelf -e tools/testing/selftests/rcutorture/initrd/init | grep Entry Entry point address: 0x1c0c $ objdump -d tools/testing/selftests/rcutorture/initrd/init | grep -m 1 -A 8 1c0c 1c0c: 0e 10 40 3c lis r2,4110 1c10: 00 7b 42 38 addir2,r2,31488 1c14: 78 0b 29 7c mr r9,r1 1c18: e4 06 21 78 rldicr r1,r1,0,59 1c1c: 00 00 00 38 li r0,0 1c20: 81 ff 21 f8 stdur1,-128(r1) 1c24: a6 03 08 7c mtlrr0 1c28: 00 00 01 f8 std r0,0(r1) 1c2c: 10 80 02 e9 ld r8,-32752(r2) The fault you're seeing is the first store using the stack pointer (r1), which is setup by the kernel. The fault address f0656d90 is weirdly low, the stack should be up near 128TB. I'm not sure how we end up with a bad r1. Can you dump some info about the kernel that was built, something like: $ file /dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-rcutorture/TREE03/vmlinux And maybe paste/attach the full log, maybe there's a clue somewhere. You can now download the content of `/dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-locktorture-kasan/LOCK01` [1, 65 MB].
ppc64le: `NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #20!!!` when turning off SMT
Dear Linux folks, On the POWER8 server IBM S822LC running Ubuntu 21.10, Linux 5.17-rc1+ built with $ grep HZ /boot/config-5.17.0-rc1+ CONFIG_NO_HZ_COMMON=y # CONFIG_HZ_PERIODIC is not set CONFIG_NO_HZ_IDLE=y # CONFIG_NO_HZ_FULL is not set CONFIG_NO_HZ=y # CONFIG_HZ_100 is not set CONFIG_HZ_250=y # CONFIG_HZ_300 is not set # CONFIG_HZ_1000 is not set CONFIG_HZ=250 once warned about a NOHZ tick-stop error, when I executed `sudo /usr/sbin/ppc64_cpu --smt=off` (so that KVM would work). ``` $ dmesg [0.00] Linux version 5.17.0-rc1+ (pmen...@flughafenberlinbrandenburgwillybrandt.molgen.mpg.de) (Ubuntu clang version 13.0.0-2, LLD 13.0.0) #1 SMP Fri Jan 28 17:13:04 CET 2022 […] [271272.030262] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #20!!! [271272.305726] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #20!!! [271272.549790] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #20!!! [271274.885167] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #20!!! [271275.113896] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #20!!! [271275.412902] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #20!!! [271275.625245] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #20!!! [271275.833107] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #20!!! [271276.041391] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #20!!! [271277.244880] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #20!!! ``` Kind regards, Paul
ppc64le: rcutorture warns about improperly set `CONFIG_HYPERVISOR_GUEST` and `CONFIG_PARAVIRT`
Dear Sebastian, dear Paul, In commit a6fda6dab9 (rcutorture: Tweak kvm options) `tools/testing/selftests/rcutorture/configs/rcu/CFcommon` was extended by the three selections below: CONFIG_HYPERVISOR_GUEST=y CONFIG_PARAVIRT=y CONFIG_KVM_GUEST=y Unfortunately, `CONFIG_HYPERVISOR_GUEST` is x86 specific and `CONFIG_PARAVIRT` only available on x86 and ARM. Thus, running the tests on a ppc64le system (POWER8 IBM S822LC), the script shows the warnings below: :CONFIG_HYPERVISOR_GUEST=y: improperly set :CONFIG_PARAVIRT=y: improperly set Do you have a way, how to work around that? Kind regards, Paul
rcutorture’s init segfaults in ppc64le VM
Dear Linux folks, On the POWER8 server IBM S822LC running Ubuntu 21.10, building Linux 5.17-rc2+ with rcutorture tests $ tools/testing/selftests/rcutorture/bin/torture.sh --duration 10 the built init $ file tools/testing/selftests/rcutorture/initrd/init tools/testing/selftests/rcutorture/initrd/init: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, BuildID[sha1]=0ded0e45649184a296f30d611f7a03cc51ecb616, for GNU/Linux 3.10.0, stripped segfaults in QEMU. From one of the log files /dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-rcutorture/TREE03/console.log [1.119803][T1] Run /init as init process [1.122011][T1] init[1]: segfault (11) at f0656d90 nip 1a18 lr 0 code 1 in init[1000+d] [1.124863][T1] init[1]: code: 2c2903e7 f9210030 4081ff84 4b58 0100 0580 3c40100f [1.128823][T1] init[1]: code: 38427c00 7c290b78 782106e4 3800 7c0803a6 f801 e9028010 Executing the init, which just seems to be an endless loop, from userspace work: $ strace ./tools/testing/selftests/rcutorture/initrd/init execve("./tools/testing/selftests/rcutorture/initrd/init", ["./tools/testing/selftests/rcutor"...], 0x7db9e860 /* 31 vars */) = 0 brk(NULL) = 0x1001d94 brk(0x1001d940b98) = 0x1001d940b98 set_tid_address(0x1001d9400d0) = 2890832 set_robust_list(0x1001d9400e0, 24) = 0 uname({sysname="Linux", nodename="flughafenberlinbrandenburgwillybrandt.molgen.mpg.de", ...}) = 0 prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0 readlink("/proc/self/exe", "/dev/shm/linux/tools/testing/sel"..., 4096) = 61 getrandom("\xf1\x30\x4c\x9e\x82\x8d\x26\xd7", 8, GRND_NONBLOCK) = 8 brk(0x1001d970b98) = 0x1001d970b98 brk(0x1001d98) = 0x1001d98 mprotect(0x100e, 65536, PROT_READ) = 0 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, 0x7b22c8a8) = 0 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, 0x7b22c8a8) = 0 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, ^C{tv_sec=0, tv_nsec=872674044}) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) strace: Process 2890832 detached Any ideas, what `mkinitrd.sh` [2] should do differently? ``` cat > init.c << '___EOF___' #ifndef NOLIBC #include #include #endif volatile unsigned long delaycount; int main(int argc, int argv[]) { int i; struct timeval tv; struct timeval tvb; for (;;) { sleep(1); /* Need some userspace time. */ if (gettimeofday(&tvb, NULL)) continue; do { for (i = 0; i < 1000 * 100; i++) delaycount = i * i; if (gettimeofday(&tv, NULL)) break; tv.tv_sec -= tvb.tv_sec; if (tv.tv_sec > 1) break; tv.tv_usec += tv.tv_sec * 1000 * 1000; tv.tv_usec -= tvb.tv_usec; } while (tv.tv_usec < 1000); } return 0; } ___EOF___ # build using nolibc on supported archs (smaller executable) and fall # back to regular glibc on other ones. if echo -e "#if __x86_64__||__i386__||__i486__||__i586__||__i686__" \ "||__ARM_EABI__||__aarch64__\nyes\n#endif" \ | ${CROSS_COMPILE}gcc -E -nostdlib -xc - \ | grep -q '^yes'; then # architecture supported by nolibc ${CROSS_COMPILE}gcc -fno-asynchronous-unwind-tables -fno-ident \ -nostdlib -include ../../../../include/nolibc/nolibc.h \ -s -static -Os -o init init.c -lgcc else ${CROSS_COMPILE}gcc -s -static -Os -o init init.c fi ``` Kind regards, Paul [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/rcutorture/doc/initrd.txt [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
Re: [PATCH 1/3] lib/raid6/test/Makefile: Use `$(pound)` instead of `\#` for Make 4.3
Dear David, Am 26.01.22 um 13:06 schrieb David Laight: From: Paul Menzel Sent: 26 January 2022 11:42 .. +pound := \# Please use 'hash' not 'pound'. Only american greengrocers use that horrid name. A 'pound' is '£'. Sure, I can change that, if you send a patch cleaning this up for the other files already using that in the tree? ;-) Or can it be different all over the Linux code base? Kind regards, Paul PS: - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) If you care, the standard signature delimiter is `-- ` [1]. [1]: https://en.wikipedia.org/wiki/Signature_block#Standard_delimiter
[PATCH 3/3] lib/raid6/test: Rename variable to avoid `raid6_call` name clash
On Ubuntu 21.10 (ppc64le) building `raid6test` with gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0 fails with the error below. $ cd lib/raid6/test $ make […] gcc -I.. -I ../../../include -g -O2 -I../../../arch/powerpc/include -DCONFIG_ALTIVEC -o raid6test test.c raid6.a /usr/bin/ld: raid6.a(algos.o):/dev/shm/linux/lib/raid6/test/algos.c:28: multiple definition of `raid6_call'; /scratch/local/ccHnUnID.o:/dev/shm/linux/lib/raid6/test/test.c:22: first defined here collect2: error: ld returned 1 exit status make: *** [Makefile:74: raid6test] Error 1 Renaming the variable in `test.c` to `raid6_call2` works around that. The resulting binary terminates with a segmentation fault: $ ./raid6test using recovery intx1 Segmentation fault (core dumped) $ dmesg | tail -3 [75519.758988] raid6test[1891185]: segfault (11) at 0 nip 0 lr 708aa3fe197c code 1 in libc.so.6[708aa3ca+26] [75519.759006] raid6test[1891185]: code: [75519.759024] raid6test[1891185]: code: Cc: Matt Brown Signed-off-by: Paul Menzel --- lib/raid6/test/test.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/lib/raid6/test/test.c b/lib/raid6/test/test.c index a3cf071941ab..937d2a8bb294 100644 --- a/lib/raid6/test/test.c +++ b/lib/raid6/test/test.c @@ -19,7 +19,7 @@ #define NDISKS 16 /* Including P and Q */ const char raid6_empty_zero_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); -struct raid6_calls raid6_call; +struct raid6_calls raid6_call2; char *dataptrs[NDISKS]; char data[NDISKS][PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); @@ -71,7 +71,7 @@ static int test_disks(int i, int j) erra = errb = 0; } else { printf("algo=%-8s faila=%3d(%c) failb=%3d(%c) %s\n", - raid6_call.name, + raid6_call2.name, i, disk_type(i), j, disk_type(j), (!erra && !errb) ? "OK" : @@ -107,30 +107,30 @@ int main(int argc, char *argv[]) if ((*algo)->valid && !(*algo)->valid()) continue; - raid6_call = **algo; + raid6_call2 = **algo; /* Nuke syndromes */ memset(data[NDISKS-2], 0xee, 2*PAGE_SIZE); /* Generate assumed good syndrome */ - raid6_call.gen_syndrome(NDISKS, PAGE_SIZE, + raid6_call2.gen_syndrome(NDISKS, PAGE_SIZE, (void **)&dataptrs); for (i = 0; i < NDISKS-1; i++) for (j = i+1; j < NDISKS; j++) err += test_disks(i, j); - if (!raid6_call.xor_syndrome) + if (!raid6_call2.xor_syndrome) continue; for (p1 = 0; p1 < NDISKS-2; p1++) for (p2 = p1; p2 < NDISKS-2; p2++) { /* Simulate rmw run */ - raid6_call.xor_syndrome(NDISKS, p1, p2, PAGE_SIZE, + raid6_call2.xor_syndrome(NDISKS, p1, p2, PAGE_SIZE, (void **)&dataptrs); makedata(p1, p2); - raid6_call.xor_syndrome(NDISKS, p1, p2, PAGE_SIZE, + raid6_call2.xor_syndrome(NDISKS, p1, p2, PAGE_SIZE, (void **)&dataptrs); for (i = 0; i < NDISKS-1; i++) -- 2.34.1
[PATCH 2/3] lib/raid6: Include for `VPERMXOR`
On Ubuntu 21.10 (ppc64le) building `raid6test` with gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0 fails with the error below. gcc -I.. -I ../../../include -g -O2 -I../../../arch/powerpc/include -DCONFIG_ALTIVEC -c -o vpermxor1.o vpermxor1.c vpermxor1.c: In function ‘raid6_vpermxor1_gen_syndrome_real’: vpermxor1.c:64:29: error: expected string literal before ‘VPERMXOR’ 64 | asm(VPERMXOR(%0,%1,%2,%3):"=v"(wq0):"v"(gf_high), "v"(gf_low), "v"(wq0)); | ^~~~ make: *** [Makefile:58: vpermxor1.o] Error 1 So, include the header `asm/ppc-opcode.h` defining this macro also when not building the Linux kernel but only this too. Cc: Matt Brown Signed-off-by: Paul Menzel --- lib/raid6/vpermxor.uc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/raid6/vpermxor.uc b/lib/raid6/vpermxor.uc index 10475dc423c1..1bfb127fbfe8 100644 --- a/lib/raid6/vpermxor.uc +++ b/lib/raid6/vpermxor.uc @@ -24,9 +24,9 @@ #ifdef CONFIG_ALTIVEC #include +#include #ifdef __KERNEL__ #include -#include #include #endif -- 2.34.1
[PATCH 1/3] lib/raid6/test/Makefile: Use `$(pound)` instead of `\#` for Make 4.3
Buidling `raid6test` on Ubuntu 21.10 (ppc64le) with GNU Make 4.3 shows the errors below: $ cd lib/raid6/test/ $ make :1:1: error: stray ‘\’ in program :1:2: error: stray ‘#’ in program :1:11: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘<’ token cp -f ../int.uc int.uc awk -f ../unroll.awk -vN=1 < int.uc > int1.c gcc -I.. -I ../../../include -g -O2 -c -o int1.o int1.c awk -f ../unroll.awk -vN=2 < int.uc > int2.c gcc -I.. -I ../../../include -g -O2 -c -o int2.o int2.c awk -f ../unroll.awk -vN=4 < int.uc > int4.c gcc -I.. -I ../../../include -g -O2 -c -o int4.o int4.c awk -f ../unroll.awk -vN=8 < int.uc > int8.c gcc -I.. -I ../../../include -g -O2 -c -o int8.o int8.c awk -f ../unroll.awk -vN=16 < int.uc > int16.c gcc -I.. -I ../../../include -g -O2 -c -o int16.o int16.c awk -f ../unroll.awk -vN=32 < int.uc > int32.c gcc -I.. -I ../../../include -g -O2 -c -o int32.o int32.c rm -f raid6.a ar cq raid6.a int1.o int2.o int4.o int8.o int16.o int32.o recov.o algos.o tables.o ranlib raid6.a gcc -I.. -I ../../../include -g -O2 -o raid6test test.c raid6.a /usr/bin/ld: raid6.a(algos.o):/dev/shm/linux/lib/raid6/test/algos.c:28: multiple definition of `raid6_call'; /scratch/local/ccIJjN8s.o:/dev/shm/linux/lib/raid6/test/test.c:22: first defined here collect2: error: ld returned 1 exit status make: *** [Makefile:72: raid6test] Error 1 The errors come from the `HAS_ALTIVEC` test, which fails, and the POWER optimized versions are not built. That’s also reason nobody noticed on the other architectures. GNU Make 4.3 does not remove the backslash anymore. From the 4.3 release announcment: > * WARNING: Backward-incompatibility! > Number signs (#) appearing inside a macro reference or function invocation > no longer introduce comments and should not be escaped with backslashes: > thus a call such as: > foo := $(shell echo '#') > is legal. Previously the number sign needed to be escaped, for example: > foo := $(shell echo '\#') > Now this latter will resolve to "\#". If you want to write makefiles > portable to both versions, assign the number sign to a variable: > H := \# > foo := $(shell echo '$H') > This was claimed to be fixed in 3.81, but wasn't, for some reason. > To detect this change search for 'nocomment' in the .FEATURES variable. So, do the same as commit 9564a8cf422d (Kbuild: fix # escaping in .cmd files for future Make) and commit 929bef467771 (bpf: Use $(pound) instead of \# in Makefiles) and define and use a `$(pound)` variable. Reference for the change in make: https://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57 Cc: Matt Brown Signed-off-by: Paul Menzel --- lib/raid6/test/Makefile | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/raid6/test/Makefile b/lib/raid6/test/Makefile index a4c7cd74cff5..4fb7700a741b 100644 --- a/lib/raid6/test/Makefile +++ b/lib/raid6/test/Makefile @@ -4,6 +4,8 @@ # from userspace. # +pound := \# + CC = gcc OPTFLAGS = -O2 # Adjust as desired CFLAGS = -I.. -I ../../../include -g $(OPTFLAGS) @@ -42,7 +44,7 @@ else ifeq ($(HAS_NEON),yes) OBJS += neon.o neon1.o neon2.o neon4.o neon8.o recov_neon.o recov_neon_inner.o CFLAGS += -DCONFIG_KERNEL_MODE_NEON=1 else -HAS_ALTIVEC := $(shell printf '\#include \nvector int a;\n' |\ +HAS_ALTIVEC := $(shell printf '$(pound)include \nvector int a;\n' |\ gcc -c -x c - >/dev/null && rm ./-.o && echo yes) ifeq ($(HAS_ALTIVEC),yes) CFLAGS += -I../../../arch/powerpc/include -- 2.34.1
Re: ppc64le: BUG: Kernel NULL pointer dereference on write at 0x00000000
Dear Paul, Am 25.01.22 um 21:33 schrieb Paul E. McKenney: On Mon, Jan 24, 2022 at 07:48:59PM +0100, Paul Menzel wrote: Building Linux 5.17-rc1+ (dd81e1c7d5fb) under Ubuntu 21.04 with CONFIG_TORTURE_TEST=y CONFIG_RCU_TORTURE_TEST=y and $ clang --version Ubuntu clang version 12.0.0-3ubuntu1~21.04.2 Target: powerpc64le-unknown-linux-gnu Thread model: posix InstalledDir: /usr/bin $ make -j100 LLVM=1 LLVM_IAS=0 bindeb-pkg and booting it on an IBM S822LC, it runs into a null pointer dereference. (Please note, the log below is captured using the Serial over LAN (SOL) interface, and I hope it’s all correct and there is no garbage due to overridden lines or characters.): ``` [0.037960][T1] rcu: rcu_spawn_gp_kthread(): Limited prio to 2 from 1 [0.399013][T1] BUG: using smp_processor_id() in preemptible [] code: swapper/0/1 [0.399527][T1] BUG: using smp_processor_id() in preemptible [] code: swapper/0/1 [0.400063][T1] BUG: using smp_processor_id() in preemptible [] code: swapper/0/1 [0.702349][T1] BUG: using smp_processor_id() in preemptible [] code: swapper/0/1 [0.702928][T1] rcu-torture:--- Start of test: nreaders=15 nfakewriters=4 stat_interval=60 verbose=1 test_n_idle_hz=1 sohuffl_eintrvale= 3stut 0.705260][T1] rcu-torture: Creating rcu_torture_fakewriter task [0.705265][ T145] rcu-torture: rcu_torture_fakewriter task started [0.705447][T1] rcu-torture: Creating rcu_torture_reader task [0.705449][ T146] rcu-torture: rcu_torture_fakewriter task started [0.705653][T1] rcu-torture: Creating rcu_torture_reader task [0.705656][ T149] rcu-torture: rcu_torture_reader task started [0.705691][T1] rcu-torture: Creating rcu_torture_reader task [0.705694][ T150] rcu-torture: rcu_torture_reader task started [0.705741][T1] rcu-torture: Creating rcu_torture_reader task [0.705745][ T151] rcu-torture: rcu_torture_reader task started [0.705927][T1] rcu-torture: Creating rcu_torture_reader task [0.705930][ T152] rcu-torture: rcu_torture_reader task started [0.705981][T1] rcu-torture: Creating rcu_torture_reader task [0.705986][ T153] rcu-torture: rcu_torture_reader task started [0.706110][T1] rcu-torture: Creating rcu_torture_reader task [0.706114][ T154] rcu-torture: rcu_torture_reader task started [0.706248][T1] rcu-torture: Creating rcu_torture_reader task [0.706251][ T155] rcu-torture: rcu_torture_reader task started [0.706320][T1] rcu-torture: Creating rcu_torture_reader task [0.706327][ T156] rcu-torture: rcu_torture_reader task started [0.706385][T1] rcu-torture: Creating rcu_torture_reader task [0.706389][ T157] rcu-torture: rcu_torture_reader task started [0.706485][T1] rcu-torture: Creating rcu_torture_reader task [0.706492][ T158] rcu-torture: rcu_torture_reader task started [0.706560][T1] rcu-torture: Creating rcu_torture_reader task [0.706567][ T159] rcu-torture: rcu_torture_reader task started [0.706628][T1] rcu-torture: Creating rcu_torture_reader task [0.706633][ T160] rcu-torture: rcu_torture_reader task started [0.706705][T1] rcu-torture: Creating rcu_torture_reader task [0.706713][ T161] rcu-torture: rcu_torture_reader task started [0.706797][T1] rcu-torture: Creating rcu_torture_reader task [0.706862][ T162] rcu-torture: rcu_torture_reader task started [0.710471][T1] rcu-torture: Creating rcu_torture_stats task [0.710475][ T163] rcu-torture: rcu_torture_reader task started [0.710516][T1] rcu-torture: Creating torture_shuffle task [0.710525][ T166] rcu-torture: rcu_torture_stats task started [0.710607][T1] rcu-torture: Creating torture_stutter task [0.710609][ T167] rcu-torture: torture_shuffle task started [0.710657][ T168] rcu-torture: torture_stutter task started [0.710662][ T18] rcu-torture: Creating rcu_torture_boost task [0.710722][ T169] rcu-torture: rcu_torture_boost started [0.710726][ T19] rcu-torture: Creating rcu_torture_boost task [0.710782][ T170] rcu-torture: rcu_torture_boost started [0.710787][ T25] rcu-torture: Creating rcu_torture_boost task [0.710840][ T171] rcu-torture: rcu_torture_boost started [0.710846][ T32] rcu-torture: Creating rcu_torture_boost task [0.710900][ T172] rcu-torture: rcu_torture_boost started [0.710906][ T38] rcu-torture: Creating rcu_torture_boost task [0.710992][ T173] rcu-torture: rcu_torture_boost started [0.710997][ T45] rcu-torture: Creating rcu_torture_boost task [0.711059][ T174] rcu-torture: rcu_torture_boost started [0.711065][ T51] rcu-torture: Creating rcu_torture_boost task [0.711126][ T175] rcu-torture: rcu_torture_boost started [0.711132][ T58] rcu-torture: Creating rcu_torture_boost task
Re: [PATCH 1/2] firmware: include drivers/firmware/Kconfig unconditionally
[Cc: +linuxppc-dev@lists.ozlabs.org] Dear Arnd, Am 28.09.21 um 09:50 schrieb Arnd Bergmann: From: Arnd Bergmann Compile-testing drivers that require access to a firmware layer fails when that firmware symbol is unavailable. This happened twice this week: - My proposed to change to rework the QCOM_SCM firmware symbol broke on ppc64 and others. - The cs_dsp firmware patch added device specific firmware loader into drivers/firmware, which broke on the same set of architectures. We should probably do the same thing for other subsystems as well, but fix this one first as this is a dependency for other patches getting merged. Cc: Mark Brown Cc: Liam Girdwood Cc: Charles Keepax Cc: Simon Trimmer Cc: Arnd Bergmann Cc: Michael Ellerman Signed-off-by: Arnd Bergmann --- Not sure how we'd want to merge this patch, if two other things need it. I'd prefer to merge it along with the QCOM_SCM change through the soc tree, but that leaves the cirrus firmware broken unless we also merge it the same way (rather than through ASoC as it is now). Alternatively, we can try to find a different home for the Cirrus firmware to decouple the two problems. I'd argue that it's actually misplaced here, as drivers/firmware is meant for kernel code that interfaces with system firmware, not for device drivers to load their own firmware blobs from user space. --- arch/arm/Kconfig| 2 -- arch/arm64/Kconfig | 2 -- arch/ia64/Kconfig | 2 -- arch/mips/Kconfig | 2 -- arch/parisc/Kconfig | 2 -- arch/riscv/Kconfig | 2 -- arch/x86/Kconfig| 2 -- drivers/Kconfig | 2 ++ 8 files changed, 2 insertions(+), 14 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index ad96f3dd7e83..194d10bbff9e 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1993,8 +1993,6 @@ config ARCH_HIBERNATION_POSSIBLE endmenu -source "drivers/firmware/Kconfig" - if CRYPTO source "arch/arm/crypto/Kconfig" endif diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index ebb49585a63f..8749517482ae 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1931,8 +1931,6 @@ source "drivers/cpufreq/Kconfig" endmenu -source "drivers/firmware/Kconfig" - source "drivers/acpi/Kconfig" source "arch/arm64/kvm/Kconfig" diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index 045792cde481..1e33666fa679 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -388,8 +388,6 @@ config CRASH_DUMP help Generate crash dump after being started by kexec. -source "drivers/firmware/Kconfig" - endmenu menu "Power management and ACPI options" diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index 771ca53af06d..6b8f591c5054 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -3316,8 +3316,6 @@ source "drivers/cpuidle/Kconfig" endmenu -source "drivers/firmware/Kconfig" - source "arch/mips/kvm/Kconfig" source "arch/mips/vdso/Kconfig" diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig index 4742b6f169b7..27a8b49af11f 100644 --- a/arch/parisc/Kconfig +++ b/arch/parisc/Kconfig @@ -384,6 +384,4 @@ config KEXEC_FILE endmenu -source "drivers/firmware/Kconfig" - source "drivers/parisc/Kconfig" diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 301a54233c7e..6a6fa9e976d5 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -561,5 +561,3 @@ menu "Power management options" source "kernel/power/Kconfig" endmenu - -source "drivers/firmware/Kconfig" diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index e5ba8afd29a0..5dcec5f13a82 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2834,8 +2834,6 @@ config HAVE_ATOMIC_IOMAP def_bool y depends on X86_32 -source "drivers/firmware/Kconfig" - source "arch/x86/kvm/Kconfig" source "arch/x86/Kconfig.assembler" diff --git a/drivers/Kconfig b/drivers/Kconfig index 30d2db37cc87..0d399ddaa185 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -17,6 +17,8 @@ source "drivers/bus/Kconfig" source "drivers/connector/Kconfig" +source "drivers/firmware/Kconfig" + source "drivers/gnss/Kconfig" source "drivers/mtd/Kconfig" With this change, I have the new entries below in my .config: ``` $ diff -u .config.old .config --- .config.old 2021-10-07 11:38:39.54400 +0200 +++ .config 2021-10-09 10:02:03.15600 +0200 @@ -1992,6 +1992,25 @@ CONFIG_CONNECTOR=y CONFIG_PROC_EVENTS=y + +# +# Firmware Drivers +# + +# +# ARM System Control and Management Interface Protocol +# +# end of ARM System Control and Management Interface Protocol + +# CONFIG_FIRMWARE_MEMMAP is not set +# CONFIG_GOOGLE_FIRMWARE is not set + +# +# Tegra firmware driver +# +# end of Tegra firmware driver +# end of Firmware Drivers + # CONFIG_GNSS is not set CONFIG_MTD=m # CONFIG_MTD_TESTS is not set ``` No idea if the entries could be hidden for platforms not supporting them. ARM System Control and Management Interface Protocol [
Re: [PATCH v3] lib/zlib_inflate/inffast: Check config in C to avoid unused function warning
Dear Linux folks, Am 20.09.21 um 17:45 schrieb Nathan Chancellor: On Mon, Sep 20, 2021 at 10:43:33AM +0200, Paul Menzel wrote: Building Linux for ppc64le with Ubuntu clang version 12.0.0-3ubuntu1~21.04.1 shows the warning below. arch/powerpc/boot/inffast.c:20:1: warning: unused function 'get_unaligned16' [-Wunused-function] get_unaligned16(const unsigned short *p) ^ 1 warning generated. Fix it, by moving the check from the preprocessor to C, so the compiler sees the use. Signed-off-by: Paul Menzel Reviewed-by: Nathan Chancellor Tested-by: Nathan Chancellor --- v2: Use IS_ENABLED v3: Use if statement over ternary operator as requested by Christophe lib/zlib_inflate/inffast.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/lib/zlib_inflate/inffast.c b/lib/zlib_inflate/inffast.c index f19c4fbe1be7..2843f9bb42ac 100644 --- a/lib/zlib_inflate/inffast.c +++ b/lib/zlib_inflate/inffast.c @@ -253,13 +253,12 @@ void inflate_fast(z_streamp strm, unsigned start) sfrom = (unsigned short *)(from); loops = len >> 1; - do -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS - *sout++ = *sfrom++; -#else - *sout++ = get_unaligned16(sfrom++); -#endif - while (--loops); + do { + if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) + *sout++ = *sfrom++; + else + *sout++ = get_unaligned16(sfrom++); + } while (--loops); out = (unsigned char *)sout; from = (unsigned char *)sfrom; } else { /* dist == 1 or dist == 2 */ -- 2.33.0 Just for the record, I compared both object files by running `objdump -d`, and the result is the same. The binary differed (`vbindiff`), but I guess this is due to the increased revision (`make bindeb-pkg`). without a change (Linus’ current master): 0B50: 00 00 00 00 00 00 00 00 1F 01 00 00 36 00 00 00 6... ^ 0B60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0B70: 00 00 00 00 00 00 00 00 29 01 00 00 32 00 00 00 )...2... ^ v2 (ternary operator): 0B50: 00 00 00 00 00 00 00 00 1C 01 00 00 36 00 00 00 6... 0B60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0B70: 00 00 00 00 00 00 00 00 26 01 00 00 32 00 00 00 &...2... v3 (if-else statement): 0B50: 00 00 00 00 00 00 00 00 1E 01 00 00 36 00 00 00 6... 0B60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0B70: 00 00 00 00 00 00 00 00 28 01 00 00 32 00 00 00 (...2... Kind regards, Paul
Re: [PATCH v2] lib/zlib_inflate/inffast: Check config in C to avoid unused function warning
Dear Christophe, Thank you for the review. Am 20.09.21 um 10:36 schrieb Christophe Leroy: Le 20/09/2021 à 09:46, Paul Menzel a écrit : Building Linux for ppc64le with Ubuntu clang version 12.0.0-3ubuntu1~21.04.1 shows the warning below. arch/powerpc/boot/inffast.c:20:1: warning: unused function 'get_unaligned16' [-Wunused-function] get_unaligned16(const unsigned short *p) ^ 1 warning generated. Fix it, by moving the check from the preprocessor to C, so the compiler sees the use. Signed-off-by: Paul Menzel --- lib/zlib_inflate/inffast.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/lib/zlib_inflate/inffast.c b/lib/zlib_inflate/inffast.c index f19c4fbe1be7..fb87a3120f0f 100644 --- a/lib/zlib_inflate/inffast.c +++ b/lib/zlib_inflate/inffast.c @@ -254,11 +254,8 @@ void inflate_fast(z_streamp strm, unsigned start) sfrom = (unsigned short *)(from); loops = len >> 1; do -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS - *sout++ = *sfrom++; -#else - *sout++ = get_unaligned16(sfrom++); -#endif + *sout++ = IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) ? + *sfrom++ : get_unaligned16(sfrom++); I think it would be more readable as do { if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) *sout++ = *sfrom++; else *sout++ = get_unaligned16(sfrom++); } while (--loops); I prefer the ternary operator, as it’s less lines, and it’s clear, that only the variable assignment is affected by the condition. But as style is subjective, I sent v3. while (--loops); out = (unsigned char *)sout; from = (unsigned char *)sfrom; Kind regards, Paul
[PATCH v3] lib/zlib_inflate/inffast: Check config in C to avoid unused function warning
Building Linux for ppc64le with Ubuntu clang version 12.0.0-3ubuntu1~21.04.1 shows the warning below. arch/powerpc/boot/inffast.c:20:1: warning: unused function 'get_unaligned16' [-Wunused-function] get_unaligned16(const unsigned short *p) ^ 1 warning generated. Fix it, by moving the check from the preprocessor to C, so the compiler sees the use. Signed-off-by: Paul Menzel --- v2: Use IS_ENABLED v3: Use if statement over ternary operator as requested by Christophe lib/zlib_inflate/inffast.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/lib/zlib_inflate/inffast.c b/lib/zlib_inflate/inffast.c index f19c4fbe1be7..2843f9bb42ac 100644 --- a/lib/zlib_inflate/inffast.c +++ b/lib/zlib_inflate/inffast.c @@ -253,13 +253,12 @@ void inflate_fast(z_streamp strm, unsigned start) sfrom = (unsigned short *)(from); loops = len >> 1; - do -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS - *sout++ = *sfrom++; -#else - *sout++ = get_unaligned16(sfrom++); -#endif - while (--loops); + do { + if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) + *sout++ = *sfrom++; + else + *sout++ = get_unaligned16(sfrom++); + } while (--loops); out = (unsigned char *)sout; from = (unsigned char *)sfrom; } else { /* dist == 1 or dist == 2 */ -- 2.33.0
[PATCH v2] lib/zlib_inflate/inffast: Check config in C to avoid unused function warning
Building Linux for ppc64le with Ubuntu clang version 12.0.0-3ubuntu1~21.04.1 shows the warning below. arch/powerpc/boot/inffast.c:20:1: warning: unused function 'get_unaligned16' [-Wunused-function] get_unaligned16(const unsigned short *p) ^ 1 warning generated. Fix it, by moving the check from the preprocessor to C, so the compiler sees the use. Signed-off-by: Paul Menzel --- lib/zlib_inflate/inffast.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/lib/zlib_inflate/inffast.c b/lib/zlib_inflate/inffast.c index f19c4fbe1be7..fb87a3120f0f 100644 --- a/lib/zlib_inflate/inffast.c +++ b/lib/zlib_inflate/inffast.c @@ -254,11 +254,8 @@ void inflate_fast(z_streamp strm, unsigned start) sfrom = (unsigned short *)(from); loops = len >> 1; do -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS - *sout++ = *sfrom++; -#else - *sout++ = get_unaligned16(sfrom++); -#endif + *sout++ = IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) ? + *sfrom++ : get_unaligned16(sfrom++); while (--loops); out = (unsigned char *)sout; from = (unsigned char *)sfrom; -- 2.33.0
[PATCH] lib/zlib_inflate/inffast: Check config in C to avoid unused function warning
Building Linux for ppc64le with Ubuntu clang version 12.0.0-3ubuntu1~21.04.1 shows the warning below. arch/powerpc/boot/inffast.c:20:1: warning: unused function 'get_unaligned16' [-Wunused-function] get_unaligned16(const unsigned short *p) ^ 1 warning generated. Fix it, by moving the check from the preprocessor to C, so the compiler sees the use. Signed-off-by: Paul Menzel --- lib/zlib_inflate/inffast.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/lib/zlib_inflate/inffast.c b/lib/zlib_inflate/inffast.c index f19c4fbe1be7..444ad3c3ccd3 100644 --- a/lib/zlib_inflate/inffast.c +++ b/lib/zlib_inflate/inffast.c @@ -254,11 +254,7 @@ void inflate_fast(z_streamp strm, unsigned start) sfrom = (unsigned short *)(from); loops = len >> 1; do -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS - *sout++ = *sfrom++; -#else - *sout++ = get_unaligned16(sfrom++); -#endif + *sout++ = CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS ? *sfrom++ : get_unaligned16(sfrom++); while (--loops); out = (unsigned char *)sout; from = (unsigned char *)sfrom; -- 2.33.0
LLVM/clang ias build fails with unsupported arguments '-mpower4' and '-many' to option 'Wa,'
Dear Linux folks, Building Linux with LLVM’s integrated assembler fails with the error below [1]. ``` $ ARCH=powerpc CROSS_COMPILE=powerpc64le-linux-gnu- make LLVM=1 LLVM_IAS=1 -j72 pseries_defconfig arch/powerpc/kernel/vdso32/gettimeofday.o ... arch/powerpc/kernel/vdso32/gettimeofday.S:72:8: error: unsupported directive '.stabs' .stabs "_restgpr_31_x:F-1",36,0,0,_restgpr_31_x; .globl _restgpr_31_x; _restgpr_31_x: ^ arch/powerpc/kernel/vdso32/gettimeofday.S:73:8: error: unsupported directive '.stabs' .stabs "_rest32gpr_31_x:F-1",36,0,0,_rest32gpr_31_x; .globl _rest32gpr_31_x; _rest32gpr_31_x: ^ ``` The LLVM developers are not planning on implementing this, as Stab has been succeeded by DWARF [2]. Kind regards, Paul [1]: https://github.com/ClangBuiltLinux/linux/issues/1418 [2]: https://bugs.llvm.org/show_bug.cgi?id=31134
Re: clang/ld.lld build fails with `can't create dynamic relocation R_PPC64_ADDR64 against local symbol in readonly segment`
Dear Christophe, Am 11.08.21 um 16:10 schrieb Christophe Leroy: Le 10/08/2021 à 20:38, Paul Menzel a écrit : Am 29.07.21 um 10:23 schrieb Paul Menzel: I just wanted to make you aware that building Linux for ppc64le with clang/lld.ld fails with [1]: ld.lld: error: can't create dynamic relocation R_PPC64_ADDR64 against symbol: empty_zero_page in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output >>> defined in arch/powerpc/kernel/head_64.o >>> referenced by arch/powerpc/kernel/head_64.o:(___ksymtab+empty_zero_page+0x0) The patch below from one of the comments [2] fixes it. --- i/arch/powerpc/Makefile +++ w/arch/powerpc/Makefile @@ -122,7 +122,7 @@ cflags-$(CONFIG_STACKPROTECTOR) += -mstack-protector-guard-reg=r2 endif LDFLAGS_vmlinux-y := -Bstatic -LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie +LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie -z notext LDFLAGS_vmlinux := $(LDFLAGS_vmlinux-y) LDFLAGS_vmlinux += $(call ld-option,--orphan-handling=warn) Any comments, if this is the right fix? Current Linux master branch still fails to build with `LLVM=1` on Ubuntu 21.04 without this change. Which kernel version are you building ? Since https://github.com/linuxppc/linux/commit/45b30fafe528601f1a4449c9d68d8ebe7bbc39ad , empty_zero_page[] is in arch/powerpc/mm/mem.c not in arch/powerpc/kernel/head_64.o Do you still have the issue with kernel 5.14 ? Yes, before sending the message, I reproduced it with $ git describe v5.14-rc5-2-g9a73fa375d58 containing the commit you mentioned. Kind regards, Paul
Re: clang/ld.lld build fails with `can't create dynamic relocation R_PPC64_ADDR64 against local symbol in readonly segment`
Dear Linux folks, Am 29.07.21 um 10:23 schrieb Paul Menzel: I just wanted to make you aware that building Linux for ppc64le with clang/lld.ld fails with [1]: ld.lld: error: can't create dynamic relocation R_PPC64_ADDR64 against symbol: empty_zero_page in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output >>> defined in arch/powerpc/kernel/head_64.o >>> referenced by arch/powerpc/kernel/head_64.o:(___ksymtab+empty_zero_page+0x0) The patch below from one of the comments [2] fixes it. --- i/arch/powerpc/Makefile +++ w/arch/powerpc/Makefile @@ -122,7 +122,7 @@ cflags-$(CONFIG_STACKPROTECTOR) += -mstack-protector-guard-reg=r2 endif LDFLAGS_vmlinux-y := -Bstatic -LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie +LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie -z notext LDFLAGS_vmlinux := $(LDFLAGS_vmlinux-y) LDFLAGS_vmlinux += $(call ld-option,--orphan-handling=warn) Any comments, if this is the right fix? Current Linux master branch still fails to build with `LLVM=1` on Ubuntu 21.04 without this change. Kind regards, Paul [1]: https://github.com/ClangBuiltLinux/linux/issues/811 [2]: https://github.com/ClangBuiltLinux/linux/issues/811#issuecomment-568316320
Re: [PATCH] powerpc/vdso: Don't use r30 to avoid breaking Go lang
Dear Michael, Am 29.07.21 um 15:12 schrieb Michael Ellerman: The Go runtime uses r30 for some special value called 'g'. It assumes that value will remain unchanged even when calling VDSO functions. Although r30 is non-volatile across function calls, the callee is free to use it, as long as the callee saves the value and restores it before returning. It used to be true by accident that the VDSO didn't use r30, because the VDSO was hand-written asm. When we switched to building the VDSO from C the compiler started using r30, at least in some builds, leading to crashes in Go. eg: ~/go/src$ ./all.bash Building Go cmd/dist using /usr/lib/go-1.16. (go1.16.2 linux/ppc64le) Building Go toolchain1 using /usr/lib/go-1.16. go build os/exec: /usr/lib/go-1.16/pkg/tool/linux_ppc64le/compile: signal: segmentation fault go build reflect: /usr/lib/go-1.16/pkg/tool/linux_ppc64le/compile: signal: segmentation fault go tool dist: FAILED: /usr/lib/go-1.16/bin/go install -gcflags=-l -tags=math_big_pure_go compiler_bootstrap bootstrap/cmd/...: exit status 1 There are patches in flight to fix Go[1], but until they are released and widely deployed we can workaround it in the VDSO by avoiding use of Nit: work around is spelled with a space. r30. Note this only works with GCC, clang does not support -ffixed-rN. Maybe the clang/LLVM build support folks (in CC) have an idea. 1: https://go-review.googlesource.com/c/go/+/328110 Fixes: ab037dd87a2f ("powerpc/vdso: Switch VDSO to generic C implementation.") Cc: sta...@vger.kernel.org # v5.11+ Reported-by: Paul Menzel Tested-by: Paul Menzel Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/vdso64/Makefile | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/powerpc/kernel/vdso64/Makefile b/arch/powerpc/kernel/vdso64/Makefile index 2813e3f98db6..3c5baaa6f1e7 100644 --- a/arch/powerpc/kernel/vdso64/Makefile +++ b/arch/powerpc/kernel/vdso64/Makefile @@ -27,6 +27,13 @@ KASAN_SANITIZE := n ccflags-y := -shared -fno-common -fno-builtin -nostdlib \ -Wl,-soname=linux-vdso64.so.1 -Wl,--hash-style=both + +# Go prior to 1.16.x assumes r30 is not clobbered by any VDSO code. That used to be true +# by accident when the VDSO was hand-written asm code, but may not be now that the VDSO is +# compiler generated. To avoid breaking Go tell GCC not to use r30. Impact on code +# generation is minimal, it will just use r29 instead. +ccflags-y += $(call cc-option, -ffixed-r30) + asflags-y := -D__VDSO64__ -s targets += vdso64.lds The rest looks good. Kind regards, Paul
Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)
Dear Michael, Am 29.07.21 um 09:41 schrieb Michael Ellerman: Paul Menzel writes: Am 28.07.21 um 14:43 schrieb Michael Ellerman: Paul Menzel writes: Am 28.07.21 um 01:14 schrieb Benjamin Herrenschmidt: On Tue, 2021-07-27 at 10:45 +0200, Paul Menzel wrote: On ppc64le Go 1.16.2 from Ubuntu 21.04 terminates with a segmentation fault [1], and it might be related to *[release-branch.go1.16] runtime: fix crash during VDSO calls on PowerPC* [2], conjecturing that commit ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.) added in Linux 5.11 causes this. If this is indeed the case, this would be a regression in userspace. Is there a generic fix or should the change be reverted? From the look at the links you posted, this appears to be completely broken assumptions by Go that some registers don't change while calling what essentially are external library functions *while inside those functions* (ie in this case from a signal handler). I suppose it would be possible to build the VDSO with gcc arguments to make it not use r30, but that's just gross... Thank you for looking into this. No idea, if it falls under Linux’ no regression policy or not. Reluctantly yes, I think it does. Though it would have been good if it had been reported to us sooner. It looks like that Go fix is only committed to master, and neither of the latest Go 1.16 or 1.15 releases contain the fix? ie. there's no way for a user to get a working version of Go other than building master? I heard it is going to be in Go 1.16.7, but I do not know much about Go. Maybe the folks in Cc can chime in. I'll see if we can work around it in the kernel. Are you able to test a kernel patch if I send you one? Yes, I could test a Linux kernel patch on ppc64le (POWER 8) running Ubuntu 21.04. Thanks, would be great if you can test on your setup. Patch below. I haven't been able to reproduce the crash by following the instructions in your bug report, I have go1.13.8, I guess the crash is only in newer versions? I only used go version 1.16.2 packaged in Ubuntu 21.04 (1.16~0ubuntu1). diff --git a/arch/powerpc/kernel/vdso64/Makefile b/arch/powerpc/kernel/vdso64/Makefile index 2813e3f98db6..3c5baaa6f1e7 100644 --- a/arch/powerpc/kernel/vdso64/Makefile +++ b/arch/powerpc/kernel/vdso64/Makefile @@ -27,6 +27,13 @@ KASAN_SANITIZE := n ccflags-y := -shared -fno-common -fno-builtin -nostdlib \ -Wl,-soname=linux-vdso64.so.1 -Wl,--hash-style=both + +# Go prior to 1.16.x assumes r30 is not clobbered by any VDSO code. That used to be true Probably that needs to be 1.16.7. +# by accident when the VDSO was hand-written asm code, but may not be now that the VDSO is +# compiler generated. To avoid breaking Go tell GCC not to use r30. Impact on code +# generation is minimal, it will just use r29 instead. +ccflags-y += $(call cc-option, -ffixed-r30) + asflags-y := -D__VDSO64__ -s targets += vdso64.lds With this applied to Linux, go does not crash with a segmentation fault anymore. Tested-by: Paul Menzel (Probably the commit should be tagged for the stable series too.) Kind regards, Paul
clang/ld.lld build fails with `can't create dynamic relocation R_PPC64_ADDR64 against local symbol in readonly segment`
Dear Linux folks, I just wanted to make you aware that building Linux for ppc64le with clang/lld.ld fails with [1]: ld.lld: error: can't create dynamic relocation R_PPC64_ADDR64 against symbol: empty_zero_page in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output >>> defined in arch/powerpc/kernel/head_64.o >>> referenced by arch/powerpc/kernel/head_64.o:(___ksymtab+empty_zero_page+0x0) The patch below from one of the comments [2] fixes it. --- i/arch/powerpc/Makefile +++ w/arch/powerpc/Makefile @@ -122,7 +122,7 @@ cflags-$(CONFIG_STACKPROTECTOR) += -mstack-protector-guard-reg=r2 endif LDFLAGS_vmlinux-y := -Bstatic -LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie +LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie -z notext LDFLAGS_vmlinux:= $(LDFLAGS_vmlinux-y) LDFLAGS_vmlinux += $(call ld-option,--orphan-handling=warn) Kind regards, Paul [1]: https://github.com/ClangBuiltLinux/linux/issues/811 [2]: https://github.com/ClangBuiltLinux/linux/issues/811#issuecomment-568316320
Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)
Dear Michael, Am 28.07.21 um 14:43 schrieb Michael Ellerman: Paul Menzel writes: Am 28.07.21 um 01:14 schrieb Benjamin Herrenschmidt: On Tue, 2021-07-27 at 10:45 +0200, Paul Menzel wrote: On ppc64le Go 1.16.2 from Ubuntu 21.04 terminates with a segmentation fault [1], and it might be related to *[release-branch.go1.16] runtime: fix crash during VDSO calls on PowerPC* [2], conjecturing that commit ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.) added in Linux 5.11 causes this. If this is indeed the case, this would be a regression in userspace. Is there a generic fix or should the change be reverted? From the look at the links you posted, this appears to be completely broken assumptions by Go that some registers don't change while calling what essentially are external library functions *while inside those functions* (ie in this case from a signal handler). I suppose it would be possible to build the VDSO with gcc arguments to make it not use r30, but that's just gross... Thank you for looking into this. No idea, if it falls under Linux’ no regression policy or not. Reluctantly yes, I think it does. Though it would have been good if it had been reported to us sooner. It looks like that Go fix is only committed to master, and neither of the latest Go 1.16 or 1.15 releases contain the fix? ie. there's no way for a user to get a working version of Go other than building master? I heard it is going to be in Go 1.16.7, but I do not know much about Go. Maybe the folks in Cc can chime in. I'll see if we can work around it in the kernel. Are you able to test a kernel patch if I send you one? Yes, I could test a Linux kernel patch on ppc64le (POWER 8) running Ubuntu 21.04. Kind regards, Paul
Re: Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)
Dear Benjamin, Am 28.07.21 um 01:14 schrieb Benjamin Herrenschmidt: On Tue, 2021-07-27 at 10:45 +0200, Paul Menzel wrote: On ppc64le Go 1.16.2 from Ubuntu 21.04 terminates with a segmentation fault [1], and it might be related to *[release-branch.go1.16] runtime: fix crash during VDSO calls on PowerPC* [2], conjecturing that commit ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.) added in Linux 5.11 causes this. If this is indeed the case, this would be a regression in userspace. Is there a generic fix or should the change be reverted? From the look at the links you posted, this appears to be completely broken assumptions by Go that some registers don't change while calling what essentially are external library functions *while inside those functions* (ie in this case from a signal handler). I suppose it would be possible to build the VDSO with gcc arguments to make it not use r30, but that's just gross... Thank you for looking into this. No idea, if it falls under Linux’ no regression policy or not. Kind regards, Paul
Possible regression by ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.)
Dear Christophe, On ppc64le Go 1.16.2 from Ubuntu 21.04 terminates with a segmentation fault [1], and it might be related to *[release-branch.go1.16] runtime: fix crash during VDSO calls on PowerPC* [2], conjecturing that commit ab037dd87a2f (powerpc/vdso: Switch VDSO to generic C implementation.) added in Linux 5.11 causes this. If this is indeed the case, this would be a regression in userspace. Is there a generic fix or should the change be reverted? Kind regards, Paul [1]: https://github.com/9elements/converged-security-suite/issues/268 [2]: https://go-review.googlesource.com/c/go/+/334410/ [3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab037dd87a2f946556850e204c06cbd7a2a19390
Re: UBSAN: array-index-out-of-bounds in arch/powerpc/kernel/legacy_serial.c:359:56
Dear Christophe, Am 07.05.21 um 10:59 schrieb Christophe Leroy: Le 07/05/2021 à 10:42, Paul Menzel a écrit : [+Andrey] Am 07.05.21 um 10:31 schrieb Christophe Leroy: Le 06/05/2021 à 21:32, Paul Menzel a écrit : [corrected subject] Am 06.05.21 um 21:31 schrieb Paul Menzel: On the POWER8 system IBM S822LC, Linux 5.13+, built with USSAN, logs the warning below. ``` [ 0.030091] [ 0.030295] UBSAN: array-index-out-of-bounds in arch/powerpc/kernel/legacy_serial.c:359:56 [ 0.030325] index -1 is out of range for type 'legacy_serial_info [8]' [ 0.030350] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0+ #2 [ 0.030360] Call Trace: [ 0.030363] [c00024f1bad0] [c09f4330] dump_stack+0xc4/0x114 (unreliable) [ 0.030386] [c00024f1bb20] [c09efed0] ubsan_epilogue+0x18/0x78 [ 0.030400] [c00024f1bb80] [c09efafc] __ubsan_handle_out_of_bounds+0xac/0xd0 [ 0.030414] [c00024f1bc20] [c1711588] ioremap_legacy_serial_console+0x54/0x144 [ 0.030430] [c00024f1bc70] [c00123c0] do_one_initcall+0x60/0x2c0 [ 0.030444] [c00024f1bd40] [c1704bc4] kernel_init_freeable+0x19c/0x25c [ 0.030458] [c00024f1bda0] [c0012a2c] kernel_init+0x2c/0x180 [ 0.030471] [c00024f1be10] [c000d6ec] ret_from_kernel_thread+0x5c/0x70 [ 0.030484] [ 0.030641] [ 0.030668] UBSAN: array-index-out-of-bounds in arch/powerpc/kernel/legacy_serial.c:360:58 [ 0.030697] index -1 is out of range for type 'plat_serial8250_port [9]' [ 0.030721] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0+ #2 [ 0.030730] Call Trace: [ 0.030733] [c00024f1bad0] [c09f4330] dump_stack+0xc4/0x114 (unreliable) [ 0.030749] [c00024f1bb20] [c09efed0] ubsan_epilogue+0x18/0x78 [ 0.030762] [c00024f1bb80] [c09efafc] __ubsan_handle_out_of_bounds+0xac/0xd0 [ 0.030775] [c00024f1bc20] [c17115a0] ioremap_legacy_serial_console+0x6c/0x144 [ 0.030790] [c00024f1bc70] [c00123c0] do_one_initcall+0x60/0x2c0 [ 0.030802] [c00024f1bd40] [c1704bc4] kernel_init_freeable+0x19c/0x25c [ 0.030816] [c00024f1bda0] [c0012a2c] kernel_init+0x2c/0x180 [ 0.030829] [c00024f1be10] [c000d6ec] ret_from_kernel_thread+0x5c/0x70 [ 0.030842] ``` The function is as follows, so when legacy_serial_console == -1 as in your situation, the pointers are just not used. static int __init ioremap_legacy_serial_console(void) { struct legacy_serial_info *info = &legacy_serial_infos[legacy_serial_console]; struct plat_serial8250_port *port = &legacy_serial_ports[legacy_serial_console]; void __iomem *vaddr; if (legacy_serial_console < 0) return 0; ... } When I look into the generated code (UBSAN not selected), we see the verification and the bail-out is done prior to any calculation based on legacy_serial_console. : 0: 94 21 ff e0 stwu r1,-32(r1) 4: 3d 20 00 00 lis r9,0 6: R_PPC_ADDR16_HA .data 8: 7c 08 02 a6 mflr r0 c: bf 81 00 10 stmw r28,16(r1) 10: 3b 80 00 00 li r28,0 14: 83 a9 00 00 lwz r29,0(r9) 16: R_PPC_ADDR16_LO .data 18: 90 01 00 24 stw r0,36(r1) 1c: 2c 1d 00 00 cmpwi r29,0 20: 41 80 00 80 blt a0 So, is it normal that UBSAN reports an error here ? If it’s useful, I could disassemble the code here. But please tell me how. Sorry, I do not know. I just selected the option, and saw the error. Maybe Andrey has an idea. No need for you to disassemble, I just wanted to show that without UBSAN there is no problem with the index as it is used only after boundary checking. (But if you want to do so, if is just an 'objdump -dr legacy_serial.o') Thank you for the hint. Now, with UBSAN, I see that UBSAN does the verification of the index earlier than expected. So what to do here, we can modify the code, but that modification would just be to make UBSAN happy as there is no problem in itself. In #g...@irc.freenode.net I was told by zid (they weren’t so happy with the wording), but maybe you understand it: It's not legal C to generate pointers to things other than 0, objects, or 1 past the end of an object, not just dereference them, so technically that's not legal per the C spec. In practice it won't matter until it's dereferenced of course unless you're doing something weird, let's say.. instrumenting the code Kind regards, Paul
Re: UBSAN: array-index-out-of-bounds in arch/powerpc/kernel/legacy_serial.c:359:56
[+Andrey] Dear Christophe, Am 07.05.21 um 10:31 schrieb Christophe Leroy: Le 06/05/2021 à 21:32, Paul Menzel a écrit : [corrected subject] Am 06.05.21 um 21:31 schrieb Paul Menzel: On the POWER8 system IBM S822LC, Linux 5.13+, built with USSAN, logs the warning below. ``` [ 0.030091] [ 0.030295] UBSAN: array-index-out-of-bounds in arch/powerpc/kernel/legacy_serial.c:359:56 [ 0.030325] index -1 is out of range for type 'legacy_serial_info [8]' [ 0.030350] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0+ #2 [ 0.030360] Call Trace: [ 0.030363] [c00024f1bad0] [c09f4330] dump_stack+0xc4/0x114 (unreliable) [ 0.030386] [c00024f1bb20] [c09efed0] ubsan_epilogue+0x18/0x78 [ 0.030400] [c00024f1bb80] [c09efafc] __ubsan_handle_out_of_bounds+0xac/0xd0 [ 0.030414] [c00024f1bc20] [c1711588] ioremap_legacy_serial_console+0x54/0x144 [ 0.030430] [c00024f1bc70] [c00123c0] do_one_initcall+0x60/0x2c0 [ 0.030444] [c00024f1bd40] [c1704bc4] kernel_init_freeable+0x19c/0x25c [ 0.030458] [c00024f1bda0] [c0012a2c] kernel_init+0x2c/0x180 [ 0.030471] [c00024f1be10] [c000d6ec] ret_from_kernel_thread+0x5c/0x70 [ 0.030484] [ 0.030641] [ 0.030668] UBSAN: array-index-out-of-bounds in arch/powerpc/kernel/legacy_serial.c:360:58 [ 0.030697] index -1 is out of range for type 'plat_serial8250_port [9]' [ 0.030721] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0+ #2 [ 0.030730] Call Trace: [ 0.030733] [c00024f1bad0] [c09f4330] dump_stack+0xc4/0x114 (unreliable) [ 0.030749] [c00024f1bb20] [c09efed0] ubsan_epilogue+0x18/0x78 [ 0.030762] [c00024f1bb80] [c09efafc] __ubsan_handle_out_of_bounds+0xac/0xd0 [ 0.030775] [c00024f1bc20] [c17115a0] ioremap_legacy_serial_console+0x6c/0x144 [ 0.030790] [c00024f1bc70] [c00123c0] do_one_initcall+0x60/0x2c0 [ 0.030802] [c00024f1bd40] [c1704bc4] kernel_init_freeable+0x19c/0x25c [ 0.030816] [c00024f1bda0] [c0012a2c] kernel_init+0x2c/0x180 [ 0.030829] [c00024f1be10] [c000d6ec] ret_from_kernel_thread+0x5c/0x70 [ 0.030842] ``` The function is as follows, so when legacy_serial_console == -1 as in your situation, the pointers are just not used. static int __init ioremap_legacy_serial_console(void) { struct legacy_serial_info *info = &legacy_serial_infos[legacy_serial_console]; struct plat_serial8250_port *port = &legacy_serial_ports[legacy_serial_console]; void __iomem *vaddr; if (legacy_serial_console < 0) return 0; ... } When I look into the generated code (UBSAN not selected), we see the verification and the bail-out is done prior to any calculation based on legacy_serial_console. : 0: 94 21 ff e0 stwu r1,-32(r1) 4: 3d 20 00 00 lis r9,0 6: R_PPC_ADDR16_HA .data 8: 7c 08 02 a6 mflr r0 c: bf 81 00 10 stmw r28,16(r1) 10: 3b 80 00 00 li r28,0 14: 83 a9 00 00 lwz r29,0(r9) 16: R_PPC_ADDR16_LO .data 18: 90 01 00 24 stw r0,36(r1) 1c: 2c 1d 00 00 cmpwi r29,0 20: 41 80 00 80 blt a0 So, is it normal that UBSAN reports an error here ? If it’s useful, I could disassemble the code here. But please tell me how. Sorry, I do not know. I just selected the option, and saw the error. Maybe Andrey has an idea. Kind regards, Paul
UBSAN: array-index-out-of-bounds in arch/powerpc/kernel/legacy_serial.c:359:56
[corrected subject] Am 06.05.21 um 21:31 schrieb Paul Menzel: Dear Linux folks, On the POWER8 system IBM S822LC, Linux 5.13+, built with USSAN, logs the warning below. ``` [ 0.030091] [ 0.030295] UBSAN: array-index-out-of-bounds in arch/powerpc/kernel/legacy_serial.c:359:56 [ 0.030325] index -1 is out of range for type 'legacy_serial_info [8]' [ 0.030350] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0+ #2 [ 0.030360] Call Trace: [ 0.030363] [c00024f1bad0] [c09f4330] dump_stack+0xc4/0x114 (unreliable) [ 0.030386] [c00024f1bb20] [c09efed0] ubsan_epilogue+0x18/0x78 [ 0.030400] [c00024f1bb80] [c09efafc] __ubsan_handle_out_of_bounds+0xac/0xd0 [ 0.030414] [c00024f1bc20] [c1711588] ioremap_legacy_serial_console+0x54/0x144 [ 0.030430] [c00024f1bc70] [c00123c0] do_one_initcall+0x60/0x2c0 [ 0.030444] [c00024f1bd40] [c1704bc4] kernel_init_freeable+0x19c/0x25c [ 0.030458] [c00024f1bda0] [c0012a2c] kernel_init+0x2c/0x180 [ 0.030471] [c00024f1be10] [c000d6ec] ret_from_kernel_thread+0x5c/0x70 [ 0.030484] [ 0.030641] [ 0.030668] UBSAN: array-index-out-of-bounds in arch/powerpc/kernel/legacy_serial.c:360:58 [ 0.030697] index -1 is out of range for type 'plat_serial8250_port [9]' [ 0.030721] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0+ #2 [ 0.030730] Call Trace: [ 0.030733] [c00024f1bad0] [c09f4330] dump_stack+0xc4/0x114 (unreliable) [ 0.030749] [c00024f1bb20] [c09efed0] ubsan_epilogue+0x18/0x78 [ 0.030762] [c00024f1bb80] [c09efafc] __ubsan_handle_out_of_bounds+0xac/0xd0 [ 0.030775] [c00024f1bc20] [c17115a0] ioremap_legacy_serial_console+0x6c/0x144 [ 0.030790] [c00024f1bc70] [c00123c0] do_one_initcall+0x60/0x2c0 [ 0.030802] [c00024f1bd40] [c1704bc4] kernel_init_freeable+0x19c/0x25c [ 0.030816] [c00024f1bda0] [c0012a2c] kernel_init+0x2c/0x180 [ 0.030829] [c00024f1be10] [c000d6ec] ret_from_kernel_thread+0x5c/0x70 [ 0.030842] ``` Kind regards, Paul
WARNING: CPU: 0 PID: 1 at arch/powerpc/lib/feature-fixups.c:109 do_feature_fixups+0xb0/0xf0
Dear Linux folks, On the POWER8 system IBM S822LC, Linux 5.13+, built with USSAN, logs the warning below. ``` [0.030091] [0.030295] UBSAN: array-index-out-of-bounds in arch/powerpc/kernel/legacy_serial.c:359:56 [0.030325] index -1 is out of range for type 'legacy_serial_info [8]' [0.030350] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0+ #2 [0.030360] Call Trace: [0.030363] [c00024f1bad0] [c09f4330] dump_stack+0xc4/0x114 (unreliable) [0.030386] [c00024f1bb20] [c09efed0] ubsan_epilogue+0x18/0x78 [0.030400] [c00024f1bb80] [c09efafc] __ubsan_handle_out_of_bounds+0xac/0xd0 [0.030414] [c00024f1bc20] [c1711588] ioremap_legacy_serial_console+0x54/0x144 [0.030430] [c00024f1bc70] [c00123c0] do_one_initcall+0x60/0x2c0 [0.030444] [c00024f1bd40] [c1704bc4] kernel_init_freeable+0x19c/0x25c [0.030458] [c00024f1bda0] [c0012a2c] kernel_init+0x2c/0x180 [0.030471] [c00024f1be10] [c000d6ec] ret_from_kernel_thread+0x5c/0x70 [0.030484] [0.030641] [0.030668] UBSAN: array-index-out-of-bounds in arch/powerpc/kernel/legacy_serial.c:360:58 [0.030697] index -1 is out of range for type 'plat_serial8250_port [9]' [0.030721] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0+ #2 [0.030730] Call Trace: [0.030733] [c00024f1bad0] [c09f4330] dump_stack+0xc4/0x114 (unreliable) [0.030749] [c00024f1bb20] [c09efed0] ubsan_epilogue+0x18/0x78 [0.030762] [c00024f1bb80] [c09efafc] __ubsan_handle_out_of_bounds+0xac/0xd0 [0.030775] [c00024f1bc20] [c17115a0] ioremap_legacy_serial_console+0x6c/0x144 [0.030790] [c00024f1bc70] [c00123c0] do_one_initcall+0x60/0x2c0 [0.030802] [c00024f1bd40] [c1704bc4] kernel_init_freeable+0x19c/0x25c [0.030816] [c00024f1bda0] [c0012a2c] kernel_init+0x2c/0x180 [0.030829] [c00024f1be10] [c000d6ec] ret_from_kernel_thread+0x5c/0x70 [0.030842] ``` Kind regards, Paul
Re: sysctl: setting key "net.core.bpf_jit_enable": Invalid argument
Dear Christophe, Am 11.04.21 um 18:23 schrieb Christophe Leroy: Le 11/04/2021 à 13:09, Paul Menzel a écrit : Related to * [CVE-2021-29154] Linux kernel incorrect computation of branch displacements in BPF JIT compiler can be abused to execute arbitrary code in Kernel mode* [1], on the POWER8 system IBM S822LC with self-built Linux 5.12.0-rc5+, I am unable to disable `bpf_jit_enable`. $ /sbin/sysctl net.core.bpf_jit_enable net.core.bpf_jit_enable = 1 $ sudo /sbin/sysctl -w net.core.bpf_jit_enable=0 sysctl: setting key "net.core.bpf_jit_enable": Invalid argument It works on an x86 with Debian sid/unstable and Linux 5.10.26-1. Maybe you have selected CONFIG_BPF_JIT_ALWAYS_ON in your self-built kernel ? config BPF_JIT_ALWAYS_ON bool "Permanently enable BPF JIT and remove BPF interpreter" depends on BPF_SYSCALL && HAVE_EBPF_JIT && BPF_JIT help Enables BPF JIT and removes BPF interpreter to avoid speculative execution of BPF instructions by the interpreter Thank you. Indeed. In contrast to Debian, Ubuntu’s Linux configuration selects that option, and I copied that. $ grep _BPF_JIT /boot/config-5.8.0-49-generic /boot/config-5.8.0-49-generic:CONFIG_BPF_JIT_ALWAYS_ON=y /boot/config-5.8.0-49-generic:CONFIG_BPF_JIT_DEFAULT_ON=y /boot/config-5.8.0-49-generic:CONFIG_BPF_JIT=y I wonder, if there is a way to better integrate that option into `/proc/sys`, so it’s clear, that it’s always enabled. Kind regards, Paul
sysctl: setting key "net.core.bpf_jit_enable": Invalid argument
Dear Linux folks, Related to * [CVE-2021-29154] Linux kernel incorrect computation of branch displacements in BPF JIT compiler can be abused to execute arbitrary code in Kernel mode* [1], on the POWER8 system IBM S822LC with self-built Linux 5.12.0-rc5+, I am unable to disable `bpf_jit_enable`. $ /sbin/sysctl net.core.bpf_jit_enable net.core.bpf_jit_enable = 1 $ sudo /sbin/sysctl -w net.core.bpf_jit_enable=0 sysctl: setting key "net.core.bpf_jit_enable": Invalid argument It works on an x86 with Debian sid/unstable and Linux 5.10.26-1. Kind regards, Paul [1]: https://seclists.org/oss-sec/2021/q2/12
WARNING: CPU: 0 PID: 1 at arch/powerpc/lib/feature-fixups.c:109 do_feature_fixups+0xb0/0xf0
Dear Linux folks, On the POWER8 system IBM S822LC, Linux 5.12-rc5+ logs the warning below. ``` [0.723485] [ cut here ] [0.723491] WARNING: CPU: 0 PID: 1 at arch/powerpc/lib/feature-fixups.c:109 do_feature_fixups+0xb0/0xf0 [0.723512] Modules linked in: [0.723524] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc5+ #3 [0.723537] NIP: c00bbc70 LR: c00bbc3c CTR: [0.723547] REGS: c0800d48b800 TRAP: 0700 Not tainted (5.12.0-rc5+) [0.723556] MSR: 92029033 CR: 42008244 XER: 2000 [0.723613] CFAR: c00bbc40 IRQMASK: 0 GPR00: c1707610 c0800d48baa0 c21bb200 0001 GPR04: c21d0d88 aaab ffc0 c21c0034 GPR08: 000c03a0 8000 GPR12: c1211458 c28e c00129f8 GPR16: c1f003f0 GPR20: c1f003d0 c1f003b0 c1f00390 c1703520 GPR24: 0004 c16ddfe0 c1791048 c1791088 GPR28: c14fd870 00fb8f5db187 c21d0d88 0001 [0.723822] NIP [c00bbc70] do_feature_fixups+0xb0/0xf0 [0.723835] LR [c00bbc3c] do_feature_fixups+0x7c/0xf0 [0.723848] Call Trace: [0.723854] [c0800d48baa0] [c01b4f4c] parse_one+0x11c/0x3c0 (unreliable) [0.723875] [c0800d48bb20] [c1707610] vdso_fixup_features+0xbc/0x11c [0.723896] [c0800d48bb60] [c17078bc] vdso_init+0x154/0x1b0 [0.723914] [c0800d48bb90] [c00123c0] do_one_initcall+0x60/0x2c0 [0.723933] [c0800d48bc60] [c1704944] do_initcalls+0x1e0/0x248 [0.723951] [c0800d48bd40] [c1704c38] kernel_init_freeable+0x1f0/0x25c [0.723969] [c0800d48bda0] [c0012a14] kernel_init+0x24/0x170 [0.723987] [c0800d48be10] [c000d6ec] ret_from_kernel_thread+0x5c/0x70 [0.724005] Instruction dump: [0.724014] 40820030 37ff 3bde0030 4082ffe4 38210080 e8010010 eb81ffe0 eba1ffe8 [0.724057] ebc1fff0 ebe1fff8 7c0803a6 4e800020 <0fe0> e8fe0028 e8de0020 e8be0018 [0.724102] ---[ end trace 9bbb55f5cd8ca2ba ]--- [0.724118] Unable to patch feature section at (ptrval) - (ptrval) with (ptrval) - (ptrval) [0.724185] pstore: Registered nvram as persistent store backend ``` Please find the output of `dmesg` attached. Kind regards, Paul [0.00] hash-mmu: Page sizes from device-tree: [0.00] hash-mmu: base_shift=12: shift=12, sllp=0x, avpnm=0x, tlbiel=1, penc=0 [0.00] hash-mmu: base_shift=12: shift=16, sllp=0x, avpnm=0x, tlbiel=1, penc=7 [0.00] hash-mmu: base_shift=12: shift=24, sllp=0x, avpnm=0x, tlbiel=1, penc=56 [0.00] hash-mmu: base_shift=16: shift=16, sllp=0x0110, avpnm=0x, tlbiel=1, penc=1 [0.00] hash-mmu: base_shift=16: shift=24, sllp=0x0110, avpnm=0x, tlbiel=1, penc=8 [0.00] hash-mmu: base_shift=20: shift=20, sllp=0x0130, avpnm=0x, tlbiel=0, penc=2 [0.00] hash-mmu: base_shift=24: shift=24, sllp=0x0100, avpnm=0x0001, tlbiel=0, penc=0 [0.00] hash-mmu: base_shift=34: shift=34, sllp=0x0120, avpnm=0x07ff, tlbiel=0, penc=3 [0.00] Enabling pkeys with max key count 32 [0.00] Activating Kernel Userspace Execution Prevention [0.00] Activating Kernel Userspace Access Prevention [0.00] Page orders: linear mapping = 24, virtual = 16, io = 16, vmemmap = 24 [0.00] Using 1TB segments [0.00] hash-mmu: Initializing hash mmu with SLB [0.00] Linux version 5.12.0-rc5+ (pmenzel@flughafenberlinbrandenburgwillybrandt) (gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0, GNU ld (GNU Binutils for Ubuntu) 2.35.1) #3 SMP Mon Mar 29 21:13:13 CEST 2021 [0.00] Found initrd at 0xc3de:0xc647d5eb [0.00] OPAL: Found non-mapped LPC bus on chip 0 [0.00] Using PowerNV machine description [0.00] printk: bootconsole [udbg0] enabled [0.00] CPU maps initialized for 8 threads per core [0.00] (thread shift is 3) [0.00] Allocated 5760 bytes for 160 pacas [0.00] - [0.00] phys_mem_size = 0x100 [0.00] dcache_bsize = 0x80 [0.00] icache_bsize = 0x80 [0.00] cpu_features = 0x00fb8f5db187 [0.00] possible= 0x000ffbfbcf5fb187 [0.00] always = 0x000380008181 [0.00] cpu_user_features = 0xdc0065c2 0xef00 [0.00] mmu_features = 0x7c006e01 [0.00] firmware_feature
Re: [PATCH] powerpc/pseries: Only register vio drivers if vio bus exists
Dear Michael, Am 16.03.21 um 02:09 schrieb Michael Ellerman: The vio bus is a fake bus, which we use on pseries LPARs (guests) to discover devices provided by the hypervisor. There's no need or sense in creating the vio bus on bare metal systems. Which is why commit 4336b9337824 ("powerpc/pseries: Make vio and ibmebus initcalls pseries specific") made the initialisation of the vio bus only happen in LPARs. However as a result of that commit we now see errors at boot on bare metal systems: Driver 'hvc_console' was unable to register with bus_type 'vio' because the bus was not initialized. Driver 'tpm_ibmvtpm' was unable to register with bus_type 'vio' because the bus was not initialized. This happens because those drivers are built-in, and are calling vio_register_driver(). It in turn calls driver_register() with a reference to vio_bus_type, but we haven't registered vio_bus_type with the driver core. Fix it by also guarding vio_register_driver() with a check to see if we are on pseries. Fixes: 4336b9337824 ("powerpc/pseries: Make vio and ibmebus initcalls pseries specific") Reported-by: Paul Menzel Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/pseries/vio.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/platforms/pseries/vio.c b/arch/powerpc/platforms/pseries/vio.c index 9cb4fc839fd5..429053d0402a 100644 --- a/arch/powerpc/platforms/pseries/vio.c +++ b/arch/powerpc/platforms/pseries/vio.c @@ -1285,6 +1285,10 @@ static int vio_bus_remove(struct device *dev) int __vio_register_driver(struct vio_driver *viodrv, struct module *owner, const char *mod_name) { + // vio_bus_type is only initialised for pseries + if (!machine_is(pseries)) + return -ENODEV; + pr_debug("%s: driver %s registering\n", __func__, viodrv->name); /* fill in 'struct driver' fields */ Thank you. The errors are gone now. Tested-by: Paul Menzel # IBM S822L (POWER8) As it fixes a commit from Linux 5.8, should it be tagged for the stable releases, or is it going to be picked up automatically due to the Fixes tag? Kind regards, Paul
Re: VIO bus not initialized
Dear Michael, Thank you very much for your response. Am 15.03.21 um 08:53 schrieb Michael Ellerman: Paul Menzel writes: On the POWER8 system IBM S822LC, Linux 5.12-rc2+ logs the errors below. That's a bare metal system, you can see that from the line "Using PowerNV machine description" in the boot log. $ dmesg --level=err [1.555668] Driver 'hvc_console' was unable to register with bus_type 'vio' because the bus was not initialized. [1.558434] Driver 'tpm_ibmvtpm' was unable to register with bus_type 'vio' because the bus was not initialized. $ grep VIO /boot/config-5.12.0-rc2+ CONFIG_IBMVIO=y The "vio" bus is not a real bus, it's a fake bus we use for hypervisor provided devices in LPARs (guests). So on bare metal machines there is no vio bus, the devices that would appear on the vio bus are found via other mechanisms. Thank you for the explanation. Two questions: 1. Could a bare metal system be detected, and the VIO “be skipped”? 2. Should the log level be changed to notice or info then, as it’s an expected failure? […] Kind regards, Paul
Re: /sys/kernel/debug/kmemleak empty despite kmemleak reports
Dear Catalin, Am 13.07.20 um 20:27 schrieb Catalin Marinas: On Thu, Jul 09, 2020 at 11:08:52PM +0200, Paul Menzel wrote: Am 09.07.20 um 19:57 schrieb Catalin Marinas: On Thu, Jul 09, 2020 at 04:37:10PM +0200, Paul Menzel wrote: Despite Linux 5.8-rc4 reporting memory leaks on the IBM POWER 8 S822LC, the file does not contain more information. $ dmesg […] > [48662.953323] perf: interrupt took too long (2570 > 2500), lowering kernel.perf_event_max_sample_rate to 77750 [48854.810636] perf: interrupt took too long (3216 > 3212), lowering kernel.perf_event_max_sample_rate to 62000 [52300.044518] perf: interrupt took too long (4244 > 4020), lowering kernel.perf_event_max_sample_rate to 47000 [52751.373083] perf: interrupt took too long (5373 > 5305), lowering kernel.perf_event_max_sample_rate to 37000 [53354.000363] perf: interrupt took too long (6793 > 6716), lowering kernel.perf_event_max_sample_rate to 29250 [53850.215606] perf: interrupt took too long (8672 > 8491), lowering kernel.perf_event_max_sample_rate to 23000 [57542.266099] perf: interrupt took too long (10940 > 10840), lowering kernel.perf_event_max_sample_rate to 18250 [57559.645404] perf: interrupt took too long (13714 > 13675), lowering kernel.perf_event_max_sample_rate to 14500 [61608.697728] Can't find PMC that caused IRQ [71774.463111] kmemleak: 12 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [92372.044785] process '@/usr/bin/gnatmake-5' started with executable stack [92849.380672] FS-Cache: Loaded [92849.417269] FS-Cache: Netfs 'nfs' registered for caching [92849.595974] NFS: Registering the id_resolver key type [92849.596000] Key type id_resolver registered [92849.596000] Key type id_legacy registered [101808.079143] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [106904.323471] Can't find PMC that caused IRQ [129416.391456] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [158171.604221] kmemleak: 34 new suspected memory leaks (see /sys/kernel/debug/kmemleak) $ sudo cat /sys/kernel/debug/kmemleak When they are no longer present, they are most likely false positives. How can this be? Shouldn’t the false positive also be logged in `/sys/kernel/debug/kmemleak`? Sorry, I wasn't clear. It can be a transient false positive. At a subsequent scan, kmemleak found pointer referring the previously reported objects and no longer shows them. Interesting. Is it possible to print a message in that case to avoid confusion? Was this triggered during boot? Or under some workload? From the timestamps it looks like under some load. Was it during boot? I put a delay of 60s to avoid this but, depending on the platform, it can still trigger. No, it happened after several hours of runtime. Kind regards, Paul
Re: /sys/kernel/debug/kmemleak empty despite kmemleak reports
Dear Catalin, Am 09.07.20 um 19:57 schrieb Catalin Marinas: On Thu, Jul 09, 2020 at 04:37:10PM +0200, Paul Menzel wrote: Despite Linux 5.8-rc4 reporting memory leaks on the IBM POWER 8 S822LC, the file does not contain more information. $ dmesg […] > [48662.953323] perf: interrupt took too long (2570 > 2500), lowering kernel.perf_event_max_sample_rate to 77750 [48854.810636] perf: interrupt took too long (3216 > 3212), lowering kernel.perf_event_max_sample_rate to 62000 [52300.044518] perf: interrupt took too long (4244 > 4020), lowering kernel.perf_event_max_sample_rate to 47000 [52751.373083] perf: interrupt took too long (5373 > 5305), lowering kernel.perf_event_max_sample_rate to 37000 [53354.000363] perf: interrupt took too long (6793 > 6716), lowering kernel.perf_event_max_sample_rate to 29250 [53850.215606] perf: interrupt took too long (8672 > 8491), lowering kernel.perf_event_max_sample_rate to 23000 [57542.266099] perf: interrupt took too long (10940 > 10840), lowering kernel.perf_event_max_sample_rate to 18250 [57559.645404] perf: interrupt took too long (13714 > 13675), lowering kernel.perf_event_max_sample_rate to 14500 [61608.697728] Can't find PMC that caused IRQ [71774.463111] kmemleak: 12 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [92372.044785] process '@/usr/bin/gnatmake-5' started with executable stack [92849.380672] FS-Cache: Loaded [92849.417269] FS-Cache: Netfs 'nfs' registered for caching [92849.595974] NFS: Registering the id_resolver key type [92849.596000] Key type id_resolver registered [92849.596000] Key type id_legacy registered [101808.079143] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [106904.323471] Can't find PMC that caused IRQ [129416.391456] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [158171.604221] kmemleak: 34 new suspected memory leaks (see /sys/kernel/debug/kmemleak) $ sudo cat /sys/kernel/debug/kmemleak When they are no longer present, they are most likely false positives. How can this be? Shouldn’t the false positive also be logged in `/sys/kernel/debug/kmemleak`? Was this triggered during boot? Or under some workload? From the timestamps it looks like under some load. Kind regards, Paul
/sys/kernel/debug/kmemleak empty despite kmemleak reports
Dear Linux folks, Despite Linux 5.8-rc4 reporting memory leaks on the IBM POWER 8 S822LC, the file does not contain more information. $ dmesg […] > [48662.953323] perf: interrupt took too long (2570 > 2500), lowering kernel.perf_event_max_sample_rate to 77750 [48854.810636] perf: interrupt took too long (3216 > 3212), lowering kernel.perf_event_max_sample_rate to 62000 [52300.044518] perf: interrupt took too long (4244 > 4020), lowering kernel.perf_event_max_sample_rate to 47000 [52751.373083] perf: interrupt took too long (5373 > 5305), lowering kernel.perf_event_max_sample_rate to 37000 [53354.000363] perf: interrupt took too long (6793 > 6716), lowering kernel.perf_event_max_sample_rate to 29250 [53850.215606] perf: interrupt took too long (8672 > 8491), lowering kernel.perf_event_max_sample_rate to 23000 [57542.266099] perf: interrupt took too long (10940 > 10840), lowering kernel.perf_event_max_sample_rate to 18250 [57559.645404] perf: interrupt took too long (13714 > 13675), lowering kernel.perf_event_max_sample_rate to 14500 [61608.697728] Can't find PMC that caused IRQ [71774.463111] kmemleak: 12 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [92372.044785] process '@/usr/bin/gnatmake-5' started with executable stack [92849.380672] FS-Cache: Loaded [92849.417269] FS-Cache: Netfs 'nfs' registered for caching [92849.595974] NFS: Registering the id_resolver key type [92849.596000] Key type id_resolver registered [92849.596000] Key type id_legacy registered [101808.079143] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [106904.323471] Can't find PMC that caused IRQ [129416.391456] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [158171.604221] kmemleak: 34 new suspected memory leaks (see /sys/kernel/debug/kmemleak) $ sudo cat /sys/kernel/debug/kmemleak $ Kind regards, Paul
Re: Using Firefox hangs system
Dear Nicholas, Am 07.07.20 um 09:03 schrieb Nicholas Piggin: Excerpts from Paul Menzel's message of July 6, 2020 3:20 pm: Am 06.07.20 um 02:41 schrieb Nicholas Piggin: Excerpts from Paul Menzel's message of July 5, 2020 8:30 pm: Am 05.07.20 um 11:22 schrieb Paul Menzel: [ 572.253008] Oops: Exception in kernel mode, sig: 5 [#1] [ 572.253198] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV [ 572.253232] Modules linked in: tcp_diag inet_diag unix_diag xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter bridge stp llc overlay xfs kvm_hv kvm binfmt_misc joydev uas usb_storage vmx_crypto bnx2x crct10dif_vpmsum ofpart cmdlinepart powernv_flash mtd mdio ibmpowernv at24 ipmi_powernv ipmi_devintf ipmi_msghandler opal_prd powernv_rng sch_fq_codel parport_pc ppdev lp nfsd parport auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic libcrc32c xor zstd_compress raid6_pq input_leds mac_hid hid_generic ast drm_vram_helper drm_ttm_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ahci drm_panel_orientation_quirks libahci usbhid hid crc32c_vpmsum uio_pdrv_genirq uio [ 572.253639] CPU: 4 PID: 6728 Comm: Web Content Not tainted 5.8.0-rc3+ #1 [ 572.253659] NIP: c000ff5c LR: c001a8f8 CTR: c01d5f00 [ 572.253835] REGS: c07f31f0f420 TRAP: 1500 Not tainted (5.8.0-rc3+) [ 572.253854] MSR: 9290b033 CR: 28c48482 XER: 2000 [ 572.253888] CFAR: c000fecc IRQMASK: 1 [ 572.253888] GPR00: c001b228 c07f31f0f6b0 c1f9a900 c07f351544d0 [ 572.253888] GPR04: c07f31f0fe90 c07f351544f0 c07f32e522b0 [ 572.253888] GPR08: 2000 90009033 c07fbcd85800 [ 572.253888] GPR12: 8800 c07fb680 0005 0004 [ 572.253888] GPR16: c07f35153800 c07f35154130 0005 0001 [ 572.253888] GPR20: 0024 c07f32e51e68 c07f35154028 007fd8da [ 572.253888] GPR24: 007fd8da c07f351544d0 c07e9a4024d0 c1665f18 [ 572.253888] GPR28: c07f351544d0 c07f35153800 9290f033 c07f35153800 [ 572.254079] NIP [c000ff5c] save_fpu+0xa8/0x2ac [ 572.254098] LR [c001a8f8] __giveup_fpu+0x28/0x80 [ 572.254114] Call Trace: [ 572.254128] [c07f31f0f6b0] [c07f35153980] 0xc07f35153980 (unreliable) [ 572.254156] [c07f31f0f6e0] [c001b228] giveup_all+0x128/0x150 [ 572.254327] [c07f31f0f710] [c001c124] __switch_to+0x104/0x490 [ 572.254352] [c07f31f0f770] [c10d2e34] __schedule+0x2e4/0xa10 [ 572.254374] [c07f31f0f840] [c10d35d4] schedule+0x74/0x140 [ 572.254397] [c07f31f0f870] [c10d9478] schedule_timeout+0x358/0x5d0 [ 572.254424] [c07f31f0f980] [c10d5638] wait_for_completion+0xc8/0x210 [ 572.254451] [c07f31f0fa00] [c0608ed4] do_coredump+0x3a4/0xd60 [ 572.254625] [c07f31f0fba0] [c018d1cc] get_signal+0x1dc/0xd00 [ 572.254648] [c07f31f0fcc0] [c001f088] do_notify_resume+0x158/0x450 [ 572.254672] [c07f31f0fda0] [c0037d04] interrupt_exit_user_prepare+0x1c4/0x230 [ 572.254699] [c07f31f0fe20] [c000f2b4] interrupt_return+0x14/0x1c0 [ 572.254720] Instruction dump: [ 572.254882] dae60170 db060180 db260190 db4601a0 db6601b0 db8601c0 dba601d0 dbc601e0 [ 572.254912] dbe601f0 48000204 3880 f250 <7c062798> f250 38800010 f0210a50 [ 572.254946] ---[ end trace ba4452ee5c77d58e ]--- Please find all the messages attached. "Oops: Exception in kernel mode, sig: 5 [#1]" Unfortunately it's a very poor error message. I think it is a 0x1500 exception triggering in the kernel FP register saving. Do you have the CONFIG_PPC_DENORMALISATION config option set? Yes, as it’s set in the Ubuntu Linux kernel configuration, I have it set too. $ grep DENORMALI /boot/config-* /boot/config-4.15.0-23-generic:CONFIG_PPC_DENORMALISATION=y /boot/config-5.4.0-40-generic:CONFIG_PPC_DENORMALISATION=y /boot/config-5.7.0-rc5+:CONFIG_PPC_DENORMALISATION=y /boot/config-5.8.0-rc3+:CONFIG_PPC_DENORMALISATION=y Ah thanks I was able to reproduce with a little denorm test case. The denorm interrupt handler got broken by some careless person. This patch should hopefully fix it for you? Yes, it does. Thank you. --- arch/powerpc/kernel/exceptions-64s.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index fa080694e581..0fc8bad878b2 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/ker
Re: Using Firefox hangs system
Dear Nicholas, Thank you for the quick response. Am 06.07.20 um 02:41 schrieb Nicholas Piggin: Excerpts from Paul Menzel's message of July 5, 2020 8:30 pm: Am 05.07.20 um 11:22 schrieb Paul Menzel: With an IBM S822LC with Ubuntu 20.04, after updating to Firefox 78.0, using Firefox seems to hang the system. This happened with self-built Linux 5.7-rc5+ and now with 5.8-rc3+. (At least I believe the Firefox update is causing this.) Log in is impossible, and using the Serial over LAN over IPMI shows the messages below. [ 2620.579187] watchdog: BUG: soft lockup - CPU#125 stuck for 22s! [swapper/125:0] [ 2620.579378] Modules linked in: tcp_diag inet_diag unix_diag xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter bridge stp llc overlay xfs kvm_hv kvm joydev binfmt_misc uas usb_storage vmx_crypto ofpart cmdlinepart bnx2x powernv_flash mtd mdio crct10dif_vpmsum at24 ibmpowernv ipmi_powernv ipmi_devintf powernv_rng ipmi_msghandler opal_prd sch_fq_codel parport_pc nfsd ppdev lp auth_rpcgss nfs_acl parport lockd grace sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic libcrc32c xor zstd_compress raid6_pq input_leds mac_hid hid_generic ast drm_vram_helper drm_ttm_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm drm_panel_orientation_quirks ahci libahci usbhid hid crc32c_vpmsum uio_pdrv_genirq uio [ 2620.579537] CPU: 125 PID: 0 Comm: swapper/125 Tainted: G D W L 5.8.0-rc3+ #1 [ 2620.579552] NIP: c10dad38 LR: c10dad30 CTR: c0237830 [ 2620.579568] REGS: c0ffcb8c7600 TRAP: 0900 Tainted: G D W L (5.8.0-rc3+) [ 2620.579582] MSR: 90009033 CR: 44004228 XER: [ 2620.579599] CFAR: c10dad44 IRQMASK: 0 [ 2620.579599] GPR00: c023718c c0ffcb8c7890 c1f9a900 [ 2620.579599] GPR04: c1fce438 0078 00010008c1f2 [ 2620.579599] GPR08: 00ffd96a 8087 c1fd25e0 [ 2620.579599] GPR12: 4400 c072f680 c1ea36d8 c0ffcb859800 [ 2620.579599] GPR16: c166c880 c16f8e00 000a c0ffcb859800 [ 2620.579599] GPR20: 0100 c166c918 c1fd21e8 c0ffcb859800 [ 2620.579599] GPR24: 00ffd96a c1d44b80 c1d53780 0008 [ 2620.579599] GPR28: c1fd21e0 0001 c1d44b80 [ 2620.579711] NIP [c10dad38] _raw_spin_lock_irqsave+0x98/0x120 [ 2620.579724] LR [c10dad30] _raw_spin_lock_irqsave+0x90/0x120 [ 2620.579737] Call Trace: [ 2620.579746] [c0ffcb8c7890] [c13c84a0] ncsi_ops+0x209f50/0x2dc1d8 (unreliable) [ 2620.579763] [c0ffcb8c78d0] [c023718c] rcu_core+0xfc/0x7a0 [ 2620.579777] [c0ffcb8c7970] [c10db81c] __do_softirq+0x17c/0x534 [ 2620.579791] [c0ffcb8c7aa0] [c01786f4] irq_exit+0xd4/0x130 [ 2620.579805] [c0ffcb8c7ad0] [c0025eec] timer_interrupt+0x13c/0x370 [ 2620.579821] [c0ffcb8c7b40] [c00165c0] replay_soft_interrupts+0x320/0x3f0 [ 2620.579837] [c0ffcb8c7d30] [c00166d8] arch_local_irq_restore+0x48/0xa0 [ 2620.579853] [c0ffcb8c7d50] [c0de2fe0] cpuidle_enter_state+0x100/0x780 [snip] I have to warm reset the system to get it working again. I am unable to reproduce this with Ubuntu’s Linux Okay, not sure what that would be from, looks like RCU perhaps. Anyway if it comes up again, let us know. Ah, it’s a different trace. I think it’s just an effect of the first error (as below), as some CPUs lock up. I wasn’t able to capture the start of the trace above. In the attachment for the hang *below* you can also see [ 664.705193] watchdog: BUG: soft lockup - CPU#134 stuck for 26s! [swapper/134:0] after the first Oops. With Linux 5.8-rc3+, I got now the beginning of the Linux messages. [ 572.253008] Oops: Exception in kernel mode, sig: 5 [#1] [ 572.253198] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV [ 572.253232] Modules linked in: tcp_diag inet_diag unix_diag xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter bridge stp llc overlay xfs kvm_hv kvm binfmt_misc joydev uas usb_storage vmx_crypto bnx2x crct10dif_vpmsum ofpart cmdlinepart powernv_flash mtd mdio ibmpowernv at24 ipmi_powernv ipmi_devintf ipmi_msghandler opal_prd powernv_rng sch_fq_codel parport_pc ppdev lp nfsd parport auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic libcrc32c xor zstd_compress raid6_pq input_leds ma
Using Firefox hangs system
Dear Linux folks, With an IBM S822LC with Ubuntu 20.04, after updating to Firefox 78.0, using Firefox seems to hang the system. This happened with self-built Linux 5.7-rc5+ and now with 5.8-rc3+. (At least I believe the Firefox update is causing this.) Log in is impossible, and using the Serial over LAN over IPMI shows the messages below. [ 2620.579187] watchdog: BUG: soft lockup - CPU#125 stuck for 22s! [swapper/125:0] [ 2620.579378] Modules linked in: tcp_diag inet_diag unix_diag xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter bridge stp llc overlay xfs kvm_hv kvm joydev binfmt_misc uas usb_storage vmx_crypto ofpart cmdlinepart bnx2x powernv_flash mtd mdio crct10dif_vpmsum at24 ibmpowernv ipmi_powernv ipmi_devintf powernv_rng ipmi_msghandler opal_prd sch_fq_codel parport_pc nfsd ppdev lp auth_rpcgss nfs_acl parport lockd grace sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic libcrc32c xor zstd_compress raid6_pq input_leds mac_hid hid_generic ast drm_vram_helper drm_ttm_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm drm_panel_orientation_quirks ahci libahci usbhid hid crc32c_vpmsum uio_pdrv_genirq uio [ 2620.579537] CPU: 125 PID: 0 Comm: swapper/125 Tainted: G D WL 5.8.0-rc3+ #1 [ 2620.579552] NIP: c10dad38 LR: c10dad30 CTR: c0237830 [ 2620.579568] REGS: c0ffcb8c7600 TRAP: 0900 Tainted: G D WL (5.8.0-rc3+) [ 2620.579582] MSR: 90009033 CR: 44004228 XER: [ 2620.579599] CFAR: c10dad44 IRQMASK: 0 [ 2620.579599] GPR00: c023718c c0ffcb8c7890 c1f9a900 [ 2620.579599] GPR04: c1fce438 0078 00010008c1f2 [ 2620.579599] GPR08: 00ffd96a 8087 c1fd25e0 [ 2620.579599] GPR12: 4400 c072f680 c1ea36d8 c0ffcb859800 [ 2620.579599] GPR16: c166c880 c16f8e00 000a c0ffcb859800 [ 2620.579599] GPR20: 0100 c166c918 c1fd21e8 c0ffcb859800 [ 2620.579599] GPR24: 00ffd96a c1d44b80 c1d53780 0008 [ 2620.579599] GPR28: c1fd21e0 0001 c1d44b80 [ 2620.579711] NIP [c10dad38] _raw_spin_lock_irqsave+0x98/0x120 [ 2620.579724] LR [c10dad30] _raw_spin_lock_irqsave+0x90/0x120 [ 2620.579737] Call Trace: [ 2620.579746] [c0ffcb8c7890] [c13c84a0] ncsi_ops+0x209f50/0x2dc1d8 (unreliable) [ 2620.579763] [c0ffcb8c78d0] [c023718c] rcu_core+0xfc/0x7a0 [ 2620.579777] [c0ffcb8c7970] [c10db81c] __do_softirq+0x17c/0x534 [ 2620.579791] [c0ffcb8c7aa0] [c01786f4] irq_exit+0xd4/0x130 [ 2620.579805] [c0ffcb8c7ad0] [c0025eec] timer_interrupt+0x13c/0x370 [ 2620.579821] [c0ffcb8c7b40] [c00165c0] replay_soft_interrupts+0x320/0x3f0 [ 2620.579837] [c0ffcb8c7d30] [c00166d8] arch_local_irq_restore+0x48/0xa0 [ 2620.579853] [c0ffcb8c7d50] [c0de2fe0] cpuidle_enter_state+0x100/0x780 [ 2620.579869] [c0ffcb8c7dd0] [c0de36fc] cpuidle_enter+0x4c/0x70 [ 2620.579883] [c0ffcb8c7e10] [c01c6bb4] do_idle+0x3c4/0x590 [ 2620.579896] [c0ffcb8c7ee0] [c01c6fcc] cpu_startup_entry+0x3c/0x50 [ 2620.579911] [c0ffcb8c7f10] [c00615f4] start_secondary+0x2d4/0x3b0 [ 2620.579927] [c0ffcb8c7f90] [c000c454] start_secondary_prolog+0x10/0x14 [ 2620.579941] Instruction dump: [ 2620.579950] 6000 6000 7c0802a6 fba10028 fbe10038 7c7f1b78 f8010050 8bad0988 [ 2620.579967] 7fc3f378 4af3b96d 6000 7c210b78 <6000> 813f 2c29 4082fff0 [ 2645.907192] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 2660.067201] watchdog: CPU 0 detected hard LOCKUP on other CPUs 113 [ 2660.067385] watchdog: CPU 0 TB:1390608252047, last SMP heartbeat TB:1382840188990 (15171ms ago) [ 2708.927190] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 2724.067205] watchdog: CPU 0 detected hard LOCKUP on other CPUs 87 [ 2724.067396] watchdog: CPU 0 TB:1423376252137, last SMP heartbeat TB:1415618427864 (15152ms ago) [ 2771.947188] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: 5:0] [ 2620.579378] Modules linked in: tcp_diag inet_diag unix_diag xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter bridge stp llc overlay xfs kvm_hv kvm joydev binfmt_misc uas usb_storage vmx_crypto ofpart cmdlinepart bnx2x powernv_flash mtd mdio crct10dif_vpmsum at24 ibmpowernv ipmi_powernv ipmi_devintf power
Re: Why is suspend with s2idle available on POWER8 systems?
Dear Rafael, On 04/29/2019 09:17 AM, Rafael J. Wysocki wrote: > On Sat, Apr 27, 2019 at 12:54 PM Paul Menzel wrote: >> Updating an IBM S822LC from Ubuntu 18.10 to 19.04 some user space stuff >> seems to have changed, so that going into sleep/suspend is enabled. >> >> That raises two questions. >> >> 1. Is suspend actually supported on a POWER8 processor? > > Suspend-to-idle is a special variant of system suspend that does not > depend on any special platform support. It works by suspending > devices and letting all of the CPUs in the system go idle (hence the > name). > > Also see > https://www.kernel.org/doc/html/latest/admin-guide/pm/sleep-states.html#suspend-to-idle Thanks. I guess I mixed it up with the new S0ix-states [1]. >>> Apr 27 10:18:13 power NetworkManager[7534]: [1556353093.7224] >>> manager: sleep: sleep requested (sleeping: no e >>> Apr 27 10:18:13 power systemd[1]: Reached target Sleep. >>> Apr 27 10:18:13 power systemd[1]: Starting Suspend... >>> Apr 27 10:18:13 power systemd-sleep[82190]: Suspending system... >>> Apr 27 10:18:13 power kernel: PM: suspend entry (s2idle) >>> -- Reboot -- >> >>> $ uname -m >>> ppc64le >>> $ more /proc/version >>> Linux version 5.1.0-rc6+ (joey@power) (gcc version 8.3.0 (Ubuntu >>> 8.3.0-6ubuntu1)) #1 SMP Sat Apr 27 10:01:48 CEST 2019 >>> $ more /sys/power/mem_sleep >>> [s2idle] >>> $ more /sys/power/state >>> freeze mem >>> $ grep _SUSPEND /boot/config-5.0.0-14-generic # also enabled in Ubuntu’s >>> configuration >>> CONFIG_ARCH_SUSPEND_POSSIBLE=y >>> CONFIG_SUSPEND=y >>> CONFIG_SUSPEND_FREEZER=y >>> # CONFIG_SUSPEND_SKIP_SYNC is not set >>> # CONFIG_PM_TEST_SUSPEND is not set >> >> Should the Kconfig symbol `SUSPEND` be selectable? If yes, should their >> be some detection during runtime? >> >> 2. If it is supported, what are the ways to getting it to resume? What >> would the IPMI command be? > > That would depend on the distribution. > > Generally, you need to set up at least one device to generate wakeup > interrupts. > > The interface to do that are the /sys/devices/.../power/wakeup files, > but that has to cause enble_irq_wake() to be called for the given IRQ, > so some support in the underlying drivers need to be present for it to > work. > > USB devices generally work as wakeup sources if the controllers reside > on a PCI bus, for example. ``` $ find /sys/devices/ -name wakeup | xargs grep enabled /sys/devices/pci0021:00/0021:00:00.0/0021:01:00.0/0021:02:09.0/0021:0d:00.0/usb1/1-3/1-3.4/power/wakeup:enabled /sys/devices/pci0021:00/0021:00:00.0/0021:01:00.0/0021:02:09.0/0021:0d:00.0/power/wakeup:enabled $ lsusb -t /: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M /: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M |__ Port 3: Dev 2, If 0, Class=Hub, Driver=hub/5p, 480M |__ Port 1: Dev 3, If 0, Class=Mass Storage, Driver=usb-storage, 480M |__ Port 2: Dev 4, If 0, Class=Mass Storage, Driver=usb-storage, 480M |__ Port 3: Dev 5, If 0, Class=Mass Storage, Driver=usb-storage, 480M |__ Port 4: Dev 6, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M |__ Port 4: Dev 6, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M $ lsusb Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 006: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard and Mouse Bus 001 Device 005: ID 046b:ff31 American Megatrends, Inc. Bus 001 Device 004: ID 046b:ff40 American Megatrends, Inc. Bus 001 Device 003: ID 046b:ff20 American Megatrends, Inc. Bus 001 Device 002: ID 046b:ff01 American Megatrends, Inc. Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub ``` Kind regards, Paul [1]: https://01.org/blogs/qwang59/2018/how-achieve-s0ix-states-linux smime.p7s Description: S/MIME Cryptographic Signature
Why is suspend with s2idle available on POWER8 systems?
Dear Linux folks, Updating an IBM S822LC from Ubuntu 18.10 to 19.04 some user space stuff seems to have changed, so that going into sleep/suspend is enabled. That raises two questions. 1. Is suspend actually supported on a POWER8 processor? Apr 27 10:18:13 power NetworkManager[7534]: [1556353093.7224] manager: sleep: sleep requested (sleeping: no e Apr 27 10:18:13 power systemd[1]: Reached target Sleep. Apr 27 10:18:13 power systemd[1]: Starting Suspend... Apr 27 10:18:13 power systemd-sleep[82190]: Suspending system... Apr 27 10:18:13 power kernel: PM: suspend entry (s2idle) -- Reboot -- $ uname -m ppc64le $ more /proc/version Linux version 5.1.0-rc6+ (joey@power) (gcc version 8.3.0 (Ubuntu 8.3.0-6ubuntu1)) #1 SMP Sat Apr 27 10:01:48 CEST 2019 $ more /sys/power/mem_sleep [s2idle] $ more /sys/power/state freeze mem $ grep _SUSPEND /boot/config-5.0.0-14-generic # also enabled in Ubuntu’s configuration CONFIG_ARCH_SUSPEND_POSSIBLE=y CONFIG_SUSPEND=y CONFIG_SUSPEND_FREEZER=y # CONFIG_SUSPEND_SKIP_SYNC is not set # CONFIG_PM_TEST_SUSPEND is not set Should the Kconfig symbol `SUSPEND` be selectable? If yes, should their be some detection during runtime? 2. If it is supported, what are the ways to getting it to resume? What would the IPMI command be? For now I disabled the automatic suspend, masking the targets [1]. Kind regards, Paul [1]: https://wiki.debian.org/Suspend#Disable_suspend_and_hibernation
Re: Reading `/sys/kernel/debug/kmemleak` takes 3 s and content not shown
Dear Christophe, On 26.03.19 13:55, Christophe Leroy wrote: Le 26/03/2019 à 13:49, Paul Menzel a écrit : On 19.02.19 10:44, Paul Menzel wrote: On a the IBM S822LC (8335-GTA) with Ubuntu 18.10, and Linux 5.0-rc5+ accessing `/sys/kernel/debug/kmemleak` takes a long time. According to strace it takes three seconds. $ dmesg | grep leak [ 4.407957] kmemleak: Kernel memory leak detector initialized [ 4.407959] kmemleak: Automatic memory scanning thread started [745989.625624] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [1002619.951902] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) ``` Unfortunately, the leaks supposedly stored in that file are not shown either. The problem is still present with Linux 5.0. Do you have an idea, how to fix this? Have you identified a previous version that works properly? It seems to have worked with Linux 4.18-rc4+ [1]. If so, have you been able to bisect the problem? No, sorry, I have not. I do not have resources for this. Kind regards, Paul [1]: https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-July/175658.html
Re: Reading `/sys/kernel/debug/kmemleak` takes 3 s and content not shown
Dear Linux folks, On 19.02.19 10:44, Paul Menzel wrote: On a the IBM S822LC (8335-GTA) with Ubuntu 18.10, and Linux 5.0-rc5+ accessing `/sys/kernel/debug/kmemleak` takes a long time. According to strace it takes three seconds. ``` $ sudo strace -tt -T cat /sys/kernel/debug/kmemleak 10:35:49.861641 execve("/bin/cat", ["cat", "/sys/kernel/debug/kmemleak"], 0x7dbcb518 /* 16 vars */) = 0 <0.000293> 10:35:49.862112 brk(NULL) = 0x75b12a5 <0.12> 10:35:49.862190 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.15> 10:35:49.862261 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) <0.15> 10:35:49.862324 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 <0.18> 10:35:49.862389 fstat(3, {st_mode=S_IFREG|0644, st_size=143482, ...}) = 0 <0.11> 10:35:49.862444 mmap(NULL, 143482, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ce4a115 <0.17> 10:35:49.862501 close(3)= 0 <0.11> 10:35:49.862550 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.15> 10:35:49.862615 openat(AT_FDCWD, "/lib/powerpc64le-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 <0.19> 10:35:49.862676 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0\25\0\1\0\0\0pN\2\0\0\0\0\0"..., 832) = 832 <0.11> 10:35:49.862731 fstat(3, {st_mode=S_IFREG|0755, st_size=2310856, ...}) = 0 <0.11> 10:35:49.862783 mmap(NULL, 2380672, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ce4a0f0 <0.18> 10:35:49.862842 mprotect(0x7ce4a112, 65536, PROT_NONE) = 0 <0.19> 10:35:49.862899 mmap(0x7ce4a113, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x22) = 0x7ce4a113 <0.19> 10:35:49.862990 close(3)= 0 <0.10> 10:35:49.863110 mprotect(0x7ce4a113, 65536, PROT_READ) = 0 <0.17> 10:35:49.863192 mprotect(0x75ad43b, 65536, PROT_READ) = 0 <0.16> 10:35:49.863252 mprotect(0x7ce4a11e, 65536, PROT_READ) = 0 <0.15> 10:35:49.863305 munmap(0x7ce4a115, 143482) = 0 <0.22> 10:35:49.863446 brk(NULL) = 0x75b12a5 <0.11> 10:35:49.863495 brk(0x75b12a8) = 0x75b12a8 <0.14> 10:35:49.863561 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 <0.19> 10:35:49.863624 fstat(3, {st_mode=S_IFREG|0644, st_size=6035920, ...}) = 0 <0.10> 10:35:49.863677 mmap(NULL, 6035920, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ce4a093 <0.17> 10:35:49.863736 close(3)= 0 <0.11> 10:35:49.863828 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 <0.10> 10:35:49.863881 openat(AT_FDCWD, "/sys/kernel/debug/kmemleak", O_RDONLY) = 3 <0.34> 10:35:49.863956 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 <0.29> 10:35:49.864028 fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0 <0.11> 10:35:49.864076 mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ce4a08f <0.17> 10:35:49.864146 read(3, "", 131072) = 0 <3.528503> 10:35:53.392797 munmap(0x7ce4a08f, 262144) = 0 <0.92> 10:35:53.392957 close(3)= 0 <0.29> 10:35:53.393038 close(1)= 0 <0.10> 10:35:53.393078 close(2)= 0 <0.09> 10:35:53.393123 exit_group(0) = ? 10:35:53.393280 +++ exited with 0 +++ $ uname -a Linux flughafenberlinbrandenburgwillybrandt 5.0.0-rc5+ #1 SMP Thu Feb 7 11:23:11 CET 2019 ppc64le ppc64le ppc64le GNU/Linux $ more /proc/version Linux version 5.0.0-rc5+ (pmenzel@flughafenberlinbrandenburgwillybrandt) (gcc version 8.2.0 (Ubuntu 8.2.0-7ubuntu1)) #1 SMP Thu Feb 7 11:23:11 CET 2019 $ more /proc/cmdline root=UUID=2c3dd738-785a-469b-843e-9f0ba8b47b0d ro rootflags=subvol=@ quiet splash $ grep KMEMLEAK /boot/config-5.0.0-rc5+ CONFIG_HAVE_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=1 # CONFIG_DEBUG_KMEMLEAK_TEST is not set # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y $ grep KMEMLEAK /boot/config-4.18.0-rc4+ CONFIG_HAVE_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=1 # CONFIG_DEBUG_KMEMLEAK_TEST is not set # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set $ dmesg | grep leak [4.407957] kmemleak: Kernel memory leak detector initialized [4.407959] kmemleak: Automatic memory scanning thread started [745989.625624] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [1002619.951902] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) ``` Unfortunately, the leaks supposedly stored in that file are not shown either. The problem is still present with Linux 5.0. Do you have an idea, how to fix this? Kind regards, Paul
Reading `/sys/kernel/debug/kmemleak` takes 3 s and content not shown
Dear Linux folks, On a the IBM S822LC (8335-GTA) with Ubuntu 18.10, and Linux 5.0-rc5+ accessing `/sys/kernel/debug/kmemleak` takes a long time. According to strace it takes three seconds. ``` $ sudo strace -tt -T cat /sys/kernel/debug/kmemleak 10:35:49.861641 execve("/bin/cat", ["cat", "/sys/kernel/debug/kmemleak"], 0x7dbcb518 /* 16 vars */) = 0 <0.000293> 10:35:49.862112 brk(NULL) = 0x75b12a5 <0.12> 10:35:49.862190 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.15> 10:35:49.862261 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) <0.15> 10:35:49.862324 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 <0.18> 10:35:49.862389 fstat(3, {st_mode=S_IFREG|0644, st_size=143482, ...}) = 0 <0.11> 10:35:49.862444 mmap(NULL, 143482, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ce4a115 <0.17> 10:35:49.862501 close(3)= 0 <0.11> 10:35:49.862550 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.15> 10:35:49.862615 openat(AT_FDCWD, "/lib/powerpc64le-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 <0.19> 10:35:49.862676 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0\25\0\1\0\0\0pN\2\0\0\0\0\0"..., 832) = 832 <0.11> 10:35:49.862731 fstat(3, {st_mode=S_IFREG|0755, st_size=2310856, ...}) = 0 <0.11> 10:35:49.862783 mmap(NULL, 2380672, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ce4a0f0 <0.18> 10:35:49.862842 mprotect(0x7ce4a112, 65536, PROT_NONE) = 0 <0.19> 10:35:49.862899 mmap(0x7ce4a113, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x22) = 0x7ce4a113 <0.19> 10:35:49.862990 close(3)= 0 <0.10> 10:35:49.863110 mprotect(0x7ce4a113, 65536, PROT_READ) = 0 <0.17> 10:35:49.863192 mprotect(0x75ad43b, 65536, PROT_READ) = 0 <0.16> 10:35:49.863252 mprotect(0x7ce4a11e, 65536, PROT_READ) = 0 <0.15> 10:35:49.863305 munmap(0x7ce4a115, 143482) = 0 <0.22> 10:35:49.863446 brk(NULL) = 0x75b12a5 <0.11> 10:35:49.863495 brk(0x75b12a8) = 0x75b12a8 <0.14> 10:35:49.863561 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 <0.19> 10:35:49.863624 fstat(3, {st_mode=S_IFREG|0644, st_size=6035920, ...}) = 0 <0.10> 10:35:49.863677 mmap(NULL, 6035920, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ce4a093 <0.17> 10:35:49.863736 close(3)= 0 <0.11> 10:35:49.863828 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 <0.10> 10:35:49.863881 openat(AT_FDCWD, "/sys/kernel/debug/kmemleak", O_RDONLY) = 3 <0.34> 10:35:49.863956 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 <0.29> 10:35:49.864028 fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0 <0.11> 10:35:49.864076 mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ce4a08f <0.17> 10:35:49.864146 read(3, "", 131072) = 0 <3.528503> 10:35:53.392797 munmap(0x7ce4a08f, 262144) = 0 <0.92> 10:35:53.392957 close(3)= 0 <0.29> 10:35:53.393038 close(1)= 0 <0.10> 10:35:53.393078 close(2)= 0 <0.09> 10:35:53.393123 exit_group(0) = ? 10:35:53.393280 +++ exited with 0 +++ $ uname -a Linux flughafenberlinbrandenburgwillybrandt 5.0.0-rc5+ #1 SMP Thu Feb 7 11:23:11 CET 2019 ppc64le ppc64le ppc64le GNU/Linux $ more /proc/version Linux version 5.0.0-rc5+ (pmenzel@flughafenberlinbrandenburgwillybrandt) (gcc version 8.2.0 (Ubuntu 8.2.0-7ubuntu1)) #1 SMP Thu Feb 7 11:23:11 CET 2019 $ more /proc/cmdline root=UUID=2c3dd738-785a-469b-843e-9f0ba8b47b0d ro rootflags=subvol=@ quiet splash $ grep KMEMLEAK /boot/config-5.0.0-rc5+ CONFIG_HAVE_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=1 # CONFIG_DEBUG_KMEMLEAK_TEST is not set # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y $ grep KMEMLEAK /boot/config-4.18.0-rc4+ CONFIG_HAVE_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=1 # CONFIG_DEBUG_KMEMLEAK_TEST is not set # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set $ dmesg | grep leak [4.407957] kmemleak: Kernel memory leak detector initialized [4.407959] kmemleak: Automatic memory scanning thread started [745989.625624] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [1002619.951902] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) ``` Unfortunately, the leaks supposedly stored in that file are not shown either. Kind regards, Paul [0.00] hash-mmu: Page sizes from device-tree: [0.00] hash-mmu: base_shift=12: shift=12, sllp=0x, avpnm=0x, tlbiel=1, penc=0 [0.00] hash-mmu: base_shift=12: shift=16, sllp=0x, avpnm=0x, tlbiel=1, penc=7 [0.00] hash-mmu: base_shift=12: shift=24, sllp=0x,
Re: [PATCH] powerpc/mm: Don't report PUDs as memory leaks when using kmemleak
Dear Michael, On 07/30/18 08:43, Michael Ellerman wrote: > Paul Menzel writes: >> Am 19.07.2018 um 16:33 schrieb Michael Ellerman: > ... >>> >>> The fix is fairly simple. We need to tell kmemleak to ignore PUD >>> allocations and never report them as leaks. We can also tell it not to >>> scan the PGD, because it will never find pointers in there. However it >>> will still notice if we allocate a PGD and then leak it. >>> >>> Reported-by: Paul Menzel >>> Signed-off-by: Michael Ellerman > --- >>> arch/powerpc/include/asm/book3s/64/pgalloc.h | 23 +++++-- >>> 1 file changed, 21 insertions(+), 2 deletions(-) >> >> […] >> >> Tested-by: Paul Menzel on IBM S822LC > > Thanks. No problem. I forgot to add, that it’d be great, if you tagged this for the stable series too. Cc: sta...@vger.kernel.org Kind regards, Paul smime.p7s Description: S/MIME Cryptographic Signature
Re: [PATCH] powerpc/mm: Don't report PUDs as memory leaks when using kmemleak
Dear Michael, Am 19.07.2018 um 16:33 schrieb Michael Ellerman: Paul Menzel reported that kmemleak was producing reports such as: unreferenced object 0xc000f8b8 (size 16384): comm "init", pid 1, jiffies 4294937416 (age 312.240s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 backtrace: [<d997deb7>] __pud_alloc+0x80/0x190 [<87f2e8a3>] move_page_tables+0xbac/0xdc0 [<091e51c2>] shift_arg_pages+0xc0/0x210 [<ab88670c>] setup_arg_pages+0x22c/0x2a0 [<60871529>] load_elf_binary+0x41c/0x1648 [<ecd9d2d4>] search_binary_handler.part.11+0xbc/0x280 [<34e0cdd7>] __do_execve_file.isra.13+0x73c/0x940 [<5f953a6e>] sys_execve+0x58/0x70 [<9700a858>] system_call+0x5c/0x70 Indicating that a PUD was being leaked. However what's really happening is that kmemleak is not able to recognise the references from the PGD to the PUD, because they are not fully qualified pointers. We can confirm that in xmon, eg: Find the task struct for pid 1 "init": 0:mon> P task_struct ->thread.kspPID PPID S P CMD c001fe7c c001fe803960 1 0 S 13 systemd Dump virtual address 0 to find the PGD: 0:mon> dv 0 c001fe7c pgd @ 0xc000f8b01000 Dump the memory of the PGD: 0:mon> d c000f8b01000 c000f8b01000 f8b9 || c000f8b01010 || c000f8b01020 || c000f8b01030 f8b8 || There we can see the reference to our supposedly leaked PUD. But because it's missing the leading 0xc, kmemleak won't recognise it. We can confirm it's still in use by translating an address that is mapped via it: 0:mon> dv 7fff9400 c001fe7c pgd @ 0xc000f8b01000 pgdp @ 0xc000f8b01038 = 0xf8b8 <-- pudp @ 0xc000f8b81ff8 = 0x037c4000 pmdp @ 0xc37c5ca0 = 0xfbd89000 ptep @ 0xc000fbd89000 = 0xc081d5ce0386 Maps physical address = 0x0001d5ce Flags = Accessed Dirty Read Write The fix is fairly simple. We need to tell kmemleak to ignore PUD allocations and never report them as leaks. We can also tell it not to scan the PGD, because it will never find pointers in there. However it will still notice if we allocate a PGD and then leak it. Reported-by: Paul Menzel Signed-off-by: Michael Ellerman > --- arch/powerpc/include/asm/book3s/64/pgalloc.h | 23 +-- 1 file changed, 21 insertions(+), 2 deletions(-) […] Tested-by: Paul Menzel on IBM S822LC Kind regards, Paul
Re: arch/powerpc/kernel/head_32.S:1106: Error: missing operand
Dear Christophe, Am 26.05.2018 um 18:02 schrieb christophe leroy: Le 26/05/2018 à 06:35, Paul Menzel a écrit : Building the configuration created with `make tinyconfig` on the Power 8 system IBM S822LC with Ubuntu 18.04 fails with the error below. ``` $ git describe --dirty v4.17-rc6-296-gbc2dbc5420e8 $ git log --oneline -1 bc2dbc5420e8 (HEAD -> master, origin/master, origin/HEAD) Merge branch 'akpm' (patches from Andrew) $ make tinyconfig $ make -j120 […] AS arch/powerpc/kernel/head_32.o arch/powerpc/kernel/head_32.S: Assembler messages: arch/powerpc/kernel/head_32.S:1106: Error: missing operand There was a similar problem in 2015, see http://linux-kernel.2935.n7.nabble.com/Missing-operand-for-tlbie-instruction-on-Power7-td1206917.html Which version of binutils do you use ? 2.30-15ubuntu1 is installed. Kind regards, Paul
arch/powerpc/kernel/head_32.S:1106: Error: missing operand
Dear Linux folks, Building the configuration created with `make tinyconfig` on the Power 8 system IBM S822LC with Ubuntu 18.04 fails with the error below. ``` $ git describe --dirty v4.17-rc6-296-gbc2dbc5420e8 $ git log --oneline -1 bc2dbc5420e8 (HEAD -> master, origin/master, origin/HEAD) Merge branch 'akpm' (patches from Andrew) $ make tinyconfig $ make -j120 […] AS arch/powerpc/kernel/head_32.o arch/powerpc/kernel/head_32.S: Assembler messages: arch/powerpc/kernel/head_32.S:1106: Error: missing operand scripts/Makefile.build:413: recipe for target 'arch/powerpc/kernel/head_32.o' failed make[1]: *** [arch/powerpc/kernel/head_32.o] Error 1 […] ``` Is this expected? *ppc64_defconfig* and *ppc64e_defconfig* build fine. Kind regards, Paul