Re: [PATCH v3 3/4] dt-bindings: interrupt-controller: fsl, ls-extirq: convert to YAML
On 27/04/2022 22:08, Leo Li wrote: >> Convert the fsl,ls-extirq binding to the new YAML format. >> >> In contrast to the original binding documentation, there are three >> compatibles which are used in their corresponding device trees which have a >> specific compatible and the (already documented) fallback >> compatible: >> - "fsl,ls1046a-extirq", "fsl,ls1043a-extirq" >> - "fsl,ls2080a-extirq", "fsl,ls1088a-extirq" >> - "fsl,lx2160a-extirq", "fsl,ls1088a-extirq" >> >> Depending on the number of the number of the external IRQs which is >> usually 12 except for the LS1021A where there are only 6, the interrupt-map- >> mask was reduced from 0x to 0xf and 0x7 respectively and the number >> of interrupt-map entries have to match. > > I assume this change won't prevent driver to be compatible with older device > trees using the 0x? The original 0x should work for both > 6/12 interrupts or whatever reasonable number of interrupts that maybe used > in future SoCs. So the purpose of this change is to make the binding more > specific to catch more errors in device tree? Yes. Best regards, Krzysztof
Re: serial hang in qemu-system-ppc64 -M pseries
On 4/28/22 00:41, Rob Landley wrote: > On 4/27/22 10:27, Thomas Huth wrote: >> On 26/04/2022 12.26, Rob Landley wrote: >>> When I cut and paste 80-ish characters of text into the Linux serial >>> console, it >>> reads 16 characters and stops. When I hit space, it reads another 16 >>> characters, >>> and if I keep at it will eventually catch up without losing data. If I type, >>> every character shows up immediately. >> >> That "16" certainly comes from VTERM_BUFSIZE in hw/char/spapr_vty.c in the >> QEMU sources, I think. >> >>> (On other qemu targets and kernels I can cut and paste an entire uuencoded >>> binary and it goes through just fine in one go, but this target hangs with >>> big >>> pastes until I hit keys.) >>> >>> Is this a qemu-side bug, or a kernel-side bug? >>> >>> Kernel config attached (linux 5.18-rc3 or thereabouts), qemu invocation is: >>> >>> qemu-system-ppc64 -M pseries -vga none -nographic -no-reboot -m 256 -kernel >>> vmlinux -initrd powerpc64leroot.cpio.gz -append "panic=1 HOST=powerpc64le >>> console=hvc0" >> >> Which version of QEMU are you using? > > $ qemu-system-ppc64 --version > QEMU emulator version 6.2.92 (v6.2.0-rc2) > Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers Just confirmed it behaves the same with current git (commit cf6f26d6f9b2). Rob
[PATCH] powerpc/perf: Add support for caps under sysfs in powerpc
Add caps support under "/sys/bus/event_source/devices//" for powerpc. This directory can be used to expose some of the specific features that powerpc PMU supports to the user. Example: pmu_name. The name of PMU registered will depend on platform, say power9 or power10 or it could be Generic Compat PMU. Currently the only way to know which is the registered PMU is from the dmesg logs. But clearing the dmesg will make it difficult to know exact PMU backend used. And even extracting from dmesg will be complicated, as we need to parse the dmesg logs and add filters for pmu name. Whereas by exposing it via caps will make it easy as we just need to directly read it from the sysfs. Add a caps directory to /sys/bus/event_source/devices/cpu/ for power8, power9, power10 and generic compat PMU. The information exposed currently: - pmu_name : Underlying PMU name from the driver Example result with power9 pmu: # ls /sys/bus/event_source/devices/cpu/caps pmu_name # cat /sys/bus/event_source/devices/cpu/caps/pmu_name power9 Signed-off-by: Athira Rajeev Reviewed-by: Madhavan Srinivasan --- arch/powerpc/perf/generic-compat-pmu.c | 20 arch/powerpc/perf/power10-pmu.c| 20 arch/powerpc/perf/power8-pmu.c | 20 arch/powerpc/perf/power9-pmu.c | 20 4 files changed, 80 insertions(+) diff --git a/arch/powerpc/perf/generic-compat-pmu.c b/arch/powerpc/perf/generic-compat-pmu.c index f3db88aee4dd..7b5fe2d89007 100644 --- a/arch/powerpc/perf/generic-compat-pmu.c +++ b/arch/powerpc/perf/generic-compat-pmu.c @@ -151,9 +151,29 @@ static const struct attribute_group generic_compat_pmu_format_group = { .attrs = generic_compat_pmu_format_attr, }; +static ssize_t pmu_name_show(struct device *cdev, + struct device_attribute *attr, + char *buf) +{ + return snprintf(buf, PAGE_SIZE, "generic_compat_pmu"); +} + +static DEVICE_ATTR_RO(pmu_name); + +static struct attribute *generic_compat_pmu_caps_attrs[] = { + &dev_attr_pmu_name.attr, + NULL +}; + +static struct attribute_group generic_compat_pmu_caps_group = { + .name = "caps", + .attrs = generic_compat_pmu_caps_attrs, +}; + static const struct attribute_group *generic_compat_pmu_attr_groups[] = { &generic_compat_pmu_format_group, &generic_compat_pmu_events_group, + &generic_compat_pmu_caps_group, NULL, }; diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c index d3398100a60f..a622ff783719 100644 --- a/arch/powerpc/perf/power10-pmu.c +++ b/arch/powerpc/perf/power10-pmu.c @@ -258,6 +258,25 @@ static const struct attribute_group power10_pmu_format_group = { .attrs = power10_pmu_format_attr, }; +static ssize_t pmu_name_show(struct device *cdev, + struct device_attribute *attr, + char *buf) +{ + return snprintf(buf, PAGE_SIZE, "power10"); +} + +static DEVICE_ATTR_RO(pmu_name); + +static struct attribute *power10_pmu_caps_attrs[] = { + &dev_attr_pmu_name.attr, + NULL +}; + +static struct attribute_group power10_pmu_caps_group = { + .name = "caps", + .attrs = power10_pmu_caps_attrs, +}; + static const struct attribute_group *power10_pmu_attr_groups_dd1[] = { &power10_pmu_format_group, &power10_pmu_events_group_dd1, @@ -267,6 +286,7 @@ static const struct attribute_group *power10_pmu_attr_groups_dd1[] = { static const struct attribute_group *power10_pmu_attr_groups[] = { &power10_pmu_format_group, &power10_pmu_events_group, + &power10_pmu_caps_group, NULL, }; diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c index e37b1e714d2b..fbcd2bfa05ba 100644 --- a/arch/powerpc/perf/power8-pmu.c +++ b/arch/powerpc/perf/power8-pmu.c @@ -187,9 +187,29 @@ static const struct attribute_group power8_pmu_events_group = { .attrs = power8_events_attr, }; +static ssize_t pmu_name_show(struct device *cdev, + struct device_attribute *attr, + char *buf) +{ + return snprintf(buf, PAGE_SIZE, "power8"); +} + +static DEVICE_ATTR_RO(pmu_name); + +static struct attribute *power8_pmu_caps_attrs[] = { + &dev_attr_pmu_name.attr, + NULL +}; + +static struct attribute_group power8_pmu_caps_group = { + .name = "caps", + .attrs = power8_pmu_caps_attrs, +}; + static const struct attribute_group *power8_pmu_attr_groups[] = { &isa207_pmu_format_group, &power8_pmu_events_group, + &power8_pmu_caps_group, NULL, }; diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c index c9eb5232e68b..11f290b6287f 100644 --- a/arch/powerpc/perf/power9-pmu.c +++ b/arch/powerpc/perf/power9-pmu.c @@ -258,9 +258,29 @@ static const struct attribute_group power9_pmu_format_group = { .attrs = power9_pmu_format_attr, }; +static ssi
Re: serial hang in qemu-system-ppc64 -M pseries
On 4/27/22 10:27, Thomas Huth wrote: > On 26/04/2022 12.26, Rob Landley wrote: >> When I cut and paste 80-ish characters of text into the Linux serial >> console, it >> reads 16 characters and stops. When I hit space, it reads another 16 >> characters, >> and if I keep at it will eventually catch up without losing data. If I type, >> every character shows up immediately. > > That "16" certainly comes from VTERM_BUFSIZE in hw/char/spapr_vty.c in the > QEMU sources, I think. > >> (On other qemu targets and kernels I can cut and paste an entire uuencoded >> binary and it goes through just fine in one go, but this target hangs with >> big >> pastes until I hit keys.) >> >> Is this a qemu-side bug, or a kernel-side bug? >> >> Kernel config attached (linux 5.18-rc3 or thereabouts), qemu invocation is: >> >> qemu-system-ppc64 -M pseries -vga none -nographic -no-reboot -m 256 -kernel >> vmlinux -initrd powerpc64leroot.cpio.gz -append "panic=1 HOST=powerpc64le >> console=hvc0" > > Which version of QEMU are you using? $ qemu-system-ppc64 --version QEMU emulator version 6.2.92 (v6.2.0-rc2) Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers >From november. I can pull and rebuild but it'll take a bit. (Hopefully rebuilding would fix the need to echo -e '\e[?7h' afterwards to undo the "bash command line history marches up the screen because qemu's x86 bios disabled line wrap and then left it that way" issue...) > Which frontend (GTK or terminal?) ... The above command line has -nographic, forcing terminal. Running ldd on the binary doesn't pull up anything gtk. (It pulls up libncursesw though.) If you want to reproduce my test locally: wget https://landley.net/toybox/downloads/binaries/mkroot/0.8.5/powerpc64le.tgz tar xvzf powerpc64le.tgz cd powerpc64le ./qemu-*.sh Then paste something longer than 16 characters at the eventual command prompt once the kernel finishes booting. If you want to reproduce it all from source: git clone https://github.com/landley/toybox cd toybox && mkdir ccc && cd ccc wget https://landley.net/toybox/downloads/binaries/toolchains/latest/powerpc64le-linux-musl-cross.tar.xz -O - | tar xv cd .. CROSS=powerpc64le LINUX=~/linux scripts/mkroot.sh cd root/powerpc64le ./qemu-*.sh This assumes your linux kernel source directory is in ~/linux of course, and that qemu-system-ppc64 is in the $PATH... > this rings a distant bell, but I thought we had fixed these issues long ago > in the past... e.g.: > > https://yhbt.net/lore/all/1380113886-16845-10-git-send-email-mdr...@linux.vnet.ibm.com/ > > https://git.qemu.org/?p=qemu.git;a=commitdiff;h=8a273cbe53221d28 The qemu I'm running is newer than 2016. :) Most targets are fine with this: I cut and paste entire uuencoded binaries into the serial console as an easy way to insert a file into an initramfs. It can usually take multiple megabytes without dropping a character, so you just "uudecode" enter, and then paste. Even my 32 bit powerpc target works fine with this. (Although -M g3beige is a very different machine from -M pseries...) Alas this target (and sh4 -m r2d) stop at 16 chars. (On sh4 the extra is discarded, not delivered progressively as more interrupts are generated.) > ... but maybe my memory also just fails and this has never been properly > fixed. > > Thomas Rob
Re: [PATCH] powerpc/time: Always set decrementer in timer_interrupt()
Excerpts from Nicholas Piggin's message of April 21, 2022 12:07 pm: > Excerpts from Michal Suchánek's message of April 21, 2022 12:28 am: >> Hello, >> >> On Thu, Apr 21, 2022 at 12:16:57AM +1000, Michael Ellerman wrote: >>> This is a partial revert of commit 0faf20a1ad16 ("powerpc/64s/interrupt: >>> Don't enable MSR[EE] in irq handlers unless perf is in use"). >>> >>> Prior to that commit, we always set the decrementer in >>> timer_interrupt(), to clear the timer interrupt. Otherwise we could end >>> up continuously taking timer interrupts. >>> >>> When high res timers are enabled there is no problem seen with leaving >>> the decrementer untouched in timer_interrupt(), because it will be >>> programmed via hrtimer_interrupt() -> tick_program_event() -> >>> clockevents_program_event() -> decrementer_set_next_event(). >>> >>> However with CONFIG_HIGH_RES_TIMERS=n or booting with highres=off, we >> >> How difficult is it to detect this condition? >> >> Maybe detecting this could be just added? > > Possibly not too difficult but I'd like to see if we can get this to work > in core timer code - > > https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-April/242212.html > > I'll resend as a patch and see what flamage I get... tglx merged it into his tree, so we could try again after its upstream. https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=timers/core&id=62c1256d544747b38e77ca9b5bfe3a26f9592576 I'm kind of worried the patch will explode some strange clock event device in an obscure way so we may wait for a release or two first. Thanks, Nick
Re: [PATCH] KVM: PPC: Book3S HV: Initialize AMOR in nested entry
Excerpts from Fabiano Rosas's message of April 26, 2022 12:21 am: > The hypervisor always sets AMOR to ~0, but let's ensure we're not > passing stale values around. > Reviewed-by: Nicholas Piggin Looks like our L0 doesn't do anything with hvregs.amor ? Thanks, Nick > Signed-off-by: Fabiano Rosas > --- > arch/powerpc/kvm/book3s_hv.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index 6fa518f6501d..b5f504576765 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -3967,6 +3967,7 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu > *vcpu, u64 time_limit, uns > > kvmhv_save_hv_regs(vcpu, &hvregs); > hvregs.lpcr = lpcr; > + hvregs.amor = ~0; > vcpu->arch.regs.msr = vcpu->arch.shregs.msr; > hvregs.version = HV_GUEST_STATE_VERSION; > if (vcpu->arch.nested) { > -- > 2.35.1 > >
Re: [PATCH 20/30] panic: Add the panic informational notifier list
On Wed, Apr 27, 2022 at 07:49:14PM -0300, Guilherme G. Piccoli wrote: > The goal of this new panic notifier is to allow its users to > register callbacks to run earlier in the panic path than they > currently do. This aims at informational mechanisms, like dumping > kernel offsets and showing device error data (in case it's simple > registers reading, for example) as well as mechanisms to disable > log flooding (like hung_task detector / RCU warnings) and the > tracing dump_on_oops (when enabled). > > Any (non-invasive) information that should be provided before > kmsg_dump() as well as log flooding preventing code should fit > here, as long it offers relatively low risk for kdump. > > For now, the patch is almost a no-op, although it changes a bit > the ordering in which some panic notifiers are executed - specially > affected by this are the notifiers responsible for disabling the > hung_task detector / RCU warnings, which now run first. In a > subsequent patch, the panic path will be refactored, then the > panic informational notifiers will effectively run earlier, > before ksmg_dump() (and usually before kdump as well). > > We also defer documenting it all properly in the subsequent > refactor patch. Finally, while at it, we removed some useless > header inclusions too. > > Cc: Benjamin Herrenschmidt > Cc: Catalin Marinas > Cc: Florian Fainelli > Cc: Frederic Weisbecker > Cc: "H. Peter Anvin" > Cc: Hari Bathini > Cc: Joel Fernandes > Cc: Jonathan Hunter > Cc: Josh Triplett > Cc: Lai Jiangshan > Cc: Leo Yan > Cc: Mathieu Desnoyers > Cc: Mathieu Poirier > Cc: Michael Ellerman > Cc: Mike Leach > Cc: Mikko Perttunen > Cc: Neeraj Upadhyay > Cc: Nicholas Piggin > Cc: Paul Mackerras > Cc: Suzuki K Poulose > Cc: Thierry Reding > Cc: Thomas Bogendoerfer > Signed-off-by: Guilherme G. Piccoli >From an RCU perspective: Acked-by: Paul E. McKenney > --- > arch/arm64/kernel/setup.c | 2 +- > arch/mips/kernel/relocate.c | 2 +- > arch/powerpc/kernel/setup-common.c| 2 +- > arch/x86/kernel/setup.c | 2 +- > drivers/bus/brcmstb_gisb.c| 2 +- > drivers/hwtracing/coresight/coresight-cpu-debug.c | 4 ++-- > drivers/soc/tegra/ari-tegra186.c | 3 ++- > include/linux/panic_notifier.h| 1 + > kernel/hung_task.c| 3 ++- > kernel/panic.c| 4 > kernel/rcu/tree.c | 1 - > kernel/rcu/tree_stall.h | 3 ++- > kernel/trace/trace.c | 2 +- > 13 files changed, 19 insertions(+), 12 deletions(-) > > diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c > index 3505789cf4bd..ac2c7e8c9c6a 100644 > --- a/arch/arm64/kernel/setup.c > +++ b/arch/arm64/kernel/setup.c > @@ -444,7 +444,7 @@ static struct notifier_block arm64_panic_block = { > > static int __init register_arm64_panic_block(void) > { > - atomic_notifier_chain_register(&panic_notifier_list, > + atomic_notifier_chain_register(&panic_info_list, > &arm64_panic_block); > return 0; > } > diff --git a/arch/mips/kernel/relocate.c b/arch/mips/kernel/relocate.c > index 56b51de2dc51..650811f2436a 100644 > --- a/arch/mips/kernel/relocate.c > +++ b/arch/mips/kernel/relocate.c > @@ -459,7 +459,7 @@ static struct notifier_block kernel_location_notifier = { > > static int __init register_kernel_offset_dumper(void) > { > - atomic_notifier_chain_register(&panic_notifier_list, > + atomic_notifier_chain_register(&panic_info_list, > &kernel_location_notifier); > return 0; > } > diff --git a/arch/powerpc/kernel/setup-common.c > b/arch/powerpc/kernel/setup-common.c > index 1468c3937bf4..d04b8bf8dbc7 100644 > --- a/arch/powerpc/kernel/setup-common.c > +++ b/arch/powerpc/kernel/setup-common.c > @@ -757,7 +757,7 @@ void __init setup_panic(void) > &ppc_fadump_block); > > if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset() > 0) > - atomic_notifier_chain_register(&panic_notifier_list, > + atomic_notifier_chain_register(&panic_info_list, > &kernel_offset_notifier); > > /* Low-level platform-specific routines that should run on panic */ > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index c95b9ac5a457..599b25346964 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -1266,7 +1266,7 @@ static struct notifier_block kernel_offset_notifier = { > > static int __init register_kernel_offset_dumper(void) > { > - atomic_notifier_chain_register(&panic_notifier_list, > + atomic_notifier_chain_register(&panic_info_list, > &kernel_offset_notifier); >
[PATCH 00/30] The panic notifiers refactor
Hey folks, this is an attempt to improve/refactor the dated panic notifiers infrastructure. This is strongly based in a suggestion made by Pter Mladek [0] some time ago, and it's finally ready. Below I'll detail the patch ordering, testing made, etc. First, a bit about the reason behind this. The panic notifiers list is an infrastructure that allows callbacks to execute during panic time. Happens that anybody can add functions there, no ordering is enforced (by default) and the decision to execute or not such notifiers before kdump may lead to high risk of failure in crash scenarios - default is not to execute any of them. There is a parameter acting as a switch for that. But some architectures require some notifiers, so..it's messy. The suggestion from Petr came after a patch submission to add a notifiers filter, allowing the notifiers selection by function name, which was welcomed by some people, but not by Petr, which claimed the code should indeed have a refactor - and it made a lot of sense, his suggestion makes code more clear and reliable. So, this series might be split in 3 portions: Part 1: the first 18 patches are mostly fixes (one or two might be considered improvements), mostly replacing spinlocks/mutexes with safer alternatives for atomic contexts, like spin_trylock, etc. We also focused on commenting everything that is possible and clean-up code. Part 2, the core: patches 19-25 are the main refactor, which splits the panic notifiers list in three, introduce the concept of panic notifier level and clean-up and highly comment the code, effectively leading to a more reliable and clear, yet highly customizable panic path. Part 3: The remaining 5 patches are fixes that _require the main refactor_ patches, they don't make sense without the core changes - but again, these are small fixes and not part of the main goal of refactoring the panic code. I've tried my best to make the patches the more "bisectable" as possible, so they tend to be self-contained and easy to backport (specially patches from part 1). Notice that the series is *based on 5.18-rc4* - usually a refactor like this would be based on linux-next, but since we have many fixes in the series, I kept it based on mainline tree. Of course I could change that in a subsequent iteration, if desired. Since this touches multiple architectures and drivers, it's very difficult to test it really (by executing all touched code). So, my tests split in two approaches: build tests and real tests, that involves panic triggering with and without kdump, changing panic notifiers level, etc. Build tests (using cross-compilers): alpha, arm, arm64, mips (sgi 22 and 32), parisc, s390, sparc, um, x86_64 (couldn't get a functional xtensa cross compiler). Real/full tests: x86_64 (Hyper-V and QEMU guests) + PowerPC (pseries guest). Here is the link with the .config files used: https://people.igalia.com/gpiccoli/panic_notifiers_configs/ (tried my best to build all the affected code). Finally, a bit about my CCing strategy: I've included everybody present in the original thread [0] plus some maintainers and other interested parties as CC in the full series. But the patches have individual CC lists, for people that are definitely related to them but might not care much for the whole series; nevertheless, _everybody_ mentioned at least once in some patch is CCed in this cover-letter. Hopefully I didn't forget to include anybody - all the mailing lists were CCed in the whole series. Apologies in advance if (a) you received emails you didn't want to or, (b) I forgot to include you but it was something considered interesting by you. Thanks in advance for reviews / comments / suggestions! Cheers, Guilherme [0] https://lore.kernel.org/lkml/YfPxvzSzDLjO5ldp@alley/ Guilherme G. Piccoli (30): x86/crash,reboot: Avoid re-disabling VMX in all CPUs on crash/restart ARM: kexec: Disable IRQs/FIQs also on crash CPUs shutdown path notifier: Add panic notifiers info and purge trailing whitespaces firmware: google: Convert regular spinlock into trylock on panic path misc/pvpanic: Convert regular spinlock into trylock on panic path soc: bcm: brcmstb: Document panic notifier action and remove useless header mips: ip22: Reword PANICED to PANICKED and remove useless header powerpc/setup: Refactor/untangle panic notifiers coresight: cpu-debug: Replace mutex with mutex_trylock on panic notifier alpha: Clean-up the panic notifier code um: Improve panic notifiers consistency and ordering parisc: Replace regular spinlock with spin_trylock on panic path s390/consoles: Improve panic notifiers reliability panic: Properly identify the panic event to the notifiers' callbacks bus: brcmstb_gisb: Clean-up panic/die notifiers drivers/hv/vmbus, video/hyperv_fb: Untangle and refactor Hyper-V panic notifiers tracing: Improve panic/die notifiers notifier: Show function names on notifier routines if DEBUG_NOTIFIERS is set panic: Add the panic hypervisor
[PATCH 03/30] notifier: Add panic notifiers info and purge trailing whitespaces
Although many notifiers are mentioned in the comments, the panic notifiers infrastructure is not. Also, the file contains some trailing whitespaces. This commit fix both issues. Cc: Arjan van de Ven Cc: Cong Wang Cc: Sebastian Andrzej Siewior Cc: Valentin Schneider Cc: Xiaoming Ni Signed-off-by: Guilherme G. Piccoli --- include/linux/notifier.h | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/include/linux/notifier.h b/include/linux/notifier.h index 87069b8459af..0589896fc7bd 100644 --- a/include/linux/notifier.h +++ b/include/linux/notifier.h @@ -201,12 +201,12 @@ static inline int notifier_to_errno(int ret) /* * Declared notifiers so far. I can imagine quite a few more chains - * over time (eg laptop power reset chains, reboot chain (to clean + * over time (eg laptop power reset chains, reboot chain (to clean * device units up), device [un]mount chain, module load/unload chain, - * low memory chain, screenblank chain (for plug in modular screenblankers) + * low memory chain, screenblank chain (for plug in modular screenblankers) * VC switch chains (for loadable kernel svgalib VC switch helpers) etc... */ - + /* CPU notfiers are defined in include/linux/cpu.h. */ /* netdevice notifiers are defined in include/linux/netdevice.h */ @@ -217,6 +217,8 @@ static inline int notifier_to_errno(int ret) /* Virtual Terminal events are defined in include/linux/vt.h. */ +/* Panic notifiers are defined in include/linux/panic_notifier.h. */ + #define NETLINK_URELEASE 0x0001 /* Unicast netlink socket released */ /* Console keyboard events. -- 2.36.0
[PATCH 06/30] soc: bcm: brcmstb: Document panic notifier action and remove useless header
The panic notifier of this driver is very simple code-wise, just a memory write to a special position with some numeric code. But this is not clear from the semantic point-of-view, and there is no public documentation about that either. After discussing this in the mailing-lists [0] and having Florian explained it very well, this patch just document that in the code for the future generations asking the same questions. Also, it removes a useless header. [0] https://lore.kernel.org/lkml/781cafb0-8d06-8b56-907a-5175c2da1...@gmail.com Fixes: 0b741b8234c8 ("soc: bcm: brcmstb: Add support for S2/S3/S5 suspend states (ARM)") Cc: Brian Norris Cc: Doug Berger Cc: Florian Fainelli Cc: Justin Chen Cc: Lee Jones Cc: Markus Mayer Signed-off-by: Guilherme G. Piccoli --- drivers/soc/bcm/brcmstb/pm/pm-arm.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/soc/bcm/brcmstb/pm/pm-arm.c b/drivers/soc/bcm/brcmstb/pm/pm-arm.c index 3cbb165d6e30..870686ae042b 100644 --- a/drivers/soc/bcm/brcmstb/pm/pm-arm.c +++ b/drivers/soc/bcm/brcmstb/pm/pm-arm.c @@ -25,7 +25,6 @@ #include #include #include -#include #include #include #include @@ -664,7 +663,20 @@ static void __iomem *brcmstb_ioremap_match(const struct of_device_id *matches, return of_io_request_and_map(dn, index, dn->full_name); } - +/* + * The AON is a small domain in the SoC that can retain its state across + * various system wide sleep states and specific reset conditions; the + * AON DATA RAM is a small RAM of a few words (< 1KB) which can store + * persistent information across such events. + * + * The purpose of the below panic notifier is to help with notifying + * the bootloader that a panic occurred and so that it should try its + * best to preserve the DRAM contents holding that buffer for recovery + * by the kernel as opposed to wiping out DRAM clean again. + * + * Reference: comment from Florian Fainelli, at + * https://lore.kernel.org/lkml/781cafb0-8d06-8b56-907a-5175c2da1...@gmail.com + */ static int brcmstb_pm_panic_notify(struct notifier_block *nb, unsigned long action, void *data) { -- 2.36.0
[PATCH 01/30] x86/crash, reboot: Avoid re-disabling VMX in all CPUs on crash/restart
In the panic path we have a list of functions to be called, the panic notifiers - such callbacks perform various actions in the machine's last breath, and sometimes users want them to run before kdump. We have the parameter "crash_kexec_post_notifiers" for that. When such parameter is used, the function "crash_smp_send_stop()" is executed to poweroff all secondary CPUs through the NMI-shootdown mechanism; part of this process involves disabling virtualization features in all CPUs (except the main one). Now, in the emergency restart procedure we have also a way of disabling VMX in all CPUs, using the same NMI-shootdown mechanism; what happens though is that in case we already NMI-disabled all CPUs, the emergency restart fails due to a second addition of the same items in the NMI list, as per the following log output: sysrq: Trigger a crash Kernel panic - not syncing: sysrq triggered crash [...] Rebooting in 2 seconds.. list_add double add: new=, prev=, next=. [ cut here ] kernel BUG at lib/list_debug.c:29! invalid opcode: [#1] PREEMPT SMP PTI In order to reproduce the problem, users just need to set the kernel parameter "crash_kexec_post_notifiers" *without* kdump set in any system with the VMX feature present. Since there is no benefit in re-disabling VMX in all CPUs in case it was already done, this patch prevents that by guarding the restart routine against doubly issuing NMIs unnecessarily. Notice we still need to disable VMX locally in the emergency restart. Fixes: ed72736183c4 ("x86/reboot: Force all cpus to exit VMX root if VMX is supported) Fixes: 0ee59413c967 ("x86/panic: replace smp_send_stop() with kdump friendly version in panic path") Cc: David P. Reed Cc: Hidehiro Kawai Cc: Paolo Bonzini Cc: Sean Christopherson Signed-off-by: Guilherme G. Piccoli --- arch/x86/include/asm/cpu.h | 1 + arch/x86/kernel/crash.c| 8 arch/x86/kernel/reboot.c | 14 -- 3 files changed, 17 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h index 86e5e4e26fcb..b6a9062d387f 100644 --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -36,6 +36,7 @@ extern int _debug_hotplug_cpu(int cpu, int action); #endif #endif +extern bool crash_cpus_stopped; int mwait_usable(const struct cpuinfo_x86 *); unsigned int x86_family(unsigned int sig); diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c index e8326a8d1c5d..71dd1a990e8d 100644 --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -42,6 +42,8 @@ #include #include +bool crash_cpus_stopped; + /* Used while preparing memory map entries for second kernel */ struct crash_memmap_data { struct boot_params *params; @@ -108,9 +110,7 @@ void kdump_nmi_shootdown_cpus(void) /* Override the weak function in kernel/panic.c */ void crash_smp_send_stop(void) { - static int cpus_stopped; - - if (cpus_stopped) + if (crash_cpus_stopped) return; if (smp_ops.crash_stop_other_cpus) @@ -118,7 +118,7 @@ void crash_smp_send_stop(void) else smp_send_stop(); - cpus_stopped = 1; + crash_cpus_stopped = true; } #else diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index fa700b46588e..2fc42b8402ac 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -589,8 +589,18 @@ static void native_machine_emergency_restart(void) int orig_reboot_type = reboot_type; unsigned short mode; - if (reboot_emergency) - emergency_vmx_disable_all(); + /* +* We can reach this point in the end of panic path, having +* NMI-disabled all secondary CPUs. This process involves +* disabling the CPU virtualization technologies, so if that +* is the case, we only miss disabling the local CPU VMX... +*/ + if (reboot_emergency) { + if (!crash_cpus_stopped) + emergency_vmx_disable_all(); + else + cpu_emergency_vmxoff(); + } tboot_shutdown(TB_SHUTDOWN_REBOOT); -- 2.36.0
[PATCH 04/30] firmware: google: Convert regular spinlock into trylock on panic path
Currently the gsmi driver registers a panic notifier as well as reboot and die notifiers. The callbacks registered are called in atomic and very limited context - for instance, panic disables preemption, local IRQs and all other CPUs that aren't running the current panic function. With that said, taking a spinlock in this scenario is a dangerous invitation for a deadlock scenario. So, we fix that in this commit by changing the regular spinlock with a trylock, which is a safer approach. Fixes: 74c5b31c6618 ("driver: Google EFI SMI") Cc: Ard Biesheuvel Cc: David Gow Cc: Evan Green Cc: Julius Werner Signed-off-by: Guilherme G. Piccoli --- drivers/firmware/google/gsmi.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/firmware/google/gsmi.c b/drivers/firmware/google/gsmi.c index adaa492c3d2d..b01ed02e4a87 100644 --- a/drivers/firmware/google/gsmi.c +++ b/drivers/firmware/google/gsmi.c @@ -629,7 +629,10 @@ static int gsmi_shutdown_reason(int reason) if (saved_reason & (1 << reason)) return 0; - spin_lock_irqsave(&gsmi_dev.lock, flags); + if (!spin_trylock_irqsave(&gsmi_dev.lock, flags)) { + rc = -EBUSY; + goto out; + } saved_reason |= (1 << reason); @@ -646,6 +649,7 @@ static int gsmi_shutdown_reason(int reason) spin_unlock_irqrestore(&gsmi_dev.lock, flags); +out: if (rc < 0) printk(KERN_ERR "gsmi: Log Shutdown Reason failed\n"); else -- 2.36.0
[PATCH 02/30] ARM: kexec: Disable IRQs/FIQs also on crash CPUs shutdown path
Currently the regular CPU shutdown path for ARM disables IRQs/FIQs in the secondary CPUs - smp_send_stop() calls ipi_cpu_stop(), which is responsible for that. This makes sense, since we're turning off such CPUs, putting them in an endless busy-wait loop. Problem is that there is an alternative path for disabling CPUs, in the form of function crash_smp_send_stop(), used for kexec/panic paths. This functions relies in a SMP call that also triggers a busy-wait loop [at machine_crash_nonpanic_core()], but *without* disabling interrupts. This might lead to odd scenarios, like early interrupts in the boot of kexec'd kernel or even interrupts in other CPUs while the main one still works in the panic path and assumes all secondary CPUs are (really!) off. This patch mimics the ipi_cpu_stop() interrupt disable mechanism in the crash CPU shutdown path, hence disabling IRQs/FIQs in all secondary CPUs in the kexec/panic path as well. Cc: Marc Zyngier Cc: Russell King Signed-off-by: Guilherme G. Piccoli --- arch/arm/kernel/machine_kexec.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c index f567032a09c0..ef788ee00519 100644 --- a/arch/arm/kernel/machine_kexec.c +++ b/arch/arm/kernel/machine_kexec.c @@ -86,6 +86,9 @@ void machine_crash_nonpanic_core(void *unused) set_cpu_online(smp_processor_id(), false); atomic_dec(&waiting_for_crash_ipi); + local_fiq_disable(); + local_irq_disable(); + while (1) { cpu_relax(); wfe(); -- 2.36.0
[PATCH 05/30] misc/pvpanic: Convert regular spinlock into trylock on panic path
The pvpanic driver relies on panic notifiers to execute a callback on panic event. Such function is executed in atomic context - the panic function disables local IRQs, preemption and all other CPUs that aren't running the panic code. With that said, it's dangerous to use regular spinlocks in such path, as introduced by commit b3c0f8774668 ("misc/pvpanic: probe multiple instances"). This patch fixes that by replacing regular spinlocks with the trylock safer approach. It also fixes an old comment (about a long gone framebuffer code) and the notifier priority - we should execute hypervisor notifiers early, deferring this way the panic action to the hypervisor, as expected by the users that are setting up pvpanic. Fixes: b3c0f8774668 ("misc/pvpanic: probe multiple instances") Cc: Christophe JAILLET Cc: Mihai Carabas Cc: Shile Zhang Cc: Wang ShaoBo Cc: zhenwei pi Signed-off-by: Guilherme G. Piccoli --- drivers/misc/pvpanic/pvpanic.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/misc/pvpanic/pvpanic.c b/drivers/misc/pvpanic/pvpanic.c index 4b8f1c7d726d..049a12006348 100644 --- a/drivers/misc/pvpanic/pvpanic.c +++ b/drivers/misc/pvpanic/pvpanic.c @@ -34,7 +34,9 @@ pvpanic_send_event(unsigned int event) { struct pvpanic_instance *pi_cur; - spin_lock(&pvpanic_lock); + if (!spin_trylock(&pvpanic_lock)) + return; + list_for_each_entry(pi_cur, &pvpanic_list, list) { if (event & pi_cur->capability & pi_cur->events) iowrite8(event, pi_cur->base); @@ -55,9 +57,13 @@ pvpanic_panic_notify(struct notifier_block *nb, unsigned long code, void *unused return NOTIFY_DONE; } +/* + * Call our notifier very early on panic, deferring the + * action taken to the hypervisor. + */ static struct notifier_block pvpanic_panic_nb = { .notifier_call = pvpanic_panic_notify, - .priority = 1, /* let this called before broken drm_fb_helper() */ + .priority = INT_MAX, }; static void pvpanic_remove(void *param) -- 2.36.0
[PATCH 30/30] um: Avoid duplicate call to kmsg_dump()
Currently the panic notifier panic_exit() calls kmsg_dump() and some console flushing routines - this makes sense since such panic notifier exits UserMode Linux and never returns. Happens that after a panic refactor, kmsg_dump() is now always called *before* the pre_reboot list of panic notifiers, in which panic_exit() belongs, leading to a double call situation. This patch changes that by removing such call from the panic notifier, but leaving the console flushing calls since the pre_reboot list still runs before console flushing on panic(). Cc: Anton Ivanov Cc: Johannes Berg Cc: Richard Weinberger Signed-off-by: Guilherme G. Piccoli --- arch/um/kernel/um_arch.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c index fc6e443299da..651310e3e86f 100644 --- a/arch/um/kernel/um_arch.c +++ b/arch/um/kernel/um_arch.c @@ -241,7 +241,6 @@ static void __init uml_postsetup(void) static int panic_exit(struct notifier_block *self, unsigned long unused1, void *unused2) { - kmsg_dump(KMSG_DUMP_PANIC); bust_spinlocks(1); bust_spinlocks(0); uml_exitcode = 1; -- 2.36.0
[PATCH 29/30] powerpc: ps3, pseries: Avoid duplicate call to kmsg_dump() on panic
Currently both pseries and ps3 are platforms that define special panic notifiers that run as callbacks inside powerpc generic panic notifier. In both cases kmsg_dump() is called, and the reason seems to be that both of these callbacks aims to effectively stop the machine, so nothing would execute after that - hence, both force a series of console flushing related operations, after calling the kmsg dumpers. Happens that recently the panic path was refactored, and now kmsg_dump() is *certainly* called before the pre_reboot panic notifiers, category in which both pseries/ps3 callbacks belong. In other words: kmsg_dump() will execute twice in both platforms, on panic path. This patch prevents that by disabling the kmsg_dump() for both platform's notifiers. But worth to notice that PowerNV still has a legit use for executing kmsg_dump() in its unrecoverable error path, so we rely in parameter passing to differentiate both cases. Also, since the pre_reboot notifiers still run earlier than console flushing routines, we kept that for both pseries and ps3 platforms, only skipping kmsg_dump(). Fixes: 35adacd6fc48 ("powerpc/pseries, ps3: panic flush kernel messages before halting system") Cc: Benjamin Herrenschmidt Cc: Hari Bathini Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Paul Mackerras Signed-off-by: Guilherme G. Piccoli --- We'd like to thanks specially the MiniCloud infrastructure [0] maintainers, that allow us to test PowerPC code in a very complete, functional and FREE environment. [0] https://openpower.ic.unicamp.br/minicloud arch/powerpc/include/asm/bug.h | 2 +- arch/powerpc/kernel/traps.c| 6 -- arch/powerpc/platforms/powernv/opal.c | 2 +- arch/powerpc/platforms/ps3/setup.c | 2 +- arch/powerpc/platforms/pseries/setup.c | 2 +- 5 files changed, 8 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h index ecbae1832de3..49e5f6f86869 100644 --- a/arch/powerpc/include/asm/bug.h +++ b/arch/powerpc/include/asm/bug.h @@ -166,7 +166,7 @@ extern void die(const char *, struct pt_regs *, long); void die_mce(const char *str, struct pt_regs *regs, long err); extern bool die_will_crash(void); extern void panic_flush_kmsg_start(void); -extern void panic_flush_kmsg_end(void); +extern void panic_flush_kmsg_end(bool dump); #endif /* !__ASSEMBLY__ */ #endif /* __KERNEL__ */ diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index a08bb7cefdc5..837a5ed98d62 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -169,9 +169,11 @@ extern void panic_flush_kmsg_start(void) bust_spinlocks(1); } -extern void panic_flush_kmsg_end(void) +extern void panic_flush_kmsg_end(bool dump) { - kmsg_dump(KMSG_DUMP_PANIC); + if (dump) + kmsg_dump(KMSG_DUMP_PANIC); + bust_spinlocks(0); debug_locks_off(); console_flush_on_panic(CONSOLE_FLUSH_PENDING); diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c index 55a8fbfdb5b2..d172ceedece2 100644 --- a/arch/powerpc/platforms/powernv/opal.c +++ b/arch/powerpc/platforms/powernv/opal.c @@ -641,7 +641,7 @@ void __noreturn pnv_platform_error_reboot(struct pt_regs *regs, const char *msg) show_regs(regs); smp_send_stop(); - panic_flush_kmsg_end(); + panic_flush_kmsg_end(true); /* * Don't bother to shut things down because this will diff --git a/arch/powerpc/platforms/ps3/setup.c b/arch/powerpc/platforms/ps3/setup.c index 3de9145c20bc..7cb78e508fb3 100644 --- a/arch/powerpc/platforms/ps3/setup.c +++ b/arch/powerpc/platforms/ps3/setup.c @@ -102,7 +102,7 @@ static void ps3_panic(char *str) printk(" System does not reboot automatically.\n"); printk(" Please press POWER button.\n"); printk("\n"); - panic_flush_kmsg_end(); + panic_flush_kmsg_end(false); while(1) lv1_pause(1); diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 955ff8aa1644..d6eea473eafd 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -856,7 +856,7 @@ static void __init pSeries_setup_arch(void) static void pseries_panic(char *str) { - panic_flush_kmsg_end(); + panic_flush_kmsg_end(false); rtas_os_term(str); } -- 2.36.0
[PATCH 28/30] panic: Unexport crash_kexec_post_notifiers
There is no users anymore of this variable that requires it to be "exported" in the headers; also, it was deprecated by the kernel parameter "panic_notifiers_level". Signed-off-by: Guilherme G. Piccoli --- include/linux/panic.h | 2 -- include/linux/panic_notifier.h | 1 - 2 files changed, 3 deletions(-) diff --git a/include/linux/panic.h b/include/linux/panic.h index 34175d0188d0..d301db07a8af 100644 --- a/include/linux/panic.h +++ b/include/linux/panic.h @@ -34,8 +34,6 @@ extern int sysctl_panic_on_rcu_stall; extern int sysctl_max_rcu_stall_to_panic; extern int sysctl_panic_on_stackoverflow; -extern bool crash_kexec_post_notifiers; - /* * panic_cpu is used for synchronizing panic() and crash_kexec() execution. It * holds a CPU number which is executing panic() currently. A value of diff --git a/include/linux/panic_notifier.h b/include/linux/panic_notifier.h index b5041132321d..8fda7045e2f7 100644 --- a/include/linux/panic_notifier.h +++ b/include/linux/panic_notifier.h @@ -11,7 +11,6 @@ extern struct atomic_notifier_head panic_pre_reboot_list; extern struct atomic_notifier_head panic_post_reboot_list; bool panic_notifiers_before_kdump(void); -extern bool crash_kexec_post_notifiers; enum panic_notifier_val { PANIC_UNUSED, -- 2.36.0
[PATCH 27/30] powerpc: Do not force all panic notifiers to execute before kdump
Commit 06e629c25daa ("powerpc/fadump: Fix inaccurate CPU state info in vmcore generated with panic") introduced a hardcoded setting of kernel parameter "crash_kexec_post_notifiers", effectively forcing all the panic notifiers to execute earlier in the panic path, before kdump. The reason for that was a fadump issue on collecting data accurately, due to smp_send_stop() setting all CPUs offline, so the net effect desired with this change was to avoid calling the regular CPU shutdown function, and instead rely on crash_smp_send_stop(), which copes fine with fadump. The collateral effect was to increase the risk for kdump if fadump is not used, since it forces all panic notifiers to execute early, before kdump. Happens that, after a panic refactor, crash_smp_send_stop() is now used by default in the panic path, so there is no reason to mess with the notifiers ordering (which was also improved in the refactor) from within arch code. Fixes: 06e629c25daa ("powerpc/fadump: Fix inaccurate CPU state info in vmcore generated with panic") Cc: Benjamin Herrenschmidt Cc: Hari Bathini Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Paul Mackerras Signed-off-by: Guilherme G. Piccoli --- We'd like to thanks specially the MiniCloud infrastructure [0] maintainers, that allow us to test PowerPC code in a very complete, functional and FREE environment. [0] https://openpower.ic.unicamp.br/minicloud arch/powerpc/kernel/fadump.c | 8 1 file changed, 8 deletions(-) diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c index 65562c4a0a69..35ae8c09af66 100644 --- a/arch/powerpc/kernel/fadump.c +++ b/arch/powerpc/kernel/fadump.c @@ -1649,14 +1649,6 @@ int __init setup_fadump(void) register_fadump(); } - /* -* In case of panic, fadump is triggered via ppc_panic_event() -* panic notifier. Setting crash_kexec_post_notifiers to 'true' -* lets panic() function take crash friendly path before panic -* notifiers are invoked. -*/ - crash_kexec_post_notifiers = true; - return 1; } /* -- 2.36.0
[PATCH 26/30] Drivers: hv: Do not force all panic notifiers to execute before kdump
Since commit a11589563e96 ("x86/Hyper-V: Report crash register data or kmsg before running crash kernel") Hyper-V forcibly sets the kernel parameter "crash_kexec_post_notifiers"; with that, it did enforce the execution of *all* panic notifiers before kdump. The main reason behind that is that Hyper-V has an hypervisor notification mechanism that has the ability of warning the hypervisor when the guest panics. Happens that after the panic notifiers refactor, we now have 3 lists and a level mechanism that defines the ordering of the notifiers execution with regards to kdump. And for Hyper-V, the specific notifier to inform the hypervisor about a panic lies in the first list, which *by default* is set to execute before kdump. Hence, this patch removes the hardcoded setting, effectively reverting the aforementioned commit. One of the problems with the forced approach was greatly exposed by commit d57d6fe5bf34 ("drivers: hv: log when enabling crash_kexec_post_notifiers") which ended-up confusing the user that didn't expect the notifiers to execute before kdump, since it's a user setting and wasn't enabled by such user. With the patch hereby proposed, that kind of issue doesn't happen anymore, the panic notifiers level is well-documented and users can expect a predictable behavior. Fixes: a11589563e96 ("x86/Hyper-V: Report crash register data or kmsg before running crash kernel") Fixes: d57d6fe5bf34 ("drivers: hv: log when enabling crash_kexec_post_notifiers" Cc: Andrea Parri (Microsoft) Cc: Dexuan Cui Cc: Haiyang Zhang Cc: "K. Y. Srinivasan" Cc: Michael Kelley Cc: Stephen Brennan Cc: Stephen Hemminger Cc: Tianyu Lan Cc: Wei Liu Tested-by: Fabio A M Martins Signed-off-by: Guilherme G. Piccoli --- Special thanks to Michael Kelley for the good information about the Hyper-V panic path in email threads some months ago, and to Fabio for the testing performed. drivers/hv/hv_common.c | 12 1 file changed, 12 deletions(-) diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c index ae68298c0dca..af59793de523 100644 --- a/drivers/hv/hv_common.c +++ b/drivers/hv/hv_common.c @@ -73,18 +73,6 @@ int __init hv_common_init(void) { int i; - /* -* Hyper-V expects to get crash register data or kmsg when -* crash enlightment is available and system crashes. Set -* crash_kexec_post_notifiers to be true to make sure that -* calling crash enlightment interface before running kdump -* kernel. -*/ - if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { - crash_kexec_post_notifiers = true; - pr_info("Hyper-V: enabling crash_kexec_post_notifiers\n"); - } - /* * Allocate the per-CPU state for the hypercall input arg. * If this allocation fails, we will not be able to setup -- 2.36.0
[PATCH 25/30] panic, printk: Add console flush parameter and convert panic_print to a notifier
Currently the parameter "panic_print" relies in a function called directly on panic path; one of the flags the users can set for panic_print triggers a console replay mechanism, to show the entire kernel log buffer (from the beginning) in a panic event. Two problems with that: the dual nature of the panic_print isn't really appropriate, the function was originally meant to allow users dumping system information on panic events, and was "overridden" to also force a console flush of the full kernel log buffer. It also turns the code a bit more complex and duplicate than it needs to be. This patch proposes 2 changes: first, we decouple panic_print from the console flushing mechanism, in the form of a new kernel core parameter (panic_console_replay); we kept the functionality on panic_print to avoid userspace breakage, although we comment in both code and documentation that this panic_print usage is deprecated. We converted panic_print function to a panic notifier too, adding it on the panic informational notifier list, executed as the final callback. This allows a more clear code and makes sense, as panic_print_sys_info() is really a panic-time only function. We also moved its code to kernel/printk.c, it seems to make more sense given it's related to printing stuff. Suggested-by: Petr Mladek Signed-off-by: Guilherme G. Piccoli --- .../admin-guide/kernel-parameters.txt | 12 +++- Documentation/admin-guide/sysctl/kernel.rst | 5 +- include/linux/console.h | 2 + include/linux/panic.h | 1 - include/linux/printk.h| 1 + kernel/panic.c| 51 +++ kernel/printk/printk.c| 62 +++ 7 files changed, 87 insertions(+), 47 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 8d3524060ce3..c99da8b2b216 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3791,6 +3791,14 @@ timeout < 0: reboot immediately Format: + panic_console_replay + [KNL] Force a kernel log replay in the console on + panic event. Notice that there is already a flush + mechanism for pending messages; this option is meant + for users that wish to replay the *full* buffer. + It deprecates the bit 5 setting on "panic_print", + both having the same functionality. + panic_notifiers_level= [KNL] Set the panic notifiers execution order. Format: @@ -3825,12 +3833,14 @@ bit 2: print timer info bit 3: print locks info if CONFIG_LOCKDEP is on bit 4: print ftrace buffer - bit 5: print all printk messages in buffer + bit 5: print all printk messages in buffer (DEPRECATED) bit 6: print all CPUs backtrace (if available in the arch) *Be aware* that this option may print a _lot_ of lines, so there are risks of losing older messages in the log. Use this option carefully, maybe worth to setup a bigger log buffer with "log_buf_len" along with this. + Also, notice that bit 5 was deprecated in favor of the + parameter "panic_console_replay". panic_on_taint= Bitmask for conditionally calling panic() in add_taint() Format: [,nousertaint] diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst index 1144ea3229a3..17b293a0e566 100644 --- a/Documentation/admin-guide/sysctl/kernel.rst +++ b/Documentation/admin-guide/sysctl/kernel.rst @@ -763,10 +763,13 @@ bit 1 print system memory info bit 2 print timer info bit 3 print locks info if ``CONFIG_LOCKDEP`` is on bit 4 print ftrace buffer -bit 5 print all printk messages in buffer +bit 5 print all printk messages in buffer (DEPRECATED) bit 6 print all CPUs backtrace (if available in the arch) = +Notice that bit 5 was deprecated in favor of kernel core parameter +"panic_console_replay" (see kernel-parameters.txt documentation). + So for example to print tasks and memory info on panic, user can:: echo 3 > /proc/sys/kernel/panic_print diff --git a/include/linux/console.h b/include/linux/console.h index 7cd758a4f44e..351c14f623ad 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -169,6 +169,8 @@ enum con_flush_mode { CONSOLE_REPLAY_ALL, }; +extern bool panic_console_replay; + extern int add_preferred_console(char *name, int idx,
[PATCH 24/30] panic: Refactor the panic path
The panic() function is somewhat convoluted - a lot of changes were made over the years, adding comments that might be misleading/outdated now, it has a code structure that is a bit complex to follow, with lots of conditionals, for example. The panic notifier list is something else - a single list, with multiple callbacks of different purposes, that run in a non-deterministic order and may affect hardly kdump reliability - see the "crash_kexec_post_notifiers" workaround-ish flag. This patch proposes a major refactor on the panic path based on Petr's idea [0] - basically we split the notifiers list in three, having a set of different call points in the panic path. Below a list of changes proposed in this patch, culminating in the panic notifiers level concept: (a) First of all, we improved comments all over the function and removed useless variables / includes. Also, as part of this clean-up we concentrate the console flushing functions in a helper. (b) As mentioned before, there is a split of the panic notifier list in three, based on the purpose of the callback. The code contains good documentation in form of comments, but a summary of the three lists follows: - the hypervisor list aims low-risk procedures to inform hypervisors or firmware about the panic event, also includes LED-related functions; - the informational list contains callbacks that provide more details, like kernel offset or trace dump (if enabled) and also includes the callbacks aimed at reducing log pollution or warns, like the RCU and hung task disable callbacks; - finally, the pre_reboot list is the old notifier list renamed, containing the more risky callbacks that didn't fit the previous lists. There is also a 4th list (the post_reboot one), but it's not related with the original list - it contains late time architecture callbacks aimed at stopping the machine, for example. The 3 notifiers lists execute in different moments, hypervisor being the first, followed by informational and finally the pre_reboot list. (c) But then, there is the ordering problem of the notifiers against the crash_kernel() call - kdump must be as reliable as possible. For that, a simple binary "switch" as "crash_kexec_post_notifiers" is not enough, hence we introduce here concept of panic notifier levels: there are 5 levels, from 0 (no notifier executes before kdump) until 4 (all notifiers run before kdump); the default level is 2, in which the hypervisor and (iff we have any kmsg dumper) the informational notifiers execute before kdump. The detailed documentation of the levels is present in code comments and in the kernel-parameters.txt file; as an analogy with the previous panic() implementation, the level 0 is exactly the same as the old behavior of notifiers, running all after kdump, and the level 4 is the same as "crash_kexec_post_notifiers=Y" (we kept this parameter as a deprecated one). (d) Finally, an important change made here: we now use only the function "crash_smp_send_stop()" to shut all the secondary CPUs in the panic path. Before, there was a case of using the regular "smp_send_stop()", but the better approach is to simplify the code and try to use the function which was created exclusively for the panic path. Experiments showed that it works fine, and code was very simplified with that. Functional change is expected from this refactor, since now we call some notifiers by default before kdump, but the goal here besides code clean-up is to have a better panic path, more reliable and deterministic, but also very customizable. [0] https://lore.kernel.org/lkml/YfPxvzSzDLjO5ldp@alley/ Suggested-by: Petr Mladek Signed-off-by: Guilherme G. Piccoli --- Special thanks to Petr and Baoquan for the suggestion and feedback in a previous email thread. There's some important design decisions that worth mentioning and discussing: * The default panic notifiers level is 2, based on Petr Mladek's suggestion, which makes a lot of sense. Of course, this is customizable through the parameter, but would be something worthwhile to have a KConfig option to set the default level? It would help distros that want the old behavior (no notifiers before kdump) as default. * The implementation choice was to _avoid_ intricate if conditionals in the panic path, which would _definitely_ be present with the panic notifiers levels idea; so, instead of lots of if conditionals, the set/clear bits approach with functions called in 2 points (but executing only in one of them) is much easier to follow an was used here; the ordering helper function and the comments also help a lot to avoid confusion (hopefully). * Choice was to *always* use crash_smp_send_stop() instead of sometimes making use of the regular smp_send_stop(); for most architectures they are the same, including Xen (on x86). For the ones that override it, all should work fine, in the powerpc case it's even more correct (see the subsequent patch "powerpc: Do not force all panic notifiers to execute before
[PATCH 23/30] printk: kmsg_dump: Introduce helper to inform number of dumpers
Currently we don't have a way to check if there are dumpers set, except counting the list members maybe. This patch introduces a very simple helper to provide this information, by just keeping track of registered/unregistered kmsg dumpers. It's going to be used on the panic path in the subsequent patch. Notice that the spinlock guarding kmsg_dumpers list also guards increment/decrement of the dumper's counter, but there's no need for that when reading the counter in the panic path, since that is an atomic path and there's no other (planned) user. Signed-off-by: Guilherme G. Piccoli --- include/linux/kmsg_dump.h | 7 +++ kernel/printk/printk.c| 14 ++ 2 files changed, 21 insertions(+) diff --git a/include/linux/kmsg_dump.h b/include/linux/kmsg_dump.h index 906521c2329c..abea1974bff8 100644 --- a/include/linux/kmsg_dump.h +++ b/include/linux/kmsg_dump.h @@ -65,6 +65,8 @@ bool kmsg_dump_get_buffer(struct kmsg_dump_iter *iter, bool syslog, void kmsg_dump_rewind(struct kmsg_dump_iter *iter); +bool kmsg_has_dumpers(void); + int kmsg_dump_register(struct kmsg_dumper *dumper); int kmsg_dump_unregister(struct kmsg_dumper *dumper); @@ -91,6 +93,11 @@ static inline void kmsg_dump_rewind(struct kmsg_dump_iter *iter) { } +static inline bool kmsg_has_dumpers(void) +{ + return false; +} + static inline int kmsg_dump_register(struct kmsg_dumper *dumper) { return -EINVAL; diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index da03c15ecc89..e3a1c429fbbc 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -3399,6 +3399,18 @@ EXPORT_SYMBOL(printk_timed_ratelimit); static DEFINE_SPINLOCK(dump_list_lock); static LIST_HEAD(dump_list); +static int num_dumpers; + +/** + * kmsg_has_dumpers - inform if there is any kmsg dumper registered. + * + * Returns true if there's at least one registered dumper, or false + * if otherwise. + */ +bool kmsg_has_dumpers(void) +{ + return num_dumpers ? true : false; +} /** * kmsg_dump_register - register a kernel log dumper. @@ -3423,6 +3435,7 @@ int kmsg_dump_register(struct kmsg_dumper *dumper) dumper->registered = 1; list_add_tail_rcu(&dumper->list, &dump_list); err = 0; + num_dumpers++; } spin_unlock_irqrestore(&dump_list_lock, flags); @@ -3447,6 +3460,7 @@ int kmsg_dump_unregister(struct kmsg_dumper *dumper) dumper->registered = 0; list_del_rcu(&dumper->list); err = 0; + num_dumpers--; } spin_unlock_irqrestore(&dump_list_lock, flags); synchronize_rcu(); -- 2.36.0
[PATCH 22/30] panic: Introduce the panic post-reboot notifier list
Currently we have 3 notifier lists in the panic path, which will be wired in a way to allow the notifier callbacks to run in different moments at panic time, in a subsequent patch. But there is also an odd set of architecture calls hardcoded in the end of panic path, after the restart machinery. They're responsible for late time tunings / events, like enabling a stop button (Sparc) or effectively stopping the machine (s390). This patch introduces yet another notifier list to offer the architectures a way to add callbacks in such late moment on panic path without the need of ifdefs / hardcoded approaches. Cc: Alexander Gordeev Cc: Christian Borntraeger Cc: "David S. Miller" Cc: Heiko Carstens Cc: Sven Schnelle Cc: Vasily Gorbik Signed-off-by: Guilherme G. Piccoli --- arch/s390/kernel/setup.c | 19 ++- arch/sparc/kernel/setup_32.c | 27 +++ arch/sparc/kernel/setup_64.c | 29 - include/linux/panic_notifier.h | 1 + kernel/panic.c | 19 +++ 5 files changed, 73 insertions(+), 22 deletions(-) diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index d860ac300919..d816b2045f1e 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -39,7 +39,6 @@ #include #include #include -#include #include #include #include @@ -51,6 +50,7 @@ #include #include #include +#include #include #include @@ -943,6 +943,20 @@ static void __init log_component_list(void) } } +/* + * The following notifier executes as one of the latest things in the panic + * path, only if the restart routines weren't executed (or didn't succeed). + */ +static int panic_event(struct notifier_block *n, unsigned long ev, void *unused) +{ + disabled_wait(); + return NOTIFY_DONE; +} + +static struct notifier_block post_reboot_panic_block = { + .notifier_call = panic_event, +}; + /* * Setup function called from init/main.c just after the banner * was printed. @@ -1058,4 +1072,7 @@ void __init setup_arch(char **cmdline_p) /* Add system specific data to the random pool */ setup_randomness(); + + atomic_notifier_chain_register(&panic_post_reboot_list, + &post_reboot_panic_block); } diff --git a/arch/sparc/kernel/setup_32.c b/arch/sparc/kernel/setup_32.c index c8e0dd99f370..4e2428972f76 100644 --- a/arch/sparc/kernel/setup_32.c +++ b/arch/sparc/kernel/setup_32.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -51,6 +52,7 @@ #include "kernel.h" +int stop_a_enabled = 1; struct screen_info screen_info = { 0, 0, /* orig-x, orig-y */ 0, /* unused */ @@ -293,6 +295,24 @@ void __init sparc32_start_kernel(struct linux_romvec *rp) start_kernel(); } +/* + * The following notifier executes as one of the latest things in the panic + * path, only if the restart routines weren't executed (or didn't succeed). + */ +static int panic_event(struct notifier_block *n, unsigned long ev, void *unused) +{ + /* Make sure the user can actually press Stop-A (L1-A) */ + stop_a_enabled = 1; + pr_emerg("Press Stop-A (L1-A) from sun keyboard or send break\n" + "twice on console to return to the boot prom\n"); + + return NOTIFY_DONE; +} + +static struct notifier_block post_reboot_panic_block = { + .notifier_call = panic_event, +}; + void __init setup_arch(char **cmdline_p) { int i; @@ -368,9 +388,10 @@ void __init setup_arch(char **cmdline_p) paging_init(); smp_setup_cpu_possible_map(); -} -extern int stop_a_enabled; + atomic_notifier_chain_register(&panic_post_reboot_list, + &post_reboot_panic_block); +} void sun_do_break(void) { @@ -384,8 +405,6 @@ void sun_do_break(void) } EXPORT_SYMBOL(sun_do_break); -int stop_a_enabled = 1; - static int __init topology_init(void) { int i, ncpus, err; diff --git a/arch/sparc/kernel/setup_64.c b/arch/sparc/kernel/setup_64.c index 48abee4eee29..9066c25ecc07 100644 --- a/arch/sparc/kernel/setup_64.c +++ b/arch/sparc/kernel/setup_64.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include @@ -62,6 +63,8 @@ #include "entry.h" #include "kernel.h" +int stop_a_enabled = 1; + /* Used to synchronize accesses to NatSemi SUPER I/O chip configure * operations in asm/ns87303.h */ @@ -632,6 +635,24 @@ void __init alloc_irqstack_bootmem(void) } } +/* + * The following notifier executes as one of the latest things in the panic + * path, only if the restart routines weren't executed (or didn't succeed). + */ +static int panic_event(struct notifier_block *n, unsigned long ev, void *unused) +{ + /* Make sure the user can actually press Stop-A (L1-A) */ + stop_a_enabled = 1; + pr_emerg("Press Stop
[PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
This patch renames the panic_notifier_list to panic_pre_reboot_list; the idea is that a subsequent patch will refactor the panic path in order to better split the notifiers, running some of them very early, some of them not so early [but still before kmsg_dump()] and finally, the rest should execute late, after kdump. The latter ones are now in the panic pre-reboot list - the name comes from the idea that these notifiers execute before panic() attempts rebooting the machine (if that option is set). We also took the opportunity to clean-up useless header inclusions, improve some notifier block declarations (e.g. in ibmasm/heartbeat.c) and more important, change some priorities - we hereby set 2 notifiers to run late in the list [iss_panic_event() and the IPMI panic_event()] due to the risks they offer (may not return, for example). Proper documentation is going to be provided in a subsequent patch, that effectively refactors the panic path. Cc: Alex Elder Cc: Alexander Gordeev Cc: Anton Ivanov Cc: Benjamin Herrenschmidt Cc: Bjorn Andersson Cc: Boris Ostrovsky Cc: Chris Zankel Cc: Christian Borntraeger Cc: Corey Minyard Cc: Dexuan Cui Cc: "H. Peter Anvin" Cc: Haiyang Zhang Cc: Heiko Carstens Cc: Helge Deller Cc: Ivan Kokshaysky Cc: "James E.J. Bottomley" Cc: James Morse Cc: Johannes Berg Cc: Juergen Gross Cc: "K. Y. Srinivasan" Cc: Mathieu Poirier Cc: Matt Turner Cc: Mauro Carvalho Chehab Cc: Max Filippov Cc: Michael Ellerman Cc: Paul Mackerras Cc: Pavel Machek Cc: Richard Henderson Cc: Richard Weinberger Cc: Robert Richter Cc: Stefano Stabellini Cc: Stephen Hemminger Cc: Sven Schnelle Cc: Tony Luck Cc: Vasily Gorbik Cc: Wei Liu Signed-off-by: Guilherme G. Piccoli --- Notice that, with this name change, out-of-tree code that relies in the global exported "panic_notifier_list" will fail to build. We could easily keep the retro-compatibility by making the old symbol to still exist and point to the pre_reboot list (or even, keep the old naming). But our design choice was to allow the breakage, making users rethink their notifiers, adding them in the list that fits best. If that wasn't a good decision, we're open to change it, of course. Thanks in advance for the review! arch/alpha/kernel/setup.c | 4 ++-- arch/parisc/kernel/pdc_chassis.c | 3 +-- arch/powerpc/kernel/setup-common.c| 2 +- arch/s390/kernel/ipl.c| 4 ++-- arch/um/drivers/mconsole_kern.c | 2 +- arch/um/kernel/um_arch.c | 2 +- arch/x86/xen/enlighten.c | 2 +- arch/xtensa/platforms/iss/setup.c | 4 ++-- drivers/char/ipmi/ipmi_msghandler.c | 12 +++- drivers/edac/altera_edac.c| 3 +-- drivers/hv/vmbus_drv.c| 4 ++-- drivers/leds/trigger/ledtrig-panic.c | 3 +-- drivers/misc/ibmasm/heartbeat.c | 16 +--- drivers/net/ipa/ipa_smp2p.c | 5 ++--- drivers/parisc/power.c| 4 ++-- drivers/remoteproc/remoteproc_core.c | 6 -- drivers/s390/char/con3215.c | 2 +- drivers/s390/char/con3270.c | 2 +- drivers/s390/char/sclp_con.c | 2 +- drivers/s390/char/sclp_vt220.c| 2 +- drivers/staging/olpc_dcon/olpc_dcon.c | 6 -- drivers/video/fbdev/hyperv_fb.c | 4 ++-- include/linux/panic_notifier.h| 2 +- kernel/panic.c| 9 - 24 files changed, 54 insertions(+), 51 deletions(-) diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c index d88bdf852753..8ace0d7113b6 100644 --- a/arch/alpha/kernel/setup.c +++ b/arch/alpha/kernel/setup.c @@ -472,8 +472,8 @@ setup_arch(char **cmdline_p) } /* Register a call for panic conditions. */ - atomic_notifier_chain_register(&panic_notifier_list, - &alpha_panic_block); + atomic_notifier_chain_register(&panic_pre_reboot_list, + &alpha_panic_block); #ifndef alpha_using_srm /* Assume that we've booted from SRM if we haven't booted from MILO. diff --git a/arch/parisc/kernel/pdc_chassis.c b/arch/parisc/kernel/pdc_chassis.c index da154406d368..0fd8d87fb4f9 100644 --- a/arch/parisc/kernel/pdc_chassis.c +++ b/arch/parisc/kernel/pdc_chassis.c @@ -22,7 +22,6 @@ #include #include #include -#include #include #include #include @@ -135,7 +134,7 @@ void __init parisc_pdc_chassis_init(void) PDC_CHASSIS_VER); /* initialize panic notifier chain */ - atomic_notifier_chain_register(&panic_notifier_list, + atomic_notifier_chain_register(&panic_pre_reboot_list, &pdc_chassis_panic_block); /* initialize reboot notifier chain */ diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index d04b8bf8dbc7..3518bebc10ad 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/
[PATCH 20/30] panic: Add the panic informational notifier list
The goal of this new panic notifier is to allow its users to register callbacks to run earlier in the panic path than they currently do. This aims at informational mechanisms, like dumping kernel offsets and showing device error data (in case it's simple registers reading, for example) as well as mechanisms to disable log flooding (like hung_task detector / RCU warnings) and the tracing dump_on_oops (when enabled). Any (non-invasive) information that should be provided before kmsg_dump() as well as log flooding preventing code should fit here, as long it offers relatively low risk for kdump. For now, the patch is almost a no-op, although it changes a bit the ordering in which some panic notifiers are executed - specially affected by this are the notifiers responsible for disabling the hung_task detector / RCU warnings, which now run first. In a subsequent patch, the panic path will be refactored, then the panic informational notifiers will effectively run earlier, before ksmg_dump() (and usually before kdump as well). We also defer documenting it all properly in the subsequent refactor patch. Finally, while at it, we removed some useless header inclusions too. Cc: Benjamin Herrenschmidt Cc: Catalin Marinas Cc: Florian Fainelli Cc: Frederic Weisbecker Cc: "H. Peter Anvin" Cc: Hari Bathini Cc: Joel Fernandes Cc: Jonathan Hunter Cc: Josh Triplett Cc: Lai Jiangshan Cc: Leo Yan Cc: Mathieu Desnoyers Cc: Mathieu Poirier Cc: Michael Ellerman Cc: Mike Leach Cc: Mikko Perttunen Cc: Neeraj Upadhyay Cc: Nicholas Piggin Cc: Paul Mackerras Cc: Suzuki K Poulose Cc: Thierry Reding Cc: Thomas Bogendoerfer Signed-off-by: Guilherme G. Piccoli --- arch/arm64/kernel/setup.c | 2 +- arch/mips/kernel/relocate.c | 2 +- arch/powerpc/kernel/setup-common.c| 2 +- arch/x86/kernel/setup.c | 2 +- drivers/bus/brcmstb_gisb.c| 2 +- drivers/hwtracing/coresight/coresight-cpu-debug.c | 4 ++-- drivers/soc/tegra/ari-tegra186.c | 3 ++- include/linux/panic_notifier.h| 1 + kernel/hung_task.c| 3 ++- kernel/panic.c| 4 kernel/rcu/tree.c | 1 - kernel/rcu/tree_stall.h | 3 ++- kernel/trace/trace.c | 2 +- 13 files changed, 19 insertions(+), 12 deletions(-) diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 3505789cf4bd..ac2c7e8c9c6a 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -444,7 +444,7 @@ static struct notifier_block arm64_panic_block = { static int __init register_arm64_panic_block(void) { - atomic_notifier_chain_register(&panic_notifier_list, + atomic_notifier_chain_register(&panic_info_list, &arm64_panic_block); return 0; } diff --git a/arch/mips/kernel/relocate.c b/arch/mips/kernel/relocate.c index 56b51de2dc51..650811f2436a 100644 --- a/arch/mips/kernel/relocate.c +++ b/arch/mips/kernel/relocate.c @@ -459,7 +459,7 @@ static struct notifier_block kernel_location_notifier = { static int __init register_kernel_offset_dumper(void) { - atomic_notifier_chain_register(&panic_notifier_list, + atomic_notifier_chain_register(&panic_info_list, &kernel_location_notifier); return 0; } diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index 1468c3937bf4..d04b8bf8dbc7 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -757,7 +757,7 @@ void __init setup_panic(void) &ppc_fadump_block); if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset() > 0) - atomic_notifier_chain_register(&panic_notifier_list, + atomic_notifier_chain_register(&panic_info_list, &kernel_offset_notifier); /* Low-level platform-specific routines that should run on panic */ diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index c95b9ac5a457..599b25346964 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1266,7 +1266,7 @@ static struct notifier_block kernel_offset_notifier = { static int __init register_kernel_offset_dumper(void) { - atomic_notifier_chain_register(&panic_notifier_list, + atomic_notifier_chain_register(&panic_info_list, &kernel_offset_notifier); return 0; } diff --git a/drivers/bus/brcmstb_gisb.c b/drivers/bus/brcmstb_gisb.c index 1ea7b015e225..c64e087fba7a 100644 --- a/drivers/bus/brcmstb_gisb.c +++ b/drivers/bus/brcmstb_gisb.c @@ -486,7 +486,7 @@ static int __init brcmstb_gisb_arb_probe(struct platform_device *pdev) if (list_is_singular(&brc
[PATCH 19/30] panic: Add the panic hypervisor notifier list
The goal of this new panic notifier is to allow its users to register callbacks to run very early in the panic path. This aims hypervisor/FW notification mechanisms as well as simple LED functions, and any other simple and safe mechanism that should run early in the panic path; more dangerous callbacks should execute later. For now, the patch is almost a no-op (although it changes a bit the ordering in which some panic notifiers are executed). In a subsequent patch, the panic path will be refactored, then the panic hypervisor notifiers will effectively run very early in the panic path. We also defer documenting it all properly in the subsequent refactor patch. While at it, we removed some useless header inclusions and fixed some notifiers return too (by using the standard NOTIFY_DONE). Cc: Alexander Gordeev Cc: Andrea Parri (Microsoft) Cc: Ard Biesheuvel Cc: Benjamin Herrenschmidt Cc: Brian Norris Cc: Christian Borntraeger Cc: Christophe JAILLET Cc: David Gow Cc: "David S. Miller" Cc: Dexuan Cui Cc: Doug Berger Cc: Evan Green Cc: Florian Fainelli Cc: Haiyang Zhang Cc: Hari Bathini Cc: Heiko Carstens Cc: Julius Werner Cc: Justin Chen Cc: "K. Y. Srinivasan" Cc: Lee Jones Cc: Markus Mayer Cc: Michael Ellerman Cc: Michael Kelley Cc: Mihai Carabas Cc: Nicholas Piggin Cc: Paul Mackerras Cc: Pavel Machek Cc: Scott Branden Cc: Sebastian Reichel Cc: Shile Zhang Cc: Stephen Hemminger Cc: Sven Schnelle Cc: Thomas Bogendoerfer Cc: Tianyu Lan Cc: Vasily Gorbik Cc: Wang ShaoBo Cc: Wei Liu Cc: zhenwei pi Signed-off-by: Guilherme G. Piccoli --- arch/mips/sgi-ip22/ip22-reset.c | 2 +- arch/mips/sgi-ip32/ip32-reset.c | 3 +-- arch/powerpc/kernel/setup-common.c | 2 +- arch/sparc/kernel/sstate.c | 3 +-- drivers/firmware/google/gsmi.c | 4 ++-- drivers/hv/vmbus_drv.c | 4 ++-- drivers/leds/trigger/ledtrig-activity.c | 4 ++-- drivers/leds/trigger/ledtrig-heartbeat.c | 4 ++-- drivers/misc/bcm-vk/bcm_vk_dev.c | 6 +++--- drivers/misc/pvpanic/pvpanic.c | 4 ++-- drivers/power/reset/ltc2952-poweroff.c | 4 ++-- drivers/s390/char/zcore.c| 5 +++-- drivers/soc/bcm/brcmstb/pm/pm-arm.c | 2 +- include/linux/panic_notifier.h | 1 + kernel/panic.c | 4 15 files changed, 28 insertions(+), 24 deletions(-) diff --git a/arch/mips/sgi-ip22/ip22-reset.c b/arch/mips/sgi-ip22/ip22-reset.c index 8f0861c58080..3023848acbf1 100644 --- a/arch/mips/sgi-ip22/ip22-reset.c +++ b/arch/mips/sgi-ip22/ip22-reset.c @@ -195,7 +195,7 @@ static int __init reboot_setup(void) } timer_setup(&blink_timer, blink_timeout, 0); - atomic_notifier_chain_register(&panic_notifier_list, &panic_block); + atomic_notifier_chain_register(&panic_hypervisor_list, &panic_block); return 0; } diff --git a/arch/mips/sgi-ip32/ip32-reset.c b/arch/mips/sgi-ip32/ip32-reset.c index 18d1c115cd53..9ee1302c9d13 100644 --- a/arch/mips/sgi-ip32/ip32-reset.c +++ b/arch/mips/sgi-ip32/ip32-reset.c @@ -15,7 +15,6 @@ #include #include #include -#include #include #include #include @@ -145,7 +144,7 @@ static __init int ip32_reboot_setup(void) pm_power_off = ip32_machine_halt; timer_setup(&blink_timer, blink_timeout, 0); - atomic_notifier_chain_register(&panic_notifier_list, &panic_block); + atomic_notifier_chain_register(&panic_hypervisor_list, &panic_block); return 0; } diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index 52f96b209a96..1468c3937bf4 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -753,7 +753,7 @@ static struct notifier_block ppc_panic_block = { void __init setup_panic(void) { /* Hard-disables IRQs + deal with FW-assisted dump (fadump) */ - atomic_notifier_chain_register(&panic_notifier_list, + atomic_notifier_chain_register(&panic_hypervisor_list, &ppc_fadump_block); if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset() > 0) diff --git a/arch/sparc/kernel/sstate.c b/arch/sparc/kernel/sstate.c index 3bcc4ddc6911..82b7b68e0bdc 100644 --- a/arch/sparc/kernel/sstate.c +++ b/arch/sparc/kernel/sstate.c @@ -5,7 +5,6 @@ */ #include -#include #include #include #include @@ -106,7 +105,7 @@ static int __init sstate_init(void) do_set_sstate(HV_SOFT_STATE_TRANSITION, booting_msg); - atomic_notifier_chain_register(&panic_notifier_list, + atomic_notifier_chain_register(&panic_hypervisor_list, &sstate_panic_block); register_reboot_notifier(&sstate_reboot_notifier); diff --git a/drivers/firmware/google/gsmi.c b/drivers/firmware/google/gsmi.c index b01ed02e4a87..ff0bebe2f444 100644 --- a/drivers/firmware/google/gsmi.c +++ b/drivers/firmware/google/gsmi.c @@ -1034,7 +1034,7 @@ s
[PATCH 18/30] notifier: Show function names on notifier routines if DEBUG_NOTIFIERS is set
Currently we have a debug infrastructure in the notifiers file, but it's very simple/limited. This patch extends it by: (a) Showing all registered/unregistered notifiers' callback names; (b) Adding a dynamic debug tuning to allow showing called notifiers' function names. Notice that this should be guarded as a tunable since it can flood the kernel log buffer. Cc: Arjan van de Ven Cc: Cong Wang Cc: Sebastian Andrzej Siewior Cc: Valentin Schneider Cc: Xiaoming Ni Signed-off-by: Guilherme G. Piccoli --- We have some design decisions that worth discussing here: (a) First of call, using C99 helps a lot to write clear and concise code, but due to commit 4d94f910e79a ("Kbuild: use -Wdeclaration-after-statement") we have a warning if mixing variable declarations with code. For this patch though, doing that makes the code way clear, so decision was to add the debug code inside brackets whenever this warning pops up. We can change that, but that'll cause more ifdefs in the same function. (b) In the symbol lookup helper function, we modify the parameter passed but even more, we return it as well! This is unusual and seems unnecessary, but was the strategy taken to allow embedding such function in the pr_debug() call. Not doing that would likely requiring 3 symbol_name variables to avoid concurrency (registering notifier A while calling notifier B) - we rely in local variables as a serialization mechanism. We're open for suggestions in case this design is not appropriate; thanks in advance! kernel/notifier.c | 48 +-- 1 file changed, 46 insertions(+), 2 deletions(-) diff --git a/kernel/notifier.c b/kernel/notifier.c index ba005ebf4730..21032ebcde57 100644 --- a/kernel/notifier.c +++ b/kernel/notifier.c @@ -7,6 +7,22 @@ #include #include +#ifdef CONFIG_DEBUG_NOTIFIERS +#include + +/* + * Helper to get symbol names in case DEBUG_NOTIFIERS is set. + * Return the modified parameter is a strategy used to achieve + * the pr_debug() functionality - with this, function is only + * executed if the dynamic debug tuning is effectively set. + */ +static inline char *notifier_name(struct notifier_block *nb, char *sym_name) +{ + lookup_symbol_name((unsigned long)(nb->notifier_call), sym_name); + return sym_name; +} +#endif + /* * Notifier list for kernel code which wants to be called * at shutdown. This is used to stop any idling DMA operations @@ -34,20 +50,41 @@ static int notifier_chain_register(struct notifier_block **nl, } n->next = *nl; rcu_assign_pointer(*nl, n); + +#ifdef CONFIG_DEBUG_NOTIFIERS + { + char sym_name[KSYM_NAME_LEN]; + + pr_info("notifiers: registered %s()\n", + notifier_name(n, sym_name)); + } +#endif return 0; } static int notifier_chain_unregister(struct notifier_block **nl, struct notifier_block *n) { + int ret = -ENOENT; + while ((*nl) != NULL) { if ((*nl) == n) { rcu_assign_pointer(*nl, n->next); - return 0; + ret = 0; + break; } nl = &((*nl)->next); } - return -ENOENT; + +#ifdef CONFIG_DEBUG_NOTIFIERS + if (!ret) { + char sym_name[KSYM_NAME_LEN]; + + pr_info("notifiers: unregistered %s()\n", + notifier_name(n, sym_name)); + } +#endif + return ret; } /** @@ -80,6 +117,13 @@ static int notifier_call_chain(struct notifier_block **nl, nb = next_nb; continue; } + + { + char sym_name[KSYM_NAME_LEN]; + + pr_debug("notifiers: calling %s()\n", +notifier_name(nb, sym_name)); + } #endif ret = nb->notifier_call(nb, val, v); -- 2.36.0
[PATCH 17/30] tracing: Improve panic/die notifiers
Currently the tracing dump_on_oops feature is implemented through separate notifiers, one for die/oops and the other for panic. With the addition of panic notifier "id", this patch makes use of such "id" to unify both functions. It also comments the function and changes the priority of the notifier blocks, in order they run early compared to other notifiers, to prevent useless trace data (like the callback names for the other notifiers). Finally, we also removed an unnecessary header inclusion. Signed-off-by: Guilherme G. Piccoli --- kernel/trace/trace.c | 57 +--- 1 file changed, 32 insertions(+), 25 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index f4de111fa18f..c1d8a3622ccc 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -19,7 +19,6 @@ #include #include #include -#include #include #include #include @@ -9767,38 +9766,46 @@ static __init int tracer_init_tracefs(void) fs_initcall(tracer_init_tracefs); -static int trace_panic_handler(struct notifier_block *this, - unsigned long event, void *unused) +/* + * The idea is to execute the following die/panic callback early, in order + * to avoid showing irrelevant information in the trace (like other panic + * notifier functions); we are the 2nd to run, after hung_task/rcu_stall + * warnings get disabled (to prevent potential log flooding). + */ +static int trace_die_panic_handler(struct notifier_block *self, + unsigned long ev, void *unused) { - if (ftrace_dump_on_oops) + int do_dump; + + if (!ftrace_dump_on_oops) + return NOTIFY_DONE; + + switch (ev) { + case DIE_OOPS: + do_dump = 1; + break; + case PANIC_NOTIFIER: + do_dump = 1; + break; + default: + do_dump = 0; + break; + } + + if (do_dump) ftrace_dump(ftrace_dump_on_oops); - return NOTIFY_OK; + + return NOTIFY_DONE; } static struct notifier_block trace_panic_notifier = { - .notifier_call = trace_panic_handler, - .next = NULL, - .priority = 150 /* priority: INT_MAX >= x >= 0 */ + .notifier_call = trace_die_panic_handler, + .priority = INT_MAX - 1, }; -static int trace_die_handler(struct notifier_block *self, -unsigned long val, -void *data) -{ - switch (val) { - case DIE_OOPS: - if (ftrace_dump_on_oops) - ftrace_dump(ftrace_dump_on_oops); - break; - default: - break; - } - return NOTIFY_OK; -} - static struct notifier_block trace_die_notifier = { - .notifier_call = trace_die_handler, - .priority = 200 + .notifier_call = trace_die_panic_handler, + .priority = INT_MAX - 1, }; /* -- 2.36.0
[PATCH 16/30] drivers/hv/vmbus, video/hyperv_fb: Untangle and refactor Hyper-V panic notifiers
Currently Hyper-V guests are among the most relevant users of the panic infrastructure, like panic notifiers, kmsg dumpers, etc. The reasons rely both in cleaning-up procedures (closing a hypervisor <-> guest connection, disabling a paravirtualized timer) as well as to data collection (sending panic information to the hypervisor) and framebuffer management. The thing is: some notifiers are related to others, ordering matters, some functionalities are duplicated and there are lots of conditionals behind sending panic information to the hypervisor. This patch, as part of an effort to clean-up the panic notifiers mechanism and better document things, address some of the issues/complexities of Hyper-V panic handling through the following changes: (a) We have die and panic notifiers on vmbus_drv.c and both have goals of sending panic information to the hypervisor, though the panic notifier is also responsible for a cleaning-up procedure. This commit clears the code by splitting the panic notifier in two, one for closing the vmbus connection whereas the other is only for sending panic info to hypervisor. With that, it was possible to merge the die and panic notifiers in a single/well-documented function, and clear some conditional complexities on sending such information to the hypervisor. (b) The new panic notifier created after (a) is only doing a single thing: cleaning the vmbus connection. This procedure might cause a delay (due to hypervisor I/O completion), so we postpone that to run late. But more relevant: this *same* vmbus unloading happens in the crash_shutdown() handler, so if kdump is set, we can safely skip this panic notifier and defer such clean-up to the kexec crash handler. (c) There is also a Hyper-V framebuffer panic notifier, which relies in doing a vmbus operation that demands a valid connection. So, we must order this notifier with the panic notifier from vmbus_drv.c, in order to guarantee that the framebuffer code executes before the vmbus connection is unloaded. Also, this commit removes a useless header. Although there is code rework and re-ordering, we expect that this change has no functional regressions but instead optimize the path and increase panic reliability on Hyper-V. This was tested on Hyper-V with success. Fixes: 792f232d57ff ("Drivers: hv: vmbus: Fix potential crash on module unload") Fixes: 74347a99e73a ("x86/Hyper-V: Unload vmbus channel in hv panic callback") Cc: Andrea Parri (Microsoft) Cc: Dexuan Cui Cc: Haiyang Zhang Cc: "K. Y. Srinivasan" Cc: Michael Kelley Cc: Stephen Hemminger Cc: Tianyu Lan Cc: Wei Liu Tested-by: Fabio A M Martins Signed-off-by: Guilherme G. Piccoli --- Special thanks to Michael Kelley for the good information about the Hyper-V panic path in email threads some months ago, and to Fabio for the testing performed. Michael and all Microsoft folks: a careful analysis to double-check our changes and assumptions here is really appreciated, this code is complex and intricate, it is possible some corner case might have been overlooked. Thanks in advance! drivers/hv/vmbus_drv.c | 109 drivers/video/fbdev/hyperv_fb.c | 8 +++ 2 files changed, 76 insertions(+), 41 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 14de17087864..f37f12d48001 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -24,11 +24,11 @@ #include #include -#include #include #include #include #include +#include #include #include #include @@ -68,51 +68,75 @@ static int hyperv_report_reg(void) return !sysctl_record_panic_msg || !hv_panic_page; } -static int hyperv_panic_event(struct notifier_block *nb, unsigned long val, +/* + * The panic notifier below is responsible solely for unloading the + * vmbus connection, which is necessary in a panic event. But notice + * that this same unloading procedure is executed in the Hyper-V + * crash_shutdown() handler [see hv_crash_handler()], which basically + * means that we can postpone its execution if we have kdump set, + * since it will run the crash_shutdown() handler anyway. Even more + * intrincated is the relation of this notifier with Hyper-V framebuffer + * panic notifier - we need vmbus connection alive there in order to + * succeed, so we need to order both with each other [for reference see + * hvfb_on_panic()] - this is done using notifiers' priorities. + */ +static int hv_panic_vmbus_unload(struct notifier_block *nb, unsigned long val, void *args) +{ + if (!kexec_crash_loaded()) + vmbus_initiate_unload(true); + + return NOTIFY_DONE; +} +static struct notifier_block hyperv_panic_vmbus_unload_block = { + .notifier_call = hv_panic_vmbus_unload, + .priority = INT_MIN + 1, /* almost the latest one to execute */ +}; + +/* + * The following callback works both as die and panic notifier; its + * goal is to provide panic informatio
[PATCH 15/30] bus: brcmstb_gisb: Clean-up panic/die notifiers
This patch improves the panic/die notifiers in this driver by making use of a passed "id" instead of comparing pointer address; also, it removes an useless prototype declaration and unnecessary header inclusion. This is part of a panic notifiers refactor - this notifier in the future will be moved to a new list, that encompass the information notifiers only. Fixes: 9eb60880d9a9 ("bus: brcmstb_gisb: add notifier handling") Cc: Brian Norris Cc: Florian Fainelli Signed-off-by: Guilherme G. Piccoli --- drivers/bus/brcmstb_gisb.c | 26 +++--- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/drivers/bus/brcmstb_gisb.c b/drivers/bus/brcmstb_gisb.c index 183d5cc37d42..1ea7b015e225 100644 --- a/drivers/bus/brcmstb_gisb.c +++ b/drivers/bus/brcmstb_gisb.c @@ -19,7 +19,6 @@ #include #include #include -#include #ifdef CONFIG_MIPS #include @@ -347,25 +346,14 @@ static irqreturn_t brcmstb_gisb_bp_handler(int irq, void *dev_id) /* * Dump out gisb errors on die or panic. */ -static int dump_gisb_error(struct notifier_block *self, unsigned long v, - void *p); - -static struct notifier_block gisb_die_notifier = { - .notifier_call = dump_gisb_error, -}; - -static struct notifier_block gisb_panic_notifier = { - .notifier_call = dump_gisb_error, -}; - static int dump_gisb_error(struct notifier_block *self, unsigned long v, void *p) { struct brcmstb_gisb_arb_device *gdev; - const char *reason = "panic"; + const char *reason = "die"; - if (self == &gisb_die_notifier) - reason = "die"; + if (v == PANIC_NOTIFIER) + reason = "panic"; /* iterate over each GISB arb registered handlers */ list_for_each_entry(gdev, &brcmstb_gisb_arb_device_list, next) @@ -374,6 +362,14 @@ static int dump_gisb_error(struct notifier_block *self, unsigned long v, return NOTIFY_DONE; } +static struct notifier_block gisb_die_notifier = { + .notifier_call = dump_gisb_error, +}; + +static struct notifier_block gisb_panic_notifier = { + .notifier_call = dump_gisb_error, +}; + static DEVICE_ATTR(gisb_arb_timeout, S_IWUSR | S_IRUGO, gisb_arb_get_timeout, gisb_arb_set_timeout); -- 2.36.0
[PATCH 14/30] panic: Properly identify the panic event to the notifiers' callbacks
The notifiers infrastructure provides a way to pass an "id" to the callbacks to determine what kind of event happened, i.e., what is the reason behind they getting called. The panic notifier currently pass 0, but this is soon to be used in a multi-targeted notifier, so let's pass a meaningful "id" over there. Signed-off-by: Guilherme G. Piccoli --- include/linux/panic_notifier.h | 5 + kernel/panic.c | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/include/linux/panic_notifier.h b/include/linux/panic_notifier.h index 41e32483d7a7..07dced83a783 100644 --- a/include/linux/panic_notifier.h +++ b/include/linux/panic_notifier.h @@ -9,4 +9,9 @@ extern struct atomic_notifier_head panic_notifier_list; extern bool crash_kexec_post_notifiers; +enum panic_notifier_val { + PANIC_UNUSED, + PANIC_NOTIFIER = 0xDEAD, +}; + #endif /* _LINUX_PANIC_NOTIFIERS_H */ diff --git a/kernel/panic.c b/kernel/panic.c index eb4dfb932c85..523bc9ccd0e9 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -287,7 +287,7 @@ void panic(const char *fmt, ...) * Run any panic handlers, including those that might need to * add information to the kmsg dump output. */ - atomic_notifier_call_chain(&panic_notifier_list, 0, buf); + atomic_notifier_call_chain(&panic_notifier_list, PANIC_NOTIFIER, buf); panic_print_sys_info(false); -- 2.36.0
[PATCH 13/30] s390/consoles: Improve panic notifiers reliability
Currently many console drivers for s390 rely on panic/reboot notifiers to invoke callbacks on these events. The panic() function disables local IRQs, secondary CPUs and preemption, so callbacks invoked on panic are effectively running in atomic context. Happens that most of these console callbacks from s390 doesn't take the proper care with regards to atomic context, like taking spinlocks that might be taken in other function/CPU and hence will cause a lockup situation. The goal for this patch is to improve the notifiers reliability, acting on 4 console drivers, as detailed below: (1) con3215: changed a regular spinlock to the trylock alternative. (2) con3270: also changed a regular spinlock to its trylock counterpart, but here we also have another problem: raw3270_activate_view() takes a different spinlock. So, we worked a helper to validate if this other lock is safe to acquire, and if so, raw3270_activate_view() should be safe. Notice though that there is a functional change here: it's now possible to continue the notifier code [reaching con3270_wait_write() and con3270_rebuild_update()] without executing raw3270_activate_view(). (3) sclp: a global lock is used heavily in the functions called from the notifier, so we added a check here - if the lock is taken already, we just bail-out, preventing the lockup. (4) sclp_vt220: same as (3), a lock validation was added to prevent the potential lockup problem. Besides (1)-(4), we also removed useless void functions, adding the code called from the notifier inside its own body, and changed the priority of such notifiers to execute late, since they are "heavyweight" for the panic environment, so we aim to reduce risks here. Changed return values to NOTIFY_DONE as well, the standard one. Cc: Alexander Gordeev Cc: Christian Borntraeger Cc: Heiko Carstens Cc: Sven Schnelle Cc: Vasily Gorbik Signed-off-by: Guilherme G. Piccoli --- As a design choice, the option used here to verify a given spinlock is taken was the function "spin_is_locked()" - but we noticed that it is not often used. An alternative would to take the lock with a spin_trylock() and if it succeeds, just release the spinlock and continue the code. But that seemed weird... Also, we'd like to ask a good validation of case (2) potential functionality change from the s390 console experts - far from expert here, and in our naive code observation, that seems fine, but that analysis might be missing some corner case. Thanks in advance! drivers/s390/char/con3215.c| 36 +++-- drivers/s390/char/con3270.c| 34 +++ drivers/s390/char/raw3270.c| 18 +++ drivers/s390/char/raw3270.h| 1 + drivers/s390/char/sclp_con.c | 28 +-- drivers/s390/char/sclp_vt220.c | 42 +++--- 6 files changed, 96 insertions(+), 63 deletions(-) diff --git a/drivers/s390/char/con3215.c b/drivers/s390/char/con3215.c index f356607835d8..192198bd3dc4 100644 --- a/drivers/s390/char/con3215.c +++ b/drivers/s390/char/con3215.c @@ -771,35 +771,37 @@ static struct tty_driver *con3215_device(struct console *c, int *index) } /* - * panic() calls con3215_flush through a panic_notifier - * before the system enters a disabled, endless loop. + * The below function is called as a panic/reboot notifier before the + * system enters a disabled, endless loop. + * + * Notice we must use the spin_trylock() alternative, to prevent lockups + * in atomic context (panic routine runs with secondary CPUs, local IRQs + * and preemption disabled). */ -static void con3215_flush(void) -{ - struct raw3215_info *raw; - unsigned long flags; - - raw = raw3215[0]; /* console 3215 is the first one */ - spin_lock_irqsave(get_ccwdev_lock(raw->cdev), flags); - raw3215_make_room(raw, RAW3215_BUFFER_SIZE); - spin_unlock_irqrestore(get_ccwdev_lock(raw->cdev), flags); -} - static int con3215_notify(struct notifier_block *self, unsigned long event, void *data) { - con3215_flush(); - return NOTIFY_OK; + struct raw3215_info *raw; + unsigned long flags; + + raw = raw3215[0]; /* console 3215 is the first one */ + if (!spin_trylock_irqsave(get_ccwdev_lock(raw->cdev), flags)) + return NOTIFY_DONE; + + raw3215_make_room(raw, RAW3215_BUFFER_SIZE); + spin_unlock_irqrestore(get_ccwdev_lock(raw->cdev), flags); + + return NOTIFY_DONE; } static struct notifier_block on_panic_nb = { .notifier_call = con3215_notify, - .priority = 0, + .priority = INT_MIN + 1, /* run the callback late */ }; static struct notifier_block on_reboot_nb = { .notifier_call = con3215_notify, - .priority = 0, + .priority = INT_MIN + 1, /* run the callback late */ }; /* diff --git a/drivers/s390/char/con3270.c b/drivers/s390/char/con3270.c index e4592890f20a..476202f3d8a0 100644 --- a/drivers/s
[PATCH 12/30] parisc: Replace regular spinlock with spin_trylock on panic path
The panic notifiers' callbacks execute in an atomic context, with interrupts/preemption disabled, and all CPUs not running the panic function are off, so it's very dangerous to wait on a regular spinlock, there's a risk of deadlock. This patch refactors the panic notifier of parisc/power driver to make use of spin_trylock - for that, we've added a second version of the soft-power function. Also, some comments were reorganized and trailing white spaces, useless header inclusion and blank lines were removed. Cc: Helge Deller Cc: "James E.J. Bottomley" Signed-off-by: Guilherme G. Piccoli --- arch/parisc/include/asm/pdc.h | 1 + arch/parisc/kernel/firmware.c | 27 +++ drivers/parisc/power.c| 17 ++--- 3 files changed, 34 insertions(+), 11 deletions(-) diff --git a/arch/parisc/include/asm/pdc.h b/arch/parisc/include/asm/pdc.h index b643092d4b98..7a106008e258 100644 --- a/arch/parisc/include/asm/pdc.h +++ b/arch/parisc/include/asm/pdc.h @@ -83,6 +83,7 @@ int pdc_do_firm_test_reset(unsigned long ftc_bitmap); int pdc_do_reset(void); int pdc_soft_power_info(unsigned long *power_reg); int pdc_soft_power_button(int sw_control); +int pdc_soft_power_button_panic(int sw_control); void pdc_io_reset(void); void pdc_io_reset_devices(void); int pdc_iodc_getc(void); diff --git a/arch/parisc/kernel/firmware.c b/arch/parisc/kernel/firmware.c index 6a7e315bcc2e..0e2f70b592f4 100644 --- a/arch/parisc/kernel/firmware.c +++ b/arch/parisc/kernel/firmware.c @@ -1232,15 +1232,18 @@ int __init pdc_soft_power_info(unsigned long *power_reg) } /* - * pdc_soft_power_button - Control the soft power button behaviour - * @sw_control: 0 for hardware control, 1 for software control + * pdc_soft_power_button{_panic} - Control the soft power button behaviour + * @sw_control: 0 for hardware control, 1 for software control * * * This PDC function places the soft power button under software or * hardware control. - * Under software control the OS may control to when to allow to shut - * down the system. Under hardware control pressing the power button + * Under software control the OS may control to when to allow to shut + * down the system. Under hardware control pressing the power button * powers off the system immediately. + * + * The _panic version relies in spin_trylock to prevent deadlock + * on panic path. */ int pdc_soft_power_button(int sw_control) { @@ -1254,6 +1257,22 @@ int pdc_soft_power_button(int sw_control) return retval; } +int pdc_soft_power_button_panic(int sw_control) +{ + int retval; + unsigned long flags; + + if (!spin_trylock_irqsave(&pdc_lock, flags)) { + pr_emerg("Couldn't enable soft power button\n"); + return -EBUSY; /* ignored by the panic notifier */ + } + + retval = mem_pdc_call(PDC_SOFT_POWER, PDC_SOFT_POWER_ENABLE, __pa(pdc_result), sw_control); + spin_unlock_irqrestore(&pdc_lock, flags); + + return retval; +} + /* * pdc_io_reset - Hack to avoid overlapping range registers of Bridges devices. * Primarily a problem on T600 (which parisc-linux doesn't support) but diff --git a/drivers/parisc/power.c b/drivers/parisc/power.c index 456776bd8ee6..8512884de2cf 100644 --- a/drivers/parisc/power.c +++ b/drivers/parisc/power.c @@ -37,7 +37,6 @@ #include #include #include -#include #include #include #include @@ -175,16 +174,21 @@ static void powerfail_interrupt(int code, void *x) -/* parisc_panic_event() is called by the panic handler. - * As soon as a panic occurs, our tasklets above will not be - * executed any longer. This function then re-enables the - * soft-power switch and allows the user to switch off the system +/* + * parisc_panic_event() is called by the panic handler. + * + * As soon as a panic occurs, our tasklets above will not + * be executed any longer. This function then re-enables + * the soft-power switch and allows the user to switch off + * the system. We rely in pdc_soft_power_button_panic() + * since this version spin_trylocks (instead of regular + * spinlock), preventing deadlocks on panic path. */ static int parisc_panic_event(struct notifier_block *this, unsigned long event, void *ptr) { /* re-enable the soft-power switch */ - pdc_soft_power_button(0); + pdc_soft_power_button_panic(0); return NOTIFY_DONE; } @@ -193,7 +197,6 @@ static struct notifier_block parisc_panic_block = { .priority = INT_MAX, }; - static int __init power_init(void) { unsigned long ret; -- 2.36.0
[PATCH 11/30] um: Improve panic notifiers consistency and ordering
Currently the panic notifiers from user mode linux don't follow the convention for most of the other notifiers present in the kernel (indentation, priority setting, numeric return). More important, the priorities could be improved, since it's a special case (userspace), hence we could run the notifiers earlier; user mode linux shouldn't care much with other panic notifiers but the ordering among the mconsole and arch notifier is important, given that the arch one effectively triggers a core dump. This patch fixes that by running the mconsole notifier as the first panic notifier, followed by the architecture one (that coredumps). Also, we remove a useless header inclusion. Cc: Anton Ivanov Cc: Johannes Berg Cc: Richard Weinberger Signed-off-by: Guilherme G. Piccoli --- arch/um/drivers/mconsole_kern.c | 8 +++- arch/um/kernel/um_arch.c| 8 2 files changed, 7 insertions(+), 9 deletions(-) diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c index 8ca67a692683..2ea0421bcc3f 100644 --- a/arch/um/drivers/mconsole_kern.c +++ b/arch/um/drivers/mconsole_kern.c @@ -11,7 +11,6 @@ #include #include #include -#include #include #include #include @@ -846,13 +845,12 @@ static int notify_panic(struct notifier_block *self, unsigned long unused1, mconsole_notify(notify_socket, MCONSOLE_PANIC, message, strlen(message) + 1); - return 0; + return NOTIFY_DONE; } static struct notifier_block panic_exit_notifier = { - .notifier_call = notify_panic, - .next = NULL, - .priority = 1 + .notifier_call = notify_panic, + .priority = INT_MAX, /* run as soon as possible */ }; static int add_notifier(void) diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c index 0760e24f2eba..4485b1a7c8e4 100644 --- a/arch/um/kernel/um_arch.c +++ b/arch/um/kernel/um_arch.c @@ -246,13 +246,13 @@ static int panic_exit(struct notifier_block *self, unsigned long unused1, bust_spinlocks(0); uml_exitcode = 1; os_dump_core(); - return 0; + + return NOTIFY_DONE; } static struct notifier_block panic_exit_notifier = { - .notifier_call = panic_exit, - .next = NULL, - .priority = 0 + .notifier_call = panic_exit, + .priority = INT_MAX - 1, /* run as 2nd notifier, won't return */ }; void uml_finishsetup(void) -- 2.36.0
[PATCH 10/30] alpha: Clean-up the panic notifier code
The alpha panic notifier has some code issues, not following the conventions of other notifiers. Also, it might halt the machine but still it is set to run as early as possible, which doesn't seem to be a good idea. This patch cleans the code, and set the notifier to run as the latest, following the same approach other architectures are doing. Also, we remove the unnecessary include of a header already included indirectly. Cc: Ivan Kokshaysky Cc: Matt Turner Cc: Richard Henderson Signed-off-by: Guilherme G. Piccoli --- arch/alpha/kernel/setup.c | 36 +++- 1 file changed, 15 insertions(+), 21 deletions(-) diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c index b4fbbba30aa2..d88bdf852753 100644 --- a/arch/alpha/kernel/setup.c +++ b/arch/alpha/kernel/setup.c @@ -41,19 +41,11 @@ #include #include #endif -#include #include #include #include #include -static int alpha_panic_event(struct notifier_block *, unsigned long, void *); -static struct notifier_block alpha_panic_block = { - alpha_panic_event, -NULL, -INT_MAX /* try to do it first */ -}; - #include #include #include @@ -435,6 +427,21 @@ static const struct sysrq_key_op srm_sysrq_reboot_op = { }; #endif +static int alpha_panic_event(struct notifier_block *this, +unsigned long event, void *ptr) +{ + /* If we are using SRM and serial console, just hard halt here. */ + if (alpha_using_srm && srmcons_output) + __halt(); + + return NOTIFY_DONE; +} + +static struct notifier_block alpha_panic_block = { + .notifier_call = alpha_panic_event, + .priority = INT_MIN, /* may not return, do it last */ +}; + void __init setup_arch(char **cmdline_p) { @@ -1427,19 +1434,6 @@ const struct seq_operations cpuinfo_op = { .show = show_cpuinfo, }; - -static int -alpha_panic_event(struct notifier_block *this, unsigned long event, void *ptr) -{ -#if 1 - /* FIXME FIXME FIXME */ - /* If we are using SRM and serial console, just hard halt here. */ - if (alpha_using_srm && srmcons_output) - __halt(); -#endif -return NOTIFY_DONE; -} - static __init int add_pcspkr(void) { struct platform_device *pd; -- 2.36.0
[PATCH 09/30] coresight: cpu-debug: Replace mutex with mutex_trylock on panic notifier
The panic notifier infrastructure executes registered callbacks when a panic event happens - such callbacks are executed in atomic context, with interrupts and preemption disabled in the running CPU and all other CPUs disabled. That said, mutexes in such context are not a good idea. This patch replaces a regular mutex with a mutex_trylock safer approach; given the nature of the mutex used in the driver, it should be pretty uncommon being unable to acquire such mutex in the panic path, hence no functional change should be observed (and if it is, that would be likely a deadlock with the regular mutex). Fixes: 2227b7c74634 ("coresight: add support for CPU debug module") Cc: Leo Yan Cc: Mathieu Poirier Cc: Mike Leach Cc: Suzuki K Poulose Signed-off-by: Guilherme G. Piccoli --- drivers/hwtracing/coresight/coresight-cpu-debug.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-cpu-debug.c b/drivers/hwtracing/coresight/coresight-cpu-debug.c index 8845ec4b4402..1874df7c6a73 100644 --- a/drivers/hwtracing/coresight/coresight-cpu-debug.c +++ b/drivers/hwtracing/coresight/coresight-cpu-debug.c @@ -380,9 +380,10 @@ static int debug_notifier_call(struct notifier_block *self, int cpu; struct debug_drvdata *drvdata; - mutex_lock(&debug_lock); + /* Bail out if we can't acquire the mutex or the functionality is off */ + if (!mutex_trylock(&debug_lock)) + return NOTIFY_DONE; - /* Bail out if the functionality is disabled */ if (!debug_enable) goto skip_dump; @@ -401,7 +402,7 @@ static int debug_notifier_call(struct notifier_block *self, skip_dump: mutex_unlock(&debug_lock); - return 0; + return NOTIFY_DONE; } static struct notifier_block debug_notifier = { -- 2.36.0
[PATCH 08/30] powerpc/setup: Refactor/untangle panic notifiers
The panic notifiers infrastructure is a bit limited in the scope of the callbacks - basically every kind of functionality is dropped in a list that runs in the same point during the kernel panic path. This is not really on par with the complexities and particularities of architecture / hypervisors' needs, and a refactor is ongoing. As part of this refactor, it was observed that powerpc has 2 notifiers, with mixed goals: one is just a KASLR offset dumper, whereas the other aims to hard-disable IRQs (necessary on panic path), warn firmware of the panic event (fadump) and run low-level platform-specific machinery that might stop kernel execution and never come back. Clearly, the 2nd notifier has opposed goals: disable IRQs / fadump should run earlier while low-level platform actions should run late since it might not even return. Hence, this patch decouples the notifiers splitting them in three: - First one is responsible for hard-disable IRQs and fadump, should run early; - The kernel KASLR offset dumper is really an informative notifier, harmless and may run at any moment in the panic path; - The last notifier should run last, since it aims to perform low-level actions for specific platforms, and might never return. It is also only registered for 2 platforms, pseries and ps3. The patch better documents the notifiers and clears the code too, also removing a useless header. Currently no functionality change should be observed, but after the planned panic refactor we should expect more panic reliability with this patch. Cc: Benjamin Herrenschmidt Cc: Hari Bathini Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Paul Mackerras Signed-off-by: Guilherme G. Piccoli --- We'd like to thanks specially the MiniCloud infrastructure [0] maintainers, that allow us to test PowerPC code in a very complete, functional and FREE environment (there's no need even for adding a credit card, like many "free" clouds require ¬¬ ). [0] https://openpower.ic.unicamp.br/minicloud arch/powerpc/kernel/setup-common.c | 74 ++ 1 file changed, 54 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index 518ae5aa9410..52f96b209a96 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -23,7 +23,6 @@ #include #include #include -#include #include #include #include @@ -680,8 +679,25 @@ int check_legacy_ioport(unsigned long base_port) } EXPORT_SYMBOL(check_legacy_ioport); -static int ppc_panic_event(struct notifier_block *this, - unsigned long event, void *ptr) +/* + * Panic notifiers setup + * + * We have 3 notifiers for powerpc, each one from a different "nature": + * + * - ppc_panic_fadump_handler() is a hypervisor notifier, which hard-disables + * IRQs and deal with the Firmware-Assisted dump, when it is configured; + * should run early in the panic path. + * + * - dump_kernel_offset() is an informative notifier, just showing the KASLR + * offset if we have RANDOMIZE_BASE set. + * + * - ppc_panic_platform_handler() is a low-level handler that's registered + * only if the platform wishes to perform final actions in the panic path, + * hence it should run late and might not even return. Currently, only + * pseries and ps3 platforms register callbacks. + */ +static int ppc_panic_fadump_handler(struct notifier_block *this, + unsigned long event, void *ptr) { /* * panic does a local_irq_disable, but we really @@ -691,45 +707,63 @@ static int ppc_panic_event(struct notifier_block *this, /* * If firmware-assisted dump has been registered then trigger -* firmware-assisted dump and let firmware handle everything else. +* its callback and let the firmware handles everything else. */ crash_fadump(NULL, ptr); - if (ppc_md.panic) - ppc_md.panic(ptr); /* May not return */ + return NOTIFY_DONE; } -static struct notifier_block ppc_panic_block = { - .notifier_call = ppc_panic_event, - .priority = INT_MIN /* may not return; must be done last */ -}; - -/* - * Dump out kernel offset information on panic. - */ static int dump_kernel_offset(struct notifier_block *self, unsigned long v, void *p) { pr_emerg("Kernel Offset: 0x%lx from 0x%lx\n", kaslr_offset(), KERNELBASE); - return 0; + return NOTIFY_DONE; } +static int ppc_panic_platform_handler(struct notifier_block *this, + unsigned long event, void *ptr) +{ + /* +* This handler is only registered if we have a panic callback +* on ppc_md, hence NULL check is not needed. +* Also, it may not return, so it runs really late on panic path. +*/ + ppc_md.panic(ptr); + + return NOTIFY_DONE; +} + +static struct notifier_b
[PATCH 07/30] mips: ip22: Reword PANICED to PANICKED and remove useless header
Many other place in the kernel prefer the latter, so let's keep it consistent in MIPS code as well. Also, removes a useless header. Cc: Thomas Bogendoerfer Signed-off-by: Guilherme G. Piccoli --- arch/mips/sgi-ip22/ip22-reset.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/arch/mips/sgi-ip22/ip22-reset.c b/arch/mips/sgi-ip22/ip22-reset.c index 9028dbbb45dd..8f0861c58080 100644 --- a/arch/mips/sgi-ip22/ip22-reset.c +++ b/arch/mips/sgi-ip22/ip22-reset.c @@ -11,7 +11,6 @@ #include #include #include -#include #include #include #include @@ -41,7 +40,7 @@ static struct timer_list power_timer, blink_timer, debounce_timer; static unsigned long blink_timer_timeout; -#define MACHINE_PANICED1 +#define MACHINE_PANICKED 1 #define MACHINE_SHUTTING_DOWN 2 static int machine_state; @@ -112,7 +111,7 @@ static void debounce(struct timer_list *unused) return; } - if (machine_state & MACHINE_PANICED) + if (machine_state & MACHINE_PANICKED) sgimc->cpuctrl0 |= SGIMC_CCTRL0_SYSINIT; enable_irq(SGI_PANEL_IRQ); @@ -120,7 +119,7 @@ static void debounce(struct timer_list *unused) static inline void power_button(void) { - if (machine_state & MACHINE_PANICED) + if (machine_state & MACHINE_PANICKED) return; if ((machine_state & MACHINE_SHUTTING_DOWN) || @@ -167,9 +166,9 @@ static irqreturn_t panel_int(int irq, void *dev_id) static int panic_event(struct notifier_block *this, unsigned long event, void *ptr) { - if (machine_state & MACHINE_PANICED) + if (machine_state & MACHINE_PANICKED) return NOTIFY_DONE; - machine_state |= MACHINE_PANICED; + machine_state |= MACHINE_PANICKED; blink_timer_timeout = PANIC_FREQ; blink_timeout(&blink_timer); -- 2.36.0
Re: [PATCH 24/30] panic: Refactor the panic path
On 4/27/22 15:49, Guilherme G. Piccoli wrote: > + crash_kexec_post_notifiers > + This was DEPRECATED - users should always prefer the This is DEPRECATED - users should always prefer the > + parameter "panic_notifiers_level" - check its entry > + in this documentation for details on how it works. > + Setting this parameter is exactly the same as setting > + "panic_notifiers_level=4". -- ~Randy
Re: [PATCH net-next 13/14] eth: spider: remove a copy of the NAPI_POLL_WEIGHT define
Hi Jakub, On 4/27/22 08:41, Jakub Kicinski wrote: > Defining local versions of NAPI_POLL_WEIGHT with the same > values in the drivers just makes refactoring harder. > --- a/drivers/net/ethernet/toshiba/spider_net.c > +++ b/drivers/net/ethernet/toshiba/spider_net.c > > netif_napi_add(netdev, &card->napi, > -spider_net_poll, SPIDER_NET_NAPI_WEIGHT); > +spider_net_poll, NAPI_POLL_WEIGHT); This seems fine. Both SPIDER_NET_NAPI_WEIGHT and NAPI_POLL_WEIGHT are defined as 64. Thanks for your contribution. Acked-by: Geoff Levand
RE: [PATCH v3 3/4] dt-bindings: interrupt-controller: fsl, ls-extirq: convert to YAML
> -Original Message- > From: Michael Walle > Sent: Wednesday, April 27, 2022 2:54 AM > To: Rob Herring ; Krzysztof Kozlowski > > Cc: Leo Li ; Michael Walle ; Shawn > Guo ; Thomas Gleixner ; Marc > Zyngier ; linuxppc-dev@lists.ozlabs.org; linux-arm- > ker...@lists.infradead.org; devicet...@vger.kernel.org; linux- > ker...@vger.kernel.org > Subject: [PATCH v3 3/4] dt-bindings: interrupt-controller: fsl,ls-extirq: > convert to YAML > > Convert the fsl,ls-extirq binding to the new YAML format. > > In contrast to the original binding documentation, there are three > compatibles which are used in their corresponding device trees which have a > specific compatible and the (already documented) fallback > compatible: > - "fsl,ls1046a-extirq", "fsl,ls1043a-extirq" > - "fsl,ls2080a-extirq", "fsl,ls1088a-extirq" > - "fsl,lx2160a-extirq", "fsl,ls1088a-extirq" > > Depending on the number of the number of the external IRQs which is > usually 12 except for the LS1021A where there are only 6, the interrupt-map- > mask was reduced from 0x to 0xf and 0x7 respectively and the number > of interrupt-map entries have to match. I assume this change won't prevent driver to be compatible with older device trees using the 0x? The original 0x should work for both 6/12 interrupts or whatever reasonable number of interrupts that maybe used in future SoCs. So the purpose of this change is to make the binding more specific to catch more errors in device tree? > > Signed-off-by: Michael Walle > --- > changes since v2: > - drop $ref to interrupt-controller.yaml > - use a more strict interrupt-map-mask and make it conditional on SoC > > changes since v1: > - new patch > > .../interrupt-controller/fsl,ls-extirq.txt| 53 > .../interrupt-controller/fsl,ls-extirq.yaml | 118 ++ > 2 files changed, 118 insertions(+), 53 deletions(-) delete mode 100644 > Documentation/devicetree/bindings/interrupt-controller/fsl,ls-extirq.txt > create mode 100644 Documentation/devicetree/bindings/interrupt- > controller/fsl,ls-extirq.yaml > > diff --git a/Documentation/devicetree/bindings/interrupt-controller/fsl,ls- > extirq.txt b/Documentation/devicetree/bindings/interrupt-controller/fsl,ls- > extirq.txt > deleted file mode 100644 > index 4d47df1a5c91.. > --- a/Documentation/devicetree/bindings/interrupt-controller/fsl,ls- > extirq.txt > +++ /dev/null > @@ -1,53 +0,0 @@ > -* Freescale Layerscape external IRQs > - > -Some Layerscape SOCs (LS1021A, LS1043A, LS1046A -LS1088A, LS208xA, > LX216xA) support inverting -the polarity of certain external interrupt lines. > - > -The device node must be a child of the node representing the - > Supplemental Configuration Unit (SCFG). > - > -Required properties: > -- compatible: should be "fsl,-extirq", e.g. "fsl,ls1021a-extirq". > - "fsl,ls1043a-extirq": for LS1043A, LS1046A. > - "fsl,ls1088a-extirq": for LS1088A, LS208xA, LX216xA. > -- #interrupt-cells: Must be 2. The first element is the index of the > - external interrupt line. The second element is the trigger type. > -- #address-cells: Must be 0. > -- interrupt-controller: Identifies the node as an interrupt controller > -- reg: Specifies the Interrupt Polarity Control Register (INTPCR) in > - the SCFG or the External Interrupt Control Register (IRQCR) in > - the ISC. > -- interrupt-map: Specifies the mapping from external interrupts to GIC > - interrupts. > -- interrupt-map-mask: Must be <0x 0>. > - > -Example: > - scfg: scfg@157 { > - compatible = "fsl,ls1021a-scfg", "syscon"; > - reg = <0x0 0x157 0x0 0x1>; > - big-endian; > - #address-cells = <1>; > - #size-cells = <1>; > - ranges = <0x0 0x0 0x157 0x1>; > - > - extirq: interrupt-controller@1ac { > - compatible = "fsl,ls1021a-extirq"; > - #interrupt-cells = <2>; > - #address-cells = <0>; > - interrupt-controller; > - reg = <0x1ac 4>; > - interrupt-map = > - <0 0 &gic GIC_SPI 163 > IRQ_TYPE_LEVEL_HIGH>, > - <1 0 &gic GIC_SPI 164 > IRQ_TYPE_LEVEL_HIGH>, > - <2 0 &gic GIC_SPI 165 > IRQ_TYPE_LEVEL_HIGH>, > - <3 0 &gic GIC_SPI 167 > IRQ_TYPE_LEVEL_HIGH>, > - <4 0 &gic GIC_SPI 168 > IRQ_TYPE_LEVEL_HIGH>, > - <5 0 &gic GIC_SPI 169 > IRQ_TYPE_LEVEL_HIGH>; > - interrupt-map-mask = <0x 0x0>; > - }; > - }; > - > - > - interrupts-extended = <&gic GIC_SPI 88 IRQ_TYPE_LEVEL_HIGH>, > - <&extirq 1 IRQ_TYPE_LEVEL_LOW>; > diff --git a/Documentation/devicetree/bindings/interrupt-controller/fsl,ls- > extirq.yaml b/Documentation/devicetree/bindings/interrupt- >
[PATCH net-next v5 18/18] team: Remove use of list iterator variable for list_for_each_entry_from()
In preparation to limit the scope of the list iterator variable to the list traversal loop, use a dedicated pointer to iterate through the list [1]. Since that variable should not be used past the loop iteration, a separate variable is used to 'remember the current location within the loop'. To either continue iterating from that position or skip the iteration (if the previous iteration was complete) list_prepare_entry() is used. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/team/team.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c index b07dde6f0abf..688c4393f099 100644 --- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -2425,17 +2425,17 @@ static int team_nl_send_options_get(struct team *team, u32 portid, u32 seq, int flags, team_nl_send_func_t *send_func, struct list_head *sel_opt_inst_list) { + struct team_option_inst *opt_inst, *tmp = NULL; struct nlattr *option_list; struct nlmsghdr *nlh; void *hdr; - struct team_option_inst *opt_inst; int err; struct sk_buff *skb = NULL; bool incomplete; int i; - opt_inst = list_first_entry(sel_opt_inst_list, - struct team_option_inst, tmp_list); + tmp = list_first_entry(sel_opt_inst_list, + struct team_option_inst, tmp_list); start_again: err = __send_and_alloc_skb(&skb, team, portid, send_func); @@ -2456,7 +2456,9 @@ static int team_nl_send_options_get(struct team *team, u32 portid, u32 seq, goto nla_put_failure; i = 0; + opt_inst = list_prepare_entry(tmp, sel_opt_inst_list, tmp_list); incomplete = false; + tmp = NULL; list_for_each_entry_from(opt_inst, sel_opt_inst_list, tmp_list) { err = team_nl_fill_one_option_get(skb, team, opt_inst); if (err) { @@ -2464,6 +2466,7 @@ static int team_nl_send_options_get(struct team *team, u32 portid, u32 seq, if (!i) goto errout; incomplete = true; + tmp = opt_inst; break; } goto errout; @@ -2707,14 +2710,14 @@ static int team_nl_send_port_list_get(struct team *team, u32 portid, u32 seq, struct nlattr *port_list; struct nlmsghdr *nlh; void *hdr; - struct team_port *port; + struct team_port *port, *tmp = NULL; int err; struct sk_buff *skb = NULL; bool incomplete; int i; - port = list_first_entry_or_null(&team->port_list, - struct team_port, list); + tmp = list_first_entry_or_null(&team->port_list, + struct team_port, list); start_again: err = __send_and_alloc_skb(&skb, team, portid, send_func); @@ -2744,7 +2747,9 @@ static int team_nl_send_port_list_get(struct team *team, u32 portid, u32 seq, err = team_nl_fill_one_port_get(skb, one_port); if (err) goto errout; - } else if (port) { + } else { + port = list_prepare_entry(tmp, &team->port_list, list); + tmp = NULL; list_for_each_entry_from(port, &team->port_list, list) { err = team_nl_fill_one_port_get(skb, port); if (err) { @@ -2752,6 +2757,7 @@ static int team_nl_send_port_list_get(struct team *team, u32 portid, u32 seq, if (!i) goto errout; incomplete = true; + tmp = port; break; } goto errout; -- 2.25.1
[PATCH net-next v5 17/18] ipvlan: Remove usage of list iterator variable for the loop body
In preparation to limit the scope of the list iterator variable to the list traversal loop, use a dedicated pointer to iterate through the list [1]. Since that variable should not be used past the loop iteration, a separate variable is used to 'remember the current location within the loop'. To either continue iterating from that position or start a new iteration (if the previous iteration was complete) list_prepare_entry() is used. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ipvlan/ipvlan_main.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index 696e245f6d00..063d7c30e944 100644 --- a/drivers/net/ipvlan/ipvlan_main.c +++ b/drivers/net/ipvlan/ipvlan_main.c @@ -9,7 +9,7 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval, struct netlink_ext_ack *extack) { - struct ipvl_dev *ipvlan; + struct ipvl_dev *ipvlan, *tmp = NULL; unsigned int flags; int err; @@ -26,8 +26,10 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval, flags & ~IFF_NOARP, extack); } - if (unlikely(err)) + if (unlikely(err)) { + tmp = ipvlan; goto fail; + } } if (nval == IPVLAN_MODE_L3S) { /* New mode is L3S */ @@ -43,6 +45,7 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval, return 0; fail: + ipvlan = list_prepare_entry(tmp, &port->ipvlans, pnode); /* Undo the flags changes that have been done so far. */ list_for_each_entry_continue_reverse(ipvlan, &port->ipvlans, pnode) { flags = ipvlan->dev->flags; -- 2.25.1
[PATCH net-next v5 16/18] ps3_gelic: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- .../net/ethernet/toshiba/ps3_gelic_wireless.c | 30 +-- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c b/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c index dc14a66583ff..c8a016c902cd 100644 --- a/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c +++ b/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c @@ -1495,14 +1495,14 @@ static int gelic_wl_start_scan(struct gelic_wl_info *wl, int always_scan, */ static void gelic_wl_scan_complete_event(struct gelic_wl_info *wl) { + struct gelic_wl_scan_info *target = NULL, *iter, *tmp; struct gelic_eurus_cmd *cmd = NULL; - struct gelic_wl_scan_info *target, *tmp; struct gelic_wl_scan_info *oldest = NULL; struct gelic_eurus_scan_info *scan_info; unsigned int scan_info_size; union iwreq_data data; unsigned long this_time = jiffies; - unsigned int data_len, i, found, r; + unsigned int data_len, i, r; void *buf; pr_debug("%s:start\n", __func__); @@ -1539,14 +1539,14 @@ static void gelic_wl_scan_complete_event(struct gelic_wl_info *wl) wl->scan_stat = GELIC_WL_SCAN_STAT_GOT_LIST; /* mark all entries are old */ - list_for_each_entry_safe(target, tmp, &wl->network_list, list) { - target->valid = 0; + list_for_each_entry_safe(iter, tmp, &wl->network_list, list) { + iter->valid = 0; /* expire too old entries */ - if (time_before(target->last_scanned + wl->scan_age, + if (time_before(iter->last_scanned + wl->scan_age, this_time)) { - kfree(target->hwinfo); - target->hwinfo = NULL; - list_move_tail(&target->list, &wl->network_free_list); + kfree(iter->hwinfo); + iter->hwinfo = NULL; + list_move_tail(&iter->list, &wl->network_free_list); } } @@ -1569,22 +1569,22 @@ static void gelic_wl_scan_complete_event(struct gelic_wl_info *wl) continue; } - found = 0; + target = NULL; oldest = NULL; - list_for_each_entry(target, &wl->network_list, list) { - if (ether_addr_equal(&target->hwinfo->bssid[2], + list_for_each_entry(iter, &wl->network_list, list) { + if (ether_addr_equal(&iter->hwinfo->bssid[2], &scan_info->bssid[2])) { - found = 1; + target = iter; pr_debug("%s: same BBS found scanned list\n", __func__); break; } if (!oldest || - (target->last_scanned < oldest->last_scanned)) - oldest = target; + (iter->last_scanned < oldest->last_scanned)) + oldest = iter; } - if (!found) { + if (!target) { /* not found in the list */ if (list_empty(&wl->network_free_list)) { /* expire oldest */ -- 2.25.1
[PATCH net-next v5 15/18] net: netcp: Remove usage of list iterator for list_add() after loop body
In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer pointing to the location where the element should be inserted [1]. Before, the code implicitly used the head when no element was found when using &next->list. The new 'pos' variable is set to the list head by default and overwritten if the list exits early, marking the insertion point for list_add(). Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/ti/netcp_core.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/ti/netcp_core.c b/drivers/net/ethernet/ti/netcp_core.c index 16507bff652a..f25104b5a31b 100644 --- a/drivers/net/ethernet/ti/netcp_core.c +++ b/drivers/net/ethernet/ti/netcp_core.c @@ -471,6 +471,7 @@ struct netcp_hook_list { int netcp_register_txhook(struct netcp_intf *netcp_priv, int order, netcp_hook_rtn *hook_rtn, void *hook_data) { + struct list_head *pos = &netcp_priv->txhook_list_head; struct netcp_hook_list *entry; struct netcp_hook_list *next; unsigned long flags; @@ -485,10 +486,12 @@ int netcp_register_txhook(struct netcp_intf *netcp_priv, int order, spin_lock_irqsave(&netcp_priv->lock, flags); list_for_each_entry(next, &netcp_priv->txhook_list_head, list) { - if (next->order > order) + if (next->order > order) { + pos = &next->list; break; + } } - __list_add(&entry->list, next->list.prev, &next->list); + list_add_tail(&entry->list, pos); spin_unlock_irqrestore(&netcp_priv->lock, flags); return 0; @@ -520,6 +523,7 @@ EXPORT_SYMBOL_GPL(netcp_unregister_txhook); int netcp_register_rxhook(struct netcp_intf *netcp_priv, int order, netcp_hook_rtn *hook_rtn, void *hook_data) { + struct list_head *pos = &netcp_priv->rxhook_list_head; struct netcp_hook_list *entry; struct netcp_hook_list *next; unsigned long flags; @@ -534,10 +538,12 @@ int netcp_register_rxhook(struct netcp_intf *netcp_priv, int order, spin_lock_irqsave(&netcp_priv->lock, flags); list_for_each_entry(next, &netcp_priv->rxhook_list_head, list) { - if (next->order > order) + if (next->order > order) { + pos = &next->list; break; + } } - __list_add(&entry->list, next->list.prev, &next->list); + list_add_tail(&entry->list, pos); spin_unlock_irqrestore(&netcp_priv->lock, flags); return 0; -- 2.25.1
[PATCH net-next v5 14/18] sfc: Remove usage of list iterator for list_add() after the loop body
In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer pointing to the location where the element should be inserted [1]. Before, the code implicitly used the head when no element was found when using &new->list. The new 'pos' variable is set to the list head by default and overwritten if the list exits early, marking the insertion point for list_add(). Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/sfc/rx_common.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/sfc/rx_common.c b/drivers/net/ethernet/sfc/rx_common.c index fa8b9aacca11..74c056210e0b 100644 --- a/drivers/net/ethernet/sfc/rx_common.c +++ b/drivers/net/ethernet/sfc/rx_common.c @@ -560,14 +560,17 @@ struct efx_rss_context *efx_alloc_rss_context_entry(struct efx_nic *efx) { struct list_head *head = &efx->rss_context.list; struct efx_rss_context *ctx, *new; + struct list_head *pos = head; u32 id = 1; /* Don't use zero, that refers to the master RSS context */ WARN_ON(!mutex_is_locked(&efx->rss_lock)); /* Search for first gap in the numbering */ list_for_each_entry(ctx, head, list) { - if (ctx->user_id != id) + if (ctx->user_id != id) { + pos = &ctx->list; break; + } id++; /* Check for wrap. If this happens, we have nearly 2^32 * allocated RSS contexts, which seems unlikely. @@ -585,7 +588,7 @@ struct efx_rss_context *efx_alloc_rss_context_entry(struct efx_nic *efx) /* Insert the new entry into the gap */ new->user_id = id; - list_add_tail(&new->list, &ctx->list); + list_add_tail(&new->list, pos); return new; } -- 2.25.1
[PATCH net-next v5 13/18] net: qede: Remove check of list iterator against head past the loop body
When list_for_each_entry() completes the iteration over the whole list without breaking the loop, the iterator value will be a bogus pointer computed based on the head element. While it is safe to use the pointer to determine if it was computed based on the head element, either with list_entry_is_head() or &pos->member == head, using the iterator variable after the loop should be avoided. In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer to point to the found element [1]. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qede/qede_filter.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qede/qede_filter.c b/drivers/net/ethernet/qlogic/qede/qede_filter.c index 3010833ddde3..3d167e37e654 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_filter.c +++ b/drivers/net/ethernet/qlogic/qede/qede_filter.c @@ -829,18 +829,21 @@ int qede_configure_vlan_filters(struct qede_dev *edev) int qede_vlan_rx_kill_vid(struct net_device *dev, __be16 proto, u16 vid) { struct qede_dev *edev = netdev_priv(dev); - struct qede_vlan *vlan; + struct qede_vlan *vlan = NULL; + struct qede_vlan *iter; int rc = 0; DP_VERBOSE(edev, NETIF_MSG_IFDOWN, "Removing vlan 0x%04x\n", vid); /* Find whether entry exists */ __qede_lock(edev); - list_for_each_entry(vlan, &edev->vlan_list, list) - if (vlan->vid == vid) + list_for_each_entry(iter, &edev->vlan_list, list) + if (iter->vid == vid) { + vlan = iter; break; + } - if (list_entry_is_head(vlan, &edev->vlan_list, list)) { + if (!vlan) { DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN), "Vlan isn't configured\n"); goto out; -- 2.25.1
[PATCH net-next v5 12/18] net: qede: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qede/qede_rdma.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qede/qede_rdma.c b/drivers/net/ethernet/qlogic/qede/qede_rdma.c index 6304514a6f2c..2eb03ffe2484 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_rdma.c +++ b/drivers/net/ethernet/qlogic/qede/qede_rdma.c @@ -246,18 +246,17 @@ static void qede_rdma_change_mtu(struct qede_dev *edev) static struct qede_rdma_event_work * qede_rdma_get_free_event_node(struct qede_dev *edev) { - struct qede_rdma_event_work *event_node = NULL; - bool found = false; + struct qede_rdma_event_work *event_node = NULL, *iter; - list_for_each_entry(event_node, &edev->rdma_info.rdma_event_list, + list_for_each_entry(iter, &edev->rdma_info.rdma_event_list, list) { - if (!work_pending(&event_node->work)) { - found = true; + if (!work_pending(&iter->work)) { + event_node = iter; break; } } - if (!found) { + if (!event_node) { event_node = kzalloc(sizeof(*event_node), GFP_ATOMIC); if (!event_node) { DP_NOTICE(edev, -- 2.25.1
[PATCH net-next v5 11/18] qed: Remove usage of list iterator variable after the loop
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. Since "found" and "p_ent" need to be equal, "found" should be used consistently to limit the scope of "p_ent" to the list traversal in the future. Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qed/qed_spq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_spq.c b/drivers/net/ethernet/qlogic/qed/qed_spq.c index d01b9245f811..cbaa2abed660 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_spq.c +++ b/drivers/net/ethernet/qlogic/qed/qed_spq.c @@ -934,10 +934,10 @@ int qed_spq_completion(struct qed_hwfn *p_hwfn, u8 fw_return_code, union event_ring_data *p_data) { + struct qed_spq_entry*found = NULL; struct qed_spq *p_spq; - struct qed_spq_entry*p_ent = NULL; + struct qed_spq_entry*p_ent; struct qed_spq_entry*tmp; - struct qed_spq_entry*found = NULL; if (!p_hwfn) return -EINVAL; @@ -980,7 +980,7 @@ int qed_spq_completion(struct qed_hwfn *p_hwfn, DP_VERBOSE(p_hwfn, QED_MSG_SPQ, "Complete EQE [echo %04x]: func %p cookie %p)\n", le16_to_cpu(echo), - p_ent->comp_cb.function, p_ent->comp_cb.cookie); + found->comp_cb.function, found->comp_cb.cookie); if (found->comp_cb.function) found->comp_cb.function(p_hwfn, found->comp_cb.cookie, p_data, fw_return_code); -- 2.25.1
[PATCH net-next v5 10/18] qed: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 26 ++--- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c index 1d1d4caad680..198c9321bf51 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c +++ b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c @@ -1630,38 +1630,36 @@ static struct qed_iwarp_listener * qed_iwarp_get_listener(struct qed_hwfn *p_hwfn, struct qed_iwarp_cm_info *cm_info) { - struct qed_iwarp_listener *listener = NULL; + struct qed_iwarp_listener *listener = NULL, *iter; static const u32 ip_zero[4] = { 0, 0, 0, 0 }; - bool found = false; - list_for_each_entry(listener, + list_for_each_entry(iter, &p_hwfn->p_rdma_info->iwarp.listen_list, list_entry) { - if (listener->port == cm_info->local_port) { - if (!memcmp(listener->ip_addr, + if (iter->port == cm_info->local_port) { + if (!memcmp(iter->ip_addr, ip_zero, sizeof(ip_zero))) { - found = true; + listener = iter; break; } - if (!memcmp(listener->ip_addr, + if (!memcmp(iter->ip_addr, cm_info->local_ip, sizeof(cm_info->local_ip)) && - (listener->vlan == cm_info->vlan)) { - found = true; + iter->vlan == cm_info->vlan) { + listener = iter; break; } } } - if (found) { + if (listener) DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "listener found = %p\n", listener); - return listener; - } + else + DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "listener not found\n"); - DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "listener not found\n"); - return NULL; + return listener; } static int -- 2.25.1
[PATCH net-next v5 09/18] qed: Use dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable [1]. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qed/qed_dev.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c b/drivers/net/ethernet/qlogic/qed/qed_dev.c index 672480c9d195..e920e7dcf66a 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_dev.c +++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c @@ -174,7 +174,7 @@ int qed_db_recovery_add(struct qed_dev *cdev, int qed_db_recovery_del(struct qed_dev *cdev, void __iomem *db_addr, void *db_data) { - struct qed_db_recovery_entry *db_entry = NULL; + struct qed_db_recovery_entry *db_entry = NULL, *iter; struct qed_hwfn *p_hwfn; int rc = -EINVAL; @@ -190,12 +190,13 @@ int qed_db_recovery_del(struct qed_dev *cdev, /* Protect the list */ spin_lock_bh(&p_hwfn->db_recovery_info.lock); - list_for_each_entry(db_entry, + list_for_each_entry(iter, &p_hwfn->db_recovery_info.list, list_entry) { /* search according to db_data addr since db_addr is not unique (roce) */ - if (db_entry->db_data == db_data) { - qed_db_recovery_dp_entry(p_hwfn, db_entry, "Deleting"); - list_del(&db_entry->list_entry); + if (iter->db_data == db_data) { + qed_db_recovery_dp_entry(p_hwfn, iter, "Deleting"); + list_del(&iter->list_entry); + db_entry = iter; rc = 0; break; } -- 2.25.1
[PATCH net-next v5 08/18] net: sparx5: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- .../microchip/sparx5/sparx5_mactable.c| 25 +-- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c b/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c index a5837dbe0c7e..bb8d9ce79ac2 100644 --- a/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c @@ -362,8 +362,7 @@ static void sparx5_mact_handle_entry(struct sparx5 *sparx5, unsigned char mac[ETH_ALEN], u16 vid, u32 cfg2) { - struct sparx5_mact_entry *mact_entry; - bool found = false; + struct sparx5_mact_entry *mact_entry = NULL, *iter; u16 port; if (LRN_MAC_ACCESS_CFG_2_MAC_ENTRY_ADDR_TYPE_GET(cfg2) != @@ -378,28 +377,28 @@ static void sparx5_mact_handle_entry(struct sparx5 *sparx5, return; mutex_lock(&sparx5->mact_lock); - list_for_each_entry(mact_entry, &sparx5->mact_entries, list) { - if (mact_entry->vid == vid && - ether_addr_equal(mac, mact_entry->mac)) { - found = true; - mact_entry->flags |= MAC_ENT_ALIVE; - if (mact_entry->port != port) { + list_for_each_entry(iter, &sparx5->mact_entries, list) { + if (iter->vid == vid && + ether_addr_equal(mac, iter->mac)) { + iter->flags |= MAC_ENT_ALIVE; + if (iter->port != port) { dev_warn(sparx5->dev, "Entry move: %d -> %d\n", -mact_entry->port, port); - mact_entry->port = port; - mact_entry->flags |= MAC_ENT_MOVED; +iter->port, port); + iter->port = port; + iter->flags |= MAC_ENT_MOVED; } /* Entry handled */ + mact_entry = iter; break; } } mutex_unlock(&sparx5->mact_lock); - if (found && !(mact_entry->flags & MAC_ENT_MOVED)) + if (mact_entry && !(mact_entry->flags & MAC_ENT_MOVED)) /* Present, not moved */ return; - if (!found) { + if (!mact_entry) { /* Entry not found - now add */ mact_entry = alloc_mact_entry(sparx5, mac, vid, port); if (!mact_entry) -- 2.25.1
[PATCH net-next v5 07/18] net: dsa: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel Reviewed-by: Vladimir Oltean Reviewed-by: Florian Fainelli --- net/dsa/dsa.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index 89c6c86e746f..645522c4dd4a 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -112,22 +112,21 @@ const struct dsa_device_ops *dsa_find_tagger_by_name(const char *buf) const struct dsa_device_ops *dsa_tag_driver_get(int tag_protocol) { - struct dsa_tag_driver *dsa_tag_driver; + struct dsa_tag_driver *dsa_tag_driver = NULL, *iter; const struct dsa_device_ops *ops; - bool found = false; request_module("%s%d", DSA_TAG_DRIVER_ALIAS, tag_protocol); mutex_lock(&dsa_tag_drivers_lock); - list_for_each_entry(dsa_tag_driver, &dsa_tag_drivers_list, list) { - ops = dsa_tag_driver->ops; + list_for_each_entry(iter, &dsa_tag_drivers_list, list) { + ops = iter->ops; if (ops->proto == tag_protocol) { - found = true; + dsa_tag_driver = iter; break; } } - if (found) { + if (dsa_tag_driver) { if (!try_module_get(dsa_tag_driver->owner)) ops = ERR_PTR(-ENOPROTOOPT); } else { -- 2.25.1
[PATCH net-next v5 06/18] net: dsa: mv88e6xxx: refactor mv88e6xxx_port_vlan()
From: Vladimir Oltean To avoid bugs and speculative execution exploits due to type-confused pointers at the end of a list_for_each_entry() loop, one measure is to restrict code to not use the iterator variable outside the loop block. In the case of mv88e6xxx_port_vlan(), this isn't a problem, as we never let the loops exit through "natural causes" anyway, by using a "found" variable and then using the last "dp" iterator prior to the break, which is a safe thing to do. Nonetheless, with the expected new syntax, this pattern will no longer be possible. Profit off of the occasion and break the two port finding methods into smaller sub-functions. Somehow, returning a copy of the iterator pointer is still accepted. This change makes it redundant to have a "bool found", since the "dp" from mv88e6xxx_port_vlan() now holds NULL if we haven't found what we were looking for. Signed-off-by: Vladimir Oltean Signed-off-by: Jakob Koschel Reviewed-by: Florian Fainelli --- drivers/net/dsa/mv88e6xxx/chip.c | 54 ++-- 1 file changed, 31 insertions(+), 23 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index b3aa0e5bc842..1f35e89053e6 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -1378,42 +1378,50 @@ static int mv88e6xxx_set_mac_eee(struct dsa_switch *ds, int port, return 0; } +static struct dsa_port *mv88e6xxx_find_port(struct dsa_switch_tree *dst, + int sw_index, int port) +{ + struct dsa_port *dp; + + list_for_each_entry(dp, &dst->ports, list) + if (dp->ds->index == sw_index && dp->index == port) + return dp; + + return NULL; +} + +static struct dsa_port * +mv88e6xxx_find_port_by_bridge_num(struct dsa_switch_tree *dst, + unsigned int bridge_num) +{ + struct dsa_port *dp; + + list_for_each_entry(dp, &dst->ports, list) + if (dsa_port_bridge_num_get(dp) == bridge_num) + return dp; + + return NULL; +} + /* Mask of the local ports allowed to receive frames from a given fabric port */ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port) { struct dsa_switch *ds = chip->ds; struct dsa_switch_tree *dst = ds->dst; struct dsa_port *dp, *other_dp; - bool found = false; u16 pvlan; - /* dev is a physical switch */ if (dev <= dst->last_switch) { - list_for_each_entry(dp, &dst->ports, list) { - if (dp->ds->index == dev && dp->index == port) { - /* dp might be a DSA link or a user port, so it -* might or might not have a bridge. -* Use the "found" variable for both cases. -*/ - found = true; - break; - } - } - /* dev is a virtual bridge */ + /* dev is a physical switch */ + dp = mv88e6xxx_find_port(dst, dev, port); } else { - list_for_each_entry(dp, &dst->ports, list) { - unsigned int bridge_num = dsa_port_bridge_num_get(dp); - - if (bridge_num + dst->last_switch != dev) - continue; - - found = true; - break; - } + /* dev is a virtual bridge */ + dp = mv88e6xxx_find_port_by_bridge_num(dst, + dev - dst->last_switch); } /* Prevent frames from unknown switch or virtual bridge */ - if (!found) + if (!dp) return 0; /* Frames from DSA links and CPU ports can egress any local port */ -- 2.25.1
[PATCH net-next v5 05/18] net: dsa: mv88e6xxx: remove redundant check in mv88e6xxx_port_vlan()
From: Vladimir Oltean We know that "dev > dst->last_switch" in the "else" block. In other words, that "dev - dst->last_switch" is > 0. dsa_port_bridge_num_get(dp) can be 0, but the check "if (bridge_num + dst->last_switch != dev) continue", rewritten as "if (bridge_num != dev - dst->last_switch) continue", aka "if (bridge_num != something which cannot be 0) continue", makes it redundant to have the extra "if (!bridge_num) continue" logic, since a bridge_num of zero would have been skipped anyway. Signed-off-by: Vladimir Oltean Signed-off-by: Jakob Koschel Reviewed-by: Florian Fainelli --- drivers/net/dsa/mv88e6xxx/chip.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 64f4fdd02902..b3aa0e5bc842 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -1404,9 +1404,6 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port) list_for_each_entry(dp, &dst->ports, list) { unsigned int bridge_num = dsa_port_bridge_num_get(dp); - if (!bridge_num) - continue; - if (bridge_num + dst->last_switch != dev) continue; -- 2.25.1
[PATCH net-next v5 04/18] net: dsa: sja1105: use list_add_tail(pos) instead of list_add(pos->prev)
From: Vladimir Oltean When passed a non-head list element, list_add_tail() actually adds the new element to its left, which is what we want. Despite the slightly confusing name, use the dedicated function which does the same thing as the open-coded list_add(pos->prev). Suggested-by: Jakub Kicinski Signed-off-by: Vladimir Oltean Signed-off-by: Jakob Koschel Reviewed-by: Florian Fainelli --- drivers/net/dsa/sja1105/sja1105_vl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/dsa/sja1105/sja1105_vl.c b/drivers/net/dsa/sja1105/sja1105_vl.c index e5ea8eb9ec4e..7fe9b18f1cbd 100644 --- a/drivers/net/dsa/sja1105/sja1105_vl.c +++ b/drivers/net/dsa/sja1105/sja1105_vl.c @@ -49,7 +49,7 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, e->rule = rule; e->gate_state = gate_state; e->interval = entry_time; - list_add(&e->list, pos->prev); + list_add_tail(&e->list, pos); gating_cfg->num_entries++; -- 2.25.1
[PATCH net-next v5 03/18] net: dsa: sja1105: reorder sja1105_first_entry_longer_than with memory allocation
From: Vladimir Oltean sja1105_first_entry_longer_than() does not make use of the full struct sja1105_gate_entry *e, just of e->interval which is set from the passed entry_time. This means that if there is a gate conflict, we have allocated e for nothing, just to free it later. Reorder the memory allocation and the function call, to avoid that and simplify the error unwind path. Signed-off-by: Vladimir Oltean Signed-off-by: Jakob Koschel Reviewed-by: Florian Fainelli --- drivers/net/dsa/sja1105/sja1105_vl.c | 17 + 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/drivers/net/dsa/sja1105/sja1105_vl.c b/drivers/net/dsa/sja1105/sja1105_vl.c index 369be2ac3587..e5ea8eb9ec4e 100644 --- a/drivers/net/dsa/sja1105/sja1105_vl.c +++ b/drivers/net/dsa/sja1105/sja1105_vl.c @@ -36,7 +36,11 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, { struct sja1105_gate_entry *e; struct list_head *pos; - int rc; + + pos = sja1105_first_entry_longer_than(&gating_cfg->entries, + entry_time, extack); + if (IS_ERR(pos)) + return PTR_ERR(pos); e = kzalloc(sizeof(*e), GFP_KERNEL); if (!e) @@ -45,22 +49,11 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, e->rule = rule; e->gate_state = gate_state; e->interval = entry_time; - - pos = sja1105_first_entry_longer_than(&gating_cfg->entries, - e->interval, extack); - if (IS_ERR(pos)) { - rc = PTR_ERR(pos); - goto err; - } - list_add(&e->list, pos->prev); gating_cfg->num_entries++; return 0; -err: - kfree(e); - return rc; } /* The gate entries contain absolute times in their e->interval field. Convert -- 2.25.1
[PATCH net-next v5 02/18] net: dsa: sja1105: remove use of iterator after list_for_each_entry() loop
From: Vladimir Oltean The link below explains that there is a desire to syntactically change list_for_each_entry() and list_for_each() such that it becomes impossible to use the iterator variable outside the scope of the loop. Although sja1105_insert_gate_entry() makes legitimate use of the iterator pointer when it breaks out, the pattern it uses may become illegal, so it needs to change. It is deemed acceptable to use a copy of the loop iterator, and sja1105_insert_gate_entry() only needs to know the list_head element before which the list insertion should be made. So let's profit from the occasion and refactor the list iteration to a dedicated function. An additional benefit is given by the fact that with the helper function in place, we no longer need to special-case the empty list, since it is equivalent to not having found any gating entry larger than the specified interval in the list. We just need to insert at the tail of that list (list_add vs list_add_tail on an empty list does the same thing). Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220407102900.3086255-3-jakobkosc...@gmail.com/#24810127 Signed-off-by: Vladimir Oltean Signed-off-by: Jakob Koschel Reviewed-by: Florian Fainelli --- drivers/net/dsa/sja1105/sja1105_vl.c | 46 ++-- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/drivers/net/dsa/sja1105/sja1105_vl.c b/drivers/net/dsa/sja1105/sja1105_vl.c index b7e95d60a6e4..369be2ac3587 100644 --- a/drivers/net/dsa/sja1105/sja1105_vl.c +++ b/drivers/net/dsa/sja1105/sja1105_vl.c @@ -7,6 +7,27 @@ #define SJA1105_SIZE_VL_STATUS 8 +static struct list_head * +sja1105_first_entry_longer_than(struct list_head *entries, + s64 interval, + struct netlink_ext_ack *extack) +{ + struct sja1105_gate_entry *p; + + list_for_each_entry(p, entries, list) { + if (p->interval == interval) { + NL_SET_ERR_MSG_MOD(extack, "Gate conflict"); + return ERR_PTR(-EBUSY); + } + + if (interval < p->interval) + return &p->list; + } + + /* Empty list, or specified interval is largest within the list */ + return entries; +} + /* Insert into the global gate list, sorted by gate action time. */ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, struct sja1105_rule *rule, @@ -14,6 +35,7 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, struct netlink_ext_ack *extack) { struct sja1105_gate_entry *e; + struct list_head *pos; int rc; e = kzalloc(sizeof(*e), GFP_KERNEL); @@ -24,25 +46,15 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, e->gate_state = gate_state; e->interval = entry_time; - if (list_empty(&gating_cfg->entries)) { - list_add(&e->list, &gating_cfg->entries); - } else { - struct sja1105_gate_entry *p; - - list_for_each_entry(p, &gating_cfg->entries, list) { - if (p->interval == e->interval) { - NL_SET_ERR_MSG_MOD(extack, - "Gate conflict"); - rc = -EBUSY; - goto err; - } - - if (e->interval < p->interval) - break; - } - list_add(&e->list, p->list.prev); + pos = sja1105_first_entry_longer_than(&gating_cfg->entries, + e->interval, extack); + if (IS_ERR(pos)) { + rc = PTR_ERR(pos); + goto err; } + list_add(&e->list, pos->prev); + gating_cfg->num_entries++; return 0; -- 2.25.1
[PATCH net-next v5 01/18] connector: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/connector/cn_queue.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/connector/cn_queue.c b/drivers/connector/cn_queue.c index 996f025eb63c..ed77599b0b25 100644 --- a/drivers/connector/cn_queue.c +++ b/drivers/connector/cn_queue.c @@ -92,20 +92,19 @@ int cn_queue_add_callback(struct cn_queue_dev *dev, const char *name, void cn_queue_del_callback(struct cn_queue_dev *dev, const struct cb_id *id) { - struct cn_callback_entry *cbq, *n; - int found = 0; + struct cn_callback_entry *cbq = NULL, *iter, *n; spin_lock_bh(&dev->queue_lock); - list_for_each_entry_safe(cbq, n, &dev->queue_list, callback_entry) { - if (cn_cb_equal(&cbq->id.id, id)) { - list_del(&cbq->callback_entry); - found = 1; + list_for_each_entry_safe(iter, n, &dev->queue_list, callback_entry) { + if (cn_cb_equal(&iter->id.id, id)) { + list_del(&iter->callback_entry); + cbq = iter; break; } } spin_unlock_bh(&dev->queue_lock); - if (found) + if (cbq) cn_queue_release_callback(cbq); } -- 2.25.1
[PATCH net-next v5 00/18] Remove use of list iterator after loop body
When the list iterator loop does not exit early the list iterator variable contains a type-confused pointer to a 'bogus' list element computed based on the head [1]. Often a 'found' variable is used to ensure the list iterator variable is only accessed after the loop body if the loop did exit early (using a break or goto). In other cases that list iterator variable is used in combination to access the list member which reverses the invocation of container_of() and brings back a "safe" pointer to the head of the list. Since, due to this code patten, there were quite a few bugs discovered [2], Linus concluded that the rule should be to never use the list iterator after the loop and introduce a dedicated pointer for that [3]. With the new gnu11 standard, it will now be possible to limit the scope of the list iterator variable to the traversal loop itself by defining the variable within the for loop. This, however, requires to remove all uses of the list iterator after the loop. Based on input from Paolo Abeni [4], Vinicius Costa Gomes [5], and Jakub Kicinski [6], I've splitted all the net-next related changes into two patch sets, where this is part 1.a v4->v5: - fix reverse-xmas tree in efx_alloc_rss_context_entry() (Martin Habets) v3->v4: - fix build issue in efx_alloc_rss_context_entry() (Jakub Kicinski) v2->v3: - fix commit authors and signed-off order regarding Vladimir's patches (Sorry about that, wasn't intentional.) v1->v2: - Fixed commit message for PATCH 14/18 and used dedicated variable pointing to the position (Edward Cree) - Removed redundant check in mv88e6xxx_port_vlan() (Vladimir Oltean) - Refactor mv88e6xxx_port_vlan() using separate list iterator functions (Vladimir Oltean) - Refactor sja1105_insert_gate_entry() to use separate list iterator functions (Vladimir Oltean) - Allow early return in sja1105_insert_gate_entry() if sja1105_first_entry_longer_than() didn't find any element (Vladimir Oltean) - Use list_add_tail() instead of list_add() in sja1105_insert_gate_entry() (Jakub Kicinski) - net: netcp: also use separate 'pos' variable instead of duplicating list_add() Link: https://lwn.net/Articles/887097/ [1] Link: https://lore.kernel.org/linux-kernel/20220217184829.1991035-4-jakobkosc...@gmail.com/ [2] Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [3] Link: https://lore.kernel.org/linux-kernel/7393b673c626fd75f2b4f8509faa5459254fb87c.ca...@redhat.com/ [4] Link: https://lore.kernel.org/linux-kernel/877d8a3sww@intel.com/ [5] Link: https://lore.kernel.org/linux-kernel/20220403205502.1b344...@kernel.org/ [6]
[PATCH net-next 13/14] eth: spider: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski --- CC: kou.ishiz...@toshiba.co.jp CC: ge...@infradead.org CC: linuxppc-dev@lists.ozlabs.org --- drivers/net/ethernet/toshiba/spider_net.c | 2 +- drivers/net/ethernet/toshiba/spider_net.h | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/ethernet/toshiba/spider_net.c b/drivers/net/ethernet/toshiba/spider_net.c index f47b8358669d..c09cd961edbb 100644 --- a/drivers/net/ethernet/toshiba/spider_net.c +++ b/drivers/net/ethernet/toshiba/spider_net.c @@ -2270,7 +2270,7 @@ spider_net_setup_netdev(struct spider_net_card *card) timer_setup(&card->aneg_timer, spider_net_link_phy, 0); netif_napi_add(netdev, &card->napi, - spider_net_poll, SPIDER_NET_NAPI_WEIGHT); + spider_net_poll, NAPI_POLL_WEIGHT); spider_net_setup_netdev_ops(netdev); diff --git a/drivers/net/ethernet/toshiba/spider_net.h b/drivers/net/ethernet/toshiba/spider_net.h index 05b1a0736835..51948e2b3a34 100644 --- a/drivers/net/ethernet/toshiba/spider_net.h +++ b/drivers/net/ethernet/toshiba/spider_net.h @@ -44,7 +44,6 @@ extern char spider_net_driver_name[]; #define SPIDER_NET_RX_CSUM_DEFAULT 1 #define SPIDER_NET_WATCHDOG_TIMEOUT50*HZ -#define SPIDER_NET_NAPI_WEIGHT 64 #define SPIDER_NET_FIRMWARE_SEQS 6 #define SPIDER_NET_FIRMWARE_SEQWORDS 1024 -- 2.34.1
Re: serial hang in qemu-system-ppc64 -M pseries
On 27/04/2022 17.27, Thomas Huth wrote: On 26/04/2022 12.26, Rob Landley wrote: When I cut and paste 80-ish characters of text into the Linux serial console, it reads 16 characters and stops. When I hit space, it reads another 16 characters, and if I keep at it will eventually catch up without losing data. If I type, every character shows up immediately. That "16" certainly comes from VTERM_BUFSIZE in hw/char/spapr_vty.c in the QEMU sources, I think. (On other qemu targets and kernels I can cut and paste an entire uuencoded binary and it goes through just fine in one go, but this target hangs with big pastes until I hit keys.) Is this a qemu-side bug, or a kernel-side bug? Kernel config attached (linux 5.18-rc3 or thereabouts), qemu invocation is: qemu-system-ppc64 -M pseries -vga none -nographic -no-reboot -m 256 -kernel vmlinux -initrd powerpc64leroot.cpio.gz -append "panic=1 HOST=powerpc64le console=hvc0" Which version of QEMU are you using? Which frontend (GTK or terminal?) ... this rings a distant bell, but I thought we had fixed these issues long ago in the past... e.g.: https://yhbt.net/lore/all/1380113886-16845-10-git-send-email-mdr...@linux.vnet.ibm.com/ https://git.qemu.org/?p=qemu.git;a=commitdiff;h=8a273cbe53221d28 ... but maybe my memory also just fails and this has never been properly fixed. In case you're using GTK, I think I just found the problem that I was remembering again: https://lists.gnu.org/archive/html/qemu-ppc/2016-11/msg00142.html https://lists.gnu.org/archive/html/qemu-ppc/2016-11/msg00143.html I assume this has never been properly fixed. Thomas
Re: serial hang in qemu-system-ppc64 -M pseries
On 26/04/2022 12.26, Rob Landley wrote: When I cut and paste 80-ish characters of text into the Linux serial console, it reads 16 characters and stops. When I hit space, it reads another 16 characters, and if I keep at it will eventually catch up without losing data. If I type, every character shows up immediately. That "16" certainly comes from VTERM_BUFSIZE in hw/char/spapr_vty.c in the QEMU sources, I think. (On other qemu targets and kernels I can cut and paste an entire uuencoded binary and it goes through just fine in one go, but this target hangs with big pastes until I hit keys.) Is this a qemu-side bug, or a kernel-side bug? Kernel config attached (linux 5.18-rc3 or thereabouts), qemu invocation is: qemu-system-ppc64 -M pseries -vga none -nographic -no-reboot -m 256 -kernel vmlinux -initrd powerpc64leroot.cpio.gz -append "panic=1 HOST=powerpc64le console=hvc0" Which version of QEMU are you using? Which frontend (GTK or terminal?) ... this rings a distant bell, but I thought we had fixed these issues long ago in the past... e.g.: https://yhbt.net/lore/all/1380113886-16845-10-git-send-email-mdr...@linux.vnet.ibm.com/ https://git.qemu.org/?p=qemu.git;a=commitdiff;h=8a273cbe53221d28 ... but maybe my memory also just fails and this has never been properly fixed. Thomas
Re: [PATCH v3 4/4] dt-bindings: fsl: convert fsl, layerscape-scfg to YAML
On Wed, 27 Apr 2022 09:53:38 +0200, Michael Walle wrote: > Convert the fsl,layerscape-scfg binding to the new YAML format. > > In the device trees, the device node always have a "syscon" > compatible, which wasn't mentioned in the previous binding. > > Also added, compared to the original binding, is the > interrupt-controller subnode as used in arch/arm/boot/dts/ls1021a.dtsi > as well as the litte-endian and big-endian properties. > > Signed-off-by: Michael Walle > Reviewed-by: Krzysztof Kozlowski > --- > changes since v2: > - none > > changes since v1: > - moved to soc/fsl/fsl,layerscape-scfg.yaml > - generic name for node in example > - mention added "syscon" compatible in commit message > - reference specific interrupt controller > > .../arm/freescale/fsl,layerscape-scfg.txt | 19 -- > .../bindings/soc/fsl/fsl,layerscape-scfg.yaml | 58 +++ > 2 files changed, 58 insertions(+), 19 deletions(-) > delete mode 100644 > Documentation/devicetree/bindings/arm/freescale/fsl,layerscape-scfg.txt > create mode 100644 > Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-scfg.yaml > Running 'make dtbs_check' with the schema in this patch gives the following warnings. Consider if they are expected or the schema is incorrect. These may not be new warnings. Note that it is not yet a requirement to have 0 warnings for dtbs_check. This will change in the future. Full log is available here: https://patchwork.ozlabs.org/patch/ scfg@157: interrupt-controller@1ac:interrupt-map-mask:0:0: 15 was expected arch/arm64/boot/dts/freescale/fsl-ls1043a-qds.dtb arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dtb arch/arm64/boot/dts/freescale/fsl-ls1046a-frwy.dtb arch/arm64/boot/dts/freescale/fsl-ls1046a-qds.dtb arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dtb scfg@157: interrupt-controller@1ac:interrupt-map-mask:0:0: 7 was expected arch/arm/boot/dts/ls1021a-moxa-uc-8410a.dtb arch/arm/boot/dts/ls1021a-qds.dtb arch/arm/boot/dts/ls1021a-tsn.dtb arch/arm/boot/dts/ls1021a-twr.dtb
Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
On Wed, 27 Apr 2022 15:01:22 +0530 "Naveen N. Rao" wrote: > If one or both of these weak functions are overridden in future, in the > final vmlinux mcount table, references to these will change over to the > non-weak variant which has its own mcount location entry. As such, there > will now be two entries for these functions, both pointing to the same > non-weak location. But is that really true in all cases? x86 uses fentry these days, and other archs do things differently too. But the original mcount (-pg) call happened *after* the frame setup. That means the offset of the mcount call would be at different offsets wrt the start of the function. If you have one of these architectures that still use mcount, and the weak function doesn't have the same size frame setup as the overriding function, then the addresses will not be the same. -- Steve > We don't need the non-weak locations since they will > never be executed. Ftrace skips the duplicate entries due to a previous > commit.
Re: [PATCH 1/2] ftrace: Drop duplicate mcount locations
On Wed, 27 Apr 2022 15:01:21 +0530 "Naveen N. Rao" wrote: > In the absence of section symbols [1], objtool (today) and recordmcount > (with a subsequent patch) generate __mcount_loc relocation records with > weak symbols as the base. This works fine as long as those weak symbols > are not overridden, but if they are, these can result in duplicate > entries in the final vmlinux mcount location table. This will cause > ftrace to fail when trying to patch the same location twice. Fix this by > dropping duplicate locations during ftrace init. > > [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1 > > Signed-off-by: Naveen N. Rao > --- > kernel/trace/ftrace.c | 13 - > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index 4f1d2f5e726341..8bc4f282bb3ff4 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -6496,7 +6496,7 @@ static int ftrace_process_locs(struct module *mod, > struct dyn_ftrace *rec; > unsigned long count; > unsigned long *p; > - unsigned long addr; > + unsigned long addr, prev_addr = 0; > unsigned long flags = 0; /* Shut up gcc */ > int ret = -ENOMEM; > > @@ -6550,6 +6550,16 @@ static int ftrace_process_locs(struct module *mod, > while (p < end) { > unsigned long end_offset; > addr = ftrace_call_adjust(*p++); > + > + /* > + * Drop duplicate entries, which can happen when weak > + * functions are overridden, and __mcount_loc relocation > + * records were generated against function names due to > + * absence of non-weak section symbols > + */ > + if (addr == prev_addr) > + addr = 0; Please don't use the side effect of addr == 0 causing the loop to continue for this logic. The two are not related. Simply call continue. if (addr == prev_addr) continue; -- Steve > + > /* >* Some architecture linkers will pad between >* the different mcount_loc sections of different > @@ -6569,6 +6579,7 @@ static int ftrace_process_locs(struct module *mod, > > rec = &pg->records[pg->index++]; > rec->ip = addr; > + prev_addr = addr; > } > > /* We should have used all pages */
[PATCH 1/2] ftrace: Drop duplicate mcount locations
In the absence of section symbols [1], objtool (today) and recordmcount (with a subsequent patch) generate __mcount_loc relocation records with weak symbols as the base. This works fine as long as those weak symbols are not overridden, but if they are, these can result in duplicate entries in the final vmlinux mcount location table. This will cause ftrace to fail when trying to patch the same location twice. Fix this by dropping duplicate locations during ftrace init. [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1 Signed-off-by: Naveen N. Rao --- kernel/trace/ftrace.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 4f1d2f5e726341..8bc4f282bb3ff4 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -6496,7 +6496,7 @@ static int ftrace_process_locs(struct module *mod, struct dyn_ftrace *rec; unsigned long count; unsigned long *p; - unsigned long addr; + unsigned long addr, prev_addr = 0; unsigned long flags = 0; /* Shut up gcc */ int ret = -ENOMEM; @@ -6550,6 +6550,16 @@ static int ftrace_process_locs(struct module *mod, while (p < end) { unsigned long end_offset; addr = ftrace_call_adjust(*p++); + + /* +* Drop duplicate entries, which can happen when weak +* functions are overridden, and __mcount_loc relocation +* records were generated against function names due to +* absence of non-weak section symbols +*/ + if (addr == prev_addr) + addr = 0; + /* * Some architecture linkers will pad between * the different mcount_loc sections of different @@ -6569,6 +6579,7 @@ static int ftrace_process_locs(struct module *mod, rec = &pg->records[pg->index++]; rec->ip = addr; + prev_addr = addr; } /* We should have used all pages */ -- 2.35.1
[PATCH 0/2] ftrace/recordmcount: Handle object files without section symbols
This solves a build issue on powerpc with binutils v2.36 and newer [1]. Since commit d1bcae833b32f1 ("ELF: Don't generate unused section symbols") [2], binutils started dropping section symbols that it thought were unused. Due to this, in certain scenarios, recordmcount is unable to find a non-weak symbol to generate a relocation record against. Clang integrated assembler is also aggressive in dropping section symbols [3]. In the past, there have been various workarounds to address this. See commits 55d5b7dd6451b5 ("initramfs: fix clang build failure") and 6e7b64b9dd6d96 ("elfcore: fix building with clang") and a recent patch: https://lore.kernel.org/linuxppc-dev/20220425174128.11455-1-naveen.n@linux.vnet.ibm.com/T/#u Fix this issue by using the weak symbol in the relocation record. This can result in duplicate locations in the mcount table if those weak functions are overridden, so have ftrace skip dupicate entries. Objtool already follows this approach, so patch 2 updates recordmcount to do the same. Patch 1 updates ftrace to skip duplicate entries. - Naveen [1] https://github.com/linuxppc/issues/issues/388 [2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1 [3] https://github.com/ClangBuiltLinux/linux/issues/981 Naveen N. Rao (2): ftrace: Drop duplicate mcount locations recordmcount: Handle sections with no non-weak symbols kernel/trace/ftrace.c | 13 ++- scripts/recordmcount.h | 86 +++--- 2 files changed, 85 insertions(+), 14 deletions(-) base-commit: 83d8a0d166119de813cad27ae7d61f54f9aea707 -- 2.35.1
[PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
Kernel builds on powerpc are failing with the below error [1]: CC kernel/kexec_file.o Cannot find symbol for section 9: .text.unlikely. kernel/kexec_file.o: failed Since commit d1bcae833b32f1 ("ELF: Don't generate unused section symbols") [2], binutils started dropping section symbols that it thought were unused. This isn't an issue in general, but with kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a separate .text.unlikely section and the section symbol ".text.unlikely" is being dropped. Due to this, recordmcount is unable to find a non-weak symbol in .text.unlikely to generate a relocation record against. Handle this by falling back to a weak symbol, similar to what objtool does in commit 44f6a7c0755d8d ("objtool: Fix seg fault with Clang non-section symbols"). Note that this approach can result in duplicate addresses in the final vmlinux mcount location table. A previous commit updated ftrace to skip such duplicate entries. After this commit, relocation records for __mcount_loc for kexec_file.o now include two entries with the weak functions arch_kexec_apply_relocations() and arch_kexec_apply_relocation_add() as the relocation bases: ... 0080 R_PPC64_ADDR64.text+0x1d34 0088 R_PPC64_ADDR64.text+0x1fec 0090 R_PPC64_ADDR64 arch_kexec_apply_relocations_add+0x000c 0098 R_PPC64_ADDR64 arch_kexec_apply_relocations+0x000c Powerpc does not override these functions today, so these get converted to correct offsets in the mcount location table in vmlinux. If one or both of these weak functions are overridden in future, in the final vmlinux mcount table, references to these will change over to the non-weak variant which has its own mcount location entry. As such, there will now be two entries for these functions, both pointing to the same non-weak location. We don't need the non-weak locations since they will never be executed. Ftrace skips the duplicate entries due to a previous commit. [1] https://github.com/linuxppc/issues/issues/388 [2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1 Signed-off-by: Naveen N. Rao --- scripts/recordmcount.h | 86 +++--- 1 file changed, 73 insertions(+), 13 deletions(-) diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h index 1e9baa5c4fc6ef..0c79a2d2b47493 100644 --- a/scripts/recordmcount.h +++ b/scripts/recordmcount.h @@ -25,6 +25,7 @@ #undef sift_rel_mcount #undef nop_mcount #undef find_secsym_ndx +#undef find_sym_ndx #undef __has_rel_mcount #undef has_rel_mcount #undef tot_relsize @@ -60,6 +61,7 @@ # define sift_rel_mcount sift64_rel_mcount # define nop_mcountnop_mcount_64 # define find_secsym_ndx find64_secsym_ndx +# define find_sym_ndx find64_sym_ndx # define __has_rel_mcount __has64_rel_mcount # define has_rel_mcounthas64_rel_mcount # define tot_relsize tot64_relsize @@ -98,6 +100,7 @@ # define sift_rel_mcount sift32_rel_mcount # define nop_mcountnop_mcount_32 # define find_secsym_ndx find32_secsym_ndx +# define find_sym_ndx find32_sym_ndx # define __has_rel_mcount __has32_rel_mcount # define has_rel_mcounthas32_rel_mcount # define tot_relsize tot32_relsize @@ -392,6 +395,51 @@ static void get_sym_str_and_relp(Elf_Shdr const *const relhdr, *relp = rel0; } +/* + * Find a symbol in the given section containing the instruction offset passed + * in r_offset, to be used in generating the relocation record for the mcount + * location. This is used if there were no local/global symbols in the given + * section to be used as the base. Weak symbols are ok, and are expected to + * result in duplicate mcount locations which get dropped by ftrace. + */ +static int find_sym_ndx(unsigned const txtndx, +char const *const txtname, +uint_t *const recvalp, +unsigned int *sym_index, +Elf_Shdr const *const symhdr, +Elf32_Word const *symtab, +Elf32_Word const *symtab_shndx, +unsigned const r_offset, +Elf_Ehdr const *const ehdr) +{ + Elf_Sym const *const sym0 = (Elf_Sym const *)(_w(symhdr->sh_offset) + + (void *)ehdr); + unsigned const nsym = _w(symhdr->sh_size) / _w(symhdr->sh_entsize); + Elf_Sym const *symp; + unsigned t; + + for (symp = sym0, t = nsym; t; --t, ++symp) { + if (txtndx == get_symindex(symp, symtab, symtab_shndx)) { + /* function symbols on ARM have quirks, avoid them */ + if (w2(ehdr->e_machine) == EM_ARM && + ELF_ST_TYPE(symp->st_info) == STT_FUNC) +
[PATCH v6 05/17] powerpc: define get_cycles macro for arch-override
PowerPC defines a get_cycles() function, but it forgot to do the usual `#define get_cycles get_cycles` dance, making it impossible for generic code to see if an arch-specific function was defined. Cc: Thomas Gleixner Cc: Arnd Bergmann Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Signed-off-by: Jason A. Donenfeld --- arch/powerpc/include/asm/timex.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/include/asm/timex.h b/arch/powerpc/include/asm/timex.h index fa2e76e4093a..14b4489de52c 100644 --- a/arch/powerpc/include/asm/timex.h +++ b/arch/powerpc/include/asm/timex.h @@ -19,6 +19,7 @@ static inline cycles_t get_cycles(void) { return mftb(); } +#define get_cycles get_cycles #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_TIMEX_H */ -- 2.35.1
[PATCH v3 0/4] dt-bindings: convert freescale extirq and scfg schemas
The first two patches will be resend to the soc tree once the schema is approved/picked up. Please note, I'm still getting these weird "is too short" errors for for interrupt-map entries, but it seems to work for you, so.. ;) Michael Walle (4): ARM: dts: ls1021a: reduce the interrupt-map-mask arm64: dts: freescale: reduce the interrup-map-mask dt-bindings: interrupt-controller: fsl,ls-extirq: convert to YAML dt-bindings: fsl: convert fsl,layerscape-scfg to YAML .../arm/freescale/fsl,layerscape-scfg.txt | 19 --- .../interrupt-controller/fsl,ls-extirq.txt| 53 .../interrupt-controller/fsl,ls-extirq.yaml | 118 ++ .../bindings/soc/fsl/fsl,layerscape-scfg.yaml | 58 + arch/arm/boot/dts/ls1021a.dtsi| 2 +- .../arm64/boot/dts/freescale/fsl-ls1043a.dtsi | 2 +- .../arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2 +- .../arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 2 +- .../arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 2 +- .../arm64/boot/dts/freescale/fsl-lx2160a.dtsi | 2 +- 10 files changed, 182 insertions(+), 78 deletions(-) delete mode 100644 Documentation/devicetree/bindings/arm/freescale/fsl,layerscape-scfg.txt delete mode 100644 Documentation/devicetree/bindings/interrupt-controller/fsl,ls-extirq.txt create mode 100644 Documentation/devicetree/bindings/interrupt-controller/fsl,ls-extirq.yaml create mode 100644 Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-scfg.yaml -- 2.30.2
[PATCH v3 1/4] ARM: dts: ls1021a: reduce the interrupt-map-mask
Reduce the interrupt-map-mask of the external interrupt controller to 7 to align with the devicetree schema. Signed-off-by: Michael Walle --- arch/arm/boot/dts/ls1021a.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi index 2e69d6eab4d1..5354104cae12 100644 --- a/arch/arm/boot/dts/ls1021a.dtsi +++ b/arch/arm/boot/dts/ls1021a.dtsi @@ -192,7 +192,7 @@ extirq: interrupt-controller@1ac { <3 0 &gic GIC_SPI 167 IRQ_TYPE_LEVEL_HIGH>, <4 0 &gic GIC_SPI 168 IRQ_TYPE_LEVEL_HIGH>, <5 0 &gic GIC_SPI 169 IRQ_TYPE_LEVEL_HIGH>; - interrupt-map-mask = <0x 0x0>; + interrupt-map-mask = <0x7 0x0>; }; }; -- 2.30.2
[PATCH v3 2/4] arm64: dts: freescale: reduce the interrup-map-mask
Reduce the interrupt-map-mask of the external interrupt controller to 0xf to align with the devicetree schema. Signed-off-by: Michael Walle --- arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi | 2 +- arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2 +- arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 2 +- arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 2 +- arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi index 35d1939e690b..46cc8d45ca65 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi @@ -335,7 +335,7 @@ extirq: interrupt-controller@1ac { <9 0 &gic GIC_SPI 149 IRQ_TYPE_LEVEL_HIGH>, <10 0 &gic GIC_SPI 150 IRQ_TYPE_LEVEL_HIGH>, <11 0 &gic GIC_SPI 151 IRQ_TYPE_LEVEL_HIGH>; - interrupt-map-mask = <0x 0x0>; + interrupt-map-mask = <0xf 0x0>; }; }; diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi index 4e7bd04d9798..3e8def8fe1b4 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi @@ -341,7 +341,7 @@ extirq: interrupt-controller@1ac { <9 0 &gic GIC_SPI 149 IRQ_TYPE_LEVEL_HIGH>, <10 0 &gic GIC_SPI 150 IRQ_TYPE_LEVEL_HIGH>, <11 0 &gic GIC_SPI 151 IRQ_TYPE_LEVEL_HIGH>; - interrupt-map-mask = <0x 0x0>; + interrupt-map-mask = <0xf 0x0>; }; }; diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi index 18e529118476..33c5ad1b9b96 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi @@ -265,7 +265,7 @@ extirq: interrupt-controller@14 { <9 0 &gic GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>, <10 0 &gic GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>, <11 0 &gic GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>; - interrupt-map-mask = <0x 0x0>; + interrupt-map-mask = <0xf 0x0>; }; }; diff --git a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi index 1282b61da8a5..3f767994b02d 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi @@ -305,7 +305,7 @@ extirq: interrupt-controller@14 { <9 0 &gic GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>, <10 0 &gic GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>, <11 0 &gic GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>; - interrupt-map-mask = <0x 0x0>; + interrupt-map-mask = <0xf 0x0>; }; }; diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi b/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi index 82bd8c0f318b..47ea854720ce 100644 --- a/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi @@ -698,7 +698,7 @@ extirq: interrupt-controller@14 { <9 0 &gic GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>, <10 0 &gic GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>, <11 0 &gic GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>; - interrupt-map-mask = <0x 0x0>; + interrupt-map-mask = <0xf 0x0>; }; }; -- 2.30.2
[PATCH v3 3/4] dt-bindings: interrupt-controller: fsl, ls-extirq: convert to YAML
Convert the fsl,ls-extirq binding to the new YAML format. In contrast to the original binding documentation, there are three compatibles which are used in their corresponding device trees which have a specific compatible and the (already documented) fallback compatible: - "fsl,ls1046a-extirq", "fsl,ls1043a-extirq" - "fsl,ls2080a-extirq", "fsl,ls1088a-extirq" - "fsl,lx2160a-extirq", "fsl,ls1088a-extirq" Depending on the number of the number of the external IRQs which is usually 12 except for the LS1021A where there are only 6, the interrupt-map-mask was reduced from 0x to 0xf and 0x7 respectively and the number of interrupt-map entries have to match. Signed-off-by: Michael Walle --- changes since v2: - drop $ref to interrupt-controller.yaml - use a more strict interrupt-map-mask and make it conditional on SoC changes since v1: - new patch .../interrupt-controller/fsl,ls-extirq.txt| 53 .../interrupt-controller/fsl,ls-extirq.yaml | 118 ++ 2 files changed, 118 insertions(+), 53 deletions(-) delete mode 100644 Documentation/devicetree/bindings/interrupt-controller/fsl,ls-extirq.txt create mode 100644 Documentation/devicetree/bindings/interrupt-controller/fsl,ls-extirq.yaml diff --git a/Documentation/devicetree/bindings/interrupt-controller/fsl,ls-extirq.txt b/Documentation/devicetree/bindings/interrupt-controller/fsl,ls-extirq.txt deleted file mode 100644 index 4d47df1a5c91.. --- a/Documentation/devicetree/bindings/interrupt-controller/fsl,ls-extirq.txt +++ /dev/null @@ -1,53 +0,0 @@ -* Freescale Layerscape external IRQs - -Some Layerscape SOCs (LS1021A, LS1043A, LS1046A -LS1088A, LS208xA, LX216xA) support inverting -the polarity of certain external interrupt lines. - -The device node must be a child of the node representing the -Supplemental Configuration Unit (SCFG). - -Required properties: -- compatible: should be "fsl,-extirq", e.g. "fsl,ls1021a-extirq". - "fsl,ls1043a-extirq": for LS1043A, LS1046A. - "fsl,ls1088a-extirq": for LS1088A, LS208xA, LX216xA. -- #interrupt-cells: Must be 2. The first element is the index of the - external interrupt line. The second element is the trigger type. -- #address-cells: Must be 0. -- interrupt-controller: Identifies the node as an interrupt controller -- reg: Specifies the Interrupt Polarity Control Register (INTPCR) in - the SCFG or the External Interrupt Control Register (IRQCR) in - the ISC. -- interrupt-map: Specifies the mapping from external interrupts to GIC - interrupts. -- interrupt-map-mask: Must be <0x 0>. - -Example: - scfg: scfg@157 { - compatible = "fsl,ls1021a-scfg", "syscon"; - reg = <0x0 0x157 0x0 0x1>; - big-endian; - #address-cells = <1>; - #size-cells = <1>; - ranges = <0x0 0x0 0x157 0x1>; - - extirq: interrupt-controller@1ac { - compatible = "fsl,ls1021a-extirq"; - #interrupt-cells = <2>; - #address-cells = <0>; - interrupt-controller; - reg = <0x1ac 4>; - interrupt-map = - <0 0 &gic GIC_SPI 163 IRQ_TYPE_LEVEL_HIGH>, - <1 0 &gic GIC_SPI 164 IRQ_TYPE_LEVEL_HIGH>, - <2 0 &gic GIC_SPI 165 IRQ_TYPE_LEVEL_HIGH>, - <3 0 &gic GIC_SPI 167 IRQ_TYPE_LEVEL_HIGH>, - <4 0 &gic GIC_SPI 168 IRQ_TYPE_LEVEL_HIGH>, - <5 0 &gic GIC_SPI 169 IRQ_TYPE_LEVEL_HIGH>; - interrupt-map-mask = <0x 0x0>; - }; - }; - - - interrupts-extended = <&gic GIC_SPI 88 IRQ_TYPE_LEVEL_HIGH>, - <&extirq 1 IRQ_TYPE_LEVEL_LOW>; diff --git a/Documentation/devicetree/bindings/interrupt-controller/fsl,ls-extirq.yaml b/Documentation/devicetree/bindings/interrupt-controller/fsl,ls-extirq.yaml new file mode 100644 index ..887e565b9573 --- /dev/null +++ b/Documentation/devicetree/bindings/interrupt-controller/fsl,ls-extirq.yaml @@ -0,0 +1,118 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/interrupt-controller/fsl,ls-extirq.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Freescale Layerscape External Interrupt Controller + +maintainers: + - Shawn Guo + - Li Yang + +description: | + Some Layerscape SOCs (LS1021A, LS1043A, LS1046A LS1088A, LS208xA, + LX216xA) support inverting the polarity of certain external interrupt + lines. + +properties: + compatible: +oneOf: + - enum: + - fsl,ls1021a-extirq + - fsl,ls1043a-extirq + - fsl,ls1088a-extirq + - items: + - enum: + - fsl,ls1046a-extirq + - const: fsl,ls1043a-extirq + - ite
[PATCH v3 4/4] dt-bindings: fsl: convert fsl,layerscape-scfg to YAML
Convert the fsl,layerscape-scfg binding to the new YAML format. In the device trees, the device node always have a "syscon" compatible, which wasn't mentioned in the previous binding. Also added, compared to the original binding, is the interrupt-controller subnode as used in arch/arm/boot/dts/ls1021a.dtsi as well as the litte-endian and big-endian properties. Signed-off-by: Michael Walle Reviewed-by: Krzysztof Kozlowski --- changes since v2: - none changes since v1: - moved to soc/fsl/fsl,layerscape-scfg.yaml - generic name for node in example - mention added "syscon" compatible in commit message - reference specific interrupt controller .../arm/freescale/fsl,layerscape-scfg.txt | 19 -- .../bindings/soc/fsl/fsl,layerscape-scfg.yaml | 58 +++ 2 files changed, 58 insertions(+), 19 deletions(-) delete mode 100644 Documentation/devicetree/bindings/arm/freescale/fsl,layerscape-scfg.txt create mode 100644 Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-scfg.yaml diff --git a/Documentation/devicetree/bindings/arm/freescale/fsl,layerscape-scfg.txt b/Documentation/devicetree/bindings/arm/freescale/fsl,layerscape-scfg.txt deleted file mode 100644 index 0ab67b0b216d.. --- a/Documentation/devicetree/bindings/arm/freescale/fsl,layerscape-scfg.txt +++ /dev/null @@ -1,19 +0,0 @@ -Freescale SCFG - -SCFG is the supplemental configuration unit, that provides SoC specific -configuration and status registers for the chip. Such as getting PEX port -status. - -Required properties: - - compatible: Should contain a chip-specific compatible string, - Chip-specific strings are of the form "fsl,-scfg", - The following s are known to be supported: - ls1012a, ls1021a, ls1043a, ls1046a, ls2080a. - - - reg: should contain base address and length of SCFG memory-mapped registers - -Example: - scfg: scfg@157 { - compatible = "fsl,ls1021a-scfg"; - reg = <0x0 0x157 0x0 0x1>; - }; diff --git a/Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-scfg.yaml b/Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-scfg.yaml new file mode 100644 index ..8d088b5fe823 --- /dev/null +++ b/Documentation/devicetree/bindings/soc/fsl/fsl,layerscape-scfg.yaml @@ -0,0 +1,58 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/soc/fsl/fsl,layerscape-scfg.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Freescale Layerscape Supplemental Configuration Unit + +maintainers: + - Shawn Guo + - Li Yang + +description: | + SCFG is the supplemental configuration unit, that provides SoC specific + configuration and status registers for the chip. Such as getting PEX port + status. + +properties: + compatible: +items: + - enum: + - fsl,ls1012a-scfg + - fsl,ls1021a-scfg + - fsl,ls1028a-scfg + - fsl,ls1043a-scfg + - fsl,ls1046a-scfg + - const: syscon + + reg: +maxItems: 1 + + little-endian: true + big-endian: true + + '#address-cells': +const: 1 + + '#size-cells': +const: 1 + + ranges: true + +patternProperties: + "^interrupt-controller@[a-z0-9]+$": +$ref: /schemas/interrupt-controller/fsl,ls-extirq.yaml# + +required: + - compatible + - reg + +additionalProperties: false + +examples: + - | +syscon@157 { +compatible = "fsl,ls1021a-scfg", "syscon"; +reg = <0x157 0x1>; +}; -- 2.30.2